From tobias.hartmann at oracle.com  Mon Jan  4 08:15:28 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 4 Jan 2016 09:15:28 +0100
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
Message-ID: <568A2A20.7030601@oracle.com>

Hi,

please approve and review the following backports to 8u.

8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
https://bugs.openjdk.java.net/browse/JDK-8144487
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407

8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
https://bugs.openjdk.java.net/browse/JDK-8145754
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522

Nightly testing showed no problems and the changes apply cleanly to 8u-dev.

Thanks,
Tobias

From tobias.hartmann at oracle.com  Mon Jan  4 09:30:33 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 4 Jan 2016 10:30:33 +0100
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A2EE7.4030600@oracle.com>
References: <568A2A20.7030601@oracle.com> <568A2EE7.4030600@oracle.com>
Message-ID: <568A3BB9.1010501@oracle.com>

Hi David,

sure, I included the links to the code review:

8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
https://bugs.openjdk.java.net/browse/JDK-8144487
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020503.html
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407

8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
https://bugs.openjdk.java.net/browse/JDK-8145754
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020502.html
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522

Thanks,
Tobias

On 04.01.2016 09:35, david buck wrote:
> Hi Tobias!
> 
> Would you please include links to the code review threads on mail.openjdk.java.net?
> 
> [ JDK 8 Updates: Push Approval Request Template ]
> http://openjdk.java.net/projects/jdk8u/approval-template.html
> 
> Cheers,
> -Buck
> 
> On 2016/01/04 17:15, Tobias Hartmann wrote:
>> Hi,
>>
>> please approve and review the following backports to 8u.
>>
>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>> https://bugs.openjdk.java.net/browse/JDK-8144487
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>
>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>> https://bugs.openjdk.java.net/browse/JDK-8145754
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>
>> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>>
>> Thanks,
>> Tobias
>>

From tobias.hartmann at oracle.com  Mon Jan  4 10:04:06 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 4 Jan 2016 11:04:06 +0100
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A41D7.2030503@oracle.com>
References: <568A2A20.7030601@oracle.com> <568A2EE7.4030600@oracle.com>
	<568A3BB9.1010501@oracle.com> <568A41D7.2030503@oracle.com>
Message-ID: <568A4396.7070301@oracle.com>

Thanks, David! I will push this to 8u-dev as soon as I get a peer review for the backport.

Best,
Tobias

On 04.01.2016 10:56, david buck wrote:
> approved for backport to 8u-dev
> 
> Thank you for adding the review links.
> 
> Cheers,
> -Buck
> 
> On 2016/01/04 18:30, Tobias Hartmann wrote:
>> Hi David,
>>
>> sure, I included the links to the code review:
>>
>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>> https://bugs.openjdk.java.net/browse/JDK-8144487
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020503.html
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>
>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>> https://bugs.openjdk.java.net/browse/JDK-8145754
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020502.html
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>
>> Thanks,
>> Tobias
>>
>> On 04.01.2016 09:35, david buck wrote:
>>> Hi Tobias!
>>>
>>> Would you please include links to the code review threads on mail.openjdk.java.net?
>>>
>>> [ JDK 8 Updates: Push Approval Request Template ]
>>> http://openjdk.java.net/projects/jdk8u/approval-template.html
>>>
>>> Cheers,
>>> -Buck
>>>
>>> On 2016/01/04 17:15, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please approve and review the following backports to 8u.
>>>>
>>>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>>>> https://bugs.openjdk.java.net/browse/JDK-8144487
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>>>
>>>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>>>> https://bugs.openjdk.java.net/browse/JDK-8145754
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>>>
>>>> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>

From tobias.hartmann at oracle.com  Mon Jan  4 11:35:43 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 4 Jan 2016 12:35:43 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
Message-ID: <568A590F.6030104@oracle.com>

Hi Roland,

sorry for the delay.

On 07.10.2015 11:06, Roland Westrelin wrote:
>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>
>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>
>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>
>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>> // pointer.
>>
>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>> Another solution would be to cut the *current* branch if it is never taken:
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>
>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>
>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
> 
> I thought about this more and I don?t think either work ok.
> 
> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
> 
> Node* never_taken = new IfTrueNode(..);
> Node* always_taken = new IfFalseNode(..);
> always_taken = gvn.transform(always_taken);
> never_taken = gvn.transform(never_taken);
> 
> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.

Right, that could be a problem.

> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.

Here is the new webrev:
http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/

However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?

Thanks,
Tobias

From paul.sandoz at oracle.com  Mon Jan  4 11:42:15 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Mon, 4 Jan 2016 12:42:15 +0100
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
Message-ID: <0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>

Hi,

> On 31 Dec 2015, at 22:33, John Rose <john.r.rose at oracle.com> wrote:
> 
> When performing explicit range checks in pre-intrinsic code,
> let's try to use the new intrinsic functions in java.util.Objects,
> called checkIndex, checkFromToIndex, and checkFromIndexSize.

At the moment only checkIndex is a C2 intrinsic, we could revisit making the others intrinsic as well based on use-cases.


> These are simpler, safer, and more maintainable than our previous
> practice of using hand-written "random logic", such as in this bug:
>  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/cb31a76eecd1#l1.52
> 

Yes, in this case i believe the calls to cryptBlockCheck

 176         cryptBlockCheck(in, inOff, len);
 177         cryptBlockCheck(out, outOff, len);
 178         return implCrypt(in, inOff, len, out, outOff);
could be replaced with:

 Objects.checkFromIndexSize(inOff, len, in.length, <BiFunction>);
 Objects.checkFromIndexSize(outOff, len, out.length, <BiFunction>);
 return implCrypt(in, inOff, len, out, outOff);

Paul.


> Depending on the documented API, it is usually enough that the
> thrown exception be a RuntimeException of any sort.  By default,
> the methods throw a generic IndexOutOfBoundsException.
> In cases where a particular exception must be thrown, the Objects
> methods provide an optional "hook" for building the desired exception.
> 
> In this case, since the code is already pushed, we should clean it
> up as part of this bug:
>  https://bugs.openjdk.java.net/browse/JDK-8135250
> 
> ? John
> 
> On Dec 29, 2015, at 9:33 AM, Kharbas, Kishor <kishor.kharbas at intel.com> wrote:
>> 
>> That's great.. Thank you!
>> 
>> I will keep the jcheck tip in mind for next time :)
>> 
>> - Kishor
>> 
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Tuesday, December 29, 2015 12:47 AM
>> To: Kharbas, Kishor; hotspot-compiler-dev at openjdk.java.net
>> Cc: Anthony Scarpino
>> Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
>> 
>> Hi Kishor,
>> 
>> There were coding style problems which I fixed. Please, do cleanup in a future (use jcheck).
>> 
>> src/cpu/x86/vm/stubGenerator_x86_32.cpp:2144: Trailing whitespace
>> src/cpu/x86/vm/stubGenerator_x86_64.cpp:3061: Trailing whitespace
>> src/cpu/x86/vm/stubRoutines_x86.hpp:36: Trailing whitespace
>> src/cpu/x86/vm/vm_version_x86.cpp:709: Trailing whitespace
>> src/share/vm/opto/library_call.cpp:702: Trailing whitespace
>> src/share/vm/opto/runtime.hpp:317: Trailing whitespace
>> 
>> src/share/vm/opto/library_call.cpp:5789: Carriage return (^M)
>> 
>> I submitted push job. Lets see how it will go.
>> 
>> Regards,
>> Vladimir
>> 
>> On 12/28/15 8:48 PM, Kharbas, Kishor wrote:
>>> Vladimir, sorry that file was added accidentally.
>>> Here is an updated patch -
>>> http://cr.openjdk.java.net/~vdeshpande/8143925/webrev.01/
>>> 
>>> This patch includes,
>>> 1. Changes to some comments.
>>> 2. Small correction in vm_version_x86.cpp.
>>> 3. Removal of version.rc file.
>>> 
>>> Thanks for reviewing the code.
>>> 
>>> Kishor
>>> 
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Thursday, December 24, 2015 4:36 PM
>>> To: Kharbas, Kishor; hotspot-compiler-dev at openjdk.java.net
>>> Cc: Anthony Scarpino
>>> Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
>>> 
>>> What are the changes in src/os/windows/vm/version.rc?
>>> 
>>> Otherwise this looks good.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 12/24/15 2:26 PM, Kharbas, Kishor wrote:
>>>> Hello all,
>>>> 
>>>> Thank you Vladimir and Anthony for your inputs so far.
>>>> I have updated the hotspot based on the suggestions and also added CTR mode to jtreg test.
>>>> 
>>>> During testing I also noticed that the Java code for CounterMode.crypt() uses the partially used encrypted counter from previous invocation and also saves the last encryptedCounter for next invocation.
>>>> This case was not handled by the intrinsic. I have fixed this in the latest patch.
>>>> 
>>>> Summary of changes:
>>>> 1. Proper disabling of UseAESCTRIntrinsic flag based on hardware
>>>> support 2. Adding the missing support explained above.
>>>> 3. Added CTR mode in jtreg test 7184394 4. Added and changed some
>>>> encodings (pextr and pinsr) in assembler_x86.cpp
>>>> 
>>>> The updated hotspot webrev is at :
>>>> http://cr.openjdk.java.net/~vdeshpande/8143925/webrev.00/
>>>> There is no update to jdk webrev posted earlier which is
>>>> http://cr.openjdk.java.net/~mcberg/8143925/jdk/webrev.02/
>>>> Bug id : https://bugs.openjdk.java.net/browse/JDK-8143925
>>>> 
>>>> Much appreciated!
>>>> 
>>>> Happy holidays!
>>>> Kishor
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Friday, December 04, 2015 3:59 PM
>>>> To: Kharbas, Kishor; hotspot-compiler-dev at openjdk.java.net
>>>> Cc: Anthony Scarpino
>>>> Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
>>>> 
>>>> jdk: http://cr.openjdk.java.net/~mcberg/8143925/jdk/webrev.02/
>>>> 
>>>> JDK changes looks good to me.
>>>> 
>>>> hotspot:
>>>> http://cr.openjdk.java.net/~mcberg/8143925/hotspot/webrev.04/
>>>> 
>>>> Please, set flag to 'false' on platforms which does not support this
>>>> intrinsic:
>>>> 
>>>>    if (UseAESCTRIntrinsics) {
>>>>      warning("AES/CTR intrinsics are not available on this CPU");
>>>>      FLAG_SET_DEFAULT(UseAESCTRIntrinsics, false);
>>>>    }
>>>> 
>>>> Also Anthony asked to add test for this intrinsic. Please do it:
>>>> 
>>>> "2) It would be good to add CTR to the TestAES tests.  It's in hotspot/test/compiler/codegen/7184394/.  The test currently has CBC, ECB, and GCM in it, so it should be easy.  It's also the only test I know of that tests the intrinsic.  None of the tests in the jdk repo that I know of loop enough to trigger the intrinsic."
>>>> 
>>>> Thanks,
>>>> Vladimir
>>>> 
>>>> On 12/4/15 1:40 PM, Kharbas, Kishor wrote:
>>>>> Thanks Vladimir for the feedback!
>>>>> 
>>>>> I have updated the jbs entry with the new patch.
>>>>> 
>>>>> JDK changes : added range checks in the JDK using additional methods.
>>>>> Hotspot changes : renamed the UseCTRAESIntrinsics flag to
>>>>> UseAESCTRIntrinsics
>>>>> 
>>>>> Further review and feedback is appreciated!
>>>>> 
>>>>> - Kishor
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>> Sent: Tuesday, December 01, 2015 5:32 PM
>>>>> To: Kharbas, Kishor; hotspot-compiler-dev at openjdk.java.net
>>>>> Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
>>>>> 
>>>>> Hotspot changes seems fine. But JDK changes should have additional method for range checks - this is new requirement for intrinsics which access arrays. See, for example, cryptBlockCheck() in AESCrypt.java.
>>>>> 
>>>>> Thanks,
>>>>> Vladimir
>>>>> 
>>>>> On 11/24/15 2:33 PM, Kharbas, Kishor wrote:
>>>>>> Hello all,
>>>>>> 
>>>>>> I request the community to review a patch for enhancing
>>>>>> CounterMode.crypt() for AES. This patch defines intrinsic for
>>>>>> CounterMode.crypt() to leverage the parallel nature of AES in
>>>>>> Counter
>>>>>> (CTR) Mode.
>>>>>> 
>>>>>> This is achieved by operating on 6 blocks in parallel to issue
>>>>>> independent x86 AES-NI instructions and keep the CPU pipeline full.
>>>>>> 
>>>>>> Testing on micro-benchmark has shown a speedup of 4x-6x.
>>>>>> 
>>>>>> Bug id:
>>>>>> 
>>>>>>      https://bugs.openjdk.java.net/browse/JDK-8143925
>>>>>> 
>>>>>> Webrev:
>>>>>> 
>>>>>>     hotspot:
>>>>>> http://cr.openjdk.java.net/~mcberg/8143925/hotspot/webrev.02/
>>>>>> 
>>>>>>     jdk:
>>>>>> http://cr.openjdk.java.net/~mcberg/8143925/jdk/webrev.01/
>>>>>> 
>>>>>> Much appreciated!
>>>>>> 
>>>>>> Kishor Kharbas
>>>>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/ac8bee0c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/ac8bee0c/signature-0001.asc>

From vladimir.kozlov at oracle.com  Mon Jan  4 15:48:08 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 4 Jan 2016 07:48:08 -0800
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A3BB9.1010501@oracle.com>
References: <568A2A20.7030601@oracle.com> <568A2EE7.4030600@oracle.com>
	<568A3BB9.1010501@oracle.com>
Message-ID: <568A9438.3010400@oracle.com>

Looks good.

Thanks,
Vladimir

On 1/4/16 1:30 AM, Tobias Hartmann wrote:
> Hi David,
>
> sure, I included the links to the code review:
>
> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
> https://bugs.openjdk.java.net/browse/JDK-8144487
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020503.html
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>
> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
> https://bugs.openjdk.java.net/browse/JDK-8145754
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020502.html
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>
> Thanks,
> Tobias
>
> On 04.01.2016 09:35, david buck wrote:
>> Hi Tobias!
>>
>> Would you please include links to the code review threads on mail.openjdk.java.net?
>>
>> [ JDK 8 Updates: Push Approval Request Template ]
>> http://openjdk.java.net/projects/jdk8u/approval-template.html
>>
>> Cheers,
>> -Buck
>>
>> On 2016/01/04 17:15, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please approve and review the following backports to 8u.
>>>
>>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>>> https://bugs.openjdk.java.net/browse/JDK-8144487
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>>
>>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>>> https://bugs.openjdk.java.net/browse/JDK-8145754
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>>
>>> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>>>
>>> Thanks,
>>> Tobias
>>>

From christian.thalinger at oracle.com  Mon Jan  4 17:16:59 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 4 Jan 2016 07:16:59 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
Message-ID: <2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>


> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
> 
> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
> 
> This change removes the JVMCI command line option support.
> 
> https://bugs.openjdk.java.net/browse/JDK-8146001
> http://cr.openjdk.java.net/~dnsimon/8146001/

+    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);

+    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);

We should either use the jvmci. prefix or not.

src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethod.java:

-        @Option(help = "", type = OptionType.Debug)
-        public static final OptionValue<Boolean> UseProfilingInformation = new OptionValue<>(true);

We are using this flag so we need to keep it.

> 
> -Doug


From christian.thalinger at oracle.com  Mon Jan  4 17:19:32 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 4 Jan 2016 07:19:32 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
Message-ID: <A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>


> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>> 
>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>> 
>> This change removes the JVMCI command line option support.
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8146001
>> http://cr.openjdk.java.net/~dnsimon/8146001/
> 
> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
> 
> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
> 
> We should either use the jvmci. prefix or not.

Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.

> 
> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethod.java:
> 
> -        @Option(help = "", type = OptionType.Debug)
> -        public static final OptionValue<Boolean> UseProfilingInformation = new OptionValue<>(true);
> 
> We are using this flag so we need to keep it.
> 
>> 
>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/ecb96337/attachment.html>

From christian.thalinger at oracle.com  Mon Jan  4 17:41:39 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 4 Jan 2016 07:41:39 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
Message-ID: <8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>


> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>>> 
>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>> 
>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>> 
>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>> 
>>> This change removes the JVMCI command line option support.
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8146001 <https://bugs.openjdk.java.net/browse/JDK-8146001>
>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>> 
>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>> 
>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>> 
>> We should either use the jvmci. prefix or not.
> 
> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.

I think we should prefix the property name in getBooleanProperty:

+    public static boolean getBooleanProperty(String name, boolean def) {
+        String value = VM.getSavedProperty("jvmci." + name);

and I put UseProfilingInformation back:

diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
--- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
+++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
@@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
 
 import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
 import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
-import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
 import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
 import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
 
@@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
 final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
 
     /**
+     * Whether to use profiling information.
+     */
+    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
+
+    /**
      * Reference to metaspace Method object.
      */
     private final long metaspaceMethod;
@@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
         ProfilingInfo info;
 
-        if (UseProfilingInformation.getValue() && methodData == null) {
+        if (UseProfilingInformation && methodData == null) {
             long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
             if (metaspaceMethodData != 0) {
                 methodData = new HotSpotMethodData(metaspaceMethodData, this);

> 
>> 
>> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethod.java:
>> 
>> -        @Option(help = "", type = OptionType.Debug)
>> -        public static final OptionValue<Boolean> UseProfilingInformation = new OptionValue<>(true);
>> 
>> We are using this flag so we need to keep it.
>> 
>>> 
>>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/11da77a1/attachment-0001.html>

From john.r.rose at oracle.com  Mon Jan  4 20:12:28 2016
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 4 Jan 2016 12:12:28 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
Message-ID: <DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>

On Jan 4, 2016, at 3:42 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
> Hi,
> 
>> On 31 Dec 2015, at 22:33, John Rose <john.r.rose at oracle.com <mailto:john.r.rose at oracle.com>> wrote:
>> 
>> When performing explicit range checks in pre-intrinsic code,
>> let's try to use the new intrinsic functions in java.util.Objects,
>> called checkIndex, checkFromToIndex, and checkFromIndexSize.
> 
> At the moment only checkIndex is a C2 intrinsic, we could revisit making the others intrinsic as well based on use-cases.

Corrected, thanks.  They don't need to be intrinsics if they optimize well.
The point is that the library functions have code shapes which work well
with the JIT.  For example, the multi-index checks might (as in Kishor's code)
be implemented on top of the single-index check, without themselves being
intrinsics.

> 
>> These are simpler, safer, and more maintainable than our previous
>> practice of using hand-written "random logic", such as in this bug:
>>  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/cb31a76eecd1#l1.52 <http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/cb31a76eecd1#l1.52>
>> 
> 
> Yes, in this case i believe the calls to cryptBlockCheck 
> 
>  176         cryptBlockCheck(in, inOff, len);
>  177         cryptBlockCheck(out, outOff, len);
>  178         return implCrypt(in, inOff, len, out, outOff);
> could be replaced with:
> 
>  Objects.checkFromIndexSize(inOff, len, in.length, <BiFunction>);
>  Objects.checkFromIndexSize(outOff, len, out.length, <BiFunction>);
>  return implCrypt(in, inOff, len, out, outOff);

Yes.  And if that doesn't produce clean code, it's a JIT bug, not a JDK bug.

One caveat:  If the BiFunction must produce a message with the index,
it is not a constant and we might have potential capture costs.
The correct trade-off here is to either simplify the message,
or ask the JIT to scalarize (EA-away) the closure node on the hot path,
or even add another entry point (with a TriFunction, perhaps).
As I pointed out below, we can just simplify the message.

? John

> Paul.
> 
> 
>> Depending on the documented API, it is usually enough that the
>> thrown exception be a RuntimeException of any sort.  By default,
>> the methods throw a generic IndexOutOfBoundsException.
>> In cases where a particular exception must be thrown, the Objects
>> methods provide an optional "hook" for building the desired exception.
>> 
>> In this case, since the code is already pushed, we should clean it
>> up as part of this bug:
>>  https://bugs.openjdk.java.net/browse/JDK-8135250 <https://bugs.openjdk.java.net/browse/JDK-8135250>
>> 
>> ? John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/edaf7ed0/attachment.html>

From doug.simon at oracle.com  Mon Jan  4 22:31:14 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 4 Jan 2016 23:31:14 +0100
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
	<8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
Message-ID: <1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>


> On 04 Jan 2016, at 18:41, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>>> 
>>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>>> 
>>>> This change removes the JVMCI command line option support.
>>>> 
>>>> https://bugs.openjdk.java.net/browse/JDK-8146001
>>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>>> 
>>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>>> 
>>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>> 
>>> We should either use the jvmci. prefix or not.
>> 
>> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.
> 
> I think we should prefix the property name in getBooleanProperty:
> 
> +    public static boolean getBooleanProperty(String name, boolean def) {
> +        String value = VM.getSavedProperty("jvmci." + name);

Ok, sounds reasonable.

> 
> and I put UseProfilingInformation back:
> 
> diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
> @@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
>  
>  import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
>  import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
> -import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
>  import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
>  import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
>  
> @@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
>  final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
>  
>      /**
> +     * Whether to use profiling information.
> +     */
> +    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
> +
> +    /**
>       * Reference to metaspace Method object.
>       */
>      private final long metaspaceMethod;
> @@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
>      public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>          ProfilingInfo info;
>  
> -        if (UseProfilingInformation.getValue() && methodData == null) {
> +        if (UseProfilingInformation && methodData == null) {
>              long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
>              if (metaspaceMethodData != 0) {
>                  methodData = new HotSpotMethodData(metaspaceMethodData, this);

JVMCI should unconditionally return available profiling information. It's up to the compiler whether or not to use it. For example, this is now compilation local in Graal:

http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16

-Doug

From vladimir.kozlov at oracle.com  Mon Jan  4 22:46:48 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 4 Jan 2016 14:46:48 -0800
Subject: RFR(S): 8139771: Eliminating CastPP nodes at Phis when they all
	come from a unique input may cause crash
In-Reply-To: <FA498C77-467D-48C5-948B-FE6DB13F061D@oracle.com>
References: <F1E9591F-712C-4FC1-A4D7-2B6254528FA3@oracle.com>
	<56623E4A.9040504@oracle.com>
	<AC89704A-46C3-4FA3-8718-17E3F77A95D1@oracle.com>
	<56663B29.7050508@oracle.com>
	<5021FF7F-DA52-44D0-A7E5-DAEFFC5992C1@oracle.com>
	<566F45F9.5000304@oracle.com>
	<B4621978-98E6-41CB-865E-42279339F0BA@oracle.com>
	<5670BF83.7060907@oracle.com>
	<FA498C77-467D-48C5-948B-FE6DB13F061D@oracle.com>
Message-ID: <568AF658.8020607@oracle.com>

The comment is wrong. I added it for next changes:

https://bugs.openjdk.java.net/browse/JDK-7004535

But later removed the change due to next bug but I did not updated the 
comment:

https://bugs.openjdk.java.net/browse/JDK-7068051

"The code added in 7004535 changes does not take into account that 
cloning/moving predicates below merge points invalidate jvm states 
recorded in corresponding uncommon traps. Phi nodes should be created 
for values referenced by predicate's uncommon traps when a predicate is 
cloned."

"Remove predicate cloning from loop peeling optimization and from split 
fall-in paths. Leave it in loop unswitching code which is safe. Don't 
allow split loop entry path in IGVN optimization for Phi nodes. And do 
not clone predicates below merge points in split-if optimization. Remove 
move_loop_predicate() and eliminate_loop_predicates() unused methods."

Thanks,
Vladimir

On 12/16/15 12:49 AM, Roland Westrelin wrote:
>>> For reference, current webrev:
>>>
>>> http://cr.openjdk.java.net/~roland/8139771/webrev.01/
>>>
>>>>> As you suggested I made CheckCastPP inherit from ConstraintCast. I also hit the following bug: one iteration of a loop is peeled which causes a CastPP to be pinned between the loop and the predicates. When a predicate that depends on the CastPP is moved out of the loop, it is moved above the CastPP. I fixed by marking all nodes that depend on a node pinned between a loop and the predicates as non loop invariant. I don?t think fixing it by moving the cast up above the predicates is a safe fix in general.
>>>>
>>>> Hmm. The test which depends on CastPP should be also peeled and it will dominate the test in main loop. If a test/predicate could be moved from main loop then it should be possible to use peeled one. What do you think?
>>>
>>> Let me take another look at this.
>>> Independently: so we never apply loop predication before peeling? Otherwise moving the peeled body before the loop predicate could be incorrect, right (predicates could have been moved out of the body before it?s peeled)?
>>
>> We never peel before predicates. Peeling does not know about them. The peeled iteration is placed between predicates and peeled loop head.
>
> The comment in PhaseIdealLoop::do_peeling() implies that the peeled iteration is above the predicates. We can apply loop  predication then peeling. If the peeled iteration is above the predicates, isn?t there a risk the peeled iteration is executed before a predicate it depends on for correctness?
>
> Roland.
>
>>
>> Vladimir
>>
>>>
>>> Roland.
>

From christian.thalinger at oracle.com  Mon Jan  4 22:47:34 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 4 Jan 2016 12:47:34 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
	<8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
	<1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>
Message-ID: <6C07E8DD-50D4-4B2E-BD8E-B131579A9664@oracle.com>


> On Jan 4, 2016, at 12:31 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 04 Jan 2016, at 18:41, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>>>> 
>>>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>>>> 
>>>>> This change removes the JVMCI command line option support.
>>>>> 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8146001
>>>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>>>> 
>>>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>>>> 
>>>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>>> 
>>>> We should either use the jvmci. prefix or not.
>>> 
>>> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.
>> 
>> I think we should prefix the property name in getBooleanProperty:
>> 
>> +    public static boolean getBooleanProperty(String name, boolean def) {
>> +        String value = VM.getSavedProperty("jvmci." + name);
> 
> Ok, sounds reasonable.
> 
>> 
>> and I put UseProfilingInformation back:
>> 
>> diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
>> @@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
>> 
>> import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
>> import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
>> -import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
>> import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
>> import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
>> 
>> @@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
>> final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
>> 
>>     /**
>> +     * Whether to use profiling information.
>> +     */
>> +    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
>> +
>> +    /**
>>      * Reference to metaspace Method object.
>>      */
>>     private final long metaspaceMethod;
>> @@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
>>     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>         ProfilingInfo info;
>> 
>> -        if (UseProfilingInformation.getValue() && methodData == null) {
>> +        if (UseProfilingInformation && methodData == null) {
>>             long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
>>             if (metaspaceMethodData != 0) {
>>                 methodData = new HotSpotMethodData(metaspaceMethodData, this);
> 
> JVMCI should unconditionally return available profiling information. It's up to the compiler whether or not to use it. For example, this is now compilation local in Graal:
> 
> http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16 <http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16>

Oh, I missed that.  Yes, that works for us as well.  Thanks for pointing that out.

> 
> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160104/dc7b9dc9/attachment-0001.html>

From vladimir.kozlov at oracle.com  Mon Jan  4 23:52:14 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 4 Jan 2016 15:52:14 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568A590F.6030104@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
	<568A590F.6030104@oracle.com>
Message-ID: <568B05AE.7020700@oracle.com>

On 1/4/16 3:35 AM, Tobias Hartmann wrote:
> Hi Roland,
>
> sorry for the delay.
>
> On 07.10.2015 11:06, Roland Westrelin wrote:
>>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>>
>>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>>
>>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>>
>>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>>> // pointer.
>>>
>>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>>> Another solution would be to cut the *current* branch if it is never taken:
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>>
>>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>>
>>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
>>
>> I thought about this more and I don?t think either work ok.
>>
>> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
>>
>> Node* never_taken = new IfTrueNode(..);
>> Node* always_taken = new IfFalseNode(..);
>> always_taken = gvn.transform(always_taken);
>> never_taken = gvn.transform(never_taken);
>>
>> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.
>
> Right, that could be a problem.

I don't see a problem. But IfProjNode::Ideal() should have additional 
checks for that:

   // Check for dead control input
   if (in(0) && remove_dead_region(phase, can_reshape)) {
     return this;
   }
   // Don't bother trying to transform a dead node
   if (in(0) && in(0)->is_top()) {
     return NULL;
   }

Also instead of set_req() use:

PhaseIterGVN* igvn = phase->is_IterGVN();
igvn->replace_input_of(other, 0, phase->C->top());

This way following gvn.transform(never_taken); will work fine.

Thanks,
Vladimir

>
>> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/
>
> However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?
>
> Thanks,
> Tobias
>

From tobias.hartmann at oracle.com  Tue Jan  5 06:03:14 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 5 Jan 2016 07:03:14 +0100
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A9438.3010400@oracle.com>
References: <568A2A20.7030601@oracle.com> <568A2EE7.4030600@oracle.com>
	<568A3BB9.1010501@oracle.com> <568A9438.3010400@oracle.com>
Message-ID: <568B5CA2.4090003@oracle.com>

Thanks, Vladimir.

Best,
Tobias

On 04.01.2016 16:48, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 1/4/16 1:30 AM, Tobias Hartmann wrote:
>> Hi David,
>>
>> sure, I included the links to the code review:
>>
>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>> https://bugs.openjdk.java.net/browse/JDK-8144487
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020503.html
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>
>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>> https://bugs.openjdk.java.net/browse/JDK-8145754
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020502.html
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>
>> Thanks,
>> Tobias
>>
>> On 04.01.2016 09:35, david buck wrote:
>>> Hi Tobias!
>>>
>>> Would you please include links to the code review threads on mail.openjdk.java.net?
>>>
>>> [ JDK 8 Updates: Push Approval Request Template ]
>>> http://openjdk.java.net/projects/jdk8u/approval-template.html
>>>
>>> Cheers,
>>> -Buck
>>>
>>> On 2016/01/04 17:15, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please approve and review the following backports to 8u.
>>>>
>>>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>>>> https://bugs.openjdk.java.net/browse/JDK-8144487
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>>>
>>>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>>>> https://bugs.openjdk.java.net/browse/JDK-8145754
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>>>
>>>> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>

From tobias.hartmann at oracle.com  Tue Jan  5 07:58:01 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 5 Jan 2016 08:58:01 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568B05AE.7020700@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
	<568A590F.6030104@oracle.com> <568B05AE.7020700@oracle.com>
Message-ID: <568B7789.5010600@oracle.com>

Hi Vladimir,

thanks for the review.

On 05.01.2016 00:52, Vladimir Kozlov wrote:
> On 1/4/16 3:35 AM, Tobias Hartmann wrote:
>> Hi Roland,
>>
>> sorry for the delay.
>>
>> On 07.10.2015 11:06, Roland Westrelin wrote:
>>>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>>>
>>>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>>>
>>>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>>>
>>>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>>>> // pointer.
>>>>
>>>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>>>> Another solution would be to cut the *current* branch if it is never taken:
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>>>
>>>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>>>
>>>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
>>>
>>> I thought about this more and I don?t think either work ok.
>>>
>>> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
>>>
>>> Node* never_taken = new IfTrueNode(..);
>>> Node* always_taken = new IfFalseNode(..);
>>> always_taken = gvn.transform(always_taken);
>>> never_taken = gvn.transform(never_taken);
>>>
>>> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.
>>
>> Right, that could be a problem.
> 
> I don't see a problem. But IfProjNode::Ideal() should have additional checks for that:
> 
>   // Check for dead control input
>   if (in(0) && remove_dead_region(phase, can_reshape)) {
>     return this;
>   }
>   // Don't bother trying to transform a dead node
>   if (in(0) && in(0)->is_top()) {
>     return NULL;
>   }

Right, I'll add those.

> Also instead of set_req() use:
> 
> PhaseIterGVN* igvn = phase->is_IterGVN();
> igvn->replace_input_of(other, 0, phase->C->top());
> 
> This way following gvn.transform(never_taken); will work fine.

But this assumes that we are only executing the code with IGVN but we also want to cut off the dead branch with GVN. Or am I missing something?

Thanks,
Tobias

> 
> Thanks,
> Vladimir
> 
>>
>>> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/
>>
>> However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?
>>
>> Thanks,
>> Tobias
>>

From aph at redhat.com  Tue Jan  5 09:48:56 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 5 Jan 2016 09:48:56 +0000
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
Message-ID: <568B9188.6000506@redhat.com>

On 04/01/16 20:12, John Rose wrote:
> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
> The point is that the library functions have code shapes which work well
> with the JIT.  For example, the multi-index checks might (as in Kishor's code)
> be implemented on top of the single-index check, without themselves being
> intrinsics.

We seem to be missing the opportunity to convert

  i >= 0 && i < size

into

  (unsigned)i < (unsigned)size

and this is, as far as I can see, the only real code-quality advantage of
the checkIndex intrinsic.  Could we not do this optimization and then
drop the C2 checkIndex intrinsic?

Andrew.


From paul.sandoz at oracle.com  Tue Jan  5 10:23:19 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 5 Jan 2016 11:23:19 +0100
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568B9188.6000506@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
Message-ID: <A66384B2-7C2C-40F5-A275-75936FEA8AE5@oracle.com>


> On 5 Jan 2016, at 10:48, Andrew Haley <aph at redhat.com> wrote:
> 
> On 04/01/16 20:12, John Rose wrote:
>> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
>> The point is that the library functions have code shapes which work well
>> with the JIT.  For example, the multi-index checks might (as in Kishor's code)
>> be implemented on top of the single-index check, without themselves being
>> intrinsics.
> 
> We seem to be missing the opportunity to convert
> 
>  i >= 0 && i < size
> 
> into
> 
>  (unsigned)i < (unsigned)size
> 
> and this is, as far as I can see, the only real code-quality advantage of
> the checkIndex intrinsic.  Could we not do this optimization and then
> drop the C2 checkIndex intrinsic?
> 

My understanding is that the pattern matching can sometimes be fragile, hence a "belts and braces? approach.

It was motivated by the VarHandle work where it was observed that explicit bounds checks plus Unsafe array access produced more generated bounds checks [*] than direct array access (which does what you propose). The VarHandle array access implementations call this method before Unsafe access.

If the pattern matching gets (or is now) sufficiently reliable we could remove the intrinsic, but i would like to carefully verify before doing that.

Paul.

[*] Another case was identified for viewed indexed ByteBuffer access, where use of the Objects.checkIndex method in the following method on Buffer also reduced generated checks:

final int checkIndex(int i, int nb) {               // package-private
    if ((i < 0) || (nb > limit - i))
        throw new IndexOutOfBoundsException();
    return i;
}

(Note that this code assumes that limit is always non-negative.) I need to go back and revisit from last time i checked in September.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160105/5a397895/signature.asc>

From paul.sandoz at oracle.com  Tue Jan  5 11:51:45 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 5 Jan 2016 12:51:45 +0100
Subject: Conditional moves vs. branching in unrolled loops
Message-ID: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>

Hi,

Recent investigation comparing for loops with streams exposed what appears to be an issue with Math.max and generated code in unrolled loops.

Namely this:

    @Benchmark
    public int forTest_if() {
        int[] a = ints;
        int e = ints.length;
        int m = Integer.MIN_VALUE;
        for (int i = 0; i < e; i++)
            if (a[i] >= m)
                m = a[i];
        return m;
    }

is faster than this:

    @Benchmark
    public int forTest_MathMax() {
        int[] a = ints;
        int e = ints.length;
        int m = Integer.MIN_VALUE;
        for (int i = 0; i < e; i++)
            m = Math.max(m, a[i]);
        return m;
    }

Or this:

  Arrays.stream(ints).reduce(Integer.MIN_VALUE, (a, b) -> a >= b ? a : b);

is faster than this:

  Arrays.stream(ints).reduce(Integer.MIN_VALUE, Math::max);

at least on an x86 i5 processor.

See the following links for more details:

  https://bugs.openjdk.java.net/browse/JDK-8146071
  https://bugs.openjdk.java.net/browse/JDK-8146071?focusedCommentId=13883495&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13883495

For generated code in the for loop cases above see:

  https://bugs.openjdk.java.net/secure/attachment/56221/mathMax.perfasm.txt

I am not familiar enough with the x86 architecture to fully explain why, but i presume branch prediction is trumping the conditional moves, which suggests that on certain processors the generated code for the Math.max intrinsic (and others) in unrolled loops should not use conditional moves.

Thanks,
Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160105/af6810da/signature-0001.asc>

From vitalyd at gmail.com  Tue Jan  5 12:00:33 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 5 Jan 2016 07:00:33 -0500
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
Message-ID: <CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>

This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104

On Tuesday, January 5, 2016, Paul Sandoz <paul.sandoz at oracle.com> wrote:

> Hi,
>
> Recent investigation comparing for loops with streams exposed what appears
> to be an issue with Math.max and generated code in unrolled loops.
>
> Namely this:
>
>     @Benchmark
>     public int forTest_if() {
>         int[] a = ints;
>         int e = ints.length;
>         int m = Integer.MIN_VALUE;
>         for (int i = 0; i < e; i++)
>             if (a[i] >= m)
>                 m = a[i];
>         return m;
>     }
>
> is faster than this:
>
>     @Benchmark
>     public int forTest_MathMax() {
>         int[] a = ints;
>         int e = ints.length;
>         int m = Integer.MIN_VALUE;
>         for (int i = 0; i < e; i++)
>             m = Math.max(m, a[i]);
>         return m;
>     }
>
> Or this:
>
>   Arrays.stream(ints).reduce(Integer.MIN_VALUE, (a, b) -> a >= b ? a : b);
>
> is faster than this:
>
>   Arrays.stream(ints).reduce(Integer.MIN_VALUE, Math::max);
>
> at least on an x86 i5 processor.
>
> See the following links for more details:
>
>   https://bugs.openjdk.java.net/browse/JDK-8146071
>
> https://bugs.openjdk.java.net/browse/JDK-8146071?focusedCommentId=13883495&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13883495
>
> For generated code in the for loop cases above see:
>
>
> https://bugs.openjdk.java.net/secure/attachment/56221/mathMax.perfasm.txt
>
> I am not familiar enough with the x86 architecture to fully explain why,
> but i presume branch prediction is trumping the conditional moves, which
> suggests that on certain processors the generated code for the Math.max
> intrinsic (and others) in unrolled loops should not use conditional moves.
>
> Thanks,
> Paul.
>


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160105/ac00a1c6/attachment.html>

From paul.sandoz at oracle.com  Tue Jan  5 12:47:20 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Tue, 5 Jan 2016 13:47:20 +0100
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
Message-ID: <775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>


> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104
> 

Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.

Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160105/9d1b60e0/signature.asc>

From vladimir.kozlov at oracle.com  Tue Jan  5 17:05:23 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 5 Jan 2016 09:05:23 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568B7789.5010600@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
	<568A590F.6030104@oracle.com> <568B05AE.7020700@oracle.com>
	<568B7789.5010600@oracle.com>
Message-ID: <568BF7D3.7000403@oracle.com>


On 1/4/16 11:58 PM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> thanks for the review.
>
> On 05.01.2016 00:52, Vladimir Kozlov wrote:
>> On 1/4/16 3:35 AM, Tobias Hartmann wrote:
>>> Hi Roland,
>>>
>>> sorry for the delay.
>>>
>>> On 07.10.2015 11:06, Roland Westrelin wrote:
>>>>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>>>>
>>>>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>>>>
>>>>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>>>>
>>>>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>>>>> // pointer.
>>>>>
>>>>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>>>>> Another solution would be to cut the *current* branch if it is never taken:
>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>>>>
>>>>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>>>>
>>>>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
>>>>
>>>> I thought about this more and I don?t think either work ok.
>>>>
>>>> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
>>>>
>>>> Node* never_taken = new IfTrueNode(..);
>>>> Node* always_taken = new IfFalseNode(..);
>>>> always_taken = gvn.transform(always_taken);
>>>> never_taken = gvn.transform(never_taken);
>>>>
>>>> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.
>>>
>>> Right, that could be a problem.
>>
>> I don't see a problem. But IfProjNode::Ideal() should have additional checks for that:
>>
>>    // Check for dead control input
>>    if (in(0) && remove_dead_region(phase, can_reshape)) {
>>      return this;
>>    }
>>    // Don't bother trying to transform a dead node
>>    if (in(0) && in(0)->is_top()) {
>>      return NULL;
>>    }
>
> Right, I'll add those.
>
>> Also instead of set_req() use:
>>
>> PhaseIterGVN* igvn = phase->is_IterGVN();
>> igvn->replace_input_of(other, 0, phase->C->top());
>>
>> This way following gvn.transform(never_taken); will work fine.
>
> But this assumes that we are only executing the code with IGVN but we also want to cut off the dead branch with GVN. Or am I missing something?

webrev.04 checks can_reshape which is true only with IGVN.
For GVN you can do it by hand:

         bool is_in_table = C->initial_gvn()->hash_delete(other);
         other->set_req(0, phase->C->top());
         if (is_in_table) {
           C->initial_gvn()->hash_find_insert(other);
         }
         C->record_for_igvn(other);

Note, during Parse (GVN) we don't remove dead code aggressively.

Vladimir

>
> Thanks,
> Tobias
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>>> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.
>>>
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/
>>>
>>> However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?
>>>
>>> Thanks,
>>> Tobias
>>>

From tobias.hartmann at oracle.com  Tue Jan  5 17:13:15 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 5 Jan 2016 18:13:15 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568BF7D3.7000403@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
	<568A590F.6030104@oracle.com> <568B05AE.7020700@oracle.com>
	<568B7789.5010600@oracle.com> <568BF7D3.7000403@oracle.com>
Message-ID: <568BF9AB.3010408@oracle.com>


On 05.01.2016 18:05, Vladimir Kozlov wrote:
> On 1/4/16 11:58 PM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> thanks for the review.
>>
>> On 05.01.2016 00:52, Vladimir Kozlov wrote:
>>> On 1/4/16 3:35 AM, Tobias Hartmann wrote:
>>>> Hi Roland,
>>>>
>>>> sorry for the delay.
>>>>
>>>> On 07.10.2015 11:06, Roland Westrelin wrote:
>>>>>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>>>>>
>>>>>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>>>>>
>>>>>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>>>>>
>>>>>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>>>>>> // pointer.
>>>>>>
>>>>>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>>>>>> Another solution would be to cut the *current* branch if it is never taken:
>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>>>>>
>>>>>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>>>>>
>>>>>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
>>>>>
>>>>> I thought about this more and I don?t think either work ok.
>>>>>
>>>>> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
>>>>>
>>>>> Node* never_taken = new IfTrueNode(..);
>>>>> Node* always_taken = new IfFalseNode(..);
>>>>> always_taken = gvn.transform(always_taken);
>>>>> never_taken = gvn.transform(never_taken);
>>>>>
>>>>> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.
>>>>
>>>> Right, that could be a problem.
>>>
>>> I don't see a problem. But IfProjNode::Ideal() should have additional checks for that:
>>>
>>>    // Check for dead control input
>>>    if (in(0) && remove_dead_region(phase, can_reshape)) {
>>>      return this;
>>>    }
>>>    // Don't bother trying to transform a dead node
>>>    if (in(0) && in(0)->is_top()) {
>>>      return NULL;
>>>    }
>>
>> Right, I'll add those.
>>
>>> Also instead of set_req() use:
>>>
>>> PhaseIterGVN* igvn = phase->is_IterGVN();
>>> igvn->replace_input_of(other, 0, phase->C->top());
>>>
>>> This way following gvn.transform(never_taken); will work fine.
>>
>> But this assumes that we are only executing the code with IGVN but we also want to cut off the dead branch with GVN. Or am I missing something?
> 
> webrev.04 checks can_reshape which is true only with IGVN.
> For GVN you can do it by hand:
> 
>         bool is_in_table = C->initial_gvn()->hash_delete(other);
>         other->set_req(0, phase->C->top());
>         if (is_in_table) {
>           C->initial_gvn()->hash_find_insert(other);
>         }
>         C->record_for_igvn(other);
> 
> Note, during Parse (GVN) we don't remove dead code aggressively.

Right, I thought you were referring to Roland's comment about webrev.03.

As I wrote in a previous email, I'm afraid that the webrev.04 solution re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Do you think that isn't a problem?

Thanks,
Tobias

> 
> Vladimir
> 
>>
>> Thanks,
>> Tobias
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>>> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.
>>>>
>>>> Here is the new webrev:
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/
>>>>
>>>> However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?
>>>>
>>>> Thanks,
>>>> Tobias
>>>>

From vladimir.kozlov at oracle.com  Tue Jan  5 17:17:04 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 5 Jan 2016 09:17:04 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568B9188.6000506@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
Message-ID: <568BFA90.4020807@oracle.com>

 > On 31 Dec 2015, at 22:33, John Rose <john.r.rose at oracle.com> wrote:
 >
 > When performing explicit range checks in pre-intrinsic code,
 > let's try to use the new intrinsic functions in java.util.Objects,
 > called checkIndex, checkFromToIndex, and checkFromIndexSize.

Please, don't forget that checks in pre-intrinsic code should match checks generated by javac (bytecode) for 
intrinsified methods. Otherwise those checks will not be removed (by dominated checks in pre-intrinsic code) when 
intrinsics are not support on a platform. That is why we currently have such duplicated pre-intrinsic code.

On other hand when intrinsics are supported they don't have checks so if they present we can intrinsify pre-intrinsic 
code as you suggested.

Thanks,
Vladimir

On 1/5/16 1:48 AM, Andrew Haley wrote:
> On 04/01/16 20:12, John Rose wrote:
>> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
>> The point is that the library functions have code shapes which work well
>> with the JIT.  For example, the multi-index checks might (as in Kishor's code)
>> be implemented on top of the single-index check, without themselves being
>> intrinsics.
>
> We seem to be missing the opportunity to convert
>
>    i >= 0 && i < size
>
> into
>
>    (unsigned)i < (unsigned)size
>
> and this is, as far as I can see, the only real code-quality advantage of
> the checkIndex intrinsic.  Could we not do this optimization and then
> drop the C2 checkIndex intrinsic?
>
> Andrew.
>

From vladimir.kozlov at oracle.com  Tue Jan  5 17:20:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 5 Jan 2016 09:20:12 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568BF9AB.3010408@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
	<A1E3C15B-7034-4987-9591-72C7706AD126@oracle.com>
	<C7AC53D7-1AA3-425C-9DD1-784A11DDFE8B@oracle.com>
	<56139149.5080906@oracle.com>
	<B85144D9-4DEA-4FB8-BFAB-B2F16074F268@oracle.com>
	<568A590F.6030104@oracle.com> <568B05AE.7020700@oracle.com>
	<568B7789.5010600@oracle.com> <568BF7D3.7000403@oracle.com>
	<568BF9AB.3010408@oracle.com>
Message-ID: <568BFB4C.40106@oracle.com>

Yes, webrev.04  is no go. I was referring webrev.03 but I missed that it does not have can_reshape.

Vladimir

On 1/5/16 9:13 AM, Tobias Hartmann wrote:
>
> On 05.01.2016 18:05, Vladimir Kozlov wrote:
>> On 1/4/16 11:58 PM, Tobias Hartmann wrote:
>>> Hi Vladimir,
>>>
>>> thanks for the review.
>>>
>>> On 05.01.2016 00:52, Vladimir Kozlov wrote:
>>>> On 1/4/16 3:35 AM, Tobias Hartmann wrote:
>>>>> Hi Roland,
>>>>>
>>>>> sorry for the delay.
>>>>>
>>>>> On 07.10.2015 11:06, Roland Westrelin wrote:
>>>>>>>> Maybe we could add an IfProjNode::Ideal method that disconnects the other branch of the If when this branch is always taken and that does so even during parsing. Given Ideal is called before Identity, that would guarantee the next call to Identity optimizes the If out.
>>>>>>>
>>>>>>> As you suggested, I added an IfProjNode::Ideal that disconnects the never taken branch from the IfNode. The subsequent call to Identity then removes the IfNode:
>>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.03/
>>>>>>>
>>>>>>> However, I wondered if this is "legal" because the comment in Node::ideal says:
>>>>>>>
>>>>>>> // The Ideal call almost arbitrarily reshape the graph rooted at the 'this'
>>>>>>> // pointer.
>>>>>>>
>>>>>>> But we are changing the graph "above" the this pointer. I executed tests with -XX:+VerifyIterativeGVN and everything seems to work fine.
>>>>>>> Another solution would be to cut the *current* branch if it is never taken:
>>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.02/
>>>>>>>
>>>>>>> But this solution depends on the assumption that we execute the identity() of the other ProjNode which is not guaranteed by GVN (I think).
>>>>>>>
>>>>>>> Therefore I would like to go for webrev.03. I verified that this solves the problem and tested the fix with JPRT.
>>>>>>
>>>>>> I thought about this more and I don?t think either work ok.
>>>>>>
>>>>>> The problem with webrev.02 is that depending on the order the projection nodes are allocated and transformed, the optimization may not happened:
>>>>>>
>>>>>> Node* never_taken = new IfTrueNode(..);
>>>>>> Node* always_taken = new IfFalseNode(..);
>>>>>> always_taken = gvn.transform(always_taken);
>>>>>> never_taken = gvn.transform(never_taken);
>>>>>>
>>>>>> The problem with webrev.03 is that we may change a node that is not yet transformed (never_taken changed by call to gvn.transform(always_taken)). Not sure if it could break existing code but it?s clearly an unexpected behavior.
>>>>>
>>>>> Right, that could be a problem.
>>>>
>>>> I don't see a problem. But IfProjNode::Ideal() should have additional checks for that:
>>>>
>>>>     // Check for dead control input
>>>>     if (in(0) && remove_dead_region(phase, can_reshape)) {
>>>>       return this;
>>>>     }
>>>>     // Don't bother trying to transform a dead node
>>>>     if (in(0) && in(0)->is_top()) {
>>>>       return NULL;
>>>>     }
>>>
>>> Right, I'll add those.
>>>
>>>> Also instead of set_req() use:
>>>>
>>>> PhaseIterGVN* igvn = phase->is_IterGVN();
>>>> igvn->replace_input_of(other, 0, phase->C->top());
>>>>
>>>> This way following gvn.transform(never_taken); will work fine.
>>>
>>> But this assumes that we are only executing the code with IGVN but we also want to cut off the dead branch with GVN. Or am I missing something?
>>
>> webrev.04 checks can_reshape which is true only with IGVN.
>> For GVN you can do it by hand:
>>
>>          bool is_in_table = C->initial_gvn()->hash_delete(other);
>>          other->set_req(0, phase->C->top());
>>          if (is_in_table) {
>>            C->initial_gvn()->hash_find_insert(other);
>>          }
>>          C->record_for_igvn(other);
>>
>> Note, during Parse (GVN) we don't remove dead code aggressively.
>
> Right, I thought you were referring to Roland's comment about webrev.03.
>
> As I wrote in a previous email, I'm afraid that the webrev.04 solution re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Do you think that isn't a problem?
>
> Thanks,
> Tobias
>
>>
>> Vladimir
>>
>>>
>>> Thanks,
>>> Tobias
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>>> An other way would be to remove the in(0)->outcnt() == 1 check from IfProjNode::Identity() and in an IfProjNode::Ideal method do what you do in webrev.03 but when can_reshape is true only.
>>>>>
>>>>> Here is the new webrev:
>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.04/
>>>>>
>>>>> However, I'm afraid that this re-introduces JDK-8027626. If we call IfProjNode::Identity() during GVN and replace the ProjNode by If's input, we end up with a node having two control outputs until we remove the dead branch during IGVN. Right?
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>

From kishor.kharbas at intel.com  Tue Jan  5 21:39:31 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Tue, 5 Jan 2016 21:39:31 +0000
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568BFA90.4020807@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com> <568BFA90.4020807@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F389E78597@ORSMSX116.amr.corp.intel.com>

Thank you guys for the in detail discussion and review.

I have patched the JDK, performing bound checking using Objects.checkFromIndexSize() in CounterMode.crypt() and AESCrypt.encryptBlock(), AESCrypt.decryptBlock()
Here is the link - http://cr.openjdk.java.net/~vdeshpande/8135250/webrev.00/

Let me know if it looks correct.

-Kishor

-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
Sent: Tuesday, January 05, 2016 9:17 AM
To: Andrew Haley; John Rose
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES

 > On 31 Dec 2015, at 22:33, John Rose <john.r.rose at oracle.com> wrote:
 >
 > When performing explicit range checks in pre-intrinsic code,  > let's try to use the new intrinsic functions in java.util.Objects,  > called checkIndex, checkFromToIndex, and checkFromIndexSize.

Please, don't forget that checks in pre-intrinsic code should match checks generated by javac (bytecode) for intrinsified methods. Otherwise those checks will not be removed (by dominated checks in pre-intrinsic code) when intrinsics are not support on a platform. That is why we currently have such duplicated pre-intrinsic code.

On other hand when intrinsics are supported they don't have checks so if they present we can intrinsify pre-intrinsic code as you suggested.

Thanks,
Vladimir

On 1/5/16 1:48 AM, Andrew Haley wrote:
> On 04/01/16 20:12, John Rose wrote:
>> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
>> The point is that the library functions have code shapes which work 
>> well with the JIT.  For example, the multi-index checks might (as in 
>> Kishor's code) be implemented on top of the single-index check, 
>> without themselves being intrinsics.
>
> We seem to be missing the opportunity to convert
>
>    i >= 0 && i < size
>
> into
>
>    (unsigned)i < (unsigned)size
>
> and this is, as far as I can see, the only real code-quality advantage 
> of the checkIndex intrinsic.  Could we not do this optimization and 
> then drop the C2 checkIndex intrinsic?
>
> Andrew.
>

From john.r.rose at oracle.com  Tue Jan  5 22:11:23 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 5 Jan 2016 14:11:23 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568B9188.6000506@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
Message-ID: <C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>

On Jan 5, 2016, at 1:48 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> On 04/01/16 20:12, John Rose wrote:
>> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
>> The point is that the library functions have code shapes which work well
>> with the JIT.  For example, the multi-index checks might (as in Kishor's code)
>> be implemented on top of the single-index check, without themselves being
>> intrinsics.
> 
> We seem to be missing the opportunity to convert
> 
>  i >= 0 && i < size
> 
> into
> 
>  (unsigned)i < (unsigned)size
> 
> and this is, as far as I can see, the only real code-quality advantage of
> the checkIndex intrinsic.  Could we not do this optimization and then
> drop the C2 checkIndex intrinsic?

?What Paul already said about belts and braces.

Of course we want the JIT to be "sufficiently smart" (tm) to discover the meaning
of all such expressions.

But surely, on balance, it's a good thing to encourage programmers to say what
they mean.

Dropping the intrinsic would prevent them from expressing their intention, forcing them
to fall back on Java's expression operators.  At that point, they have a variety of ways
of indirectly spelling out their intention.  There is no direct contract that the JIT will
understand them, just a hope.  That's not good engineering.

Also, it's not just a matter of micro-optimizing a single expression to use
unsigned arithmetic (though that is surprisingly tricky).  Range checks are
interesting to block-level loop transformations (iteration range reorganization).
Do you really want your loop optimizations to be gated on "sufficient smarts"
in the JIT's expression pattern matcher?

? John

From sangheon.kim at oracle.com  Wed Jan  6 00:31:05 2016
From: sangheon.kim at oracle.com (sangheon)
Date: Tue, 5 Jan 2016 16:31:05 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on SPARC
	for non-G1 GC mode
Message-ID: <568C6049.5020400@oracle.com>

Hi all,

Could I have reviews for the below change to remove size 
limitation(<4096) of TLABWasteIncrement on SPARC?

Current implementation uses 'add(Register, int, Register)' which has 
13bit limitation for 'int' parameter.
I changed to use 'set64' to load the value to register and then call 
'add'. 'set64' will run cheap path as the range of TLABWasteIncrememt is 
(0, max_juint).

This assert is only fired on non-G1 mode as G1 is the only GC that 
returns false from Universe::heap()->supports_inline_contig_alloc() by 
default option. And this decides to fall that routine.

I didn't add a test as current TestOptionsWithRanges.java is enough to 
test this case with nightly option rotation.

CR: https://bugs.openjdk.java.net/browse/JDK-8144573
Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.00/
Testing: JPRT, manual test on SPARC[1]

[1]: java -XX:TLABWasteIncrement=4096(and some larger values as well) 
-XX:+UseConcMarkSweepGC(UseParallelGC and UseSerialGC) -version

Thanks,
Sangheon

From john.r.rose at oracle.com  Wed Jan  6 01:05:56 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 5 Jan 2016 17:05:56 -0800
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
Message-ID: <79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>

Darn, this works against the "say what you mean" story I told for checkIndex. 

The bug here is very very special but is hit commonly so needs fixing. The special part is that accumulating Math.max values over a long loop almost *always* creates a series of predictable branches, which means cmov will lose on many CPUs places. (Exercise: Try to construct a long series of values for which each value is the largest so far, randomly, with 50% probability.  This will not be a series found often in nature.)

We need to explicitly detect accumulations on cmov ops in long loops, and convert them to branches. 

Also, we should continue to recommend using intrinsics instead of random logic. 

Fun fact:  Using your own branch logic makes the JVM manage a branch profile just for you, which can mean performance. Intrinsics, if they have internal branch logic, have polluted profiles. We need better call-site profiles and/or split profiles to overcome this. 

? John

> On Jan 5, 2016, at 4:47 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> 
> 
>> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>> 
>> This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104
> 
> Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.
> 
> Paul.

From forax at univ-mlv.fr  Wed Jan  6 02:02:46 2016
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 6 Jan 2016 03:02:46 +0100 (CET)
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
Message-ID: <585079517.742162.1452045766579.JavaMail.zimbra@u-pem.fr>

----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> ?: "Paul Sandoz" <paul.sandoz at oracle.com>
> Cc: "hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
> Envoy?: Mercredi 6 Janvier 2016 02:05:56
> Objet: Re: Conditional moves vs. branching in unrolled loops
> 
> Darn, this works against the "say what you mean" story I told for checkIndex.
> 
> The bug here is very very special but is hit commonly so needs fixing. The
> special part is that accumulating Math.max values over a long loop almost
> *always* creates a series of predictable branches, which means cmov will
> lose on many CPUs places. (Exercise: Try to construct a long series of
> values for which each value is the largest so far, randomly, with 50%
> probability.  This will not be a series found often in nature.)
> 
> We need to explicitly detect accumulations on cmov ops in long loops, and
> convert them to branches.
> 
> Also, we should continue to recommend using intrinsics instead of random
> logic.
> 
> Fun fact:  Using your own branch logic makes the JVM manage a branch profile
> just for you, which can mean performance. Intrinsics, if they have internal
> branch logic, have polluted profiles. We need better call-site profiles
> and/or split profiles to overcome this.

we already have the first part of a kind of split profiles in tiered mode,
if code is first inlined by c1, c2 could use these different profiles,
but currently the profiles are shared because you have one profile for one bci.

so in tiered more, we should have one profile by bci + caller path inside the same inlining blob,
the VM need to keep the inlining tree created by c1 to send it to c2
(there is maybe enough info in the stackwalk info to recreate the inlining tree). 

> 
> ? John

R?mi

> 
> > On Jan 5, 2016, at 4:47 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
> > 
> > 
> >> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> >> 
> >> This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104
> > 
> > Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.
> > 
> > Paul.
> 

From john.r.rose at oracle.com  Wed Jan  6 02:52:52 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 5 Jan 2016 18:52:52 -0800
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <585079517.742162.1452045766579.JavaMail.zimbra@u-pem.fr>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<585079517.742162.1452045766579.JavaMail.zimbra@u-pem.fr>
Message-ID: <EF2DE7B0-FE81-4382-8F7F-35910F06D8ED@oracle.com>

Yep. It's a matter of data structure to keep track of the splits. 

? John

> On Jan 5, 2016, at 6:02 PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> so in tiered more, we should have one profile by bci + caller path inside the same inlining blob,
> the VM need to keep the inlining tree created by c1 to send it to c2
> (there is maybe enough info in the stackwalk info to recreate the inlining tree).

From igor.veresov at oracle.com  Wed Jan  6 04:29:11 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 5 Jan 2016 20:29:11 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <568C6049.5020400@oracle.com>
References: <568C6049.5020400@oracle.com>
Message-ID: <6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>

I?m not sure we care a lot about tiny bits of performance in the this instance? But, in case use wanted to keep the original code for the simm13 case you could check the range of the constant and still emit the code that was there before. It also seems suboptimal to do set64 in MacroAssembler::tlab_refill() on all paths - the result of the original add in the delay slot doesn?t seem to be used if we jump to discard_tlab, right?  So, may be you could do something like:

brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
if (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment())) {
  delayed()->add(t2, ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
} else {
  delayed()->nop();
  set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(), t3, G0);
  add(t2, t3, t2);
}

Similarly, tighter code can be emitted for the interpreter in templateTable_sparc.cpp.

igor


> On Jan 5, 2016, at 4:31 PM, sangheon <sangheon.kim at oracle.com> wrote:
> 
> Hi all,
> 
> Could I have reviews for the below change to remove size limitation(<4096) of TLABWasteIncrement on SPARC?
> 
> Current implementation uses 'add(Register, int, Register)' which has 13bit limitation for 'int' parameter.
> I changed to use 'set64' to load the value to register and then call 'add'. 'set64' will run cheap path as the range of TLABWasteIncrememt is (0, max_juint).
> 
> This assert is only fired on non-G1 mode as G1 is the only GC that returns false from Universe::heap()->supports_inline_contig_alloc() by default option. And this decides to fall that routine.
> 
> I didn't add a test as current TestOptionsWithRanges.java is enough to test this case with nightly option rotation.
> 
> CR: https://bugs.openjdk.java.net/browse/JDK-8144573
> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.00/
> Testing: JPRT, manual test on SPARC[1]
> 
> [1]: java -XX:TLABWasteIncrement=4096(and some larger values as well) -XX:+UseConcMarkSweepGC(UseParallelGC and UseSerialGC) -version
> 
> Thanks,
> Sangheon


From aph at redhat.com  Wed Jan  6 10:05:29 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 6 Jan 2016 10:05:29 +0000
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568BFA90.4020807@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com> <568BFA90.4020807@oracle.com>
Message-ID: <568CE6E9.1070904@redhat.com>

On 05/01/16 17:17, Vladimir Kozlov wrote:

> Please, don't forget that checks in pre-intrinsic code should match
> checks generated by javac (bytecode) for intrinsified
> methods. Otherwise those checks will not be removed (by dominated
> checks in pre-intrinsic code) when intrinsics are not support on a
> platform. That is why we currently have such duplicated
> pre-intrinsic code.
> 
> On other hand when intrinsics are supported they don't have checks
> so if they present we can intrinsify pre-intrinsic code as you
> suggested.

It may be that I'm just being very dim, but I've read this ten times
and I still don't know exactly what you mean.

Can you give me a pointer to an example of such duplicated
pre-intrinsic code?

Thanks,

Andrew.

From paul.sandoz at oracle.com  Wed Jan  6 10:12:09 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Wed, 6 Jan 2016 11:12:09 +0100
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
Message-ID: <AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>


> On 6 Jan 2016, at 02:05, John Rose <john.r.rose at oracle.com> wrote:
> 
> Darn, this works against the "say what you mean" story I told for checkIndex.
> 
> The bug here is very very special but is hit commonly so needs fixing. The special part is that accumulating Math.max values over a long loop almost *always* creates a series of predictable branches, which means cmov will lose on many CPUs places. (Exercise: Try to construct a long series of values for which each value is the largest so far, randomly, with 50% probability.  This will not be a series found often in nature.)
> 

Here are some results (see benchmark below, and thanks to Aleksey for hints/tips):

Benchmark          (bias)            (dg)  (size)  Mode  Cnt     Score     Error  Units
A.forTest_MathMax     0.1          RANDOM       1  avgt   10     3.698 ?   0.146  ns/op
A.forTest_MathMax     0.1          RANDOM      10  avgt   10     9.474 ?   0.234  ns/op
A.forTest_MathMax     0.1          RANDOM     100  avgt   10    84.363 ?   2.734  ns/op
A.forTest_MathMax     0.1          RANDOM    1000  avgt   10   840.102 ?  22.474  ns/op
A.forTest_MathMax     0.1          RANDOM   10000  avgt   10  8514.794 ? 202.722  ns/op
A.forTest_MathMax     0.1  RANDOM_RAMP_UP       1  avgt   10     3.764 ?   0.166  ns/op
A.forTest_MathMax     0.1  RANDOM_RAMP_UP      10  avgt   10     9.838 ?   0.428  ns/op
A.forTest_MathMax     0.1  RANDOM_RAMP_UP     100  avgt   10    84.650 ?   3.155  ns/op
A.forTest_MathMax     0.1  RANDOM_RAMP_UP    1000  avgt   10   844.412 ?  21.983  ns/op
A.forTest_MathMax     0.1  RANDOM_RAMP_UP   10000  avgt   10  8519.292 ? 295.786  ns/op
A.forTest_MathMax     0.5          RANDOM       1  avgt   10     3.667 ?   0.116  ns/op
A.forTest_MathMax     0.5          RANDOM      10  avgt   10     9.527 ?   0.235  ns/op
A.forTest_MathMax     0.5          RANDOM     100  avgt   10    83.318 ?   2.954  ns/op
A.forTest_MathMax     0.5          RANDOM    1000  avgt   10   843.540 ?  22.051  ns/op
A.forTest_MathMax     0.5          RANDOM   10000  avgt   10  8559.293 ? 333.435  ns/op
A.forTest_MathMax     0.5  RANDOM_RAMP_UP       1  avgt   10     3.712 ?   0.123  ns/op
A.forTest_MathMax     0.5  RANDOM_RAMP_UP      10  avgt   10     9.536 ?   0.195  ns/op
A.forTest_MathMax     0.5  RANDOM_RAMP_UP     100  avgt   10    82.943 ?   2.199  ns/op
A.forTest_MathMax     0.5  RANDOM_RAMP_UP    1000  avgt   10   842.282 ?  19.100  ns/op
A.forTest_MathMax     0.5  RANDOM_RAMP_UP   10000  avgt   10  8454.333 ? 293.222  ns/op
A.forTest_if          0.1          RANDOM       1  avgt   10     3.453 ?   0.106  ns/op
A.forTest_if          0.1          RANDOM      10  avgt   10     9.156 ?   0.555  ns/op
A.forTest_if          0.1          RANDOM     100  avgt   10    39.006 ?   1.575  ns/op
A.forTest_if          0.1          RANDOM    1000  avgt   10   372.999 ?  20.423  ns/op
A.forTest_if          0.1          RANDOM   10000  avgt   10  3613.243 ?  72.343  ns/op
A.forTest_if          0.1  RANDOM_RAMP_UP       1  avgt   10     3.410 ?   0.086  ns/op
A.forTest_if          0.1  RANDOM_RAMP_UP      10  avgt   10     9.236 ?   0.412  ns/op
A.forTest_if          0.1  RANDOM_RAMP_UP     100  avgt   10    49.200 ?   1.642  ns/op
A.forTest_if          0.1  RANDOM_RAMP_UP    1000  avgt   10   476.677 ?  16.041  ns/op
A.forTest_if          0.1  RANDOM_RAMP_UP   10000  avgt   10  3774.091 ? 131.946  ns/op
A.forTest_if          0.5          RANDOM       1  avgt   10     3.398 ?   0.121  ns/op
A.forTest_if          0.5          RANDOM      10  avgt   10     9.565 ?   0.614  ns/op
A.forTest_if          0.5          RANDOM     100  avgt   10    49.666 ?   2.257  ns/op
A.forTest_if          0.5          RANDOM    1000  avgt   10   383.734 ?  22.051  ns/op
A.forTest_if          0.5          RANDOM   10000  avgt   10  3624.447 ? 204.303  ns/op
A.forTest_if          0.5  RANDOM_RAMP_UP       1  avgt   10     3.446 ?   0.135  ns/op
A.forTest_if          0.5  RANDOM_RAMP_UP      10  avgt   10     9.330 ?   0.399  ns/op
A.forTest_if          0.5  RANDOM_RAMP_UP     100  avgt   10    84.596 ?   4.132  ns/op
A.forTest_if          0.5  RANDOM_RAMP_UP    1000  avgt   10   914.982 ?  30.125  ns/op
A.forTest_if          0.5  RANDOM_RAMP_UP   10000  avgt   10  8991.088 ? 315.307  ns/op

At least for this set of tests the results indicate conditional moves offer no major advantage over branching. For the worst case branching scenario (the ?50 cent? case) conditional moves appear marginally better, but as you say the data pattern is likely rare.

Perhaps for conditional moves data dependency chains are more costly?

Paul.

package oracle.jmh;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;

import java.util.Arrays;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import java.util.function.BiConsumer;


@State(Scope.Benchmark)
@Fork(value = 1, warmups = 0)
@Warmup(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class A {

    @Param({"1", "10", "100", "1000", "10000"})
    int size;

    @Param({"0.0", "0.1", "0.2", "0.3", "0.4", "0.5"})
    private double bias;

    @Param({"RANDOM", "RANDOM_RAMP_UP"})
    DataGenerator dg;

    int ints[];

    @Setup
    public void setUp() {
        ints = dg.generate(bias, size);
    }

    public enum DataGenerator {
        RANDOM((b, vs) -> {
            Random random = new Random();
            for (int i = 0; i < vs.length; i++)
                if (random.nextFloat() > b)
                    vs[i] = random.nextInt();
        }),

        RANDOM_RAMP_UP((b, vs) -> {
            Random random = new Random();
            for (int i = 0; i < vs.length; i++) {
                if (random.nextFloat() > b)
                    vs[i] = i;
            }
        });

        final BiConsumer<Double, int[]> filler;

        DataGenerator(BiConsumer<Double, int[]> filler) {
            this.filler = filler;
        }

        int[] generate(double bias, int size) {
            int[] vs = new int[size];
            filler.accept(bias, vs);
            return vs;
        }
    }

    @Benchmark
    public int forTest_if() {
        int[] a = ints;
        int e = ints.length;
        int m = Integer.MIN_VALUE;
        for (int i = 0; i < e; i++)
            if (a[i] >= m)
                m = a[i];
        return m;
    }

    @Benchmark
    public int forTest_MathMax() {
        int[] a = ints;
        int e = ints.length;
        int m = Integer.MIN_VALUE;
        for (int i = 0; i < e; i++)
            m = Math.max(m, a[i]);
        return m;
    }

    @Benchmark
    public int streamTest_lambda() {
        return Arrays.stream(ints).reduce(Integer.MIN_VALUE, (a, b) -> a >= b ? a : b);
    }

    @Benchmark
    public int streamTest_MathMax() {
        return Arrays.stream(ints).reduce(Integer.MIN_VALUE, Math::max);
    }
}


> We need to explicitly detect accumulations on cmov ops in long loops, and convert them to branches.
> 
> Also, we should continue to recommend using intrinsics instead of random logic.
> 
> Fun fact:  Using your own branch logic makes the JVM manage a branch profile just for you, which can mean performance. Intrinsics, if they have internal branch logic, have polluted profiles. We need better call-site profiles and/or split profiles to overcome this.
> 
> ? John
> 
>> On Jan 5, 2016, at 4:47 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>> 
>> 
>>> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>> 
>>> This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104
>> 
>> Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.
>> 
>> Paul.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/cb206751/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/cb206751/signature-0001.asc>

From john.r.rose at oracle.com  Wed Jan  6 10:31:09 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 02:31:09 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568CE6E9.1070904@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com> <568BFA90.4020807@oracle.com>
	<568CE6E9.1070904@redhat.com>
Message-ID: <5C9816F5-9618-4BBF-A761-E03CCDEDB230@oracle.com>

On Jan 6, 2016, at 2:05 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> Can you give me a pointer to an example of such duplicated
> pre-intrinsic code?
> 

It's another case of belt-and-suspenders.

The internal bytecodes of a non-replaced intrinsic candidate perform
hardwired range checks as part of iaload, etc.  So the JVM defends
itself against out-of-range access, as usual.  Meanwhile, at a higher
level, the intrinsic candidate (whether replaced or not) is dominated
by a call to explicit range check logic.

  if (rangeCheckFail(array, indexes))  goto L_throw_1;
  /* non-replaced intrinsic, logic gets inlined as follows: */
  for (index in indexes?) {
     if (rangeCheckFail(array, index))  goto L_throw_2;
     tem = iaload(array, index);
     ? }

In the case of a replaced intrinsic, there is not guaranteed to
be a full range check of the array access, so:

  if (rangeCheckFail(array, indexes))  goto L_throw_1;
  /* replaced intrinsic */
  ?some vectorized assembly code works with array and indexes?

In the first case, if the first "rangeCheckFail" logic is similar enough
to the second "rangeCheckFail" logic, the JIT can elide the second one.
But they are likely *not* to match if the programmer has written something
elegant and/or clever for the first set of checks.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/0f4e69d0/attachment.html>

From aph at redhat.com  Wed Jan  6 10:41:31 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 6 Jan 2016 10:41:31 +0000
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
Message-ID: <568CEF5B.5060306@redhat.com>

On 05/01/16 22:11, John Rose wrote:

> Dropping the intrinsic would prevent them from expressing their
> intention, forcing them to fall back on Java's expression operators.

I don't really understand that point: Objects.checkIndex would still
exist, and hopefully people would use it, but it wouldn't need
special-case handling in C2.

> Also, it's not just a matter of micro-optimizing a single expression
> to use unsigned arithmetic (though that is surprisingly tricky).

I accept that point.

> Range checks are interesting to block-level loop transformations
> (iteration range reorganization).  Do you really want your loop
> optimizations to be gated on "sufficient smarts" in the JIT's
> expression pattern matcher?

Please forgive me for pushing this: I'm not arguing for the sake of it,
I'm trying to understand your reasoning.

As it stands we recognize a call to Objects.checkIndex and transform
it into a certain pattern.  I'm assuming that it's not impossible to
recognize the logic inside Objects.checkIndex and transform it into
the same form that the intrinsic generates.  And that would have a
payoff in all the places that the same logic is used in existing
programs, both inside and outside the JDK.

I suppose one downside of this approach is that C2 might decide
not to inline Objects.checkIndex, so it would be called instead
and the optimization would not be done.

Andrew.

From tobias.hartmann at oracle.com  Wed Jan  6 11:22:29 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 6 Jan 2016 12:22:29 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <55FBDFEC.4060405@oracle.com>
References: <55FBDFEC.4060405@oracle.com>
Message-ID: <568CF8F5.5090202@oracle.com>

Hi,

I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.

Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).

http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/

What do you think?

Thanks,
Tobias


On 18.09.2015 11:57, Tobias Hartmann wrote:
> Hi,
> 
> please review the following patch.
> 
> https://bugs.openjdk.java.net/browse/JDK-8136469
> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
> 
> Problem:
> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
> 
> Solution:
> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
> 
> Testing:
> - New test (TestPresizedStringBuilder)
> - JPRT
> 
> Thanks,
> Tobias
> 
> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
> 

From tobias.hartmann at oracle.com  Wed Jan  6 12:01:45 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 6 Jan 2016 13:01:45 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to compact
	strings
Message-ID: <568D0229.60908@oracle.com>

Hi,

please review the following patch.

https://bugs.openjdk.java.net/browse/JDK-8144212
http://cr.openjdk.java.net/~thartmann/8144212/webrev.00/

An Apache Lucene test fails with Compact Strings enabled because the result of String.getChars() is invalid. The problem is a missing membar after the _inflateString intrinsic, allowing a subsequent load from the destination array to flow above and return a wrong result (see [1]: 210 LoadUS should read the result of 196 StrInflatedCopy).

Tested with JPRT and failing Apache Lucene test.

During my investigation, I noticed that the StringUTF16.getChars() and StringUTF16.compress/inflate intrinsics use LibraryCallKit::tightly_coupled_allocation() to skip zeroing the array elements. However, the intrinsics do not take care of zeroing remaining array elements not affected by the intrinsic operation.
Currently, this is not a problem because all (String API internal) usages of the intrinsics that have a tightly coupled allocation make sure that the entire array is initialized. However, we should fix this to avoid potential bugs. I filed JDK-8146547 and will take care of it.

Thanks,
Tobias

[1] https://bugs.openjdk.java.net/secure/attachment/56238/Graph.png

From aph at redhat.com  Wed Jan  6 12:07:26 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 6 Jan 2016 12:07:26 +0000
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568D0229.60908@oracle.com>
References: <568D0229.60908@oracle.com>
Message-ID: <568D037E.7000105@redhat.com>

On 06/01/16 12:01, Tobias Hartmann wrote:

> An Apache Lucene test fails with Compact Strings enabled because the
> result of String.getChars() is invalid. The problem is a missing
> membar after the _inflateString intrinsic, allowing a subsequent
> load from the destination array to flow above and return a wrong
> result (see [1]: 210 LoadUS should read the result of 196
> StrInflatedCopy).
> 
> Tested with JPRT and failing Apache Lucene test.

Is a MemBarCPUOrder sufficient for machines with relaxed memory
ordering?

Andrew.

From vitalyd at gmail.com  Wed Jan  6 12:20:58 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 07:20:58 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568CEF5B.5060306@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
Message-ID: <CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>

I agree with Andrew.

We had a similarly themed discussion a few months back when someone wanted
to make Integer/Long::compareTo an intrinsic; the sentiment there was that
there's nothing "special" about compareTo, and instead the JIT can be
taught to pick up the pattern used in the bytecode for those methods.
Objects::checkIndex seems no different in that regard.

I realize that there may always be a user-specified shape that the JIT
doesn't understand, but straightforward cases should hopefully Just
Work(tm) as those patterns can be picked up elsewhere in code and
performance improves without changing a line of code.

On Wednesday, January 6, 2016, Andrew Haley <aph at redhat.com> wrote:

> On 05/01/16 22:11, John Rose wrote:
>
> > Dropping the intrinsic would prevent them from expressing their
> > intention, forcing them to fall back on Java's expression operators.
>
> I don't really understand that point: Objects.checkIndex would still
> exist, and hopefully people would use it, but it wouldn't need
> special-case handling in C2.
>
> > Also, it's not just a matter of micro-optimizing a single expression
> > to use unsigned arithmetic (though that is surprisingly tricky).
>
> I accept that point.
>
> > Range checks are interesting to block-level loop transformations
> > (iteration range reorganization).  Do you really want your loop
> > optimizations to be gated on "sufficient smarts" in the JIT's
> > expression pattern matcher?
>
> Please forgive me for pushing this: I'm not arguing for the sake of it,
> I'm trying to understand your reasoning.
>
> As it stands we recognize a call to Objects.checkIndex and transform
> it into a certain pattern.  I'm assuming that it's not impossible to
> recognize the logic inside Objects.checkIndex and transform it into
> the same form that the intrinsic generates.  And that would have a
> payoff in all the places that the same logic is used in existing
> programs, both inside and outside the JDK.
>
> I suppose one downside of this approach is that C2 might decide
> not to inline Objects.checkIndex, so it would be called instead
> and the optimization would not be done.
>
> Andrew.
>


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/b72882cf/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan  6 12:38:20 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 07:38:20 -0500
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
Message-ID: <CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>

On Wednesday, January 6, 2016, Paul Sandoz <paul.sandoz at oracle.com> wrote:

>
> On 6 Jan 2016, at 02:05, John Rose <john.r.rose at oracle.com
> <javascript:_e(%7B%7D,'cvml','john.r.rose at oracle.com');>> wrote:
>
> Darn, this works against the "say what you mean" story I told for
> checkIndex.
>
> The bug here is very very special but is hit commonly so needs fixing. The
> special part is that accumulating Math.max values over a long loop almost
> *always* creates a series of predictable branches, which means cmov will
> lose on many CPUs places. (Exercise: Try to construct a long series of
> values for which each value is the largest so far, randomly, with 50%
> probability.  This will not be a series found often in nature.)
>
>
> Here are some results (see benchmark below, and thanks to Aleksey for
> hints/tips):
>
> Benchmark          (bias)            (dg)  (size)  Mode  Cnt     Score
> Error  Units
> A.forTest_MathMax     0.1          RANDOM       1  avgt   10     3.698 ?
> 0.146  ns/op
> A.forTest_MathMax     0.1          RANDOM      10  avgt   10     9.474 ?
> 0.234  ns/op
> A.forTest_MathMax     0.1          RANDOM     100  avgt   10    84.363 ?
> 2.734  ns/op
> A.forTest_MathMax     0.1          RANDOM    1000  avgt   10   840.102 ?
>  22.474  ns/op
> A.forTest_MathMax     0.1          RANDOM   10000  avgt   10  8514.794 ?
> 202.722  ns/op
> A.forTest_MathMax     0.1  RANDOM_RAMP_UP       1  avgt   10     3.764 ?
> 0.166  ns/op
> A.forTest_MathMax     0.1  RANDOM_RAMP_UP      10  avgt   10     9.838 ?
> 0.428  ns/op
> A.forTest_MathMax     0.1  RANDOM_RAMP_UP     100  avgt   10    84.650 ?
> 3.155  ns/op
> A.forTest_MathMax     0.1  RANDOM_RAMP_UP    1000  avgt   10   844.412 ?
>  21.983  ns/op
> A.forTest_MathMax     0.1  RANDOM_RAMP_UP   10000  avgt   10  8519.292 ?
> 295.786  ns/op
> A.forTest_MathMax     0.5          RANDOM       1  avgt   10     3.667 ?
> 0.116  ns/op
> A.forTest_MathMax     0.5          RANDOM      10  avgt   10     9.527 ?
> 0.235  ns/op
> A.forTest_MathMax     0.5          RANDOM     100  avgt   10    83.318 ?
> 2.954  ns/op
> A.forTest_MathMax     0.5          RANDOM    1000  avgt   10   843.540 ?
>  22.051  ns/op
> A.forTest_MathMax     0.5          RANDOM   10000  avgt   10  8559.293 ?
> 333.435  ns/op
> A.forTest_MathMax     0.5  RANDOM_RAMP_UP       1  avgt   10     3.712 ?
> 0.123  ns/op
> A.forTest_MathMax     0.5  RANDOM_RAMP_UP      10  avgt   10     9.536 ?
> 0.195  ns/op
> A.forTest_MathMax     0.5  RANDOM_RAMP_UP     100  avgt   10    82.943 ?
> 2.199  ns/op
> A.forTest_MathMax     0.5  RANDOM_RAMP_UP    1000  avgt   10   842.282 ?
>  19.100  ns/op
> A.forTest_MathMax     0.5  RANDOM_RAMP_UP   10000  avgt   10  8454.333 ?
> 293.222  ns/op
> A.forTest_if          0.1          RANDOM       1  avgt   10     3.453 ?
> 0.106  ns/op
> A.forTest_if          0.1          RANDOM      10  avgt   10     9.156 ?
> 0.555  ns/op
> A.forTest_if          0.1          RANDOM     100  avgt   10    39.006 ?
> 1.575  ns/op
> A.forTest_if          0.1          RANDOM    1000  avgt   10   372.999 ?
>  20.423  ns/op
> A.forTest_if          0.1          RANDOM   10000  avgt   10  3613.243 ?
>  72.343  ns/op
> A.forTest_if          0.1  RANDOM_RAMP_UP       1  avgt   10     3.410 ?
> 0.086  ns/op
> A.forTest_if          0.1  RANDOM_RAMP_UP      10  avgt   10     9.236 ?
> 0.412  ns/op
> A.forTest_if          0.1  RANDOM_RAMP_UP     100  avgt   10    49.200 ?
> 1.642  ns/op
> A.forTest_if          0.1  RANDOM_RAMP_UP    1000  avgt   10   476.677 ?
>  16.041  ns/op
> A.forTest_if          0.1  RANDOM_RAMP_UP   10000  avgt   10  3774.091 ?
> 131.946  ns/op
> A.forTest_if          0.5          RANDOM       1  avgt   10     3.398 ?
> 0.121  ns/op
> A.forTest_if          0.5          RANDOM      10  avgt   10     9.565 ?
> 0.614  ns/op
> A.forTest_if          0.5          RANDOM     100  avgt   10    49.666 ?
> 2.257  ns/op
> A.forTest_if          0.5          RANDOM    1000  avgt   10   383.734 ?
>  22.051  ns/op
> A.forTest_if          0.5          RANDOM   10000  avgt   10  3624.447 ?
> 204.303  ns/op
> A.forTest_if          0.5  RANDOM_RAMP_UP       1  avgt   10     3.446 ?
> 0.135  ns/op
> A.forTest_if          0.5  RANDOM_RAMP_UP      10  avgt   10     9.330 ?
> 0.399  ns/op
> A.forTest_if          0.5  RANDOM_RAMP_UP     100  avgt   10    84.596 ?
> 4.132  ns/op
> A.forTest_if          0.5  RANDOM_RAMP_UP    1000  avgt   10   914.982 ?
>  30.125  ns/op
> A.forTest_if          0.5  RANDOM_RAMP_UP   10000  avgt   10  8991.088 ?
> 315.307  ns/op
>
> At least for this set of tests the results indicate conditional moves
> offer no major advantage over branching. For the worst case branching
> scenario (the ?50 cent? case) conditional moves appear marginally better,
> but as you say the data pattern is likely rare.
>
> Perhaps for conditional moves data dependency chains are more costly?
>

cmov carries a dependency on both inputs, making it more likely to stall
when at least one isn't available whereas the branch still allows cpu to
continue with speculative execution.  In a tight loop with a memory access
as one input to cmov, the memory op has to retire before cmov can proceed;
using cmov when both inputs are already ready (e.g. values in registers) is
pretty harmless though and avoids a branch entirely.  cmov also has larger
encoding than a branch.

As the original jira on this issue states, cmov should only be used when
the branch is profiled to be unpredictable.  I'm not sure why loops with a
max/min accumulator need to be called out separately in this regard -
wouldn't the branch profile dictate this anyway? This of course assumes
that profile pollution is addressed in some manner.

>
> Paul.
>
> package oracle.jmh;
>
> import org.openjdk.jmh.annotations.Benchmark;
> import org.openjdk.jmh.annotations.BenchmarkMode;
> import org.openjdk.jmh.annotations.Fork;
> import org.openjdk.jmh.annotations.Measurement;
> import org.openjdk.jmh.annotations.Mode;
> import org.openjdk.jmh.annotations.OutputTimeUnit;
> import org.openjdk.jmh.annotations.Param;
> import org.openjdk.jmh.annotations.Scope;
> import org.openjdk.jmh.annotations.Setup;
> import org.openjdk.jmh.annotations.State;
> import org.openjdk.jmh.annotations.Warmup;
>
> import java.util.Arrays;
> import java.util.Random;
> import java.util.concurrent.TimeUnit;
> import java.util.function.BiConsumer;
>
>
> @State(Scope.Benchmark)
> @Fork(value = 1, warmups = 0)
> @Warmup(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
> @Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> public class A {
>
>     @Param({"1", "10", "100", "1000", "10000"})
>     int size;
>
>     @Param({"0.0", "0.1", "0.2", "0.3", "0.4", "0.5"})
>     private double bias;
>
>     @Param({"RANDOM", "RANDOM_RAMP_UP"})
>     DataGenerator dg;
>
>     int ints[];
>
>     @Setup
>     public void setUp() {
>         ints = dg.generate(bias, size);
>     }
>
>     public enum DataGenerator {
>         RANDOM((b, vs) -> {
>             Random random = new Random();
>             for (int i = 0; i < vs.length; i++)
>                 if (random.nextFloat() > b)
>                     vs[i] = random.nextInt();
>         }),
>
>         RANDOM_RAMP_UP((b, vs) -> {
>             Random random = new Random();
>             for (int i = 0; i < vs.length; i++) {
>                 if (random.nextFloat() > b)
>                     vs[i] = i;
>             }
>         });
>
>         final BiConsumer<Double, int[]> filler;
>
>         DataGenerator(BiConsumer<Double, int[]> filler) {
>             this.filler = filler;
>         }
>
>         int[] generate(double bias, int size) {
>             int[] vs = new int[size];
>             filler.accept(bias, vs);
>             return vs;
>         }
>     }
>
>     @Benchmark
>     public int forTest_if() {
>         int[] a = ints;
>         int e = ints.length;
>         int m = Integer.MIN_VALUE;
>         for (int i = 0; i < e; i++)
>             if (a[i] >= m)
>                 m = a[i];
>         return m;
>     }
>
>     @Benchmark
>     public int forTest_MathMax() {
>         int[] a = ints;
>         int e = ints.length;
>         int m = Integer.MIN_VALUE;
>         for (int i = 0; i < e; i++)
>             m = Math.max(m, a[i]);
>         return m;
>     }
>
>     @Benchmark
>     public int streamTest_lambda() {
>         return Arrays.stream(ints).reduce(Integer.MIN_VALUE, (a, b) -> a >= b ? a : b);
>     }
>
>     @Benchmark
>     public int streamTest_MathMax() {
>         return Arrays.stream(ints).reduce(Integer.MIN_VALUE, Math::max);
>     }
> }
>
>
>
> We need to explicitly detect accumulations on cmov ops in long loops, and
> convert them to branches.
>
> Also, we should continue to recommend using intrinsics instead of random
> logic.
>
> Fun fact:  Using your own branch logic makes the JVM manage a branch
> profile just for you, which can mean performance. Intrinsics, if they have
> internal branch logic, have polluted profiles. We need better call-site
> profiles and/or split profiles to overcome this.
>
> ? John
>
> On Jan 5, 2016, at 4:47 AM, Paul Sandoz <paul.sandoz at oracle.com
> <javascript:_e(%7B%7D,'cvml','paul.sandoz at oracle.com');>> wrote:
>
>
> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
> This is a known issue: https://bugs.openjdk.java.net/browse/JDK-8039104
>
>
> Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.
>
> Paul.
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/a86f4443/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan  6 12:43:36 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 07:43:36 -0500
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <585079517.742162.1452045766579.JavaMail.zimbra@u-pem.fr>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<585079517.742162.1452045766579.JavaMail.zimbra@u-pem.fr>
Message-ID: <CAHjP37EFNkg2O9FszRJ0WWTdbqYiLuBLrt0xsmKKV5fe=yW+iQ@mail.gmail.com>

Ideally profile pollution could be solved/improved without requiring
tiered; tiered has its own wrinkles, and many places simply use C2 alone.


On Tuesday, January 5, 2016, Remi Forax <forax at univ-mlv.fr> wrote:

> ----- Mail original -----
> > De: "John Rose" <john.r.rose at oracle.com <javascript:;>>
> > ?: "Paul Sandoz" <paul.sandoz at oracle.com <javascript:;>>
> > Cc: "hotspot compiler" <hotspot-compiler-dev at openjdk.java.net
> <javascript:;>>
> > Envoy?: Mercredi 6 Janvier 2016 02:05:56
> > Objet: Re: Conditional moves vs. branching in unrolled loops
> >
> > Darn, this works against the "say what you mean" story I told for
> checkIndex.
> >
> > The bug here is very very special but is hit commonly so needs fixing.
> The
> > special part is that accumulating Math.max values over a long loop almost
> > *always* creates a series of predictable branches, which means cmov will
> > lose on many CPUs places. (Exercise: Try to construct a long series of
> > values for which each value is the largest so far, randomly, with 50%
> > probability.  This will not be a series found often in nature.)
> >
> > We need to explicitly detect accumulations on cmov ops in long loops, and
> > convert them to branches.
> >
> > Also, we should continue to recommend using intrinsics instead of random
> > logic.
> >
> > Fun fact:  Using your own branch logic makes the JVM manage a branch
> profile
> > just for you, which can mean performance. Intrinsics, if they have
> internal
> > branch logic, have polluted profiles. We need better call-site profiles
> > and/or split profiles to overcome this.
>
> we already have the first part of a kind of split profiles in tiered mode,
> if code is first inlined by c1, c2 could use these different profiles,
> but currently the profiles are shared because you have one profile for one
> bci.
>
> so in tiered more, we should have one profile by bci + caller path inside
> the same inlining blob,
> the VM need to keep the inlining tree created by c1 to send it to c2
> (there is maybe enough info in the stackwalk info to recreate the inlining
> tree).
>
> >
> > ? John
>
> R?mi
>
> >
> > > On Jan 5, 2016, at 4:47 AM, Paul Sandoz <paul.sandoz at oracle.com
> <javascript:;>> wrote:
> > >
> > >
> > >> On 5 Jan 2016, at 13:00, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:;>> wrote:
> > >>
> > >> This is a known issue:
> https://bugs.openjdk.java.net/browse/JDK-8039104
> > >
> > > Many thanks, i closed JDK-8146071 as a dup of JDK-8039104.
> > >
> > > Paul.
> >
>


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/497cfe90/attachment.html>

From tobias.hartmann at oracle.com  Wed Jan  6 13:06:16 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 6 Jan 2016 14:06:16 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568D037E.7000105@redhat.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
Message-ID: <568D1148.1030901@oracle.com>

Hi Andrew,

On 06.01.2016 13:07, Andrew Haley wrote:
> On 06/01/16 12:01, Tobias Hartmann wrote:
> 
>> An Apache Lucene test fails with Compact Strings enabled because the
>> result of String.getChars() is invalid. The problem is a missing
>> membar after the _inflateString intrinsic, allowing a subsequent
>> load from the destination array to flow above and return a wrong
>> result (see [1]: 210 LoadUS should read the result of 196
>> StrInflatedCopy).
>>
>> Tested with JPRT and failing Apache Lucene test.
> 
> Is a MemBarCPUOrder sufficient for machines with relaxed memory
> ordering?

The problem here is that C2 reorders memory instructions and moves an array load before an array store. The MemBarCPUOrder is now used (compiler internally) to prevent this. We do the same for normal array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual code is emitted. See also the comment in memnode.hpp:

 // Ordering within the same CPU.  Used to order unsafe memory references
 // inside the compiler when we lack alias info.  Not needed "outside" the
 // compiler because the CPU does all the ordering for us.

"CPU does all the ordering for us" means that even with a relaxed memory ordering, loads are never moved before dependent stores.

Or did I misunderstand your question?

Thanks,
Tobias

From aph at redhat.com  Wed Jan  6 13:34:28 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 6 Jan 2016 13:34:28 +0000
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568D1148.1030901@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com>
Message-ID: <568D17E4.90301@redhat.com>

On 01/06/2016 01:06 PM, Tobias Hartmann wrote:

> The problem here is that C2 reorders memory instructions and moves
> an array load before an array store. The MemBarCPUOrder is now used
> (compiler internally) to prevent this. We do the same for normal
> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
> code is emitted. See also the comment in memnode.hpp:
> 
>  // Ordering within the same CPU.  Used to order unsafe memory references
>  // inside the compiler when we lack alias info.  Not needed "outside" the
>  // compiler because the CPU does all the ordering for us.
> 
> "CPU does all the ordering for us" means that even with a relaxed
> memory ordering, loads are never moved before dependent stores.
> 
> Or did I misunderstand your question?

No, I don't think so.  I was just checking: I am very aware that
HotSpot has presented those of use with relaxed memory order machines
with some interesting gotchas over the years, that's all.  I'm a bit
surprised that C2 needs this barrier, given that there is a
read-after-write dependency, but never mind.

Thanks,

Andrew.

From paul.sandoz at oracle.com  Wed Jan  6 14:14:40 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Wed, 6 Jan 2016 15:14:40 +0100
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
Message-ID: <9D66933E-DF61-45F6-94C9-4A314A9AC6F3@oracle.com>


> On 6 Jan 2016, at 11:12, Paul Sandoz <Paul.Sandoz at oracle.com> wrote:

> At least for this set of tests the results indicate conditional moves offer no major advantage over branching. For the worst case branching scenario (the ?50 cent? case) conditional moves appear marginally better, but as you say the data pattern is likely rare.

Scrap that, cmoves are kicking for "A.forTest_if     0.5  RANDOM_RAMP_UP"

Disabling them with -XX:ConditionalMoveLimit==0 (thanks Roland), muddies the waters a bit:
# VM options: -XX:-TieredCompilation
Benchmark     (bias)            (dg)  (size)  Mode  Cnt     Score     Error  Units
A.forTest_if     0.1  RANDOM_RAMP_UP       1  avgt    5     3.535 ?   0.083  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP      10  avgt    5     7.478 ?   0.232  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP     100  avgt    5    42.348 ?   0.922  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP    1000  avgt    5   460.924 ?  12.692  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP   10000  avgt    5  3708.576 ? 110.138  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP       1  avgt    5     3.557 ?   0.172  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP      10  avgt    5     9.860 ?   0.135  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP     100  avgt    5    82.380 ?   1.971  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP    1000  avgt    5   832.391 ?  23.629  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP   10000  avgt    5  8325.406 ? 206.872  ns/op


# VM options: -XX:ConditionalMoveLimit=0 -XX:-TieredCompilation
Benchmark     (bias)            (dg)  (size)  Mode  Cnt      Score     Error  Units
A.forTest_if     0.1  RANDOM_RAMP_UP       1  avgt    5      3.554 ?   0.049  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP      10  avgt    5      9.382 ?   0.062  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP     100  avgt    5     37.483 ?   0.696  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP    1000  avgt    5    369.375 ?   9.780  ns/op
A.forTest_if     0.1  RANDOM_RAMP_UP   10000  avgt    5   3712.492 ? 128.310  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP       1  avgt    5      3.546 ?   0.053  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP      10  avgt    5      7.488 ?   0.118  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP     100  avgt    5     52.889 ?   5.328  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP    1000  avgt    5    447.437 ?  14.273  ns/op
A.forTest_if     0.5  RANDOM_RAMP_UP   10000  avgt    5  10040.920 ? 993.644  ns/op
Paul.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/44971ea9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/44971ea9/signature.asc>

From paul.sandoz at oracle.com  Wed Jan  6 14:34:29 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Wed, 6 Jan 2016 15:34:29 +0100
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
	<CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
Message-ID: <A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>


> On 6 Jan 2016, at 13:38, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> Perhaps for conditional moves data dependency chains are more costly?
> 
> cmov carries a dependency on both inputs, making it more likely to stall when at least one isn't available whereas the branch still allows cpu to continue with speculative execution.  In a tight loop with a memory access as one input to cmov, the memory op has to retire before cmov can proceed; using cmov when both inputs are already ready (e.g. values in registers) is pretty harmless though and avoids a branch entirely.  cmov also has larger encoding than a branch.
> 

Ok. The generated code for an unrolled loop firsts load array elements into registers before performing the cmovs.


> As the original jira on this issue states, cmov should only be used when the branch is profiled to be unpredictable.  I'm not sure why loops with a max/min accumulator need to be called out separately in this regard - wouldn't the branch profile dictate this anyway?

Yes, that was me not understanding the underlying branch profiling mechanisms.

Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/eaecba0a/signature-0001.asc>

From david.buck at oracle.com  Mon Jan  4 08:35:51 2016
From: david.buck at oracle.com (david buck)
Date: Mon, 4 Jan 2016 17:35:51 +0900
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A2A20.7030601@oracle.com>
References: <568A2A20.7030601@oracle.com>
Message-ID: <568A2EE7.4030600@oracle.com>

Hi Tobias!

Would you please include links to the code review threads on 
mail.openjdk.java.net?

[ JDK 8 Updates: Push Approval Request Template ]
http://openjdk.java.net/projects/jdk8u/approval-template.html

Cheers,
-Buck

On 2016/01/04 17:15, Tobias Hartmann wrote:
> Hi,
>
> please approve and review the following backports to 8u.
>
> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
> https://bugs.openjdk.java.net/browse/JDK-8144487
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>
> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
> https://bugs.openjdk.java.net/browse/JDK-8145754
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>
> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>
> Thanks,
> Tobias
>

From david.buck at oracle.com  Mon Jan  4 09:56:39 2016
From: david.buck at oracle.com (david buck)
Date: Mon, 4 Jan 2016 18:56:39 +0900
Subject: [8u] Request for approval: Backport of 8144487 and 8145754
In-Reply-To: <568A3BB9.1010501@oracle.com>
References: <568A2A20.7030601@oracle.com> <568A2EE7.4030600@oracle.com>
	<568A3BB9.1010501@oracle.com>
Message-ID: <568A41D7.2030503@oracle.com>

approved for backport to 8u-dev

Thank you for adding the review links.

Cheers,
-Buck

On 2016/01/04 18:30, Tobias Hartmann wrote:
> Hi David,
>
> sure, I included the links to the code review:
>
> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
> https://bugs.openjdk.java.net/browse/JDK-8144487
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020503.html
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>
> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
> https://bugs.openjdk.java.net/browse/JDK-8145754
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020502.html
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>
> Thanks,
> Tobias
>
> On 04.01.2016 09:35, david buck wrote:
>> Hi Tobias!
>>
>> Would you please include links to the code review threads on mail.openjdk.java.net?
>>
>> [ JDK 8 Updates: Push Approval Request Template ]
>> http://openjdk.java.net/projects/jdk8u/approval-template.html
>>
>> Cheers,
>> -Buck
>>
>> On 2016/01/04 17:15, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please approve and review the following backports to 8u.
>>>
>>> 8144487: PhaseIdealLoop::build_and_optimize() must restore major_progress flag if skip_loop_opts is true
>>> https://bugs.openjdk.java.net/browse/JDK-8144487
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/21689239c407
>>>
>>> 8145754: PhaseIdealLoop::is_scaled_iv_plus_offset() does not match AddI
>>> https://bugs.openjdk.java.net/browse/JDK-8145754
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/0e9d64117522
>>>
>>> Nightly testing showed no problems and the changes apply cleanly to 8u-dev.
>>>
>>> Thanks,
>>> Tobias
>>>

From vitalyd at gmail.com  Wed Jan  6 14:45:35 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 09:45:35 -0500
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
	<CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
	<A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>
Message-ID: <CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>

>
> Ok. The generated code for an unrolled loop firsts load array elements
> into registers before performing the cmovs.


Yes, but the cmov cannot proceed until that load retires.  If you had a
normal branch, speculation can continue past the branch and put more
instructions into the pipeline barring other hazards/dependencies.  By
"available in registers" I meant a cmov executed against 2 values in
registers that are already available (i.e. the loads which put the values
into registers have already completed, or the registers were set with
immediates, etc).

Basically, if the cost of branch misprediction is higher than waiting for
both inputs to cmov to be available, then cmov is better.  For very
predictable branches, cmov is a loss (as we've already established in this
thread) and I think always will be (i.e. cpu vendors seem to be putting
more and more smarts into branch prediction instead).

Yes, that was me not understanding the underlying branch profiling
> mechanisms.


Actually, that question of mine was more aimed at John who said we should
do something special for loops with max/min accumulators :).


On Wed, Jan 6, 2016 at 9:34 AM, Paul Sandoz <paul.sandoz at oracle.com> wrote:

>
> > On 6 Jan 2016, at 13:38, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> >
> > Perhaps for conditional moves data dependency chains are more costly?
> >
> > cmov carries a dependency on both inputs, making it more likely to stall
> when at least one isn't available whereas the branch still allows cpu to
> continue with speculative execution.  In a tight loop with a memory access
> as one input to cmov, the memory op has to retire before cmov can proceed;
> using cmov when both inputs are already ready (e.g. values in registers) is
> pretty harmless though and avoids a branch entirely.  cmov also has larger
> encoding than a branch.
> >
>
> Ok. The generated code for an unrolled loop firsts load array elements
> into registers before performing the cmovs.
>
>
> > As the original jira on this issue states, cmov should only be used when
> the branch is profiled to be unpredictable.  I'm not sure why loops with a
> max/min accumulator need to be called out separately in this regard -
> wouldn't the branch profile dictate this anyway?
>
> Yes, that was me not understanding the underlying branch profiling
> mechanisms.
>
> Paul.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/d48ec24b/attachment.html>

From paul.sandoz at oracle.com  Wed Jan  6 15:01:52 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Wed, 6 Jan 2016 16:01:52 +0100
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
	<CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
	<A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>
	<CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>
Message-ID: <60AACA78-1F15-4B35-84CB-6BCED8172324@oracle.com>


> On 6 Jan 2016, at 15:45, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> Ok. The generated code for an unrolled loop firsts load array elements into registers before performing the cmovs.
> 
> Yes, but the cmov cannot proceed until that load retires.  If you had a normal branch, speculation can continue past the branch and put more instructions into the pipeline barring other hazards/dependencies.  By "available in registers" I meant a cmov executed against 2 values in registers that are already available (i.e. the loads which put the values into registers have already completed, or the registers were set with immediates, etc).
> 
> Basically, if the cost of branch misprediction is higher than waiting for both inputs to cmov to be available, then cmov is better.  For very predictable branches, cmov is a loss (as we've already established in this thread) and I think always will be (i.e. cpu vendors seem to be putting more and more smarts into branch prediction instead).
> 

Thanks for the explanations. It?s helpful.


> Yes, that was me not understanding the underlying branch profiling mechanisms.
> 
> Actually, that question of mine was more aimed at John who said we should do something special for loops with max/min accumulators :).
> 

Oh, ok :-)

Paul.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/05726465/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/05726465/signature.asc>

From john.r.rose at oracle.com  Wed Jan  6 17:22:53 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 09:22:53 -0800
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
	<CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
	<A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>
	<CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>
Message-ID: <A79EB5B8-3FC7-4F7D-8B1C-3F081893F542@oracle.com>

On Jan 6, 2016, at 6:45 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> Basically, if the cost of branch misprediction is higher than waiting for both inputs to cmov to be available, then cmov is better.  For very predictable branches, cmov is a loss (as we've already established in this thread) and I think always will be (i.e. cpu vendors seem to be putting more and more smarts into branch prediction instead).
> 
> Yes, that was me not understanding the underlying branch profiling mechanisms.
> 
> Actually, that question of mine was more aimed at John who said we should do something special for loops with max/min accumulators :). 

Buried in the bug comments is the following insight:  Branch profiling by the JVM is different from branch profiling by the CPU, and the difference is significant for the specific use case of an accumulated max (or min).

The CPU's profiling has a much shorter time scale:  It collects information (many times) over the course of a single loop.  The JVM's profiling has a long time scale, usually the whole application execution.  If a loop has bursty behavior (high short-span correlation) the CPU can predict branches very well, even though the JVM sees just noise.

(Fun fact:  The JVM could also profile auto-correlation and other statistics, but we have avoided doing this so far.)

So, usually, the branch profiling done in software by the JVM (interpreter or profiled tier) gives enough information to predict what the CPU will experience.  In this very special case (a = max(a, x) for loop-varying x), almost all inputs "settle down" to an almost 100% branch profile, in favor of 'a'.  For random data, you expect to find your max half way through the loop.  That means that the second half of the loop can be speculated as "a = a" instead of "a = max(a, x)".

This, in turn, can be detected in the JIT by pattern-matching locally on the max node, to see if it is of the form phi = max(phi, x).

Fair enough?

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/c0272896/attachment.html>

From john.r.rose at oracle.com  Wed Jan  6 17:32:54 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 09:32:54 -0800
Subject: RFR(XS): 8144852: Corrupted oop in nmethod
In-Reply-To: <5671F5F6.9060605@oracle.com>
References: <566A44AA.1040101@oracle.com> <566AB84C.1000603@oracle.com>
	<566B216B.1020204@oracle.com> <5671CC94.2080205@oracle.com>
	<5671F5F6.9060605@oracle.com>
Message-ID: <457FC936-D24F-4486-8D99-E4D8B55528CC@oracle.com>

On Dec 16, 2015, at 3:38 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> Adding non_oop_word to oopDesc::print_*_on would imply that it's OK to assign this value in a more general context, which is not true.

So put in a comment.  The print_on stuff is for us to use in debuggers and tracing code, not for end users who might be confused.

> So I would suggest keeping knowledge of non_oop_word inside nmethod for now, and we can revisit this if other places start to use non_oop_word.

Either way is OK with me.  But I like my print functions to be as forgiving as possible; don't you?

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/0e94d684/attachment-0001.html>

From john.r.rose at oracle.com  Wed Jan  6 17:42:44 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 09:42:44 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
Message-ID: <86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>

On Jan 6, 2016, at 4:20 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> I realize that there may always be a user-specified shape that the JIT doesn't understand, but straightforward cases should hopefully Just Work(tm) as those patterns can be picked up elsewhere in code and performance improves without changing a line of code.

Where we differ is this:  I am skeptical that there is a well-defined set of "straightforward cases", which all reasonable coders, who expect optimization, will use.

It is better to point out one case for favorable treatment, and say "if you really expect best optimization, use this name".  Followed by, "if you don't choose to use that name, we'll still try to optimize all the straightforward cases, but don't expect us to prioritize them as highly as the best practice we suggested".

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/3b4a1e81/attachment.html>

From john.r.rose at oracle.com  Wed Jan  6 17:51:06 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 09:51:06 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568CEF5B.5060306@redhat.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
Message-ID: <8AF84C4E-7ECA-4A0F-8CF1-24C38B294C72@oracle.com>

On Jan 6, 2016, at 2:41 AM, Andrew Haley <aph at redhat.com> wrote:
> 
>> 
>> Range checks are interesting to block-level loop transformations
>> (iteration range reorganization).  Do you really want your loop
>> optimizations to be gated on "sufficient smarts" in the JIT's
>> expression pattern matcher?
> 
> Please forgive me for pushing this: I'm not arguing for the sake of it,
> I'm trying to understand your reasoning.
> 
> As it stands we recognize a call to Objects.checkIndex and transform
> it into a certain pattern.  I'm assuming that it's not impossible to
> recognize the logic inside Objects.checkIndex and transform it into
> the same form that the intrinsic generates.  And that would have a
> payoff in all the places that the same logic is used in existing
> programs, both inside and outside the JDK.

Sure, and we do this as much as possible.  But there are too many
degrees of freedom in user-coded range check expressions.  So
we give the users a clearer target to aim at if they want best perf.
on range checks.

You could say (as Vitaly pointed out for Integer.compareTo), that
we don't need an intrinsic as long as the bytecoded body of
Objects.checkIndex has the Best Possible Formulation (tm)
of a range check, which naturally will always be maximally
optimized by the JIT.  The specific problem with range checking
is that (as I said before) the JVM inserts its own range checks
into bytecode semantics (iaload etc.), and we need to make
the user-written ones fold up with the JVM-inserted ones.
That is a hard coupling between the JDK and JVM, much
harder than just "yes, we are all using the same math".
An intrinsic properly expresses and enforces this coupling.
Using a similar expression does not.

> I suppose one downside of this approach is that C2 might decide
> not to inline Objects.checkIndex, so it would be called instead
> and the optimization would not be done.

Yes, come to think of it, one "super power" of an intrinsic is that
the inlining heuristics apply to it more favorably.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/d7fbc92a/attachment.html>

From christian.thalinger at oracle.com  Wed Jan  6 17:54:20 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 6 Jan 2016 07:54:20 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <6C07E8DD-50D4-4B2E-BD8E-B131579A9664@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
	<8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
	<1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>
	<6C07E8DD-50D4-4B2E-BD8E-B131579A9664@oracle.com>
Message-ID: <E07C2AD0-3B85-4315-8C49-95D771E44553@oracle.com>

I just noticed this code in HotSpotResolvedJavaMethodImpl:

    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");

The only other direct usage of System.getProperty is:

hotspot/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
167:        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {

I think both of them should be using the same mechanism as introduced by this change.

> On Jan 4, 2016, at 12:47 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 4, 2016, at 12:31 PM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>> 
>>> 
>>> On 04 Jan 2016, at 18:41, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> 
>>>> 
>>>> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>> 
>>>>> 
>>>>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>>>>> 
>>>>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>>>>> 
>>>>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>>>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>>>>> 
>>>>>> This change removes the JVMCI command line option support.
>>>>>> 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8146001 <https://bugs.openjdk.java.net/browse/JDK-8146001>
>>>>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>>>>> 
>>>>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>>>>> 
>>>>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>>>> 
>>>>> We should either use the jvmci. prefix or not.
>>>> 
>>>> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.
>>> 
>>> I think we should prefix the property name in getBooleanProperty:
>>> 
>>> +    public static boolean getBooleanProperty(String name, boolean def) {
>>> +        String value = VM.getSavedProperty("jvmci." + name);
>> 
>> Ok, sounds reasonable.
>> 
>>> 
>>> and I put UseProfilingInformation back:
>>> 
>>> diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
>>> @@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
>>> 
>>> import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
>>> import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
>>> -import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
>>> import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
>>> import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
>>> 
>>> @@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
>>> final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
>>> 
>>>     /**
>>> +     * Whether to use profiling information.
>>> +     */
>>> +    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
>>> +
>>> +    /**
>>>      * Reference to metaspace Method object.
>>>      */
>>>     private final long metaspaceMethod;
>>> @@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
>>>     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>         ProfilingInfo info;
>>> 
>>> -        if (UseProfilingInformation.getValue() && methodData == null) {
>>> +        if (UseProfilingInformation && methodData == null) {
>>>             long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
>>>             if (metaspaceMethodData != 0) {
>>>                 methodData = new HotSpotMethodData(metaspaceMethodData, this);
>> 
>> JVMCI should unconditionally return available profiling information. It's up to the compiler whether or not to use it. For example, this is now compilation local in Graal:
>> 
>> http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16 <http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16>
> 
> Oh, I missed that.  Yes, that works for us as well.  Thanks for pointing that out.
> 
>> 
>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/11bb6fc9/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan  6 17:56:29 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 12:56:29 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
Message-ID: <CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>

>
> Where we differ is this:  I am skeptical that there is a well-defined set
> of "straightforward cases", which all reasonable coders, who expect
> optimization, will use.


I'm not sure we differ here.  A "straightforward case" of bytecode like
Object::checkIndex is a good start.

It is better to point out one case for favorable treatment, and say "if you
> really expect best optimization, use this name".  Followed by, "if you
> don't choose to use that name, we'll still try to optimize all the
> straightforward cases, but don't expect us to prioritize them as highly as
> the best practice we suggested".


For new code or code being modified, using a known method to guarantee
optimization is great; unfortunately that doesn't work for existing code.
And of course existing profile pollution problem makes using common entry
points a bit unpleasant if there's risk the profile doesn't match your
particular call.

More generally, I'd expect you guys would also prefer to keep # of
intrinsics down and rely on better canonicalization and pattern matching?
This has, as mentioned, the added side benefit that it will match existing
code shapes without requiring any changes.  Over time, provided people
report missed optimizations, hopefully the set of patterns that get matched
increases and the # of "clever" cases that fail to optimize goes down.

On Wed, Jan 6, 2016 at 12:42 PM, John Rose <john.r.rose at oracle.com> wrote:

> On Jan 6, 2016, at 4:20 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
>
> I realize that there may always be a user-specified shape that the JIT
> doesn't understand, but straightforward cases should hopefully Just
> Work(tm) as those patterns can be picked up elsewhere in code and
> performance improves without changing a line of code.
>
>
> Where we differ is this:  I am skeptical that there is a well-defined set
> of "straightforward cases", which all reasonable coders, who expect
> optimization, will use.
>
> It is better to point out one case for favorable treatment, and say "if
> you really expect best optimization, use this name".  Followed by, "if you
> don't choose to use that name, we'll still try to optimize all the
> straightforward cases, but don't expect us to prioritize them as highly as
> the best practice we suggested".
>
> ? John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/03d31543/attachment.html>

From vitalyd at gmail.com  Wed Jan  6 18:00:22 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 13:00:22 -0500
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <A79EB5B8-3FC7-4F7D-8B1C-3F081893F542@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
	<CAHjP37FRnj7_WdGgQQt4rM7o3-fbUTkWKVJSUESam25C+sLkrA@mail.gmail.com>
	<775F44DC-A0A1-42D5-BB2E-AE861A855125@oracle.com>
	<79E3DB9F-5425-4A93-A8C8-5223337D9346@oracle.com>
	<AC1A7790-A78D-483E-B3AF-000BD5BFF109@oracle.com>
	<CAHjP37FzNkwRQTatB-zbbhZX4qLx+_ngUBvv9mYvR1EHo_SMVA@mail.gmail.com>
	<A991B8D1-5F9A-42B0-BD41-783E74F4C7B4@oracle.com>
	<CAHjP37E1VfF7TYnNOnTraiZJZRhLXo9OVH=jmLfNm+mwkCQjEw@mail.gmail.com>
	<A79EB5B8-3FC7-4F7D-8B1C-3F081893F542@oracle.com>
Message-ID: <CAHjP37Eq6i3yLS2BtbWJYw4f5XBtRdd1gp2tRH8n7rTo_Bqasg@mail.gmail.com>

>
> The CPU's profiling has a much shorter time scale:  It collects
> information (many times) over the course of a single loop.  The JVM's
> profiling has a long time scale, usually the whole application execution.
> If a loop has bursty behavior (high short-span correlation) the CPU can
> predict branches very well, even though the JVM sees just noise.


That's a good point.  This almost implies that branches within loops
shouldn't even use JVM collected profiles -- just emit a branch -- since
software doesn't model the hardware as well (and even if it attempted, it
would be a moving target with many different targets).

On Wed, Jan 6, 2016 at 12:22 PM, John Rose <john.r.rose at oracle.com> wrote:

> On Jan 6, 2016, at 6:45 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
>
> Basically, if the cost of branch misprediction is higher than waiting for
> both inputs to cmov to be available, then cmov is better.  For very
> predictable branches, cmov is a loss (as we've already established in this
> thread) and I think always will be (i.e. cpu vendors seem to be putting
> more and more smarts into branch prediction instead).
>
> Yes, that was me not understanding the underlying branch profiling
>> mechanisms.
>
>
> Actually, that question of mine was more aimed at John who said we should
> do something special for loops with max/min accumulators :).
>
>
> Buried in the bug comments is the following insight:  Branch profiling by
> the JVM is different from branch profiling by the CPU, and the difference
> is significant for the specific use case of an accumulated max (or min).
>
> The CPU's profiling has a much shorter time scale:  It collects
> information (many times) over the course of a single loop.  The JVM's
> profiling has a long time scale, usually the whole application execution.
> If a loop has bursty behavior (high short-span correlation) the CPU can
> predict branches very well, even though the JVM sees just noise.
>
> (Fun fact:  The JVM could also profile auto-correlation and other
> statistics, but we have avoided doing this so far.)
>
> So, usually, the branch profiling done in software by the JVM (interpreter
> or profiled tier) gives enough information to predict what the CPU will
> experience.  In this very special case (a = max(a, x) for loop-varying x),
> almost all inputs "settle down" to an almost 100% branch profile, in favor
> of 'a'.  For random data, you expect to find your max half way through the
> loop.  That means that the second half of the loop can be speculated as "a
> = a" instead of "a = max(a, x)".
>
> This, in turn, can be detected in the JIT by pattern-matching locally on
> the max node, to see if it is of the form phi = max(phi, x).
>
> Fair enough?
>
> ? John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/90a5b9ee/attachment.html>

From doug.simon at oracle.com  Wed Jan  6 18:04:19 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 6 Jan 2016 19:04:19 +0100
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <E07C2AD0-3B85-4315-8C49-95D771E44553@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
	<8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
	<1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>
	<6C07E8DD-50D4-4B2E-BD8E-B131579A9664@oracle.com>
	<E07C2AD0-3B85-4315-8C49-95D771E44553@oracle.com>
Message-ID: <0BB3D050-7E42-4777-BB7B-E4D7DC2A6605@oracle.com>


> On 06 Jan 2016, at 18:54, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> I just noticed this code in HotSpotResolvedJavaMethodImpl:
> 
>     private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
> 
> The only other direct usage of System.getProperty is:
> 
> hotspot/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
> 167:        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
> 
> I think both of them should be using the same mechanism as introduced by this change.

I agree (assuming you mean the HotSpotJVMCIRuntime.getBooleanProperty mechanism).

There?s also:

hotspot/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
70:    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");

But we will have to leave that as is given that HotSpotJVMCIRuntime is not visible from this code. We could also remove the (legacy) ?jvmci.runtime.TimeInit? alias.

-Doug

> 
>> On Jan 4, 2016, at 12:47 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 4, 2016, at 12:31 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>>> 
>>>> On 04 Jan 2016, at 18:41, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>> 
>>>>>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>>>>>> 
>>>>>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>>>>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>>>>>> 
>>>>>>> This change removes the JVMCI command line option support.
>>>>>>> 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8146001
>>>>>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>>>>>> 
>>>>>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>>>>>> 
>>>>>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>>>>> 
>>>>>> We should either use the jvmci. prefix or not.
>>>>> 
>>>>> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.
>>>> 
>>>> I think we should prefix the property name in getBooleanProperty:
>>>> 
>>>> +    public static boolean getBooleanProperty(String name, boolean def) {
>>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>> 
>>> Ok, sounds reasonable.
>>> 
>>>> 
>>>> and I put UseProfilingInformation back:
>>>> 
>>>> diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
>>>> @@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
>>>> 
>>>> import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
>>>> import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
>>>> -import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
>>>> import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
>>>> import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
>>>> 
>>>> @@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
>>>> final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
>>>> 
>>>>     /**
>>>> +     * Whether to use profiling information.
>>>> +     */
>>>> +    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
>>>> +
>>>> +    /**
>>>>      * Reference to metaspace Method object.
>>>>      */
>>>>     private final long metaspaceMethod;
>>>> @@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
>>>>     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>>         ProfilingInfo info;
>>>> 
>>>> -        if (UseProfilingInformation.getValue() && methodData == null) {
>>>> +        if (UseProfilingInformation && methodData == null) {
>>>>             long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
>>>>             if (metaspaceMethodData != 0) {
>>>>                 methodData = new HotSpotMethodData(metaspaceMethodData, this);
>>> 
>>> JVMCI should unconditionally return available profiling information. It's up to the compiler whether or not to use it. For example, this is now compilation local in Graal:
>>> 
>>> http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16
>> 
>> Oh, I missed that.  Yes, that works for us as well.  Thanks for pointing that out.
>> 
>>> 
>>> -Doug
> 


From sergey.kuksenko at oracle.com  Wed Jan  6 18:59:33 2016
From: sergey.kuksenko at oracle.com (Sergey Kuksenko)
Date: Wed, 6 Jan 2016 10:59:33 -0800
Subject: Conditional moves vs. branching in unrolled loops
In-Reply-To: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
References: <D16EE35D-520C-4CB2-B074-FE564E0571FD@oracle.com>
Message-ID: <568D6415.2070306@oracle.com>

Hi,
Move under branch if always faster than cmov (due to additional data 
dependencies) in case of predicted branch.
So the key point here how HW deal with unpredicted branches.
Here (on slides 40-41)
http://www.slideshare.net/SergeyKuksenko/quantum-performance-effects-44390719 

you can find some measurements for predicted/unpredicted cases for 
different HW.
On Intel x86 cost of unpredicted branch is quite low starting from Sandy 
Bridge micro-architecture, but only when the loop is small enough to fit 
into uop-cache.
On AMD x86 cost of unpredicted branch is higher and cmov was winner, but 
I didn't check it on modern AMD CPUs.


On 01/05/2016 03:51 AM, Paul Sandoz wrote:
> Hi,
>
> Recent investigation comparing for loops with streams exposed what appears to be an issue with Math.max and generated code in unrolled loops.
>
> Namely this:
>
>      @Benchmark
>      public int forTest_if() {
>          int[] a = ints;
>          int e = ints.length;
>          int m = Integer.MIN_VALUE;
>          for (int i = 0; i < e; i++)
>              if (a[i] >= m)
>                  m = a[i];
>          return m;
>      }
>
> is faster than this:
>
>      @Benchmark
>      public int forTest_MathMax() {
>          int[] a = ints;
>          int e = ints.length;
>          int m = Integer.MIN_VALUE;
>          for (int i = 0; i < e; i++)
>              m = Math.max(m, a[i]);
>          return m;
>      }
>
> Or this:
>
>    Arrays.stream(ints).reduce(Integer.MIN_VALUE, (a, b) -> a >= b ? a : b);
>
> is faster than this:
>
>    Arrays.stream(ints).reduce(Integer.MIN_VALUE, Math::max);
>
> at least on an x86 i5 processor.
>
> See the following links for more details:
>
>    https://bugs.openjdk.java.net/browse/JDK-8146071
>    https://bugs.openjdk.java.net/browse/JDK-8146071?focusedCommentId=13883495&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13883495
>
> For generated code in the for loop cases above see:
>
>    https://bugs.openjdk.java.net/secure/attachment/56221/mathMax.perfasm.txt
>
> I am not familiar enough with the x86 architecture to fully explain why, but i presume branch prediction is trumping the conditional moves, which suggests that on certain processors the generated code for the Math.max intrinsic (and others) in unrolled loops should not use conditional moves.
>
> Thanks,
> Paul.


From christian.thalinger at oracle.com  Wed Jan  6 19:19:35 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 6 Jan 2016 09:19:35 -1000
Subject: RFR (S): 8146246: JVMCICompiler::abort_on_pending_exception:
	assert(!thread->owns_locks()) failed: must release all locks
	when leaving VM
Message-ID: <ED6A057F-0842-42FD-8F7A-8116EA9DF5D2@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8146246

The problem is that https://bugs.openjdk.java.net/browse/JDK-8145435 introduced ttyLocker to synchronize the exception output but java_lang_Throwable::print_stack_trace can call out to Java to get the cause.

There are two solutions:

1) Remove ttyLocker and deal with some possible scrambling in the rare case of an exception:

diff -r df8d635f2296 -r e87e187552fb src/share/vm/jvmci/jvmciCompiler.cpp
--- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 11:24:01 2015 -0800
+++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Thu Dec 31 09:20:16 2015 -0800
@@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
     Handle exception(THREAD, PENDING_EXCEPTION);
     CLEAR_PENDING_EXCEPTION;
 
-    {
-      ttyLocker ttyl;
-      java_lang_Throwable::print_stack_trace(exception, tty);
-    }
+    java_lang_Throwable::print_stack_trace(exception, tty);
 
     // Something went wrong so disable compilation at this level
     method->set_not_compilable(CompLevel_full_optimization);
@@ -181,11 +178,8 @@ void JVMCICompiler::abort_on_pending_exc
   Thread* THREAD = Thread::current();
   CLEAR_PENDING_EXCEPTION;
 
-  {
-    ttyLocker ttyl;
-    tty->print_raw_cr(message);
-    java_lang_Throwable::print_stack_trace(exception, tty);
-  }
+  tty->print_raw_cr(message);
+  java_lang_Throwable::print_stack_trace(exception, tty);
 
   // Give other aborting threads to also print their stack traces.
   // This can be very useful when debugging class initialization
diff -r df8d635f2296 -r e87e187552fb src/share/vm/runtime/java.cpp
--- a/src/share/vm/runtime/java.cpp	Tue Dec 29 11:24:01 2015 -0800
+++ b/src/share/vm/runtime/java.cpp	Thu Dec 31 09:20:16 2015 -0800
@@ -432,7 +432,6 @@ void before_exit(JavaThread* thread) {
   if (HAS_PENDING_EXCEPTION) {
     Handle exception(THREAD, PENDING_EXCEPTION);
     CLEAR_PENDING_EXCEPTION;
-    ttyLocker ttyl;
     java_lang_Throwable::print_stack_trace(exception, tty);
   }
 #endif

or

2) Call out to Java and let the Java code do the printing:

diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.cpp
--- a/src/share/vm/classfile/javaClasses.cpp	Tue Dec 29 18:30:51 2015 +0100
+++ b/src/share/vm/classfile/javaClasses.cpp	Wed Jan 06 09:12:00 2016 -1000
@@ -1784,6 +1784,20 @@ void java_lang_Throwable::print_stack_tr
   }
 }
 
+/**
+ * Print the throwable stack trace by calling the Java method java.lang.Throwable.printStackTrace().
+ */
+void java_lang_Throwable::java_printStackTrace(Handle throwable, TRAPS) {
+  assert(throwable->is_a(SystemDictionary::Throwable_klass()), "Throwable instance expected");
+  JavaValue result(T_VOID);
+  JavaCalls::call_virtual(&result,
+                          throwable,
+                          KlassHandle(THREAD, SystemDictionary::Throwable_klass()),
+                          vmSymbols::printStackTrace_name(),
+                          vmSymbols::void_method_signature(),
+                          THREAD);
+}
+
 void java_lang_Throwable::fill_in_stack_trace(Handle throwable, const methodHandle& method, TRAPS) {
   if (!StackTraceInThrowable) return;
   ResourceMark rm(THREAD);
diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.hpp
--- a/src/share/vm/classfile/javaClasses.hpp	Tue Dec 29 18:30:51 2015 +0100
+++ b/src/share/vm/classfile/javaClasses.hpp	Wed Jan 06 09:12:00 2016 -1000
@@ -554,6 +554,7 @@ class java_lang_Throwable: AllStatic {
   // Printing
   static void print(Handle throwable, outputStream* st);
   static void print_stack_trace(Handle throwable, outputStream* st);
+  static void java_printStackTrace(Handle throwable, TRAPS);
   // Debugging
   friend class JavaClasses;
 };
diff -r 0fcfe4b07f7e src/share/vm/jvmci/jvmciCompiler.cpp
--- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 18:30:51 2015 +0100
+++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Wed Jan 06 09:12:00 2016 -1000
@@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
     Handle exception(THREAD, PENDING_EXCEPTION);
     CLEAR_PENDING_EXCEPTION;
 
-    {
-      ttyLocker ttyl;
-      java_lang_Throwable::print_stack_trace(exception, tty);
-    }
+    java_lang_Throwable::java_printStackTrace(exception, THREAD);
 
     // Something went wrong so disable compilation at this level
     method->set_not_compilable(CompLevel_full_optimization);
@@ -181,11 +178,7 @@ void JVMCICompiler::abort_on_pending_exc
   Thread* THREAD = Thread::current();
   CLEAR_PENDING_EXCEPTION;
 
-  {
-    ttyLocker ttyl;
-    tty->print_raw_cr(message);
-    java_lang_Throwable::print_stack_trace(exception, tty);
-  }
+  java_lang_Throwable::java_printStackTrace(exception, THREAD);
 
   // Give other aborting threads to also print their stack traces.
   // This can be very useful when debugging class initialization
diff -r 0fcfe4b07f7e src/share/vm/runtime/java.cpp
--- a/src/share/vm/runtime/java.cpp	Tue Dec 29 18:30:51 2015 +0100
+++ b/src/share/vm/runtime/java.cpp	Wed Jan 06 09:12:00 2016 -1000
@@ -433,7 +433,7 @@ void before_exit(JavaThread* thread) {
     Handle exception(THREAD, PENDING_EXCEPTION);
     CLEAR_PENDING_EXCEPTION;
     ttyLocker ttyl;
-    java_lang_Throwable::print_stack_trace(exception, tty);
+    java_lang_Throwable::java_printStackTrace(exception, THREAD);
   }
 #endif


From vladimir.kozlov at oracle.com  Wed Jan  6 19:34:59 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 11:34:59 -0800
Subject: RFR (S): 8146246: JVMCICompiler::abort_on_pending_exception:
	assert(!thread->owns_locks()) failed: must release all locks when
	leaving VM
In-Reply-To: <ED6A057F-0842-42FD-8F7A-8116EA9DF5D2@oracle.com>
References: <ED6A057F-0842-42FD-8F7A-8116EA9DF5D2@oracle.com>
Message-ID: <568D6C63.5000403@oracle.com>

I would go with "Java code do the printing".
You left ttyLocker in case 2) in src/share/vm/runtime/java.cpp

Thanks,
Vladimir

On 1/6/16 11:19 AM, Christian Thalinger wrote:
> https://bugs.openjdk.java.net/browse/JDK-8146246
>
> The problem is that https://bugs.openjdk.java.net/browse/JDK-8145435 introduced ttyLocker to synchronize the exception output but java_lang_Throwable::print_stack_trace can call out to Java to get the cause.
>
> There are two solutions:
>
> 1) Remove ttyLocker and deal with some possible scrambling in the rare case of an exception:
>
> diff -r df8d635f2296 -r e87e187552fb src/share/vm/jvmci/jvmciCompiler.cpp
> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 11:24:01 2015 -0800
> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Thu Dec 31 09:20:16 2015 -0800
> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>       Handle exception(THREAD, PENDING_EXCEPTION);
>       CLEAR_PENDING_EXCEPTION;
>
> -    {
> -      ttyLocker ttyl;
> -      java_lang_Throwable::print_stack_trace(exception, tty);
> -    }
> +    java_lang_Throwable::print_stack_trace(exception, tty);
>
>       // Something went wrong so disable compilation at this level
>       method->set_not_compilable(CompLevel_full_optimization);
> @@ -181,11 +178,8 @@ void JVMCICompiler::abort_on_pending_exc
>     Thread* THREAD = Thread::current();
>     CLEAR_PENDING_EXCEPTION;
>
> -  {
> -    ttyLocker ttyl;
> -    tty->print_raw_cr(message);
> -    java_lang_Throwable::print_stack_trace(exception, tty);
> -  }
> +  tty->print_raw_cr(message);
> +  java_lang_Throwable::print_stack_trace(exception, tty);
>
>     // Give other aborting threads to also print their stack traces.
>     // This can be very useful when debugging class initialization
> diff -r df8d635f2296 -r e87e187552fb src/share/vm/runtime/java.cpp
> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 11:24:01 2015 -0800
> +++ b/src/share/vm/runtime/java.cpp	Thu Dec 31 09:20:16 2015 -0800
> @@ -432,7 +432,6 @@ void before_exit(JavaThread* thread) {
>     if (HAS_PENDING_EXCEPTION) {
>       Handle exception(THREAD, PENDING_EXCEPTION);
>       CLEAR_PENDING_EXCEPTION;
> -    ttyLocker ttyl;
>       java_lang_Throwable::print_stack_trace(exception, tty);
>     }
>   #endif
>
> or
>
> 2) Call out to Java and let the Java code do the printing:
>
> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.cpp
> --- a/src/share/vm/classfile/javaClasses.cpp	Tue Dec 29 18:30:51 2015 +0100
> +++ b/src/share/vm/classfile/javaClasses.cpp	Wed Jan 06 09:12:00 2016 -1000
> @@ -1784,6 +1784,20 @@ void java_lang_Throwable::print_stack_tr
>     }
>   }
>
> +/**
> + * Print the throwable stack trace by calling the Java method java.lang.Throwable.printStackTrace().
> + */
> +void java_lang_Throwable::java_printStackTrace(Handle throwable, TRAPS) {
> +  assert(throwable->is_a(SystemDictionary::Throwable_klass()), "Throwable instance expected");
> +  JavaValue result(T_VOID);
> +  JavaCalls::call_virtual(&result,
> +                          throwable,
> +                          KlassHandle(THREAD, SystemDictionary::Throwable_klass()),
> +                          vmSymbols::printStackTrace_name(),
> +                          vmSymbols::void_method_signature(),
> +                          THREAD);
> +}
> +
>   void java_lang_Throwable::fill_in_stack_trace(Handle throwable, const methodHandle& method, TRAPS) {
>     if (!StackTraceInThrowable) return;
>     ResourceMark rm(THREAD);
> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.hpp
> --- a/src/share/vm/classfile/javaClasses.hpp	Tue Dec 29 18:30:51 2015 +0100
> +++ b/src/share/vm/classfile/javaClasses.hpp	Wed Jan 06 09:12:00 2016 -1000
> @@ -554,6 +554,7 @@ class java_lang_Throwable: AllStatic {
>     // Printing
>     static void print(Handle throwable, outputStream* st);
>     static void print_stack_trace(Handle throwable, outputStream* st);
> +  static void java_printStackTrace(Handle throwable, TRAPS);
>     // Debugging
>     friend class JavaClasses;
>   };
> diff -r 0fcfe4b07f7e src/share/vm/jvmci/jvmciCompiler.cpp
> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 18:30:51 2015 +0100
> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Wed Jan 06 09:12:00 2016 -1000
> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>       Handle exception(THREAD, PENDING_EXCEPTION);
>       CLEAR_PENDING_EXCEPTION;
>
> -    {
> -      ttyLocker ttyl;
> -      java_lang_Throwable::print_stack_trace(exception, tty);
> -    }
> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>
>       // Something went wrong so disable compilation at this level
>       method->set_not_compilable(CompLevel_full_optimization);
> @@ -181,11 +178,7 @@ void JVMCICompiler::abort_on_pending_exc
>     Thread* THREAD = Thread::current();
>     CLEAR_PENDING_EXCEPTION;
>
> -  {
> -    ttyLocker ttyl;
> -    tty->print_raw_cr(message);
> -    java_lang_Throwable::print_stack_trace(exception, tty);
> -  }
> +  java_lang_Throwable::java_printStackTrace(exception, THREAD);
>
>     // Give other aborting threads to also print their stack traces.
>     // This can be very useful when debugging class initialization
> diff -r 0fcfe4b07f7e src/share/vm/runtime/java.cpp
> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 18:30:51 2015 +0100
> +++ b/src/share/vm/runtime/java.cpp	Wed Jan 06 09:12:00 2016 -1000
> @@ -433,7 +433,7 @@ void before_exit(JavaThread* thread) {
>       Handle exception(THREAD, PENDING_EXCEPTION);
>       CLEAR_PENDING_EXCEPTION;
>       ttyLocker ttyl;
> -    java_lang_Throwable::print_stack_trace(exception, tty);
> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>     }
>   #endif
>

From john.r.rose at oracle.com  Wed Jan  6 19:50:22 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 6 Jan 2016 11:50:22 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
Message-ID: <3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>


> On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> better canonicalization

That's our first and most important tactic.  (Actually inlining is.)

But the various idioms for checkIndex do not canonicalize easily. In this case the correct trade-off is not to invest more time and research and code into stronger canonicalization. 

We do have canonicalization of if-expressions. It's just that in this case strengthening it to cover range checks reliably is harder than the reasonable alternative.

? John

PS.  I am tempted to write out a list of 20 different ways to code a range check but will leave that as a exercise. 


From vitalyd at gmail.com  Wed Jan  6 20:39:47 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 15:39:47 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
Message-ID: <CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>

I don't think there's a need to write out 20 different ways to do a range
check -- I think nobody would expect all 20 to be covered by the
optimizer.  Some of those variations may not map cleanly to
Object::checkIndex either, nor is there any guarantee that people will
update all their existing range checks (or even know about) to use
Object::checkIndex -- some code will be left unoptimized no matter what.

But my point is the same as Andrew's, I think; instead of making checkIndex
an intrinsic, simply add a pattern match against that exact bytecode shape
(perhaps with basic canonicalization) and then still encourage people to
use Object::checkIndex.  This is better than intrinsic (modulo profile
pollution) since any other code that happens to use same pattern will match
as well, and not require an update to use checkIndex.  Then, if someone
comes to this list with an unoptimized example with a different bytecode
shape and has a convincing argument that the code shape is "common", you
guys can consider pattern matching that as well.

On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com> wrote:

>
> > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> >
> > better canonicalization
>
> That's our first and most important tactic.  (Actually inlining is.)
>
> But the various idioms for checkIndex do not canonicalize easily. In this
> case the correct trade-off is not to invest more time and research and code
> into stronger canonicalization.
>
> We do have canonicalization of if-expressions. It's just that in this case
> strengthening it to cover range checks reliably is harder than the
> reasonable alternative.
>
> ? John
>
> PS.  I am tempted to write out a list of 20 different ways to code a range
> check but will leave that as a exercise.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/911daf5c/attachment-0001.html>

From vladimir.kozlov at oracle.com  Wed Jan  6 20:57:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 12:57:05 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
Message-ID: <568D7FA1.4040707@oracle.com>

Note, we already have range check pattern matching code in C2 (thanks to 
Roland):

https://bugs.openjdk.java.net/browse/JDK-8137168

Vladimir

On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
> I don't think there's a need to write out 20 different ways to do a
> range check -- I think nobody would expect all 20 to be covered by the
> optimizer.  Some of those variations may not map cleanly to
> Object::checkIndex either, nor is there any guarantee that people will
> update all their existing range checks (or even know about) to use
> Object::checkIndex -- some code will be left unoptimized no matter what.
>
> But my point is the same as Andrew's, I think; instead of making
> checkIndex an intrinsic, simply add a pattern match against that exact
> bytecode shape (perhaps with basic canonicalization) and then still
> encourage people to use Object::checkIndex.  This is better than
> intrinsic (modulo profile pollution) since any other code that happens
> to use same pattern will match as well, and not require an update to use
> checkIndex.  Then, if someone comes to this list with an unoptimized
> example with a different bytecode shape and has a convincing argument
> that the code shape is "common", you guys can consider pattern matching
> that as well.
>
> On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
> <mailto:john.r.rose at oracle.com>> wrote:
>
>
>      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com
>     <mailto:vitalyd at gmail.com>> wrote:
>      >
>      > better canonicalization
>
>     That's our first and most important tactic.  (Actually inlining is.)
>
>     But the various idioms for checkIndex do not canonicalize easily. In
>     this case the correct trade-off is not to invest more time and
>     research and code into stronger canonicalization.
>
>     We do have canonicalization of if-expressions. It's just that in
>     this case strengthening it to cover range checks reliably is harder
>     than the reasonable alternative.
>
>     ? John
>
>     PS.  I am tempted to write out a list of 20 different ways to code a
>     range check but will leave that as a exercise.
>
>

From vladimir.kozlov at oracle.com  Wed Jan  6 22:25:40 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 14:25:40 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F389E78597@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<565E4A28.5010008@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E68A42@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com> <568BFA90.4020807@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E78597@ORSMSX116.amr.corp.intel.com>
Message-ID: <568D9464.2090008@oracle.com>

Hi Kishor,

Please send this as separate RFR for 8135250.
RFR should be sent to jdk9-dev at openjdk.java.net since it is JDK changes.
And CC to paul.sandoz at oracle.com who is assigned to the bug.

Thanks,
Vladimir

On 1/5/16 1:39 PM, Kharbas, Kishor wrote:
> Thank you guys for the in detail discussion and review.
>
> I have patched the JDK, performing bound checking using Objects.checkFromIndexSize() in CounterMode.crypt() and AESCrypt.encryptBlock(), AESCrypt.decryptBlock()
> Here is the link - http://cr.openjdk.java.net/~vdeshpande/8135250/webrev.00/
>
> Let me know if it looks correct.
>
> -Kishor
>
> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
> Sent: Tuesday, January 05, 2016 9:17 AM
> To: Andrew Haley; John Rose
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
>
>   > On 31 Dec 2015, at 22:33, John Rose <john.r.rose at oracle.com> wrote:
>   >
>   > When performing explicit range checks in pre-intrinsic code,  > let's try to use the new intrinsic functions in java.util.Objects,  > called checkIndex, checkFromToIndex, and checkFromIndexSize.
>
> Please, don't forget that checks in pre-intrinsic code should match checks generated by javac (bytecode) for intrinsified methods. Otherwise those checks will not be removed (by dominated checks in pre-intrinsic code) when intrinsics are not support on a platform. That is why we currently have such duplicated pre-intrinsic code.
>
> On other hand when intrinsics are supported they don't have checks so if they present we can intrinsify pre-intrinsic code as you suggested.
>
> Thanks,
> Vladimir
>
> On 1/5/16 1:48 AM, Andrew Haley wrote:
>> On 04/01/16 20:12, John Rose wrote:
>>> Corrected, thanks.  They don't need to be intrinsics if they optimize well.
>>> The point is that the library functions have code shapes which work
>>> well with the JIT.  For example, the multi-index checks might (as in
>>> Kishor's code) be implemented on top of the single-index check,
>>> without themselves being intrinsics.
>>
>> We seem to be missing the opportunity to convert
>>
>>     i >= 0 && i < size
>>
>> into
>>
>>     (unsigned)i < (unsigned)size
>>
>> and this is, as far as I can see, the only real code-quality advantage
>> of the checkIndex intrinsic.  Could we not do this optimization and
>> then drop the C2 checkIndex intrinsic?
>>
>> Andrew.
>>

From christian.thalinger at oracle.com  Wed Jan  6 22:57:39 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 6 Jan 2016 12:57:39 -1000
Subject: RFR (S): 8146246: JVMCICompiler::abort_on_pending_exception:
	assert(!thread->owns_locks()) failed: must release all locks
	when leaving VM
In-Reply-To: <568D6C63.5000403@oracle.com>
References: <ED6A057F-0842-42FD-8F7A-8116EA9DF5D2@oracle.com>
	<568D6C63.5000403@oracle.com>
Message-ID: <0C62FED5-F3F8-44CE-B1DB-095F9170370B@oracle.com>


> On Jan 6, 2016, at 9:34 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> I would go with "Java code do the printing?.

Yeah, it might be better.

> You left ttyLocker in case 2) in src/share/vm/runtime/java.cpp

Right.  Thanks for pointing that out.

> 
> Thanks,
> Vladimir
> 
> On 1/6/16 11:19 AM, Christian Thalinger wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8146246
>> 
>> The problem is that https://bugs.openjdk.java.net/browse/JDK-8145435 introduced ttyLocker to synchronize the exception output but java_lang_Throwable::print_stack_trace can call out to Java to get the cause.
>> 
>> There are two solutions:
>> 
>> 1) Remove ttyLocker and deal with some possible scrambling in the rare case of an exception:
>> 
>> diff -r df8d635f2296 -r e87e187552fb src/share/vm/jvmci/jvmciCompiler.cpp
>> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 11:24:01 2015 -0800
>> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Thu Dec 31 09:20:16 2015 -0800
>> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>>      Handle exception(THREAD, PENDING_EXCEPTION);
>>      CLEAR_PENDING_EXCEPTION;
>> 
>> -    {
>> -      ttyLocker ttyl;
>> -      java_lang_Throwable::print_stack_trace(exception, tty);
>> -    }
>> +    java_lang_Throwable::print_stack_trace(exception, tty);
>> 
>>      // Something went wrong so disable compilation at this level
>>      method->set_not_compilable(CompLevel_full_optimization);
>> @@ -181,11 +178,8 @@ void JVMCICompiler::abort_on_pending_exc
>>    Thread* THREAD = Thread::current();
>>    CLEAR_PENDING_EXCEPTION;
>> 
>> -  {
>> -    ttyLocker ttyl;
>> -    tty->print_raw_cr(message);
>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>> -  }
>> +  tty->print_raw_cr(message);
>> +  java_lang_Throwable::print_stack_trace(exception, tty);
>> 
>>    // Give other aborting threads to also print their stack traces.
>>    // This can be very useful when debugging class initialization
>> diff -r df8d635f2296 -r e87e187552fb src/share/vm/runtime/java.cpp
>> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 11:24:01 2015 -0800
>> +++ b/src/share/vm/runtime/java.cpp	Thu Dec 31 09:20:16 2015 -0800
>> @@ -432,7 +432,6 @@ void before_exit(JavaThread* thread) {
>>    if (HAS_PENDING_EXCEPTION) {
>>      Handle exception(THREAD, PENDING_EXCEPTION);
>>      CLEAR_PENDING_EXCEPTION;
>> -    ttyLocker ttyl;
>>      java_lang_Throwable::print_stack_trace(exception, tty);
>>    }
>>  #endif
>> 
>> or
>> 
>> 2) Call out to Java and let the Java code do the printing:
>> 
>> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.cpp
>> --- a/src/share/vm/classfile/javaClasses.cpp	Tue Dec 29 18:30:51 2015 +0100
>> +++ b/src/share/vm/classfile/javaClasses.cpp	Wed Jan 06 09:12:00 2016 -1000
>> @@ -1784,6 +1784,20 @@ void java_lang_Throwable::print_stack_tr
>>    }
>>  }
>> 
>> +/**
>> + * Print the throwable stack trace by calling the Java method java.lang.Throwable.printStackTrace().
>> + */
>> +void java_lang_Throwable::java_printStackTrace(Handle throwable, TRAPS) {
>> +  assert(throwable->is_a(SystemDictionary::Throwable_klass()), "Throwable instance expected");
>> +  JavaValue result(T_VOID);
>> +  JavaCalls::call_virtual(&result,
>> +                          throwable,
>> +                          KlassHandle(THREAD, SystemDictionary::Throwable_klass()),
>> +                          vmSymbols::printStackTrace_name(),
>> +                          vmSymbols::void_method_signature(),
>> +                          THREAD);
>> +}
>> +
>>  void java_lang_Throwable::fill_in_stack_trace(Handle throwable, const methodHandle& method, TRAPS) {
>>    if (!StackTraceInThrowable) return;
>>    ResourceMark rm(THREAD);
>> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.hpp
>> --- a/src/share/vm/classfile/javaClasses.hpp	Tue Dec 29 18:30:51 2015 +0100
>> +++ b/src/share/vm/classfile/javaClasses.hpp	Wed Jan 06 09:12:00 2016 -1000
>> @@ -554,6 +554,7 @@ class java_lang_Throwable: AllStatic {
>>    // Printing
>>    static void print(Handle throwable, outputStream* st);
>>    static void print_stack_trace(Handle throwable, outputStream* st);
>> +  static void java_printStackTrace(Handle throwable, TRAPS);
>>    // Debugging
>>    friend class JavaClasses;
>>  };
>> diff -r 0fcfe4b07f7e src/share/vm/jvmci/jvmciCompiler.cpp
>> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 18:30:51 2015 +0100
>> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Wed Jan 06 09:12:00 2016 -1000
>> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>>      Handle exception(THREAD, PENDING_EXCEPTION);
>>      CLEAR_PENDING_EXCEPTION;
>> 
>> -    {
>> -      ttyLocker ttyl;
>> -      java_lang_Throwable::print_stack_trace(exception, tty);
>> -    }
>> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>> 
>>      // Something went wrong so disable compilation at this level
>>      method->set_not_compilable(CompLevel_full_optimization);
>> @@ -181,11 +178,7 @@ void JVMCICompiler::abort_on_pending_exc
>>    Thread* THREAD = Thread::current();
>>    CLEAR_PENDING_EXCEPTION;
>> 
>> -  {
>> -    ttyLocker ttyl;
>> -    tty->print_raw_cr(message);
>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>> -  }
>> +  java_lang_Throwable::java_printStackTrace(exception, THREAD);
>> 
>>    // Give other aborting threads to also print their stack traces.
>>    // This can be very useful when debugging class initialization
>> diff -r 0fcfe4b07f7e src/share/vm/runtime/java.cpp
>> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 18:30:51 2015 +0100
>> +++ b/src/share/vm/runtime/java.cpp	Wed Jan 06 09:12:00 2016 -1000
>> @@ -433,7 +433,7 @@ void before_exit(JavaThread* thread) {
>>      Handle exception(THREAD, PENDING_EXCEPTION);
>>      CLEAR_PENDING_EXCEPTION;
>>      ttyLocker ttyl;
>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>>    }
>>  #endif
>> 


From sangheon.kim at oracle.com  Wed Jan  6 23:50:35 2016
From: sangheon.kim at oracle.com (sangheon)
Date: Wed, 6 Jan 2016 15:50:35 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
References: <568C6049.5020400@oracle.com>
	<6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
Message-ID: <568DA84B.9050309@oracle.com>

Hi Igor,

Thank you for reviewing this.

On 01/05/2016 08:29 PM, Igor Veresov wrote:
> I?m not sure we care a lot about tiny bits of performance in the this instance? But, in case use wanted to keep the original code for the simm13 case you could check the range of the constant and still emit the code that was there before. It also seems suboptimal to do set64 in MacroAssembler::tlab_refill() on all paths - the result of the original add in the delay slot doesn?t seem to be used if we jump to discard_tlab, right?
You are right.
If the branch is taken, original add in the delay slot is not used.

The reason of always calling 'set64' was to keep its behavior. i.e. same 
order of doing something before branch within delay slot. But as you 
said, it is less tighter code.

>    So, may be you could do something like:
>
> brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
> if (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment())) {
>    delayed()->add(t2, ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
> } else {
>    delayed()->nop();
>    set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(), t3, G0);
>    add(t2, t3, t2);
> }
Okay, checking its value first seems good idea.

>
> Similarly, tighter code can be emitted for the interpreter in templateTable_sparc.cpp.
Okay, done.

Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.01

Thanks,
Sangheon


>
> igor
>
>
>> On Jan 5, 2016, at 4:31 PM, sangheon <sangheon.kim at oracle.com> wrote:
>>
>> Hi all,
>>
>> Could I have reviews for the below change to remove size limitation(<4096) of TLABWasteIncrement on SPARC?
>>
>> Current implementation uses 'add(Register, int, Register)' which has 13bit limitation for 'int' parameter.
>> I changed to use 'set64' to load the value to register and then call 'add'. 'set64' will run cheap path as the range of TLABWasteIncrememt is (0, max_juint).
>>
>> This assert is only fired on non-G1 mode as G1 is the only GC that returns false from Universe::heap()->supports_inline_contig_alloc() by default option. And this decides to fall that routine.
>>
>> I didn't add a test as current TestOptionsWithRanges.java is enough to test this case with nightly option rotation.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8144573
>> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.00/
>> Testing: JPRT, manual test on SPARC[1]
>>
>> [1]: java -XX:TLABWasteIncrement=4096(and some larger values as well) -XX:+UseConcMarkSweepGC(UseParallelGC and UseSerialGC) -version
>>
>> Thanks,
>> Sangheon


From vladimir.kozlov at oracle.com  Wed Jan  6 23:58:34 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 15:58:34 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568D17E4.90301@redhat.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
Message-ID: <568DAA2A.9070704@oracle.com>

Andrew is right.
GraphKit::inflate_string() should have SCMemProjNode as 
compress_string() does to prevent loads move up.
StrInflatedCopyNode is not memory node.

Thanks,
Vladimir

On 1/6/16 5:34 AM, Andrew Haley wrote:
> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>
>> The problem here is that C2 reorders memory instructions and moves
>> an array load before an array store. The MemBarCPUOrder is now used
>> (compiler internally) to prevent this. We do the same for normal
>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>> code is emitted. See also the comment in memnode.hpp:
>>
>>   // Ordering within the same CPU.  Used to order unsafe memory references
>>   // inside the compiler when we lack alias info.  Not needed "outside" the
>>   // compiler because the CPU does all the ordering for us.
>>
>> "CPU does all the ordering for us" means that even with a relaxed
>> memory ordering, loads are never moved before dependent stores.
>>
>> Or did I misunderstand your question?
>
> No, I don't think so.  I was just checking: I am very aware that
> HotSpot has presented those of use with relaxed memory order machines
> with some interesting gotchas over the years, that's all.  I'm a bit
> surprised that C2 needs this barrier, given that there is a
> read-after-write dependency, but never mind.
>
> Thanks,
>
> Andrew.
>

From igor.veresov at oracle.com  Thu Jan  7 00:01:48 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 6 Jan 2016 16:01:48 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <568DA84B.9050309@oracle.com>
References: <568C6049.5020400@oracle.com>
	<6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
	<568DA84B.9050309@oracle.com>
Message-ID: <AB0D601C-9965-463A-9846-0094610DD503@oracle.com>

That looks good to me.

igor

> On Jan 6, 2016, at 3:50 PM, sangheon <sangheon.kim at oracle.com> wrote:
> 
> Hi Igor,
> 
> Thank you for reviewing this.
> 
> On 01/05/2016 08:29 PM, Igor Veresov wrote:
>> I?m not sure we care a lot about tiny bits of performance in the this instance? But, in case use wanted to keep the original code for the simm13 case you could check the range of the constant and still emit the code that was there before. It also seems suboptimal to do set64 in MacroAssembler::tlab_refill() on all paths - the result of the original add in the delay slot doesn?t seem to be used if we jump to discard_tlab, right?
> You are right.
> If the branch is taken, original add in the delay slot is not used.
> 
> The reason of always calling 'set64' was to keep its behavior. i.e. same order of doing something before branch within delay slot. But as you said, it is less tighter code.
> 
>>   So, may be you could do something like:
>> 
>> brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
>> if (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment())) {
>>   delayed()->add(t2, ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
>> } else {
>>   delayed()->nop();
>>   set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(), t3, G0);
>>   add(t2, t3, t2);
>> }
> Okay, checking its value first seems good idea.
> 
>> 
>> Similarly, tighter code can be emitted for the interpreter in templateTable_sparc.cpp.
> Okay, done.
> 
> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.01
> 
> Thanks,
> Sangheon
> 
> 
>> 
>> igor
>> 
>> 
>>> On Jan 5, 2016, at 4:31 PM, sangheon <sangheon.kim at oracle.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> Could I have reviews for the below change to remove size limitation(<4096) of TLABWasteIncrement on SPARC?
>>> 
>>> Current implementation uses 'add(Register, int, Register)' which has 13bit limitation for 'int' parameter.
>>> I changed to use 'set64' to load the value to register and then call 'add'. 'set64' will run cheap path as the range of TLABWasteIncrememt is (0, max_juint).
>>> 
>>> This assert is only fired on non-G1 mode as G1 is the only GC that returns false from Universe::heap()->supports_inline_contig_alloc() by default option. And this decides to fall that routine.
>>> 
>>> I didn't add a test as current TestOptionsWithRanges.java is enough to test this case with nightly option rotation.
>>> 
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8144573
>>> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.00/
>>> Testing: JPRT, manual test on SPARC[1]
>>> 
>>> [1]: java -XX:TLABWasteIncrement=4096(and some larger values as well) -XX:+UseConcMarkSweepGC(UseParallelGC and UseSerialGC) -version
>>> 
>>> Thanks,
>>> Sangheon
> 


From vitalyd at gmail.com  Thu Jan  7 00:11:39 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 6 Jan 2016 19:11:39 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <568D7FA1.4040707@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com>
Message-ID: <CAHjP37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>

Does checkIndex match on it? If so, is there a reason to proceed with
intrinsifying checkIndex?

On Wednesday, January 6, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> Note, we already have range check pattern matching code in C2 (thanks to
> Roland):
>
> https://bugs.openjdk.java.net/browse/JDK-8137168
>
> Vladimir
>
> On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
>
>> I don't think there's a need to write out 20 different ways to do a
>> range check -- I think nobody would expect all 20 to be covered by the
>> optimizer.  Some of those variations may not map cleanly to
>> Object::checkIndex either, nor is there any guarantee that people will
>> update all their existing range checks (or even know about) to use
>> Object::checkIndex -- some code will be left unoptimized no matter what.
>>
>> But my point is the same as Andrew's, I think; instead of making
>> checkIndex an intrinsic, simply add a pattern match against that exact
>> bytecode shape (perhaps with basic canonicalization) and then still
>> encourage people to use Object::checkIndex.  This is better than
>> intrinsic (modulo profile pollution) since any other code that happens
>> to use same pattern will match as well, and not require an update to use
>> checkIndex.  Then, if someone comes to this list with an unoptimized
>> example with a different bytecode shape and has a convincing argument
>> that the code shape is "common", you guys can consider pattern matching
>> that as well.
>>
>> On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
>> <mailto:john.r.rose at oracle.com>> wrote:
>>
>>
>>      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com
>>     <mailto:vitalyd at gmail.com>> wrote:
>>      >
>>      > better canonicalization
>>
>>     That's our first and most important tactic.  (Actually inlining is.)
>>
>>     But the various idioms for checkIndex do not canonicalize easily. In
>>     this case the correct trade-off is not to invest more time and
>>     research and code into stronger canonicalization.
>>
>>     We do have canonicalization of if-expressions. It's just that in
>>     this case strengthening it to cover range checks reliably is harder
>>     than the reasonable alternative.
>>
>>     ? John
>>
>>     PS.  I am tempted to write out a list of 20 different ways to code a
>>     range check but will leave that as a exercise.
>>
>>
>>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160106/d3e85125/attachment.html>

From sangheon.kim at oracle.com  Thu Jan  7 00:12:12 2016
From: sangheon.kim at oracle.com (sangheon)
Date: Wed, 6 Jan 2016 16:12:12 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <AB0D601C-9965-463A-9846-0094610DD503@oracle.com>
References: <568C6049.5020400@oracle.com>
	<6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
	<568DA84B.9050309@oracle.com>
	<AB0D601C-9965-463A-9846-0094610DD503@oracle.com>
Message-ID: <568DAD5C.9080409@oracle.com>

Thanks for the review.

Sangheon


On 01/06/2016 04:01 PM, Igor Veresov wrote:
> That looks good to me.
>
> igor
>
>> On Jan 6, 2016, at 3:50 PM, sangheon <sangheon.kim at oracle.com> wrote:
>>
>> Hi Igor,
>>
>> Thank you for reviewing this.
>>
>> On 01/05/2016 08:29 PM, Igor Veresov wrote:
>>> I?m not sure we care a lot about tiny bits of performance in the this instance? But, in case use wanted to keep the original code for the simm13 case you could check the range of the constant and still emit the code that was there before. It also seems suboptimal to do set64 in MacroAssembler::tlab_refill() on all paths - the result of the original add in the delay slot doesn?t seem to be used if we jump to discard_tlab, right?
>> You are right.
>> If the branch is taken, original add in the delay slot is not used.
>>
>> The reason of always calling 'set64' was to keep its behavior. i.e. same order of doing something before branch within delay slot. But as you said, it is less tighter code.
>>
>>>    So, may be you could do something like:
>>>
>>> brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
>>> if (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment())) {
>>>    delayed()->add(t2, ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
>>> } else {
>>>    delayed()->nop();
>>>    set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(), t3, G0);
>>>    add(t2, t3, t2);
>>> }
>> Okay, checking its value first seems good idea.
>>
>>> Similarly, tighter code can be emitted for the interpreter in templateTable_sparc.cpp.
>> Okay, done.
>>
>> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.01
>>
>> Thanks,
>> Sangheon
>>
>>
>>> igor
>>>
>>>
>>>> On Jan 5, 2016, at 4:31 PM, sangheon <sangheon.kim at oracle.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Could I have reviews for the below change to remove size limitation(<4096) of TLABWasteIncrement on SPARC?
>>>>
>>>> Current implementation uses 'add(Register, int, Register)' which has 13bit limitation for 'int' parameter.
>>>> I changed to use 'set64' to load the value to register and then call 'add'. 'set64' will run cheap path as the range of TLABWasteIncrememt is (0, max_juint).
>>>>
>>>> This assert is only fired on non-G1 mode as G1 is the only GC that returns false from Universe::heap()->supports_inline_contig_alloc() by default option. And this decides to fall that routine.
>>>>
>>>> I didn't add a test as current TestOptionsWithRanges.java is enough to test this case with nightly option rotation.
>>>>
>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8144573
>>>> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.00/
>>>> Testing: JPRT, manual test on SPARC[1]
>>>>
>>>> [1]: java -XX:TLABWasteIncrement=4096(and some larger values as well) -XX:+UseConcMarkSweepGC(UseParallelGC and UseSerialGC) -version
>>>>
>>>> Thanks,
>>>> Sangheon


From vivek.r.deshpande at intel.com  Thu Jan  7 00:31:46 2016
From: vivek.r.deshpande at intel.com (Deshpande, Vivek R)
Date: Thu, 7 Jan 2016 00:31:46 +0000
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <5684A5B8.7070407@oracle.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569E1902@ORSMSX106.amr.corp.intel.com>
	<5684A5B8.7070407@oracle.com>
Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A569F23FC@ORSMSX106.amr.corp.intel.com>

HI Vladimir,

Yes, the macroAssembler_x86_libm.cpp file is getting large, I could look into splitting it into two files macroAssembler_libm_x86_64.cpp and macroAssembler_libm_x86_32.cpp. Please let me know if that sounds good to you.

The 64 bit code takes advantage of additional general purpose registers and 64 bit integer arithmetic and so we have two different versions for 32 bit and 64 bit. 

Regarding the FPU usage in cos/sin, we talked with the LIBM algorithm experts and they came back with the following:
"It would not be easy to remove FPU x87 instructions from libm_sincos_huge and libm_reduced_pi04l, they are designed with using extended precision from FPU in mind.  The performance for 32bit implementation for these that do not use x87 instructions may not be optimal. These two are only used for very large input arguments."

Thank you.
Regards,
Vivek

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Wednesday, December 30, 2015 7:49 PM
To: Deshpande, Vivek R; Joseph D. Darcy
Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib

Hi Vivek,

Why 32-bit code is so different from 64-bit code? You only use it if sse2 is available so XMM registers are present. Why to use FPU if you have SSE?

32-bit:

  582   movsd(Address(rsp, 8), xmm0);
  583   fld_d(Address(rsp, 8));
  584   movsd(Address(rsp, 16), xmm6);
  585   fld_d(Address(rsp, 16));
  586   fmula(1);

64-bit:

  295   mulsd(xmm0, xmm2);

It is concerned to all LIBM 32-bit intrinsics.

The main concern is that macroAssembler_x86_libm.cpp file become too large and it would be nice if 32-bit and 64-bit reuse the same code.

Thanks,
Vladimir

On 12/24/15 6:10 PM, Deshpande, Vivek R wrote:
> HI Vladimir
>
> I have updated the libm sin cos intrinsics for x86 for hotspot.
> The updated webrev for the same is at this location for your review.
> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/web
> rev.00/
> Could you please review it.
>
> Regards,
> Vivek
>
>
> -----Original Message-----
> From: Deshpande, Vivek R
> Sent: Tuesday, December 22, 2015 5:42 PM
> To: 'Joseph D. Darcy'; Vladimir Kozlov
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the math 
> lib
>
> HI All
>
> I have uploaded the patch for sin and cos tests with input and allowed outputs at this location for your review.
> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.
> 00/ Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
> Thank you.
>
> Regards,
> Vivek
>
> -----Original Message-----
> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
> Sent: Friday, December 04, 2015 4:50 PM
> To: Deshpande, Vivek R; Vladimir Kozlov
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math 
> lib
>
> Hi Vivek,
>
> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>> Hi
>>
>> Sure I will add the tests. Shall I use StrictMath result as a reference for exact result.
>> Let me know your thoughts.
>
> As a rough test of another sin/cos implementation, StrictMath.{sin, 
> cos} can be used a reference with the following caveat: there isn't an 
> indication of which why the error is in a StrictMath result. Let me 
> given an example, if
>
>       StrictMath.sin(x) => y
>
> then one of the following should be true
>
>       Math.sin(x) => y
>       Math.sin(x) => Math.nextUp(y)
>       Math.sin(x) => Math.nextDown(y)
>
> That is, Math.sin(x) should either be the same as StrictMath.sin(x) OR 
> equal to one of the floating-point numbers adjacent to that result. Of 
> these three options, only two area allowed by the accuracy 
> requirements of the StrictMath.sin specification. However, since 
> StrictMath.sin doesn't give an indication of which way its error went 
> (if it rounded up or down), there is no indication without additional 
> work which of
> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't buggy).
>
> HTH,
>
> -Joe
>
>
>>
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: joe darcy [mailto:joe.darcy at oracle.com]
>> Sent: Thursday, December 03, 2015 1:29 PM
>> To: Vladimir Kozlov; Deshpande, Vivek R
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math 
>> lib
>>
>> Hello,
>>
>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>> Vivek,
>>>
>>> I think Joe is asking you to write these tests as hotspot regression 
>>> test in hotspot/test/compiler.
>> Exactly; if not generally applicable sin/cos tests that could be hosted in the jdk repo (alongside the regression and unit tests for java.lang.Math), then test of intrinsics in the HotSpot repo alongside other tests targeting intrinsics.
>>
>> Thanks,
>>
>> -Joe
>>
>>> Vladimir
>>>
>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>> Hi Joe
>>>>
>>>> It would be great if you would please share the additional tests 
>>>> with us.
>>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>> math lib
>>>>
>>>> I think it is unwise for this large of an implementation change to 
>>>> be pushed with no tests targeting the specifics of the new implementation.
>>>>
>>>> The worst-case tests in the jdk repo are the mathematical worst 
>>>> cases for floating-point approximations, in other words the cases 
>>>> were the exact mathematical answer is closes to half-way between 
>>>> two representation floating-point numbers. Passing such tests is 
>>>> necessary but not sufficient condition for a new implementation.
>>>>
>>>> Chers,
>>>>
>>>> -Joe
>>>>
>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>> Okay, looks reasonable to me.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>> Hi Vladimir
>>>>>>
>>>>>> This is the link for the updated webrev with latest hotspot 
>>>>>> source as base for your review.
>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>> Thank you.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Deshpande, Vivek R
>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>> math lib
>>>>>>
>>>>>> Hi Vladimir
>>>>>>
>>>>>> This is the link for the updated webrev for your review.
>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>> Thank you.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>> math lib
>>>>>>
>>>>>> Please send link to new webrev on cr server.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> Please find the webrev with your suggested updates attached with 
>>>>>>> the mail.
>>>>>>> We will update it in the jbs entry soon.
>>>>>>> Please let me know if it needs further changes.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Deshpande, Vivek R
>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>>> math lib
>>>>>>>
>>>>>>> HI Vladimir, Joe
>>>>>>>
>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you 
>>>>>>> have mentioned. It passed those tests.
>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions 
>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>> The performance gain is 3.2x over base jdk, that is over current 
>>>>>>> fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>
>>>>>>> Could I get those tests around the boundary values. Would 
>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>
>>>>>>> I would work on adding either diagnostic flag or just one flag 
>>>>>>> for libm and send out the webrev soon.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>>> math lib
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Just getting added to the thread..
>>>>>>>
>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>
>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition to 
>>>>>>>> Hotspot tests.
>>>>>>>>
>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from 
>>>>>>>>> StrictMath result and not exact result. So I added the flag to 
>>>>>>>>> switch between FDLIBM and LIBM.
>>>>>>>>>
>>>>>>>>> Quick explanation:
>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result 
>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>> 0.1945729362957022
>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA 
>>>>>>>>> library result is between the above two values and Exact 
>>>>>>>>> result would be pretty close to it.
>>>>>>>>> So here StrictMath result is less than quad-precision result, 
>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not 
>>>>>>>>> StrictMath
>>>>>>>>> - 1ulp, according to our test.
>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both 
>>>>>>>> direction, I
>>>>>>>> think) and it should be consistent for Interpreter and code 
>>>>>>>> generated by JIT compilers:
>>>>>>>>
>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#si
>>>>>>>> n
>>>>>>>> %
>>>>>>>> 28
>>>>>>>> do
>>>>>>>> u
>>>>>>>> ble%29
>>>>>>>>
>>>>>>> That interpretation of the spec is not quite right. For the Math 
>>>>>>> methods with a 1/2 ulp error bound, the floating-point result 
>>>>>>> closest to the exact result must be returned. For the methods 
>>>>>>> with a
>>>>>>> 1 ulp error bound, either of the floating-point result 
>>>>>>> bracketing the true result can be returned, subject to the 
>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>
>>>>>>>>> I have done the experiments with XX:+UnlockDiagnosticVMOptions 
>>>>>>>>> -XX:DisableIntrinsic=_dsin and XX:+UnlockDiagnosticVMOptions 
>>>>>>>>> -XX:DisableIntrinsic=_dcos. With this option, the interpreter 
>>>>>>>>> would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>> If we want to disable LIBM completely, we need the flags 
>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>> I was thinking about using existing
>>>>>>>> DirectiveSet::is_intrinsic_disabled() and 
>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add 
>>>>>>>> additional versions of functions which accept intrinsic ID 
>>>>>>>> instead of methodHandle.
>>>>>>>>
>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>
>>>>>>>>> Also the performance gain ~4x is with 
>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>> You confused me here. So you get 4x when only Interpreter use 
>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>
>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>> Port fdlibm to Java), which is providing a significant speed 
>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>
>>>>>>> I find the current patch *insufficient* as-is in terms of its 
>>>>>>> testing.
>>>>>>> For example, part of patch says
>>>>>>>
>>>>>>> # For sin
>>>>>>>
>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>
>>>>>>> # For cos
>>>>>>>
>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>
>>>>>>> If nothing else, there are no tests at around those boundary 
>>>>>>> values, which is unacceptable. There should also be some tests 
>>>>>>> of values of interest to the algorithm in question.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> -Joe
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>> Let me know your thoughts on this. I would answer more 
>>>>>>>>> questions and give more data if needed.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>
>>>>>>>>>> It would be interesting to see what happens if you disable 
>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>
>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions 
>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>> Hi Vivek,
>>>>>>>>>
>>>>>>>>> I want to point that you can do this experiment later. We can 
>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>
>>>>>>>>> For now, please, answer my question about flags only. This is 
>>>>>>>>> the only thing holding it from push.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi all
>>>>>>>>>>>
>>>>>>>>>>> I would like to contribute a patch which optimizes 
>>>>>>>>>>> Math.sin() and
>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>         implementation.
>>>>>>>>>>>
>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and cos.
>>>>>>>>>>>
>>>>>>>>>>> The option to use the optimizations are 
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>
>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>
>>>>>>>>>>> Bug-id:
>>>>>>>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>> webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>
>>>>>>>>>>> Vivek
>>>>>>>>>>>
>

From vladimir.kozlov at oracle.com  Thu Jan  7 00:35:43 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 16:35:43 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568CF8F5.5090202@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
Message-ID: <568DB2DF.4010305@oracle.com>

Nope. Too much unrelated changes. If you want to go this road - file 
separate RFE to change phase argument type of Identity() and Value().
And why use PhaseValue and not PhaseGVN as in Ideal()?

So I agree to do your change in IfNode::Identity(). But as separate fix 
after general change.

Thanks,
Vladimir

On 1/6/16 3:22 AM, Tobias Hartmann wrote:
> Hi,
>
> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>
> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>
> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>
> What do you think?
>
> Thanks,
> Tobias
>
>
> On 18.09.2015 11:57, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8136469
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>
>> Problem:
>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>
>> Solution:
>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>
>> Testing:
>> - New test (TestPresizedStringBuilder)
>> - JPRT
>>
>> Thanks,
>> Tobias
>>
>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>

From vladimir.kozlov at oracle.com  Thu Jan  7 00:54:07 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 6 Jan 2016 16:54:07 -0800
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A569F23FC@ORSMSX106.amr.corp.intel.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569E1902@ORSMSX106.amr.corp.intel.com>
	<5684A5B8.7070407@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569F23FC@ORSMSX106.amr.corp.intel.com>
Message-ID: <568DB72F.6010408@oracle.com>

On 1/6/16 4:31 PM, Deshpande, Vivek R wrote:
> HI Vladimir,
>
> Yes, the macroAssembler_x86_libm.cpp file is getting large, I could look into splitting it into two files macroAssembler_libm_x86_64.cpp and macroAssembler_libm_x86_32.cpp. Please let me know if that sounds good to you.

Yes, if we keep separate code we should split the file (and adjust make 
files).

>
> The 64 bit code takes advantage of additional general purpose registers and 64 bit integer arithmetic and so we have two different versions for 32 bit and 64 bit.

Okay, this is valid argument. Even so we may use push/pop on 32-bit to 
preserve registers.

>
> Regarding the FPU usage in cos/sin, we talked with the LIBM algorithm experts and they came back with the following:
> "It would not be easy to remove FPU x87 instructions from libm_sincos_huge and libm_reduced_pi04l, they are designed with using extended precision from FPU in mind.  The performance for 32bit implementation for these that do not use x87 instructions may not be optimal. These two are only used for very large input arguments."

I don't buy this argument. Do they mean that 64-bit code, which does not 
use FPU, produces less precise result for very large input arguments" ?
Very large input arguments is very rare case, I think. Should we care 
about its performance?
Note, 32-bit performance become less and less important.

Okay, for now lets split the file. Late we can try to 
simplify/combine/factor out the code.

Thanks,
Vladimir


>
> Thank you.
> Regards,
> Vivek
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, December 30, 2015 7:49 PM
> To: Deshpande, Vivek R; Joseph D. Darcy
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib
>
> Hi Vivek,
>
> Why 32-bit code is so different from 64-bit code? You only use it if sse2 is available so XMM registers are present. Why to use FPU if you have SSE?
>
> 32-bit:
>
>    582   movsd(Address(rsp, 8), xmm0);
>    583   fld_d(Address(rsp, 8));
>    584   movsd(Address(rsp, 16), xmm6);
>    585   fld_d(Address(rsp, 16));
>    586   fmula(1);
>
> 64-bit:
>
>    295   mulsd(xmm0, xmm2);
>
> It is concerned to all LIBM 32-bit intrinsics.
>
> The main concern is that macroAssembler_x86_libm.cpp file become too large and it would be nice if 32-bit and 64-bit reuse the same code.
>
> Thanks,
> Vladimir
>
> On 12/24/15 6:10 PM, Deshpande, Vivek R wrote:
>> HI Vladimir
>>
>> I have updated the libm sin cos intrinsics for x86 for hotspot.
>> The updated webrev for the same is at this location for your review.
>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/web
>> rev.00/
>> Could you please review it.
>>
>> Regards,
>> Vivek
>>
>>
>> -----Original Message-----
>> From: Deshpande, Vivek R
>> Sent: Tuesday, December 22, 2015 5:42 PM
>> To: 'Joseph D. Darcy'; Vladimir Kozlov
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the math
>> lib
>>
>> HI All
>>
>> I have uploaded the patch for sin and cos tests with input and allowed outputs at this location for your review.
>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.
>> 00/ Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>> Thank you.
>>
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>> Sent: Friday, December 04, 2015 4:50 PM
>> To: Deshpande, Vivek R; Vladimir Kozlov
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>> lib
>>
>> Hi Vivek,
>>
>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>> Hi
>>>
>>> Sure I will add the tests. Shall I use StrictMath result as a reference for exact result.
>>> Let me know your thoughts.
>>
>> As a rough test of another sin/cos implementation, StrictMath.{sin,
>> cos} can be used a reference with the following caveat: there isn't an
>> indication of which why the error is in a StrictMath result. Let me
>> given an example, if
>>
>>        StrictMath.sin(x) => y
>>
>> then one of the following should be true
>>
>>        Math.sin(x) => y
>>        Math.sin(x) => Math.nextUp(y)
>>        Math.sin(x) => Math.nextDown(y)
>>
>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) OR
>> equal to one of the floating-point numbers adjacent to that result. Of
>> these three options, only two area allowed by the accuracy
>> requirements of the StrictMath.sin specification. However, since
>> StrictMath.sin doesn't give an indication of which way its error went
>> (if it rounded up or down), there is no indication without additional
>> work which of
>> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't buggy).
>>
>> HTH,
>>
>> -Joe
>>
>>
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Thursday, December 03, 2015 1:29 PM
>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>>> lib
>>>
>>> Hello,
>>>
>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>> Vivek,
>>>>
>>>> I think Joe is asking you to write these tests as hotspot regression
>>>> test in hotspot/test/compiler.
>>> Exactly; if not generally applicable sin/cos tests that could be hosted in the jdk repo (alongside the regression and unit tests for java.lang.Math), then test of intrinsics in the HotSpot repo alongside other tests targeting intrinsics.
>>>
>>> Thanks,
>>>
>>> -Joe
>>>
>>>> Vladimir
>>>>
>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>> Hi Joe
>>>>>
>>>>> It would be great if you would please share the additional tests
>>>>> with us.
>>>>>
>>>>> Regards,
>>>>> Vivek
>>>>>
>>>>> -----Original Message-----
>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>> math lib
>>>>>
>>>>> I think it is unwise for this large of an implementation change to
>>>>> be pushed with no tests targeting the specifics of the new implementation.
>>>>>
>>>>> The worst-case tests in the jdk repo are the mathematical worst
>>>>> cases for floating-point approximations, in other words the cases
>>>>> were the exact mathematical answer is closes to half-way between
>>>>> two representation floating-point numbers. Passing such tests is
>>>>> necessary but not sufficient condition for a new implementation.
>>>>>
>>>>> Chers,
>>>>>
>>>>> -Joe
>>>>>
>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>> Okay, looks reasonable to me.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev with latest hotspot
>>>>>>> source as base for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Deshpande, Vivek R
>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> Please send link to new webrev on cr server.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> Please find the webrev with your suggested updates attached with
>>>>>>>> the mail.
>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> HI Vladimir, Joe
>>>>>>>>
>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you
>>>>>>>> have mentioned. It passed those tests.
>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions
>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>> The performance gain is 3.2x over base jdk, that is over current
>>>>>>>> fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>
>>>>>>>> Could I get those tests around the boundary values. Would
>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>
>>>>>>>> I would work on adding either diagnostic flag or just one flag
>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Just getting added to the thread..
>>>>>>>>
>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>
>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition to
>>>>>>>>> Hotspot tests.
>>>>>>>>>
>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>> Hi Vladimir
>>>>>>>>>>
>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from
>>>>>>>>>> StrictMath result and not exact result. So I added the flag to
>>>>>>>>>> switch between FDLIBM and LIBM.
>>>>>>>>>>
>>>>>>>>>> Quick explanation:
>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result
>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>> 0.1945729362957022
>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA
>>>>>>>>>> library result is between the above two values and Exact
>>>>>>>>>> result would be pretty close to it.
>>>>>>>>>> So here StrictMath result is less than quad-precision result,
>>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not
>>>>>>>>>> StrictMath
>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both
>>>>>>>>> direction, I
>>>>>>>>> think) and it should be consistent for Interpreter and code
>>>>>>>>> generated by JIT compilers:
>>>>>>>>>
>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#si
>>>>>>>>> n
>>>>>>>>> %
>>>>>>>>> 28
>>>>>>>>> do
>>>>>>>>> u
>>>>>>>>> ble%29
>>>>>>>>>
>>>>>>>> That interpretation of the spec is not quite right. For the Math
>>>>>>>> methods with a 1/2 ulp error bound, the floating-point result
>>>>>>>> closest to the exact result must be returned. For the methods
>>>>>>>> with a
>>>>>>>> 1 ulp error bound, either of the floating-point result
>>>>>>>> bracketing the true result can be returned, subject to the
>>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>>
>>>>>>>>>> I have done the experiments with XX:+UnlockDiagnosticVMOptions
>>>>>>>>>> -XX:DisableIntrinsic=_dsin and XX:+UnlockDiagnosticVMOptions
>>>>>>>>>> -XX:DisableIntrinsic=_dcos. With this option, the interpreter
>>>>>>>>>> would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>> If we want to disable LIBM completely, we need the flags
>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>> I was thinking about using existing
>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and
>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add
>>>>>>>>> additional versions of functions which accept intrinsic ID
>>>>>>>>> instead of methodHandle.
>>>>>>>>>
>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>
>>>>>>>>>> Also the performance gain ~4x is with
>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>> You confused me here. So you get 4x when only Interpreter use
>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>
>>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>>> Port fdlibm to Java), which is providing a significant speed
>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>
>>>>>>>> I find the current patch *insufficient* as-is in terms of its
>>>>>>>> testing.
>>>>>>>> For example, part of patch says
>>>>>>>>
>>>>>>>> # For sin
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> # For cos
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> If nothing else, there are no tests at around those boundary
>>>>>>>> values, which is unacceptable. There should also be some tests
>>>>>>>> of values of interest to the algorithm in question.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> -Joe
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>> Let me know your thoughts on this. I would answer more
>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Vivek
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>>> the math lib
>>>>>>>>>>
>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>
>>>>>>>>>>> It would be interesting to see what happens if you disable
>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>
>>>>>>>>>>>          -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>> Hi Vivek,
>>>>>>>>>>
>>>>>>>>>> I want to point that you can do this experiment later. We can
>>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>>
>>>>>>>>>> For now, please, answer my question about flags only. This is
>>>>>>>>>> the only thing holding it from push.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>> Hi all
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to contribute a patch which optimizes
>>>>>>>>>>>> Math.sin() and
>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>>          implementation.
>>>>>>>>>>>>
>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and cos.
>>>>>>>>>>>>
>>>>>>>>>>>> The option to use the optimizations are
>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>
>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>> webrev:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Vivek
>>>>>>>>>>>>
>>

From thomas.schatzl at oracle.com  Thu Jan  7 09:11:05 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 07 Jan 2016 10:11:05 +0100
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <568DA84B.9050309@oracle.com>
References: <568C6049.5020400@oracle.com>
	<6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
	<568DA84B.9050309@oracle.com>
Message-ID: <1452157865.2611.2.camel@oracle.com>

Hi Sangheon,

On Wed, 2016-01-06 at 15:50 -0800, sangheon wrote:
> Hi Igor,
> 
> Thank you for reviewing this.
> 
> On 01/05/2016 08:29 PM, Igor Veresov wrote:
> > I?m not sure we care a lot about tiny bits of performance in the
> > this instance? But, in case use wanted to keep the original code
> > for the simm13 case you could check the range of the constant and
> > still emit the code that was there before. It also seems suboptimal
> > to do set64 in MacroAssembler::tlab_refill() on all paths - the
> > result of the original add in the delay slot doesn?t seem to be
> > used if we jump to discard_tlab, right?
> You are right.
> If the branch is taken, original add in the delay slot is not used.
> 
> The reason of always calling 'set64' was to keep its behavior. i.e.
> same 
> order of doing something before branch within delay slot. But as you 
> said, it is less tighter code.
> 
> >    So, may be you could do something like:
> > 
> > brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
> > if
> > (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment()))
> > {
> >    delayed()->add(t2,
> > ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
> > } else {
> >    delayed()->nop();
> >    set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(),
> > t3, G0);
> >    add(t2, t3, t2);
> > }
> Okay, checking its value first seems good idea.
> 
> > 
> > Similarly, tighter code can be emitted for the interpreter in
> > templateTable_sparc.cpp.
> Okay, done.
> 
> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.01

  looks good.

Could you move the "// increment waste limit to prevent getting stuck
on this slow path" above the if-clause in both cases and remove the
other mentions of that to make the comments in both macroAssembler and
templateTable uniform?

I do not need another review for the comment change.

Thanks,
  Thomas


From roland.westrelin at oracle.com  Thu Jan  7 09:29:07 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 7 Jan 2016 10:29:07 +0100
Subject: Request for Reviews (S): JDK-8003585 strength reduce or eliminate
	range checks for power-of-two sized arrays
In-Reply-To: <DB5DF00F-CD36-41D2-8E1D-0958E70E6D1D@oracle.com>
References: <CA+cQ+tR5c3y3uLpdO2Gop-RA0ZPJ4Mmq9yAAK5dOpFtgv-XPcg@mail.gmail.com>
	<440F2280-4B25-4AE6-A4F6-DDD4EB529636@oracle.com>
	<52FC129D.7040409@oracle.com>
	<CA+cQ+tTzA822citeYM0zCNE3FDyBk9hTO2h2i6POiJWSar6hVA@mail.gmail.com>
	<CA+cQ+tTWkkJHwcvJ4R=f31ve9vRwG5TfizHfo56sO9R3p3sRcQ@mail.gmail.com>
	<52FE6A08.20400@oracle.com>
	<CA+cQ+tQ00O1bT0JAAX9Lk+hPrfLV0PgQoiO1wmEUbKrPrqi1kA@mail.gmail.com>
	<52FE7313.3060404@oracle.com> <530209A8.1020501@oracle.com>
	<DC638183-2429-4FD2-B455-F87C0FD7EBFE@oracle.com>
	<CA+cQ+tQ2FoMhSj_B-Q=zgv3sGtJo9HojhWDsEWkx-UThc+mqEA@mail.gmail.com>
	<38EE6922-0B9C-49A6-B54D-E78BA0EFECB1@oracle.com>
	<B42C8B5D-5C7E-4FB9-A519-0B319F89018C@oracle.com>
	<CA+cQ+tQKvH9dyymvOEO6YFrCv2A1+hTo6bwMg0km8hx4G2jf4w@mail.gmail.com>
	<8232A81B-6B78-4F61-A8EC-1A3DF3938648@oracle.com>
	<DB5DF00F-CD36-41D2-8E1D-0958E70E6D1D@oracle.com>
Message-ID: <70FBA4CF-CF05-4232-AFEC-202E93BFA930@oracle.com>

Can I get a review for this?

Roland.

> On Oct 5, 2015, at 12:51 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> 
> Here is a new webrev:
> 
> http://cr.openjdk.java.net/~roland/8003585/webrev.01/
> 
> Roland.
> 
>> On Oct 2, 2015, at 3:30 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>> 
>> Hi Chris,
>> 
>>> Thanks for picking it up! It mostly looks good to me. (Not a Reviewer)
>> 
>> Thanks for looking at this again.
>> 
>>> What I really needed with my earlier webrev was some instructions as to what test to write -- since the Java corelibs can come across this optimization a lot (e.g. HashMap), I didn't have a good idea of what kind of test really needs to be written.
>>> 
>>> A couple of issues with this webrev:
>>> 
>>> 1. In subnode.cpp, line 1346:
>>> 
>>> 1344     } else if (_test._test == BoolTest::lt &&
>>> 1345                cmp2->Opcode() == Op_AddI &&
>>> 1346                cmp2->in(2)->find_int_con(1)) {
>>> 1347       bound = cmp2->in(1);
>>> 1348     }
>>> 
>>> I think it should be
>>> cmp2->in(2)->find_int_con(0) == 1
>>> instead, because the value passed into this function is actually for a "fallback when no int constant is found". Passing the expected value (1) to it defeats the purpose.
>> 
>> You?re right. Thanks for spotting that.
>> 
>>> jint find_int_con(jint value_if_unknown) const {
>>>   const TypeInt* t = find_int_type();
>>>   return (t != NULL && t->is_con()) ? t->get_con() : value_if_unknown;
>>> }
>>> 
>>> 2. Formattign nitpick: could you please trim the spaces before the new's on lines 1368, 1369 and 1387
>> 
>> Sure.
>> 
>> I?ll send an updated webrev.
>> 
>> Roland.
>> 
>>> 
>>> Thanks,
>>> Kris (OpenJDK username: krismo)
>>> 
>>> On Wed, Sep 30, 2015 at 1:34 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>> I?m picking that one up. Here is a new webrev:
>>> 
>>> http://cr.openjdk.java.net/~roland/8003585/webrev.00/
>>> 
>>> The only change to c2 compared to the previous webrev is that ((x & m) u< m+1) is optimized the same way ((x & m) u<= m) is. Actually, I don?t think that C2 currently produces the ((x & m) u<= m) shape. The IfNode::fold_compares() logic produces the ((x & m) u< m+1) variant. I also added a test case to check the validity of the transformations and ran usual testing on the change.
>>> 
>>> Roland.
> 


From martin.doerr at sap.com  Thu Jan  7 13:45:19 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 7 Jan 2016 13:45:19 +0000
Subject: RFR(M): 8146612: C2: Precedence edges specification violated
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>

Hi,

some time ago, we found out, that C2 doesn't treat precedence edges as specified.

The description of precedence edges in node.hpp says:
"They are unordered and not duplicated; they have no embedded NULLs."

Some functions in the current implementation violate this specification.

I have fixed this in the following webrev:
http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.00/

Please review. I will need a sponsor, please.

Best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160107/f4f77b25/attachment.html>

From martin.doerr at sap.com  Thu Jan  7 13:55:10 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 7 Jan 2016 13:55:10 +0000
Subject: RFR(M): 8146613: PPC64: C2 does no longer respect int to long
	conversion for stub calls
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228AAD8@DEWDFEMB19C.global.corp.sap>

Hi,

I have created a webrev which introduces int to long conversion in PPC64 functions which are called by C2 runtime calls.
I also added assertions to arraycopy stubs which are already called correctly.

Background:
8086069 removed too much code. Only the native wrapper performed the conversion after this change. However, it is required to convert ints to longs for all C calls and some runtime calls.
8144466 reintroduced the platform variable CCallingConventionRequiresIntsAsLongs and the conversion for the runtime calls for which C2 calls shared C functions on PPC64.
Some PPC64 runtime functions which rely on proper 64 bit arguments are still called without conversion.

Webrev is here:
http://cr.openjdk.java.net/~mdoerr/8146613_ppc64_int2long/webrev.00/

It only touches PPC64 files.

Please review and sponsor.

Best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160107/81cc37ca/attachment.html>

From tobias.hartmann at oracle.com  Thu Jan  7 14:52:27 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 7 Jan 2016 15:52:27 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568DAA2A.9070704@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com>
Message-ID: <568E7BAB.5070908@oracle.com>

Hi Vladimir,

On 07.01.2016 00:58, Vladimir Kozlov wrote:
> Andrew is right.

Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.

I fixed this for the inflate and compress intrinsics.

> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
> StrInflatedCopyNode is not memory node.

Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:

 // This class defines a projection of the memory  state of a store conditional node.
 // These nodes return a value, but also update memory.

But inflate does not return any value.

Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/

Related question:
In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
 char[int:>=0]:exact+any *

which is equal to the type of the char load.

I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
 char[int:1]:NotNull:exact *

Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?

Thanks,
Tobias


> Thanks,
> Vladimir
> 
> On 1/6/16 5:34 AM, Andrew Haley wrote:
>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>
>>> The problem here is that C2 reorders memory instructions and moves
>>> an array load before an array store. The MemBarCPUOrder is now used
>>> (compiler internally) to prevent this. We do the same for normal
>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>> code is emitted. See also the comment in memnode.hpp:
>>>
>>>   // Ordering within the same CPU.  Used to order unsafe memory references
>>>   // inside the compiler when we lack alias info.  Not needed "outside" the
>>>   // compiler because the CPU does all the ordering for us.
>>>
>>> "CPU does all the ordering for us" means that even with a relaxed
>>> memory ordering, loads are never moved before dependent stores.
>>>
>>> Or did I misunderstand your question?
>>
>> No, I don't think so.  I was just checking: I am very aware that
>> HotSpot has presented those of use with relaxed memory order machines
>> with some interesting gotchas over the years, that's all.  I'm a bit
>> surprised that C2 needs this barrier, given that there is a
>> read-after-write dependency, but never mind.
>>
>> Thanks,
>>
>> Andrew.
>>

From sangheon.kim at oracle.com  Thu Jan  7 15:36:47 2016
From: sangheon.kim at oracle.com (sangheon)
Date: Thu, 7 Jan 2016 07:36:47 -0800
Subject: RFR(s): 8144573: TLABWasteIncrement=max_jint fires an assert on
	SPARC for non-G1 GC mode
In-Reply-To: <1452157865.2611.2.camel@oracle.com>
References: <568C6049.5020400@oracle.com>
	<6D69BB31-A1F4-44A8-8CED-CF166CB2EB46@oracle.com>
	<568DA84B.9050309@oracle.com> <1452157865.2611.2.camel@oracle.com>
Message-ID: <568E860F.7010409@oracle.com>

Hi Thomas,

Thanks for looking at this.

On 01/07/2016 01:11 AM, Thomas Schatzl wrote:
> Hi Sangheon,
>
> On Wed, 2016-01-06 at 15:50 -0800, sangheon wrote:
>> Hi Igor,
>>
>> Thank you for reviewing this.
>>
>> On 01/05/2016 08:29 PM, Igor Veresov wrote:
>>> I?m not sure we care a lot about tiny bits of performance in the
>>> this instance? But, in case use wanted to keep the original code
>>> for the simm13 case you could check the range of the constant and
>>> still emit the code that was there before. It also seems suboptimal
>>> to do set64 in MacroAssembler::tlab_refill() on all paths - the
>>> result of the original add in the delay slot doesn?t seem to be
>>> used if we jump to discard_tlab, right?
>> You are right.
>> If the branch is taken, original add in the delay slot is not used.
>>
>> The reason of always calling 'set64' was to keep its behavior. i.e.
>> same
>> order of doing something before branch within delay slot. But as you
>> said, it is less tighter code.
>>
>>>     So, may be you could do something like:
>>>
>>> brx(Assembler::lessEqual, false, Assembler::pt, discard_tlab);
>>> if
>>> (is_simm13(ThreadLocalAllocBuffer::refill_waste_limit_increment()))
>>> {
>>>     delayed()->add(t2,
>>> ThreadLocalAllocBuffer::refill_waste_limit_increment(), t2);
>>> } else {
>>>     delayed()->nop();
>>>     set64(ThreadLocalAllocBuffer::refill_waste_limit_increment(),
>>> t3, G0);
>>>     add(t2, t3, t2);
>>> }
>> Okay, checking its value first seems good idea.
>>
>>> Similarly, tighter code can be emitted for the interpreter in
>>> templateTable_sparc.cpp.
>> Okay, done.
>>
>> Webrev: http://cr.openjdk.java.net/~sangheki/8144573/webrev.01
>    looks good.
>
> Could you move the "// increment waste limit to prevent getting stuck
> on this slow path" above the if-clause in both cases and remove the
> other mentions of that to make the comments in both macroAssembler and
> templateTable uniform?
Okay, I will fix them before pushing.

Thanks,
Sangheon


>
> I do not need another review for the comment change.
>
> Thanks,
>    Thomas
>


From thomas.schatzl at oracle.com  Thu Jan  7 15:47:35 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 07 Jan 2016 16:47:35 +0100
Subject: RFR(s): 8144949: TestOptionsWithRanges
	-XX:NUMAInterleaveGranularity=2147483648 crashes VM
In-Reply-To: <56709DD0.80808@oracle.com>
References: <56709DD0.80808@oracle.com>
Message-ID: <1452181655.2611.39.camel@oracle.com>

Hi,

On Tue, 2015-12-15 at 15:10 -0800, sangheon wrote:
> I think the constraint function can be removed with maximum range of 
> 2G/8192G. These are the maximum available memory on Windows and
> smaller 
> values can be used but I wanted to avoid adding artificial limit.
> With 
> this limitation, current constraint function for overflow check is
> not 
> needed.
> And we need to check allocation failure.

  looks good.

Thomas

From tobias.hartmann at oracle.com  Thu Jan  7 18:29:08 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 7 Jan 2016 19:29:08 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568DB2DF.4010305@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com>
Message-ID: <568EAE74.6020507@oracle.com>

Hi Vladimir,

On 07.01.2016 01:35, Vladimir Kozlov wrote:
> Nope. Too much unrelated changes. If you want to go this road - file separate RFE to change phase argument type of Identity() and Value().

Okay, I agree. I filed JDK-8146629 [1].

> And why use PhaseValue and not PhaseGVN as in Ideal()?

Right, we can use PhaseGVN.

> So I agree to do your change in IfNode::Identity(). But as separate fix after general change.

Here is the updated webrev based on JDK-8146629:
http://cr.openjdk.java.net/~thartmann/8136469/webrev.06/

Thanks,
Tobias

[1] https://bugs.openjdk.java.net/browse/JDK-8146629

 
> Thanks,
> Vladimir
> 
> On 1/6/16 3:22 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>>
>> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
>> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>>
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>>
>> What do you think?
>>
>> Thanks,
>> Tobias
>>
>>
>> On 18.09.2015 11:57, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8136469
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>>
>>> Problem:
>>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>>
>>> Solution:
>>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>>
>>> Testing:
>>> - New test (TestPresizedStringBuilder)
>>> - JPRT
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>>

From tobias.hartmann at oracle.com  Thu Jan  7 18:51:12 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 7 Jan 2016 19:51:12 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
Message-ID: <568EB3A0.3040909@oracle.com>

Hi,

please review the following patch.

https://bugs.openjdk.java.net/browse/JDK-8146629
http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/

Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.

Thanks,
Tobias

[1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html

From vladimir.kozlov at oracle.com  Thu Jan  7 19:08:54 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 11:08:54 -0800
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EB3A0.3040909@oracle.com>
References: <568EB3A0.3040909@oracle.com>
Message-ID: <568EB7C6.5030701@oracle.com>

Perfect.

Thanks,
Vladimir

On 1/7/16 10:51 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch.
>
> https://bugs.openjdk.java.net/browse/JDK-8146629
> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>
> Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
>
> Thanks,
> Tobias
>
> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>

From vladimir.kozlov at oracle.com  Thu Jan  7 19:24:54 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 11:24:54 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568EAE74.6020507@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
Message-ID: <568EBB86.1060108@oracle.com>

On 1/7/16 10:29 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> On 07.01.2016 01:35, Vladimir Kozlov wrote:
>> Nope. Too much unrelated changes. If you want to go this road - file separate RFE to change phase argument type of Identity() and Value().
>
> Okay, I agree. I filed JDK-8146629 [1].
>
>> And why use PhaseValue and not PhaseGVN as in Ideal()?
>
> Right, we can use PhaseGVN.
>
>> So I agree to do your change in IfNode::Identity(). But as separate fix after general change.
>
> Here is the updated webrev based on JDK-8146629:
> http://cr.openjdk.java.net/~thartmann/8136469/webrev.06/

So for IGVN we wait until dead branch is removed and only one IfProj node left before we do this Identity optimization.
And for GVN (Parse phase) we don't wait because during this phase we don't remove nodes.
The comment should say something about GVN/Parse phase to understand !phase->is_IterGVN() condition.

Thanks,
Vladimir

>
> Thanks,
> Tobias
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8146629
>
>
>> Thanks,
>> Vladimir
>>
>> On 1/6/16 3:22 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>>>
>>> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
>>> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>>>
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>>>
>>> What do you think?
>>>
>>> Thanks,
>>> Tobias
>>>
>>>
>>> On 18.09.2015 11:57, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8136469
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>>>
>>>> Problem:
>>>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>>>
>>>> Solution:
>>>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>>>
>>>> Testing:
>>>> - New test (TestPresizedStringBuilder)
>>>> - JPRT
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>>>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>>>

From vladimir.kozlov at oracle.com  Thu Jan  7 20:49:32 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 12:49:32 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568E7BAB.5070908@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
Message-ID: <568ECF5C.6090407@oracle.com>

On 1/7/16 6:52 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>> Andrew is right.
>
> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.

Right. It was the root of this bug, see below.

>
> I fixed this for the inflate and compress intrinsics.
>
>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>> StrInflatedCopyNode is not memory node.
>
> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:

I did not get the question. Is it before your webrev.01 change? Or even with the change?

>
>   // This class defines a projection of the memory  state of a store conditional node.
>   // These nodes return a value, but also update memory.
>
> But inflate does not return any value.

Hmm, according to bottom type inflate produce memory:

StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }

So it really does not need SCMemProjNode. Sorry about that.
So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
Instead of SCMemProjNode we should have to change the idx of your dst_type:

set_memory(str, dst_type);

And you should rollback part of changes in escape.cpp and macro.cpp.

>
> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/

In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & 
loads bypass these copy nodes.

>
> Related question:
> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>   char[int:>=0]:exact+any *
>
> which is equal to the type of the char load.

Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for 
getChars() which returns char array?

Should we also be more careful in inflate_string_slow()? Is it used?

>
> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>   char[int:1]:NotNull:exact *
>
> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?

It is indeed strange. What memory type of LoadUS? It could be bug.

Thanks,
Vladimir

>
> Thanks,
> Tobias
>
>
>> Thanks,
>> Vladimir
>>
>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>
>>>> The problem here is that C2 reorders memory instructions and moves
>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>> (compiler internally) to prevent this. We do the same for normal
>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>> code is emitted. See also the comment in memnode.hpp:
>>>>
>>>>    // Ordering within the same CPU.  Used to order unsafe memory references
>>>>    // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>    // compiler because the CPU does all the ordering for us.
>>>>
>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>> memory ordering, loads are never moved before dependent stores.
>>>>
>>>> Or did I misunderstand your question?
>>>
>>> No, I don't think so.  I was just checking: I am very aware that
>>> HotSpot has presented those of use with relaxed memory order machines
>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>> surprised that C2 needs this barrier, given that there is a
>>> read-after-write dependency, but never mind.
>>>
>>> Thanks,
>>>
>>> Andrew.
>>>

From vladimir.kozlov at oracle.com  Thu Jan  7 21:21:36 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 13:21:36 -0800
Subject: RFR(M): 8146613: PPC64: C2 does no longer respect int to long
	conversion for stub calls
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228AAD8@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAD8@DEWDFEMB19C.global.corp.sap>
Message-ID: <568ED6E0.7030907@oracle.com>

Looks fine to me.

Thanks,
Vladimir

On 1/7/16 5:55 AM, Doerr, Martin wrote:
> Hi,
>
> I have created a webrev which introduces int to long conversion in PPC64 functions which are called by C2 runtime calls.
>
> I also added assertions to arraycopy stubs which are already called correctly.
>
> Background:
> 8086069 removed too much code. Only the native wrapper performed the conversion after this change. However, it is
> required to convert ints to longs for all C calls and some runtime calls.
> 8144466 reintroduced the platform variable CCallingConventionRequiresIntsAsLongs and the conversion for the runtime
> calls for which C2 calls shared C functions on PPC64.
> Some PPC64 runtime functions which rely on proper 64 bit arguments are still called without conversion.
>
> Webrev is here:
>
> http://cr.openjdk.java.net/~mdoerr/8146613_ppc64_int2long/webrev.00/
>
> It only touches PPC64 files.
>
> Please review and sponsor.
>
> Best regards,
>
> Martin
>

From kishor.kharbas at intel.com  Thu Jan  7 22:05:49 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Thu, 7 Jan 2016 22:05:49 +0000
Subject: RFR (M): 8146581: Minor corrections to the patch submitted for
	earlier bug id - 8143925
Message-ID: <F89640DCD01A85489FCBA68183A6A0F389E793CC@ORSMSX116.amr.corp.intel.com>

Hi Vladimir,

For the CounterMode.crypt() patch (https://bugs.openjdk.java.net/browse/JDK-8143925) which was committed earlier, I found a minor correction in the checks performed for AES support in vm_version_x86.cpp.
Basically, a condition check was missing in a else if() block, and some old code was left by accident.
I also took this opportunity to add some more comments to make the stub code more readable/maintainable.

Bug - https://bugs.openjdk.java.net/browse/JDK-8146581
Patch - http://cr.openjdk.java.net/~vdeshpande/8146581/webrev.00/

Regards
Kishor Kharbas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160107/46488ec1/attachment.html>

From vladimir.kozlov at oracle.com  Thu Jan  7 22:07:34 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 14:07:34 -0800
Subject: RFR(M): 8146612: C2: Precedence edges specification violated
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>
Message-ID: <568EE1A6.3050202@oracle.com>

// Avoid spec violation: multiple prec edge.

I think should be:

// Avoid spec violation: duplicated prec edge.

Should we add assert to rm_prec()?:
      assert(j >= _cnt, "not a precedence edge");

Also we may need to check that input index is < _max in set_prec() and rm_prec().

Next access will be outside _in array if j == _max-1 (in rm_prec()):

    _in[i] = NULL;                // NULL out last element

unless we guarantee that there is always NULL at the end. Which I don't see because set_prec() may set the last prec 
edge to not NULL.

Please factor out similar code (search for last non-NULL prec edge) in del_req(), del_req_ordered() and rm_prec() into 
separate method.

Thanks,
Vladimir


On 1/7/16 5:45 AM, Doerr, Martin wrote:
> Hi,
>
> some time ago, we found out, that C2 doesn?t treat precedence edges as specified.
>
> The description of precedence edges in node.hpp says:
>
> "They are unordered and not duplicated; they have no embedded NULLs."
>
> Some functions in the current implementation violate this specification.
>
> I have fixed this in the following webrev:
>
> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.00/
>
> Please review. I will need a sponsor, please.
>
> Best regards,
>
> Martin
>

From vladimir.kozlov at oracle.com  Thu Jan  7 22:10:08 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 14:10:08 -0800
Subject: RFR (M): 8146581: Minor corrections to the patch submitted for
	earlier bug id - 8143925
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F389E793CC@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F389E793CC@ORSMSX116.amr.corp.intel.com>
Message-ID: <568EE240.7000607@oracle.com>

Looks good.

Thanks,
Vladimir

On 1/7/16 2:05 PM, Kharbas, Kishor wrote:
> Hi Vladimir,
>
> For the CounterMode.crypt() patch (https://bugs.openjdk.java.net/browse/JDK-8143925) which was committed earlier, I
> found a minor correction in the checks performed for AES support in vm_version_x86.cpp.
>
> Basically, a condition check was missing in a else if() block, and some old code was left by accident.
>
> I also took this opportunity to add some more comments to make the stub code more readable/maintainable.
>
> Bug - https://bugs.openjdk.java.net/browse/JDK-8146581
>
> Patch - http://cr.openjdk.java.net/~vdeshpande/8146581/webrev.00/
>
> Regards
>
> Kishor Kharbas
>

From kishor.kharbas at intel.com  Thu Jan  7 22:12:52 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Thu, 7 Jan 2016 22:12:52 +0000
Subject: RFR (M): 8146581: Minor corrections to the patch submitted for
	earlier bug id - 8143925
In-Reply-To: <568EE240.7000607@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389E793CC@ORSMSX116.amr.corp.intel.com>
	<568EE240.7000607@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F389E793EB@ORSMSX116.amr.corp.intel.com>

Wow! That was quick.. thanks :)

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Thursday, January 07, 2016 2:10 PM
To: Kharbas, Kishor; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR (M): 8146581: Minor corrections to the patch submitted for earlier bug id - 8143925

Looks good.

Thanks,
Vladimir

On 1/7/16 2:05 PM, Kharbas, Kishor wrote:
> Hi Vladimir,
>
> For the CounterMode.crypt() patch 
> (https://bugs.openjdk.java.net/browse/JDK-8143925) which was committed earlier, I found a minor correction in the checks performed for AES support in vm_version_x86.cpp.
>
> Basically, a condition check was missing in a else if() block, and some old code was left by accident.
>
> I also took this opportunity to add some more comments to make the stub code more readable/maintainable.
>
> Bug - https://bugs.openjdk.java.net/browse/JDK-8146581
>
> Patch - http://cr.openjdk.java.net/~vdeshpande/8146581/webrev.00/
>
> Regards
>
> Kishor Kharbas
>

From rednaxelafx at gmail.com  Thu Jan  7 22:40:48 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 7 Jan 2016 14:40:48 -0800
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EB7C6.5030701@oracle.com>
References: <568EB3A0.3040909@oracle.com>
	<568EB7C6.5030701@oracle.com>
Message-ID: <CA+cQ+tRiJt5vRBKvCYhz7D5SNd20XYi7=gEz6e54jVky6OB36Q@mail.gmail.com>

That's interesting. Out of curiosity, would adding a "bool can_reshape"
argument to Identity() and Value() do the job, just like the way Ideal()
does it?
If so, what was the trade off that led to this change as opposed to adding
an argument?

Thanks,
Kris

On Thursday, January 7, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> Perfect.
>
> Thanks,
> Vladimir
>
> On 1/7/16 10:51 AM, Tobias Hartmann wrote:
>
>> Hi,
>>
>> please review the following patch.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8146629
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>>
>> Currently, there is no way to determine in Node::Identity() and
>> Node::Value() if we were called from GVN or IGVN but sometimes we would
>> like to do optimizations based on this information (for example, see
>> discussion in RFR for JDK-8136469 [1]). I changed the arguments of
>> Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like
>> this, we can simply call PhaseValues::is_IterGVN() from both methods.
>>
>> Thanks,
>> Tobias
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160107/e8cbcfce/attachment.html>

From vladimir.kozlov at oracle.com  Thu Jan  7 22:51:44 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 7 Jan 2016 14:51:44 -0800
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <CA+cQ+tRiJt5vRBKvCYhz7D5SNd20XYi7=gEz6e54jVky6OB36Q@mail.gmail.com>
References: <568EB3A0.3040909@oracle.com> <568EB7C6.5030701@oracle.com>
	<CA+cQ+tRiJt5vRBKvCYhz7D5SNd20XYi7=gEz6e54jVky6OB36Q@mail.gmail.com>
Message-ID: <568EEC00.2070909@oracle.com>

PhaseValues has additional data which we may access.
As Tobias said we may consider using additional field in PhaseGVN to check for IGVN instead of using virtual method 
is_IterGVN() and can_reshape parameter. It would be next step.

Thanks,
Vladimir

On 1/7/16 2:40 PM, Krystal Mok wrote:
> That's interesting. Out of curiosity, would adding a "bool can_reshape" argument to Identity() and Value() do the job,
> just like the way Ideal() does it?
> If so, what was the trade off that led to this change as opposed to adding an argument?
>
> Thanks,
> Kris
>
> On Thursday, January 7, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     Perfect.
>
>     Thanks,
>     Vladimir
>
>     On 1/7/16 10:51 AM, Tobias Hartmann wrote:
>
>         Hi,
>
>         please review the following patch.
>
>         https://bugs.openjdk.java.net/browse/JDK-8146629
>         http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>
>         Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN
>         but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR
>         for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to
>         PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
>
>         Thanks,
>         Tobias
>
>         [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>

From rednaxelafx at gmail.com  Thu Jan  7 23:20:09 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 7 Jan 2016 15:20:09 -0800
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EEC00.2070909@oracle.com>
References: <568EB3A0.3040909@oracle.com> <568EB7C6.5030701@oracle.com>
	<CA+cQ+tRiJt5vRBKvCYhz7D5SNd20XYi7=gEz6e54jVky6OB36Q@mail.gmail.com>
	<568EEC00.2070909@oracle.com>
Message-ID: <CA+cQ+tQZARJkWOQOF4WfN+yfP+Lsb_mSqtaJi2rQLm-amczc8w@mail.gmail.com>

I see. Thanks a lot for the explanation, Vladimir!

Best regards,
Kris

On Thu, Jan 7, 2016 at 2:51 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> PhaseValues has additional data which we may access.
> As Tobias said we may consider using additional field in PhaseGVN to check
> for IGVN instead of using virtual method is_IterGVN() and can_reshape
> parameter. It would be next step.
>
> Thanks,
> Vladimir
>
> On 1/7/16 2:40 PM, Krystal Mok wrote:
>
>> That's interesting. Out of curiosity, would adding a "bool can_reshape"
>> argument to Identity() and Value() do the job,
>> just like the way Ideal() does it?
>> If so, what was the trade off that led to this change as opposed to
>> adding an argument?
>>
>> Thanks,
>> Kris
>>
>> On Thursday, January 7, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     Perfect.
>>
>>     Thanks,
>>     Vladimir
>>
>>     On 1/7/16 10:51 AM, Tobias Hartmann wrote:
>>
>>         Hi,
>>
>>         please review the following patch.
>>
>>         https://bugs.openjdk.java.net/browse/JDK-8146629
>>         http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>>
>>         Currently, there is no way to determine in Node::Identity() and
>> Node::Value() if we were called from GVN or IGVN
>>         but sometimes we would like to do optimizations based on this
>> information (for example, see discussion in RFR
>>         for JDK-8136469 [1]). I changed the arguments of Node::Identity()
>> and Node::Value() from PhaseTransform* to
>>         PhaseGVN*. Like this, we can simply call
>> PhaseValues::is_IterGVN() from both methods.
>>
>>         Thanks,
>>         Tobias
>>
>>         [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160107/d8674636/attachment.html>

From goetz.lindenmaier at sap.com  Fri Jan  8 07:35:54 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 8 Jan 2016 07:35:54 +0000
Subject: RFR(M): 8146613: PPC64: C2 does no longer respect int to long
	conversion for stub calls
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228AAD8@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAD8@DEWDFEMB19C.global.corp.sap>
Message-ID: <4295855A5C1DE049A61835A1887419CC41F0F5AC@DEWDFEMB12A.global.corp.sap>

Hi Martin, 

thanks for doing these fixes, they look good.
I especially like the trick of doing an int cast that won't be
optimized by the C compiler in the montgomery intrinsics.

You should be able to push this yourselves as you are now
Committer and it only touches ppc files.

Best regards,
  Goetz.


> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Doerr, Martin
> Sent: Donnerstag, 7. Januar 2016 14:55
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: RFR(M): 8146613: PPC64: C2 does no longer respect int to long
> conversion for stub calls
> 
> Hi,
> 
> 
> 
> I have created a webrev which introduces int to long conversion in PPC64
> functions which are called by C2 runtime calls.
> 
> I also added assertions to arraycopy stubs which are already called correctly.
> 
> 
> 
> Background:
> 8086069 removed too much code. Only the native wrapper performed the
> conversion after this change. However, it is required to convert ints to longs
> for all C calls and some runtime calls.
> 8144466 reintroduced the platform variable
> CCallingConventionRequiresIntsAsLongs and the conversion for the runtime
> calls for which C2 calls shared C functions on PPC64.
> Some PPC64 runtime functions which rely on proper 64 bit arguments are still
> called without conversion.
> 
> Webrev is here:
> 
> http://cr.openjdk.java.net/~mdoerr/8146613_ppc64_int2long/webrev.00/
> 
> 
> 
> It only touches PPC64 files.
> 
> 
> 
> Please review and sponsor.
> 
> 
> 
> Best regards,
> 
> Martin
> 
> 


From tobias.hartmann at oracle.com  Fri Jan  8 08:03:47 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 8 Jan 2016 09:03:47 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EB7C6.5030701@oracle.com>
References: <568EB3A0.3040909@oracle.com> <568EB7C6.5030701@oracle.com>
Message-ID: <568F6D63.9050603@oracle.com>

Thanks for the review, Vladimir.

Best,
Tobias

On 07.01.2016 20:08, Vladimir Kozlov wrote:
> Perfect.
> 
> Thanks,
> Vladimir
> 
> On 1/7/16 10:51 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8146629
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>>
>> Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
>>
>> Thanks,
>> Tobias
>>
>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>>

From tobias.hartmann at oracle.com  Fri Jan  8 09:05:24 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 8 Jan 2016 10:05:24 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568EBB86.1060108@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
	<568EBB86.1060108@oracle.com>
Message-ID: <568F7BD4.1070000@oracle.com>

Hi Vladimir,

On 07.01.2016 20:24, Vladimir Kozlov wrote:
> On 1/7/16 10:29 AM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 07.01.2016 01:35, Vladimir Kozlov wrote:
>>> Nope. Too much unrelated changes. If you want to go this road - file separate RFE to change phase argument type of Identity() and Value().
>>
>> Okay, I agree. I filed JDK-8146629 [1].
>>
>>> And why use PhaseValue and not PhaseGVN as in Ideal()?
>>
>> Right, we can use PhaseGVN.
>>
>>> So I agree to do your change in IfNode::Identity(). But as separate fix after general change.
>>
>> Here is the updated webrev based on JDK-8146629:
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.06/
> 
> So for IGVN we wait until dead branch is removed and only one IfProj node left before we do this Identity optimization.
> And for GVN (Parse phase) we don't wait because during this phase we don't remove nodes.
> The comment should say something about GVN/Parse phase to understand !phase->is_IterGVN() condition.

Right, I updated the comment. Does this look good to you?
http://cr.openjdk.java.net/~thartmann/8136469/webrev.07

Thanks,
Tobias

> 
> Thanks,
> Vladimir
> 
>>
>> Thanks,
>> Tobias
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8146629
>>
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/6/16 3:22 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>>>>
>>>> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
>>>> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>>>>
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>>>>
>>>> What do you think?
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>>
>>>> On 18.09.2015 11:57, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> please review the following patch.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8136469
>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>>>>
>>>>> Problem:
>>>>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>>>>
>>>>> Solution:
>>>>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>>>>
>>>>> Testing:
>>>>> - New test (TestPresizedStringBuilder)
>>>>> - JPRT
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>>>>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>>>>

From roland.westrelin at oracle.com  Fri Jan  8 09:33:32 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 8 Jan 2016 10:33:32 +0100
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <CAHjP37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com> <CAHjP!
	37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>
Message-ID: <1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>

> Does checkIndex match on it? If so, is there a reason to proceed with intrinsifying checkIndex?

I expect it would in some cases but not all.

The pattern matching needs profiling to tell the branches that would trigger an exception are never taken, then only can the tests be folded and made to look like a range check for the next optimization passes. Profiling can be polluted or not mature enough. The intrinsic assumes the exception path are never taken and doesn't rely on profiling (then if the check does fail we recompile and don't use the intrinsic). We take the use of the checkIndex API as a hint that the checks are not expected to fail.

Also, for the pattern matching to work, in i <0 || i >= length the compiler needs to know enough on the range of values taken by length to be able to fold. Again we see checkIndex as an indication that length is positive and if we can't prove it we compile a predicate to verify that it is so we can safely use an unsigned compare. Again we take the use of checkIndex as a hint that the length argument is positive.

Roland.

> 
> On Wednesday, January 6, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> Note, we already have range check pattern matching code in C2 (thanks to Roland):
> 
> https://bugs.openjdk.java.net/browse/JDK-8137168
> 
> Vladimir
> 
> On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
> I don't think there's a need to write out 20 different ways to do a
> range check -- I think nobody would expect all 20 to be covered by the
> optimizer.  Some of those variations may not map cleanly to
> Object::checkIndex either, nor is there any guarantee that people will
> update all their existing range checks (or even know about) to use
> Object::checkIndex -- some code will be left unoptimized no matter what.
> 
> But my point is the same as Andrew's, I think; instead of making
> checkIndex an intrinsic, simply add a pattern match against that exact
> bytecode shape (perhaps with basic canonicalization) and then still
> encourage people to use Object::checkIndex.  This is better than
> intrinsic (modulo profile pollution) since any other code that happens
> to use same pattern will match as well, and not require an update to use
> checkIndex.  Then, if someone comes to this list with an unoptimized
> example with a different bytecode shape and has a convincing argument
> that the code shape is "common", you guys can consider pattern matching
> that as well.
> 
> On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
> <mailto:john.r.rose at oracle.com>> wrote:
> 
> 
>      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com
>     <mailto:vitalyd at gmail.com>> wrote:
>      >
>      > better canonicalization
> 
>     That's our first and most important tactic.  (Actually inlining is.)
> 
>     But the various idioms for checkIndex do not canonicalize easily. In
>     this case the correct trade-off is not to invest more time and
>     research and code into stronger canonicalization.
> 
>     We do have canonicalization of if-expressions. It's just that in
>     this case strengthening it to cover range checks reliably is harder
>     than the reasonable alternative.
> 
>     ? John
> 
>     PS.  I am tempted to write out a list of 20 different ways to code a
>     range check but will leave that as a exercise.
> 
> 
> 
> 
> -- 
> Sent from my phone


From roland.westrelin at oracle.com  Fri Jan  8 10:32:00 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 8 Jan 2016 11:32:00 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EB3A0.3040909@oracle.com>
References: <568EB3A0.3040909@oracle.com>
Message-ID: <4A5ECDA0-6F08-4AA1-AEBC-202042F6707E@oracle.com>

> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/

That looks good to me.

Roland.

From roland.westrelin at oracle.com  Fri Jan  8 10:33:56 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 8 Jan 2016 11:33:56 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568F7BD4.1070000@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
	<568EBB86.1060108@oracle.com> <568F7BD4.1070000@oracle.com>
Message-ID: <63EB93BD-3E8A-4681-AF8F-0A005E61BE1C@oracle.com>

> http://cr.openjdk.java.net/~thartmann/8136469/webrev.07

That looks good to me.

Roland.

From tobias.hartmann at oracle.com  Fri Jan  8 10:35:56 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 8 Jan 2016 11:35:56 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <4A5ECDA0-6F08-4AA1-AEBC-202042F6707E@oracle.com>
References: <568EB3A0.3040909@oracle.com>
	<4A5ECDA0-6F08-4AA1-AEBC-202042F6707E@oracle.com>
Message-ID: <568F910C.6030701@oracle.com>

Thanks, Roland.

Best,
Tobias

On 08.01.2016 11:32, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
> 
> That looks good to me.
> 
> Roland.
> 

From tobias.hartmann at oracle.com  Fri Jan  8 10:36:13 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 8 Jan 2016 11:36:13 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <63EB93BD-3E8A-4681-AF8F-0A005E61BE1C@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
	<568EBB86.1060108@oracle.com> <568F7BD4.1070000@oracle.com>
	<63EB93BD-3E8A-4681-AF8F-0A005E61BE1C@oracle.com>
Message-ID: <568F911D.1050203@oracle.com>

Thanks, Roland.

Best,
Tobias

On 08.01.2016 11:33, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.07
> 
> That looks good to me.
> 
> Roland.
> 

From tobias.hartmann at oracle.com  Fri Jan  8 10:37:55 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 8 Jan 2016 11:37:55 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568ECF5C.6090407@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com>
Message-ID: <568F9183.9070909@oracle.com>


On 07.01.2016 21:49, Vladimir Kozlov wrote:
> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>> Andrew is right.
>>
>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
> 
> Right. It was the root of this bug, see below.
> 
>>
>> I fixed this for the inflate and compress intrinsics.
>>
>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>> StrInflatedCopyNode is not memory node.
>>
>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
> 
> I did not get the question. Is it before your webrev.01 change? Or even with the change?

I meant with webrev.01 but you answered my question below.

>>   // This class defines a projection of the memory  state of a store conditional node.
>>   // These nodes return a value, but also update memory.
>>
>> But inflate does not return any value.
> 
> Hmm, according to bottom type inflate produce memory:
> 
> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
> 
> So it really does not need SCMemProjNode. Sorry about that.
> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.

Exactly.

> Instead of SCMemProjNode we should have to change the idx of your dst_type:
> 
> set_memory(str, dst_type);

Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.

> And you should rollback part of changes in escape.cpp and macro.cpp.

Okay, I'll to that.

>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
> 
> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.

Okay, should we then use BOTTOM for both the input and output type?

>> Related question:
>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>   char[int:>=0]:exact+any *
>>
>> which is equal to the type of the char load.
> 
> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?

Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:

// compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
//   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
//   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
// compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
//   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
//   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)

I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].

> Should we also be more careful in inflate_string_slow()? Is it used?

No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].

>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>   char[int:1]:NotNull:exact *
>>
>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
> 
> It is indeed strange. What memory type of LoadUS? It could be bug.

LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.

I will look into this again and try to understand what happens.

Thanks,
Tobias

>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>
>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>
>>>>>    // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>    // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>    // compiler because the CPU does all the ordering for us.
>>>>>
>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>
>>>>> Or did I misunderstand your question?
>>>>
>>>> No, I don't think so.  I was just checking: I am very aware that
>>>> HotSpot has presented those of use with relaxed memory order machines
>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>> surprised that C2 needs this barrier, given that there is a
>>>> read-after-write dependency, but never mind.
>>>>
>>>> Thanks,
>>>>
>>>> Andrew.
>>>>

From martin.doerr at sap.com  Fri Jan  8 11:06:42 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 8 Jan 2016 11:06:42 +0000
Subject: RFR(M): 8146612: C2: Precedence edges specification violated
In-Reply-To: <568EE1A6.3050202@oracle.com>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>
	<568EE1A6.3050202@oracle.com>
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228ACE8@DEWDFEMB19C.global.corp.sap>

Hi Vladimir,

thanks for the review.

I have changed the comments, added assertions and factored out the common functionality of del_req(), del_req_ordered() and rm_prec() into a new private function close_prec_gap_at(). That makes sense.

About your concern about accessing outside of _in array in rm_prec():
Please note that i is decremented before it gets used:
"j == _max-1", "i" will be set to "_max", but decremented in "_in[--i]"

Anyway, I have replaced this code by close_prec_gap_at(), so it doesn't matter anymore.

The new webrev is here:
http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.01/

Best regards,
  Martin


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Donnerstag, 7. Januar 2016 23:08
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M): 8146612: C2: Precedence edges specification violated

// Avoid spec violation: multiple prec edge.

I think should be:

// Avoid spec violation: duplicated prec edge.

Should we add assert to rm_prec()?:
      assert(j >= _cnt, "not a precedence edge");

Also we may need to check that input index is < _max in set_prec() and rm_prec().

Next access will be outside _in array if j == _max-1 (in rm_prec()):

    _in[i] = NULL;                // NULL out last element

unless we guarantee that there is always NULL at the end. Which I don't see because set_prec() may set the last prec 
edge to not NULL.

Please factor out similar code (search for last non-NULL prec edge) in del_req(), del_req_ordered() and rm_prec() into 
separate method.

Thanks,
Vladimir


On 1/7/16 5:45 AM, Doerr, Martin wrote:
> Hi,
>
> some time ago, we found out, that C2 doesn't treat precedence edges as specified.
>
> The description of precedence edges in node.hpp says:
>
> "They are unordered and not duplicated; they have no embedded NULLs."
>
> Some functions in the current implementation violate this specification.
>
> I have fixed this in the following webrev:
>
> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.00/
>
> Please review. I will need a sponsor, please.
>
> Best regards,
>
> Martin
>

From zoltan.majo at oracle.com  Fri Jan  8 11:06:58 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Fri, 8 Jan 2016 12:06:58 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
Message-ID: <568F9852.4090806@oracle.com>

Hi,


please review the patch for 8086053.

https://bugs.openjdk.java.net/browse/JDK-8086053

Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly 
allocated TLAB regions. With ZeroTLAB disabled, the interpreter and 
compiled code should assume the responsibility to zero-fill newly 
allocated regions.
Currently, the handling of the ZeroTLAB flag shows some inconsistencies 
between the GC and the compilers. These inconsistencies lead to newly 
allocated regions not being filled with zeros.

Solution: Address the following:
- With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without 
notifying the GC. As a result, the newly allocated TLAB is not 
initialized with zero. Add TLAB initialization code to C1.
- With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly 
allocated objects/arrays even if TLAB allocation is disabled. Add 
stricter conditions to C2 on when to skip filling objects/arrays with zero.

Webrev:
http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/

Testing:
- local testing (linux_x86_64) of failing test case with -XX:+UseG1GC 
and -XX:+UseSerialGC;
- JPRT;
- all hotspot tests on all platforms affected by the change using all 
combinations of +/-UseTLAB and +/-ZeroTLAB.

Thank you and best regards,


Zoltan


From edward.nevill at gmail.com  Fri Jan  8 11:46:57 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Fri, 08 Jan 2016 11:46:57 +0000
Subject: RFR: 8146678: aarch64: assertion failure: call instruction in an
	infinite loop
Message-ID: <1452253617.19405.10.camel@mint>

Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8146678/webrev/

JIRA Issue: https://bugs.openjdk.java.net/browse/JDK-8146678

This fixes an assertion in Relocation::pd_set_call_destination

assert(addr() != x, "call instruction in an infinite loop"); 

which triggers following 

8146286: aarch64: guarantee failures with large code cache sizes on jtreg test java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 

The reason is that this change deliberately generates BL to self to avoid BL going out of range.

The fix is to remove the assertion as it is no longer valid.

Thanks,
Ed.


From vitalyd at gmail.com  Fri Jan  8 12:38:18 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 8 Jan 2016 07:38:18 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com>
	<CAHjP37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>
	<1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>
Message-ID: <CAHjP37GKBXBk1e0NSByAxxm8A04i6=TD2uzSiy3KyTNDuCxgEA@mail.gmail.com>

Roland, thanks for elaborating; a few comments inline ...

On Friday, January 8, 2016, Roland Westrelin <roland.westrelin at oracle.com>
wrote:

> > Does checkIndex match on it? If so, is there a reason to proceed with
> intrinsifying checkIndex?
>
> I expect it would in some cases but not all.
>
> The pattern matching needs profiling to tell the branches that would
> trigger an exception are never taken, then only can the tests be folded and
> made to look like a range check for the next optimization passes. Profiling
> can be polluted or not mature enough. The intrinsic assumes the exception
> path are never taken and doesn't rely on profiling (then if the check does
> fail we recompile and don't use the intrinsic). We take the use of the
> checkIndex API as a hint that the checks are not expected to fail.


As a general comment, would it make sense to assume exceptional paths are
not taken in most Java code? That is, for code optimization purposes it's
probably a reasonable assumption.  It seems like having an exceptional path
is already a hint that it's not expected to fail; most Java devs know not
to use exceptions for expected control flow.

>
> Also, for the pattern matching to work, in i <0 || i >= length the
> compiler needs to know enough on the range of values taken by length to be
> able to fold. Again we see checkIndex as an indication that length is
> positive and if we can't prove it we compile a predicate to verify that it
> is so we can safely use an unsigned compare. Again we take the use of
> checkIndex as a hint that the length argument is positive.


Could bytecode shape just like checkIndex be treated as same hint? Are
there cases where something looks like checkIndex but really isn't?

>
> Roland.
>
> >
> > On Wednesday, January 6, 2016, Vladimir Kozlov <
> vladimir.kozlov at oracle.com <javascript:;>> wrote:
> > Note, we already have range check pattern matching code in C2 (thanks to
> Roland):
> >
> > https://bugs.openjdk.java.net/browse/JDK-8137168
> >
> > Vladimir
> >
> > On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
> > I don't think there's a need to write out 20 different ways to do a
> > range check -- I think nobody would expect all 20 to be covered by the
> > optimizer.  Some of those variations may not map cleanly to
> > Object::checkIndex either, nor is there any guarantee that people will
> > update all their existing range checks (or even know about) to use
> > Object::checkIndex -- some code will be left unoptimized no matter what.
> >
> > But my point is the same as Andrew's, I think; instead of making
> > checkIndex an intrinsic, simply add a pattern match against that exact
> > bytecode shape (perhaps with basic canonicalization) and then still
> > encourage people to use Object::checkIndex.  This is better than
> > intrinsic (modulo profile pollution) since any other code that happens
> > to use same pattern will match as well, and not require an update to use
> > checkIndex.  Then, if someone comes to this list with an unoptimized
> > example with a different bytecode shape and has a convincing argument
> > that the code shape is "common", you guys can consider pattern matching
> > that as well.
> >
> > On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
> <javascript:;>
> > <mailto:john.r.rose at oracle.com <javascript:;>>> wrote:
> >
> >
> >      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:;>
> >     <mailto:vitalyd at gmail.com <javascript:;>>> wrote:
> >      >
> >      > better canonicalization
> >
> >     That's our first and most important tactic.  (Actually inlining is.)
> >
> >     But the various idioms for checkIndex do not canonicalize easily. In
> >     this case the correct trade-off is not to invest more time and
> >     research and code into stronger canonicalization.
> >
> >     We do have canonicalization of if-expressions. It's just that in
> >     this case strengthening it to cover range checks reliably is harder
> >     than the reasonable alternative.
> >
> >     ? John
> >
> >     PS.  I am tempted to write out a list of 20 different ways to code a
> >     range check but will leave that as a exercise.
> >
> >
> >
> >
> > --
> > Sent from my phone
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160108/0634a5c9/attachment.html>

From rahul.v.raghavan at oracle.com  Fri Jan  8 17:13:40 2016
From: rahul.v.raghavan at oracle.com (Rahul Raghavan)
Date: Fri, 8 Jan 2016 09:13:40 -0800 (PST)
Subject: FW: RFR(S): 6378256: Performance problem with System.identityHashCode
	in client compiler
In-Reply-To: <a145a4d4-86f2-4c54-8be6-a0ff20c40a61@default>
References: <a145a4d4-86f2-4c54-8be6-a0ff20c40a61@default>
Message-ID: <d8c539a2-ca51-43de-a2b7-61460fc4fbf7@default>

Hello,

Please review the following revised patch for JDK-6378256 -
http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/

This revised webrev got following changes -

 1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null reference input!
   (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
 
 2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.

Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.

Thanks,
Rahul


> -----Original Message-----
> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-dev at openjdk.java.net
> 
> > webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> 
> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again would be
> nice.
> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
> 
> Roland.


> -----Original Message-----
> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
> 
> Hello,
> 
> Please review the following patch for JDK-6378256.
> 
> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
> slower).
> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
> 
> sample unit test:
>    public class Jdk6378256Test
>    {
>       public static void main(String[] args)
>       {
>          Object obj = new Object();
>          long time = System.nanoTime();
>          for(int i = 0 ; i < 1000000 ; i++)
>             System.identityHashCode(obj);  //compare to obj.hashCode();
>          System.out.println ("Result = " + (System.nanoTime() - time));
>       }
>    }
> 
> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
> (looks in the header for the hashCode before calling into the VM).
> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
> So also added required additional null check for System.identityHashCode case.
> 
> Testing:
>    - successful JPRT run (-testset hotspot).
>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
>    - Added 'noreg-perf' label for this performance bug.
>       Manual testing done and confirmed expected performance values for unit tests with fix.
> 
> Thanks,
> Rahul

From vladimir.kozlov at oracle.com  Fri Jan  8 19:34:09 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 8 Jan 2016 11:34:09 -0800
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <568F7BD4.1070000@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
	<568EBB86.1060108@oracle.com> <568F7BD4.1070000@oracle.com>
Message-ID: <56900F31.4060409@oracle.com>

Very good.

Thanks,
Vladimir

On 1/8/16 1:05 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> On 07.01.2016 20:24, Vladimir Kozlov wrote:
>> On 1/7/16 10:29 AM, Tobias Hartmann wrote:
>>> Hi Vladimir,
>>>
>>> On 07.01.2016 01:35, Vladimir Kozlov wrote:
>>>> Nope. Too much unrelated changes. If you want to go this road - file separate RFE to change phase argument type of Identity() and Value().
>>>
>>> Okay, I agree. I filed JDK-8146629 [1].
>>>
>>>> And why use PhaseValue and not PhaseGVN as in Ideal()?
>>>
>>> Right, we can use PhaseGVN.
>>>
>>>> So I agree to do your change in IfNode::Identity(). But as separate fix after general change.
>>>
>>> Here is the updated webrev based on JDK-8146629:
>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.06/
>>
>> So for IGVN we wait until dead branch is removed and only one IfProj node left before we do this Identity optimization.
>> And for GVN (Parse phase) we don't wait because during this phase we don't remove nodes.
>> The comment should say something about GVN/Parse phase to understand !phase->is_IterGVN() condition.
>
> Right, I updated the comment. Does this look good to you?
> http://cr.openjdk.java.net/~thartmann/8136469/webrev.07
>
> Thanks,
> Tobias
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8146629
>>>
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 1/6/16 3:22 AM, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>>>>>
>>>>> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
>>>>> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>>>>>
>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>>
>>>>> On 18.09.2015 11:57, Tobias Hartmann wrote:
>>>>>> Hi,
>>>>>>
>>>>>> please review the following patch.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136469
>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>>>>>
>>>>>> Problem:
>>>>>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>>>>>
>>>>>> Solution:
>>>>>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>>>>>
>>>>>> Testing:
>>>>>> - New test (TestPresizedStringBuilder)
>>>>>> - JPRT
>>>>>>
>>>>>> Thanks,
>>>>>> Tobias
>>>>>>
>>>>>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>>>>>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>>>>>

From vladimir.kozlov at oracle.com  Fri Jan  8 19:41:53 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 8 Jan 2016 11:41:53 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <568F9183.9070909@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
Message-ID: <56901101.6050503@oracle.com>

On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>
> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>> Hi Vladimir,
>>>
>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>> Andrew is right.
>>>
>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>
>> Right. It was the root of this bug, see below.
>>
>>>
>>> I fixed this for the inflate and compress intrinsics.
>>>
>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>> StrInflatedCopyNode is not memory node.
>>>
>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>
>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>
> I meant with webrev.01 but you answered my question below.
>
>>>    // This class defines a projection of the memory  state of a store conditional node.
>>>    // These nodes return a value, but also update memory.
>>>
>>> But inflate does not return any value.
>>
>> Hmm, according to bottom type inflate produce memory:
>>
>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>
>> So it really does not need SCMemProjNode. Sorry about that.
>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>
> Exactly.
>
>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>
>> set_memory(str, dst_type);
>
> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>
>> And you should rollback part of changes in escape.cpp and macro.cpp.
>
> Okay, I'll to that.
>
>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>
>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>
> Okay, should we then use BOTTOM for both the input and output type?

Only input. Output type corresponds to dst array type which you set correctly now.

>
>>> Related question:
>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>    char[int:>=0]:exact+any *
>>>
>>> which is equal to the type of the char load.
>>
>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>
> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>
> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>
> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>
>> Should we also be more careful in inflate_string_slow()? Is it used?
>
> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>
>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>    char[int:1]:NotNull:exact *
>>>
>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>
>> It is indeed strange. What memory type of LoadUS? It could be bug.
>
> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>
> I will look into this again and try to understand what happens.

It could that aryptr is pointer to array and load type is pointer to array's element.

Thanks,
Vladimir

>
> Thanks,
> Tobias
>
>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>
>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>
>>>>>>     // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>     // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>     // compiler because the CPU does all the ordering for us.
>>>>>>
>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>
>>>>>> Or did I misunderstand your question?
>>>>>
>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>> surprised that C2 needs this barrier, given that there is a
>>>>> read-after-write dependency, but never mind.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Andrew.
>>>>>

From vladimir.kozlov at oracle.com  Fri Jan  8 19:46:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 8 Jan 2016 11:46:33 -0800
Subject: RFR(M): 8146612: C2: Precedence edges specification violated
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228ACE8@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>
	<568EE1A6.3050202@oracle.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228ACE8@DEWDFEMB19C.global.corp.sap>
Message-ID: <56901219.8090805@oracle.com>

Very good. I will sponsor it.

Thanks,
Vladimir

On 1/8/16 3:06 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> thanks for the review.
>
> I have changed the comments, added assertions and factored out the common functionality of del_req(), del_req_ordered() and rm_prec() into a new private function close_prec_gap_at(). That makes sense.
>
> About your concern about accessing outside of _in array in rm_prec():
> Please note that i is decremented before it gets used:
> "j == _max-1", "i" will be set to "_max", but decremented in "_in[--i]"
>
> Anyway, I have replaced this code by close_prec_gap_at(), so it doesn't matter anymore.
>
> The new webrev is here:
> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.01/
>
> Best regards,
>    Martin
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 7. Januar 2016 23:08
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8146612: C2: Precedence edges specification violated
>
> // Avoid spec violation: multiple prec edge.
>
> I think should be:
>
> // Avoid spec violation: duplicated prec edge.
>
> Should we add assert to rm_prec()?:
>        assert(j >= _cnt, "not a precedence edge");
>
> Also we may need to check that input index is < _max in set_prec() and rm_prec().
>
> Next access will be outside _in array if j == _max-1 (in rm_prec()):
>
>      _in[i] = NULL;                // NULL out last element
>
> unless we guarantee that there is always NULL at the end. Which I don't see because set_prec() may set the last prec
> edge to not NULL.
>
> Please factor out similar code (search for last non-NULL prec edge) in del_req(), del_req_ordered() and rm_prec() into
> separate method.
>
> Thanks,
> Vladimir
>
>
> On 1/7/16 5:45 AM, Doerr, Martin wrote:
>> Hi,
>>
>> some time ago, we found out, that C2 doesn't treat precedence edges as specified.
>>
>> The description of precedence edges in node.hpp says:
>>
>> "They are unordered and not duplicated; they have no embedded NULLs."
>>
>> Some functions in the current implementation violate this specification.
>>
>> I have fixed this in the following webrev:
>>
>> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.00/
>>
>> Please review. I will need a sponsor, please.
>>
>> Best regards,
>>
>> Martin
>>

From vladimir.kozlov at oracle.com  Fri Jan  8 20:46:59 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 8 Jan 2016 12:46:59 -0800
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <568F9852.4090806@oracle.com>
References: <568F9852.4090806@oracle.com>
Message-ID: <56902043.1040409@oracle.com>

Looks good to me.

Thanks,
Vladimir

On 1/8/16 3:06 AM, Zolt?n Maj? wrote:
> Hi,
>
>
> please review the patch for 8086053.
>
> https://bugs.openjdk.java.net/browse/JDK-8086053
>
> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB disabled,
> the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These
> inconsistencies lead to newly allocated regions not being filled with zeros.
>
> Solution: Address the following:
> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly allocated
> TLAB is not initialized with zero. Add TLAB initialization code to C1.
> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB
> allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>
> Webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>
> Testing:
> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
> - JPRT;
> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>
> Thank you and best regards,
>
>
> Zoltan
>

From vivek.r.deshpande at intel.com  Sat Jan  9 02:16:01 2016
From: vivek.r.deshpande at intel.com (Deshpande, Vivek R)
Date: Sat, 9 Jan 2016 02:16:01 +0000
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <568DB72F.6010408@oracle.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569E1902@ORSMSX106.amr.corp.intel.com>
	<5684A5B8.7070407@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569F23FC@ORSMSX106.amr.corp.intel.com>
	<568DB72F.6010408@oracle.com>
Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A569F50C1@ORSMSX106.amr.corp.intel.com>

Hi Vladimir,

I have updated the patch with latest base source and split the macroAssembler_x86_libm.cpp  file into two files for your review.
The patch is at this location:
http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/webrev.01/

64 bit code does not have less precise result or lower performance, by without using FPU instructions.

Thank you.
Regards,
Vivek

 
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Wednesday, January 06, 2016 4:54 PM
To: Deshpande, Vivek R; Joseph D. Darcy
Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib

On 1/6/16 4:31 PM, Deshpande, Vivek R wrote:
> HI Vladimir,
>
> Yes, the macroAssembler_x86_libm.cpp file is getting large, I could look into splitting it into two files macroAssembler_libm_x86_64.cpp and macroAssembler_libm_x86_32.cpp. Please let me know if that sounds good to you.

Yes, if we keep separate code we should split the file (and adjust make files).

>
> The 64 bit code takes advantage of additional general purpose registers and 64 bit integer arithmetic and so we have two different versions for 32 bit and 64 bit.

Okay, this is valid argument. Even so we may use push/pop on 32-bit to preserve registers.

>
> Regarding the FPU usage in cos/sin, we talked with the LIBM algorithm experts and they came back with the following:
> "It would not be easy to remove FPU x87 instructions from libm_sincos_huge and libm_reduced_pi04l, they are designed with using extended precision from FPU in mind.  The performance for 32bit implementation for these that do not use x87 instructions may not be optimal. These two are only used for very large input arguments."

I don't buy this argument. Do they mean that 64-bit code, which does not use FPU, produces less precise result for very large input arguments" ?
Very large input arguments is very rare case, I think. Should we care about its performance?
Note, 32-bit performance become less and less important.

Okay, for now lets split the file. Late we can try to simplify/combine/factor out the code.

Thanks,
Vladimir


>
> Thank you.
> Regards,
> Vivek
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, December 30, 2015 7:49 PM
> To: Deshpande, Vivek R; Joseph D. Darcy
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math 
> lib
>
> Hi Vivek,
>
> Why 32-bit code is so different from 64-bit code? You only use it if sse2 is available so XMM registers are present. Why to use FPU if you have SSE?
>
> 32-bit:
>
>    582   movsd(Address(rsp, 8), xmm0);
>    583   fld_d(Address(rsp, 8));
>    584   movsd(Address(rsp, 16), xmm6);
>    585   fld_d(Address(rsp, 16));
>    586   fmula(1);
>
> 64-bit:
>
>    295   mulsd(xmm0, xmm2);
>
> It is concerned to all LIBM 32-bit intrinsics.
>
> The main concern is that macroAssembler_x86_libm.cpp file become too large and it would be nice if 32-bit and 64-bit reuse the same code.
>
> Thanks,
> Vladimir
>
> On 12/24/15 6:10 PM, Deshpande, Vivek R wrote:
>> HI Vladimir
>>
>> I have updated the libm sin cos intrinsics for x86 for hotspot.
>> The updated webrev for the same is at this location for your review.
>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/we
>> b
>> rev.00/
>> Could you please review it.
>>
>> Regards,
>> Vivek
>>
>>
>> -----Original Message-----
>> From: Deshpande, Vivek R
>> Sent: Tuesday, December 22, 2015 5:42 PM
>> To: 'Joseph D. Darcy'; Vladimir Kozlov
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the math 
>> lib
>>
>> HI All
>>
>> I have uploaded the patch for sin and cos tests with input and allowed outputs at this location for your review.
>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.
>> 00/ Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>> Thank you.
>>
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>> Sent: Friday, December 04, 2015 4:50 PM
>> To: Deshpande, Vivek R; Vladimir Kozlov
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math 
>> lib
>>
>> Hi Vivek,
>>
>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>> Hi
>>>
>>> Sure I will add the tests. Shall I use StrictMath result as a reference for exact result.
>>> Let me know your thoughts.
>>
>> As a rough test of another sin/cos implementation, StrictMath.{sin, 
>> cos} can be used a reference with the following caveat: there isn't 
>> an indication of which why the error is in a StrictMath result. Let 
>> me given an example, if
>>
>>        StrictMath.sin(x) => y
>>
>> then one of the following should be true
>>
>>        Math.sin(x) => y
>>        Math.sin(x) => Math.nextUp(y)
>>        Math.sin(x) => Math.nextDown(y)
>>
>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) 
>> OR equal to one of the floating-point numbers adjacent to that 
>> result. Of these three options, only two area allowed by the accuracy 
>> requirements of the StrictMath.sin specification. However, since 
>> StrictMath.sin doesn't give an indication of which way its error went 
>> (if it rounded up or down), there is no indication without additional 
>> work which of
>> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't buggy).
>>
>> HTH,
>>
>> -Joe
>>
>>
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Thursday, December 03, 2015 1:29 PM
>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>> math lib
>>>
>>> Hello,
>>>
>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>> Vivek,
>>>>
>>>> I think Joe is asking you to write these tests as hotspot 
>>>> regression test in hotspot/test/compiler.
>>> Exactly; if not generally applicable sin/cos tests that could be hosted in the jdk repo (alongside the regression and unit tests for java.lang.Math), then test of intrinsics in the HotSpot repo alongside other tests targeting intrinsics.
>>>
>>> Thanks,
>>>
>>> -Joe
>>>
>>>> Vladimir
>>>>
>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>> Hi Joe
>>>>>
>>>>> It would be great if you would please share the additional tests 
>>>>> with us.
>>>>>
>>>>> Regards,
>>>>> Vivek
>>>>>
>>>>> -----Original Message-----
>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>> math lib
>>>>>
>>>>> I think it is unwise for this large of an implementation change to 
>>>>> be pushed with no tests targeting the specifics of the new implementation.
>>>>>
>>>>> The worst-case tests in the jdk repo are the mathematical worst 
>>>>> cases for floating-point approximations, in other words the cases 
>>>>> were the exact mathematical answer is closes to half-way between 
>>>>> two representation floating-point numbers. Passing such tests is 
>>>>> necessary but not sufficient condition for a new implementation.
>>>>>
>>>>> Chers,
>>>>>
>>>>> -Joe
>>>>>
>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>> Okay, looks reasonable to me.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev with latest hotspot 
>>>>>>> source as base for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Deshpande, Vivek R
>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>>> math lib
>>>>>>>
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>>> math lib
>>>>>>>
>>>>>>> Please send link to new webrev on cr server.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> Please find the webrev with your suggested updates attached 
>>>>>>>> with the mail.
>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> HI Vladimir, Joe
>>>>>>>>
>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you 
>>>>>>>> have mentioned. It passed those tests.
>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions 
>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>> The performance gain is 3.2x over base jdk, that is over 
>>>>>>>> current fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>
>>>>>>>> Could I get those tests around the boundary values. Would 
>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>
>>>>>>>> I would work on adding either diagnostic flag or just one flag 
>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Just getting added to the thread..
>>>>>>>>
>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>
>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition 
>>>>>>>>> to Hotspot tests.
>>>>>>>>>
>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>> Hi Vladimir
>>>>>>>>>>
>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from 
>>>>>>>>>> StrictMath result and not exact result. So I added the flag 
>>>>>>>>>> to switch between FDLIBM and LIBM.
>>>>>>>>>>
>>>>>>>>>> Quick explanation:
>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result 
>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>> 0.1945729362957022
>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA 
>>>>>>>>>> library result is between the above two values and Exact 
>>>>>>>>>> result would be pretty close to it.
>>>>>>>>>> So here StrictMath result is less than quad-precision result, 
>>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not 
>>>>>>>>>> StrictMath
>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both 
>>>>>>>>> direction, I
>>>>>>>>> think) and it should be consistent for Interpreter and code 
>>>>>>>>> generated by JIT compilers:
>>>>>>>>>
>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#s
>>>>>>>>> i
>>>>>>>>> n
>>>>>>>>> %
>>>>>>>>> 28
>>>>>>>>> do
>>>>>>>>> u
>>>>>>>>> ble%29
>>>>>>>>>
>>>>>>>> That interpretation of the spec is not quite right. For the 
>>>>>>>> Math methods with a 1/2 ulp error bound, the floating-point 
>>>>>>>> result closest to the exact result must be returned. For the 
>>>>>>>> methods with a
>>>>>>>> 1 ulp error bound, either of the floating-point result 
>>>>>>>> bracketing the true result can be returned, subject to the 
>>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>>
>>>>>>>>>> I have done the experiments with 
>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin and 
>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dcos. 
>>>>>>>>>> With this option, the interpreter would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>> If we want to disable LIBM completely, we need the flags 
>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>> I was thinking about using existing
>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and 
>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add 
>>>>>>>>> additional versions of functions which accept intrinsic ID 
>>>>>>>>> instead of methodHandle.
>>>>>>>>>
>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>
>>>>>>>>>> Also the performance gain ~4x is with 
>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>> You confused me here. So you get 4x when only Interpreter use 
>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>
>>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>>> Port fdlibm to Java), which is providing a significant speed 
>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>
>>>>>>>> I find the current patch *insufficient* as-is in terms of its 
>>>>>>>> testing.
>>>>>>>> For example, part of patch says
>>>>>>>>
>>>>>>>> # For sin
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> # For cos
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> If nothing else, there are no tests at around those boundary 
>>>>>>>> values, which is unacceptable. There should also be some tests 
>>>>>>>> of values of interest to the algorithm in question.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> -Joe
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>> Let me know your thoughts on this. I would answer more 
>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Vivek
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>>> the math lib
>>>>>>>>>>
>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>
>>>>>>>>>>> It would be interesting to see what happens if you disable 
>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>
>>>>>>>>>>>          -XX:+UnlockDiagnosticVMOptions 
>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>> Hi Vivek,
>>>>>>>>>>
>>>>>>>>>> I want to point that you can do this experiment later. We can 
>>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>>
>>>>>>>>>> For now, please, answer my question about flags only. This is 
>>>>>>>>>> the only thing holding it from push.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>> Hi all
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to contribute a patch which optimizes
>>>>>>>>>>>> Math.sin() and
>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>>          implementation.
>>>>>>>>>>>>
>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and cos.
>>>>>>>>>>>>
>>>>>>>>>>>> The option to use the optimizations are 
>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>
>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>> webrev:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Vivek
>>>>>>>>>>>>
>>

From vladimir.kozlov at oracle.com  Sat Jan  9 06:43:40 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 8 Jan 2016 22:43:40 -0800
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A569F50C1@ORSMSX106.amr.corp.intel.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569E1902@ORSMSX106.amr.corp.intel.com>
	<5684A5B8.7070407@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569F23FC@ORSMSX106.amr.corp.intel.com>
	<568DB72F.6010408@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569F50C1@ORSMSX106.amr.corp.intel.com>
Message-ID: <5690AC1C.2070908@oracle.com>

Good. I sponsor it.

Thanks,
Vladimir

On 1/8/16 6:16 PM, Deshpande, Vivek R wrote:
> Hi Vladimir,
>
> I have updated the patch with latest base source and split the macroAssembler_x86_libm.cpp  file into two files for your review.
> The patch is at this location:
> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/webrev.01/
>
> 64 bit code does not have less precise result or lower performance, by without using FPU instructions.
>
> Thank you.
> Regards,
> Vivek
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, January 06, 2016 4:54 PM
> To: Deshpande, Vivek R; Joseph D. Darcy
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib
>
> On 1/6/16 4:31 PM, Deshpande, Vivek R wrote:
>> HI Vladimir,
>>
>> Yes, the macroAssembler_x86_libm.cpp file is getting large, I could look into splitting it into two files macroAssembler_libm_x86_64.cpp and macroAssembler_libm_x86_32.cpp. Please let me know if that sounds good to you.
>
> Yes, if we keep separate code we should split the file (and adjust make files).
>
>>
>> The 64 bit code takes advantage of additional general purpose registers and 64 bit integer arithmetic and so we have two different versions for 32 bit and 64 bit.
>
> Okay, this is valid argument. Even so we may use push/pop on 32-bit to preserve registers.
>
>>
>> Regarding the FPU usage in cos/sin, we talked with the LIBM algorithm experts and they came back with the following:
>> "It would not be easy to remove FPU x87 instructions from libm_sincos_huge and libm_reduced_pi04l, they are designed with using extended precision from FPU in mind.  The performance for 32bit implementation for these that do not use x87 instructions may not be optimal. These two are only used for very large input arguments."
>
> I don't buy this argument. Do they mean that 64-bit code, which does not use FPU, produces less precise result for very large input arguments" ?
> Very large input arguments is very rare case, I think. Should we care about its performance?
> Note, 32-bit performance become less and less important.
>
> Okay, for now lets split the file. Late we can try to simplify/combine/factor out the code.
>
> Thanks,
> Vladimir
>
>
>>
>> Thank you.
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, December 30, 2015 7:49 PM
>> To: Deshpande, Vivek R; Joseph D. Darcy
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>> lib
>>
>> Hi Vivek,
>>
>> Why 32-bit code is so different from 64-bit code? You only use it if sse2 is available so XMM registers are present. Why to use FPU if you have SSE?
>>
>> 32-bit:
>>
>>     582   movsd(Address(rsp, 8), xmm0);
>>     583   fld_d(Address(rsp, 8));
>>     584   movsd(Address(rsp, 16), xmm6);
>>     585   fld_d(Address(rsp, 16));
>>     586   fmula(1);
>>
>> 64-bit:
>>
>>     295   mulsd(xmm0, xmm2);
>>
>> It is concerned to all LIBM 32-bit intrinsics.
>>
>> The main concern is that macroAssembler_x86_libm.cpp file become too large and it would be nice if 32-bit and 64-bit reuse the same code.
>>
>> Thanks,
>> Vladimir
>>
>> On 12/24/15 6:10 PM, Deshpande, Vivek R wrote:
>>> HI Vladimir
>>>
>>> I have updated the libm sin cos intrinsics for x86 for hotspot.
>>> The updated webrev for the same is at this location for your review.
>>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/hotspot/we
>>> b
>>> rev.00/
>>> Could you please review it.
>>>
>>> Regards,
>>> Vivek
>>>
>>>
>>> -----Original Message-----
>>> From: Deshpande, Vivek R
>>> Sent: Tuesday, December 22, 2015 5:42 PM
>>> To: 'Joseph D. Darcy'; Vladimir Kozlov
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the math
>>> lib
>>>
>>> HI All
>>>
>>> I have uploaded the patch for sin and cos tests with input and allowed outputs at this location for your review.
>>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.
>>> 00/ Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>>> Thank you.
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Friday, December 04, 2015 4:50 PM
>>> To: Deshpande, Vivek R; Vladimir Kozlov
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>>> lib
>>>
>>> Hi Vivek,
>>>
>>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>>> Hi
>>>>
>>>> Sure I will add the tests. Shall I use StrictMath result as a reference for exact result.
>>>> Let me know your thoughts.
>>>
>>> As a rough test of another sin/cos implementation, StrictMath.{sin,
>>> cos} can be used a reference with the following caveat: there isn't
>>> an indication of which why the error is in a StrictMath result. Let
>>> me given an example, if
>>>
>>>         StrictMath.sin(x) => y
>>>
>>> then one of the following should be true
>>>
>>>         Math.sin(x) => y
>>>         Math.sin(x) => Math.nextUp(y)
>>>         Math.sin(x) => Math.nextDown(y)
>>>
>>> That is, Math.sin(x) should either be the same as StrictMath.sin(x)
>>> OR equal to one of the floating-point numbers adjacent to that
>>> result. Of these three options, only two area allowed by the accuracy
>>> requirements of the StrictMath.sin specification. However, since
>>> StrictMath.sin doesn't give an indication of which way its error went
>>> (if it rounded up or down), there is no indication without additional
>>> work which of
>>> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't buggy).
>>>
>>> HTH,
>>>
>>> -Joe
>>>
>>>
>>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:29 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>> math lib
>>>>
>>>> Hello,
>>>>
>>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>>> Vivek,
>>>>>
>>>>> I think Joe is asking you to write these tests as hotspot
>>>>> regression test in hotspot/test/compiler.
>>>> Exactly; if not generally applicable sin/cos tests that could be hosted in the jdk repo (alongside the regression and unit tests for java.lang.Math), then test of intrinsics in the HotSpot repo alongside other tests targeting intrinsics.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>>> Hi Joe
>>>>>>
>>>>>> It would be great if you would please share the additional tests
>>>>>> with us.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>> math lib
>>>>>>
>>>>>> I think it is unwise for this large of an implementation change to
>>>>>> be pushed with no tests targeting the specifics of the new implementation.
>>>>>>
>>>>>> The worst-case tests in the jdk repo are the mathematical worst
>>>>>> cases for floating-point approximations, in other words the cases
>>>>>> were the exact mathematical answer is closes to half-way between
>>>>>> two representation floating-point numbers. Passing such tests is
>>>>>> necessary but not sufficient condition for a new implementation.
>>>>>>
>>>>>> Chers,
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>>> Okay, looks reasonable to me.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev with latest hotspot
>>>>>>>> source as base for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Please send link to new webrev on cr server.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> Please find the webrev with your suggested updates attached
>>>>>>>>> with the mail.
>>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Deshpande, Vivek R
>>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> HI Vladimir, Joe
>>>>>>>>>
>>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you
>>>>>>>>> have mentioned. It passed those tests.
>>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>>> The performance gain is 3.2x over base jdk, that is over
>>>>>>>>> current fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>>
>>>>>>>>> Could I get those tests around the boundary values. Would
>>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>>
>>>>>>>>> I would work on adding either diagnostic flag or just one flag
>>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Just getting added to the thread..
>>>>>>>>>
>>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>>
>>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition
>>>>>>>>>> to Hotspot tests.
>>>>>>>>>>
>>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi Vladimir
>>>>>>>>>>>
>>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from
>>>>>>>>>>> StrictMath result and not exact result. So I added the flag
>>>>>>>>>>> to switch between FDLIBM and LIBM.
>>>>>>>>>>>
>>>>>>>>>>> Quick explanation:
>>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result
>>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>>> 0.1945729362957022
>>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA
>>>>>>>>>>> library result is between the above two values and Exact
>>>>>>>>>>> result would be pretty close to it.
>>>>>>>>>>> So here StrictMath result is less than quad-precision result,
>>>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not
>>>>>>>>>>> StrictMath
>>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both
>>>>>>>>>> direction, I
>>>>>>>>>> think) and it should be consistent for Interpreter and code
>>>>>>>>>> generated by JIT compilers:
>>>>>>>>>>
>>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#s
>>>>>>>>>> i
>>>>>>>>>> n
>>>>>>>>>> %
>>>>>>>>>> 28
>>>>>>>>>> do
>>>>>>>>>> u
>>>>>>>>>> ble%29
>>>>>>>>>>
>>>>>>>>> That interpretation of the spec is not quite right. For the
>>>>>>>>> Math methods with a 1/2 ulp error bound, the floating-point
>>>>>>>>> result closest to the exact result must be returned. For the
>>>>>>>>> methods with a
>>>>>>>>> 1 ulp error bound, either of the floating-point result
>>>>>>>>> bracketing the true result can be returned, subject to the
>>>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>>>
>>>>>>>>>>> I have done the experiments with
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin and
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dcos.
>>>>>>>>>>> With this option, the interpreter would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>>> If we want to disable LIBM completely, we need the flags
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>> I was thinking about using existing
>>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and
>>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add
>>>>>>>>>> additional versions of functions which accept intrinsic ID
>>>>>>>>>> instead of methodHandle.
>>>>>>>>>>
>>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>>
>>>>>>>>>>> Also the performance gain ~4x is with
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>>> You confused me here. So you get 4x when only Interpreter use
>>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>>
>>>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>>>> Port fdlibm to Java), which is providing a significant speed
>>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>>
>>>>>>>>> I find the current patch *insufficient* as-is in terms of its
>>>>>>>>> testing.
>>>>>>>>> For example, part of patch says
>>>>>>>>>
>>>>>>>>> # For sin
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> # For cos
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> If nothing else, there are no tests at around those boundary
>>>>>>>>> values, which is unacceptable. There should also be some tests
>>>>>>>>> of values of interest to the algorithm in question.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> -Joe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Let me know your thoughts on this. I would answer more
>>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Vivek
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>>>> the math lib
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be interesting to see what happens if you disable
>>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>>
>>>>>>>>>>>>           -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>
>>>>>>>>>>> I want to point that you can do this experiment later. We can
>>>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>>>
>>>>>>>>>>> For now, please, answer my question about flags only. This is
>>>>>>>>>>> the only thing holding it from push.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to contribute a patch which optimizes
>>>>>>>>>>>>> Math.sin() and
>>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>>>           implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and cos.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The option to use the optimizations are
>>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vivek
>>>>>>>>>>>>>
>>>

From aph at redhat.com  Sat Jan  9 10:40:18 2016
From: aph at redhat.com (Andrew Haley)
Date: Sat, 9 Jan 2016 10:40:18 +0000
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <56902043.1040409@oracle.com>
References: <568F9852.4090806@oracle.com> <56902043.1040409@oracle.com>
Message-ID: <5690E392.9060704@redhat.com>

On 08/01/16 20:46, Vladimir Kozlov wrote:
> Looks good to me.

Maybe we're going to need changes for PPC and AArch64.  I'm wondering
if maybe we could have some sort of way to flag such changes for
maintainers of those ports.  Otherwise it's just luck that I notice
the bug going past.

Andrew.


From tobias.hartmann at oracle.com  Mon Jan 11 06:56:32 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 07:56:32 +0100
Subject: [9] RFR(S): 8136469: OptimizeStringConcat fails on pre-sized
	StringBuilder shapes
In-Reply-To: <56900F31.4060409@oracle.com>
References: <55FBDFEC.4060405@oracle.com> <568CF8F5.5090202@oracle.com>
	<568DB2DF.4010305@oracle.com> <568EAE74.6020507@oracle.com>
	<568EBB86.1060108@oracle.com> <568F7BD4.1070000@oracle.com>
	<56900F31.4060409@oracle.com>
Message-ID: <56935220.6020301@oracle.com>

Thanks, Vladimir.

Best,
Tobias

On 08.01.2016 20:34, Vladimir Kozlov wrote:
> Very good.
> 
> Thanks,
> Vladimir
> 
> On 1/8/16 1:05 AM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 07.01.2016 20:24, Vladimir Kozlov wrote:
>>> On 1/7/16 10:29 AM, Tobias Hartmann wrote:
>>>> Hi Vladimir,
>>>>
>>>> On 07.01.2016 01:35, Vladimir Kozlov wrote:
>>>>> Nope. Too much unrelated changes. If you want to go this road - file separate RFE to change phase argument type of Identity() and Value().
>>>>
>>>> Okay, I agree. I filed JDK-8146629 [1].
>>>>
>>>>> And why use PhaseValue and not PhaseGVN as in Ideal()?
>>>>
>>>> Right, we can use PhaseGVN.
>>>>
>>>>> So I agree to do your change in IfNode::Identity(). But as separate fix after general change.
>>>>
>>>> Here is the updated webrev based on JDK-8146629:
>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.06/
>>>
>>> So for IGVN we wait until dead branch is removed and only one IfProj node left before we do this Identity optimization.
>>> And for GVN (Parse phase) we don't wait because during this phase we don't remove nodes.
>>> The comment should say something about GVN/Parse phase to understand !phase->is_IterGVN() condition.
>>
>> Right, I updated the comment. Does this look good to you?
>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.07
>>
>> Thanks,
>> Tobias
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8146629
>>>>
>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 1/6/16 3:22 AM, Tobias Hartmann wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I had an off-thread discussion with Roland and we came to the conclusion that all proposed fixes essentially work around the fact that we are unable to determine if Identity is called from GVN or IGVN. As Roland pointed out, we would probably miss to adapt such a fix if we ever get the ability to check for GVN/IGVN.
>>>>>>
>>>>>> Here is a more robust solution not depending on any worklist ordering assumptions and not causing unexpected side effects:
>>>>>> Since Node::Identity(PhaseTransform* phase) is always called with either PhaseGVN or PhaseIterGVN, we can change the argument to type PhaseValues* and can therefore simply use phase->is_IterGVN() to determine if we were called from GVN or IGVN. This could also be useful for other changes. Of course, this introduces an additional virtual call but we are already calling phase->is_IterGVN() at many other places in the code. In the future, these calls could be replaced by a field access (as Vladimir suggested in the RFR for 8139771).
>>>>>>
>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.05/
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> Thanks,
>>>>>> Tobias
>>>>>>
>>>>>>
>>>>>> On 18.09.2015 11:57, Tobias Hartmann wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> please review the following patch.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136469
>>>>>>> http://cr.openjdk.java.net/~thartmann/8136469/webrev.00/
>>>>>>>
>>>>>>> Problem:
>>>>>>> When creating a pre-sized StringBuilder, C2's string concatenation optimization sometimes fails to optimize the chain (see [1]). The problem is that the initial size of the StringBuilder depends on a static final boolean that is initialized to true at runtime. Therefore the string concatenation control flow chain [2] contains an IfNode with a ConI (1) as input instead of the expected BoolNode and StringConcat::validate_control_flow() silently bails out.
>>>>>>>
>>>>>>> Solution:
>>>>>>> I changed the implementation to skip dead tests as they would be removed by IGVN later anyway. I added an assert to make sure we don't bail out silently if the input of the IfNode is not a bool. I also had to change validate_mem_flow() to handle dead ifs. Further, the assert in line 825 is unnecessary because we execute the same check in as_If().
>>>>>>>
>>>>>>> Testing:
>>>>>>> - New test (TestPresizedStringBuilder)
>>>>>>> - JPRT
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Tobias
>>>>>>>
>>>>>>> [1] https://bugs.openjdk.java.net/secure/attachment/53220/TestPresizedStringBuilder.java
>>>>>>> [2] https://bugs.openjdk.java.net/secure/attachment/53218/graph.png
>>>>>>>

From martin.doerr at sap.com  Mon Jan 11 08:39:50 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 11 Jan 2016 08:39:50 +0000
Subject: RFR(M): 8146612: C2: Precedence edges specification violated
In-Reply-To: <56901219.8090805@oracle.com>
References: <7C9B87B351A4BA4AA9EC95BB418116567228AAB8@DEWDFEMB19C.global.corp.sap>
	<568EE1A6.3050202@oracle.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228ACE8@DEWDFEMB19C.global.corp.sap>
	<56901219.8090805@oracle.com>
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228AF47@DEWDFEMB19C.global.corp.sap>

Hi Vladimir,

thanks for reviewing and sponsoring.

Best regards,
 Martin

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Freitag, 8. Januar 2016 20:47
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M): 8146612: C2: Precedence edges specification violated

Very good. I will sponsor it.

Thanks,
Vladimir

On 1/8/16 3:06 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> thanks for the review.
>
> I have changed the comments, added assertions and factored out the common functionality of del_req(), del_req_ordered() and rm_prec() into a new private function close_prec_gap_at(). That makes sense.
>
> About your concern about accessing outside of _in array in rm_prec():
> Please note that i is decremented before it gets used:
> "j == _max-1", "i" will be set to "_max", but decremented in "_in[--i]"
>
> Anyway, I have replaced this code by close_prec_gap_at(), so it doesn't matter anymore.
>
> The new webrev is here:
> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.01/
>
> Best regards,
>    Martin
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 7. Januar 2016 23:08
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8146612: C2: Precedence edges specification violated
>
> // Avoid spec violation: multiple prec edge.
>
> I think should be:
>
> // Avoid spec violation: duplicated prec edge.
>
> Should we add assert to rm_prec()?:
>        assert(j >= _cnt, "not a precedence edge");
>
> Also we may need to check that input index is < _max in set_prec() and rm_prec().
>
> Next access will be outside _in array if j == _max-1 (in rm_prec()):
>
>      _in[i] = NULL;                // NULL out last element
>
> unless we guarantee that there is always NULL at the end. Which I don't see because set_prec() may set the last prec
> edge to not NULL.
>
> Please factor out similar code (search for last non-NULL prec edge) in del_req(), del_req_ordered() and rm_prec() into
> separate method.
>
> Thanks,
> Vladimir
>
>
> On 1/7/16 5:45 AM, Doerr, Martin wrote:
>> Hi,
>>
>> some time ago, we found out, that C2 doesn't treat precedence edges as specified.
>>
>> The description of precedence edges in node.hpp says:
>>
>> "They are unordered and not duplicated; they have no embedded NULLs."
>>
>> Some functions in the current implementation violate this specification.
>>
>> I have fixed this in the following webrev:
>>
>> http://cr.openjdk.java.net/~mdoerr/8146612_C2_prec_edges/webrev.00/
>>
>> Please review. I will need a sponsor, please.
>>
>> Best regards,
>>
>> Martin
>>

From tobias.hartmann at oracle.com  Mon Jan 11 08:48:40 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 09:48:40 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <568F9852.4090806@oracle.com>
References: <568F9852.4090806@oracle.com>
Message-ID: <56936C68.70002@oracle.com>

Hi Zoltan,

looks good to me. Do you think it would make sense to add a regression test running with flag combinations like -XX:-UseTLAB and -XX:+ZeroTLAB to catch the missing initialization?

Best,
Tobias


On 08.01.2016 12:06, Zolt?n Maj? wrote:
> Hi,
> 
> 
> please review the patch for 8086053.
> 
> https://bugs.openjdk.java.net/browse/JDK-8086053
> 
> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These inconsistencies lead to newly allocated regions not being filled with zeros.
> 
> Solution: Address the following:
> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
> 
> Webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
> 
> Testing:
> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
> - JPRT;
> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
> 
> Thank you and best regards,
> 
> 
> Zoltan
> 

From tobias.hartmann at oracle.com  Mon Jan 11 09:26:00 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 10:26:00 +0100
Subject: FW: RFR(S): 6378256: Performance problem with
	System.identityHashCode in client compiler
In-Reply-To: <d8c539a2-ca51-43de-a2b7-61460fc4fbf7@default>
References: <a145a4d4-86f2-4c54-8be6-a0ff20c40a61@default>
	<d8c539a2-ca51-43de-a2b7-61460fc4fbf7@default>
Message-ID: <56937528.2080600@oracle.com>

Hi Rahul,

> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/

Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also don't need the 'hash' register if you compute everything in 'result'.

Best,
Tobias


On 08.01.2016 18:13, Rahul Raghavan wrote:
> Hello,
> 
> Please review the following revised patch for JDK-6378256 -
> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> 
> This revised webrev got following changes -
> 
>  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null reference input!
>    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
>  
>  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
> 
> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> 
> Thanks,
> Rahul
> 
> 
>> -----Original Message-----
>> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-dev at openjdk.java.net
>>
>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>
>> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again would be
>> nice.
>> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
>>
>> Roland.
> 
> 
>> -----Original Message-----
>> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
>>
>> Hello,
>>
>> Please review the following patch for JDK-6378256.
>>
>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
>> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
>> slower).
>> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
>>
>> sample unit test:
>>    public class Jdk6378256Test
>>    {
>>       public static void main(String[] args)
>>       {
>>          Object obj = new Object();
>>          long time = System.nanoTime();
>>          for(int i = 0 ; i < 1000000 ; i++)
>>             System.identityHashCode(obj);  //compare to obj.hashCode();
>>          System.out.println ("Result = " + (System.nanoTime() - time));
>>       }
>>    }
>>
>> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
>> (looks in the header for the hashCode before calling into the VM).
>> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
>> So also added required additional null check for System.identityHashCode case.
>>
>> Testing:
>>    - successful JPRT run (-testset hotspot).
>>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
>>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
>>    - Added 'noreg-perf' label for this performance bug.
>>       Manual testing done and confirmed expected performance values for unit tests with fix.
>>
>> Thanks,
>> Rahul

From tobias.hartmann at oracle.com  Mon Jan 11 10:08:31 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 11:08:31 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <568EB3A0.3040909@oracle.com>
References: <568EB3A0.3040909@oracle.com>
Message-ID: <56937F1F.7010709@oracle.com>

FYI, I had to merge with JDK-8143353 [1] (CosDNode and SinDNode were removed).

This is the change I indent to push:
http://cr.openjdk.java.net/~thartmann/8146629/webrev.01/

Thanks,
Tobias

[1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/13b04370e8e9

On 07.01.2016 19:51, Tobias Hartmann wrote:
> Hi,
> 
> please review the following patch.
> 
> https://bugs.openjdk.java.net/browse/JDK-8146629
> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
> 
> Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
> 
> Thanks,
> Tobias
> 
> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
> 

From zoltan.majo at oracle.com  Mon Jan 11 13:10:42 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 11 Jan 2016 14:10:42 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <56902043.1040409@oracle.com>
References: <568F9852.4090806@oracle.com> <56902043.1040409@oracle.com>
Message-ID: <5693A9D2.3080009@oracle.com>

Thank you, Vladimir, for the review!

Best regards,


Zoltan

On 01/08/2016 09:46 PM, Vladimir Kozlov wrote:
> Looks good to me.
>
> Thanks,
> Vladimir
>
> On 1/8/16 3:06 AM, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>> please review the patch for 8086053.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>
>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly 
>> allocated TLAB regions. With ZeroTLAB disabled,
>> the interpreter and compiled code should assume the responsibility to 
>> zero-fill newly allocated regions.
>> Currently, the handling of the ZeroTLAB flag shows some 
>> inconsistencies between the GC and the compilers. These
>> inconsistencies lead to newly allocated regions not being filled with 
>> zeros.
>>
>> Solution: Address the following:
>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without 
>> notifying the GC. As a result, the newly allocated
>> TLAB is not initialized with zero. Add TLAB initialization code to C1.
>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of 
>> newly allocated objects/arrays even if TLAB
>> allocation is disabled. Add stricter conditions to C2 on when to skip 
>> filling objects/arrays with zero.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>
>> Testing:
>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC 
>> and -XX:+UseSerialGC;
>> - JPRT;
>> - all hotspot tests on all platforms affected by the change using all 
>> combinations of +/-UseTLAB and +/-ZeroTLAB.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>


From zoltan.majo at oracle.com  Mon Jan 11 13:11:03 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 11 Jan 2016 14:11:03 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <56936C68.70002@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
Message-ID: <5693A9E7.3040700@oracle.com>

Hi Tobias,


On 01/11/2016 09:48 AM, Tobias Hartmann wrote:
> Hi Zoltan,
>
> looks good to me.

thank you for the feedback!

> Do you think it would make sense to add a regression test running with flag combinations like -XX:-UseTLAB and -XX:+ZeroTLAB to catch the missing initialization?

Yes, that is a good idea. I added a test that launches the VM with all 
flag combinations and also with different GCs. I did the same what the 
test does to reproduce the original failure.

Here is the updated webrev:
http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/

The newly added test passes on all supported platforms.

Thank you and best regards,


Zoltan

>
> Best,
> Tobias
>
>
> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>> please review the patch for 8086053.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>
>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These inconsistencies lead to newly allocated regions not being filled with zeros.
>>
>> Solution: Address the following:
>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>
>> Testing:
>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>> - JPRT;
>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>


From zoltan.majo at oracle.com  Mon Jan 11 13:16:11 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 11 Jan 2016 14:16:11 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5690E392.9060704@redhat.com>
References: <568F9852.4090806@oracle.com> <56902043.1040409@oracle.com>
	<5690E392.9060704@redhat.com>
Message-ID: <5693AB1B.7090909@oracle.com>

Hi Andrew,


On 01/09/2016 11:40 AM, Andrew Haley wrote:
> On 08/01/16 20:46, Vladimir Kozlov wrote:
>> Looks good to me.
> Maybe we're going to need changes for PPC and AArch64.

Yes, I think you need this patch on those platforms as well.

> I'm wondering
> if maybe we could have some sort of way to flag such changes for
> maintainers of those ports.  Otherwise it's just luck that I notice
> the bug going past.

Maybe we could define a new JIRA label for this purpose. What do you 
think about that?

Also, we might need a way to signal the need to propagate changes into 
the opposite direction (i.e., from ppc/aarch64 to the other supported 
platforms).

Best regards,


Zoltan

>
> Andrew.
>


From doug.simon at oracle.com  Mon Jan 11 13:18:50 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 14:18:50 +0100
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
Message-ID: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>

The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.

This review is for changes that address the above issues as follows: 

1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
2. A thread waiting for a compilation task to complete checks the state of the compiler thread periodically (500ms intervals). If 5 successive checks see a blocked thread, the compilation times out and the waiting thread is unblocked.

https://bugs.openjdk.java.net/browse/JDK-8146705
http://cr.openjdk.java.net/~dnsimon/8146705/

-Doug

From tobias.hartmann at oracle.com  Mon Jan 11 13:20:33 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 14:20:33 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693A9E7.3040700@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com>
Message-ID: <5693AC21.5070304@oracle.com>

Hi Zoltan,

looks good to me!

Best,
Tobias

On 11.01.2016 14:11, Zolt?n Maj? wrote:
> Hi Tobias,
> 
> 
> On 01/11/2016 09:48 AM, Tobias Hartmann wrote:
>> Hi Zoltan,
>>
>> looks good to me.
> 
> thank you for the feedback!
> 
>> Do you think it would make sense to add a regression test running with flag combinations like -XX:-UseTLAB and -XX:+ZeroTLAB to catch the missing initialization?
> 
> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs. I did the same what the test does to reproduce the original failure.
> 
> Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
> 
> The newly added test passes on all supported platforms.
> 
> Thank you and best regards,
> 
> 
> Zoltan
> 
>>
>> Best,
>> Tobias
>>
>>
>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>> Hi,
>>>
>>>
>>> please review the patch for 8086053.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>
>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These inconsistencies lead to newly allocated regions not being filled with zeros.
>>>
>>> Solution: Address the following:
>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>
>>> Testing:
>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>> - JPRT;
>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
> 

From aph at redhat.com  Mon Jan 11 13:31:46 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 11 Jan 2016 13:31:46 +0000
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693AB1B.7090909@oracle.com>
References: <568F9852.4090806@oracle.com> <56902043.1040409@oracle.com>
	<5690E392.9060704@redhat.com> <5693AB1B.7090909@oracle.com>
Message-ID: <5693AEC2.70409@redhat.com>

Hi,

On 01/11/2016 01:16 PM, Zolt?n Maj? wrote:
> On 01/09/2016 11:40 AM, Andrew Haley wrote:
>> On 08/01/16 20:46, Vladimir Kozlov wrote:
>>> Looks good to me.
>> Maybe we're going to need changes for PPC and AArch64.
> 
> Yes, I think you need this patch on those platforms as well.
> 
>> I'm wondering
>> if maybe we could have some sort of way to flag such changes for
>> maintainers of those ports.  Otherwise it's just luck that I notice
>> the bug going past.
> 
> Maybe we could define a new JIRA label for this purpose. What do you 
> think about that?

That sounds like it might work.

> Also, we might need a way to signal the need to propagate changes into 
> the opposite direction (i.e., from ppc/aarch64 to the other supported 
> platforms).

Maybe so.  That hasn't happened yet, though.  The symmetry appeals to me.

Andrew.


From zoltan.majo at oracle.com  Mon Jan 11 13:37:33 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 11 Jan 2016 14:37:33 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693A9E7.3040700@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com>
Message-ID: <5693B01D.60604@oracle.com>

Hi,


On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
> [...]
> Yes, that is a good idea. I added a test that launches the VM with all 
> flag combinations and also with different GCs. I did the same what the 
> test does to reproduce the original failure.
>
> Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/

The test contains and unnecessary @library tag and package import. The 
year in the copyright statement must be changed as well (to 2016).

Here is the webrev with those changes:
http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/

Sorry for the noise.

Thank you and best regards,


Zoltan


>
> The newly added test passes on all supported platforms.
>
> Thank you and best regards,
>
>
> Zoltan
>
>>
>> Best,
>> Tobias
>>
>>
>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>> Hi,
>>>
>>>
>>> please review the patch for 8086053.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>
>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill 
>>> newly allocated TLAB regions. With ZeroTLAB disabled, the 
>>> interpreter and compiled code should assume the responsibility to 
>>> zero-fill newly allocated regions.
>>> Currently, the handling of the ZeroTLAB flag shows some 
>>> inconsistencies between the GC and the compilers. These 
>>> inconsistencies lead to newly allocated regions not being filled 
>>> with zeros.
>>>
>>> Solution: Address the following:
>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB 
>>> without notifying the GC. As a result, the newly allocated TLAB is 
>>> not initialized with zero. Add TLAB initialization code to C1.
>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of 
>>> newly allocated objects/arrays even if TLAB allocation is disabled. 
>>> Add stricter conditions to C2 on when to skip filling objects/arrays 
>>> with zero.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>
>>> Testing:
>>> - local testing (linux_x86_64) of failing test case with 
>>> -XX:+UseG1GC and -XX:+UseSerialGC;
>>> - JPRT;
>>> - all hotspot tests on all platforms affected by the change using 
>>> all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>


From zoltan.majo at oracle.com  Mon Jan 11 13:40:18 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 11 Jan 2016 14:40:18 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693AC21.5070304@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693AC21.5070304@oracle.com>
Message-ID: <5693B0C2.8090908@oracle.com>

Hi Tobias,


On 01/11/2016 02:20 PM, Tobias Hartmann wrote:
> Hi Zoltan,
>
> looks good to me!

thank you for the review!

Best regards,


Zoltan

>
> Best,
> Tobias
>
> On 11.01.2016 14:11, Zolt?n Maj? wrote:
>> Hi Tobias,
>>
>>
>> On 01/11/2016 09:48 AM, Tobias Hartmann wrote:
>>> Hi Zoltan,
>>>
>>> looks good to me.
>> thank you for the feedback!
>>
>>> Do you think it would make sense to add a regression test running with flag combinations like -XX:-UseTLAB and -XX:+ZeroTLAB to catch the missing initialization?
>> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs. I did the same what the test does to reproduce the original failure.
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>
>> The newly added test passes on all supported platforms.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>> Best,
>>> Tobias
>>>
>>>
>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>> Hi,
>>>>
>>>>
>>>> please review the patch for 8086053.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>
>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These inconsistencies lead to newly allocated regions not being filled with zeros.
>>>>
>>>> Solution: Address the following:
>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>
>>>> Testing:
>>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>>> - JPRT;
>>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>


From doug.simon at oracle.com  Mon Jan 11 14:05:03 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 15:05:03 +0100
Subject: RFR: 8146788: remove jvmci.jar from mx suite
Message-ID: <A33CE13D-EA4A-480D-82EB-533200B3E31F@oracle.com>

Please this small change to remove generation of a jvmci.jar by the mx JVMCI build system.

https://bugs.openjdk.java.net/browse/JDK-8146788
http://cr.openjdk.java.net/~dnsimon/8146788/

-Doug

From roland.westrelin at oracle.com  Mon Jan 11 15:07:55 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 11 Jan 2016 16:07:55 +0100
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
Message-ID: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>

http://cr.openjdk.java.net/~roland/8146792/webrev.00/

- partial peeling is applied to a loop 
- the peeled section is optimized and leaves a pinned node between the loop predicates and the loop body but no control flow 
- loop predicates are applied and a predicate that depends on the pinned node is moved out of the loop, before the pinned node, leading to a broken graph 

This is the same issue that came up during review of 8139771. Vladimir suggested it gets reviewed separately. With the included test case it reproduces without the change from 8139771.

Roland.

From tobias.hartmann at oracle.com  Mon Jan 11 15:20:31 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 11 Jan 2016 16:20:31 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <56901101.6050503@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com>
Message-ID: <5693C83F.9030100@oracle.com>


On 08.01.2016 20:41, Vladimir Kozlov wrote:
> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>
>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>> Hi Vladimir,
>>>>
>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>> Andrew is right.
>>>>
>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>
>>> Right. It was the root of this bug, see below.
>>>
>>>>
>>>> I fixed this for the inflate and compress intrinsics.
>>>>
>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>> StrInflatedCopyNode is not memory node.
>>>>
>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>
>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>
>> I meant with webrev.01 but you answered my question below.
>>
>>>>    // This class defines a projection of the memory  state of a store conditional node.
>>>>    // These nodes return a value, but also update memory.
>>>>
>>>> But inflate does not return any value.
>>>
>>> Hmm, according to bottom type inflate produce memory:
>>>
>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>
>>> So it really does not need SCMemProjNode. Sorry about that.
>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>
>> Exactly.
>>
>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>
>>> set_memory(str, dst_type);
>>
>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>
>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>
>> Okay, I'll to that.
>>
>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>
>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>
>> Okay, should we then use BOTTOM for both the input and output type?
> 
> Only input. Output type corresponds to dst array type which you set correctly now.

It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
 StoreC
 inflate_string
 LoadC

The memory graph (def->use) now looks like this:
 LoadC -> inflate_string -> ByteMem
               ... StoreC-> CharMem


The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').

Setting the input to BOTTOM, generates the following graph:
http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.

I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
 LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png

Here is the new webrev:
http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/

Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?

Best,
Tobias

>>>> Related question:
>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>    char[int:>=0]:exact+any *
>>>>
>>>> which is equal to the type of the char load.
>>>
>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>
>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>
>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>
>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>
>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>
>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>
>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>    char[int:1]:NotNull:exact *
>>>>
>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>
>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>
>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>
>> I will look into this again and try to understand what happens.
> 
> It could that aryptr is pointer to array and load type is pointer to array's element.
> 
> Thanks,
> Vladimir
> 
>>
>> Thanks,
>> Tobias
>>
>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>
>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>
>>>>>>>     // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>     // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>     // compiler because the CPU does all the ordering for us.
>>>>>>>
>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>
>>>>>>> Or did I misunderstand your question?
>>>>>>
>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>> read-after-write dependency, but never mind.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Andrew.
>>>>>>

From roland.westrelin at oracle.com  Mon Jan 11 15:36:52 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 11 Jan 2016 16:36:52 +0100
Subject: RFR(M): 8145322: Code generated from unsafe loops can be slightly
	improved
In-Reply-To: <5670C098.1030301@oracle.com>
References: <AC20CF94-2BC1-4D73-8F23-2AB51104705C@oracle.com>
	<566F8177.8080000@oracle.com>
	<6BBA85D7-71DE-43AD-9DA9-CA97FF99F73D@oracle.com>
	<5670C098.1030301@oracle.com>
Message-ID: <13ADAC9C-2611-410B-A0F9-A8662C4F58BD@oracle.com>

Thanks for the review, Vladimir and Tobias.

Roland.

> On Dec 16, 2015, at 2:38 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Very nice!
> 
> You may need to change code in castnode.cpp according new changes 8145096 if they pushed first (not yet).
> And also 32-bit as Tobias pointed.
> 
> Thanks,
> Vladimir
> 
> On 12/15/15 12:55 AM, Roland Westrelin wrote:
>> Hi Vladimir,
>> 
>> Thanks for looking at this.
>> 
>>> Second assembler output still have intermediate increments and also new movslq instructions. Why it should be better.
>> 
>> I thinks there is some confusion here. There are 2 problems I?d like to fix. One is when using checkIndex. In that case, the code should be as good as regular array accesses. The first assembly dump shows it?s not. The second problem is when not using checkIndex but we know the loop bounds, should be able to do better. That?s the second assembly dump. In my email I only showed assembly without my change. With my change:
>> 
>> first test case:
>> 
>> 0c2   B11: #    B37 B12 <- B8 B10       Loop: B11-B10 inner main of N142 Freq: 975.841
>> 0c2     movq    RAX, [RSI + #16 + RDI << #3]    # long
>> 0c7     movq    RBX, [R9 + #16 + RDI << #3]     # long
>> 0cc     cmpq    RBX, RAX
>> 0cf     jne     B37  P=0.000000 C=7836.000000
>> 0cf
>> 0d5   B12: #    B38 B13 <- B11  Freq: 975.84
>> 0d5     movq    RAX, [RSI + #24 + RDI << #3]    # long
>> 0da     movq    RBX, [R9 + #24 + RDI << #3]     # long
>> 0df     cmpq    RBX, RAX
>> 0e2     jne     B38  P=0.000000 C=7836.000000
>> 0e2
>> 0e8   B13: #    B40 B14 <- B12  Freq: 975.84
>> 0e8     movq    RAX, [RSI + #32 + RDI << #3]    # long
>> 0ed     movq    RBX, [R9 + #32 + RDI << #3]     # long
>> 0f2     cmpq    RBX, RAX
>> 0f5     jne     B40  P=0.000000 C=7836.000000
>> 0f5
>> 0fb   B14: #    B42 B15 <- B13  Freq: 975.84
>> 0fb     movq    RAX, [RSI + #40 + RDI << #3]    # long
>> 100     movq    RBX, [R9 + #40 + RDI << #3]     # long
>> 105     cmpq    RBX, RAX
>> 108     jne     B42  P=0.000000 C=7836.000000
>> 108
>> 10e   B15: #    B44 B16 <- B14  Freq: 975.839
>> 10e     movq    RAX, [RSI + #48 + RDI << #3]    # long
>> 113     movq    RBX, [R9 + #48 + RDI << #3]     # long
>> 118     movl    RDX, RDI        # spill
>> 11a     addl    RDX, #4 # int
>> 11d     cmpq    RBX, RAX
>> 120     jne     B44  P=0.000000 C=7836.000000
>> 120
>> 126   B16: #    B39 B17 <- B15  Freq: 975.839
>> 126     movq    RAX, [RSI + #56 + RDI << #3]    # long
>> 12b     movq    RBX, [R9 + #56 + RDI << #3]     # long
>> 130     cmpq    RBX, RAX
>> 133     jne     B39  P=0.000000 C=7836.000000
>> 133
>> 139   B17: #    B41 B18 <- B16  Freq: 975.838
>> 139     movq    RAX, [RSI + #64 + RDI << #3]    # long
>> 13e     movq    RBX, [R9 + #64 + RDI << #3]     # long
>> 143     cmpq    RBX, RAX
>> 146     jne     B41  P=0.000000 C=7836.000000
>> 146
>> 14c   B18: #    B43 B19 <- B17  Freq: 975.838
>> 14c     movq    RAX, [RSI + #72 + RDI << #3]    # long
>> 151     movq    RBX, [R9 + #72 + RDI << #3]     # long
>> 156     cmpq    RBX, RAX
>> 159     jne     B43  P=0.000000 C=7836.000000
>> 159
>> 15f   B19: #    B10 B20 <- B18  Freq: 975.837
>> 15f     movl    RDX, RDI        # spill
>> 161     addl    RDX, #8 # int
>> 164     cmpl    RDX, RBP
>> 166     jl     B10      # loop end  P=0.998980 C=7836.000000
>> 
>> 
>> 
>> second test case:
>> 
>> 0a3   B7: #     B32 B8 <- B6 B15        Loop: B7-B15 inner main of N123 Freq: 975.843
>> 0a3     movq    RDI, [RBP + #16 + RSI << #3]    # long
>> 0a8     movq    RAX, [RDX + #16 + RSI << #3]    # long
>> 0ad     cmpq    RAX, RDI
>> 0b0     jne     B32  P=0.000000 C=7836.000000
>> 0b0
>> 0b6   B8: #     B33 B9 <- B7  Freq: 975.842
>> 0b6     movq    RDI, [RBP + #24 + RSI << #3]    # long
>> 0bb     movq    RAX, [RDX + #24 + RSI << #3]    # long
>> 0c0     cmpq    RAX, RDI
>> 0c3     jne     B33  P=0.000000 C=7836.000000
>> 0c3
>> 0c9   B9: #     B35 B10 <- B8  Freq: 975.842
>> 0c9     movq    RDI, [RBP + #32 + RSI << #3]    # long
>> 0ce     movq    RAX, [RDX + #32 + RSI << #3]    # long
>> 0d3     cmpq    RAX, RDI
>> 0d6     jne     B35  P=0.000000 C=7836.000000
>> 0d6
>> 0dc   B10: #    B39 B11 <- B9  Freq: 975.842
>> 0dc     movq    RDI, [RBP + #40 + RSI << #3]    # long
>> 0e1     movq    RAX, [RDX + #40 + RSI << #3]    # long
>> 0e6     cmpq    RAX, RDI
>> 0e9     jne     B39  P=0.000000 C=7836.000000
>> 0e9
>> 0ef   B11: #    B38 B12 <- B10  Freq: 975.841
>> 0ef     movq    RDI, [RBP + #48 + RSI << #3]    # long
>> 0f4     movq    RAX, [RDX + #48 + RSI << #3]    # long
>> 0f9     movl    R8, RSI # spill
>> 0fc     addl    R8, #4  # int
>> 100     cmpq    RAX, RDI
>> 103     jne     B38  P=0.000000 C=7836.000000
>> 103
>> 109   B12: #    B34 B13 <- B11  Freq: 975.841
>> 109     movq    RDI, [RBP + #56 + RSI << #3]    # long
>> 10e     movq    RAX, [RDX + #56 + RSI << #3]    # long
>> 113     cmpq    RAX, RDI
>> 116     jne     B34  P=0.000000 C=7836.000000
>> 116
>> 11c   B13: #    B36 B14 <- B12  Freq: 975.84
>> 11c     movq    RDI, [RBP + #64 + RSI << #3]    # long
>> 121     movq    RAX, [RDX + #64 + RSI << #3]    # long
>> 126     cmpq    RAX, RDI
>> 129     jne     B36  P=0.000000 C=7836.000000
>> 129
>> 12f   B14: #    B38 B15 <- B13  Freq: 975.84
>> 12f     movq    RDI, [RBP + #72 + RSI << #3]    # long
>> 134     movq    RAX, [RDX + #72 + RSI << #3]    # long
>> 139     movl    R8, RSI # spill
>> 13c     addl    R8, #7  # int
>> 140     cmpq    RAX, RDI
>> 143     jne     B38  P=0.000000 C=7836.000000
>> 143
>> 149   B15: #    B7 B16 <- B14  Freq: 975.839
>> 149     addl    RSI, #8 # int
>> 14c     cmpl    RSI, R11
>> 14f     jl     B7       # loop end  P=0.998980 C=7836.000000
>> 
>> Roland.
>> 
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 12/14/15 8:42 AM, Roland Westrelin wrote:
>>>> http://cr.openjdk.java.net/~roland/8145322/webrev.00/
>>>> 
>>>> Paul spotted the following small inefficiencies:
>>>> 
>>>>         for (; wi < l; wi++) {
>>>>             long bi = ((long) Objects.checkIndex(wi, l, null)) << LOG2_ARRAY_LONG_INDEX_SCALE;
>>>>             long av = U.getLongUnaligned(a, aOffset + bi);
>>>>             long bv = U.getLongUnaligned(b, bOffset + bi);
>>>>             if (av != bv) {
>>>> 
>>>> is compiled to:
>>>> 
>>>> 0b0 B9: # B28 B10 <- B8 B13 Loop: B9-B13 inner main of N130 Freq: 977.661
>>>> 0b0 movl RDX, RDI # spill
>>>> 0b2 # castII of RDX
>>>> 0b2 movq RBX, [R9 + #16 + RDX << #3] # long
>>>> 0b7 movq RAX, [RSI + #16 + RDX << #3] # long
>>>> 0bc cmpq RBX, RAX
>>>> 0bf jne B28 P=0.000000 C=7836.000000
>>>> 0bf
>>>> 0c5 B10: # B28 B11 <- B9 Freq: 977.66
>>>> 0c5 movl RDX, RDI # spill
>>>> 0c7 incl RDX # int
>>>> 0c9 # castII of RDX
>>>> 0c9 movq RBX, [R9 + #16 + RDX << #3] # long
>>>> 0ce movq RAX, [RSI + #16 + RDX << #3] # long
>>>> 0d3 cmpq RBX, RAX
>>>> 0d6 jne B28 P=0.000000 C=7836.000000
>>>> 0d6
>>>> 0dc B11: # B28 B12 <- B10 Freq: 977.66
>>>> 0dc movl RDX, RDI # spill
>>>> 0de addl RDX, #2 # int
>>>> 0e1 # castII of RDX
>>>> 0e1 movq RBX, [R9 + #16 + RDX << #3] # long
>>>> 0e6 movq RAX, [RSI + #16 + RDX << #3] # long
>>>> 0eb cmpq RBX, RAX
>>>> 0ee jne B28 P=0.000000 C=7836.000000
>>>> 0ee
>>>> 0f4 B12: # B28 B13 <- B11 Freq: 977.659
>>>> 0f4 movl RDX, RDI # spill
>>>> 0f6 addl RDX, #3 # int
>>>> 0f9 # castII of RDX
>>>> 0f9 movq RBX, [R9 + #16 + RDX << #3] # long
>>>> 0fe movq RAX, [RSI + #16 + RDX << #3] # long
>>>> 103 cmpq RBX, RAX
>>>> 106 jne B28 P=0.000000 C=7836.000000
>>>> 106
>>>> 10c B13: # B9 B14 <- B12 Freq: 977.659
>>>> 10c addl RDI, #4 # int
>>>> 10f cmpl RDI, RBP
>>>> 111 jl,s B9 # loop end P=0.998980 C=7836.000000
>>>> 
>>>> But the intermediate increment of the induction variable:
>>>> 0c7 incl RDX # int
>>>> 0de addl RDX, #2 # int
>>>> 0f6 addl RDX, #3 # int
>>>> 
>>>> should be folded in the address computation of the memory accesses: ConvI2L(AddI(x, y)) should be converted to AddL(ConvI2L(x), ConvI2L(y)) but there?s a CastII from the checkIndex between the AddI and the ConvI2L so we first need to push the CastII through the AddI. That?s the first CastIINode::Ideal transformation. If we apply that transformation we then have several CastII that only differ by their type so we need the second transformation of CastIINode::Ideal so all of them fold after loop opts.
>>>> 
>>>>         for (; wi < length >> valuesPerWidth; wi++) {
>>>>             long bi = ((long) wi) << LOG2_ARRAY_LONG_INDEX_SCALE;
>>>>             long av = U.getLongUnaligned(a, aOffset + bi);
>>>>             long bv = U.getLongUnaligned(b, bOffset + bi);
>>>>             if (av != bv) {
>>>> 
>>>> 0b0 B7: # B32 B8 <- B6 B15 Loop: B7-B15 inner main of N123 Freq: 975.843
>>>> 0b0 movslq R8, RSI # i2l
>>>> 0b3 movq RAX, [RDX + #16 + R8 << #3] # long
>>>> 0b8 movq RDI, [RBP + #16 + R8 << #3] # long
>>>> 0bd cmpq RAX, RDI
>>>> 0c0 jne B32 P=0.000000 C=7836.000000
>>>> 0c0
>>>> 0c6 B8: # B33 B9 <- B7 Freq: 975.842
>>>> 0c6 movl R8, RSI # spill
>>>> 0c9 incl R8 # int
>>>> 0cc movslq RDI, R8 # i2l
>>>> 0cf movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 0d4 movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 0d9 cmpq RAX, RDI
>>>> 0dc jne B33 P=0.000000 C=7836.000000
>>>> 0dc
>>>> 0e2 B9: # B33 B10 <- B8 Freq: 975.842
>>>> 0e2 movl R8, RSI # spill
>>>> 0e5 addl R8, #2 # int
>>>> 0e9 movslq RDI, R8 # i2l
>>>> 0ec movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 0f1 movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 0f6 cmpq RAX, RDI
>>>> 0f9 jne B33 P=0.000000 C=7836.000000
>>>> 0f9
>>>> 0ff B10: # B33 B11 <- B9 Freq: 975.842
>>>> 0ff movl R8, RSI # spill
>>>> 102 addl R8, #3 # int
>>>> 106 movslq RDI, R8 # i2l
>>>> 109 movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 10e movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 113 cmpq RAX, RDI
>>>> 116 jne B33 P=0.000000 C=7836.000000
>>>> 116
>>>> 11c B11: # B33 B12 <- B10 Freq: 975.841
>>>> 11c movl R8, RSI # spill
>>>> 11f addl R8, #4 # int
>>>> 123 movslq RDI, R8 # i2l
>>>> 126 movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 12b movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 130 cmpq RAX, RDI
>>>> 133 jne B33 P=0.000000 C=7836.000000
>>>> 133
>>>> 139 B12: # B33 B13 <- B11 Freq: 975.841
>>>> 139 movl R8, RSI # spill
>>>> 13c addl R8, #5 # int
>>>> 140 movslq RDI, R8 # i2l
>>>> 143 movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 148 movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 14d cmpq RAX, RDI
>>>> 150 jne B33 P=0.000000 C=7836.000000
>>>> 150
>>>> 156 B13: # B33 B14 <- B12 Freq: 975.84
>>>> 156 movl R8, RSI # spill
>>>> 159 addl R8, #6 # int
>>>> 15d movslq RDI, R8 # i2l
>>>> 160 movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 165 movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 16a cmpq RAX, RDI
>>>> 16d jne B33 P=0.000000 C=7836.000000
>>>> 16d
>>>> 173 B14: # B33 B15 <- B13 Freq: 975.84
>>>> 173 movl R8, RSI # spill
>>>> 176 addl R8, #7 # int
>>>> 17a movslq RDI, R8 # i2l
>>>> 17d movq RAX, [RDX + #16 + RDI << #3] # long
>>>> 182 movq RDI, [RBP + #16 + RDI << #3] # long
>>>> 187 cmpq RAX, RDI
>>>> 18a jne B33 P=0.000000 C=7836.000000
>>>> 18a
>>>> 190 B15: # B7 B16 <- B14 Freq: 975.839
>>>> 190 addl RSI, #8 # int
>>>> 193 cmpl RSI, R11
>>>> 196 jl B7 # loop end P=0.998980 C=7836.000000
>>>> 
>>>> Same as above the intermediate increment of the induction variable should fold into the address computation but ConvI2L(AddI(x, y)) -> AddL(ConvI2L(x), ConvI2L(y)) is not applied because the compiler loses track of the bounds of the induction variable. The i2l conversions should also fold into the address computations but they don?t for the same reason. The change in loopnode.cpp tries to work around the problem by capturing the bounds of the loop as soon the CountedLoop is created and before other transformations applied to the loop makes it much harder for the compiler to figure the bounds out. I also relaxed the Phi type computation in PhiNode::Value().
>>>> 
>>>> I hit a couple unrelated bugs during testing: the fix in x86_64.ad is obvious. The change to superword is because we sometimes end up there with an AddL while, as I understand, we only expect integer nodes. Using the AddL leads to broken graphs.
>>>> 
>>>> Roland.
>>>> 
>> 


From roland.westrelin at oracle.com  Mon Jan 11 15:54:35 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 11 Jan 2016 16:54:35 +0100
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <CAHjP37GKBXBk1e0NSByAxxm8A04i6=TD2uzSiy3KyTNDuCxgEA@mail.gmail.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com> <CAHjP!
	37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>
	<1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>
	<CAHjP37GKBXBk1e0NSByAxxm8A04i6=TD2uzSiy3KyTNDuCxgEA@mail.gmail.com>
Message-ID: <FD41E4F1-ABC6-49D4-9CA3-B9972C56C662@oracle.com>

> As a general comment, would it make sense to assume exceptional paths are not taken in most Java code? That is, for code optimization purposes it's probably a reasonable assumption.  It seems like having an exceptional path is already a hint that it's not expected to fail; most Java devs know not to use exceptions for expected control flow.

That sounds reasonable. There?s a BailoutToInterpreterForThrows command line argument that does that (off by default, not available in product builds). I don?t know what the history behind it is.

> Could bytecode shape just like checkIndex be treated as same hint? Are there cases where something looks like checkIndex but really isn't? 

That sounds like a good suggestion. We would trade:
2 comparisons: i < 0 || i >= length
for
2 comparisons: length < 0 || i >=u length 

so even if it doesn't result in further improvements, we wouldn?t lose anything.

Roland.


> 
> Roland.
> 
> >
> > On Wednesday, January 6, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> > Note, we already have range check pattern matching code in C2 (thanks to Roland):
> >
> > https://bugs.openjdk.java.net/browse/JDK-8137168
> >
> > Vladimir
> >
> > On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
> > I don't think there's a need to write out 20 different ways to do a
> > range check -- I think nobody would expect all 20 to be covered by the
> > optimizer.  Some of those variations may not map cleanly to
> > Object::checkIndex either, nor is there any guarantee that people will
> > update all their existing range checks (or even know about) to use
> > Object::checkIndex -- some code will be left unoptimized no matter what.
> >
> > But my point is the same as Andrew's, I think; instead of making
> > checkIndex an intrinsic, simply add a pattern match against that exact
> > bytecode shape (perhaps with basic canonicalization) and then still
> > encourage people to use Object::checkIndex.  This is better than
> > intrinsic (modulo profile pollution) since any other code that happens
> > to use same pattern will match as well, and not require an update to use
> > checkIndex.  Then, if someone comes to this list with an unoptimized
> > example with a different bytecode shape and has a convincing argument
> > that the code shape is "common", you guys can consider pattern matching
> > that as well.
> >
> > On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
> > <mailto:john.r.rose at oracle.com>> wrote:
> >
> >
> >      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <vitalyd at gmail.com
> >     <mailto:vitalyd at gmail.com>> wrote:
> >      >
> >      > better canonicalization
> >
> >     That's our first and most important tactic.  (Actually inlining is.)
> >
> >     But the various idioms for checkIndex do not canonicalize easily. In
> >     this case the correct trade-off is not to invest more time and
> >     research and code into stronger canonicalization.
> >
> >     We do have canonicalization of if-expressions. It's just that in
> >     this case strengthening it to cover range checks reliably is harder
> >     than the reasonable alternative.
> >
> >     ? John
> >
> >     PS.  I am tempted to write out a list of 20 different ways to code a
> >     range check but will leave that as a exercise.
> >
> >
> >
> >
> > --
> > Sent from my phone
> 
> 
> 
> -- 
> Sent from my phone


From vitalyd at gmail.com  Mon Jan 11 16:18:13 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 11 Jan 2016 11:18:13 -0500
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <FD41E4F1-ABC6-49D4-9CA3-B9972C56C662@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com>
	<1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>
	<CAHjP37GKBXBk1e0NSByAxxm8A04i6=TD2uzSiy3KyTNDuCxgEA@mail.gmail.com>
	<FD41E4F1-ABC6-49D4-9CA3-B9972C56C662@oracle.com>
Message-ID: <CAHjP37FR+7Pu9=vBjoXyQdoEUNoYTXhMJV0FsKPRZqw7rAvgtA@mail.gmail.com>

>
> That sounds reasonable. There?s a BailoutToInterpreterForThrows command
> line argument that does that (off by default, not available in product
> builds). I don?t know what the history behind it is.


I'm surprised that's not the default behavior (i.e. statically treating
control flow ending with exception as uncommon).  Exceptional paths should
not be optimization barriers, IMHO.  It would also be good to not count
bytecodes in those paths for inlining purposes, but that's a separate topic
I suppose.

That sounds like a good suggestion. We would trade:
> 2 comparisons: i < 0 || i >= length
> for
> 2 comparisons: length < 0 || i >=u length
> so even if it doesn't result in further improvements, we wouldn?t lose
> anything.


Yes, that's my thinking as well.  You won't lose anything, but may gain
something by picking up similarly-shaped user-code checks elsewhere in
existing code.  Personally, I think intrinsics should be reserved for
constructs/intentions impossible (or very difficult) to express in plain
bytecode and for platform/CPU specific things; all else would be pattern
matched to cast a wider net.


On Mon, Jan 11, 2016 at 10:54 AM, Roland Westrelin <
roland.westrelin at oracle.com> wrote:

> > As a general comment, would it make sense to assume exceptional paths
> are not taken in most Java code? That is, for code optimization purposes
> it's probably a reasonable assumption.  It seems like having an exceptional
> path is already a hint that it's not expected to fail; most Java devs know
> not to use exceptions for expected control flow.
>
> That sounds reasonable. There?s a BailoutToInterpreterForThrows command
> line argument that does that (off by default, not available in product
> builds). I don?t know what the history behind it is.
>
> > Could bytecode shape just like checkIndex be treated as same hint? Are
> there cases where something looks like checkIndex but really isn't?
>
> That sounds like a good suggestion. We would trade:
> 2 comparisons: i < 0 || i >= length
> for
> 2 comparisons: length < 0 || i >=u length
>
> so even if it doesn't result in further improvements, we wouldn?t lose
> anything.
>
> Roland.
>
>
> >
> > Roland.
> >
> > >
> > > On Wednesday, January 6, 2016, Vladimir Kozlov <
> vladimir.kozlov at oracle.com> wrote:
> > > Note, we already have range check pattern matching code in C2 (thanks
> to Roland):
> > >
> > > https://bugs.openjdk.java.net/browse/JDK-8137168
> > >
> > > Vladimir
> > >
> > > On 1/6/16 12:39 PM, Vitaly Davidovich wrote:
> > > I don't think there's a need to write out 20 different ways to do a
> > > range check -- I think nobody would expect all 20 to be covered by the
> > > optimizer.  Some of those variations may not map cleanly to
> > > Object::checkIndex either, nor is there any guarantee that people will
> > > update all their existing range checks (or even know about) to use
> > > Object::checkIndex -- some code will be left unoptimized no matter
> what.
> > >
> > > But my point is the same as Andrew's, I think; instead of making
> > > checkIndex an intrinsic, simply add a pattern match against that exact
> > > bytecode shape (perhaps with basic canonicalization) and then still
> > > encourage people to use Object::checkIndex.  This is better than
> > > intrinsic (modulo profile pollution) since any other code that happens
> > > to use same pattern will match as well, and not require an update to
> use
> > > checkIndex.  Then, if someone comes to this list with an unoptimized
> > > example with a different bytecode shape and has a convincing argument
> > > that the code shape is "common", you guys can consider pattern matching
> > > that as well.
> > >
> > > On Wed, Jan 6, 2016 at 2:50 PM, John Rose <john.r.rose at oracle.com
> > > <mailto:john.r.rose at oracle.com>> wrote:
> > >
> > >
> > >      > On Jan 6, 2016, at 9:56 AM, Vitaly Davidovich <
> vitalyd at gmail.com
> > >     <mailto:vitalyd at gmail.com>> wrote:
> > >      >
> > >      > better canonicalization
> > >
> > >     That's our first and most important tactic.  (Actually inlining
> is.)
> > >
> > >     But the various idioms for checkIndex do not canonicalize easily.
> In
> > >     this case the correct trade-off is not to invest more time and
> > >     research and code into stronger canonicalization.
> > >
> > >     We do have canonicalization of if-expressions. It's just that in
> > >     this case strengthening it to cover range checks reliably is harder
> > >     than the reasonable alternative.
> > >
> > >     ? John
> > >
> > >     PS.  I am tempted to write out a list of 20 different ways to code
> a
> > >     range check but will leave that as a exercise.
> > >
> > >
> > >
> > >
> > > --
> > > Sent from my phone
> >
> >
> >
> > --
> > Sent from my phone
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160111/faf15918/attachment.html>

From christian.thalinger at oracle.com  Mon Jan 11 17:23:36 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 07:23:36 -1000
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
Message-ID: <C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>


> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
> 
> This review is for changes that address the above issues as follows: 
> 
> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 

Aren?t blocking tasks selected before non-blocking tasks?

> 2. A thread waiting for a compilation task to complete checks the state of the compiler thread periodically (500ms intervals). If 5 successive checks see a blocked thread, the compilation times out and the waiting thread is unblocked.
> 
> https://bugs.openjdk.java.net/browse/JDK-8146705
> http://cr.openjdk.java.net/~dnsimon/8146705/
> 
> -Doug


From doug.simon at oracle.com  Mon Jan 11 17:30:02 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 18:30:02 +0100
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
Message-ID: <3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>


> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> 
>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>> 
>> This review is for changes that address the above issues as follows: 
>> 
>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
> 
> Aren?t blocking tasks selected before non-blocking tasks?

Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.

-Doug

From christian.thalinger at oracle.com  Mon Jan 11 18:28:57 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 08:28:57 -1000
Subject: RFR: 8146001: Remove support for command line options from JVMCI
In-Reply-To: <0BB3D050-7E42-4777-BB7B-E4D7DC2A6605@oracle.com>
References: <F61E4557-011B-4746-9223-DBF028DA539B@oracle.com>
	<2FC5EBAA-49A0-42D5-A608-665B8237B326@oracle.com>
	<A3D7E69B-28A9-49C3-95CD-00FEC4F637F1@oracle.com>
	<8DE14AF8-90A4-4DF2-9CC2-98EE2E4F8670@oracle.com>
	<1297DA97-3C65-403D-AB46-16E203A74F26@oracle.com>
	<6C07E8DD-50D4-4B2E-BD8E-B131579A9664@oracle.com>
	<E07C2AD0-3B85-4315-8C49-95D771E44553@oracle.com>
	<0BB3D050-7E42-4777-BB7B-E4D7DC2A6605@oracle.com>
Message-ID: <721F2EB5-F633-4E8E-AF23-751B169B4A86@oracle.com>


> On Jan 6, 2016, at 8:04 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 06 Jan 2016, at 18:54, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> I just noticed this code in HotSpotResolvedJavaMethodImpl:
>> 
>>    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>> 
>> The only other direct usage of System.getProperty is:
>> 
>> hotspot/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>> 167:        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>> 
>> I think both of them should be using the same mechanism as introduced by this change.
> 
> I agree (assuming you mean the HotSpotJVMCIRuntime.getBooleanProperty mechanism).

Yes.

> 
> There?s also:
> 
> hotspot/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
> 70:    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit?);

I?ve seen that one too.

> 
> But we will have to leave that as is given that HotSpotJVMCIRuntime is not visible from this code. We could also remove the (legacy) ?jvmci.runtime.TimeInit? alias.

Yes, let?s remove the legacy property.

https://bugs.openjdk.java.net/browse/JDK-8146820

> 
> -Doug
> 
>> 
>>> On Jan 4, 2016, at 12:47 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 4, 2016, at 12:31 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On 04 Jan 2016, at 18:41, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jan 4, 2016, at 7:19 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Jan 4, 2016, at 7:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> On Dec 22, 2015, at 4:50 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>>> 
>>>>>>>> The effort of maintaining JVMCI across different JDK versions (including a potential backport to JDK7) is reduced by making JVMCI as small as possible. The support for command line options in JVMCI (based around the @Option annotation) is a good candidate for removal: 
>>>>>>>> 
>>>>>>>> 1. It?s almost entirely implemented on top of system properties and so can be made to work without VM support. 
>>>>>>>> 2. JVMCI itself only currently uses 3 options which can be replaced with usage of sun.misc.VM.getSavedProperty(). The latter ensures application code can?t override JVMCI properties set on the command line.
>>>>>>>> 
>>>>>>>> This change removes the JVMCI command line option support.
>>>>>>>> 
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8146001
>>>>>>>> http://cr.openjdk.java.net/~dnsimon/8146001/
>>>>>>> 
>>>>>>> +    private static final boolean TrustFinalDefaultFields = HotSpotJVMCIRuntime.getBooleanProperty(TrustFinalDefaultFieldsProperty, true);
>>>>>>> 
>>>>>>> +    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>>>>>> 
>>>>>>> We should either use the jvmci. prefix or not.
>>>>>> 
>>>>>> Sorry, I was reading the patch wrong.  Of course both use the jvmci. prefix.
>>>>> 
>>>>> I think we should prefix the property name in getBooleanProperty:
>>>>> 
>>>>> +    public static boolean getBooleanProperty(String name, boolean def) {
>>>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>>> 
>>>> Ok, sounds reasonable.
>>>> 
>>>>> 
>>>>> and I put UseProfilingInformation back:
>>>>> 
>>>>> diff -r 0fcfe4b07f7e src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Tue Dec 29 18:30:51 2015 +0100
>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 04 07:40:46 2016 -1000
>>>>> @@ -24,7 +24,6 @@ package jdk.vm.ci.hotspot;
>>>>> 
>>>>> import static jdk.vm.ci.hotspot.CompilerToVM.compilerToVM;
>>>>> import static jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime;
>>>>> -import static jdk.vm.ci.hotspot.HotSpotResolvedJavaMethod.Options.UseProfilingInformation;
>>>>> import static jdk.vm.ci.hotspot.HotSpotVMConfig.config;
>>>>> import static jdk.vm.ci.hotspot.UnsafeAccess.UNSAFE;
>>>>> 
>>>>> @@ -65,6 +64,11 @@ import jdk.vm.ci.meta.TriState;
>>>>> final class HotSpotResolvedJavaMethodImpl extends HotSpotMethod implements HotSpotResolvedJavaMethod, HotSpotProxified, MetaspaceWrapperObject {
>>>>> 
>>>>>    /**
>>>>> +     * Whether to use profiling information.
>>>>> +     */
>>>>> +    private static final boolean UseProfilingInformation = HotSpotJVMCIRuntime.getBooleanProperty("UseProfilingInformation", true);
>>>>> +
>>>>> +    /**
>>>>>     * Reference to metaspace Method object.
>>>>>     */
>>>>>    private final long metaspaceMethod;
>>>>> @@ -424,7 +428,7 @@ final class HotSpotResolvedJavaMethodImp
>>>>>    public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>>>        ProfilingInfo info;
>>>>> 
>>>>> -        if (UseProfilingInformation.getValue() && methodData == null) {
>>>>> +        if (UseProfilingInformation && methodData == null) {
>>>>>            long metaspaceMethodData = UNSAFE.getAddress(metaspaceMethod + config().methodDataOffset);
>>>>>            if (metaspaceMethodData != 0) {
>>>>>                methodData = new HotSpotMethodData(metaspaceMethodData, this);
>>>> 
>>>> JVMCI should unconditionally return available profiling information. It's up to the compiler whether or not to use it. For example, this is now compilation local in Graal:
>>>> 
>>>> http://hg.openjdk.java.net/graal/graal-compiler/rev/f35e653aa876#l16.16
>>> 
>>> Oh, I missed that.  Yes, that works for us as well.  Thanks for pointing that out.
>>> 
>>>> 
>>>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160111/9593fa5f/attachment-0001.html>

From christian.thalinger at oracle.com  Mon Jan 11 18:35:10 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 08:35:10 -1000
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
	<3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
Message-ID: <1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>


> On Jan 11, 2016, at 7:30 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> 
>>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>>> 
>>> This review is for changes that address the above issues as follows: 
>>> 
>>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
>> 
>> Aren?t blocking tasks selected before non-blocking tasks?
> 
> Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.

Then it looks good :-)

> 
> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160111/1f7f1a03/attachment.html>

From igor.veresov at oracle.com  Mon Jan 11 18:46:41 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Mon, 11 Jan 2016 10:46:41 -0800
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
Message-ID: <A37D03AA-8F95-4491-A1B8-EFF292E6BA98@oracle.com>

Makes sense. Looks good to me.

igor

> On Jan 11, 2016, at 5:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
> 
> This review is for changes that address the above issues as follows: 
> 
> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
> 2. A thread waiting for a compilation task to complete checks the state of the compiler thread periodically (500ms intervals). If 5 successive checks see a blocked thread, the compilation times out and the waiting thread is unblocked.
> 
> https://bugs.openjdk.java.net/browse/JDK-8146705
> http://cr.openjdk.java.net/~dnsimon/8146705/
> 
> -Doug


From christian.thalinger at oracle.com  Mon Jan 11 19:15:27 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 09:15:27 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
Message-ID: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8146820

I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?

diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
--- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
+++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
@@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
     }
 
     /**
+     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
+     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
+     *
+     * @param name the name of the system property
+     * @param def the value to return if there is no system property corresponding to {@code name}
+     */
+    public static String getProperty(String name, String def) {
+        String value = VM.getSavedProperty("jvmci." + name);
+        if (value == null) {
+            return def;
+        }
+        return value;
+    }
+
+    /**
      * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
      * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
      *
@@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
      * @param def the value to return if there is no system property corresponding to {@code name}
      */
     public static boolean getBooleanProperty(String name, boolean def) {
-        String value = VM.getSavedProperty("jvmci." + name);
+        String value = getProperty(name, null);
         if (value == null) {
             return def;
         }
@@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
         }
         metaAccessContext = context;
 
-        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
+        if (getBooleanProperty("printconfig", false)) {
             printConfig(config, compilerToVm);
         }
 
diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
--- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
+++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
@@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
         return false;
     }
 
-    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
+    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
 
     @Override
     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
--- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
+++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
@@ -65,9 +65,11 @@ public final class InitTimer implements 
     }
 
     /**
-     * Specifies if initialization timing is enabled.
+     * Specifies if initialization timing is enabled. Note: this property cannot use
+     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
+     * package.
      */
-    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
+    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
 
     public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
     public static final String SPACES = "                                            ";


From vladimir.kozlov at oracle.com  Mon Jan 11 19:26:22 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 11 Jan 2016 11:26:22 -0800
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693B01D.60604@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
Message-ID: <569401DE.8000105@oracle.com>

Don't use GC flags in the test. They will conflict with flags passed by testing infra and the test will fail. The was 
bug fixed by removing GC flags from all our tests.
Note, Nightly testing does GC flags rotation so you don't need to do that.
Otherwise looks good.

Thanks,
Vladimir

On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
> Hi,
>
>
>
> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>> [...]
>> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs.
>> I did the same what the test does to reproduce the original failure.
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>
> The test contains and unnecessary @library tag and package import. The year in the copyright statement must be changed
> as well (to 2016).
>
> Here is the webrev with those changes:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>
> Sorry for the noise.
>
> Thank you and best regards,
>
>
> Zoltan
>
>
>>
>> The newly added test passes on all supported platforms.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>>
>>> Best,
>>> Tobias
>>>
>>>
>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>> Hi,
>>>>
>>>>
>>>> please review the patch for 8086053.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>
>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB
>>>> disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These
>>>> inconsistencies lead to newly allocated regions not being filled with zeros.
>>>>
>>>> Solution: Address the following:
>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly
>>>> allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB
>>>> allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>
>>>> Testing:
>>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>>> - JPRT;
>>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>
>

From vladimir.kozlov at oracle.com  Mon Jan 11 19:30:17 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 11 Jan 2016 11:30:17 -0800
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <56937F1F.7010709@oracle.com>
References: <568EB3A0.3040909@oracle.com> <56937F1F.7010709@oracle.com>
Message-ID: <569402C9.5060305@oracle.com>

Sounds good.

Thanks,
Vladimir

On 1/11/16 2:08 AM, Tobias Hartmann wrote:
> FYI, I had to merge with JDK-8143353 [1] (CosDNode and SinDNode were removed).
>
> This is the change I indent to push:
> http://cr.openjdk.java.net/~thartmann/8146629/webrev.01/
>
> Thanks,
> Tobias
>
> [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/13b04370e8e9
>
> On 07.01.2016 19:51, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8146629
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>>
>> Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
>>
>> Thanks,
>> Tobias
>>
>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>>

From vladimir.kozlov at oracle.com  Mon Jan 11 19:50:17 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 11 Jan 2016 11:50:17 -0800
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
Message-ID: <56940779.8070804@oracle.com>

What is naming convention for properties?
Do we have somewhere list of all JVMCI properties we accept? May be we should add it.
All JVMCI properties names should be consistent whatever you choose.

'inittimer' is also lowcased.

Thanks,
Vladimir

On 1/11/16 11:15 AM, Christian Thalinger wrote:
> https://bugs.openjdk.java.net/browse/JDK-8146820
>
> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>
> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>       }
>
>       /**
> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
> +     *
> +     * @param name the name of the system property
> +     * @param def the value to return if there is no system property corresponding to {@code name}
> +     */
> +    public static String getProperty(String name, String def) {
> +        String value = VM.getSavedProperty("jvmci." + name);
> +        if (value == null) {
> +            return def;
> +        }
> +        return value;
> +    }
> +
> +    /**
>        * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>        * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>        *
> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>        * @param def the value to return if there is no system property corresponding to {@code name}
>        */
>       public static boolean getBooleanProperty(String name, boolean def) {
> -        String value = VM.getSavedProperty("jvmci." + name);
> +        String value = getProperty(name, null);
>           if (value == null) {
>               return def;
>           }
> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>           }
>           metaAccessContext = context;
>
> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
> +        if (getBooleanProperty("printconfig", false)) {
>               printConfig(config, compilerToVm);
>           }
>
> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>           return false;
>       }
>
> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>
>       @Override
>       public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
> @@ -65,9 +65,11 @@ public final class InitTimer implements
>       }
>
>       /**
> -     * Specifies if initialization timing is enabled.
> +     * Specifies if initialization timing is enabled. Note: this property cannot use
> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
> +     * package.
>        */
> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>
>       public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>       public static final String SPACES = "                                            ";
>

From christian.thalinger at oracle.com  Mon Jan 11 19:55:47 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 09:55:47 -1000
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
	<3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
	<1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>
Message-ID: <705F08CD-332B-4D25-B352-4FC237D6E6BC@oracle.com>

?or not:

/scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp: In member function 'virtual CompileTask* AdvancedThresholdPolicy::select_task(CompileQueue*)':
/scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:201:9: error: 'UseJVMCICompiler' was not declared in this scope
     if (UseJVMCICompiler && task->is_blocking()) {
         ^
/scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:202:11: error: 'max_blocking_task' was not declared in this scope
       if (max_blocking_task == NULL || compare_methods(method, max_blocking_task->method())) {
           ^

> On Jan 11, 2016, at 8:35 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 11, 2016, at 7:30 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>> 
>>> 
>>> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> 
>>> 
>>>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>>> 
>>>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>>>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>>>> 
>>>> This review is for changes that address the above issues as follows: 
>>>> 
>>>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
>>> 
>>> Aren?t blocking tasks selected before non-blocking tasks?
>> 
>> Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.
> 
> Then it looks good :-)
> 
>> 
>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160111/68b331ac/attachment.html>

From doug.simon at oracle.com  Mon Jan 11 19:57:05 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 20:57:05 +0100
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
Message-ID: <7C710B2B-BC2A-4D86-AD0C-608F4E057051@oracle.com>


> On 11 Jan 2016, at 20:15, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> https://bugs.openjdk.java.net/browse/JDK-8146820
> 
> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?

Yes. You should also do the same for jvmci.inittimer (i.e. jvmci.InitTimer).

-Doug

From christian.thalinger at oracle.com  Mon Jan 11 19:57:39 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 09:57:39 -1000
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <705F08CD-332B-4D25-B352-4FC237D6E6BC@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
	<3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
	<1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>
	<705F08CD-332B-4D25-B352-4FC237D6E6BC@oracle.com>
Message-ID: <4E408C1B-8F4C-4763-9A18-0AE6885F8A9E@oracle.com>

It?s #ifdef vs. #if:

+#ifdef INCLUDE_JVMCI

I?ll fix it.

> On Jan 11, 2016, at 9:55 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> ?or not:
> 
> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp: In member function 'virtual CompileTask* AdvancedThresholdPolicy::select_task(CompileQueue*)':
> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:201:9: error: 'UseJVMCICompiler' was not declared in this scope
>      if (UseJVMCICompiler && task->is_blocking()) {
>          ^
> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:202:11: error: 'max_blocking_task' was not declared in this scope
>        if (max_blocking_task == NULL || compare_methods(method, max_blocking_task->method())) {
>            ^
> 
>> On Jan 11, 2016, at 8:35 AM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>>> 
>>> On Jan 11, 2016, at 7:30 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>> 
>>>> 
>>>> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>> 
>>>> 
>>>>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>>>> 
>>>>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>>>>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>>>>> 
>>>>> This review is for changes that address the above issues as follows: 
>>>>> 
>>>>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
>>>> 
>>>> Aren?t blocking tasks selected before non-blocking tasks?
>>> 
>>> Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.
>> 
>> Then it looks good :-)
>> 
>>> 
>>> -Doug
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160111/49a9d916/attachment-0001.html>

From vladimir.kozlov at oracle.com  Mon Jan 11 20:00:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 11 Jan 2016 12:00:05 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <5693C83F.9030100@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
Message-ID: <569409C5.2040805@oracle.com>

On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>
> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>
>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>> Andrew is right.
>>>>>
>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>
>>>> Right. It was the root of this bug, see below.
>>>>
>>>>>
>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>
>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>> StrInflatedCopyNode is not memory node.
>>>>>
>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>
>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>
>>> I meant with webrev.01 but you answered my question below.
>>>
>>>>>     // This class defines a projection of the memory  state of a store conditional node.
>>>>>     // These nodes return a value, but also update memory.
>>>>>
>>>>> But inflate does not return any value.
>>>>
>>>> Hmm, according to bottom type inflate produce memory:
>>>>
>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>
>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>
>>> Exactly.
>>>
>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>
>>>> set_memory(str, dst_type);
>>>
>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>
>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>
>>> Okay, I'll to that.
>>>
>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>
>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>
>>> Okay, should we then use BOTTOM for both the input and output type?
>>
>> Only input. Output type corresponds to dst array type which you set correctly now.
>
> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>   StoreC
>   inflate_string
>   LoadC
>
> The memory graph (def->use) now looks like this:
>   LoadC -> inflate_string -> ByteMem
>                 ... StoreC-> CharMem

I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for 
LoadC. If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not 
follow StoreC then result is relaxed.

Thanks,
Vladimir

>
>
> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>
> Setting the input to BOTTOM, generates the following graph:
> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>
> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>   LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>
> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>
> Best,
> Tobias
>
>>>>> Related question:
>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>     char[int:>=0]:exact+any *
>>>>>
>>>>> which is equal to the type of the char load.
>>>>
>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>
>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>
>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>
>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>
>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>
>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>
>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>     char[int:1]:NotNull:exact *
>>>>>
>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>
>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>
>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>
>>> I will look into this again and try to understand what happens.
>>
>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Thanks,
>>> Tobias
>>>
>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>
>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>
>>>>>>>>      // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>      // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>      // compiler because the CPU does all the ordering for us.
>>>>>>>>
>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>
>>>>>>>> Or did I misunderstand your question?
>>>>>>>
>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>> read-after-write dependency, but never mind.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Andrew.
>>>>>>>

From doug.simon at oracle.com  Mon Jan 11 20:14:00 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 21:14:00 +0100
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <56940779.8070804@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
Message-ID: <B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>


> On 11 Jan 2016, at 20:50, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> What is naming convention for properties?
> Do we have somewhere list of all JVMCI properties we accept? May be we should add it.

Currently, there is no list of accepted JVMCI properties. Once Chris applies the changes below such that all system property access (apart from jvmci.InitTimer) goes through HotSpotJVMCIRuntime.getProperty(), then the javadoc of that method could contain the list (much like System.getProperties describes the supported standard properties).

> All JVMCI properties names should be consistent whatever you choose.

I agree.

-Doug

> 
> 'inittimer' is also lowcased.
> 
> Thanks,
> Vladimir
> 
> On 1/11/16 11:15 AM, Christian Thalinger wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8146820
>> 
>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>> 
>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
>> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>>      }
>> 
>>      /**
>> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
>> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
>> +     *
>> +     * @param name the name of the system property
>> +     * @param def the value to return if there is no system property corresponding to {@code name}
>> +     */
>> +    public static String getProperty(String name, String def) {
>> +        String value = VM.getSavedProperty("jvmci." + name);
>> +        if (value == null) {
>> +            return def;
>> +        }
>> +        return value;
>> +    }
>> +
>> +    /**
>>       * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>>       * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>       *
>> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>>       * @param def the value to return if there is no system property corresponding to {@code name}
>>       */
>>      public static boolean getBooleanProperty(String name, boolean def) {
>> -        String value = VM.getSavedProperty("jvmci." + name);
>> +        String value = getProperty(name, null);
>>          if (value == null) {
>>              return def;
>>          }
>> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>>          }
>>          metaAccessContext = context;
>> 
>> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>> +        if (getBooleanProperty("printconfig", false)) {
>>              printConfig(config, compilerToVm);
>>          }
>> 
>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
>> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>>          return false;
>>      }
>> 
>> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>> 
>>      @Override
>>      public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
>> @@ -65,9 +65,11 @@ public final class InitTimer implements
>>      }
>> 
>>      /**
>> -     * Specifies if initialization timing is enabled.
>> +     * Specifies if initialization timing is enabled. Note: this property cannot use
>> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
>> +     * package.
>>       */
>> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
>> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>> 
>>      public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>>      public static final String SPACES = "                                            ";
>> 


From doug.simon at oracle.com  Mon Jan 11 20:15:54 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 21:15:54 +0100
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <4E408C1B-8F4C-4763-9A18-0AE6885F8A9E@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
	<3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
	<1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>
	<705F08CD-332B-4D25-B352-4FC237D6E6BC@oracle.com>
	<4E408C1B-8F4C-4763-9A18-0AE6885F8A9E@oracle.com>
Message-ID: <444F8F02-2BEB-43DA-BB40-F421010E366B@oracle.com>


> On 11 Jan 2016, at 20:57, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> It?s #ifdef vs. #if:
> 
> +#ifdef INCLUDE_JVMCI

No, I think it?s the other way around (i.e. #if instead of #ifdef) judging by the rest of the code guarded by this macro.

-Doug

> 
> I?ll fix it.
> 
>> On Jan 11, 2016, at 9:55 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> ?or not:
>> 
>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp: In member function 'virtual CompileTask* AdvancedThresholdPolicy::select_task(CompileQueue*)':
>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:201:9: error: 'UseJVMCICompiler' was not declared in this scope
>>      if (UseJVMCICompiler && task->is_blocking()) {
>>          ^
>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:202:11: error: 'max_blocking_task' was not declared in this scope
>>        if (max_blocking_task == NULL || compare_methods(method, max_blocking_task->method())) {
>>            ^
>> 
>> 
>>> On Jan 11, 2016, at 8:35 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 11, 2016, at 7:30 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>> 
>>>>> 
>>>>>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>> 
>>>>>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>>>>>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>>>>>> 
>>>>>> This review is for changes that address the above issues as follows: 
>>>>>> 
>>>>>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
>>>>> 
>>>>> Aren?t blocking tasks selected before non-blocking tasks?
>>>> 
>>>> Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.
>>> 
>>> Then it looks good :-)
>>> 
>>>> 
>>>> -Doug
>> 
> 


From christian.thalinger at oracle.com  Mon Jan 11 21:08:19 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 11:08:19 -1000
Subject: RFR: 8146705: Improve JVMCI support for blocking compilation
In-Reply-To: <444F8F02-2BEB-43DA-BB40-F421010E366B@oracle.com>
References: <41AAC503-ADDA-42DD-B338-CD29626AC132@oracle.com>
	<C83ED6F6-77E7-4E97-814D-32ED6E9BEFB7@oracle.com>
	<3EFCF17A-DCC3-4D7A-8D85-42A9C5C64A27@oracle.com>
	<1E212BB1-49D9-4DEE-A1AA-998D96D8ABB5@oracle.com>
	<705F08CD-332B-4D25-B352-4FC237D6E6BC@oracle.com>
	<4E408C1B-8F4C-4763-9A18-0AE6885F8A9E@oracle.com>
	<444F8F02-2BEB-43DA-BB40-F421010E366B@oracle.com>
Message-ID: <D05D04ED-C25B-466C-9217-78A83B59A578@oracle.com>


> On Jan 11, 2016, at 10:15 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 11 Jan 2016, at 20:57, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> It?s #ifdef vs. #if:
>> 
>> +#ifdef INCLUDE_JVMCI
> 
> No, I think it?s the other way around (i.e. #if instead of #ifdef) judging by the rest of the code guarded by this macro.

Sorry, that?s what I meant; it should be #if.

> 
> -Doug
> 
>> 
>> I?ll fix it.
>> 
>>> On Jan 11, 2016, at 9:55 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>> ?or not:
>>> 
>>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp: In member function 'virtual CompileTask* AdvancedThresholdPolicy::select_task(CompileQueue*)':
>>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:201:9: error: 'UseJVMCICompiler' was not declared in this scope
>>>     if (UseJVMCICompiler && task->is_blocking()) {
>>>         ^
>>> /scratch/jprt/T/P1/192954.cthaling/s/hotspot/src/share/vm/runtime/advancedThresholdPolicy.cpp:202:11: error: 'max_blocking_task' was not declared in this scope
>>>       if (max_blocking_task == NULL || compare_methods(method, max_blocking_task->method())) {
>>>           ^
>>> 
>>> 
>>>> On Jan 11, 2016, at 8:35 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Jan 11, 2016, at 7:30 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On 11 Jan 2016, at 18:23, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On Jan 11, 2016, at 3:18 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>> 
>>>>>>> The CompileBroker currently uses a simple timeout of 1 second when waiting for a blocking JVMCI compilation to complete. This approach is too simple. JVMCI compiler threads themselves flood the compilation queues with compilation requests; such compilations cannot be blocking (the JVMCI compiler can easily cause the system to deadlock). This flooding means that application submitted tasks often timeout before the tasks even start compiling. 
>>>>>>> Once a JVMCI thread starts compiling a task, there is still the risk of it deadlocking. The current timeout mechanism needs to be augmented with a test of the compiler thread's state. As long as it's not blocked for too long, we know the compiler is making progress and will eventually complete.
>>>>>>> 
>>>>>>> This review is for changes that address the above issues as follows: 
>>>>>>> 
>>>>>>> 1. Non-blocking tasks are selected before blocking tasks from the compilation queue. 
>>>>>> 
>>>>>> Aren?t blocking tasks selected before non-blocking tasks?
>>>>> 
>>>>> Yes, exactly the opposite of what I said ;-) I?ve fixed the bug description and thankfully got the implementation the right way round.
>>>> 
>>>> Then it looks good :-)
>>>> 
>>>>> 
>>>>> -Doug
>>> 
>> 
> 


From doug.simon at oracle.com  Mon Jan 11 22:43:24 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 11 Jan 2016 23:43:24 +0100
Subject: RFR: 8146364: Remove @ServiceProvider mechanism from JVMCI
Message-ID: <FEDF1809-3009-4CD4-84B8-A87DDAC1CFA6@oracle.com>

Hi,

Please review these changes for removing the mechanism in JVMCI for automating the generation of files in META-INF/services for service providers annotated with @ServiceProvider.

https://bugs.openjdk.java.net/browse/JDK-8146364

http://cr.openjdk.java.net/~dnsimon/8146364/jdk9/ 
http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/

-Doug

From vladimir.kozlov at oracle.com  Tue Jan 12 00:43:49 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 11 Jan 2016 16:43:49 -0800
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
Message-ID: <56944C45.6060307@oracle.com>

Now I think I understand.
Note, there should be NO any control between loop's head and predicate check. I assume CastPP is attached to it because 
its original check was removed by dominated similar check (for example NULL check).

I think it is safe to move CastPP above original dummy predicate checks (one or two checks if there is loop limit 
checks) since Cast PP should not depend on them. It will solve the problem since moved check(new predicate) is always 
inserted before original dummy predicate (which will be removed later).

Thanks,
Vladimir

On 1/11/16 7:07 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8146792/webrev.00/
>
> - partial peeling is applied to a loop
> - the peeled section is optimized and leaves a pinned node between the loop predicates and the loop body but no control flow
> - loop predicates are applied and a predicate that depends on the pinned node is moved out of the loop, before the pinned node, leading to a broken graph
>
> This is the same issue that came up during review of 8139771. Vladimir suggested it gets reviewed separately. With the included test case it reproduces without the change from 8139771.
>
> Roland.
>

From christian.thalinger at oracle.com  Mon Jan 11 22:48:56 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 12:48:56 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <7C710B2B-BC2A-4D86-AD0C-608F4E057051@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<7C710B2B-BC2A-4D86-AD0C-608F4E057051@oracle.com>
Message-ID: <538AF03D-B463-4E34-BBB4-6ED53A232DBD@oracle.com>


> On Jan 11, 2016, at 9:57 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 11 Jan 2016, at 20:15, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8146820
>> 
>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
> 
> Yes. You should also do the same for jvmci.inittimer (i.e. jvmci.InitTimer).

Ok

> 
> -Doug


From christian.thalinger at oracle.com  Mon Jan 11 22:51:04 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 12:51:04 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
Message-ID: <490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>


> On Jan 11, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 11 Jan 2016, at 20:50, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>> 
>> What is naming convention for properties?
>> Do we have somewhere list of all JVMCI properties we accept? May be we should add it.
> 
> Currently, there is no list of accepted JVMCI properties. Once Chris applies the changes below such that all system property access (apart from jvmci.InitTimer) goes through HotSpotJVMCIRuntime.getProperty(), then the javadoc of that method could contain the list (much like System.getProperties describes the supported standard properties).

Good idea.

> 
>> All JVMCI properties names should be consistent whatever you choose.
> 
> I agree.

Yes.  They should feel like our other command line options so camel-case is what I had in mind.

> 
> -Doug
> 
>> 
>> 'inittimer' is also lowcased.
>> 
>> Thanks,
>> Vladimir
>> 
>> On 1/11/16 11:15 AM, Christian Thalinger wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8146820
>>> 
>>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>>> 
>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
>>> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>>>     }
>>> 
>>>     /**
>>> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
>>> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>> +     *
>>> +     * @param name the name of the system property
>>> +     * @param def the value to return if there is no system property corresponding to {@code name}
>>> +     */
>>> +    public static String getProperty(String name, String def) {
>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>> +        if (value == null) {
>>> +            return def;
>>> +        }
>>> +        return value;
>>> +    }
>>> +
>>> +    /**
>>>      * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>>>      * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>      *
>>> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>>>      * @param def the value to return if there is no system property corresponding to {@code name}
>>>      */
>>>     public static boolean getBooleanProperty(String name, boolean def) {
>>> -        String value = VM.getSavedProperty("jvmci." + name);
>>> +        String value = getProperty(name, null);
>>>         if (value == null) {
>>>             return def;
>>>         }
>>> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>>>         }
>>>         metaAccessContext = context;
>>> 
>>> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>>> +        if (getBooleanProperty("printconfig", false)) {
>>>             printConfig(config, compilerToVm);
>>>         }
>>> 
>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
>>> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>>>         return false;
>>>     }
>>> 
>>> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>>> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>>> 
>>>     @Override
>>>     public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
>>> @@ -65,9 +65,11 @@ public final class InitTimer implements
>>>     }
>>> 
>>>     /**
>>> -     * Specifies if initialization timing is enabled.
>>> +     * Specifies if initialization timing is enabled. Note: this property cannot use
>>> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
>>> +     * package.
>>>      */
>>> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
>>> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>>> 
>>>     public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>>>     public static final String SPACES = "                                            ";
>>> 
> 


From christian.thalinger at oracle.com  Tue Jan 12 01:35:28 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 11 Jan 2016 15:35:28 -1000
Subject: RFR: 8146364: Remove @ServiceProvider mechanism from JVMCI
In-Reply-To: <FEDF1809-3009-4CD4-84B8-A87DDAC1CFA6@oracle.com>
References: <FEDF1809-3009-4CD4-84B8-A87DDAC1CFA6@oracle.com>
Message-ID: <A56EBDF3-B7D9-4184-BD4A-9B9731B43385@oracle.com>


> On Jan 11, 2016, at 12:43 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Hi,
> 
> Please review these changes for removing the mechanism in JVMCI for automating the generation of files in META-INF/services for service providers annotated with @ServiceProvider.

Did you try this with a regular JDK 9 build?  I don?t think it works to have the same META-INF file in different locations:

src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.aarch64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.amd64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.sparc/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory

They will overwrite each other when being installed into the image.

> 
> https://bugs.openjdk.java.net/browse/JDK-8146364
> 
> http://cr.openjdk.java.net/~dnsimon/8146364/jdk9/ 
> http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/
> 
> -Doug


From zoltan.majo at oracle.com  Tue Jan 12 07:56:08 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 12 Jan 2016 08:56:08 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5693AEC2.70409@redhat.com>
References: <568F9852.4090806@oracle.com> <56902043.1040409@oracle.com>
	<5690E392.9060704@redhat.com> <5693AB1B.7090909@oracle.com>
	<5693AEC2.70409@redhat.com>
Message-ID: <5694B198.8060902@oracle.com>

Hi,


On 01/11/2016 02:31 PM, Andrew Haley wrote:
> [...]
>>> I'm wondering
>>> if maybe we could have some sort of way to flag such changes for
>>> maintainers of those ports.  Otherwise it's just luck that I notice
>>> the bug going past.
>> Maybe we could define a new JIRA label for this purpose. What do you
>> think about that?
> That sounds like it might work.
>
>> Also, we might need a way to signal the need to propagate changes into
>> the opposite direction (i.e., from ppc/aarch64 to the other supported
>> platforms).
> Maybe so.  That hasn't happened yet, though.  The symmetry appeals to me.

OK, I'll look into this and inform you once I've figured out how to proceed.

Best wishes,


Zoltan

>
> Andrew.
>
>


From zoltan.majo at oracle.com  Tue Jan 12 08:00:34 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 12 Jan 2016 09:00:34 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <569401DE.8000105@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
	<569401DE.8000105@oracle.com>
Message-ID: <5694B2A2.7010905@oracle.com>

Hi Vladimir,


On 01/11/2016 08:26 PM, Vladimir Kozlov wrote:
> Don't use GC flags in the test. They will conflict with flags passed 
> by testing infra and the test will fail. The was bug fixed by removing 
> GC flags from all our tests.
> Note, Nightly testing does GC flags rotation so you don't need to do 
> that.

OK, I removed all GC flags from the test. Here is the updated webrev:
http://cr.openjdk.java.net/~zmajo/8086053/webrev.03/

> Otherwise looks good.

Thank you for the review! I'll push webrev.03 today as it addresses all 
issues that were brought up.

Thank you and best regards,


Zoltan

>
> Thanks,
> Vladimir
>
> On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>>
>> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>>> [...]
>>> Yes, that is a good idea. I added a test that launches the VM with 
>>> all flag combinations and also with different GCs.
>>> I did the same what the test does to reproduce the original failure.
>>>
>>> Here is the updated webrev:
>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>
>> The test contains and unnecessary @library tag and package import. 
>> The year in the copyright statement must be changed
>> as well (to 2016).
>>
>> Here is the webrev with those changes:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>>
>> Sorry for the noise.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>
>>>
>>> The newly added test passes on all supported platforms.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>>>>
>>>> Best,
>>>> Tobias
>>>>
>>>>
>>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>> please review the patch for 8086053.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>>
>>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill 
>>>>> newly allocated TLAB regions. With ZeroTLAB
>>>>> disabled, the interpreter and compiled code should assume the 
>>>>> responsibility to zero-fill newly allocated regions.
>>>>> Currently, the handling of the ZeroTLAB flag shows some 
>>>>> inconsistencies between the GC and the compilers. These
>>>>> inconsistencies lead to newly allocated regions not being filled 
>>>>> with zeros.
>>>>>
>>>>> Solution: Address the following:
>>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB 
>>>>> without notifying the GC. As a result, the newly
>>>>> allocated TLAB is not initialized with zero. Add TLAB 
>>>>> initialization code to C1.
>>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of 
>>>>> newly allocated objects/arrays even if TLAB
>>>>> allocation is disabled. Add stricter conditions to C2 on when to 
>>>>> skip filling objects/arrays with zero.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>>
>>>>> Testing:
>>>>> - local testing (linux_x86_64) of failing test case with 
>>>>> -XX:+UseG1GC and -XX:+UseSerialGC;
>>>>> - JPRT;
>>>>> - all hotspot tests on all platforms affected by the change using 
>>>>> all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>
>>


From tobias.hartmann at oracle.com  Tue Jan 12 08:20:26 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 12 Jan 2016 09:20:26 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5694B2A2.7010905@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
	<569401DE.8000105@oracle.com> <5694B2A2.7010905@oracle.com>
Message-ID: <5694B74A.20100@oracle.com>

Hi Zoltan,

looks good to me.

Best,
Tobias

On 12.01.2016 09:00, Zolt?n Maj? wrote:
> Hi Vladimir,
> 
> 
> On 01/11/2016 08:26 PM, Vladimir Kozlov wrote:
>> Don't use GC flags in the test. They will conflict with flags passed by testing infra and the test will fail. The was bug fixed by removing GC flags from all our tests.
>> Note, Nightly testing does GC flags rotation so you don't need to do that.
> 
> OK, I removed all GC flags from the test. Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.03/
> 
>> Otherwise looks good.
> 
> Thank you for the review! I'll push webrev.03 today as it addresses all issues that were brought up.
> 
> Thank you and best regards,
> 
> 
> Zoltan
> 
>>
>> Thanks,
>> Vladimir
>>
>> On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
>>> Hi,
>>>
>>>
>>>
>>> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>>>> [...]
>>>> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs.
>>>> I did the same what the test does to reproduce the original failure.
>>>>
>>>> Here is the updated webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>>
>>> The test contains and unnecessary @library tag and package import. The year in the copyright statement must be changed
>>> as well (to 2016).
>>>
>>> Here is the webrev with those changes:
>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>>>
>>> Sorry for the noise.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>>>
>>>>
>>>> The newly added test passes on all supported platforms.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>>
>>>>> Best,
>>>>> Tobias
>>>>>
>>>>>
>>>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>> please review the patch for 8086053.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>>>
>>>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB
>>>>>> disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>>>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These
>>>>>> inconsistencies lead to newly allocated regions not being filled with zeros.
>>>>>>
>>>>>> Solution: Address the following:
>>>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly
>>>>>> allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB
>>>>>> allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>>>
>>>>>> Testing:
>>>>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>>>>> - JPRT;
>>>>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>>>
>>>>>> Thank you and best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>
>>>
> 

From zoltan.majo at oracle.com  Tue Jan 12 08:21:32 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 12 Jan 2016 09:21:32 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5694B74A.20100@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
	<569401DE.8000105@oracle.com> <5694B2A2.7010905@oracle.com>
	<5694B74A.20100@oracle.com>
Message-ID: <5694B78C.2030802@oracle.com>

On 01/12/2016 09:20 AM, Tobias Hartmann wrote:
> Hi Zoltan,
>
> looks good to me.

Thank you, Tobias!

Best regards,


Zoltan


>
> Best,
> Tobias
>
> On 12.01.2016 09:00, Zolt?n Maj? wrote:
>> Hi Vladimir,
>>
>>
>> On 01/11/2016 08:26 PM, Vladimir Kozlov wrote:
>>> Don't use GC flags in the test. They will conflict with flags passed by testing infra and the test will fail. The was bug fixed by removing GC flags from all our tests.
>>> Note, Nightly testing does GC flags rotation so you don't need to do that.
>> OK, I removed all GC flags from the test. Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.03/
>>
>>> Otherwise looks good.
>> Thank you for the review! I'll push webrev.03 today as it addresses all issues that were brought up.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
>>>> Hi,
>>>>
>>>>
>>>>
>>>> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>>>>> [...]
>>>>> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs.
>>>>> I did the same what the test does to reproduce the original failure.
>>>>>
>>>>> Here is the updated webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>>> The test contains and unnecessary @library tag and package import. The year in the copyright statement must be changed
>>>> as well (to 2016).
>>>>
>>>> Here is the webrev with those changes:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>>>>
>>>> Sorry for the noise.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>
>>>>> The newly added test passes on all supported platforms.
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>> Best,
>>>>>> Tobias
>>>>>>
>>>>>>
>>>>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> please review the patch for 8086053.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>>>>
>>>>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB
>>>>>>> disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>>>>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These
>>>>>>> inconsistencies lead to newly allocated regions not being filled with zeros.
>>>>>>>
>>>>>>> Solution: Address the following:
>>>>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly
>>>>>>> allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>>>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB
>>>>>>> allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>>>>
>>>>>>> Testing:
>>>>>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>>>>>> - JPRT;
>>>>>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>>>>
>>>>>>> Thank you and best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>


From tobias.hartmann at oracle.com  Tue Jan 12 08:39:02 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 12 Jan 2016 09:39:02 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <569402C9.5060305@oracle.com>
References: <568EB3A0.3040909@oracle.com> <56937F1F.7010709@oracle.com>
	<569402C9.5060305@oracle.com>
Message-ID: <5694BBA6.4040301@oracle.com>

I had to merge again with JDK-8139771 (castnode.cpp/hpp):
http://cr.openjdk.java.net/~thartmann/8146629/webrev.02/

Thanks,
Tobias

On 11.01.2016 20:30, Vladimir Kozlov wrote:
> Sounds good.
> 
> Thanks,
> Vladimir
> 
> On 1/11/16 2:08 AM, Tobias Hartmann wrote:
>> FYI, I had to merge with JDK-8143353 [1] (CosDNode and SinDNode were removed).
>>
>> This is the change I indent to push:
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.01/
>>
>> Thanks,
>> Tobias
>>
>> [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/13b04370e8e9
>>
>> On 07.01.2016 19:51, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8146629
>>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.00/
>>>
>>> Currently, there is no way to determine in Node::Identity() and Node::Value() if we were called from GVN or IGVN but sometimes we would like to do optimizations based on this information (for example, see discussion in RFR for JDK-8136469 [1]). I changed the arguments of Node::Identity() and Node::Value() from PhaseTransform* to PhaseGVN*. Like this, we can simply call PhaseValues::is_IterGVN() from both methods.
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/020670.html
>>>

From john.r.rose at oracle.com  Tue Jan 12 08:45:16 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 12 Jan 2016 00:45:16 -0800
Subject: RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
In-Reply-To: <FD41E4F1-ABC6-49D4-9CA3-B9972C56C662@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F389DA0DB1@ORSMSX116.amr.corp.intel.com>
	<566228AD.6060704@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E70295@ORSMSX116.amr.corp.intel.com>
	<567C8F5C.204@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E7642F@ORSMSX116.amr.corp.intel.com>
	<5682486D.4030402@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F389E766CF@ORSMSX116.amr.corp.intel.com>
	<758D9731-2548-4370-A6AA-7CCA2FF671EC@oracle.com>
	<0C5AB04C-125E-41A2-8761-A5C3025783E7@oracle.com>
	<DD4705E1-1E04-413E-8A8C-C5C698807C85@oracle.com>
	<568B9188.6000506@redhat.com>
	<C1BAA75A-E4E1-4598-8403-6D4F56E6B555@oracle.com>
	<568CEF5B.5060306@redhat.com>
	<CAHjP37G1jVB1AtL5v1EHitXLEbxN+F1nMGDQosBZwS9wR8879g@mail.gmail.com>
	<86663D10-D257-44D1-AFDE-BD484AE439A8@oracle.com>
	<CAHjP37G35=UTkirFhPTG0XbdOjMRZ5fodAjPw22BW0sAupA=rw@mail.gmail.com>
	<3746840B-2F8D-42A1-B81F-02A0DF4A1D11@oracle.com>
	<CAHjP37FmTLYsaa5qhM8UOqTK8xxi6BjD4aXfLdv7jzURwTkryw@mail.gmail.com>
	<568D7FA1.4040707@oracle.com> <CAHjP!
	! 37Fc7wAj24b5np5Yh9m6OKDmoh3-_TvrEtja=UMOsr73bg@mail.gmail.com>
	<1BC8C0B0-E8EF-4D6B-B9EE-D374E2FC3E04@oracle.com>
	<CAHjP37GKBXBk1e0NSByAxxm8A04i6=TD2uzSiy3KyTNDuCxgEA@mail.gmail.com>
	<FD41E4F1-ABC6-49D4-9CA3-B9972C56C662@oracle.com>
Message-ID: <5AE9C5CC-4ACE-4140-B044-CDDDAE2D9C9B@oracle.com>

On Jan 11, 2016, at 7:54 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> 
>> As a general comment, would it make sense to assume exceptional paths are not taken in most Java code? That is, for code optimization purposes it's probably a reasonable assumption.  It seems like having an exceptional path is already a hint that it's not expected to fail; most Java devs know not to use exceptions for expected control flow.
> 
> That sounds reasonable. There?s a BailoutToInterpreterForThrows command line argument that does that (off by default, not available in product builds). I don?t know what the history behind it is.

It is reasonable in *most* code, which means you have to be ready to run into a bit of code which misbehaves, and mark the profile so you treat that bit of code specially.  Key example:  Null checks are *mostly* uncommon, so we work hard to turn them into implicit (trap-bearing) instructions.  But those which are not get rewritten differently, after the trap happens, using the profile marks.  In other words, we speculate that a null check that throws uncommonly finds a null (pending evidence to the contrary).

Hence the subtle interactions with profiles, including the profile inside a bytecoded method like checkIndex.  With an intrinsic, you can say (just for that intrinsic), "ignore special-case marks in my profile".  We don't have a good-enough heuristic (in the absence of split profiles) for detecting which normal methods can be treated this way.

>> Could bytecode shape just like checkIndex be treated as same hint? Are there cases where something looks like checkIndex but really isn't? 
> 
> That sounds like a good suggestion. We would trade:
> 2 comparisons: i < 0 || i >= length
> for
> 2 comparisons: length < 0 || i >=u length 
> 
> so even if it doesn't result in further improvements, we wouldn?t lose anything.

In the small scale you don't lose anything, but (as I said earlier) in the IR graph at scale you lose the opportunity to common up certain expressions.

Since Java expressions don't have unsigned comparison operators (and programmers wouldn't use them consistently for index checking, even if they were available), the comparison expressions available in the IR for commoning are signed, except for those which have been converted to unsigned.  Converting speculatively from signed to unsigned (as suggested above) would seem to be harmless, but unless it is somehow limited to comparisons that *all* go unsigned, you could get a mix of signed and unsigned versions of the same logic, which would (worst case) double the number of tests in the object code.  Using an intrinsic for range checks (which is the current case with aaload and will also be the case with checkIndex) allows us to reduce the number of unsigned comparisons to those which actually .

I hope everybody understands that I am not arguing *against* strong IR normalization and automatic detection of idioms, but rather observing that, powerful as those desirable techniques are, they are not infallible, and sometimes benefit from user-driven help via explicit operators, like checkIndex.  Of course we want to fold user code which "works just like" checkIndex or the aaload check into the same good IR.  But we don't want to rely on this auto-detection always, and we want to tread carefully when balancing the various IR normalization rules, which may work either for or against the use case of checkIndex detection.

Open question:  Given a "sea of nodes" encoding various configurations of signed and unsigned comparisons, how do we normalize them so that (a) we maximize commoning (U with U and S with S), and (b) we end up with all available clever uses of U mode to fold <=0 checks?  Or, more to the point how do we arrange these choices so (c) the dynamic number of comparisons (of either mode) is minimized?  Given that (a) and (b) can sometimes work against each other, what's a good heuristic for binning comparisons into U and S categories, for subsequent CSE?  (So, checkIndex is a hint for binning.)

Personal background:  About 10 years ago I worked on opto/subnode.cpp to try to switch between S and U modes more vigorously, implementing something probably related to what Vitaly is advocating.  I ran into the (a) vs. (b) tradeoff, especially trying to preserve the aggressive matching of dominating tests in opto/ifnode.cpp.  I think it can be made better than it is today, but the details are very tricky.  Conjecture:  It might help if the TypeInt lattice could encode ranges in the uint32 space, just as (today) it encodes ranges in int32, since some of the heuristics are type-driven.

? John

From roland.westrelin at oracle.com  Tue Jan 12 08:51:37 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 12 Jan 2016 09:51:37 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <5694BBA6.4040301@oracle.com>
References: <568EB3A0.3040909@oracle.com> <56937F1F.7010709@oracle.com>
	<569402C9.5060305@oracle.com> <5694BBA6.4040301@oracle.com>
Message-ID: <A819C8B7-D024-407E-A0D3-2EF1F092A41D@oracle.com>

> I had to merge again with JDK-8139771 (castnode.cpp/hpp):
> http://cr.openjdk.java.net/~thartmann/8146629/webrev.02/

Looks good to me.

Roland.

From tobias.hartmann at oracle.com  Tue Jan 12 09:02:38 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 12 Jan 2016 10:02:38 +0100
Subject: [9] RFR(M): 8146629: Make phase->is_IterGVN() accessible from
	Node::Identity and Node::Value
In-Reply-To: <A819C8B7-D024-407E-A0D3-2EF1F092A41D@oracle.com>
References: <568EB3A0.3040909@oracle.com> <56937F1F.7010709@oracle.com>
	<569402C9.5060305@oracle.com> <5694BBA6.4040301@oracle.com>
	<A819C8B7-D024-407E-A0D3-2EF1F092A41D@oracle.com>
Message-ID: <5694C12E.6040809@oracle.com>

Thanks, Roland.

And sorry for the noise.

Best,
Tobias

On 12.01.2016 09:51, Roland Westrelin wrote:
>> I had to merge again with JDK-8139771 (castnode.cpp/hpp):
>> http://cr.openjdk.java.net/~thartmann/8146629/webrev.02/
> 
> Looks good to me.
> 
> Roland.
> 

From doug.simon at oracle.com  Tue Jan 12 09:39:55 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 12 Jan 2016 10:39:55 +0100
Subject: RFR: 8146364: Remove @ServiceProvider mechanism from JVMCI
In-Reply-To: <A56EBDF3-B7D9-4184-BD4A-9B9731B43385@oracle.com>
References: <FEDF1809-3009-4CD4-84B8-A87DDAC1CFA6@oracle.com>
	<A56EBDF3-B7D9-4184-BD4A-9B9731B43385@oracle.com>
Message-ID: <DAF7D57C-F7D5-4973-98CA-DA83F838777C@oracle.com>

Doh! This was a result of pilot error when transplanting patches from graal-jvmci-9. I left out:

http://hg.openjdk.java.net/graal/graal-jvmci-9/hotspot/rev/2390bc159b77

The behavior I saw was not that the META-INF files overwrote each other. Instead, they were ignored completely.

I?ve updated http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/ now.

-Doug

> On 12 Jan 2016, at 02:35, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> 
>> On Jan 11, 2016, at 12:43 PM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> Hi,
>> 
>> Please review these changes for removing the mechanism in JVMCI for automating the generation of files in META-INF/services for service providers annotated with @ServiceProvider.
> 
> Did you try this with a regular JDK 9 build?  I don?t think it works to have the same META-INF file in different locations:
> 
> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.aarch64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.amd64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.sparc/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
> 
> They will overwrite each other when being installed into the image.
> 
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8146364
>> 
>> http://cr.openjdk.java.net/~dnsimon/8146364/jdk9/ 
>> http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/
>> 
>> -Doug
> 


From roland.westrelin at oracle.com  Tue Jan 12 10:07:50 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 12 Jan 2016 11:07:50 +0100
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <56944C45.6060307@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
Message-ID: <37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>

Hi Vladimir,

Thanks for looking at this.

> Now I think I understand.
> Note, there should be NO any control between loop's head and predicate check. I assume CastPP is attached to it because its original check was removed by dominated similar check (for example NULL check).
> 
> I think it is safe to move CastPP above original dummy predicate checks (one or two checks if there is loop limit checks) since Cast PP should not depend on them. It will solve the problem since moved check(new predicate) is always inserted before original dummy predicate (which will be removed later).

I first saw that problem with a CastPP but in the test case that I wrote for that bug, the pinned node is not a CastPP, it?s a StoreF. The predicate that is moved above tests a LoadF value that is memory dependent on the StoreF. I don?t see any reason the same problem couldn?t be reproduced with any data node.

With the test case, it would be safe to move the StoreF above the predicates I think. But in the general case, I don?t see how we can be sure that we don?t have:

- null check/range check for the StoreF moved out of loops as predicates
- partial peel that causes the StoreF to be pinned below the predicates
- loop predication that moves some data node that depends on the StoreF above it

Roland.


> 
> Thanks,
> Vladimir
> 
> On 1/11/16 7:07 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8146792/webrev.00/
>> 
>> - partial peeling is applied to a loop
>> - the peeled section is optimized and leaves a pinned node between the loop predicates and the loop body but no control flow
>> - loop predicates are applied and a predicate that depends on the pinned node is moved out of the loop, before the pinned node, leading to a broken graph
>> 
>> This is the same issue that came up during review of 8139771. Vladimir suggested it gets reviewed separately. With the included test case it reproduces without the change from 8139771.
>> 
>> Roland.
>> 


From roland.westrelin at oracle.com  Tue Jan 12 10:17:16 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 12 Jan 2016 11:17:16 +0100
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
	<37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
Message-ID: <78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>

> With the test case, it would be safe to move the StoreF above the predicates I think. But in the general case, I don?t see how we can be sure that we don?t have:
> 
> - null check/range check for the StoreF moved out of loops as predicates
> - partial peel that causes the StoreF to be pinned below the predicates
> - loop predication that moves some data node that depends on the StoreF above it

Actually, I can reproduce this scenario with the patch below: some changes to the test and making range check smearing a little big more aggressive so a range check is replaced by a dominating predicate range check.

Roland.

diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
--- a/src/share/vm/opto/ifnode.cpp
+++ b/src/share/vm/opto/ifnode.cpp
@@ -514,7 +514,7 @@
   // along the OOB path.  Otherwise, it's possible that the user wrote
   // something which optimized to look like a range check but behaves
   // in some other way.
-  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_range_check) == NULL) {
+  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_none) == NULL) {
     return 0;
   }
 
diff --git a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
--- a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
+++ b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
@@ -30,6 +30,8 @@
  *
  */
 
+import java.util.Objects;
+
 public class BadPredicateAfterPartialPeel {
 
     static void not_inlined1() {}
@@ -45,13 +47,13 @@
     boolean flag;
     int j;
 
-    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) {
+    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) throws Exception {
         int i1 = 1;
 
         // To delay partial peeling to the loop opts pass right before CCP
-        int i2 = 0;
-        for (; i2 < 10; i2 += i1);
-        i2 = i2 / 10;
+        int i2 = 1;
+        // for (; i2 < 10; i2 += i1);
+        // i2 = i2 / 10;
         
         // Simplified during CCP:
         int i3 = 2;
@@ -63,11 +65,12 @@
 
         not_inlined1();
 
-        array[0] = -1;
         do {
             // peeled section starts here
             o.flag = false;
             o.j = 0;
+            
+            Objects.checkIndex(0, array.length, null);
 
             if (b) {
                 // The following store will be pinned between
@@ -300,7 +303,7 @@
         not_inlined4();
     }
 
-    static public void main(String[] args) {
+    static public void main(String[] args) throws Exception {
         BadPredicateAfterPartialPeel o1 = new BadPredicateAfterPartialPeel();
         BadPredicateAfterPartialPeel o2 = new BadPredicateAfterPartialPeel();
         for (int i = 0; i < 20000; i++) {


From tobias.hartmann at oracle.com  Tue Jan 12 13:59:38 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 12 Jan 2016 14:59:38 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <569409C5.2040805@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com>
Message-ID: <569506CA.8040001@oracle.com>

On 11.01.2016 21:00, Vladimir Kozlov wrote:
> On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>>> Andrew is right.
>>>>>>
>>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>>
>>>>> Right. It was the root of this bug, see below.
>>>>>
>>>>>>
>>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>>
>>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>>> StrInflatedCopyNode is not memory node.
>>>>>>
>>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>>
>>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>>
>>>> I meant with webrev.01 but you answered my question below.
>>>>
>>>>>>     // This class defines a projection of the memory  state of a store conditional node.
>>>>>>     // These nodes return a value, but also update memory.
>>>>>>
>>>>>> But inflate does not return any value.
>>>>>
>>>>> Hmm, according to bottom type inflate produce memory:
>>>>>
>>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>>
>>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>>
>>>> Exactly.
>>>>
>>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>>
>>>>> set_memory(str, dst_type);
>>>>
>>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>>
>>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>>
>>>> Okay, I'll to that.
>>>>
>>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>>
>>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>>
>>>> Okay, should we then use BOTTOM for both the input and output type?
>>>
>>> Only input. Output type corresponds to dst array type which you set correctly now.
>>
>> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>>   StoreC
>>   inflate_string
>>   LoadC
>>
>> The memory graph (def->use) now looks like this:
>>   LoadC -> inflate_string -> ByteMem
>>                 ... StoreC-> CharMem
> 
> I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for LoadC.

Note that the StoreC and inflate_string are *not* writing to the same char[] array. The test looks like this:

 char c1[] = new char[1];
 char c2[] = new char[1];

 c2[0] = 42;
 // Inflate String from byte[] to char[]
 s.getChars(0, 1, c1, 0);
 // Read char[] memory written before inflation
 return c2[0];

The result should be 42. The problem is that inflate_string does not point to StoreC because inflate_string uses a byte[] as input and in this case also writes to a different char[]. Even if we set the input to BOTTOM, inflate_string points to 7 Parm (BOTTOM) but not to the char[] memory produced by 96 StoreC:
http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png

349 LoadUS then reads from the output char[] memory of inflate_string which does not include the result of StoreC. The test fails because the return value is != 42.

My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.

>  If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not follow StoreC then result is relaxed.

Yes, these cases work fine.

Thanks,
Tobias

>> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>>
>> Setting the input to BOTTOM, generates the following graph:
>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>>
>> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>>   LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>>
>> Best,
>> Tobias
>>
>>>>>> Related question:
>>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>>     char[int:>=0]:exact+any *
>>>>>>
>>>>>> which is equal to the type of the char load.
>>>>>
>>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>>
>>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>>
>>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>
>>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>>
>>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>>
>>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>>
>>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>>     char[int:1]:NotNull:exact *
>>>>>>
>>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>>
>>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>>
>>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>>
>>>> I will look into this again and try to understand what happens.
>>>
>>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>>
>>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>>
>>>>>>>>>      // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>>      // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>>      // compiler because the CPU does all the ordering for us.
>>>>>>>>>
>>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>>
>>>>>>>>> Or did I misunderstand your question?
>>>>>>>>
>>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>>> read-after-write dependency, but never mind.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Andrew.
>>>>>>>>

From edward.nevill at gmail.com  Tue Jan 12 14:01:01 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Tue, 12 Jan 2016 14:01:01 +0000
Subject: RFR: 8146886: aarch64: fails to build following 8136525 and 8139864
Message-ID: <1452607261.30600.5.camel@mylittlepony.linaroharston>

Hi,

The following webrev fixed several build failures in aarch64 following recent merges.

Webrev: http://cr.openjdk.java.net/~enevill/8146886/webrev/
Jira: https://bugs.openjdk.java.net/browse/JDK-8146886

Testing by building release and festdebug versions and by running jtreg hotspot and langtools.

Jtreg hotspot: Test results: passed: 1,068; failed: 16; error: 15
JTreg langtools: Test results: passed: 3,358; failed: 1; error: 4

OK to push?
Ed.


From vladimir.x.ivanov at oracle.com  Tue Jan 12 15:22:38 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 12 Jan 2016 18:22:38 +0300
Subject: [9] RFR (S): 8140001: _allocateInstance intrinsic does not throw
	InstantiationException for abstract classes and interfaces
Message-ID: <56951A3E.7070805@oracle.com>

http://cr.openjdk.java.net/~vlivanov/8140001/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8140001

EA can eliminate allocations of abstract classes or interfaces, thus 
changing observable behavior of a program as the test case demonstrates.

The fix is to always mark such allocations as escaping.

Testing: failing test, JPRT.

Thanks!

Best regards,
Vladimir Ivanov

From vladimir.kozlov at oracle.com  Tue Jan 12 17:00:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 12 Jan 2016 09:00:33 -0800
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <5694B2A2.7010905@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
	<569401DE.8000105@oracle.com> <5694B2A2.7010905@oracle.com>
Message-ID: <56953131.7070408@oracle.com>

Looks good. The test may still have a problem on slow platforms with -Xcomp. We may need to increase timeout later.

Thanks,
Vladimir

On 1/12/16 12:00 AM, Zolt?n Maj? wrote:
> Hi Vladimir,
>
>
> On 01/11/2016 08:26 PM, Vladimir Kozlov wrote:
>> Don't use GC flags in the test. They will conflict with flags passed by testing infra and the test will fail. The was
>> bug fixed by removing GC flags from all our tests.
>> Note, Nightly testing does GC flags rotation so you don't need to do that.
>
> OK, I removed all GC flags from the test. Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8086053/webrev.03/
>
>> Otherwise looks good.
>
> Thank you for the review! I'll push webrev.03 today as it addresses all issues that were brought up.
>
> Thank you and best regards,
>
>
> Zoltan
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
>>> Hi,
>>>
>>>
>>>
>>> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>>>> [...]
>>>> Yes, that is a good idea. I added a test that launches the VM with all flag combinations and also with different GCs.
>>>> I did the same what the test does to reproduce the original failure.
>>>>
>>>> Here is the updated webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>>
>>> The test contains and unnecessary @library tag and package import. The year in the copyright statement must be changed
>>> as well (to 2016).
>>>
>>> Here is the webrev with those changes:
>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>>>
>>> Sorry for the noise.
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>>>
>>>>
>>>> The newly added test passes on all supported platforms.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>>
>>>>> Best,
>>>>> Tobias
>>>>>
>>>>>
>>>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>> please review the patch for 8086053.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>>>
>>>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill newly allocated TLAB regions. With ZeroTLAB
>>>>>> disabled, the interpreter and compiled code should assume the responsibility to zero-fill newly allocated regions.
>>>>>> Currently, the handling of the ZeroTLAB flag shows some inconsistencies between the GC and the compilers. These
>>>>>> inconsistencies lead to newly allocated regions not being filled with zeros.
>>>>>>
>>>>>> Solution: Address the following:
>>>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB without notifying the GC. As a result, the newly
>>>>>> allocated TLAB is not initialized with zero. Add TLAB initialization code to C1.
>>>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization of newly allocated objects/arrays even if TLAB
>>>>>> allocation is disabled. Add stricter conditions to C2 on when to skip filling objects/arrays with zero.
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>>>
>>>>>> Testing:
>>>>>> - local testing (linux_x86_64) of failing test case with -XX:+UseG1GC and -XX:+UseSerialGC;
>>>>>> - JPRT;
>>>>>> - all hotspot tests on all platforms affected by the change using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>>>
>>>>>> Thank you and best regards,
>>>>>>
>>>>>>
>>>>>> Zoltan
>>>>>>
>>>>
>>>
>

From zoltan.majo at oracle.com  Tue Jan 12 17:08:14 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 12 Jan 2016 18:08:14 +0100
Subject: [9] RFR (M): 8086053: Address inconsistencies regarding ZeroTLAB
In-Reply-To: <56953131.7070408@oracle.com>
References: <568F9852.4090806@oracle.com> <56936C68.70002@oracle.com>
	<5693A9E7.3040700@oracle.com> <5693B01D.60604@oracle.com>
	<569401DE.8000105@oracle.com> <5694B2A2.7010905@oracle.com>
	<56953131.7070408@oracle.com>
Message-ID: <569532FE.2080107@oracle.com>

Hi Vladimir,


On 01/12/2016 06:00 PM, Vladimir Kozlov wrote:
> Looks good. The test may still have a problem on slow platforms with 
> -Xcomp. We may need to increase timeout later.

thank you for pointing that out! I'll keep an eye on the test.

Best regards,


Zoltan

>
> Thanks,
> Vladimir
>
> On 1/12/16 12:00 AM, Zolt?n Maj? wrote:
>> Hi Vladimir,
>>
>>
>> On 01/11/2016 08:26 PM, Vladimir Kozlov wrote:
>>> Don't use GC flags in the test. They will conflict with flags passed 
>>> by testing infra and the test will fail. The was
>>> bug fixed by removing GC flags from all our tests.
>>> Note, Nightly testing does GC flags rotation so you don't need to do 
>>> that.
>>
>> OK, I removed all GC flags from the test. Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.03/
>>
>>> Otherwise looks good.
>>
>> Thank you for the review! I'll push webrev.03 today as it addresses 
>> all issues that were brought up.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/11/16 5:37 AM, Zolt?n Maj? wrote:
>>>> Hi,
>>>>
>>>>
>>>>
>>>> On 01/11/2016 02:11 PM, Zolt?n Maj? wrote:
>>>>> [...]
>>>>> Yes, that is a good idea. I added a test that launches the VM with 
>>>>> all flag combinations and also with different GCs.
>>>>> I did the same what the test does to reproduce the original failure.
>>>>>
>>>>> Here is the updated webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.01/
>>>>
>>>> The test contains and unnecessary @library tag and package import. 
>>>> The year in the copyright statement must be changed
>>>> as well (to 2016).
>>>>
>>>> Here is the webrev with those changes:
>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.02/
>>>>
>>>> Sorry for the noise.
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>>
>>>>>
>>>>> The newly added test passes on all supported platforms.
>>>>>
>>>>> Thank you and best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> Best,
>>>>>> Tobias
>>>>>>
>>>>>>
>>>>>> On 08.01.2016 12:06, Zolt?n Maj? wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> please review the patch for 8086053.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8086053
>>>>>>>
>>>>>>> Problem: With ZeroTLAB enabled, the GC is supposed to zero-fill 
>>>>>>> newly allocated TLAB regions. With ZeroTLAB
>>>>>>> disabled, the interpreter and compiled code should assume the 
>>>>>>> responsibility to zero-fill newly allocated regions.
>>>>>>> Currently, the handling of the ZeroTLAB flag shows some 
>>>>>>> inconsistencies between the GC and the compilers. These
>>>>>>> inconsistencies lead to newly allocated regions not being filled 
>>>>>>> with zeros.
>>>>>>>
>>>>>>> Solution: Address the following:
>>>>>>> - With -XX:+FastTLABRefill, C1-compiled code refills the TLAB 
>>>>>>> without notifying the GC. As a result, the newly
>>>>>>> allocated TLAB is not initialized with zero. Add TLAB 
>>>>>>> initialization code to C1.
>>>>>>> - With -XX:+ZeroTLAB, the C2 compiler skips zero-initialization 
>>>>>>> of newly allocated objects/arrays even if TLAB
>>>>>>> allocation is disabled. Add stricter conditions to C2 on when to 
>>>>>>> skip filling objects/arrays with zero.
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8086053/webrev.00/
>>>>>>>
>>>>>>> Testing:
>>>>>>> - local testing (linux_x86_64) of failing test case with 
>>>>>>> -XX:+UseG1GC and -XX:+UseSerialGC;
>>>>>>> - JPRT;
>>>>>>> - all hotspot tests on all platforms affected by the change 
>>>>>>> using all combinations of +/-UseTLAB and +/-ZeroTLAB.
>>>>>>>
>>>>>>> Thank you and best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>
>>>>
>>


From vladimir.x.ivanov at oracle.com  Tue Jan 12 18:41:24 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 12 Jan 2016 21:41:24 +0300
Subject: [9] RFR (XS): 6985422: flush the output streams before OnError
	commands
Message-ID: <569548D4.2070707@oracle.com>

http://cr.openjdk.java.net/~vlivanov/6985422/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-6985422

OnError commands are executed before hotspot log is finished.

The fix is to finish the log before executing OnError commands.

Also, I moved compilation replay data dumping logic before OnError 
processing, so compilation replay file is accessible from OnError 
commands as well.

I verified the fix by triggering VM crash w/ -XX:+LogCompilation 
-XX:LogFile=hotspot.log -XX:OnError='cp hotspot.log hs.log' flags and 
checking that hs.log is complete. Without the fix the log is corrupted.

Testing: manual, JPRT.

Thanks!

Best regards,
Vladimir Ivanov

From vladimir.kozlov at oracle.com  Tue Jan 12 19:13:35 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 12 Jan 2016 11:13:35 -0800
Subject: [9] RFR (XS): 6985422: flush the output streams before OnError
	commands
In-Reply-To: <569548D4.2070707@oracle.com>
References: <569548D4.2070707@oracle.com>
Message-ID: <5695505F.7050005@oracle.com>

Looks good.

Vladimir K

On 1/12/16 10:41 AM, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/6985422/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-6985422
>
> OnError commands are executed before hotspot log is finished.
>
> The fix is to finish the log before executing OnError commands.
>
> Also, I moved compilation replay data dumping logic before OnError processing, so compilation replay file is accessible
> from OnError commands as well.
>
> I verified the fix by triggering VM crash w/ -XX:+LogCompilation -XX:LogFile=hotspot.log -XX:OnError='cp hotspot.log
> hs.log' flags and checking that hs.log is complete. Without the fix the log is corrupted.
>
> Testing: manual, JPRT.
>
> Thanks!
>
> Best regards,
> Vladimir Ivanov

From vladimir.kozlov at oracle.com  Tue Jan 12 19:24:30 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 12 Jan 2016 11:24:30 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <569506CA.8040001@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
Message-ID: <569552EE.8050809@oracle.com>

 > My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.

Yes, that is right solution here.

Thanks,
Vladimir

On 1/12/16 5:59 AM, Tobias Hartmann wrote:
> On 11.01.2016 21:00, Vladimir Kozlov wrote:
>> On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>>> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>>>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>>>> Andrew is right.
>>>>>>>
>>>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>>>
>>>>>> Right. It was the root of this bug, see below.
>>>>>>
>>>>>>>
>>>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>>>
>>>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>>>> StrInflatedCopyNode is not memory node.
>>>>>>>
>>>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>>>
>>>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>>>
>>>>> I meant with webrev.01 but you answered my question below.
>>>>>
>>>>>>>      // This class defines a projection of the memory  state of a store conditional node.
>>>>>>>      // These nodes return a value, but also update memory.
>>>>>>>
>>>>>>> But inflate does not return any value.
>>>>>>
>>>>>> Hmm, according to bottom type inflate produce memory:
>>>>>>
>>>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>>>
>>>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>>>
>>>>> Exactly.
>>>>>
>>>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>>>
>>>>>> set_memory(str, dst_type);
>>>>>
>>>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>>>
>>>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>>>
>>>>> Okay, I'll to that.
>>>>>
>>>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>>>
>>>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>>>
>>>>> Okay, should we then use BOTTOM for both the input and output type?
>>>>
>>>> Only input. Output type corresponds to dst array type which you set correctly now.
>>>
>>> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>>>    StoreC
>>>    inflate_string
>>>    LoadC
>>>
>>> The memory graph (def->use) now looks like this:
>>>    LoadC -> inflate_string -> ByteMem
>>>                  ... StoreC-> CharMem
>>
>> I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for LoadC.
>
> Note that the StoreC and inflate_string are *not* writing to the same char[] array. The test looks like this:
>
>   char c1[] = new char[1];
>   char c2[] = new char[1];
>
>   c2[0] = 42;
>   // Inflate String from byte[] to char[]
>   s.getChars(0, 1, c1, 0);
>   // Read char[] memory written before inflation
>   return c2[0];
>
> The result should be 42. The problem is that inflate_string does not point to StoreC because inflate_string uses a byte[] as input and in this case also writes to a different char[]. Even if we set the input to BOTTOM, inflate_string points to 7 Parm (BOTTOM) but not to the char[] memory produced by 96 StoreC:
> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>
> 349 LoadUS then reads from the output char[] memory of inflate_string which does not include the result of StoreC. The test fails because the return value is != 42.
>
> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>
>>   If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not follow StoreC then result is relaxed.
>
> Yes, these cases work fine.
>
> Thanks,
> Tobias
>
>>> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>>>
>>> Setting the input to BOTTOM, generates the following graph:
>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>>>
>>> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>>>    LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>>>
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>>> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>>>
>>> Best,
>>> Tobias
>>>
>>>>>>> Related question:
>>>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>>>      char[int:>=0]:exact+any *
>>>>>>>
>>>>>>> which is equal to the type of the char load.
>>>>>>
>>>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>>>
>>>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>>>
>>>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>
>>>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>>>
>>>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>>>
>>>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>>>
>>>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>>>      char[int:1]:NotNull:exact *
>>>>>>>
>>>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>>>
>>>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>>>
>>>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>>>
>>>>> I will look into this again and try to understand what happens.
>>>>
>>>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>>>
>>>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>>>
>>>>>>>>>>       // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>>>       // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>>>       // compiler because the CPU does all the ordering for us.
>>>>>>>>>>
>>>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>>>
>>>>>>>>>> Or did I misunderstand your question?
>>>>>>>>>
>>>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>>>> read-after-write dependency, but never mind.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Andrew.
>>>>>>>>>

From vladimir.kozlov at oracle.com  Tue Jan 12 19:40:06 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 12 Jan 2016 11:40:06 -0800
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
	<37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
	<78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>
Message-ID: <56955696.2080501@oracle.com>

On 1/12/16 2:17 AM, Roland Westrelin wrote:
>> With the test case, it would be safe to move the StoreF above the predicates I think. But in the general case, I don?t see how we can be sure that we don?t have:
>>
>> - null check/range check for the StoreF moved out of loops as predicates
>> - partial peel that causes the StoreF to be pinned below the predicates
>> - loop predication that moves some data node that depends on the StoreF above it

I agree that your change works as very conservative approach. But we will not get performance from it.

I am thinking that it is "always" safe to move pinned data node above original/dummy predicates (loop index variable is 
depending on limit check predicate, but we will never move index node from loop). We only needs to be sure that we move 
it (after partial peel, for example) before any dependent checks and data nodes are moved from the loop. Those checks 
and data will be inserted below it.

Anyway, how rare this case? If it is vary rare I agree with your change since performance is not important.

Thanks,
Vladimir

>
> Actually, I can reproduce this scenario with the patch below: some changes to the test and making range check smearing a little big more aggressive so a range check is replaced by a dominating predicate range check.
>
> Roland.
>
> diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
> --- a/src/share/vm/opto/ifnode.cpp
> +++ b/src/share/vm/opto/ifnode.cpp
> @@ -514,7 +514,7 @@
>     // along the OOB path.  Otherwise, it's possible that the user wrote
>     // something which optimized to look like a range check but behaves
>     // in some other way.
> -  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_range_check) == NULL) {
> +  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_none) == NULL) {
>       return 0;
>     }
>
> diff --git a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
> --- a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
> +++ b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
> @@ -30,6 +30,8 @@
>    *
>    */
>
> +import java.util.Objects;
> +
>   public class BadPredicateAfterPartialPeel {
>
>       static void not_inlined1() {}
> @@ -45,13 +47,13 @@
>       boolean flag;
>       int j;
>
> -    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) {
> +    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) throws Exception {
>           int i1 = 1;
>
>           // To delay partial peeling to the loop opts pass right before CCP
> -        int i2 = 0;
> -        for (; i2 < 10; i2 += i1);
> -        i2 = i2 / 10;
> +        int i2 = 1;
> +        // for (; i2 < 10; i2 += i1);
> +        // i2 = i2 / 10;
>
>           // Simplified during CCP:
>           int i3 = 2;
> @@ -63,11 +65,12 @@
>
>           not_inlined1();
>
> -        array[0] = -1;
>           do {
>               // peeled section starts here
>               o.flag = false;
>               o.j = 0;
> +
> +            Objects.checkIndex(0, array.length, null);
>
>               if (b) {
>                   // The following store will be pinned between
> @@ -300,7 +303,7 @@
>           not_inlined4();
>       }
>
> -    static public void main(String[] args) {
> +    static public void main(String[] args) throws Exception {
>           BadPredicateAfterPartialPeel o1 = new BadPredicateAfterPartialPeel();
>           BadPredicateAfterPartialPeel o2 = new BadPredicateAfterPartialPeel();
>           for (int i = 0; i < 20000; i++) {
>
>

From christian.thalinger at oracle.com  Tue Jan 12 19:40:42 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 12 Jan 2016 09:40:42 -1000
Subject: RFR: 8146364: Remove @ServiceProvider mechanism from JVMCI
In-Reply-To: <DAF7D57C-F7D5-4973-98CA-DA83F838777C@oracle.com>
References: <FEDF1809-3009-4CD4-84B8-A87DDAC1CFA6@oracle.com>
	<A56EBDF3-B7D9-4184-BD4A-9B9731B43385@oracle.com>
	<DAF7D57C-F7D5-4973-98CA-DA83F838777C@oracle.com>
Message-ID: <6DC2B1CE-3AAF-45D7-99D2-24C584BDE71B@oracle.com>


> On Jan 11, 2016, at 11:39 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Doh! This was a result of pilot error when transplanting patches from graal-jvmci-9. I left out:
> 
> http://hg.openjdk.java.net/graal/graal-jvmci-9/hotspot/rev/2390bc159b77
> 
> The behavior I saw was not that the META-INF files overwrote each other. Instead, they were ignored completely.
> 
> I?ve updated http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/ now.

Yes, that looks better.

> 
> -Doug
> 
>> On 12 Jan 2016, at 02:35, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> 
>>> On Jan 11, 2016, at 12:43 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Please review these changes for removing the mechanism in JVMCI for automating the generation of files in META-INF/services for service providers annotated with @ServiceProvider.
>> 
>> Did you try this with a regular JDK 9 build?  I don?t think it works to have the same META-INF file in different locations:
>> 
>> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.aarch64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
>> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.amd64/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
>> src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot.sparc/src/META-INF/services/jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory
>> 
>> They will overwrite each other when being installed into the image.
>> 
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8146364
>>> 
>>> http://cr.openjdk.java.net/~dnsimon/8146364/jdk9/ 
>>> http://cr.openjdk.java.net/~dnsimon/8146364/hotspot/
>>> 
>>> -Doug
>> 
> 


From roland.westrelin at oracle.com  Tue Jan 12 19:56:12 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 12 Jan 2016 20:56:12 +0100
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <56955696.2080501@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
	<37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
	<78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>
	<56955696.2080501@oracle.com>
Message-ID: <A3959978-A3F6-46B3-B154-7F64F50BE41A@oracle.com>

> I am thinking that it is "always" safe to move pinned data node above original/dummy predicates (loop index variable is depending on limit check predicate, but we will never move index node from loop). We only needs to be sure that we move it (after partial peel, for example) before any dependent checks and data nodes are moved from the loop. Those checks and data will be inserted below it.

I get it now and I think you?re right but it would need to be done for all data nodes which sounds like a mess.

> Anyway, how rare this case? If it is vary rare I agree with your change since performance is not important.

It?s very rare. I?ve seen it only once running the old CTW with the castPP change from 8139771.

Roland.

> 
> Thanks,
> Vladimir
> 
>> 
>> Actually, I can reproduce this scenario with the patch below: some changes to the test and making range check smearing a little big more aggressive so a range check is replaced by a dominating predicate range check.
>> 
>> Roland.
>> 
>> diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
>> --- a/src/share/vm/opto/ifnode.cpp
>> +++ b/src/share/vm/opto/ifnode.cpp
>> @@ -514,7 +514,7 @@
>>    // along the OOB path.  Otherwise, it's possible that the user wrote
>>    // something which optimized to look like a range check but behaves
>>    // in some other way.
>> -  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_range_check) == NULL) {
>> +  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_none) == NULL) {
>>      return 0;
>>    }
>> 
>> diff --git a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>> --- a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>> +++ b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>> @@ -30,6 +30,8 @@
>>   *
>>   */
>> 
>> +import java.util.Objects;
>> +
>>  public class BadPredicateAfterPartialPeel {
>> 
>>      static void not_inlined1() {}
>> @@ -45,13 +47,13 @@
>>      boolean flag;
>>      int j;
>> 
>> -    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) {
>> +    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) throws Exception {
>>          int i1 = 1;
>> 
>>          // To delay partial peeling to the loop opts pass right before CCP
>> -        int i2 = 0;
>> -        for (; i2 < 10; i2 += i1);
>> -        i2 = i2 / 10;
>> +        int i2 = 1;
>> +        // for (; i2 < 10; i2 += i1);
>> +        // i2 = i2 / 10;
>> 
>>          // Simplified during CCP:
>>          int i3 = 2;
>> @@ -63,11 +65,12 @@
>> 
>>          not_inlined1();
>> 
>> -        array[0] = -1;
>>          do {
>>              // peeled section starts here
>>              o.flag = false;
>>              o.j = 0;
>> +
>> +            Objects.checkIndex(0, array.length, null);
>> 
>>              if (b) {
>>                  // The following store will be pinned between
>> @@ -300,7 +303,7 @@
>>          not_inlined4();
>>      }
>> 
>> -    static public void main(String[] args) {
>> +    static public void main(String[] args) throws Exception {
>>          BadPredicateAfterPartialPeel o1 = new BadPredicateAfterPartialPeel();
>>          BadPredicateAfterPartialPeel o2 = new BadPredicateAfterPartialPeel();
>>          for (int i = 0; i < 20000; i++) {
>> 
>> 


From christian.thalinger at oracle.com  Tue Jan 12 20:04:44 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 12 Jan 2016 10:04:44 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
Message-ID: <CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>


> On Jan 11, 2016, at 12:51 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> 
>> On Jan 11, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> 
>>> On 11 Jan 2016, at 20:50, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>> 
>>> What is naming convention for properties?
>>> Do we have somewhere list of all JVMCI properties we accept? May be we should add it.
>> 
>> Currently, there is no list of accepted JVMCI properties. Once Chris applies the changes below such that all system property access (apart from jvmci.InitTimer) goes through HotSpotJVMCIRuntime.getProperty(), then the javadoc of that method could contain the list (much like System.getProperties describes the supported standard properties).
> 
> Good idea.
> 
>> 
>>> All JVMCI properties names should be consistent whatever you choose.
>> 
>> I agree.
> 
> Yes.  They should feel like our other command line options so camel-case is what I had in mind.

How about this:

http://cr.openjdk.java.net/~twisti/8146820/webrev.01/index.html

Now all options are in an enum so that we can have PrintFlags and ShowFlags options.  I did not add any documentation but we could.

> 
>> 
>> -Doug
>> 
>>> 
>>> 'inittimer' is also lowcased.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 1/11/16 11:15 AM, Christian Thalinger wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8146820
>>>> 
>>>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>>>> 
>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
>>>> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>>>>    }
>>>> 
>>>>    /**
>>>> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
>>>> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>> +     *
>>>> +     * @param name the name of the system property
>>>> +     * @param def the value to return if there is no system property corresponding to {@code name}
>>>> +     */
>>>> +    public static String getProperty(String name, String def) {
>>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>>> +        if (value == null) {
>>>> +            return def;
>>>> +        }
>>>> +        return value;
>>>> +    }
>>>> +
>>>> +    /**
>>>>     * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>>>>     * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>>     *
>>>> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>>>>     * @param def the value to return if there is no system property corresponding to {@code name}
>>>>     */
>>>>    public static boolean getBooleanProperty(String name, boolean def) {
>>>> -        String value = VM.getSavedProperty("jvmci." + name);
>>>> +        String value = getProperty(name, null);
>>>>        if (value == null) {
>>>>            return def;
>>>>        }
>>>> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>>>>        }
>>>>        metaAccessContext = context;
>>>> 
>>>> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>>>> +        if (getBooleanProperty("printconfig", false)) {
>>>>            printConfig(config, compilerToVm);
>>>>        }
>>>> 
>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
>>>> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>>>>        return false;
>>>>    }
>>>> 
>>>> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>>>> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>>>> 
>>>>    @Override
>>>>    public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
>>>> @@ -65,9 +65,11 @@ public final class InitTimer implements
>>>>    }
>>>> 
>>>>    /**
>>>> -     * Specifies if initialization timing is enabled.
>>>> +     * Specifies if initialization timing is enabled. Note: this property cannot use
>>>> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
>>>> +     * package.
>>>>     */
>>>> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
>>>> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>>>> 
>>>>    public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>>>>    public static final String SPACES = "                                            ";
>>>> 
>> 
> 


From doug.simon at oracle.com  Tue Jan 12 20:14:10 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 12 Jan 2016 21:14:10 +0100
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
Message-ID: <CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>

If we?re going with an enum, you could put accessors directly in the enum:

    private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);

    private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);

You could then type the value of the options and check the right accessor is used:

     public enum Option {
         ImplicitStableValues(boolean.class),
         InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
         PrintConfig(boolean.class),
         PrintFlags(boolean.class),
         ShowFlags(boolean.class),
         TraceMethodDataFilter(String.class),
         TrustFinalDefaultFields(String.class);

Even ignoring these suggestions, the discipline imposed by the enum if a good idea.

-Doug


> On 12 Jan 2016, at 21:04, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 11, 2016, at 12:51 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> 
>>> On Jan 11, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> 
>>>> On 11 Jan 2016, at 20:50, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>> 
>>>> What is naming convention for properties?
>>>> Do we have somewhere list of all JVMCI properties we accept? May be we should add it.
>>> 
>>> Currently, there is no list of accepted JVMCI properties. Once Chris applies the changes below such that all system property access (apart from jvmci.InitTimer) goes through HotSpotJVMCIRuntime.getProperty(), then the javadoc of that method could contain the list (much like System.getProperties describes the supported standard properties).
>> 
>> Good idea.
>> 
>>> 
>>>> All JVMCI properties names should be consistent whatever you choose.
>>> 
>>> I agree.
>> 
>> Yes.  They should feel like our other command line options so camel-case is what I had in mind.
> 
> How about this:
> 
> http://cr.openjdk.java.net/~twisti/8146820/webrev.01/index.html
> 
> Now all options are in an enum so that we can have PrintFlags and ShowFlags options.  I did not add any documentation but we could.
> 
>> 
>>> 
>>> -Doug
>>> 
>>>> 
>>>> 'inittimer' is also lowcased.
>>>> 
>>>> Thanks,
>>>> Vladimir
>>>> 
>>>> On 1/11/16 11:15 AM, Christian Thalinger wrote:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8146820
>>>>> 
>>>>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>>>>> 
>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
>>>>> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>>>>>   }
>>>>> 
>>>>>   /**
>>>>> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
>>>>> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>>> +     *
>>>>> +     * @param name the name of the system property
>>>>> +     * @param def the value to return if there is no system property corresponding to {@code name}
>>>>> +     */
>>>>> +    public static String getProperty(String name, String def) {
>>>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>>>> +        if (value == null) {
>>>>> +            return def;
>>>>> +        }
>>>>> +        return value;
>>>>> +    }
>>>>> +
>>>>> +    /**
>>>>>    * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>>>>>    * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>>>    *
>>>>> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>>>>>    * @param def the value to return if there is no system property corresponding to {@code name}
>>>>>    */
>>>>>   public static boolean getBooleanProperty(String name, boolean def) {
>>>>> -        String value = VM.getSavedProperty("jvmci." + name);
>>>>> +        String value = getProperty(name, null);
>>>>>       if (value == null) {
>>>>>           return def;
>>>>>       }
>>>>> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>>>>>       }
>>>>>       metaAccessContext = context;
>>>>> 
>>>>> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>>>>> +        if (getBooleanProperty("printconfig", false)) {
>>>>>           printConfig(config, compilerToVm);
>>>>>       }
>>>>> 
>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
>>>>> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>>>>>       return false;
>>>>>   }
>>>>> 
>>>>> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>>>>> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>>>>> 
>>>>>   @Override
>>>>>   public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
>>>>> @@ -65,9 +65,11 @@ public final class InitTimer implements
>>>>>   }
>>>>> 
>>>>>   /**
>>>>> -     * Specifies if initialization timing is enabled.
>>>>> +     * Specifies if initialization timing is enabled. Note: this property cannot use
>>>>> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
>>>>> +     * package.
>>>>>    */
>>>>> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
>>>>> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>>>>> 
>>>>>   public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>>>>>   public static final String SPACES = "                                            ";


From vladimir.kozlov at oracle.com  Tue Jan 12 20:33:08 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 12 Jan 2016 12:33:08 -0800
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <A3959978-A3F6-46B3-B154-7F64F50BE41A@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
	<37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
	<78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>
	<56955696.2080501@oracle.com>
	<A3959978-A3F6-46B3-B154-7F64F50BE41A@oracle.com>
Message-ID: <56956304.30100@oracle.com>

On 1/12/16 11:56 AM, Roland Westrelin wrote:
>> I am thinking that it is "always" safe to move pinned data node above original/dummy predicates (loop index variable is depending on limit check predicate, but we will never move index node from loop). We only needs to be sure that we move it (after partial peel, for example) before any dependent checks and data nodes are moved from the loop. Those checks and data will be inserted below it.
>
> I get it now and I think you?re right but it would need to be done for all data nodes which sounds like a mess.
>
>> Anyway, how rare this case? If it is vary rare I agree with your change since performance is not important.
>
> It?s very rare. I?ve seen it only once running the old CTW with the castPP change from 8139771.

Okay then. Go with your changes - they are good enough.

Thanks,
Vladimir

>
> Roland.
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Actually, I can reproduce this scenario with the patch below: some changes to the test and making range check smearing a little big more aggressive so a range check is replaced by a dominating predicate range check.
>>>
>>> Roland.
>>>
>>> diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
>>> --- a/src/share/vm/opto/ifnode.cpp
>>> +++ b/src/share/vm/opto/ifnode.cpp
>>> @@ -514,7 +514,7 @@
>>>     // along the OOB path.  Otherwise, it's possible that the user wrote
>>>     // something which optimized to look like a range check but behaves
>>>     // in some other way.
>>> -  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_range_check) == NULL) {
>>> +  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_none) == NULL) {
>>>       return 0;
>>>     }
>>>
>>> diff --git a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>> --- a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>> +++ b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>> @@ -30,6 +30,8 @@
>>>    *
>>>    */
>>>
>>> +import java.util.Objects;
>>> +
>>>   public class BadPredicateAfterPartialPeel {
>>>
>>>       static void not_inlined1() {}
>>> @@ -45,13 +47,13 @@
>>>       boolean flag;
>>>       int j;
>>>
>>> -    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) {
>>> +    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) throws Exception {
>>>           int i1 = 1;
>>>
>>>           // To delay partial peeling to the loop opts pass right before CCP
>>> -        int i2 = 0;
>>> -        for (; i2 < 10; i2 += i1);
>>> -        i2 = i2 / 10;
>>> +        int i2 = 1;
>>> +        // for (; i2 < 10; i2 += i1);
>>> +        // i2 = i2 / 10;
>>>
>>>           // Simplified during CCP:
>>>           int i3 = 2;
>>> @@ -63,11 +65,12 @@
>>>
>>>           not_inlined1();
>>>
>>> -        array[0] = -1;
>>>           do {
>>>               // peeled section starts here
>>>               o.flag = false;
>>>               o.j = 0;
>>> +
>>> +            Objects.checkIndex(0, array.length, null);
>>>
>>>               if (b) {
>>>                   // The following store will be pinned between
>>> @@ -300,7 +303,7 @@
>>>           not_inlined4();
>>>       }
>>>
>>> -    static public void main(String[] args) {
>>> +    static public void main(String[] args) throws Exception {
>>>           BadPredicateAfterPartialPeel o1 = new BadPredicateAfterPartialPeel();
>>>           BadPredicateAfterPartialPeel o2 = new BadPredicateAfterPartialPeel();
>>>           for (int i = 0; i < 20000; i++) {
>>>
>>>
>

From christian.thalinger at oracle.com  Tue Jan 12 21:39:35 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 12 Jan 2016 11:39:35 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
Message-ID: <9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>


> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> If we?re going with an enum, you could put accessors directly in the enum:
> 
>    private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
> 
>    private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
> 
> You could then type the value of the options and check the right accessor is used:
> 
>     public enum Option {
>         ImplicitStableValues(boolean.class),
>         InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>         PrintConfig(boolean.class),
>         PrintFlags(boolean.class),
>         ShowFlags(boolean.class),
>         TraceMethodDataFilter(String.class),
>         TrustFinalDefaultFields(String.class);
> 
> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.

Excellent idea!  I was also thinking about adding the default value to the enum.

> 
> -Doug
> 
> 
>> On 12 Jan 2016, at 21:04, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 11, 2016, at 12:51 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>> 
>>>> On Jan 11, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> 
>>>>> On 11 Jan 2016, at 20:50, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>>> 
>>>>> What is naming convention for properties?
>>>>> Do we have somewhere list of all JVMCI properties we accept? May be we should add it.
>>>> 
>>>> Currently, there is no list of accepted JVMCI properties. Once Chris applies the changes below such that all system property access (apart from jvmci.InitTimer) goes through HotSpotJVMCIRuntime.getProperty(), then the javadoc of that method could contain the list (much like System.getProperties describes the supported standard properties).
>>> 
>>> Good idea.
>>> 
>>>> 
>>>>> All JVMCI properties names should be consistent whatever you choose.
>>>> 
>>>> I agree.
>>> 
>>> Yes.  They should feel like our other command line options so camel-case is what I had in mind.
>> 
>> How about this:
>> 
>> http://cr.openjdk.java.net/~twisti/8146820/webrev.01/index.html
>> 
>> Now all options are in an enum so that we can have PrintFlags and ShowFlags options.  I did not add any documentation but we could.
>> 
>>> 
>>>> 
>>>> -Doug
>>>> 
>>>>> 
>>>>> 'inittimer' is also lowcased.
>>>>> 
>>>>> Thanks,
>>>>> Vladimir
>>>>> 
>>>>> On 1/11/16 11:15 AM, Christian Thalinger wrote:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8146820
>>>>>> 
>>>>>> I?ve renamed traceMethodDataFilter to TraceMethodDataFilter.  Should we rename printconfig to PrintConfig?
>>>>>> 
>>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java
>>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Fri Dec 18 20:23:28 2015 +0300
>>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCIRuntime.java	Mon Jan 11 09:12:48 2016 -1000
>>>>>> @@ -85,6 +85,21 @@ public final class HotSpotJVMCIRuntime i
>>>>>>  }
>>>>>> 
>>>>>>  /**
>>>>>> +     * Gets a String value based on a system property {@linkplain VM#getSavedProperty(String) saved}
>>>>>> +     * at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>>>> +     *
>>>>>> +     * @param name the name of the system property
>>>>>> +     * @param def the value to return if there is no system property corresponding to {@code name}
>>>>>> +     */
>>>>>> +    public static String getProperty(String name, String def) {
>>>>>> +        String value = VM.getSavedProperty("jvmci." + name);
>>>>>> +        if (value == null) {
>>>>>> +            return def;
>>>>>> +        }
>>>>>> +        return value;
>>>>>> +    }
>>>>>> +
>>>>>> +    /**
>>>>>>   * Gets a boolean value based on a system property {@linkplain VM#getSavedProperty(String)
>>>>>>   * saved} at system initialization time. The property name is prefixed with "{@code jvmci.}".
>>>>>>   *
>>>>>> @@ -93,7 +108,7 @@ public final class HotSpotJVMCIRuntime i
>>>>>>   * @param def the value to return if there is no system property corresponding to {@code name}
>>>>>>   */
>>>>>>  public static boolean getBooleanProperty(String name, boolean def) {
>>>>>> -        String value = VM.getSavedProperty("jvmci." + name);
>>>>>> +        String value = getProperty(name, null);
>>>>>>      if (value == null) {
>>>>>>          return def;
>>>>>>      }
>>>>>> @@ -164,7 +179,7 @@ public final class HotSpotJVMCIRuntime i
>>>>>>      }
>>>>>>      metaAccessContext = context;
>>>>>> 
>>>>>> -        if (Boolean.valueOf(System.getProperty("jvmci.printconfig"))) {
>>>>>> +        if (getBooleanProperty("printconfig", false)) {
>>>>>>          printConfig(config, compilerToVm);
>>>>>>      }
>>>>>> 
>>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java
>>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Fri Dec 18 20:23:28 2015 +0300
>>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java	Mon Jan 11 09:12:48 2016 -1000
>>>>>> @@ -417,7 +417,7 @@ final class HotSpotResolvedJavaMethodImp
>>>>>>      return false;
>>>>>>  }
>>>>>> 
>>>>>> -    private static final String TraceMethodDataFilter = System.getProperty("jvmci.traceMethodDataFilter");
>>>>>> +    private static final String TraceMethodDataFilter = HotSpotJVMCIRuntime.getProperty("TraceMethodDataFilter", null);
>>>>>> 
>>>>>>  @Override
>>>>>>  public ProfilingInfo getProfilingInfo(boolean includeNormal, boolean includeOSR) {
>>>>>> diff -r c90679b0ea25 src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java
>>>>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Fri Dec 18 20:23:28 2015 +0300
>>>>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.inittimer/src/jdk/vm/ci/inittimer/InitTimer.java	Mon Jan 11 09:12:48 2016 -1000
>>>>>> @@ -65,9 +65,11 @@ public final class InitTimer implements
>>>>>>  }
>>>>>> 
>>>>>>  /**
>>>>>> -     * Specifies if initialization timing is enabled.
>>>>>> +     * Specifies if initialization timing is enabled. Note: this property cannot use
>>>>>> +     * {@code HotSpotJVMCIRuntime.getBooleanProperty} since that class is not visible from this
>>>>>> +     * package.
>>>>>>   */
>>>>>> -    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer") || Boolean.getBoolean("jvmci.runtime.TimeInit");
>>>>>> +    private static final boolean ENABLED = Boolean.getBoolean("jvmci.inittimer");
>>>>>> 
>>>>>>  public static final AtomicInteger nesting = ENABLED ? new AtomicInteger() : null;
>>>>>>  public static final String SPACES = "                                            ";
> 


From doug.simon at oracle.com  Tue Jan 12 22:03:04 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 12 Jan 2016 23:03:04 +0100
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
Message-ID: <B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>


> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> If we?re going with an enum, you could put accessors directly in the enum:
>> 
>>   private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>> 
>>   private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>> 
>> You could then type the value of the options and check the right accessor is used:
>> 
>>    public enum Option {
>>        ImplicitStableValues(boolean.class),
>>        InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>        PrintConfig(boolean.class),
>>        PrintFlags(boolean.class),
>>        ShowFlags(boolean.class),
>>        TraceMethodDataFilter(String.class),
>>        TrustFinalDefaultFields(String.class);
>> 
>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
> 
> Excellent idea!  I was also thinking about adding the default value to the enum.

Can you do that without having to box the default value?

-Doug

From christian.thalinger at oracle.com  Tue Jan 12 22:14:29 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 12 Jan 2016 12:14:29 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
Message-ID: <41621484-0886-401C-A8AD-36D534DDE591@oracle.com>


> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> If we?re going with an enum, you could put accessors directly in the enum:
>>> 
>>>  private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>> 
>>>  private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>> 
>>> You could then type the value of the options and check the right accessor is used:
>>> 
>>>   public enum Option {
>>>       ImplicitStableValues(boolean.class),
>>>       InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>       PrintConfig(boolean.class),
>>>       PrintFlags(boolean.class),
>>>       ShowFlags(boolean.class),
>>>       TraceMethodDataFilter(String.class),
>>>       TrustFinalDefaultFields(String.class);
>>> 
>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>> 
>> Excellent idea!  I was also thinking about adding the default value to the enum.
> 
> Can you do that without having to box the default value?

No, we have to box but we can initialize all flags in the constructor:

http://cr.openjdk.java.net/~twisti/8146820/webrev.02/

We will not have many flags so this should be alright.  A PrintFlags looks like this:

$ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.PrintFlags=true InitGraal
[List of JVMCI options]
  boolean ImplicitStableValues                     := true          
  boolean InitTimer                                := false         
  boolean PrintConfig                              := false         
  boolean PrintFlags                                = true          
  boolean ShowFlags                                := false         
   String TraceMethodDataFilter                    := null          
   String TrustFinalDefaultFields                  := true          

I?m almost tempted to move InitTimer to another package, like jdk.vm.ci.common ?

> 
> -Doug


From christian.thalinger at oracle.com  Tue Jan 12 22:39:32 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 12 Jan 2016 12:39:32 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
Message-ID: <B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>


> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>>> 
>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>> 
>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>> 
>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>> 
>>>> You could then type the value of the options and check the right accessor is used:
>>>> 
>>>>  public enum Option {
>>>>      ImplicitStableValues(boolean.class),
>>>>      InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>      PrintConfig(boolean.class),
>>>>      PrintFlags(boolean.class),
>>>>      ShowFlags(boolean.class),
>>>>      TraceMethodDataFilter(String.class),
>>>>      TrustFinalDefaultFields(String.class);
>>>> 
>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>> 
>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>> 
>> Can you do that without having to box the default value?
> 
> No, we have to box but we can initialize all flags in the constructor:
> 
> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/ <http://cr.openjdk.java.net/~twisti/8146820/webrev.02/>
> 
> We will not have many flags so this should be alright.  A PrintFlags looks like this:
> 
> $ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.PrintFlags=true InitGraal
> [List of JVMCI options]
>  boolean ImplicitStableValues                     := true          
>  boolean InitTimer                                := false         
>  boolean PrintConfig                              := false         
>  boolean PrintFlags                                = true          
>  boolean ShowFlags                                := false         
>   String TraceMethodDataFilter                    := null          
>   String TrustFinalDefaultFields                  := true          

?and this is a bug, of course :-)

> 
> I?m almost tempted to move InitTimer to another package, like jdk.vm.ci.common ?
> 
>> 
>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160112/ab339939/attachment-0001.html>

From roland.westrelin at oracle.com  Wed Jan 13 09:06:34 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 13 Jan 2016 10:06:34 +0100
Subject: RFR(S): 8146792: Predicate moved after partial peel may lead to
	broken graph
In-Reply-To: <56956304.30100@oracle.com>
References: <FAEE8FAC-DCB5-4E47-9BE8-650566BFFEB8@oracle.com>
	<56944C45.6060307@oracle.com>
	<37CD6E66-1ACA-4B40-A974-F1B6A3086C10@oracle.com>
	<78074704-CE75-4F50-9F53-22FEC75E836E@oracle.com>
	<56955696.2080501@oracle.com>
	<A3959978-A3F6-46B3-B154-7F64F50BE41A@oracle.com>
	<56956304.30100@oracle.com>
Message-ID: <EB42DA3F-D518-444C-A0E5-EF427DFE7AB8@oracle.com>

>>> I am thinking that it is "always" safe to move pinned data node above original/dummy predicates (loop index variable is depending on limit check predicate, but we will never move index node from loop). We only needs to be sure that we move it (after partial peel, for example) before any dependent checks and data nodes are moved from the loop. Those checks and data will be inserted below it.
>> 
>> I get it now and I think you?re right but it would need to be done for all data nodes which sounds like a mess.
>> 
>>> Anyway, how rare this case? If it is vary rare I agree with your change since performance is not important.
>> 
>> It?s very rare. I?ve seen it only once running the old CTW with the castPP change from 8139771.
> 
> Okay then. Go with your changes - they are good enough.

Thanks for the review.

Roland.

> 
> Thanks,
> Vladimir
> 
>> 
>> Roland.
>> 
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>>> 
>>>> Actually, I can reproduce this scenario with the patch below: some changes to the test and making range check smearing a little big more aggressive so a range check is replaced by a dominating predicate range check.
>>>> 
>>>> Roland.
>>>> 
>>>> diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
>>>> --- a/src/share/vm/opto/ifnode.cpp
>>>> +++ b/src/share/vm/opto/ifnode.cpp
>>>> @@ -514,7 +514,7 @@
>>>>    // along the OOB path.  Otherwise, it's possible that the user wrote
>>>>    // something which optimized to look like a range check but behaves
>>>>    // in some other way.
>>>> -  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_range_check) == NULL) {
>>>> +  if (iftrap->is_uncommon_trap_proj(Deoptimization::Reason_none) == NULL) {
>>>>      return 0;
>>>>    }
>>>> 
>>>> diff --git a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>>> --- a/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>>> +++ b/test/compiler/loopopts/BadPredicateAfterPartialPeel.java
>>>> @@ -30,6 +30,8 @@
>>>>   *
>>>>   */
>>>> 
>>>> +import java.util.Objects;
>>>> +
>>>>  public class BadPredicateAfterPartialPeel {
>>>> 
>>>>      static void not_inlined1() {}
>>>> @@ -45,13 +47,13 @@
>>>>      boolean flag;
>>>>      int j;
>>>> 
>>>> -    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) {
>>>> +    static void m(BadPredicateAfterPartialPeel o1, BadPredicateAfterPartialPeel o2, BadPredicateAfterPartialPeel o, int i4) throws Exception {
>>>>          int i1 = 1;
>>>> 
>>>>          // To delay partial peeling to the loop opts pass right before CCP
>>>> -        int i2 = 0;
>>>> -        for (; i2 < 10; i2 += i1);
>>>> -        i2 = i2 / 10;
>>>> +        int i2 = 1;
>>>> +        // for (; i2 < 10; i2 += i1);
>>>> +        // i2 = i2 / 10;
>>>> 
>>>>          // Simplified during CCP:
>>>>          int i3 = 2;
>>>> @@ -63,11 +65,12 @@
>>>> 
>>>>          not_inlined1();
>>>> 
>>>> -        array[0] = -1;
>>>>          do {
>>>>              // peeled section starts here
>>>>              o.flag = false;
>>>>              o.j = 0;
>>>> +
>>>> +            Objects.checkIndex(0, array.length, null);
>>>> 
>>>>              if (b) {
>>>>                  // The following store will be pinned between
>>>> @@ -300,7 +303,7 @@
>>>>          not_inlined4();
>>>>      }
>>>> 
>>>> -    static public void main(String[] args) {
>>>> +    static public void main(String[] args) throws Exception {
>>>>          BadPredicateAfterPartialPeel o1 = new BadPredicateAfterPartialPeel();
>>>>          BadPredicateAfterPartialPeel o2 = new BadPredicateAfterPartialPeel();
>>>>          for (int i = 0; i < 20000; i++) {
>>>> 
>>>> 
>> 


From martin.doerr at sap.com  Wed Jan 13 10:38:51 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 13 Jan 2016 10:38:51 +0000
Subject: RFR(S): 8146978: PPC64: Fix build after integration of C++
	interpreter removal
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228B502@DEWDFEMB19C.global.corp.sap>

Hi,

the file register_ppc.hpp didn't merge correctly.

Webrev to fix the build is here:
http://cr.openjdk.java.net/~mdoerr/8146978_PPC64_fix_build/webrev.00/

Please review.

Best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160113/673df8a6/attachment.html>

From goetz.lindenmaier at sap.com  Wed Jan 13 10:45:09 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 13 Jan 2016 10:45:09 +0000
Subject: RFR(S): 8146978: PPC64: Fix build after integration of C++
	interpreter removal
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228B502@DEWDFEMB19C.global.corp.sap>
References: <7C9B87B351A4BA4AA9EC95BB418116567228B502@DEWDFEMB19C.global.corp.sap>
Message-ID: <4295855A5C1DE049A61835A1887419CC41F13C62@DEWDFEMB12A.global.corp.sap>

Hi Martin,

thanks for doing this fix.  
Please push it soon, as all other repos pulling from hs will break, too.

Best regards,
  Goetz.


> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Doerr, Martin
> Sent: Mittwoch, 13. Januar 2016 11:39
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: RFR(S): 8146978: PPC64: Fix build after integration of C++ interpreter
> removal
> 
> Hi,
> 
> 
> 
> the file register_ppc.hpp didn't merge correctly.
> 
> 
> 
> Webrev to fix the build is here:
> 
> http://cr.openjdk.java.net/~mdoerr/8146978_PPC64_fix_build/webrev.00/
> 
> 
> 
> Please review.
> 
> 
> 
> Best regards,
> 
> Martin
> 
> 


From edward.nevill at gmail.com  Wed Jan 13 11:40:06 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 13 Jan 2016 11:40:06 +0000
Subject: RFR: 8146843: aarch64: add scheduling support for FP and vector
	instructions
Message-ID: <1452685206.14278.16.camel@mylittlepony.linaroharston>

Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8146843/webrev.1

This adds support for OptoScheduling of FP & Vector (Neon) instructions on aarch64 (aarch64 already has support for scheduling of scalar instructions).

The following table shows the performance difference of this change.

http://cr.openjdk.java.net/~enevill/8146843/vectest.html

Note that the pipeline scheduling used in this change is based on partner C hardware because that is the only hardware I have the micro architecture details for.

Unsurprisingly the performance difference is most noticeable for the in order cores (B & D).

In a few cases the performance is worse. This seems to be due to it mis-scheduling data processing instructions at the cost of load store instructions on out of order cores.

However I think that the overall performance improvement makes this change worthwhile.

It may be possible in a future change to predicate individual pipeline classes on the core it is being run on, however this could rapidly lead to explosion in the size of aarch64.ad. Alternatively we could do some coarser predication on In Order vs Out of Order.

I have tested the change with jtreg hotspot and langtools with the following results.

Before:

Hotspot: Test results: passed: 1,066; failed: 15; error: 18
Langtools: Test results: passed: 3,358; failed: 1; error: 4

After:

Hotspot: Test results: passed: 1,073; failed: 11; error: 15
Langtools: Test results: passed: 3,358; failed: 1; error: 4

Thanks for the review,
Ed.


From tobias.hartmann at oracle.com  Wed Jan 13 12:00:58 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 13 Jan 2016 13:00:58 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <569552EE.8050809@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
	<569552EE.8050809@oracle.com>
Message-ID: <56963C7A.8040203@oracle.com>

Thanks, Vladimir.

On 12.01.2016 20:24, Vladimir Kozlov wrote:
>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
> 
> Yes, that is right solution here.

I changed the implementation to only capture the byte[] and char[] memory:
http://cr.openjdk.java.net/~thartmann/8144212/webrev.03/

The method GraphKit::capture_memory(src_type, dst_type) returns a new MergeMemNode if the src and dst types are different, merging the two.

Best,
Tobias

> On 1/12/16 5:59 AM, Tobias Hartmann wrote:
>> On 11.01.2016 21:00, Vladimir Kozlov wrote:
>>> On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>>>> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>>>>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>>>>> Hi Vladimir,
>>>>>>>>
>>>>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>>>>> Andrew is right.
>>>>>>>>
>>>>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>>>>
>>>>>>> Right. It was the root of this bug, see below.
>>>>>>>
>>>>>>>>
>>>>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>>>>
>>>>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>>>>> StrInflatedCopyNode is not memory node.
>>>>>>>>
>>>>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>>>>
>>>>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>>>>
>>>>>> I meant with webrev.01 but you answered my question below.
>>>>>>
>>>>>>>>      // This class defines a projection of the memory  state of a store conditional node.
>>>>>>>>      // These nodes return a value, but also update memory.
>>>>>>>>
>>>>>>>> But inflate does not return any value.
>>>>>>>
>>>>>>> Hmm, according to bottom type inflate produce memory:
>>>>>>>
>>>>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>>>>
>>>>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>>>>
>>>>>> Exactly.
>>>>>>
>>>>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>>>>
>>>>>>> set_memory(str, dst_type);
>>>>>>
>>>>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>>>>
>>>>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>>>>
>>>>>> Okay, I'll to that.
>>>>>>
>>>>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>>>>
>>>>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>>>>
>>>>>> Okay, should we then use BOTTOM for both the input and output type?
>>>>>
>>>>> Only input. Output type corresponds to dst array type which you set correctly now.
>>>>
>>>> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>>>>    StoreC
>>>>    inflate_string
>>>>    LoadC
>>>>
>>>> The memory graph (def->use) now looks like this:
>>>>    LoadC -> inflate_string -> ByteMem
>>>>                  ... StoreC-> CharMem
>>>
>>> I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for LoadC.
>>
>> Note that the StoreC and inflate_string are *not* writing to the same char[] array. The test looks like this:
>>
>>   char c1[] = new char[1];
>>   char c2[] = new char[1];
>>
>>   c2[0] = 42;
>>   // Inflate String from byte[] to char[]
>>   s.getChars(0, 1, c1, 0);
>>   // Read char[] memory written before inflation
>>   return c2[0];
>>
>> The result should be 42. The problem is that inflate_string does not point to StoreC because inflate_string uses a byte[] as input and in this case also writes to a different char[]. Even if we set the input to BOTTOM, inflate_string points to 7 Parm (BOTTOM) but not to the char[] memory produced by 96 StoreC:
>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>
>> 349 LoadUS then reads from the output char[] memory of inflate_string which does not include the result of StoreC. The test fails because the return value is != 42.
>>
>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>>
>>>   If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not follow StoreC then result is relaxed.
>>
>> Yes, these cases work fine.
>>
>> Thanks,
>> Tobias
>>
>>>> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>>>>
>>>> Setting the input to BOTTOM, generates the following graph:
>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>>> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>>>>
>>>> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>>>>    LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>>>>
>>>> Here is the new webrev:
>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>>>> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>>>>
>>>> Best,
>>>> Tobias
>>>>
>>>>>>>> Related question:
>>>>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>>>>      char[int:>=0]:exact+any *
>>>>>>>>
>>>>>>>> which is equal to the type of the char load.
>>>>>>>
>>>>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>>>>
>>>>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>>>>
>>>>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>>>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>
>>>>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>>>>
>>>>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>>>>
>>>>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>>>>
>>>>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>>>>      char[int:1]:NotNull:exact *
>>>>>>>>
>>>>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>>>>
>>>>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>>>>
>>>>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>>>>
>>>>>> I will look into this again and try to understand what happens.
>>>>>
>>>>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Tobias
>>>>>>
>>>>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>>>>
>>>>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>>>>
>>>>>>>>>>>       // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>>>>       // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>>>>       // compiler because the CPU does all the ordering for us.
>>>>>>>>>>>
>>>>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>>>>
>>>>>>>>>>> Or did I misunderstand your question?
>>>>>>>>>>
>>>>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>>>>> read-after-write dependency, but never mind.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Andrew.
>>>>>>>>>>

From vladimir.x.ivanov at oracle.com  Wed Jan 13 13:53:18 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 13 Jan 2016 16:53:18 +0300
Subject: [9] RFR (XS): 8146983: C1: assert(appendix.not_null()) failed for
	invokehandle bytecode
Message-ID: <569656CE.9050602@oracle.com>

http://cr.openjdk.java.net/~vlivanov/8146983/webrev.00
https://bugs.openjdk.java.net/browse/JDK-8146983

Assertion added in 8140659 is too strong: sometimes appendix patching 
for invokehandle happens when there's no appendix argument.

Appendix is present only for MH::invoke (_invokeGeneric; see 
LinkResolver::lookup_polymorphic_method). MH::invokeExact doesn't have 
an appendix. But C1 uses patching for all unresolved invokedynamic & 
invokehandle call sites (see GraphBuilder::invoke and 
Bytecodes::has_optional_appendix).

The fix is to remove the assertion.

Also, fixed a leftover from 8140659: appendix resolution for 
invokehandle should be idempotent as well.

Testing: JPRT

Best regards,
Vladimir Ivanov

From pavel.punegov at oracle.com  Wed Jan 13 14:21:16 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Wed, 13 Jan 2016 17:21:16 +0300
Subject: RFR (XXS): 8145025:
	compiler/compilercontrol/commandfile/CompileOnlyTest.java and
	compiler/compilercontrol/commands/CompileOnlyTest.java fail:
	java.lang.RuntimeException: FAILED: method ... compilable:
	false, but should: true
In-Reply-To: <037269E6-9A07-4436-86E5-3E19D260D063@oracle.com>
References: <F15ED7E4-B2EA-44EE-A560-E3B57EFBC5BF@oracle.com>
	<567288EC.3020001@oracle.com>
	<037269E6-9A07-4436-86E5-3E19D260D063@oracle.com>
Message-ID: <31DAC8F9-82C5-4C62-922B-B630B5BF6450@oracle.com>

Anyone else to review, please?

? Thanks,
Pavel Punegov

> On 17 Dec 2015, at 16:44, Pavel Punegov <pavel.punegov at oracle.com> wrote:
> 
> Thanks for review, Nils
> 
> ? Pavel.
> 
>> On 17 Dec 2015, at 13:05, Nils Eliasson <nils.eliasson at oracle.com <mailto:nils.eliasson at oracle.com>> wrote:
>> 
>> Hi Pavel,
>> 
>> Looks good.
>> 
>> //Nils
>> 
>> On 2015-12-16 20:56, Pavel Punegov wrote:
>>> Please review this small fix to a test bug.
>>> 
>>> Issue: when test builds a state for a method that doesn?t match any compileonly command it should consider that this method wasn?t set compiled/excluded with any other compileonly or exclude commands. This means that it should check that appropriate Optional is not present (isn?t set).
>>> 
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8145025 <https://bugs.openjdk.java.net/browse/JDK-8145025>
>>> webrev: http://cr.openjdk.java.net/~ppunegov/8145025/webrev.00/ <http://cr.openjdk.java.net/%7Eppunegov/8145025/webrev.00/>
>>> ? Thanks,
>>> Pavel Punegov
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160113/514662af/attachment.html>

From roland.westrelin at oracle.com  Wed Jan 13 14:35:03 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 13 Jan 2016 15:35:03 +0100
Subject: [9] RFR (XS): 8146983: C1: assert(appendix.not_null()) failed for
	invokehandle bytecode
In-Reply-To: <569656CE.9050602@oracle.com>
References: <569656CE.9050602@oracle.com>
Message-ID: <93A56FDC-A240-48BF-A8CF-CB1C0B2D5D0A@oracle.com>

> http://cr.openjdk.java.net/~vlivanov/8146983/webrev.00

Looks good to me.

Roland.


From zoltan.majo at oracle.com  Wed Jan 13 15:02:22 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Wed, 13 Jan 2016 16:02:22 +0100
Subject: [9] RFR (S): 8071864: compiler/c2/6772683/InterruptedTest.java failed
	in nightly
Message-ID: <569666FE.1010007@oracle.com>

Hi,


please review the patch for 8071864.

https://bugs.openjdk.java.net/browse/JDK-8071864

Problem: The test runs using two threads: The main thread and a worker 
thread. Before exiting, the main thread interrupts the worker thread. 
Then, the main thread waits a limited amount of time for the worker 
thread to exit.

On highly loaded systems it can happen that the OS does not provide CPU 
time to the worker thread to exit in the limited amount of time 
available. In this case the test fails.

Solution: Increase the amount of time the main thread waits for the 
worker thread.

Webrev:
http://cr.openjdk.java.net/~zmajo/8071864/webrev.00/

Testing:
- executed test on a highly loaded system: Without the fix, the test 
fails after 66 iterations; with the fix it was possible to execute the 
test 1000 iteration without a failure;
- JPRT.

Thank you and best regards,


Zoltan


From aph at redhat.com  Wed Jan 13 15:09:51 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 13 Jan 2016 15:09:51 +0000
Subject: RFR: 8146843: aarch64: add scheduling support for FP and vector
	instructions
In-Reply-To: <1452685206.14278.16.camel@mylittlepony.linaroharston>
References: <1452685206.14278.16.camel@mylittlepony.linaroharston>
Message-ID: <569668BF.5070000@redhat.com>

On 01/13/2016 11:40 AM, Edward Nevill wrote:
> The following table shows the performance difference of this change.
> 
> http://cr.openjdk.java.net/~enevill/8146843/vectest.html

OK; this is generally positive.

Andrew.


From roland.westrelin at oracle.com  Wed Jan 13 15:10:35 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 13 Jan 2016 16:10:35 +0100
Subject: RFR(XS): 8146999: hotspot/test/compiler/c2/8007294/Test8007294.java
	test nightly failure
Message-ID: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>

http://cr.openjdk.java.net/~roland/8146999/webrev.00/

8139771 made CheckCastPP inherit from ConstraintCast but the is_ConstraintCast() fails for CheckCastPP and as a consequence so does uncast().

Roland.

From tobias.hartmann at oracle.com  Wed Jan 13 15:15:05 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 13 Jan 2016 16:15:05 +0100
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
Message-ID: <569669F9.2030607@oracle.com>

Hi Roland,

looks good to me (not a reviewer).

Best,
Tobias

On 13.01.2016 16:10, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8146999/webrev.00/
> 
> 8139771 made CheckCastPP inherit from ConstraintCast but the is_ConstraintCast() fails for CheckCastPP and as a consequence so does uncast().
> 
> Roland.
> 

From vladimir.x.ivanov at oracle.com  Wed Jan 13 15:54:21 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 13 Jan 2016 18:54:21 +0300
Subject: [9] RFR (XS): 8146983: C1: assert(appendix.not_null()) failed for
	invokehandle bytecode
In-Reply-To: <93A56FDC-A240-48BF-A8CF-CB1C0B2D5D0A@oracle.com>
References: <569656CE.9050602@oracle.com>
	<93A56FDC-A240-48BF-A8CF-CB1C0B2D5D0A@oracle.com>
Message-ID: <5696732D.8070408@oracle.com>

Thanks, Roland!

Best regards,
Vladimir Ivanov

On 1/13/16 5:35 PM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~vlivanov/8146983/webrev.00
>
> Looks good to me.
>
> Roland.
>

From vladimir.kozlov at oracle.com  Wed Jan 13 18:09:21 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 13 Jan 2016 10:09:21 -0800
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
Message-ID: <569692D1.4040001@oracle.com>

Right.

Thanks,
Vladimir

On 1/13/16 7:10 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8146999/webrev.00/
>
> 8139771 made CheckCastPP inherit from ConstraintCast but the is_ConstraintCast() fails for CheckCastPP and as a consequence so does uncast().
>
> Roland.
>

From vladimir.kozlov at oracle.com  Wed Jan 13 18:10:52 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 13 Jan 2016 10:10:52 -0800
Subject: [9] RFR (S): 8071864: compiler/c2/6772683/InterruptedTest.java
	failed in nightly
In-Reply-To: <569666FE.1010007@oracle.com>
References: <569666FE.1010007@oracle.com>
Message-ID: <5696932C.1080602@oracle.com>

Good.

Thanks,
Vladimir

On 1/13/16 7:02 AM, Zolt?n Maj? wrote:
> Hi,
>
>
> please review the patch for 8071864.
>
> https://bugs.openjdk.java.net/browse/JDK-8071864
>
> Problem: The test runs using two threads: The main thread and a worker thread. Before exiting, the main thread
> interrupts the worker thread. Then, the main thread waits a limited amount of time for the worker thread to exit.
>
> On highly loaded systems it can happen that the OS does not provide CPU time to the worker thread to exit in the limited
> amount of time available. In this case the test fails.
>
> Solution: Increase the amount of time the main thread waits for the worker thread.
>
> Webrev:
> http://cr.openjdk.java.net/~zmajo/8071864/webrev.00/
>
> Testing:
> - executed test on a highly loaded system: Without the fix, the test fails after 66 iterations; with the fix it was
> possible to execute the test 1000 iteration without a failure;
> - JPRT.
>
> Thank you and best regards,
>
>
> Zoltan
>

From vladimir.kozlov at oracle.com  Wed Jan 13 18:16:48 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 13 Jan 2016 10:16:48 -0800
Subject: RFR (XXS): 8145025:
	compiler/compilercontrol/commandfile/CompileOnlyTest.java and
	compiler/compilercontrol/commands/CompileOnlyTest.java fail:
	java.lang.RuntimeException: FAILED: method ... compilable: false,
	but should: true
In-Reply-To: <31DAC8F9-82C5-4C62-922B-B630B5BF6450@oracle.com>
References: <F15ED7E4-B2EA-44EE-A560-E3B57EFBC5BF@oracle.com>
	<567288EC.3020001@oracle.com>
	<037269E6-9A07-4436-86E5-3E19D260D063@oracle.com>
	<31DAC8F9-82C5-4C62-922B-B630B5BF6450@oracle.com>
Message-ID: <56969490.7020300@oracle.com>

Good.

Thanks,
Vladimir

On 1/13/16 6:21 AM, Pavel Punegov wrote:
> Anyone else to review, please?
>
> ? Thanks,
> Pavel Punegov
>
>> On 17 Dec 2015, at 16:44, Pavel Punegov <pavel.punegov at oracle.com <mailto:pavel.punegov at oracle.com>> wrote:
>>
>> Thanks for review, Nils
>>
>> ? Pavel.
>>
>>> On 17 Dec 2015, at 13:05, Nils Eliasson <nils.eliasson at oracle.com <mailto:nils.eliasson at oracle.com>> wrote:
>>>
>>> Hi Pavel,
>>>
>>> Looks good.
>>>
>>> //Nils
>>>
>>> On 2015-12-16 20:56, Pavel Punegov wrote:
>>>> Please review this small fix to a test bug.
>>>>
>>>> Issue: when test builds a state for a method that doesn?t match any compileonly command it should consider that this
>>>> method wasn?t set compiled/excluded with any other compileonly or exclude commands. This means that it should check
>>>> that appropriate Optional is not present (isn?t set).
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8145025
>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8145025/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Eppunegov/8145025/webrev.00/>
>>>>
>>>> ? Thanks,
>>>> Pavel Punegov
>>>>
>>>
>>
>

From roland.westrelin at oracle.com  Wed Jan 13 19:05:03 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 13 Jan 2016 20:05:03 +0100
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <569692D1.4040001@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
	<569692D1.4040001@oracle.com>
Message-ID: <C797E4B4-0D91-4345-BB75-BA24703304D3@oracle.com>

Thanks Tobias, Vladimir for the review.

Roland.

From ahmed.khawaja at oracle.com  Wed Jan 13 20:51:28 2016
From: ahmed.khawaja at oracle.com (Ahmed Khawaja)
Date: Wed, 13 Jan 2016 14:51:28 -0600
Subject: Accessing Addresses of object members from interp/C1
Message-ID: <5696B8D0.4010408@oracle.com>

I am working on adding/modifying some intrinsics that can get called 
from the interpreter and C1. I need to pass the address of a member 
variable (an array of ints) to each of the intrinsics. I know how to do 
this with C2 but not interp/C1. Can anyone point me in the right direction?

For the interpreter:
AbstractInterpreter::MethodKind kind -> need address from this

For C1:
Intrinsic* x   -> need address from this


Thank you,
    Ahmed

From christian.thalinger at oracle.com  Wed Jan 13 23:09:15 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 13 Jan 2016 13:09:15 -1000
Subject: RFR: 8146788: remove jvmci.jar from mx suite
In-Reply-To: <A33CE13D-EA4A-480D-82EB-533200B3E31F@oracle.com>
References: <A33CE13D-EA4A-480D-82EB-533200B3E31F@oracle.com>
Message-ID: <6C96000A-BEFD-4CE8-9C2E-70C155283BEE@oracle.com>

I?m not sure about this.  One reason to have the monolithic jvmci.jar is to not have to build the JVMCI on command line when you make a change in your IDE:

cthaling at macbook:~/ws/jdk9/hs-comp/hotspot$ mx -v vm -version
/Users/cthaling/ws/jdk9/hs-comp/build/macosx-x86_64-normal-server-release/jdk/bin/java -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -d64 -Xbootclasspath/p:/Users/cthaling/ws/jdk9/hs-comp/build/mx/hotspot/dists/jvmci.jar -version
java version "9-internal"
Java(TM) SE Runtime Environment (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp)
Java HotSpot(TM) 64-Bit Server VM (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp, mixed mode)

> On Jan 11, 2016, at 4:05 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Please this small change to remove generation of a jvmci.jar by the mx JVMCI build system.
> 
> https://bugs.openjdk.java.net/browse/JDK-8146788
> http://cr.openjdk.java.net/~dnsimon/8146788/
> 
> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160113/d8975490/attachment.html>

From doug.simon at oracle.com  Wed Jan 13 23:24:55 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 14 Jan 2016 00:24:55 +0100
Subject: RFR: 8146788: remove jvmci.jar from mx suite
In-Reply-To: <6C96000A-BEFD-4CE8-9C2E-70C155283BEE@oracle.com>
References: <A33CE13D-EA4A-480D-82EB-533200B3E31F@oracle.com>
	<6C96000A-BEFD-4CE8-9C2E-70C155283BEE@oracle.com>
Message-ID: <600A5EE2-9B76-4914-A747-AE0FF1CD316B@oracle.com>

Fair enough. This was mainly removed because it?s not used by Graal any more. But you?re right, it?s still useful when hacking on JVMCI itself from within Eclipse.

How to I withdraw this JBS issue? Close it with ?Won?t fix??

> On 14 Jan 2016, at 00:09, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> I?m not sure about this.  One reason to have the monolithic jvmci.jar is to not have to build the JVMCI on command line when you make a change in your IDE:
> 
> cthaling at macbook:~/ws/jdk9/hs-comp/hotspot$ mx -v vm -version
> /Users/cthaling/ws/jdk9/hs-comp/build/macosx-x86_64-normal-server-release/jdk/bin/java -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -d64 -Xbootclasspath/p:/Users/cthaling/ws/jdk9/hs-comp/build/mx/hotspot/dists/jvmci.jar -version
> java version "9-internal"
> Java(TM) SE Runtime Environment (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp)
> Java HotSpot(TM) 64-Bit Server VM (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp, mixed mode)
> 
>> On Jan 11, 2016, at 4:05 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> Please this small change to remove generation of a jvmci.jar by the mx JVMCI build system.
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8146788
>> http://cr.openjdk.java.net/~dnsimon/8146788/
>> 
>> -Doug
> 


From christian.thalinger at oracle.com  Wed Jan 13 23:30:34 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 13 Jan 2016 13:30:34 -1000
Subject: RFR: 8146788: remove jvmci.jar from mx suite
In-Reply-To: <600A5EE2-9B76-4914-A747-AE0FF1CD316B@oracle.com>
References: <A33CE13D-EA4A-480D-82EB-533200B3E31F@oracle.com>
	<6C96000A-BEFD-4CE8-9C2E-70C155283BEE@oracle.com>
	<600A5EE2-9B76-4914-A747-AE0FF1CD316B@oracle.com>
Message-ID: <D1119E8D-C70A-4F66-BA78-62755D025681@oracle.com>


> On Jan 13, 2016, at 1:24 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Fair enough. This was mainly removed because it?s not used by Graal any more. But you?re right, it?s still useful when hacking on JVMCI itself from within Eclipse.
> 
> How to I withdraw this JBS issue? Close it with ?Won?t fix??

There is ?Withdrawn? which is technically for JEPs.  ?Won?t Fix? works.

> 
>> On 14 Jan 2016, at 00:09, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> I?m not sure about this.  One reason to have the monolithic jvmci.jar is to not have to build the JVMCI on command line when you make a change in your IDE:
>> 
>> cthaling at macbook:~/ws/jdk9/hs-comp/hotspot$ mx -v vm -version
>> /Users/cthaling/ws/jdk9/hs-comp/build/macosx-x86_64-normal-server-release/jdk/bin/java -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -d64 -Xbootclasspath/p:/Users/cthaling/ws/jdk9/hs-comp/build/mx/hotspot/dists/jvmci.jar -version
>> java version "9-internal"
>> Java(TM) SE Runtime Environment (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp)
>> Java HotSpot(TM) 64-Bit Server VM (build 9-internal+0-2016-01-11-180948.cthaling.hs-comp, mixed mode)
>> 
>>> On Jan 11, 2016, at 4:05 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> Please this small change to remove generation of a jvmci.jar by the mx JVMCI build system.
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8146788
>>> http://cr.openjdk.java.net/~dnsimon/8146788/
>>> 
>>> -Doug
>> 
> 


From christian.thalinger at oracle.com  Thu Jan 14 05:58:58 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 13 Jan 2016 19:58:58 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
	<B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
Message-ID: <7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>


> On Jan 12, 2016, at 12:39 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>>> 
>>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>> 
>>>> 
>>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>> 
>>>>> 
>>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com <mailto:doug.simon at oracle.com>> wrote:
>>>>> 
>>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>>> 
>>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>>> 
>>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>>> 
>>>>> You could then type the value of the options and check the right accessor is used:
>>>>> 
>>>>>  public enum Option {
>>>>>      ImplicitStableValues(boolean.class),
>>>>>      InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>>      PrintConfig(boolean.class),
>>>>>      PrintFlags(boolean.class),
>>>>>      ShowFlags(boolean.class),
>>>>>      TraceMethodDataFilter(String.class),
>>>>>      TrustFinalDefaultFields(String.class);
>>>>> 
>>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>>> 
>>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>>> 
>>> Can you do that without having to box the default value?
>> 
>> No, we have to box but we can initialize all flags in the constructor:
>> 
>> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/ <http://cr.openjdk.java.net/~twisti/8146820/webrev.02/>

Do we agree on the change?

>> 
>> We will not have many flags so this should be alright.  A PrintFlags looks like this:
>> 
>> $ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.PrintFlags=true InitGraal
>> [List of JVMCI options]
>>  boolean ImplicitStableValues                     := true          
>>  boolean InitTimer                                := false         
>>  boolean PrintConfig                              := false         
>>  boolean PrintFlags                                = true          
>>  boolean ShowFlags                                := false         
>>   String TraceMethodDataFilter                    := null          
>>   String TrustFinalDefaultFields                  := true          
> 
> ?and this is a bug, of course :-)
> 
>> 
>> I?m almost tempted to move InitTimer to another package, like jdk.vm.ci.common ?
>> 
>>> 
>>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160113/e719270d/attachment.html>

From zoltan.majo at oracle.com  Thu Jan 14 08:21:47 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Thu, 14 Jan 2016 09:21:47 +0100
Subject: [9] RFR (S): 8071864: compiler/c2/6772683/InterruptedTest.java
	failed in nightly
In-Reply-To: <5696932C.1080602@oracle.com>
References: <569666FE.1010007@oracle.com> <5696932C.1080602@oracle.com>
Message-ID: <56975A9B.9010005@oracle.com>

Thank you, Vladimir, for the review!

Best regards,


Zoltan

On 01/13/2016 07:10 PM, Vladimir Kozlov wrote:
> Good.
>
> Thanks,
> Vladimir
>
> On 1/13/16 7:02 AM, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>> please review the patch for 8071864.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8071864
>>
>> Problem: The test runs using two threads: The main thread and a 
>> worker thread. Before exiting, the main thread
>> interrupts the worker thread. Then, the main thread waits a limited 
>> amount of time for the worker thread to exit.
>>
>> On highly loaded systems it can happen that the OS does not provide 
>> CPU time to the worker thread to exit in the limited
>> amount of time available. In this case the test fails.
>>
>> Solution: Increase the amount of time the main thread waits for the 
>> worker thread.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~zmajo/8071864/webrev.00/
>>
>> Testing:
>> - executed test on a highly loaded system: Without the fix, the test 
>> fails after 66 iterations; with the fix it was
>> possible to execute the test 1000 iteration without a failure;
>> - JPRT.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>


From doug.simon at oracle.com  Thu Jan 14 12:44:42 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 14 Jan 2016 13:44:42 +0100
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
	<B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
	<7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>
Message-ID: <E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>


> On 14 Jan 2016, at 06:58, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 12, 2016, at 12:39 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>> 
>>>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>>>> 
>>>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>>>> 
>>>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>>>> 
>>>>>> You could then type the value of the options and check the right accessor is used:
>>>>>> 
>>>>>>  public enum Option {
>>>>>>      ImplicitStableValues(boolean.class),
>>>>>>      InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>>>      PrintConfig(boolean.class),
>>>>>>      PrintFlags(boolean.class),
>>>>>>      ShowFlags(boolean.class),
>>>>>>      TraceMethodDataFilter(String.class),
>>>>>>      TrustFinalDefaultFields(String.class);
>>>>>> 
>>>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>>>> 
>>>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>>>> 
>>>> Can you do that without having to box the default value?
>>> 
>>> No, we have to box but we can initialize all flags in the constructor:
>>> 
>>> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/
> 
> Do we agree on the change?

I would prefer it if the value was lazy initialized (for non-AOT runtimes):


    /**
     * Supported JVMCI options.
     */
    public enum Option {
        ImplicitStableValues(boolean.class, true),
        InitTimer(boolean.class, false),  // Note: Not used (see InitTimer.ENABLED).
        PrintConfig(boolean.class, false),
        PrintFlags(boolean.class, false),
        ShowFlags(boolean.class, false),
        TraceMethodDataFilter(String.class, null),
        TrustFinalDefaultFields(String.class, true);

        /**
         * The prefix for system properties that are JVMCI options.
         */
        private static final String JVMCI_OPTION_PROPERTY_PREFIX = "jvmci.";

        private final Class<?> type;
        private Object value;
        private final Object defaultValue;
        private boolean isDefault;

        private Option(Class<?> type, Object defaultValue) {
            assert Character.isUpperCase(name().charAt(0)) : "Option name must start with upper-case letter: " + name();
            this.type = type;
            this.value = "UNINITIALIZED";
            this.defaultValue = defaultValue;
        }

        private Object getValue() {
            if (value == "UNINITIALIZED") {
                String propertyValue = VM.getSavedProperty(JVMCI_OPTION_PROPERTY_PREFIX + name());
                if (propertyValue == null) {
                    this.value = defaultValue;
                    this.isDefault = true;
                } else {
                    if (type == boolean.class) {
                        this.value = Boolean.parseBoolean(propertyValue);
                    } else if (type == String.class) {
                        this.value = propertyValue;
                    } else {
                        throw new JVMCIError("Unexpected option type " + type);
                    }
                    this.isDefault = false;
                }
                // Saved properties should not be interned - let?s be sure
                assert value != "UNINITIALIZED";
            }
            return value;
        }

        /**
         * Returns the option's value as boolean.
         *
         * @return option's value
         */
        public boolean getBoolean() {
            return (boolean) getValue();
        }

        /**
         * Returns the option's value as String.
         *
         * @return option's value
         */
        public String getString() {
            return (String) getValue();
        }

        /**
         * Prints all option flags to {@code out}.
         *
         * @param out stream to print to
         */
        public static void printFlags(PrintStream out) {
            out.println("[List of JVMCI options]");
            for (Option option : values()) {
                Object value = option.getValue();
                String assign = option.isDefault ? ":=" : " =";
                out.printf("%9s %-40s %s %-14s%n", option.type.getSimpleName(), option, assign, value);
            }
        }
    }


 Also, you can remove all the static fields that just cache a (possibly unboxed) option value and use the option directly. For example:

diff -r 1034ff44c5d0 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java
--- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Tue Jan 12 15:04:27 2016 +0100
+++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Thu Jan 14 13:40:28 2016 +0100
@@ -29,6 +29,7 @@
 import java.lang.reflect.Field;

 import jdk.vm.ci.common.JVMCIError;
+import jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.Option;
 import jdk.vm.ci.meta.JavaType;
 import jdk.vm.ci.meta.LocationIdentity;
 import jdk.vm.ci.meta.MetaAccessProvider;
@@ -41,11 +42,6 @@
  */
 class HotSpotResolvedJavaFieldImpl implements HotSpotResolvedJavaField, HotSpotProxified {

-    /**
-     * Mark well-known stable fields as such.
-     */
-    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
-
     private final HotSpotResolvedObjectTypeImpl holder;
     private final String name;
     private JavaType type;
@@ -198,7 +194,7 @@
             return true;
         }
         assert getAnnotation(Stable.class) == null;
-        if (ImplicitStableValues && isImplicitStableField()) {
+        if (Option.ImplicitStableValues.getBoolean() && isImplicitStableField()) {
             return true;
         }
         return false;

None of the current options are used in tight loops where the cost of the unboxing (if any) would matter.

Lastly, since you?ve added PrintFlags and ShowFlags, why not add a help message to each option. For example:

        ImplicitStableValues(boolean.class, true, ?Mark well-known stable fields as such."),

-Doug

> 
>>> 
>>> We will not have many flags so this should be alright.  A PrintFlags looks like this:
>>> 
>>> $ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.PrintFlags=true InitGraal
>>> [List of JVMCI options]
>>>  boolean ImplicitStableValues                     := true          
>>>  boolean InitTimer                                := false         
>>>  boolean PrintConfig                              := false         
>>>  boolean PrintFlags                                = true          
>>>  boolean ShowFlags                                := false         
>>>   String TraceMethodDataFilter                    := null          
>>>   String TrustFinalDefaultFields                  := true          
>> 
>> ?and this is a bug, of course :-)
>> 
>>> 
>>> I?m almost tempted to move InitTimer to another package, like jdk.vm.ci.common ?
>>> 
>>>> 
>>>> -Doug


From nils.eliasson at oracle.com  Thu Jan 14 12:44:47 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 14 Jan 2016 13:44:47 +0100
Subject: RFR(S): 8145331: SEGV in DirectivesStack::release(DirectiveSet*)
Message-ID: <5697983F.3080509@oracle.com>

Hi,

Please review this patch:

Description:
In the fix for JDK-8144873 I updated only one of the two use cases of 
CompilerDirectives::get_for(AbstractCompiler..)

Summary:
I simplify CompilerDirectives::get_for(..) to always return the c1_store 
for all unsupported cases. Makes getMatchingDirective and 
getDefaultDirective simpler too. Moved refcount out of get_for(...) 
since it is not guaranteed to be used if updated here.

Testing:
All intrinsic tests and all compilercontrol tests in addition to testset 
hotspot.
IntrinsicAvailableTest is updated to not check JVMCI compiler for 
intrinsics.
IntrinsicDisabledTest.jtr doesn't work with JVMCI - no action taken
NullCheckDroppingsTest.jtr doesn't work - since JVMCI doesn't support 
BackgroudCompilation - no action taken

Bug: https://bugs.openjdk.java.net/browse/JDK-8145331
Webrev: http://cr.openjdk.java.net/~neliasso/8145331/webrev.01/

Regards,
Nils Eliasson

From pavel.punegov at oracle.com  Thu Jan 14 14:55:52 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Thu, 14 Jan 2016 17:55:52 +0300
Subject: RFR (XXS): 8145025:
	compiler/compilercontrol/commandfile/CompileOnlyTest.java and
	compiler/compilercontrol/commands/CompileOnlyTest.java fail:
	java.lang.RuntimeException: FAILED: method ... compilable:
	false, but should: true
In-Reply-To: <56969490.7020300@oracle.com>
References: <F15ED7E4-B2EA-44EE-A560-E3B57EFBC5BF@oracle.com>
	<567288EC.3020001@oracle.com>
	<037269E6-9A07-4436-86E5-3E19D260D063@oracle.com>
	<31DAC8F9-82C5-4C62-922B-B630B5BF6450@oracle.com>
	<56969490.7020300@oracle.com>
Message-ID: <EB566C2C-9225-4A59-AD64-644FEC775BD5@oracle.com>

Thanks for review, Vladimir.

? Pavel.

> On 13 Jan 2016, at 21:16, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir
> 
> On 1/13/16 6:21 AM, Pavel Punegov wrote:
>> Anyone else to review, please?
>> 
>> ? Thanks,
>> Pavel Punegov
>> 
>>> On 17 Dec 2015, at 16:44, Pavel Punegov <pavel.punegov at oracle.com <mailto:pavel.punegov at oracle.com>> wrote:
>>> 
>>> Thanks for review, Nils
>>> 
>>> ? Pavel.
>>> 
>>>> On 17 Dec 2015, at 13:05, Nils Eliasson <nils.eliasson at oracle.com <mailto:nils.eliasson at oracle.com>> wrote:
>>>> 
>>>> Hi Pavel,
>>>> 
>>>> Looks good.
>>>> 
>>>> //Nils
>>>> 
>>>> On 2015-12-16 20:56, Pavel Punegov wrote:
>>>>> Please review this small fix to a test bug.
>>>>> 
>>>>> Issue: when test builds a state for a method that doesn?t match any compileonly command it should consider that this
>>>>> method wasn?t set compiled/excluded with any other compileonly or exclude commands. This means that it should check
>>>>> that appropriate Optional is not present (isn?t set).
>>>>> 
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8145025
>>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8145025/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Eppunegov/8145025/webrev.00/>
>>>>> 
>>>>> ? Thanks,
>>>>> Pavel Punegov
>>>>> 
>>>> 
>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160114/89735dbb/attachment.html>

From vladimir.x.ivanov at oracle.com  Thu Jan 14 15:05:25 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 14 Jan 2016 18:05:25 +0300
Subject: [9] RFR (S): 8140001: _allocateInstance intrinsic does not throw
	InstantiationException for abstract classes and interfaces
In-Reply-To: <56951A3E.7070805@oracle.com>
References: <56951A3E.7070805@oracle.com>
Message-ID: <5697B935.9020209@oracle.com>

Any feedback, please?

Best regards,
Vladimir Ivanov

On 1/12/16 6:22 PM, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/8140001/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8140001
>
> EA can eliminate allocations of abstract classes or interfaces, thus
> changing observable behavior of a program as the test case demonstrates.
>
> The fix is to always mark such allocations as escaping.
>
> Testing: failing test, JPRT.
>
> Thanks!
>
> Best regards,
> Vladimir Ivanov

From aleksey.shipilev at oracle.com  Thu Jan 14 15:15:52 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 14 Jan 2016 18:15:52 +0300
Subject: [9] RFR (S): 8140001: _allocateInstance intrinsic does not throw
	InstantiationException for abstract classes and interfaces
In-Reply-To: <5697B935.9020209@oracle.com>
References: <56951A3E.7070805@oracle.com> <5697B935.9020209@oracle.com>
Message-ID: <5697BBA8.8030901@oracle.com>

Looks okay to me, but I think the property name should reflect Java
terminology, e.g. "can_be_instantiated", "not is_allocatable"?

$ javac AbstractSample.java
[ERROR] AbstractSample.java:[36,9] AbstractSample.M is abstract; cannot
be instantiated

Thanks,
-Aleksey

On 01/14/2016 06:05 PM, Vladimir Ivanov wrote:
> Any feedback, please?
> 
> Best regards,
> Vladimir Ivanov
> 
> On 1/12/16 6:22 PM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/8140001/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8140001
>>
>> EA can eliminate allocations of abstract classes or interfaces, thus
>> changing observable behavior of a program as the test case demonstrates.
>>
>> The fix is to always mark such allocations as escaping.
>>
>> Testing: failing test, JPRT.
>>
>> Thanks!
>>
>> Best regards,
>> Vladimir Ivanov


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160114/bb4630f5/signature.asc>

From tobias.hartmann at oracle.com  Thu Jan 14 16:00:36 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 14 Jan 2016 17:00:36 +0100
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained ConvI2L
	with narrowed type
Message-ID: <5697C624.7040201@oracle.com>

Hi,

please review the following patch.

https://bugs.openjdk.java.net/browse/JDK-6675699
http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/

*Problem*
The problem is that ConvI2L nodes with a narrow type (used to convert integer array indices to long values) are not dependent on the corresponding range check that proves that the input value is always in the (integer-)range. As a result, the ConvI2L node may flow above the range check during loop optimizations and end up with an input that is not in its type range. The node is then replaced by TOP causing the data path to be eliminated. However, because there is no control dependency on the corresponding range check, the control path from the peeled iteration that uses the result of the ConvI2L may not be eliminated. We crash because we are potentially using a value that is not available.

For example, TestLoopPeeling::testArrayAccess() triggers loop peeling because the loop contains an invariant check. The array store in line 66 is moved out of the loop and reachable from the peeled and old iterations of the loop. However, the array index computation consisting of a LShiftL(ConvI2L(Phi)) remains in each loop because it has loop variant usages and is not dependent on the range check that was moved out of the loop. The peeled iteration of the loop uses storeIndex == -1 causing the ConvI2L to be replaced by TOP because -1 is not in its [0, MAX_INT] range. The TOP is propagated downwards and ends up as one of the inputs to the Phi that merges the array index from the peeled and old loop exits. The Phi replaced by it's only remaining input and the store ends up using the index from the old iteration although it's still reachable from the peeled iteration. We crash because we potentially use the index value from the old iteration while coming from the peeled iterat!
 ion (of co

urse, the range check would catch this at runtime).

This problem may show up with array accesses but also with other code for which we emit a ConvI2L node with a narrow type. For example, array allocation uses a ConvI2L to convert the integer array size to a long value (see TestLoopPeeling::testArrayAllocation). We solved several different instances of this problem in the past with "workaround-fixes" that just disabled loop optimizations in special cases (see below). Such a workaround fix is not feasible to fix all potential occurrences of this problem. TestLoopPeeling.java crashes JDK 7, 8 and 9.

*Solution*
To make the ConvI2L dependent on a range check, I added code to emit a narrow CastII node with a control dependency on the range check that is then used as input to the ConvI2L. Like this, we explicitly express the dependency and prevent loop optimizations from moving the ConvI2L above the range check. 

To make sure that the impact is as small as possible, the range check dependent CastII nodes are removed right after loop optimizations. Further, all optimizations that depend on the old shape of array address computations are adapted to be aware of the CastII node.

With the fix, we could now remove the following old "workaround-fixes":
https://bugs.openjdk.java.net/browse/JDK-4781451
https://bugs.openjdk.java.net/browse/JDK-4799512
https://bugs.openjdk.java.net/browse/JDK-6659207
https://bugs.openjdk.java.net/browse/JDK-6663854
For reference, the individual patches can be found here:
http://cr.openjdk.java.net/~thartmann/6675699/backouts/

However, performance evaluation showed that backing out the old fixes causes significant regressions. It seems that aggressive splitting of ConvI2L nodes through phis leads to less optimal code due to more register spilling. I suspect that additional changes to the loop optimizations are necessary and would therefore like to leave the workaround fixes in for now. I filed JDK-8145313 to remove them later. Like this, we also reduce the impact/risk when backporting this fix to JDK 8 and potentially JDK 7.

Roland pointed out that the changes in ConvI2LNode::Ideal() could potentially be merged into the CastIINode::Ideal() optimization introduced by his fix for JDK-8145322. After some investigation it turned out that the CastII optimization does not only affect memory addressing but also other CastII(AddI(..)) graph shapes. Making it more generic has a broader impact and therefore needs more investigation. I filed JDK-8147394 for this.

ConvI2L nodes with a narrow type are also emitted by intrinsics:
- GraphKit::array_element_address()
- PhaseMacroExpand::array_element_address()
- ArrayCopyNode::prepare_array_copy()
I was not able to reproduce the problem with intrinsics. It's also not easily possible to make the CastII node range check dependent here because the range check is not always available from within the intrinsic.

*Testing*
I did extensive testing to make sure the fix does not introduce correctness or performance issues.
- Different RBT test suites [1] with and without -Xcomp.
- Full run of multiple CTW suites.
- Verified changes in "PhaseIdealLoop::match_fill_loop" (loopTransform.cpp) by manually checking the output of [2] with -XX:+TraceOptimizeFill.
- Verified changes in "IfNode::improve_address_types" (ifnode.cpp) by manually checking the output of [3] with -XX:+PrintOptoAssembly to make sure all range checks are folded.
- Verified changes in superword.cpp by comparing output with -XX:+TraceSuperWord.
- Performance runs (Footprint, JMH-Javac, SPECjbb2005, SPECjvm2008, Startup, Volano) on x86 and SPARC showed no regression

Thanks,
Tobias

[1] RBT test suites:
- hotspot/test/:hotspot_all
- noncolo.testlist
- vm.compiler.testlist
- vm.regression.testlist
- nsk.regression.testlist
- nsk.split_verifier.testlist
- nsk.stress.testlist
- nsk.stress.jck.testlist
- jdk/test/:jdk_jfr
- jdk/test/:svc_tools
- jdk/test/:jdk_instrument
- jdk/test/:jdk_lang
- jdk/test/:jdk_svc
- nashorn/test/:tier1
- nashorn/test/:tier2
- nashorn/test/:tier3
Only without -Xcomp:
- Kitchensink
- runThese
- Weblogic12medrec
[2] test/compiler/intrinsics/6982370/Test6982370.java
[3] test/compiler/rangechecks/TestExplicitRangeChecks.java

From vladimir.x.ivanov at oracle.com  Thu Jan 14 17:59:10 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 14 Jan 2016 20:59:10 +0300
Subject: [9] RFR (XS): 6985422: flush the output streams before OnError
	commands
In-Reply-To: <5695505F.7050005@oracle.com>
References: <569548D4.2070707@oracle.com> <5695505F.7050005@oracle.com>
Message-ID: <5697E1EE.4010704@oracle.com>

Thank you, Vladimir.

Best regards,
Vladimir Ivanov

On 1/12/16 10:13 PM, Vladimir Kozlov wrote:
> Looks good.
>
> Vladimir K
>
> On 1/12/16 10:41 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/6985422/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-6985422
>>
>> OnError commands are executed before hotspot log is finished.
>>
>> The fix is to finish the log before executing OnError commands.
>>
>> Also, I moved compilation replay data dumping logic before OnError
>> processing, so compilation replay file is accessible
>> from OnError commands as well.
>>
>> I verified the fix by triggering VM crash w/ -XX:+LogCompilation
>> -XX:LogFile=hotspot.log -XX:OnError='cp hotspot.log
>> hs.log' flags and checking that hs.log is complete. Without the fix
>> the log is corrupted.
>>
>> Testing: manual, JPRT.
>>
>> Thanks!
>>
>> Best regards,
>> Vladimir Ivanov

From vladimir.kozlov at oracle.com  Thu Jan 14 18:29:55 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 14 Jan 2016 10:29:55 -0800
Subject: Request for Reviews (S): JDK-8003585 strength reduce or eliminate
	range checks for power-of-two sized arrays
In-Reply-To: <70FBA4CF-CF05-4232-AFEC-202E93BFA930@oracle.com>
References: <CA+cQ+tR5c3y3uLpdO2Gop-RA0ZPJ4Mmq9yAAK5dOpFtgv-XPcg@mail.gmail.com>
	<440F2280-4B25-4AE6-A4F6-DDD4EB529636@oracle.com>
	<52FC129D.7040409@oracle.com>
	<CA+cQ+tTzA822citeYM0zCNE3FDyBk9hTO2h2i6POiJWSar6hVA@mail.gmail.com>
	<CA+cQ+tTWkkJHwcvJ4R=f31ve9vRwG5TfizHfo56sO9R3p3sRcQ@mail.gmail.com>
	<52FE6A08.20400@oracle.com>
	<CA+cQ+tQ00O1bT0JAAX9Lk+hPrfLV0PgQoiO1wmEUbKrPrqi1kA@mail.gmail.com>
	<52FE7313.3060404@oracle.com> <530209A8.1020501@oracle.com>
	<DC638183-2429-4FD2-B455-F87C0FD7EBFE@oracle.com>
	<CA+cQ+tQ2FoMhSj_B-Q=zgv3sGtJo9HojhWDsEWkx-UThc+mqEA@mail.gmail.com>
	<38EE6922-0B9C-49A6-B54D-E78BA0EFECB1@oracle.com>
	<B42C8B5D-5C7E-4FB9-A519-0B319F89018C@oracle.com>
	<CA+cQ+tQKvH9dyymvOEO6YFrCv2A1+hTo6bwMg0km8hx4G2jf4w@mail.gmail.com>
	<8232A81B-6B78-4F61-A8EC-1A3DF3938648@oracle.com>
	<DB5DF00F-CD36-41D2-8E1D-0958E70E6D1D@oracle.com>
	<70FBA4CF-CF05-4232-AFEC-202E93BFA930@oracle.com>
Message-ID: <5697E923.6000908@oracle.com>

I know it is duplication but CmpU creation should be under conditions otherwise you are creating and transforming dead node.

+     Node* ncmp = phase->transform(new CmpUNode(cmp1, cmp2));
+     if (_test._test == BoolTest::le || _test._test == BoolTest::eq) {

The test does not cover next conversions:

+   // Change (arraylength <= 0) or (arraylength == 0)
+   //   into (arraylength u<= 0)
+   // Also change (arraylength != 0) into (arraylength u> 0)

Thanks,
Vladimir

On 1/7/16 1:29 AM, Roland Westrelin wrote:
> Can I get a review for this?
>
> Roland.
>
>> On Oct 5, 2015, at 12:51 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>
>> Here is a new webrev:
>>
>> http://cr.openjdk.java.net/~roland/8003585/webrev.01/
>>
>> Roland.
>>
>>> On Oct 2, 2015, at 3:30 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>
>>> Hi Chris,
>>>
>>>> Thanks for picking it up! It mostly looks good to me. (Not a Reviewer)
>>>
>>> Thanks for looking at this again.
>>>
>>>> What I really needed with my earlier webrev was some instructions as to what test to write -- since the Java corelibs can come across this optimization a lot (e.g. HashMap), I didn't have a good idea of what kind of test really needs to be written.
>>>>
>>>> A couple of issues with this webrev:
>>>>
>>>> 1. In subnode.cpp, line 1346:
>>>>
>>>> 1344     } else if (_test._test == BoolTest::lt &&
>>>> 1345                cmp2->Opcode() == Op_AddI &&
>>>> 1346                cmp2->in(2)->find_int_con(1)) {
>>>> 1347       bound = cmp2->in(1);
>>>> 1348     }
>>>>
>>>> I think it should be
>>>> cmp2->in(2)->find_int_con(0) == 1
>>>> instead, because the value passed into this function is actually for a "fallback when no int constant is found". Passing the expected value (1) to it defeats the purpose.
>>>
>>> You?re right. Thanks for spotting that.
>>>
>>>> jint find_int_con(jint value_if_unknown) const {
>>>>    const TypeInt* t = find_int_type();
>>>>    return (t != NULL && t->is_con()) ? t->get_con() : value_if_unknown;
>>>> }
>>>>
>>>> 2. Formattign nitpick: could you please trim the spaces before the new's on lines 1368, 1369 and 1387
>>>
>>> Sure.
>>>
>>> I?ll send an updated webrev.
>>>
>>> Roland.
>>>
>>>>
>>>> Thanks,
>>>> Kris (OpenJDK username: krismo)
>>>>
>>>> On Wed, Sep 30, 2015 at 1:34 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>> I?m picking that one up. Here is a new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~roland/8003585/webrev.00/
>>>>
>>>> The only change to c2 compared to the previous webrev is that ((x & m) u< m+1) is optimized the same way ((x & m) u<= m) is. Actually, I don?t think that C2 currently produces the ((x & m) u<= m) shape. The IfNode::fold_compares() logic produces the ((x & m) u< m+1) variant. I also added a test case to check the validity of the transformations and ran usual testing on the change.
>>>>
>>>> Roland.
>>
>

From vladimir.kozlov at oracle.com  Thu Jan 14 18:37:00 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 14 Jan 2016 10:37:00 -0800
Subject: RFR(S): 8145331: SEGV in DirectivesStack::release(DirectiveSet*)
In-Reply-To: <5697983F.3080509@oracle.com>
References: <5697983F.3080509@oracle.com>
Message-ID: <5697EACC.1020907@oracle.com>

Good.

Thanks,
Vladimir

On 1/14/16 4:44 AM, Nils Eliasson wrote:
> Hi,
>
> Please review this patch:
>
> Description:
> In the fix for JDK-8144873 I updated only one of the two use cases of CompilerDirectives::get_for(AbstractCompiler..)
>
> Summary:
> I simplify CompilerDirectives::get_for(..) to always return the c1_store for all unsupported cases. Makes
> getMatchingDirective and getDefaultDirective simpler too. Moved refcount out of get_for(...) since it is not guaranteed
> to be used if updated here.
>
> Testing:
> All intrinsic tests and all compilercontrol tests in addition to testset hotspot.
> IntrinsicAvailableTest is updated to not check JVMCI compiler for intrinsics.
> IntrinsicDisabledTest.jtr doesn't work with JVMCI - no action taken
> NullCheckDroppingsTest.jtr doesn't work - since JVMCI doesn't support BackgroudCompilation - no action taken
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8145331
> Webrev: http://cr.openjdk.java.net/~neliasso/8145331/webrev.01/
>
> Regards,
> Nils Eliasson

From vladimir.kozlov at oracle.com  Thu Jan 14 18:43:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 14 Jan 2016 10:43:33 -0800
Subject: [9] RFR (S): 8140001: _allocateInstance intrinsic does not throw
	InstantiationException for abstract classes and interfaces
In-Reply-To: <5697BBA8.8030901@oracle.com>
References: <56951A3E.7070805@oracle.com> <5697B935.9020209@oracle.com>
	<5697BBA8.8030901@oracle.com>
Message-ID: <5697EC55.2010507@oracle.com>

The fix in EA (mark as escaping) is good.

Thanks,
Vladimir

On 1/14/16 7:15 AM, Aleksey Shipilev wrote:
> Looks okay to me, but I think the property name should reflect Java
> terminology, e.g. "can_be_instantiated", "not is_allocatable"?
>
> $ javac AbstractSample.java
> [ERROR] AbstractSample.java:[36,9] AbstractSample.M is abstract; cannot
> be instantiated
>
> Thanks,
> -Aleksey
>
> On 01/14/2016 06:05 PM, Vladimir Ivanov wrote:
>> Any feedback, please?
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 1/12/16 6:22 PM, Vladimir Ivanov wrote:
>>> http://cr.openjdk.java.net/~vlivanov/8140001/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8140001
>>>
>>> EA can eliminate allocations of abstract classes or interfaces, thus
>>> changing observable behavior of a program as the test case demonstrates.
>>>
>>> The fix is to always mark such allocations as escaping.
>>>
>>> Testing: failing test, JPRT.
>>>
>>> Thanks!
>>>
>>> Best regards,
>>> Vladimir Ivanov
>
>

From christian.thalinger at oracle.com  Thu Jan 14 18:55:16 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 14 Jan 2016 08:55:16 -1000
Subject: RFR(S): 8145331: SEGV in DirectivesStack::release(DirectiveSet*)
In-Reply-To: <5697983F.3080509@oracle.com>
References: <5697983F.3080509@oracle.com>
Message-ID: <CD05703A-23F7-4F91-BE8C-355D08887B62@oracle.com>

It would be nice if we somehow could detect if we are using C2 or not.  I mean this is sufficient:

+             // Dont bother check JVMCI compiler - returns false on all intrinsics.
+             if (!Boolean.valueOf(getVMOption("UseJVMCICompiler"))) {

but we are doing the reverse test:  we should be testing for isC2() not !isJVMCI().

Anyway, that is a different issue.  This change looks good.

> On Jan 14, 2016, at 2:44 AM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
> 
> Hi,
> 
> Please review this patch:
> 
> Description:
> In the fix for JDK-8144873 I updated only one of the two use cases of CompilerDirectives::get_for(AbstractCompiler..)
> 
> Summary:
> I simplify CompilerDirectives::get_for(..) to always return the c1_store for all unsupported cases. Makes getMatchingDirective and getDefaultDirective simpler too. Moved refcount out of get_for(...) since it is not guaranteed to be used if updated here.
> 
> Testing:
> All intrinsic tests and all compilercontrol tests in addition to testset hotspot.
> IntrinsicAvailableTest is updated to not check JVMCI compiler for intrinsics.
> IntrinsicDisabledTest.jtr doesn't work with JVMCI - no action taken
> NullCheckDroppingsTest.jtr doesn't work - since JVMCI doesn't support BackgroudCompilation - no action taken
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8145331
> Webrev: http://cr.openjdk.java.net/~neliasso/8145331/webrev.01/
> 
> Regards,
> Nils Eliasson


From vladimir.kozlov at oracle.com  Thu Jan 14 18:59:11 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 14 Jan 2016 10:59:11 -0800
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <5697C624.7040201@oracle.com>
References: <5697C624.7040201@oracle.com>
Message-ID: <5697EFFF.90305@oracle.com>

You have to update code for 8146999 changes when Roland push it.

The only thing I don't like about changes is using #ifdef _LP64 for part of changes.
I know where it is coming from (ConvI2L for loop indexing) but as you said ConvI2L could be generated in other cases 
too. Should the test cast->has_range_check() return 'false' in 32-bit?

Thanks,
Vladimir

On 1/14/16 8:00 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch.
>
> https://bugs.openjdk.java.net/browse/JDK-6675699
> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/
>
> *Problem*
> The problem is that ConvI2L nodes with a narrow type (used to convert integer array indices to long values) are not dependent on the corresponding range check that proves that the input value is always in the (integer-)range. As a result, the ConvI2L node may flow above the range check during loop optimizations and end up with an input that is not in its type range. The node is then replaced by TOP causing the data path to be eliminated. However, because there is no control dependency on the corresponding range check, the control path from the peeled iteration that uses the result of the ConvI2L may not be eliminated. We crash because we are potentially using a value that is not available.
>
> For example, TestLoopPeeling::testArrayAccess() triggers loop peeling because the loop contains an invariant check. The array store in line 66 is moved out of the loop and reachable from the peeled and old iterations of the loop. However, the array index computation consisting of a LShiftL(ConvI2L(Phi)) remains in each loop because it has loop variant usages and is not dependent on the range check that was moved out of the loop. The peeled iteration of the loop uses storeIndex == -1 causing the ConvI2L to be replaced by TOP because -1 is not in its [0, MAX_INT] range. The TOP is propagated downwards and ends up as one of the inputs to the Phi that merges the array index from the peeled and old loop exits. The Phi replaced by it's only remaining input and the store ends up using the index from the old iteration although it's still reachable from the peeled iteration. We crash because we potentially use the index value from the old iteration while coming from the peeled iter!
 at!
>   ion (of co
>
> urse, the range check would catch this at runtime).
>
> This problem may show up with array accesses but also with other code for which we emit a ConvI2L node with a narrow type. For example, array allocation uses a ConvI2L to convert the integer array size to a long value (see TestLoopPeeling::testArrayAllocation). We solved several different instances of this problem in the past with "workaround-fixes" that just disabled loop optimizations in special cases (see below). Such a workaround fix is not feasible to fix all potential occurrences of this problem. TestLoopPeeling.java crashes JDK 7, 8 and 9.
>
> *Solution*
> To make the ConvI2L dependent on a range check, I added code to emit a narrow CastII node with a control dependency on the range check that is then used as input to the ConvI2L. Like this, we explicitly express the dependency and prevent loop optimizations from moving the ConvI2L above the range check.
>
> To make sure that the impact is as small as possible, the range check dependent CastII nodes are removed right after loop optimizations. Further, all optimizations that depend on the old shape of array address computations are adapted to be aware of the CastII node.
>
> With the fix, we could now remove the following old "workaround-fixes":
> https://bugs.openjdk.java.net/browse/JDK-4781451
> https://bugs.openjdk.java.net/browse/JDK-4799512
> https://bugs.openjdk.java.net/browse/JDK-6659207
> https://bugs.openjdk.java.net/browse/JDK-6663854
> For reference, the individual patches can be found here:
> http://cr.openjdk.java.net/~thartmann/6675699/backouts/
>
> However, performance evaluation showed that backing out the old fixes causes significant regressions. It seems that aggressive splitting of ConvI2L nodes through phis leads to less optimal code due to more register spilling. I suspect that additional changes to the loop optimizations are necessary and would therefore like to leave the workaround fixes in for now. I filed JDK-8145313 to remove them later. Like this, we also reduce the impact/risk when backporting this fix to JDK 8 and potentially JDK 7.
>
> Roland pointed out that the changes in ConvI2LNode::Ideal() could potentially be merged into the CastIINode::Ideal() optimization introduced by his fix for JDK-8145322. After some investigation it turned out that the CastII optimization does not only affect memory addressing but also other CastII(AddI(..)) graph shapes. Making it more generic has a broader impact and therefore needs more investigation. I filed JDK-8147394 for this.
>
> ConvI2L nodes with a narrow type are also emitted by intrinsics:
> - GraphKit::array_element_address()
> - PhaseMacroExpand::array_element_address()
> - ArrayCopyNode::prepare_array_copy()
> I was not able to reproduce the problem with intrinsics. It's also not easily possible to make the CastII node range check dependent here because the range check is not always available from within the intrinsic.
>
> *Testing*
> I did extensive testing to make sure the fix does not introduce correctness or performance issues.
> - Different RBT test suites [1] with and without -Xcomp.
> - Full run of multiple CTW suites.
> - Verified changes in "PhaseIdealLoop::match_fill_loop" (loopTransform.cpp) by manually checking the output of [2] with -XX:+TraceOptimizeFill.
> - Verified changes in "IfNode::improve_address_types" (ifnode.cpp) by manually checking the output of [3] with -XX:+PrintOptoAssembly to make sure all range checks are folded.
> - Verified changes in superword.cpp by comparing output with -XX:+TraceSuperWord.
> - Performance runs (Footprint, JMH-Javac, SPECjbb2005, SPECjvm2008, Startup, Volano) on x86 and SPARC showed no regression
>
> Thanks,
> Tobias
>
> [1] RBT test suites:
> - hotspot/test/:hotspot_all
> - noncolo.testlist
> - vm.compiler.testlist
> - vm.regression.testlist
> - nsk.regression.testlist
> - nsk.split_verifier.testlist
> - nsk.stress.testlist
> - nsk.stress.jck.testlist
> - jdk/test/:jdk_jfr
> - jdk/test/:svc_tools
> - jdk/test/:jdk_instrument
> - jdk/test/:jdk_lang
> - jdk/test/:jdk_svc
> - nashorn/test/:tier1
> - nashorn/test/:tier2
> - nashorn/test/:tier3
> Only without -Xcomp:
> - Kitchensink
> - runThese
> - Weblogic12medrec
> [2] test/compiler/intrinsics/6982370/Test6982370.java
> [3] test/compiler/rangechecks/TestExplicitRangeChecks.java
>

From nils.eliasson at oracle.com  Thu Jan 14 19:21:10 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 14 Jan 2016 20:21:10 +0100
Subject: RFR(S): 8145331: SEGV in DirectivesStack::release(DirectiveSet*)
In-Reply-To: <CD05703A-23F7-4F91-BE8C-355D08887B62@oracle.com>
References: <5697983F.3080509@oracle.com>
	<CD05703A-23F7-4F91-BE8C-355D08887B62@oracle.com>
Message-ID: <5697F526.5080804@oracle.com>


On 2016-01-14 19:55, Christian Thalinger wrote:
> It would be nice if we somehow could detect if we are using C2 or not.  I mean this is sufficient:
>
> +             // Dont bother check JVMCI compiler - returns false on all intrinsics.
> +             if (!Boolean.valueOf(getVMOption("UseJVMCICompiler"))) {
>
> but we are doing the reverse test:  we should be testing for isC2() not !isJVMCI().
Yes, we should really get away from using the messy 
complevel-to-compiler-translation in these test and iterate over the 
available compilers instead. That would allow for having different 
compilers behind the JVMCI interface too.

Thanks for having a look,
Nils

>
> Anyway, that is a different issue.  This change looks good.
>
>> On Jan 14, 2016, at 2:44 AM, Nils Eliasson <nils.eliasson at oracle.com> wrote:
>>
>> Hi,
>>
>> Please review this patch:
>>
>> Description:
>> In the fix for JDK-8144873 I updated only one of the two use cases of CompilerDirectives::get_for(AbstractCompiler..)
>>
>> Summary:
>> I simplify CompilerDirectives::get_for(..) to always return the c1_store for all unsupported cases. Makes getMatchingDirective and getDefaultDirective simpler too. Moved refcount out of get_for(...) since it is not guaranteed to be used if updated here.
>>
>> Testing:
>> All intrinsic tests and all compilercontrol tests in addition to testset hotspot.
>> IntrinsicAvailableTest is updated to not check JVMCI compiler for intrinsics.
>> IntrinsicDisabledTest.jtr doesn't work with JVMCI - no action taken
>> NullCheckDroppingsTest.jtr doesn't work - since JVMCI doesn't support BackgroudCompilation - no action taken
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8145331
>> Webrev: http://cr.openjdk.java.net/~neliasso/8145331/webrev.01/
>>
>> Regards,
>> Nils Eliasson


From nils.eliasson at oracle.com  Thu Jan 14 19:21:33 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 14 Jan 2016 20:21:33 +0100
Subject: RFR(S): 8145331: SEGV in DirectivesStack::release(DirectiveSet*)
In-Reply-To: <5697EACC.1020907@oracle.com>
References: <5697983F.3080509@oracle.com> <5697EACC.1020907@oracle.com>
Message-ID: <5697F53D.8030205@oracle.com>

Thank you Vladimir!
//Nils

On 2016-01-14 19:37, Vladimir Kozlov wrote:
> Good.
>
> Thanks,
> Vladimir
>
> On 1/14/16 4:44 AM, Nils Eliasson wrote:
>> Hi,
>>
>> Please review this patch:
>>
>> Description:
>> In the fix for JDK-8144873 I updated only one of the two use cases of 
>> CompilerDirectives::get_for(AbstractCompiler..)
>>
>> Summary:
>> I simplify CompilerDirectives::get_for(..) to always return the 
>> c1_store for all unsupported cases. Makes
>> getMatchingDirective and getDefaultDirective simpler too. Moved 
>> refcount out of get_for(...) since it is not guaranteed
>> to be used if updated here.
>>
>> Testing:
>> All intrinsic tests and all compilercontrol tests in addition to 
>> testset hotspot.
>> IntrinsicAvailableTest is updated to not check JVMCI compiler for 
>> intrinsics.
>> IntrinsicDisabledTest.jtr doesn't work with JVMCI - no action taken
>> NullCheckDroppingsTest.jtr doesn't work - since JVMCI doesn't support 
>> BackgroudCompilation - no action taken
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8145331
>> Webrev: http://cr.openjdk.java.net/~neliasso/8145331/webrev.01/
>>
>> Regards,
>> Nils Eliasson


From christian.thalinger at oracle.com  Thu Jan 14 21:50:15 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 14 Jan 2016 11:50:15 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
	<B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
	<7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>
	<E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>
Message-ID: <C7F23075-7E5B-4DDD-A443-7540561C449C@oracle.com>


> On Jan 14, 2016, at 2:44 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 14 Jan 2016, at 06:58, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 12, 2016, at 12:39 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>> 
>>>>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>>>>> 
>>>>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>>>>> 
>>>>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>>>>> 
>>>>>>> You could then type the value of the options and check the right accessor is used:
>>>>>>> 
>>>>>>> public enum Option {
>>>>>>>     ImplicitStableValues(boolean.class),
>>>>>>>     InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>>>>     PrintConfig(boolean.class),
>>>>>>>     PrintFlags(boolean.class),
>>>>>>>     ShowFlags(boolean.class),
>>>>>>>     TraceMethodDataFilter(String.class),
>>>>>>>     TrustFinalDefaultFields(String.class);
>>>>>>> 
>>>>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>>>>> 
>>>>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>>>>> 
>>>>> Can you do that without having to box the default value?
>>>> 
>>>> No, we have to box but we can initialize all flags in the constructor:
>>>> 
>>>> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/
>> 
>> Do we agree on the change?
> 
> I would prefer it if the value was lazy initialized (for non-AOT runtimes):

It?s not different in AOT-land because these cannot be constants.

> 
> Also, you can remove all the static fields that just cache a (possibly unboxed) option value and use the option directly. For example:
> 
> diff -r 1034ff44c5d0 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Tue Jan 12 15:04:27 2016 +0100
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Thu Jan 14 13:40:28 2016 +0100
> @@ -29,6 +29,7 @@
> import java.lang.reflect.Field;
> 
> import jdk.vm.ci.common.JVMCIError;
> +import jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.Option;
> import jdk.vm.ci.meta.JavaType;
> import jdk.vm.ci.meta.LocationIdentity;
> import jdk.vm.ci.meta.MetaAccessProvider;
> @@ -41,11 +42,6 @@
>  */
> class HotSpotResolvedJavaFieldImpl implements HotSpotResolvedJavaField, HotSpotProxified {
> 
> -    /**
> -     * Mark well-known stable fields as such.
> -     */
> -    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
> -
>     private final HotSpotResolvedObjectTypeImpl holder;
>     private final String name;
>     private JavaType type;
> @@ -198,7 +194,7 @@
>             return true;
>         }
>         assert getAnnotation(Stable.class) == null;
> -        if (ImplicitStableValues && isImplicitStableField()) {
> +        if (Option.ImplicitStableValues.getBoolean() && isImplicitStableField()) {
>             return true;
>         }
>         return false;
> 
> None of the current options are used in tight loops where the cost of the unboxing (if any) would matter.

Right.

> 
> Lastly, since you?ve added PrintFlags and ShowFlags, why not add a help message to each option. For example:
> 
>        ImplicitStableValues(boolean.class, true, ?Mark well-known stable fields as such.?),

We should.

http://cr.openjdk.java.net/~twisti/8146820/webrev.03/

$ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.ShowFlags=true InitGraal
[List of JVMCI options]
  boolean ImplicitStableValues                     := true           Mark well-known stable fields as such.
  boolean InitTimer                                := false          Specifies if initialization timing is enabled.
  boolean PrintConfig                              := false          Prints all HotSpotVMConfig fields.
  boolean PrintFlags                               := false          Prints all JVMCI flags and exits.
  boolean ShowFlags                                 = true           Prints all JVMCI flags and continues.
   String TraceMethodDataFilter                    := null           
  boolean TrustFinalDefaultFields                  := true           Determines whether to treat final fields with default values as constant.


From zoltan.majo at oracle.com  Fri Jan 15 08:43:00 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Fri, 15 Jan 2016 09:43:00 +0100
Subject: [9] RFR (XS): 8147441: unchecked pending exceptions in the WhiteBox
	API's implementation
Message-ID: <5698B114.4020106@oracle.com>

Hi,


please review the patch for 8147441.

https://bugs.openjdk.java.net/browse/JDK-8147441

Problem: The method codeBlob2objectArray is used by the implementation 
of the WB API to fill in an object array with information about a code 
blob. Although the codeBlob2objectArray method can cause various JNI 
exceptions, there are two code locations where the VM does not check for 
exceptions after codeBlob2objectArray returns.

Solution: Add exception check to the above mentioned code locations.

Webrev:
http://cr.openjdk.java.net/~zmajo/8147441/webrev.00/

Testing:
- JPRT;
- all hotspot tests executed locally; all tests that pass with the 
default version pass with the fixed version as well.

Thank you and best regards,


Zoltan


From tobias.hartmann at oracle.com  Fri Jan 15 11:07:45 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 12:07:45 +0100
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
Message-ID: <5698D301.4070309@oracle.com>

Hi,

please review the following patch:

https://bugs.openjdk.java.net/browse/JDK-8147444
http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/

The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.

Thanks,
Tobias

From zoltan.majo at oracle.com  Fri Jan 15 11:14:58 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Fri, 15 Jan 2016 12:14:58 +0100
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698D301.4070309@oracle.com>
References: <5698D301.4070309@oracle.com>
Message-ID: <5698D4B2.6030300@oracle.com>

Hi Tobias,


this looks good to me!

Thank you and best regards,


Zoltan

On 01/15/2016 12:07 PM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8147444
> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>
> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>
> Thanks,
> Tobias


From tobias.hartmann at oracle.com  Fri Jan 15 11:15:54 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 12:15:54 +0100
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698D4B2.6030300@oracle.com>
References: <5698D301.4070309@oracle.com> <5698D4B2.6030300@oracle.com>
Message-ID: <5698D4EA.7000909@oracle.com>

Thanks, Zoltan!

Best,
Tobias

On 15.01.2016 12:14, Zolt?n Maj? wrote:
> Hi Tobias,
> 
> 
> this looks good to me!
> 
> Thank you and best regards,
> 
> 
> Zoltan
> 
> On 01/15/2016 12:07 PM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147444
>> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>>
>> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>>
>> Thanks,
>> Tobias
> 

From vladimir.x.ivanov at oracle.com  Fri Jan 15 12:40:02 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 15 Jan 2016 15:40:02 +0300
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698D301.4070309@oracle.com>
References: <5698D301.4070309@oracle.com>
Message-ID: <5698E8A2.7020205@oracle.com>

Looks good. Thanks for fixing it!

BTW can ClassFileInstaller be improved to check the input and report the 
problem in a meaningful way? NPE is useless when diagnosing such problems.

Best regards,
Vladimir Ivanov

On 1/15/16 2:07 PM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8147444
> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>
> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>
> Thanks,
> Tobias
>

From tobias.hartmann at oracle.com  Fri Jan 15 13:07:57 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 14:07:57 +0100
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698E8A2.7020205@oracle.com>
References: <5698D301.4070309@oracle.com> <5698E8A2.7020205@oracle.com>
Message-ID: <5698EF2D.1070306@oracle.com>

Thanks, Vladimir.

On 15.01.2016 13:40, Vladimir Ivanov wrote:
> Looks good. Thanks for fixing it!
> 
> BTW can ClassFileInstaller be improved to check the input and report the problem in a meaningful way? NPE is useless when diagnosing such problems.

Sure, I changed the implementation to throw an exception:
 Execution failed: `main' threw exception: java.io.FileNotFoundException: jdk/test/lib/Asserts.class

http://cr.openjdk.java.net/~thartmann/8147444/webrev.01/

Best,
Tobias

> 
> Best regards,
> Vladimir Ivanov
> 
> On 1/15/16 2:07 PM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147444
>> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>>
>> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>>
>> Thanks,
>> Tobias
>>

From vladimir.x.ivanov at oracle.com  Fri Jan 15 13:19:23 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 15 Jan 2016 16:19:23 +0300
Subject: [9] RFR (S): 8140001: _allocateInstance intrinsic does not throw
	InstantiationException for abstract classes and interfaces
In-Reply-To: <5697EC55.2010507@oracle.com>
References: <56951A3E.7070805@oracle.com> <5697B935.9020209@oracle.com>
	<5697BBA8.8030901@oracle.com> <5697EC55.2010507@oracle.com>
Message-ID: <5698F1DB.8070103@oracle.com>

Vladimir, Aleksey, thanks for the review.

I don't have a strong opinion about naming.
can_be_instantiated looks fine. Will do renaming before the push.

Best regards,
Vladimir Ivanov

On 1/14/16 9:43 PM, Vladimir Kozlov wrote:
> The fix in EA (mark as escaping) is good.
>
> Thanks,
> Vladimir
>
> On 1/14/16 7:15 AM, Aleksey Shipilev wrote:
>> Looks okay to me, but I think the property name should reflect Java
>> terminology, e.g. "can_be_instantiated", "not is_allocatable"?
>>
>> $ javac AbstractSample.java
>> [ERROR] AbstractSample.java:[36,9] AbstractSample.M is abstract; cannot
>> be instantiated
>>
>> Thanks,
>> -Aleksey
>>
>> On 01/14/2016 06:05 PM, Vladimir Ivanov wrote:
>>> Any feedback, please?
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 1/12/16 6:22 PM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8140001/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8140001
>>>>
>>>> EA can eliminate allocations of abstract classes or interfaces, thus
>>>> changing observable behavior of a program as the test case
>>>> demonstrates.
>>>>
>>>> The fix is to always mark such allocations as escaping.
>>>>
>>>> Testing: failing test, JPRT.
>>>>
>>>> Thanks!
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>
>>

From vladimir.x.ivanov at oracle.com  Fri Jan 15 13:20:30 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 15 Jan 2016 16:20:30 +0300
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698EF2D.1070306@oracle.com>
References: <5698D301.4070309@oracle.com> <5698E8A2.7020205@oracle.com>
	<5698EF2D.1070306@oracle.com>
Message-ID: <5698F21E.2090403@oracle.com>

Reviewed!

Best regards,
Vladimir Ivanov

On 1/15/16 4:07 PM, Tobias Hartmann wrote:
> Thanks, Vladimir.
>
> On 15.01.2016 13:40, Vladimir Ivanov wrote:
>> Looks good. Thanks for fixing it!
>>
>> BTW can ClassFileInstaller be improved to check the input and report the problem in a meaningful way? NPE is useless when diagnosing such problems.
>
> Sure, I changed the implementation to throw an exception:
>   Execution failed: `main' threw exception: java.io.FileNotFoundException: jdk/test/lib/Asserts.class
>
> http://cr.openjdk.java.net/~thartmann/8147444/webrev.01/
>
> Best,
> Tobias
>
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 1/15/16 2:07 PM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8147444
>>> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>>>
>>> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>>>
>>> Thanks,
>>> Tobias
>>>

From tobias.hartmann at oracle.com  Fri Jan 15 13:21:12 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 14:21:12 +0100
Subject: [9] RFR(XS): 8147444:
	compiler/jsr292/NonInlinedCall/RedefineTest.java fails with
	NullPointerException in ClassFileInstaller
In-Reply-To: <5698F21E.2090403@oracle.com>
References: <5698D301.4070309@oracle.com> <5698E8A2.7020205@oracle.com>
	<5698EF2D.1070306@oracle.com> <5698F21E.2090403@oracle.com>
Message-ID: <5698F248.9050600@oracle.com>

Thanks, Vladimir.

Best,
Tobias

On 15.01.2016 14:20, Vladimir Ivanov wrote:
> Reviewed!
> 
> Best regards,
> Vladimir Ivanov
> 
> On 1/15/16 4:07 PM, Tobias Hartmann wrote:
>> Thanks, Vladimir.
>>
>> On 15.01.2016 13:40, Vladimir Ivanov wrote:
>>> Looks good. Thanks for fixing it!
>>>
>>> BTW can ClassFileInstaller be improved to check the input and report the problem in a meaningful way? NPE is useless when diagnosing such problems.
>>
>> Sure, I changed the implementation to throw an exception:
>>   Execution failed: `main' threw exception: java.io.FileNotFoundException: jdk/test/lib/Asserts.class
>>
>> http://cr.openjdk.java.net/~thartmann/8147444/webrev.01/
>>
>> Best,
>> Tobias
>>
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 1/15/16 2:07 PM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8147444
>>>> http://cr.openjdk.java.net/~thartmann/8147444/webrev.00/
>>>>
>>>> The test compiler/jsr292/NonInlinedCall/RedefineTest.java fails in the ClassFileInstaller while trying to install jdk.test.lib.Asserts because the class is not imported and therefore not compiled. Because asserts are not used in this test, I removed the directive.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>

From roland.westrelin at oracle.com  Fri Jan 15 14:01:49 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 15 Jan 2016 15:01:49 +0100
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <5697C624.7040201@oracle.com>
References: <5697C624.7040201@oracle.com>
Message-ID: <EE5A0332-12A8-413A-84BC-039B718BF202@oracle.com>

> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/

I agree with Vladimir on the #ifdef _LP64 but otherwise it looks good to me.

Roland.

From roland.westrelin at oracle.com  Fri Jan 15 14:06:21 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 15 Jan 2016 15:06:21 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <56963C7A.8040203@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
	<569552EE.8050809@oracle.com> <56963C7A.8040203@oracle.com>
Message-ID: <0BEFA2BA-5115-4EE6-A9B4-CFFB8B6485DF@oracle.com>


> I changed the implementation to only capture the byte[] and char[] memory:
> http://cr.openjdk.java.net/~thartmann/8144212/webrev.03/

That looks good to me.

Roland.

From tobias.hartmann at oracle.com  Fri Jan 15 14:13:45 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 15:13:45 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <0BEFA2BA-5115-4EE6-A9B4-CFFB8B6485DF@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
	<569552EE.8050809@oracle.com> <56963C7A.8040203@oracle.com>
	<0BEFA2BA-5115-4EE6-A9B4-CFFB8B6485DF@oracle.com>
Message-ID: <5698FE99.6080601@oracle.com>

Thanks, Roland!

Best,
Tobias

On 15.01.2016 15:06, Roland Westrelin wrote:
> 
>> I changed the implementation to only capture the byte[] and char[] memory:
>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.03/
> 
> That looks good to me.
> 
> Roland.
> 

From tobias.hartmann at oracle.com  Fri Jan 15 14:28:47 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 15:28:47 +0100
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <5697EFFF.90305@oracle.com>
References: <5697C624.7040201@oracle.com> <5697EFFF.90305@oracle.com>
Message-ID: <5699021F.90500@oracle.com>

Thanks, Vladimir.

On 14.01.2016 19:59, Vladimir Kozlov wrote:
> You have to update code for 8146999 changes when Roland push it.

Yes, I'll do so but Roland mentioned that he still has problems with his 814699 fix.

> The only thing I don't like about changes is using #ifdef _LP64 for part of changes.
> I know where it is coming from (ConvI2L for loop indexing) but as you said ConvI2L could be generated in other cases too. Should the test cast->has_range_check() return 'false' in 32-bit?

I added the _LP64 ifdefs because we only emit a narrowed ConvI2L on 64 bit. But I agree - it's cleaner without those. As you suggested, I removed the ifdefs and changed has_range_check() to return false on 32 bit. 

Here is the new webrev:
http://cr.openjdk.java.net/~thartmann/6675699/webrev.02/

Thanks,
Tobias

> On 1/14/16 8:00 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch.
>>
>> https://bugs.openjdk.java.net/browse/JDK-6675699
>> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/
>>
>> *Problem*
>> The problem is that ConvI2L nodes with a narrow type (used to convert integer array indices to long values) are not dependent on the corresponding range check that proves that the input value is always in the (integer-)range. As a result, the ConvI2L node may flow above the range check during loop optimizations and end up with an input that is not in its type range. The node is then replaced by TOP causing the data path to be eliminated. However, because there is no control dependency on the corresponding range check, the control path from the peeled iteration that uses the result of the ConvI2L may not be eliminated. We crash because we are potentially using a value that is not available.
>>
>> For example, TestLoopPeeling::testArrayAccess() triggers loop peeling because the loop contains an invariant check. The array store in line 66 is moved out of the loop and reachable from the peeled and old iterations of the loop. However, the array index computation consisting of a LShiftL(ConvI2L(Phi)) remains in each loop because it has loop variant usages and is not dependent on the range check that was moved out of the loop. The peeled iteration of the loop uses storeIndex == -1 causing the ConvI2L to be replaced by TOP because -1 is not in its [0, MAX_INT] range. The TOP is propagated downwards and ends up as one of the inputs to the Phi that merges the array index from the peeled and old loop exits. The Phi replaced by it's only remaining input and the store ends up using the index from the old iteration although it's still reachable from the peeled iteration. We crash because we potentially use the index value from the old iteration while coming from the peeled ite!
 r!
> at!
>>   ion (of co
>>
>> urse, the range check would catch this at runtime).
>>
>> This problem may show up with array accesses but also with other code for which we emit a ConvI2L node with a narrow type. For example, array allocation uses a ConvI2L to convert the integer array size to a long value (see TestLoopPeeling::testArrayAllocation). We solved several different instances of this problem in the past with "workaround-fixes" that just disabled loop optimizations in special cases (see below). Such a workaround fix is not feasible to fix all potential occurrences of this problem. TestLoopPeeling.java crashes JDK 7, 8 and 9.
>>
>> *Solution*
>> To make the ConvI2L dependent on a range check, I added code to emit a narrow CastII node with a control dependency on the range check that is then used as input to the ConvI2L. Like this, we explicitly express the dependency and prevent loop optimizations from moving the ConvI2L above the range check.
>>
>> To make sure that the impact is as small as possible, the range check dependent CastII nodes are removed right after loop optimizations. Further, all optimizations that depend on the old shape of array address computations are adapted to be aware of the CastII node.
>>
>> With the fix, we could now remove the following old "workaround-fixes":
>> https://bugs.openjdk.java.net/browse/JDK-4781451
>> https://bugs.openjdk.java.net/browse/JDK-4799512
>> https://bugs.openjdk.java.net/browse/JDK-6659207
>> https://bugs.openjdk.java.net/browse/JDK-6663854
>> For reference, the individual patches can be found here:
>> http://cr.openjdk.java.net/~thartmann/6675699/backouts/
>>
>> However, performance evaluation showed that backing out the old fixes causes significant regressions. It seems that aggressive splitting of ConvI2L nodes through phis leads to less optimal code due to more register spilling. I suspect that additional changes to the loop optimizations are necessary and would therefore like to leave the workaround fixes in for now. I filed JDK-8145313 to remove them later. Like this, we also reduce the impact/risk when backporting this fix to JDK 8 and potentially JDK 7.
>>
>> Roland pointed out that the changes in ConvI2LNode::Ideal() could potentially be merged into the CastIINode::Ideal() optimization introduced by his fix for JDK-8145322. After some investigation it turned out that the CastII optimization does not only affect memory addressing but also other CastII(AddI(..)) graph shapes. Making it more generic has a broader impact and therefore needs more investigation. I filed JDK-8147394 for this.
>>
>> ConvI2L nodes with a narrow type are also emitted by intrinsics:
>> - GraphKit::array_element_address()
>> - PhaseMacroExpand::array_element_address()
>> - ArrayCopyNode::prepare_array_copy()
>> I was not able to reproduce the problem with intrinsics. It's also not easily possible to make the CastII node range check dependent here because the range check is not always available from within the intrinsic.
>>
>> *Testing*
>> I did extensive testing to make sure the fix does not introduce correctness or performance issues.
>> - Different RBT test suites [1] with and without -Xcomp.
>> - Full run of multiple CTW suites.
>> - Verified changes in "PhaseIdealLoop::match_fill_loop" (loopTransform.cpp) by manually checking the output of [2] with -XX:+TraceOptimizeFill.
>> - Verified changes in "IfNode::improve_address_types" (ifnode.cpp) by manually checking the output of [3] with -XX:+PrintOptoAssembly to make sure all range checks are folded.
>> - Verified changes in superword.cpp by comparing output with -XX:+TraceSuperWord.
>> - Performance runs (Footprint, JMH-Javac, SPECjbb2005, SPECjvm2008, Startup, Volano) on x86 and SPARC showed no regression
>>
>> Thanks,
>> Tobias
>>
>> [1] RBT test suites:
>> - hotspot/test/:hotspot_all
>> - noncolo.testlist
>> - vm.compiler.testlist
>> - vm.regression.testlist
>> - nsk.regression.testlist
>> - nsk.split_verifier.testlist
>> - nsk.stress.testlist
>> - nsk.stress.jck.testlist
>> - jdk/test/:jdk_jfr
>> - jdk/test/:svc_tools
>> - jdk/test/:jdk_instrument
>> - jdk/test/:jdk_lang
>> - jdk/test/:jdk_svc
>> - nashorn/test/:tier1
>> - nashorn/test/:tier2
>> - nashorn/test/:tier3
>> Only without -Xcomp:
>> - Kitchensink
>> - runThese
>> - Weblogic12medrec
>> [2] test/compiler/intrinsics/6982370/Test6982370.java
>> [3] test/compiler/rangechecks/TestExplicitRangeChecks.java
>>

From tobias.hartmann at oracle.com  Fri Jan 15 14:33:19 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 15 Jan 2016 15:33:19 +0100
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <EE5A0332-12A8-413A-84BC-039B718BF202@oracle.com>
References: <5697C624.7040201@oracle.com>
	<EE5A0332-12A8-413A-84BC-039B718BF202@oracle.com>
Message-ID: <5699032F.2070105@oracle.com>

Thanks, Roland!

Best,
Tobias

On 15.01.2016 15:01, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/
> 
> I agree with Vladimir on the #ifdef _LP64 but otherwise it looks good to me.
> 
> Roland.
> 

From tom.rodriguez at oracle.com  Fri Jan 15 16:43:41 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 15 Jan 2016 08:43:41 -0800
Subject: RFR(S): 8147433: PrintNMethods no longer works with JVMCI
Message-ID: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>

http://cr.openjdk.java.net/~never/8147433/webrev/index.html <http://cr.openjdk.java.net/~never/8147433/webrev/index.html>

https://bugs.openjdk.java.net/browse/JDK-8137167 <https://bugs.openjdk.java.net/browse/JDK-8137167> moved the PrintNMethods related code into ciEnv but since  JVMCI doesn?t use ciEnv PrintNMethods no longer works for it.  This moves into CompileBroker with the other compilation related printing code.  Tested with fastdebug -XX:+PrintNMethods running specjvm2008.

tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160115/ac58de6d/attachment.html>

From vladimir.kozlov at oracle.com  Fri Jan 15 18:10:48 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 15 Jan 2016 10:10:48 -0800
Subject: [9] RFR (XS): 8147441: unchecked pending exceptions in the
	WhiteBox API's implementation
In-Reply-To: <5698B114.4020106@oracle.com>
References: <5698B114.4020106@oracle.com>
Message-ID: <56993628.1060702@oracle.com>

Seems fine.

Thanks,
Vladimir

On 1/15/16 12:43 AM, Zolt?n Maj? wrote:
> Hi,
>
>
> please review the patch for 8147441.
>
> https://bugs.openjdk.java.net/browse/JDK-8147441
>
> Problem: The method codeBlob2objectArray is used by the implementation
> of the WB API to fill in an object array with information about a code
> blob. Although the codeBlob2objectArray method can cause various JNI
> exceptions, there are two code locations where the VM does not check for
> exceptions after codeBlob2objectArray returns.
>
> Solution: Add exception check to the above mentioned code locations.
>
> Webrev:
> http://cr.openjdk.java.net/~zmajo/8147441/webrev.00/
>
> Testing:
> - JPRT;
> - all hotspot tests executed locally; all tests that pass with the
> default version pass with the fixed version as well.
>
> Thank you and best regards,
>
>
> Zoltan
>

From vladimir.kozlov at oracle.com  Fri Jan 15 18:14:28 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 15 Jan 2016 10:14:28 -0800
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <56963C7A.8040203@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
	<569552EE.8050809@oracle.com> <56963C7A.8040203@oracle.com>
Message-ID: <56993704.7000503@oracle.com>

Very good.

Thanks,
Vladimir

On 1/13/16 4:00 AM, Tobias Hartmann wrote:
> Thanks, Vladimir.
>
> On 12.01.2016 20:24, Vladimir Kozlov wrote:
>>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>>
>> Yes, that is right solution here.
>
> I changed the implementation to only capture the byte[] and char[] memory:
> http://cr.openjdk.java.net/~thartmann/8144212/webrev.03/
>
> The method GraphKit::capture_memory(src_type, dst_type) returns a new MergeMemNode if the src and dst types are different, merging the two.
>
> Best,
> Tobias
>
>> On 1/12/16 5:59 AM, Tobias Hartmann wrote:
>>> On 11.01.2016 21:00, Vladimir Kozlov wrote:
>>>> On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>>>>> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>>>>>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>>>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>>>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>>>>>> Hi Vladimir,
>>>>>>>>>
>>>>>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>>>>>> Andrew is right.
>>>>>>>>>
>>>>>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>>>>>
>>>>>>>> Right. It was the root of this bug, see below.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>>>>>
>>>>>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>>>>>> StrInflatedCopyNode is not memory node.
>>>>>>>>>
>>>>>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>>>>>
>>>>>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>>>>>
>>>>>>> I meant with webrev.01 but you answered my question below.
>>>>>>>
>>>>>>>>>       // This class defines a projection of the memory  state of a store conditional node.
>>>>>>>>>       // These nodes return a value, but also update memory.
>>>>>>>>>
>>>>>>>>> But inflate does not return any value.
>>>>>>>>
>>>>>>>> Hmm, according to bottom type inflate produce memory:
>>>>>>>>
>>>>>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>>>>>
>>>>>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>>>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>>>>>
>>>>>>> Exactly.
>>>>>>>
>>>>>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>>>>>
>>>>>>>> set_memory(str, dst_type);
>>>>>>>
>>>>>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>>>>>
>>>>>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>>>>>
>>>>>>> Okay, I'll to that.
>>>>>>>
>>>>>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>>>>>
>>>>>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>>>>>
>>>>>>> Okay, should we then use BOTTOM for both the input and output type?
>>>>>>
>>>>>> Only input. Output type corresponds to dst array type which you set correctly now.
>>>>>
>>>>> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>>>>>     StoreC
>>>>>     inflate_string
>>>>>     LoadC
>>>>>
>>>>> The memory graph (def->use) now looks like this:
>>>>>     LoadC -> inflate_string -> ByteMem
>>>>>                   ... StoreC-> CharMem
>>>>
>>>> I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for LoadC.
>>>
>>> Note that the StoreC and inflate_string are *not* writing to the same char[] array. The test looks like this:
>>>
>>>    char c1[] = new char[1];
>>>    char c2[] = new char[1];
>>>
>>>    c2[0] = 42;
>>>    // Inflate String from byte[] to char[]
>>>    s.getChars(0, 1, c1, 0);
>>>    // Read char[] memory written before inflation
>>>    return c2[0];
>>>
>>> The result should be 42. The problem is that inflate_string does not point to StoreC because inflate_string uses a byte[] as input and in this case also writes to a different char[]. Even if we set the input to BOTTOM, inflate_string points to 7 Parm (BOTTOM) but not to the char[] memory produced by 96 StoreC:
>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>>
>>> 349 LoadUS then reads from the output char[] memory of inflate_string which does not include the result of StoreC. The test fails because the return value is != 42.
>>>
>>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>>>
>>>>    If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not follow StoreC then result is relaxed.
>>>
>>> Yes, these cases work fine.
>>>
>>> Thanks,
>>> Tobias
>>>
>>>>> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>>>>>
>>>>> Setting the input to BOTTOM, generates the following graph:
>>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>>>> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>>>>>
>>>>> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>>>>>     LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
>>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>>>>>
>>>>> Here is the new webrev:
>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>>>>> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>>>>>
>>>>> Best,
>>>>> Tobias
>>>>>
>>>>>>>>> Related question:
>>>>>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>>>>>       char[int:>=0]:exact+any *
>>>>>>>>>
>>>>>>>>> which is equal to the type of the char load.
>>>>>>>>
>>>>>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>>>>>
>>>>>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>>>>>
>>>>>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>>>>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>>
>>>>>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>>>>>
>>>>>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>>>>>
>>>>>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>>>>>
>>>>>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>>>>>       char[int:1]:NotNull:exact *
>>>>>>>>>
>>>>>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>>>>>
>>>>>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>>>>>
>>>>>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>>>>>
>>>>>>> I will look into this again and try to understand what happens.
>>>>>>
>>>>>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Tobias
>>>>>>>
>>>>>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>>>>>
>>>>>>>>>>>>        // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>>>>>        // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>>>>>        // compiler because the CPU does all the ordering for us.
>>>>>>>>>>>>
>>>>>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>>>>>
>>>>>>>>>>>> Or did I misunderstand your question?
>>>>>>>>>>>
>>>>>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>>>>>> read-after-write dependency, but never mind.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Andrew.
>>>>>>>>>>>

From vladimir.kozlov at oracle.com  Fri Jan 15 18:18:40 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 15 Jan 2016 10:18:40 -0800
Subject: RFR(S): 8147433: PrintNMethods no longer works with JVMCI
In-Reply-To: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
References: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
Message-ID: <56993800.3080301@oracle.com>

Good. Thank you for fixing this.

Vladimir

On 1/15/16 8:43 AM, Tom Rodriguez wrote:
> http://cr.openjdk.java.net/~never/8147433/webrev/index.html
>
> https://bugs.openjdk.java.net/browse/JDK-8137167 moved the PrintNMethods
> related code into ciEnv but since  JVMCI doesn?t use ciEnv PrintNMethods
> no longer works for it.  This moves into CompileBroker with the other
> compilation related printing code.  Tested with fastdebug
> -XX:+PrintNMethods running specjvm2008.
>
> tom

From christian.thalinger at oracle.com  Fri Jan 15 18:20:46 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 15 Jan 2016 08:20:46 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <32315674-1303-4A27-8FFD-AE40E8868F27@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
	<B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
	<7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>
	<E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>
	<C7F23075-7E5B-4DDD-A443-7540561C449C@oracle.com>
	<32315674-1303-4A27-8FFD-AE40E8868F27@oracle.com>
Message-ID: <C66A67F1-18A7-4D2F-9050-B01F44593563@oracle.com>

Thanks, Doug.

> On Jan 14, 2016, at 11:55 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Looks good.
> 
>> On 14 Jan 2016, at 22:50, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 14, 2016, at 2:44 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>>> 
>>>> On 14 Jan 2016, at 06:58, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Jan 12, 2016, at 12:39 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>>>> 
>>>>>>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>>>>>>> 
>>>>>>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>>>>>>> 
>>>>>>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>>>>>>> 
>>>>>>>>> You could then type the value of the options and check the right accessor is used:
>>>>>>>>> 
>>>>>>>>> public enum Option {
>>>>>>>>>   ImplicitStableValues(boolean.class),
>>>>>>>>>   InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>>>>>>   PrintConfig(boolean.class),
>>>>>>>>>   PrintFlags(boolean.class),
>>>>>>>>>   ShowFlags(boolean.class),
>>>>>>>>>   TraceMethodDataFilter(String.class),
>>>>>>>>>   TrustFinalDefaultFields(String.class);
>>>>>>>>> 
>>>>>>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>>>>>>> 
>>>>>>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>>>>>>> 
>>>>>>> Can you do that without having to box the default value?
>>>>>> 
>>>>>> No, we have to box but we can initialize all flags in the constructor:
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/
>>>> 
>>>> Do we agree on the change?
>>> 
>>> I would prefer it if the value was lazy initialized (for non-AOT runtimes):
>> 
>> It?s not different in AOT-land because these cannot be constants.
>> 
>>> 
>>> Also, you can remove all the static fields that just cache a (possibly unboxed) option value and use the option directly. For example:
>>> 
>>> diff -r 1034ff44c5d0 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java
>>> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Tue Jan 12 15:04:27 2016 +0100
>>> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Thu Jan 14 13:40:28 2016 +0100
>>> @@ -29,6 +29,7 @@
>>> import java.lang.reflect.Field;
>>> 
>>> import jdk.vm.ci.common.JVMCIError;
>>> +import jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.Option;
>>> import jdk.vm.ci.meta.JavaType;
>>> import jdk.vm.ci.meta.LocationIdentity;
>>> import jdk.vm.ci.meta.MetaAccessProvider;
>>> @@ -41,11 +42,6 @@
>>> */
>>> class HotSpotResolvedJavaFieldImpl implements HotSpotResolvedJavaField, HotSpotProxified {
>>> 
>>> -    /**
>>> -     * Mark well-known stable fields as such.
>>> -     */
>>> -    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
>>> -
>>>   private final HotSpotResolvedObjectTypeImpl holder;
>>>   private final String name;
>>>   private JavaType type;
>>> @@ -198,7 +194,7 @@
>>>           return true;
>>>       }
>>>       assert getAnnotation(Stable.class) == null;
>>> -        if (ImplicitStableValues && isImplicitStableField()) {
>>> +        if (Option.ImplicitStableValues.getBoolean() && isImplicitStableField()) {
>>>           return true;
>>>       }
>>>       return false;
>>> 
>>> None of the current options are used in tight loops where the cost of the unboxing (if any) would matter.
>> 
>> Right.
>> 
>>> 
>>> Lastly, since you?ve added PrintFlags and ShowFlags, why not add a help message to each option. For example:
>>> 
>>>      ImplicitStableValues(boolean.class, true, ?Mark well-known stable fields as such.?),
>> 
>> We should.
>> 
>> http://cr.openjdk.java.net/~twisti/8146820/webrev.03/
>> 
>> $ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.ShowFlags=true InitGraal
>> [List of JVMCI options]
>> boolean ImplicitStableValues                     := true           Mark well-known stable fields as such.
>> boolean InitTimer                                := false          Specifies if initialization timing is enabled.
>> boolean PrintConfig                              := false          Prints all HotSpotVMConfig fields.
>> boolean PrintFlags                               := false          Prints all JVMCI flags and exits.
>> boolean ShowFlags                                 = true           Prints all JVMCI flags and continues.
>>  String TraceMethodDataFilter                    := null           
>> boolean TrustFinalDefaultFields                  := true           Determines whether to treat final fields with default values as constant.
> 


From vladimir.kozlov at oracle.com  Fri Jan 15 18:25:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 15 Jan 2016 10:25:05 -0800
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <5699021F.90500@oracle.com>
References: <5697C624.7040201@oracle.com> <5697EFFF.90305@oracle.com>
	<5699021F.90500@oracle.com>
Message-ID: <56993981.8020703@oracle.com>

This looks good.

Thanks,
Vladimir

On 1/15/16 6:28 AM, Tobias Hartmann wrote:
> Thanks, Vladimir.
>
> On 14.01.2016 19:59, Vladimir Kozlov wrote:
>> You have to update code for 8146999 changes when Roland push it.
>
> Yes, I'll do so but Roland mentioned that he still has problems with his 814699 fix.
>
>> The only thing I don't like about changes is using #ifdef _LP64 for part of changes.
>> I know where it is coming from (ConvI2L for loop indexing) but as you said ConvI2L could be generated in other cases too. Should the test cast->has_range_check() return 'false' in 32-bit?
>
> I added the _LP64 ifdefs because we only emit a narrowed ConvI2L on 64 bit. But I agree - it's cleaner without those. As you suggested, I removed the ifdefs and changed has_range_check() to return false on 32 bit.
>
> Here is the new webrev:
> http://cr.openjdk.java.net/~thartmann/6675699/webrev.02/
>
> Thanks,
> Tobias
>
>> On 1/14/16 8:00 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6675699
>>> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/
>>>
>>> *Problem*
>>> The problem is that ConvI2L nodes with a narrow type (used to convert integer array indices to long values) are not dependent on the corresponding range check that proves that the input value is always in the (integer-)range. As a result, the ConvI2L node may flow above the range check during loop optimizations and end up with an input that is not in its type range. The node is then replaced by TOP causing the data path to be eliminated. However, because there is no control dependency on the corresponding range check, the control path from the peeled iteration that uses the result of the ConvI2L may not be eliminated. We crash because we are potentially using a value that is not available.
>>>
>>> For example, TestLoopPeeling::testArrayAccess() triggers loop peeling because the loop contains an invariant check. The array store in line 66 is moved out of the loop and reachable from the peeled and old iterations of the loop. However, the array index computation consisting of a LShiftL(ConvI2L(Phi)) remains in each loop because it has loop variant usages and is not dependent on the range check that was moved out of the loop. The peeled iteration of the loop uses storeIndex == -1 causing the ConvI2L to be replaced by TOP because -1 is not in its [0, MAX_INT] range. The TOP is propagated downwards and ends up as one of the inputs to the Phi that merges the array index from the peeled and old loop exits. The Phi replaced by it's only remaining input and the store ends up using the index from the old iteration although it's still reachable from the peeled iteration. We crash because we potentially use the index value from the old iteration while coming from the peeled it!
 e!
>   r!
>> at!
>>>    ion (of co
>>>
>>> urse, the range check would catch this at runtime).
>>>
>>> This problem may show up with array accesses but also with other code for which we emit a ConvI2L node with a narrow type. For example, array allocation uses a ConvI2L to convert the integer array size to a long value (see TestLoopPeeling::testArrayAllocation). We solved several different instances of this problem in the past with "workaround-fixes" that just disabled loop optimizations in special cases (see below). Such a workaround fix is not feasible to fix all potential occurrences of this problem. TestLoopPeeling.java crashes JDK 7, 8 and 9.
>>>
>>> *Solution*
>>> To make the ConvI2L dependent on a range check, I added code to emit a narrow CastII node with a control dependency on the range check that is then used as input to the ConvI2L. Like this, we explicitly express the dependency and prevent loop optimizations from moving the ConvI2L above the range check.
>>>
>>> To make sure that the impact is as small as possible, the range check dependent CastII nodes are removed right after loop optimizations. Further, all optimizations that depend on the old shape of array address computations are adapted to be aware of the CastII node.
>>>
>>> With the fix, we could now remove the following old "workaround-fixes":
>>> https://bugs.openjdk.java.net/browse/JDK-4781451
>>> https://bugs.openjdk.java.net/browse/JDK-4799512
>>> https://bugs.openjdk.java.net/browse/JDK-6659207
>>> https://bugs.openjdk.java.net/browse/JDK-6663854
>>> For reference, the individual patches can be found here:
>>> http://cr.openjdk.java.net/~thartmann/6675699/backouts/
>>>
>>> However, performance evaluation showed that backing out the old fixes causes significant regressions. It seems that aggressive splitting of ConvI2L nodes through phis leads to less optimal code due to more register spilling. I suspect that additional changes to the loop optimizations are necessary and would therefore like to leave the workaround fixes in for now. I filed JDK-8145313 to remove them later. Like this, we also reduce the impact/risk when backporting this fix to JDK 8 and potentially JDK 7.
>>>
>>> Roland pointed out that the changes in ConvI2LNode::Ideal() could potentially be merged into the CastIINode::Ideal() optimization introduced by his fix for JDK-8145322. After some investigation it turned out that the CastII optimization does not only affect memory addressing but also other CastII(AddI(..)) graph shapes. Making it more generic has a broader impact and therefore needs more investigation. I filed JDK-8147394 for this.
>>>
>>> ConvI2L nodes with a narrow type are also emitted by intrinsics:
>>> - GraphKit::array_element_address()
>>> - PhaseMacroExpand::array_element_address()
>>> - ArrayCopyNode::prepare_array_copy()
>>> I was not able to reproduce the problem with intrinsics. It's also not easily possible to make the CastII node range check dependent here because the range check is not always available from within the intrinsic.
>>>
>>> *Testing*
>>> I did extensive testing to make sure the fix does not introduce correctness or performance issues.
>>> - Different RBT test suites [1] with and without -Xcomp.
>>> - Full run of multiple CTW suites.
>>> - Verified changes in "PhaseIdealLoop::match_fill_loop" (loopTransform.cpp) by manually checking the output of [2] with -XX:+TraceOptimizeFill.
>>> - Verified changes in "IfNode::improve_address_types" (ifnode.cpp) by manually checking the output of [3] with -XX:+PrintOptoAssembly to make sure all range checks are folded.
>>> - Verified changes in superword.cpp by comparing output with -XX:+TraceSuperWord.
>>> - Performance runs (Footprint, JMH-Javac, SPECjbb2005, SPECjvm2008, Startup, Volano) on x86 and SPARC showed no regression
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] RBT test suites:
>>> - hotspot/test/:hotspot_all
>>> - noncolo.testlist
>>> - vm.compiler.testlist
>>> - vm.regression.testlist
>>> - nsk.regression.testlist
>>> - nsk.split_verifier.testlist
>>> - nsk.stress.testlist
>>> - nsk.stress.jck.testlist
>>> - jdk/test/:jdk_jfr
>>> - jdk/test/:svc_tools
>>> - jdk/test/:jdk_instrument
>>> - jdk/test/:jdk_lang
>>> - jdk/test/:jdk_svc
>>> - nashorn/test/:tier1
>>> - nashorn/test/:tier2
>>> - nashorn/test/:tier3
>>> Only without -Xcomp:
>>> - Kitchensink
>>> - runThese
>>> - Weblogic12medrec
>>> [2] test/compiler/intrinsics/6982370/Test6982370.java
>>> [3] test/compiler/rangechecks/TestExplicitRangeChecks.java
>>>

From christian.thalinger at oracle.com  Fri Jan 15 22:55:49 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 15 Jan 2016 12:55:49 -1000
Subject: RFR(S): 8147433: PrintNMethods no longer works with JVMCI
In-Reply-To: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
References: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
Message-ID: <A814C4D3-17AF-475F-A4BA-156E183FFEE1@oracle.com>

Looks good.

> On Jan 15, 2016, at 6:43 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~never/8147433/webrev/index.html <http://cr.openjdk.java.net/~never/8147433/webrev/index.html>
> 
> https://bugs.openjdk.java.net/browse/JDK-8137167 <https://bugs.openjdk.java.net/browse/JDK-8137167> moved the PrintNMethods related code into ciEnv but since  JVMCI doesn?t use ciEnv PrintNMethods no longer works for it.  This moves into CompileBroker with the other compilation related printing code.  Tested with fastdebug -XX:+PrintNMethods running specjvm2008.
> 
> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160115/c065337e/attachment.html>

From christian.thalinger at oracle.com  Fri Jan 15 23:30:06 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 15 Jan 2016 13:30:06 -1000
Subject: RFR(S): 8147433: PrintNMethods no longer works with JVMCI
In-Reply-To: <A814C4D3-17AF-475F-A4BA-156E183FFEE1@oracle.com>
References: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
	<A814C4D3-17AF-475F-A4BA-156E183FFEE1@oracle.com>
Message-ID: <48DBC395-B4EB-42C7-8F86-969AAB5D86A7@oracle.com>

Tom, can you push this yourself?

> On Jan 15, 2016, at 12:55 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> Looks good.
> 
>> On Jan 15, 2016, at 6:43 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> http://cr.openjdk.java.net/~never/8147433/webrev/index.html <http://cr.openjdk.java.net/~never/8147433/webrev/index.html>
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8137167 <https://bugs.openjdk.java.net/browse/JDK-8137167> moved the PrintNMethods related code into ciEnv but since  JVMCI doesn?t use ciEnv PrintNMethods no longer works for it.  This moves into CompileBroker with the other compilation related printing code.  Tested with fastdebug -XX:+PrintNMethods running specjvm2008.
>> 
>> tom
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160115/78cabb9b/attachment.html>

From joe.darcy at oracle.com  Sat Jan 16 01:33:56 2016
From: joe.darcy at oracle.com (Joseph D. Darcy)
Date: Fri, 15 Jan 2016 17:33:56 -0800
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
Message-ID: <56999E04.5040207@oracle.com>

Hello,

Catching up on email, how were these test cases generated or chosen? In 
other words, in what sense are they corners?

The data would be easier to read if the numbers were aligned by column 
(they don't appear that way in the webrev at least).

What is the code coverage of the new intrinsics with this set of tests?

Theses tests should not be separated from the implementation for long; 
in other words, since the new implementation has already been pushed to 
a HotSpot forest, test coverage for that new implementation should not 
lag behind.

Thanks,

-Joe

On 12/22/2015 5:41 PM, Deshpande, Vivek R wrote:
> HI All
>
> I have uploaded the patch for sin and cos tests with input and allowed outputs
> at this location for your review.
> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.00/
> Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
> Thank you.
>
> Regards,
> Vivek
>
> -----Original Message-----
> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
> Sent: Friday, December 04, 2015 4:50 PM
> To: Deshpande, Vivek R; Vladimir Kozlov
> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib
>
> Hi Vivek,
>
> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>> Hi
>>
>> Sure I will add the tests. Shall I use StrictMath result as a reference for exact result.
>> Let me know your thoughts.
> As a rough test of another sin/cos implementation, StrictMath.{sin, cos} can be used a reference with the following caveat: there isn't an indication of which why the error is in a StrictMath result. Let me given an example, if
>
>       StrictMath.sin(x) => y
>
> then one of the following should be true
>
>       Math.sin(x) => y
>       Math.sin(x) => Math.nextUp(y)
>       Math.sin(x) => Math.nextDown(y)
>
> That is, Math.sin(x) should either be the same as StrictMath.sin(x) OR equal to one of the floating-point numbers adjacent to that result. Of these three options, only two area allowed by the accuracy requirements of the StrictMath.sin specification. However, since StrictMath.sin doesn't give an indication of which way its error went (if it rounded up or down), there is no indication without additional work which of
> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't buggy).
>
> HTH,
>
> -Joe
>
>
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: joe darcy [mailto:joe.darcy at oracle.com]
>> Sent: Thursday, December 03, 2015 1:29 PM
>> To: Vladimir Kozlov; Deshpande, Vivek R
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>> lib
>>
>> Hello,
>>
>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>> Vivek,
>>>
>>> I think Joe is asking you to write these tests as hotspot regression
>>> test in hotspot/test/compiler.
>> Exactly; if not generally applicable sin/cos tests that could be hosted in the jdk repo (alongside the regression and unit tests for java.lang.Math), then test of intrinsics in the HotSpot repo alongside other tests targeting intrinsics.
>>
>> Thanks,
>>
>> -Joe
>>
>>> Vladimir
>>>
>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>> Hi Joe
>>>>
>>>> It would be great if you would please share the additional tests
>>>> with us.
>>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>> math lib
>>>>
>>>> I think it is unwise for this large of an implementation change to
>>>> be pushed with no tests targeting the specifics of the new implementation.
>>>>
>>>> The worst-case tests in the jdk repo are the mathematical worst
>>>> cases for floating-point approximations, in other words the cases
>>>> were the exact mathematical answer is closes to half-way between two
>>>> representation floating-point numbers. Passing such tests is
>>>> necessary but not sufficient condition for a new implementation.
>>>>
>>>> Chers,
>>>>
>>>> -Joe
>>>>
>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>> Okay, looks reasonable to me.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>> Hi Vladimir
>>>>>>
>>>>>> This is the link for the updated webrev with latest hotspot source
>>>>>> as base for your review.
>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>> Thank you.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Deshpande, Vivek R
>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>> math lib
>>>>>>
>>>>>> Hi Vladimir
>>>>>>
>>>>>> This is the link for the updated webrev for your review.
>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>> Thank you.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>> math lib
>>>>>>
>>>>>> Please send link to new webrev on cr server.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> Please find the webrev with your suggested updates attached with
>>>>>>> the mail.
>>>>>>> We will update it in the jbs entry soon.
>>>>>>> Please let me know if it needs further changes.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Deshpande, Vivek R
>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> HI Vladimir, Joe
>>>>>>>
>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you
>>>>>>> have mentioned. It passed those tests.
>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions
>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>> The performance gain is 3.2x over base jdk, that is over current
>>>>>>> fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>
>>>>>>> Could I get those tests around the boundary values. Would
>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>
>>>>>>> I would work on adding either diagnostic flag or just one flag
>>>>>>> for libm and send out the webrev soon.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Just getting added to the thread..
>>>>>>>
>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>
>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition to
>>>>>>>> Hotspot tests.
>>>>>>>>
>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from
>>>>>>>>> StrictMath result and not exact result. So I added the flag to
>>>>>>>>> switch between FDLIBM and LIBM.
>>>>>>>>>
>>>>>>>>> Quick explanation:
>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result
>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>> 0.1945729362957022
>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA
>>>>>>>>> library result is between the above two values and Exact result
>>>>>>>>> would be pretty close to it.
>>>>>>>>> So here StrictMath result is less than quad-precision result,
>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not
>>>>>>>>> StrictMath
>>>>>>>>> - 1ulp, according to our test.
>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both direction,
>>>>>>>> I
>>>>>>>> think) and it should be consistent for Interpreter and code
>>>>>>>> generated by JIT compilers:
>>>>>>>>
>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#sin
>>>>>>>> %
>>>>>>>> 28
>>>>>>>> do
>>>>>>>> u
>>>>>>>> ble%29
>>>>>>>>
>>>>>>> That interpretation of the spec is not quite right. For the Math
>>>>>>> methods with a 1/2 ulp error bound, the floating-point result
>>>>>>> closest to the exact result must be returned. For the methods
>>>>>>> with a
>>>>>>> 1 ulp error bound, either of the floating-point result bracketing
>>>>>>> the true result can be returned, subject to the monotonicity
>>>>>>> constraints of the specification of the particular method.
>>>>>>>
>>>>>>>>> I have done the experiments with XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:DisableIntrinsic=_dsin and XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:DisableIntrinsic=_dcos. With this option, the interpreter
>>>>>>>>> would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>> If we want to disable LIBM completely, we need the flags
>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>> I was thinking about using existing
>>>>>>>> DirectiveSet::is_intrinsic_disabled() and
>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add additional
>>>>>>>> versions of functions which accept intrinsic ID instead of
>>>>>>>> methodHandle.
>>>>>>>>
>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>
>>>>>>>>> Also the performance gain ~4x is with
>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>> You confused me here. So you get 4x when only Interpreter use
>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>
>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>> Port fdlibm to Java), which is providing a significant speed
>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>
>>>>>>> I find the current patch *insufficient* as-is in terms of its
>>>>>>> testing.
>>>>>>> For example, part of patch says
>>>>>>>
>>>>>>> # For sin
>>>>>>>
>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>
>>>>>>> # For cos
>>>>>>>
>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>
>>>>>>> If nothing else, there are no tests at around those boundary
>>>>>>> values, which is unacceptable. There should also be some tests of
>>>>>>> values of interest to the algorithm in question.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> -Joe
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>> Let me know your thoughts on this. I would answer more
>>>>>>>>> questions and give more data if needed.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>
>>>>>>>>>> It would be interesting to see what happens if you disable
>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>
>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>> Hi Vivek,
>>>>>>>>>
>>>>>>>>> I want to point that you can do this experiment later. We can
>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>
>>>>>>>>> For now, please, answer my question about flags only. This is
>>>>>>>>> the only thing holding it from push.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi all
>>>>>>>>>>>
>>>>>>>>>>> I would like to contribute a patch which optimizes Math.sin()
>>>>>>>>>>> and
>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>         implementation.
>>>>>>>>>>>
>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and cos.
>>>>>>>>>>>
>>>>>>>>>>> The option to use the optimizations are
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>
>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>
>>>>>>>>>>> Bug-id:
>>>>>>>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>> webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>
>>>>>>>>>>> Vivek
>>>>>>>>>>>


From vladimir.kozlov at oracle.com  Sat Jan 16 01:58:46 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 15 Jan 2016 17:58:46 -0800
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <56999E04.5040207@oracle.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
	<56999E04.5040207@oracle.com>
Message-ID: <5699A3D6.6080305@oracle.com>

Note, the test was pushed together with VM changes into hs-comp repo:

  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/ddd59a780769

New sin/cos code is tested in all running modes since it is used by 
Interpreter and JITed code (C1 and C2).

I will let Vivek answer questions about the test.

Regards,
Vladimir

On 1/15/16 5:33 PM, Joseph D. Darcy wrote:
> Hello,
>
> Catching up on email, how were these test cases generated or chosen? In
> other words, in what sense are they corners?
>
> The data would be easier to read if the numbers were aligned by column
> (they don't appear that way in the webrev at least).
>
> What is the code coverage of the new intrinsics with this set of tests?
>
> Theses tests should not be separated from the implementation for long;
> in other words, since the new implementation has already been pushed to
> a HotSpot forest, test coverage for that new implementation should not
> lag behind.
>
> Thanks,
>
> -Joe
>
> On 12/22/2015 5:41 PM, Deshpande, Vivek R wrote:
>> HI All
>>
>> I have uploaded the patch for sin and cos tests with input and allowed
>> outputs
>> at this location for your review.
>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.00/
>> Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>> Thank you.
>>
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>> Sent: Friday, December 04, 2015 4:50 PM
>> To: Deshpande, Vivek R; Vladimir Kozlov
>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib
>>
>> Hi Vivek,
>>
>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>> Hi
>>>
>>> Sure I will add the tests. Shall I use StrictMath result as a
>>> reference for exact result.
>>> Let me know your thoughts.
>> As a rough test of another sin/cos implementation, StrictMath.{sin,
>> cos} can be used a reference with the following caveat: there isn't an
>> indication of which why the error is in a StrictMath result. Let me
>> given an example, if
>>
>>       StrictMath.sin(x) => y
>>
>> then one of the following should be true
>>
>>       Math.sin(x) => y
>>       Math.sin(x) => Math.nextUp(y)
>>       Math.sin(x) => Math.nextDown(y)
>>
>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) OR
>> equal to one of the floating-point numbers adjacent to that result. Of
>> these three options, only two area allowed by the accuracy
>> requirements of the StrictMath.sin specification. However, since
>> StrictMath.sin doesn't give an indication of which way its error went
>> (if it rounded up or down), there is no indication without additional
>> work which of
>> nextUp(y) and nextDown(y) is allowable (assuming  StrictMath.sin isn't
>> buggy).
>>
>> HTH,
>>
>> -Joe
>>
>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Thursday, December 03, 2015 1:29 PM
>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>>> lib
>>>
>>> Hello,
>>>
>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>> Vivek,
>>>>
>>>> I think Joe is asking you to write these tests as hotspot regression
>>>> test in hotspot/test/compiler.
>>> Exactly; if not generally applicable sin/cos tests that could be
>>> hosted in the jdk repo (alongside the regression and unit tests for
>>> java.lang.Math), then test of intrinsics in the HotSpot repo
>>> alongside other tests targeting intrinsics.
>>>
>>> Thanks,
>>>
>>> -Joe
>>>
>>>> Vladimir
>>>>
>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>> Hi Joe
>>>>>
>>>>> It would be great if you would please share the additional tests
>>>>> with us.
>>>>>
>>>>> Regards,
>>>>> Vivek
>>>>>
>>>>> -----Original Message-----
>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>> math lib
>>>>>
>>>>> I think it is unwise for this large of an implementation change to
>>>>> be pushed with no tests targeting the specifics of the new
>>>>> implementation.
>>>>>
>>>>> The worst-case tests in the jdk repo are the mathematical worst
>>>>> cases for floating-point approximations, in other words the cases
>>>>> were the exact mathematical answer is closes to half-way between two
>>>>> representation floating-point numbers. Passing such tests is
>>>>> necessary but not sufficient condition for a new implementation.
>>>>>
>>>>> Chers,
>>>>>
>>>>> -Joe
>>>>>
>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>> Okay, looks reasonable to me.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev with latest hotspot source
>>>>>>> as base for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Deshpande, Vivek R
>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> Hi Vladimir
>>>>>>>
>>>>>>> This is the link for the updated webrev for your review.
>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vivek
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>> math lib
>>>>>>>
>>>>>>> Please send link to new webrev on cr server.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> Please find the webrev with your suggested updates attached with
>>>>>>>> the mail.
>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> HI Vladimir, Joe
>>>>>>>>
>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you
>>>>>>>> have mentioned. It passed those tests.
>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions
>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>> The performance gain is 3.2x over base jdk, that is over current
>>>>>>>> fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>
>>>>>>>> Could I get those tests around the boundary values. Would
>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>
>>>>>>>> I would work on adding either diagnostic flag or just one flag
>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Just getting added to the thread..
>>>>>>>>
>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>
>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition to
>>>>>>>>> Hotspot tests.
>>>>>>>>>
>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>> Hi Vladimir
>>>>>>>>>>
>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from
>>>>>>>>>> StrictMath result and not exact result. So I added the flag to
>>>>>>>>>> switch between FDLIBM and LIBM.
>>>>>>>>>>
>>>>>>>>>> Quick explanation:
>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result
>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>> 0.1945729362957022
>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA
>>>>>>>>>> library result is between the above two values and Exact result
>>>>>>>>>> would be pretty close to it.
>>>>>>>>>> So here StrictMath result is less than quad-precision result,
>>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not
>>>>>>>>>> StrictMath
>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both direction,
>>>>>>>>> I
>>>>>>>>> think) and it should be consistent for Interpreter and code
>>>>>>>>> generated by JIT compilers:
>>>>>>>>>
>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#sin
>>>>>>>>> %
>>>>>>>>> 28
>>>>>>>>> do
>>>>>>>>> u
>>>>>>>>> ble%29
>>>>>>>>>
>>>>>>>> That interpretation of the spec is not quite right. For the Math
>>>>>>>> methods with a 1/2 ulp error bound, the floating-point result
>>>>>>>> closest to the exact result must be returned. For the methods
>>>>>>>> with a
>>>>>>>> 1 ulp error bound, either of the floating-point result bracketing
>>>>>>>> the true result can be returned, subject to the monotonicity
>>>>>>>> constraints of the specification of the particular method.
>>>>>>>>
>>>>>>>>>> I have done the experiments with XX:+UnlockDiagnosticVMOptions
>>>>>>>>>> -XX:DisableIntrinsic=_dsin and XX:+UnlockDiagnosticVMOptions
>>>>>>>>>> -XX:DisableIntrinsic=_dcos. With this option, the interpreter
>>>>>>>>>> would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>> If we want to disable LIBM completely, we need the flags
>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>> I was thinking about using existing
>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and
>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add additional
>>>>>>>>> versions of functions which accept intrinsic ID instead of
>>>>>>>>> methodHandle.
>>>>>>>>>
>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>
>>>>>>>>>> Also the performance gain ~4x is with
>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>> You confused me here. So you get 4x when only Interpreter use
>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>
>>>>>>>> I'm part way through porting the FDLIBM code to Java (JDK-8134780:
>>>>>>>> Port fdlibm to Java), which is providing a significant speed
>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>
>>>>>>>> I find the current patch *insufficient* as-is in terms of its
>>>>>>>> testing.
>>>>>>>> For example, part of patch says
>>>>>>>>
>>>>>>>> # For sin
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> # For cos
>>>>>>>>
>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>
>>>>>>>> If nothing else, there are no tests at around those boundary
>>>>>>>> values, which is unacceptable. There should also be some tests of
>>>>>>>> values of interest to the algorithm in question.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> -Joe
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>>> Let me know your thoughts on this. I would answer more
>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Vivek
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>>> the math lib
>>>>>>>>>>
>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>
>>>>>>>>>>> It would be interesting to see what happens if you disable
>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>
>>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>> Hi Vivek,
>>>>>>>>>>
>>>>>>>>>> I want to point that you can do this experiment later. We can
>>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>>
>>>>>>>>>> For now, please, answer my question about flags only. This is
>>>>>>>>>> the only thing holding it from push.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>> Hi all
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to contribute a patch which optimizes Math.sin()
>>>>>>>>>>>> and
>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel LIBM
>>>>>>>>>>>>         implementation.
>>>>>>>>>>>>
>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and
>>>>>>>>>>>> cos.
>>>>>>>>>>>>
>>>>>>>>>>>> The option to use the optimizations are
>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>
>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>> webrev:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Vivek
>>>>>>>>>>>>
>

From joe.darcy at oracle.com  Sat Jan 16 02:28:31 2016
From: joe.darcy at oracle.com (joe darcy)
Date: Fri, 15 Jan 2016 18:28:31 -0800
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <5699A3D6.6080305@oracle.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
	<56999E04.5040207@oracle.com> <5699A3D6.6080305@oracle.com>
Message-ID: <5699AACF.6080608@oracle.com>

Ah okay; I overlooked the separate push of the tests.

Thanks,

-Joe

On 1/15/2016 5:58 PM, Vladimir Kozlov wrote:
> Note, the test was pushed together with VM changes into hs-comp repo:
>
>  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/ddd59a780769
>
> New sin/cos code is tested in all running modes since it is used by 
> Interpreter and JITed code (C1 and C2).
>
> I will let Vivek answer questions about the test.
>
> Regards,
> Vladimir
>
> On 1/15/16 5:33 PM, Joseph D. Darcy wrote:
>> Hello,
>>
>> Catching up on email, how were these test cases generated or chosen? In
>> other words, in what sense are they corners?
>>
>> The data would be easier to read if the numbers were aligned by column
>> (they don't appear that way in the webrev at least).
>>
>> What is the code coverage of the new intrinsics with this set of tests?
>>
>> Theses tests should not be separated from the implementation for long;
>> in other words, since the new implementation has already been pushed to
>> a HotSpot forest, test coverage for that new implementation should not
>> lag behind.
>>
>> Thanks,
>>
>> -Joe
>>
>> On 12/22/2015 5:41 PM, Deshpande, Vivek R wrote:
>>> HI All
>>>
>>> I have uploaded the patch for sin and cos tests with input and allowed
>>> outputs
>>> at this location for your review.
>>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webrev.00/ 
>>>
>>> Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>>> Thank you.
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Friday, December 04, 2015 4:50 PM
>>> To: Deshpande, Vivek R; Vladimir Kozlov
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>> math lib
>>>
>>> Hi Vivek,
>>>
>>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>>> Hi
>>>>
>>>> Sure I will add the tests. Shall I use StrictMath result as a
>>>> reference for exact result.
>>>> Let me know your thoughts.
>>> As a rough test of another sin/cos implementation, StrictMath.{sin,
>>> cos} can be used a reference with the following caveat: there isn't an
>>> indication of which why the error is in a StrictMath result. Let me
>>> given an example, if
>>>
>>>       StrictMath.sin(x) => y
>>>
>>> then one of the following should be true
>>>
>>>       Math.sin(x) => y
>>>       Math.sin(x) => Math.nextUp(y)
>>>       Math.sin(x) => Math.nextDown(y)
>>>
>>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) OR
>>> equal to one of the floating-point numbers adjacent to that result. Of
>>> these three options, only two area allowed by the accuracy
>>> requirements of the StrictMath.sin specification. However, since
>>> StrictMath.sin doesn't give an indication of which way its error went
>>> (if it rounded up or down), there is no indication without additional
>>> work which of
>>> nextUp(y) and nextDown(y) is allowable (assuming StrictMath.sin isn't
>>> buggy).
>>>
>>> HTH,
>>>
>>> -Joe
>>>
>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:29 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math
>>>> lib
>>>>
>>>> Hello,
>>>>
>>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>>> Vivek,
>>>>>
>>>>> I think Joe is asking you to write these tests as hotspot regression
>>>>> test in hotspot/test/compiler.
>>>> Exactly; if not generally applicable sin/cos tests that could be
>>>> hosted in the jdk repo (alongside the regression and unit tests for
>>>> java.lang.Math), then test of intrinsics in the HotSpot repo
>>>> alongside other tests targeting intrinsics.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>>> Hi Joe
>>>>>>
>>>>>> It would be great if you would please share the additional tests
>>>>>> with us.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>> math lib
>>>>>>
>>>>>> I think it is unwise for this large of an implementation change to
>>>>>> be pushed with no tests targeting the specifics of the new
>>>>>> implementation.
>>>>>>
>>>>>> The worst-case tests in the jdk repo are the mathematical worst
>>>>>> cases for floating-point approximations, in other words the cases
>>>>>> were the exact mathematical answer is closes to half-way between two
>>>>>> representation floating-point numbers. Passing such tests is
>>>>>> necessary but not sufficient condition for a new implementation.
>>>>>>
>>>>>> Chers,
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>>> Okay, looks reasonable to me.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev with latest hotspot source
>>>>>>>> as base for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>> math lib
>>>>>>>>
>>>>>>>> Please send link to new webrev on cr server.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> Please find the webrev with your suggested updates attached with
>>>>>>>>> the mail.
>>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Deshpande, Vivek R
>>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>>> math lib
>>>>>>>>>
>>>>>>>>> HI Vladimir, Joe
>>>>>>>>>
>>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you
>>>>>>>>> have mentioned. It passed those tests.
>>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions
>>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>>> The performance gain is 3.2x over base jdk, that is over current
>>>>>>>>> fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>>
>>>>>>>>> Could I get those tests around the boundary values. Would
>>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>>
>>>>>>>>> I would work on adding either diagnostic flag or just one flag
>>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the
>>>>>>>>> math lib
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Just getting added to the thread..
>>>>>>>>>
>>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>>
>>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition to
>>>>>>>>>> Hotspot tests.
>>>>>>>>>>
>>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi Vladimir
>>>>>>>>>>>
>>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from
>>>>>>>>>>> StrictMath result and not exact result. So I added the flag to
>>>>>>>>>>> switch between FDLIBM and LIBM.
>>>>>>>>>>>
>>>>>>>>>>> Quick explanation:
>>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result
>>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>>> 0.1945729362957022
>>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA
>>>>>>>>>>> library result is between the above two values and Exact result
>>>>>>>>>>> would be pretty close to it.
>>>>>>>>>>> So here StrictMath result is less than quad-precision result,
>>>>>>>>>>> Math result should be StrictMath or StrictMath + 1ulp and not
>>>>>>>>>>> StrictMath
>>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both direction,
>>>>>>>>>> I
>>>>>>>>>> think) and it should be consistent for Interpreter and code
>>>>>>>>>> generated by JIT compilers:
>>>>>>>>>>
>>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#sin
>>>>>>>>>> %
>>>>>>>>>> 28
>>>>>>>>>> do
>>>>>>>>>> u
>>>>>>>>>> ble%29
>>>>>>>>>>
>>>>>>>>> That interpretation of the spec is not quite right. For the Math
>>>>>>>>> methods with a 1/2 ulp error bound, the floating-point result
>>>>>>>>> closest to the exact result must be returned. For the methods
>>>>>>>>> with a
>>>>>>>>> 1 ulp error bound, either of the floating-point result bracketing
>>>>>>>>> the true result can be returned, subject to the monotonicity
>>>>>>>>> constraints of the specification of the particular method.
>>>>>>>>>
>>>>>>>>>>> I have done the experiments with XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:DisableIntrinsic=_dsin and XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>> -XX:DisableIntrinsic=_dcos. With this option, the interpreter
>>>>>>>>>>> would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>>> If we want to disable LIBM completely, we need the flags
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>> I was thinking about using existing
>>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and
>>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add additional
>>>>>>>>>> versions of functions which accept intrinsic ID instead of
>>>>>>>>>> methodHandle.
>>>>>>>>>>
>>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>>
>>>>>>>>>>> Also the performance gain ~4x is with
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>>> You confused me here. So you get 4x when only Interpreter use
>>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>>
>>>>>>>>> I'm part way through porting the FDLIBM code to Java 
>>>>>>>>> (JDK-8134780:
>>>>>>>>> Port fdlibm to Java), which is providing a significant speed
>>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>>
>>>>>>>>> I find the current patch *insufficient* as-is in terms of its
>>>>>>>>> testing.
>>>>>>>>> For example, part of patch says
>>>>>>>>>
>>>>>>>>> # For sin
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> # For cos
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> If nothing else, there are no tests at around those boundary
>>>>>>>>> values, which is unacceptable. There should also be some tests of
>>>>>>>>> values of interest to the algorithm in question.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> -Joe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Let me know your thoughts on this. I would answer more
>>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Vivek
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>>> To: Deshpande, Vivek R; hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in
>>>>>>>>>>> the math lib
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be interesting to see what happens if you disable
>>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>>
>>>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions
>>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>
>>>>>>>>>>> I want to point that you can do this experiment later. We can
>>>>>>>>>>> file bugs and fixed them after FC.
>>>>>>>>>>>
>>>>>>>>>>> For now, please, answer my question about flags only. This is
>>>>>>>>>>> the only thing holding it from push.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to contribute a patch which optimizes Math.sin()
>>>>>>>>>>>>> and
>>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel 
>>>>>>>>>>>>> LIBM
>>>>>>>>>>>>>         implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin and
>>>>>>>>>>>>> cos.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The option to use the optimizations are
>>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vivek
>>>>>>>>>>>>>
>>


From tom.rodriguez at oracle.com  Sat Jan 16 02:48:36 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 15 Jan 2016 18:48:36 -0800
Subject: RFR(S): 8147433: PrintNMethods no longer works with JVMCI
In-Reply-To: <48DBC395-B4EB-42C7-8F86-969AAB5D86A7@oracle.com>
References: <E9C63117-7CBD-4A8F-8A6E-B06E40DA7C7A@oracle.com>
	<A814C4D3-17AF-475F-A4BA-156E183FFEE1@oracle.com>
	<48DBC395-B4EB-42C7-8F86-969AAB5D86A7@oracle.com>
Message-ID: <7D91D2AA-5C7F-46A1-8189-61F6CD9076BB@oracle.com>

I thought I?d give it a try.  I think I have it all set up right, so it?s a good test.

tom

> On Jan 15, 2016, at 3:30 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> Tom, can you push this yourself?
> 
>> On Jan 15, 2016, at 12:55 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>> Looks good.
>> 
>>> On Jan 15, 2016, at 6:43 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> 
>>> http://cr.openjdk.java.net/~never/8147433/webrev/index.html <http://cr.openjdk.java.net/~never/8147433/webrev/index.html>
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8137167 <https://bugs.openjdk.java.net/browse/JDK-8137167> moved the PrintNMethods related code into ciEnv but since  JVMCI doesn?t use ciEnv PrintNMethods no longer works for it.  This moves into CompileBroker with the other compilation related printing code.  Tested with fastdebug -XX:+PrintNMethods running specjvm2008.
>>> 
>>> tom
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160115/66b5c950/attachment.html>

From tobias.hartmann at oracle.com  Mon Jan 18 07:09:51 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 18 Jan 2016 08:09:51 +0100
Subject: [9] RFR(S): 8144212: JDK 9 b93 breaks Apache Lucene due to
	compact strings
In-Reply-To: <56993704.7000503@oracle.com>
References: <568D0229.60908@oracle.com> <568D037E.7000105@redhat.com>
	<568D1148.1030901@oracle.com> <568D17E4.90301@redhat.com>
	<568DAA2A.9070704@oracle.com> <568E7BAB.5070908@oracle.com>
	<568ECF5C.6090407@oracle.com> <568F9183.9070909@oracle.com>
	<56901101.6050503@oracle.com> <5693C83F.9030100@oracle.com>
	<569409C5.2040805@oracle.com> <569506CA.8040001@oracle.com>
	<569552EE.8050809@oracle.com> <56963C7A.8040203@oracle.com>
	<56993704.7000503@oracle.com>
Message-ID: <569C8FBF.2030402@oracle.com>

Thanks, Vladimir!

Best,
Tobias

On 15.01.2016 19:14, Vladimir Kozlov wrote:
> Very good.
> 
> Thanks,
> Vladimir
> 
> On 1/13/16 4:00 AM, Tobias Hartmann wrote:
>> Thanks, Vladimir.
>>
>> On 12.01.2016 20:24, Vladimir Kozlov wrote:
>>>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>>>
>>> Yes, that is right solution here.
>>
>> I changed the implementation to only capture the byte[] and char[] memory:
>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.03/
>>
>> The method GraphKit::capture_memory(src_type, dst_type) returns a new MergeMemNode if the src and dst types are different, merging the two.
>>
>> Best,
>> Tobias
>>
>>> On 1/12/16 5:59 AM, Tobias Hartmann wrote:
>>>> On 11.01.2016 21:00, Vladimir Kozlov wrote:
>>>>> On 1/11/16 7:20 AM, Tobias Hartmann wrote:
>>>>>> On 08.01.2016 20:41, Vladimir Kozlov wrote:
>>>>>>> On 1/8/16 2:37 AM, Tobias Hartmann wrote:
>>>>>>>> On 07.01.2016 21:49, Vladimir Kozlov wrote:
>>>>>>>>> On 1/7/16 6:52 AM, Tobias Hartmann wrote:
>>>>>>>>>> Hi Vladimir,
>>>>>>>>>>
>>>>>>>>>> On 07.01.2016 00:58, Vladimir Kozlov wrote:
>>>>>>>>>>> Andrew is right.
>>>>>>>>>>
>>>>>>>>>> Yes, he's right that the membar is not needed in this case. I noticed that GraphKit::inflate_string() sets the output memory to TypeAryPtr::BYTES although inflate writes to a char[] array in this case. This caused the subsequent char load to be on a different slice allowing C2 to move the load to before the intrinsic.
>>>>>>>>>
>>>>>>>>> Right. It was the root of this bug, see below.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I fixed this for the inflate and compress intrinsics.
>>>>>>>>>>
>>>>>>>>>>> GraphKit::inflate_string() should have SCMemProjNode as compress_string() does to prevent loads move up.
>>>>>>>>>>> StrInflatedCopyNode is not memory node.
>>>>>>>>>>
>>>>>>>>>> Okay, why are above changes not sufficient to prevent the load from moving up? Also, the comment for SCMemProjNode says:
>>>>>>>>>
>>>>>>>>> I did not get the question. Is it before your webrev.01 change? Or even with the change?
>>>>>>>>
>>>>>>>> I meant with webrev.01 but you answered my question below.
>>>>>>>>
>>>>>>>>>>       // This class defines a projection of the memory  state of a store conditional node.
>>>>>>>>>>       // These nodes return a value, but also update memory.
>>>>>>>>>>
>>>>>>>>>> But inflate does not return any value.
>>>>>>>>>
>>>>>>>>> Hmm, according to bottom type inflate produce memory:
>>>>>>>>>
>>>>>>>>> StrInflatedCopyNode::bottom_type() const { return Type::MEMORY; }
>>>>>>>>>
>>>>>>>>> So it really does not need SCMemProjNode. Sorry about that.
>>>>>>>>> So load was LoadUS which is char load and originally memory slice of inflate was incorrect BYTES.
>>>>>>>>
>>>>>>>> Exactly.
>>>>>>>>
>>>>>>>>> Instead of SCMemProjNode we should have to change the idx of your dst_type:
>>>>>>>>>
>>>>>>>>> set_memory(str, dst_type);
>>>>>>>>
>>>>>>>> Yes, that's what I do now in webrev.01 by passing the dst_type as an argument to inflate_string.
>>>>>>>>
>>>>>>>>> And you should rollback part of changes in escape.cpp and macro.cpp.
>>>>>>>>
>>>>>>>> Okay, I'll to that.
>>>>>>>>
>>>>>>>>>> Here is the new webrev, including the SCMemProjNode and adapting escape analysis and macro expansion accordingly:
>>>>>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.01/
>>>>>>>>>
>>>>>>>>> In general when src & dst arrays have different type we may need to use TypeOopPtr::BOTTOM to prevent related store & loads bypass these copy nodes.
>>>>>>>>
>>>>>>>> Okay, should we then use BOTTOM for both the input and output type?
>>>>>>>
>>>>>>> Only input. Output type corresponds to dst array type which you set correctly now.
>>>>>>
>>>>>> It seems like that this is not sufficient. As Roland pointed out (off-thread), there may still be a problem in the following case:
>>>>>>     StoreC
>>>>>>     inflate_string
>>>>>>     LoadC
>>>>>>
>>>>>> The memory graph (def->use) now looks like this:
>>>>>>     LoadC -> inflate_string -> ByteMem
>>>>>>                   ... StoreC-> CharMem
>>>>>
>>>>> I did not get this. If StoreC node is created before inflate_string - inflate_string should point to it be barrier for LoadC.
>>>>
>>>> Note that the StoreC and inflate_string are *not* writing to the same char[] array. The test looks like this:
>>>>
>>>>    char c1[] = new char[1];
>>>>    char c2[] = new char[1];
>>>>
>>>>    c2[0] = 42;
>>>>    // Inflate String from byte[] to char[]
>>>>    s.getChars(0, 1, c1, 0);
>>>>    // Read char[] memory written before inflation
>>>>    return c2[0];
>>>>
>>>> The result should be 42. The problem is that inflate_string does not point to StoreC because inflate_string uses a byte[] as input and in this case also writes to a different char[]. Even if we set the input to BOTTOM, inflate_string points to 7 Parm (BOTTOM) but not to the char[] memory produced by 96 StoreC:
>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>>>
>>>> 349 LoadUS then reads from the output char[] memory of inflate_string which does not include the result of StoreC. The test fails because the return value is != 42.
>>>>
>>>> My solution is to capture both the byte[] and char[] memory by using a MergeMem node as input to inflate_string.
>>>>
>>>>>    If StoreC followed inflate_string and LoadC followed StoreC - LoadC should point to StoreC. If LoadC does not follow StoreC then result is relaxed.
>>>>
>>>> Yes, these cases work fine.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>>>> The intrinsic hides the dependency between LoadC and StoreC, causing the load to read from memory not containing the result of the StoreC. I was able to write a regression test for this (see 'TestStringIntrinsicMemoryFlow::testInflate2').
>>>>>>
>>>>>> Setting the input to BOTTOM, generates the following graph:
>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_bottom.png
>>>>>> The 349 LoadUS does not read the result of the 96 StoreC because the StrInflateCopyNode does not capture it's memory. The test fails.
>>>>>>
>>>>>> I adapted the fix to emit a MergeMemoryNode to capture the entire memory state as input to the intrinsic. The graph then looks like this:
>>>>>>     LoadC -> inflate_string -> MergeMem(ByteMem, StoreC(CharMem))
>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/inflate_merge.png
>>>>>>
>>>>>> Here is the new webrev:
>>>>>> http://cr.openjdk.java.net/~thartmann/8144212/webrev.02/
>>>>>> Probably, we could also only capture the byte and char slices instead of merging everything. What do you think?
>>>>>>
>>>>>> Best,
>>>>>> Tobias
>>>>>>
>>>>>>>>>> Related question:
>>>>>>>>>> In library_call.cpp, I now use TypeAryPtr::get_array_body_type(dst_elem) to get the correct TypeAryPtr for the destination (we support both BYTES and CHARS). For a char[] destination, it returns:
>>>>>>>>>>       char[int:>=0]:exact+any *
>>>>>>>>>>
>>>>>>>>>> which is equal to the type of the char load.
>>>>>>>>>
>>>>>>>>> Please, explain this. I thought string's array will always be byte[] when compressed strings are enabled. Is it used for getChars() which returns char array?
>>>>>>>>
>>>>>>>> Yes, both the compress and inflate intrinsics are used for different types of src and dst arrays. See comment in library_call.cpp:
>>>>>>>>
>>>>>>>> // compressIt == true --> generate a compressed copy operation (compress char[]/byte[] to byte[])
>>>>>>>> //   int StringUTF16.compress(char[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>>> //   int StringUTF16.compress(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>>> // compressIt == false --> generate an inflated copy operation (inflate byte[] to char[]/byte[])
>>>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, char[] dst, int dstOff, int len)
>>>>>>>> //   void StringLatin1.inflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len)
>>>>>>>>
>>>>>>>> I.e., the inflate intrinsic is used for inflation from byte[] to byte[]/char[].
>>>>>>>>
>>>>>>>>> Should we also be more careful in inflate_string_slow()? Is it used?
>>>>>>>>
>>>>>>>> No, inflate_string_slow() is only called from PhaseStringOpts::copy_latin1_string() where it is used to inflate from byte[] to byte[].
>>>>>>>>
>>>>>>>>>> I also tried to derive the type from the array by using dst_type->isa_aryptr(). However, this returns a more specific type:
>>>>>>>>>>       char[int:1]:NotNull:exact *
>>>>>>>>>>
>>>>>>>>>> Using this results in C2 assuming that the subsequent char load is independent and again moving it to before the intrinsic. I don't understand why that is. Shouldn't the second type be a "subtype" of the first type?
>>>>>>>>>
>>>>>>>>> It is indeed strange. What memory type of LoadUS? It could be bug.
>>>>>>>>
>>>>>>>> LoadUS has memory type "char[int:>=0]:exact+any *" which has alias index 4. dst_type->isa_aryptr() returns memory type "char[int:1]:NotNull:exact *" which has alias index 8.
>>>>>>>>
>>>>>>>> I will look into this again and try to understand what happens.
>>>>>>>
>>>>>>> It could that aryptr is pointer to array and load type is pointer to array's element.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Tobias
>>>>>>>>
>>>>>>>>>>> On 1/6/16 5:34 AM, Andrew Haley wrote:
>>>>>>>>>>>> On 01/06/2016 01:06 PM, Tobias Hartmann wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The problem here is that C2 reorders memory instructions and moves
>>>>>>>>>>>>> an array load before an array store. The MemBarCPUOrder is now used
>>>>>>>>>>>>> (compiler internally) to prevent this. We do the same for normal
>>>>>>>>>>>>> array copys in PhaseMacroExpand::expand_arraycopy_node(). No actual
>>>>>>>>>>>>> code is emitted. See also the comment in memnode.hpp:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        // Ordering within the same CPU.  Used to order unsafe memory references
>>>>>>>>>>>>>        // inside the compiler when we lack alias info.  Not needed "outside" the
>>>>>>>>>>>>>        // compiler because the CPU does all the ordering for us.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "CPU does all the ordering for us" means that even with a relaxed
>>>>>>>>>>>>> memory ordering, loads are never moved before dependent stores.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Or did I misunderstand your question?
>>>>>>>>>>>>
>>>>>>>>>>>> No, I don't think so.  I was just checking: I am very aware that
>>>>>>>>>>>> HotSpot has presented those of use with relaxed memory order machines
>>>>>>>>>>>> with some interesting gotchas over the years, that's all.  I'm a bit
>>>>>>>>>>>> surprised that C2 needs this barrier, given that there is a
>>>>>>>>>>>> read-after-write dependency, but never mind.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Andrew.
>>>>>>>>>>>>

From tobias.hartmann at oracle.com  Mon Jan 18 07:10:14 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 18 Jan 2016 08:10:14 +0100
Subject: [9] RFR(M): 6675699: need comprehensive fix for unconstrained
	ConvI2L with narrowed type
In-Reply-To: <56993981.8020703@oracle.com>
References: <5697C624.7040201@oracle.com> <5697EFFF.90305@oracle.com>
	<5699021F.90500@oracle.com> <56993981.8020703@oracle.com>
Message-ID: <569C8FD6.9090806@oracle.com>

Thanks, Vladimir!

Best,
Tobias

On 15.01.2016 19:25, Vladimir Kozlov wrote:
> This looks good.
> 
> Thanks,
> Vladimir
> 
> On 1/15/16 6:28 AM, Tobias Hartmann wrote:
>> Thanks, Vladimir.
>>
>> On 14.01.2016 19:59, Vladimir Kozlov wrote:
>>> You have to update code for 8146999 changes when Roland push it.
>>
>> Yes, I'll do so but Roland mentioned that he still has problems with his 814699 fix.
>>
>>> The only thing I don't like about changes is using #ifdef _LP64 for part of changes.
>>> I know where it is coming from (ConvI2L for loop indexing) but as you said ConvI2L could be generated in other cases too. Should the test cast->has_range_check() return 'false' in 32-bit?
>>
>> I added the _LP64 ifdefs because we only emit a narrowed ConvI2L on 64 bit. But I agree - it's cleaner without those. As you suggested, I removed the ifdefs and changed has_range_check() to return false on 32 bit.
>>
>> Here is the new webrev:
>> http://cr.openjdk.java.net/~thartmann/6675699/webrev.02/
>>
>> Thanks,
>> Tobias
>>
>>> On 1/14/16 8:00 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-6675699
>>>> http://cr.openjdk.java.net/~thartmann/6675699/webrev.01/
>>>>
>>>> *Problem*
>>>> The problem is that ConvI2L nodes with a narrow type (used to convert integer array indices to long values) are not dependent on the corresponding range check that proves that the input value is always in the (integer-)range. As a result, the ConvI2L node may flow above the range check during loop optimizations and end up with an input that is not in its type range. The node is then replaced by TOP causing the data path to be eliminated. However, because there is no control dependency on the corresponding range check, the control path from the peeled iteration that uses the result of the ConvI2L may not be eliminated. We crash because we are potentially using a value that is not available.
>>>>
>>>> For example, TestLoopPeeling::testArrayAccess() triggers loop peeling because the loop contains an invariant check. The array store in line 66 is moved out of the loop and reachable from the peeled and old iterations of the loop. However, the array index computation consisting of a LShiftL(ConvI2L(Phi)) remains in each loop because it has loop variant usages and is not dependent on the range check that was moved out of the loop. The peeled iteration of the loop uses storeIndex == -1 causing the ConvI2L to be replaced by TOP because -1 is not in its [0, MAX_INT] range. The TOP is propagated downwards and ends up as one of the inputs to the Phi that merges the array index from the peeled and old loop exits. The Phi replaced by it's only remaining input and the store ends up using the index from the old iteration although it's still reachable from the peeled iteration. We crash because we potentially use the index value from the old iteration while coming from the peeled i!
 t!
> e!
>>   r!
>>> at!
>>>>    ion (of co
>>>>
>>>> urse, the range check would catch this at runtime).
>>>>
>>>> This problem may show up with array accesses but also with other code for which we emit a ConvI2L node with a narrow type. For example, array allocation uses a ConvI2L to convert the integer array size to a long value (see TestLoopPeeling::testArrayAllocation). We solved several different instances of this problem in the past with "workaround-fixes" that just disabled loop optimizations in special cases (see below). Such a workaround fix is not feasible to fix all potential occurrences of this problem. TestLoopPeeling.java crashes JDK 7, 8 and 9.
>>>>
>>>> *Solution*
>>>> To make the ConvI2L dependent on a range check, I added code to emit a narrow CastII node with a control dependency on the range check that is then used as input to the ConvI2L. Like this, we explicitly express the dependency and prevent loop optimizations from moving the ConvI2L above the range check.
>>>>
>>>> To make sure that the impact is as small as possible, the range check dependent CastII nodes are removed right after loop optimizations. Further, all optimizations that depend on the old shape of array address computations are adapted to be aware of the CastII node.
>>>>
>>>> With the fix, we could now remove the following old "workaround-fixes":
>>>> https://bugs.openjdk.java.net/browse/JDK-4781451
>>>> https://bugs.openjdk.java.net/browse/JDK-4799512
>>>> https://bugs.openjdk.java.net/browse/JDK-6659207
>>>> https://bugs.openjdk.java.net/browse/JDK-6663854
>>>> For reference, the individual patches can be found here:
>>>> http://cr.openjdk.java.net/~thartmann/6675699/backouts/
>>>>
>>>> However, performance evaluation showed that backing out the old fixes causes significant regressions. It seems that aggressive splitting of ConvI2L nodes through phis leads to less optimal code due to more register spilling. I suspect that additional changes to the loop optimizations are necessary and would therefore like to leave the workaround fixes in for now. I filed JDK-8145313 to remove them later. Like this, we also reduce the impact/risk when backporting this fix to JDK 8 and potentially JDK 7.
>>>>
>>>> Roland pointed out that the changes in ConvI2LNode::Ideal() could potentially be merged into the CastIINode::Ideal() optimization introduced by his fix for JDK-8145322. After some investigation it turned out that the CastII optimization does not only affect memory addressing but also other CastII(AddI(..)) graph shapes. Making it more generic has a broader impact and therefore needs more investigation. I filed JDK-8147394 for this.
>>>>
>>>> ConvI2L nodes with a narrow type are also emitted by intrinsics:
>>>> - GraphKit::array_element_address()
>>>> - PhaseMacroExpand::array_element_address()
>>>> - ArrayCopyNode::prepare_array_copy()
>>>> I was not able to reproduce the problem with intrinsics. It's also not easily possible to make the CastII node range check dependent here because the range check is not always available from within the intrinsic.
>>>>
>>>> *Testing*
>>>> I did extensive testing to make sure the fix does not introduce correctness or performance issues.
>>>> - Different RBT test suites [1] with and without -Xcomp.
>>>> - Full run of multiple CTW suites.
>>>> - Verified changes in "PhaseIdealLoop::match_fill_loop" (loopTransform.cpp) by manually checking the output of [2] with -XX:+TraceOptimizeFill.
>>>> - Verified changes in "IfNode::improve_address_types" (ifnode.cpp) by manually checking the output of [3] with -XX:+PrintOptoAssembly to make sure all range checks are folded.
>>>> - Verified changes in superword.cpp by comparing output with -XX:+TraceSuperWord.
>>>> - Performance runs (Footprint, JMH-Javac, SPECjbb2005, SPECjvm2008, Startup, Volano) on x86 and SPARC showed no regression
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] RBT test suites:
>>>> - hotspot/test/:hotspot_all
>>>> - noncolo.testlist
>>>> - vm.compiler.testlist
>>>> - vm.regression.testlist
>>>> - nsk.regression.testlist
>>>> - nsk.split_verifier.testlist
>>>> - nsk.stress.testlist
>>>> - nsk.stress.jck.testlist
>>>> - jdk/test/:jdk_jfr
>>>> - jdk/test/:svc_tools
>>>> - jdk/test/:jdk_instrument
>>>> - jdk/test/:jdk_lang
>>>> - jdk/test/:jdk_svc
>>>> - nashorn/test/:tier1
>>>> - nashorn/test/:tier2
>>>> - nashorn/test/:tier3
>>>> Only without -Xcomp:
>>>> - Kitchensink
>>>> - runThese
>>>> - Weblogic12medrec
>>>> [2] test/compiler/intrinsics/6982370/Test6982370.java
>>>> [3] test/compiler/rangechecks/TestExplicitRangeChecks.java
>>>>

From zoltan.majo at oracle.com  Mon Jan 18 07:44:42 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 18 Jan 2016 08:44:42 +0100
Subject: [9] RFR (XS): 8147441: unchecked pending exceptions in the
	WhiteBox API's implementation
In-Reply-To: <56993628.1060702@oracle.com>
References: <5698B114.4020106@oracle.com> <56993628.1060702@oracle.com>
Message-ID: <569C97EA.6020409@oracle.com>

Thank you, Vladimir, for the review!

Best regards,


Zoltan

On 01/15/2016 07:10 PM, Vladimir Kozlov wrote:
> Seems fine.
>
> Thanks,
> Vladimir
>
> On 1/15/16 12:43 AM, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>> please review the patch for 8147441.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147441
>>
>> Problem: The method codeBlob2objectArray is used by the implementation
>> of the WB API to fill in an object array with information about a code
>> blob. Although the codeBlob2objectArray method can cause various JNI
>> exceptions, there are two code locations where the VM does not check for
>> exceptions after codeBlob2objectArray returns.
>>
>> Solution: Add exception check to the above mentioned code locations.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~zmajo/8147441/webrev.00/
>>
>> Testing:
>> - JPRT;
>> - all hotspot tests executed locally; all tests that pass with the
>> default version pass with the fixed version as well.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>


From vladimir.x.ivanov at oracle.com  Mon Jan 18 12:54:48 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Mon, 18 Jan 2016 15:54:48 +0300
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause target
	method to always run in interpreter mode
Message-ID: <569CE098.4030807@oracle.com>

http://cr.openjdk.java.net/~vlivanov/7177745/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-7177745

JVM aggressively inlines through CallSites, even for mutable and 
volatile flavors. It's the key optimization for making invokedynamic 
performant.

When a CallSite.target is updated, JVM invalidates all affected nmethods 
and try to recompile them later. If a call site target regularly 
changes, JVM will eventually mark (after PerMethodRecompilationCutoff 
invalidations) all hot methods which have the call site bound as 
non-compilable. It leads to significant peak performance reduction, 
because all affected methods will always be executed in interpreter mode 
since then.

The fix is to avoid updating recompilation count when corresponding 
nmethod is invalidated due to a call site target change.

I filed a separate RFE (JDK-8147550 [1]) to consider slow non-inlined 
code shape for unstable call sites, as John suggested [2].

Testing: regression test, octane, JPRT.

Thanks!

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8147550
[2] 
https://bugs.openjdk.java.net/browse/JDK-7177745?focusedCommentId=13821545&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821545

From aleksey.shipilev at oracle.com  Mon Jan 18 13:15:58 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 18 Jan 2016 16:15:58 +0300
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569CE098.4030807@oracle.com>
References: <569CE098.4030807@oracle.com>
Message-ID: <569CE58E.6030805@oracle.com>

On 18.01.2016 15:54, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/7177745/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-7177745

Finally.

Cheers,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160118/0c6fe665/signature.asc>

From roland.schatz at oracle.com  Mon Jan 18 16:40:52 2016
From: roland.schatz at oracle.com (Roland Schatz)
Date: Mon, 18 Jan 2016 17:40:52 +0100
Subject: RFR(XS): 8147564: [JVMCI] remove unused method
	CodeCacheProvider.needsDataPatch
Message-ID: <569D1594.6060502@oracle.com>

Hi,

Please review this small patch:
webrev: http://cr.openjdk.java.net/~rschatz/JDK-8147564/webrev.00/
jira: https://bugs.openjdk.java.net/browse/JDK-8147564

The removed method always returned false, because there is no class 
implementing both the JavaConstant and the HotSpotMetaspaceConstant 
interfaces.

Thanks,
Roland

From roland.westrelin at oracle.com  Mon Jan 18 20:39:50 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 18 Jan 2016 21:39:50 +0100
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
Message-ID: <E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>

> http://cr.openjdk.java.net/~roland/8146999/webrev.00/

Further testing revealed this bug was hiding another one.

http://cr.openjdk.java.net/~roland/8146999/webrev.01/

In PhiNode::unique_input(), uncast() could step over the CheckCastPP node that follows an Allocation. If that happens, a new CheckCastPP will be created and will replace the CheckCastPP of the allocation but with a different control and it could cause a safepoint to be in between the allocation and the CheckCastPP and assert failures with "there should be a oop in OopMap instead of a live raw oop at safepoint?.

Roland.

From vladimir.kozlov at oracle.com  Mon Jan 18 23:07:57 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 18 Jan 2016 15:07:57 -0800
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
	<E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>
Message-ID: <569D704D.6080908@oracle.com>

Looks good.

Thanks,
Vladimir

On 1/18/16 12:39 PM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8146999/webrev.00/
>
> Further testing revealed this bug was hiding another one.
>
> http://cr.openjdk.java.net/~roland/8146999/webrev.01/
>
> In PhiNode::unique_input(), uncast() could step over the CheckCastPP node that follows an Allocation. If that happens, a new CheckCastPP will be created and will replace the CheckCastPP of the allocation but with a different control and it could cause a safepoint to be in between the allocation and the CheckCastPP and assert failures with "there should be a oop in OopMap instead of a live raw oop at safepoint?.
>
> Roland.
>

From tobias.hartmann at oracle.com  Tue Jan 19 08:18:04 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 19 Jan 2016 09:18:04 +0100
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
	<E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>
Message-ID: <569DF13C.8020700@oracle.com>

Hi Roland,

this looks good to me but please make sure that you merge your node.hpp changes with my 6675699 changes:
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/diff/bfb7a8a004de/src/share/vm/opto/node.hpp

Best,
Tobias

On 18.01.2016 21:39, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8146999/webrev.00/
> 
> Further testing revealed this bug was hiding another one.
> 
> http://cr.openjdk.java.net/~roland/8146999/webrev.01/
> 
> In PhiNode::unique_input(), uncast() could step over the CheckCastPP node that follows an Allocation. If that happens, a new CheckCastPP will be created and will replace the CheckCastPP of the allocation but with a different control and it could cause a safepoint to be in between the allocation and the CheckCastPP and assert failures with "there should be a oop in OopMap instead of a live raw oop at safepoint?.
> 
> Roland.
> 

From roland.westrelin at oracle.com  Tue Jan 19 10:09:57 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 19 Jan 2016 11:09:57 +0100
Subject: RFR(XS): 8146999:
	hotspot/test/compiler/c2/8007294/Test8007294.java test nightly failure
In-Reply-To: <569DF13C.8020700@oracle.com>
References: <CF202F67-5FEC-4CF4-8657-BBC5EA8E665C@oracle.com>
	<E5EC34C9-6BF9-442A-A41F-1599D14854C8@oracle.com>
	<569DF13C.8020700@oracle.com>
Message-ID: <2FCABBE1-182F-4608-A79E-7E62ADF9B7EF@oracle.com>

Thanks Vladimir & Tobias for the review.

Roland.

From andreas.eriksson at oracle.com  Tue Jan 19 12:32:30 2016
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Tue, 19 Jan 2016 13:32:30 +0100
Subject: RFR(S): 8146096: [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
Message-ID: <569E2CDE.3060805@oracle.com>

Hi,

Can I please have a review for the removal of 
hotspot/test/compiler/loopopts/UseCountedLoopSafepoints.java.

The test needs to do a loop that takes more than two seconds to execute 
fully without doing a safepointing call. For this expensive atomic 
operations were used. The problem is that on certain embedded platforms 
they are too expensive, and the test times out.
The loop length could probably be reduced, and it should still work on 
faster machines. However, the test is not very useful, so I think it's 
better to just remove it to avoid future problems.

Bug: https://bugs.openjdk.java.net/browse/JDK-8146096
Test to be removed: 
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/d84a55e7aaf8/test/compiler/loopopts/UseCountedLoopSafepoints.java
(I can create a webrev if you think it necessary.)

Thanks,
Andreas

From vladimir.x.ivanov at oracle.com  Tue Jan 19 12:50:17 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 19 Jan 2016 15:50:17 +0300
Subject: RFR(S): 8146096: [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <569E2CDE.3060805@oracle.com>
References: <569E2CDE.3060805@oracle.com>
Message-ID: <569E3109.8090107@oracle.com>

As an idea to improve the test: spawn a thread which executes the 
counted loop and then use WhiteBox.forceSafepoint() to trigger a safepoint.

If the test times out, it means there's no safepoint in the loop.

Also, it also simplifies the implementation - no need to spawn a child 
process, the check can be done in-process.

Best regards,
Vladimir Ivanov

On 1/19/16 3:32 PM, Andreas Eriksson wrote:
> Hi,
>
> Can I please have a review for the removal of
> hotspot/test/compiler/loopopts/UseCountedLoopSafepoints.java.
>
> The test needs to do a loop that takes more than two seconds to execute
> fully without doing a safepointing call. For this expensive atomic
> operations were used. The problem is that on certain embedded platforms
> they are too expensive, and the test times out.
> The loop length could probably be reduced, and it should still work on
> faster machines. However, the test is not very useful, so I think it's
> better to just remove it to avoid future problems.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8146096
> Test to be removed:
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/d84a55e7aaf8/test/compiler/loopopts/UseCountedLoopSafepoints.java
>
> (I can create a webrev if you think it necessary.)
>
> Thanks,
> Andreas

From roland.westrelin at oracle.com  Tue Jan 19 15:22:35 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 19 Jan 2016 16:22:35 +0100
Subject: Request for Reviews (S): JDK-8003585 strength reduce or eliminate
	range checks for power-of-two sized arrays
In-Reply-To: <5697E923.6000908@oracle.com>
References: <CA+cQ+tR5c3y3uLpdO2Gop-RA0ZPJ4Mmq9yAAK5dOpFtgv-XPcg@mail.gmail.com>
	<440F2280-4B25-4AE6-A4F6-DDD4EB529636@oracle.com>
	<52FC129D.7040409@oracle.com>
	<CA+cQ+tTzA822citeYM0zCNE3FDyBk9hTO2h2i6POiJWSar6hVA@mail.gmail.com>
	<CA+cQ+tTWkkJHwcvJ4R=f31ve9vRwG5TfizHfo56sO9R3p3sRcQ@mail.gmail.com>
	<52FE6A08.20400@oracle.com>
	<CA+cQ+tQ00O1bT0JAAX9Lk+hPrfLV0PgQoiO1wmEUbKrPrqi1kA@mail.gmail.com>
	<52FE7313.3060404@oracle.com> <530209A8.1020501@oracle.com>
	<DC638183-2429-4FD2-B455-F87C0FD7EBFE@oracle.com>
	<CA+cQ+tQ2FoMhSj_B-Q=zgv3sGtJo9HojhWDsEWkx-UThc+mqEA@mail.gmail.com>
	<38EE6922-0B9C-49A6-B54D-E78BA0EFECB1@oracle.com>
	<B42C8B5D-5C7E-4FB9-A519-0B319F89018C@oracle.com>
	<CA+cQ+tQKvH9dyymvOEO6YFrCv2A1+hTo6bwMg0km8hx4G2jf4w@mail.gmail.com>
	<8232A81B-6B78-4F61-A8EC-1A3DF3938648@oracle.com>
	<DB5DF00F-CD36-41D2-8E1D-0958E70E6D1D@oracle.com>
	<70FBA4CF-CF05-4232-AFEC-202E93BFA930@oracle.com>
	<5697E923.6000908@oracle.com>
Message-ID: <0317CD9D-F104-4AFE-BB75-3966C8DF8421@oracle.com>

Thanks for taking another look at this, Vladimir.

> I know it is duplication but CmpU creation should be under conditions otherwise you are creating and transforming dead node.
> 
> +     Node* ncmp = phase->transform(new CmpUNode(cmp1, cmp2));
> +     if (_test._test == BoolTest::le || _test._test == BoolTest::eq) {
> 
> The test does not cover next conversions:
> 
> +   // Change (arraylength <= 0) or (arraylength == 0)
> +   //   into (arraylength u<= 0)
> +   // Also change (arraylength != 0) into (arraylength u> 0)

Here is a new webrev:

http://cr.openjdk.java.net/~roland/8003585/webrev.02/

Roland.

> 
> Thanks,
> Vladimir
> 
> On 1/7/16 1:29 AM, Roland Westrelin wrote:
>> Can I get a review for this?
>> 
>> Roland.
>> 
>>> On Oct 5, 2015, at 12:51 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>> 
>>> Here is a new webrev:
>>> 
>>> http://cr.openjdk.java.net/~roland/8003585/webrev.01/
>>> 
>>> Roland.
>>> 
>>>> On Oct 2, 2015, at 3:30 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>> 
>>>> Hi Chris,
>>>> 
>>>>> Thanks for picking it up! It mostly looks good to me. (Not a Reviewer)
>>>> 
>>>> Thanks for looking at this again.
>>>> 
>>>>> What I really needed with my earlier webrev was some instructions as to what test to write -- since the Java corelibs can come across this optimization a lot (e.g. HashMap), I didn't have a good idea of what kind of test really needs to be written.
>>>>> 
>>>>> A couple of issues with this webrev:
>>>>> 
>>>>> 1. In subnode.cpp, line 1346:
>>>>> 
>>>>> 1344     } else if (_test._test == BoolTest::lt &&
>>>>> 1345                cmp2->Opcode() == Op_AddI &&
>>>>> 1346                cmp2->in(2)->find_int_con(1)) {
>>>>> 1347       bound = cmp2->in(1);
>>>>> 1348     }
>>>>> 
>>>>> I think it should be
>>>>> cmp2->in(2)->find_int_con(0) == 1
>>>>> instead, because the value passed into this function is actually for a "fallback when no int constant is found". Passing the expected value (1) to it defeats the purpose.
>>>> 
>>>> You?re right. Thanks for spotting that.
>>>> 
>>>>> jint find_int_con(jint value_if_unknown) const {
>>>>>   const TypeInt* t = find_int_type();
>>>>>   return (t != NULL && t->is_con()) ? t->get_con() : value_if_unknown;
>>>>> }
>>>>> 
>>>>> 2. Formattign nitpick: could you please trim the spaces before the new's on lines 1368, 1369 and 1387
>>>> 
>>>> Sure.
>>>> 
>>>> I?ll send an updated webrev.
>>>> 
>>>> Roland.
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Kris (OpenJDK username: krismo)
>>>>> 
>>>>> On Wed, Sep 30, 2015 at 1:34 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>>> I?m picking that one up. Here is a new webrev:
>>>>> 
>>>>> http://cr.openjdk.java.net/~roland/8003585/webrev.00/
>>>>> 
>>>>> The only change to c2 compared to the previous webrev is that ((x & m) u< m+1) is optimized the same way ((x & m) u<= m) is. Actually, I don?t think that C2 currently produces the ((x & m) u<= m) shape. The IfNode::fold_compares() logic produces the ((x & m) u< m+1) variant. I also added a test case to check the validity of the transformations and ran usual testing on the change.
>>>>> 
>>>>> Roland.
>>> 
>> 


From roland.westrelin at oracle.com  Tue Jan 19 17:06:35 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 19 Jan 2016 18:06:35 +0100
Subject: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect size
	calculattion x86_32.ad
Message-ID: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>

http://cr.openjdk.java.net/~roland/8147386/webrev.00/

src_offset/dst_offset are incremented in the size computation code and then used if cbuf is not null but now have the wrong value.

Roland.

From vladimir.x.ivanov at oracle.com  Tue Jan 19 17:59:01 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Tue, 19 Jan 2016 20:59:01 +0300
Subject: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect
	size calculattion x86_32.ad
In-Reply-To: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
References: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
Message-ID: <569E7965.9080800@oracle.com>

Looks good.

Best regards,
Vladimir Ivanov

On 1/19/16 8:06 PM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8147386/webrev.00/
>
> src_offset/dst_offset are incremented in the size computation code and then used if cbuf is not null but now have the wrong value.
>
> Roland.
>

From vladimir.kozlov at oracle.com  Tue Jan 19 18:20:58 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 10:20:58 -0800
Subject: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect
	size calculattion x86_32.ad
In-Reply-To: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
References: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
Message-ID: <569E7E8A.5080809@oracle.com>

Good.

Thanks,
Vladimir

On 1/19/16 9:06 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8147386/webrev.00/
>
> src_offset/dst_offset are incremented in the size computation code and then used if cbuf is not null but now have the wrong value.
>
> Roland.
>

From vladimir.kozlov at oracle.com  Tue Jan 19 18:30:34 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 10:30:34 -0800
Subject: [9] RFR (XS) 8146119: java/lang/Math/PowTests.java fails on
	solaris-x64 using -Xcomp
In-Reply-To: <567B0CD6.2070107@oracle.com>
References: <567B0B90.7000004@oracle.com>
	<BC90B691-CBEF-4706-BB64-4636E48A430B@oracle.com>
	<567B0CD6.2070107@oracle.com>
Message-ID: <569E80CA.6080806@oracle.com>

I am pushing this change since we decided to keep 12.4SS C++ update.

Thanks,
Vladimir

On 12/23/15 1:06 PM, Vladimir Kozlov wrote:
> Thanks!
>
> On 12/23/15 1:05 PM, Christian Thalinger wrote:
>> Unfortunate but looks good.
>>
>>> On Dec 23, 2015, at 11:01 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8146119
>>>
>>> http://cr.openjdk.java.net/~kvn/8146119/webrev/
>>>
>>> New SunStudio C++ compiler generates incorrect code in library_call.cpp. All build versions are affected.
>>> It is also failed with -xO0 level so I removed any optimizations.
>>>
>>> Tested with failed test.
>>>
>>> Thanks,
>>> Vladimir
>>

From volker.simonis at gmail.com  Tue Jan 19 18:57:06 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 19 Jan 2016 19:57:06 +0100
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings
	change
Message-ID: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>

Hi,

can somebody please review and sponsor this change.

Despite the bug summary, I still had to do some small shared changes
to make this work, so unfortunately I can not push this on my own.

The change also affects aarch64 (although it is minimal and I don't
expect it to break anything) so I cc-ed aarch64-port-dev.

http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
https://bugs.openjdk.java.net/browse/JDK-8145336

As described in the bug, this change only fixes the string intrinsics
for the -XX:-UseCompactStrings mode which is still the default on
ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
added because we already had a similar intrinsic for constant string
needles of length one anyway. A later change (which we're already
working on) will add the intrinsics which can handle compact strings.

The current intrinsics can handle both, the new byte-array based
string representation as well as the old char-array based string
representation because we internally still use the new hotspot with
older versions of the class libraries.

I've also ported some of our internal string tests into a new
regression test (TestStringIntrinsics2.java) because the existing
tests didn't exercise all of our intrinsics.

Following the shared changes I had to do:

Until now, UseSSE42Intrinsics was a global shared option which was
used to control the availability of the stringIndexOf intrinsics. But
UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
make a lot of sense to define it for other architectures. I've
therefore moved the flag to globals_x86.hpp and changed the condition
which checks for the ability of the stringIndexOf intrinsics from:

if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {

to:

if (!Matcher::match_rule_supported(Op_StrIndexOf)) {

The Matcher::match_rule_supported() method already calls
Matcher::has_match_rule() anyway. And it is implemented in the .ad
file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
platforms can now decide in their .ad file if they unconditionally
support the intrinsic or if they need a special feature check. This
change was already briefly discussed in [1].

The other shared change I had to make was in
LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
We have optimized intrinsics for the case that one of the strings to
compare is constant, but the  StrEqualsNode is constructed without
taking into account that one of the string length values could be a
constant. This prevented our optimized instruction from being matched
in the ad-file.

All the other changes are ppc-specific.

Thank you and best regards,
Volker


[1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From michael.c.berg at intel.com  Tue Jan 19 19:07:35 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Tue, 19 Jan 2016 19:07:35 +0000
Subject: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect
	size calculattion x86_32.ad
In-Reply-To: <569E7E8A.5080809@oracle.com>
References: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
	<569E7E8A.5080809@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756DBEFAE@FMSMSX102.amr.corp.intel.com>

Looks ok.

-Michael

-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
Sent: Tuesday, January 19, 2016 10:21 AM
To: hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect size calculattion x86_32.ad

Good.

Thanks,
Vladimir

On 1/19/16 9:06 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8147386/webrev.00/
>
> src_offset/dst_offset are incremented in the size computation code and then used if cbuf is not null but now have the wrong value.
>
> Roland.
>

From tom.rodriguez at oracle.com  Tue Jan 19 19:32:39 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Tue, 19 Jan 2016 11:32:39 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
Message-ID: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>

http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>

Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.

tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/19bfcf3f/attachment.html>

From christian.thalinger at oracle.com  Tue Jan 19 19:58:36 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 09:58:36 -1000
Subject: RFR(XS): 8147564: [JVMCI] remove unused method
	CodeCacheProvider.needsDataPatch
In-Reply-To: <569D1594.6060502@oracle.com>
References: <569D1594.6060502@oracle.com>
Message-ID: <61268260-8FD9-43DA-93F1-866C3D769E10@oracle.com>

Looks good.

> On Jan 18, 2016, at 6:40 AM, Roland Schatz <roland.schatz at oracle.com> wrote:
> 
> Hi,
> 
> Please review this small patch:
> webrev: http://cr.openjdk.java.net/~rschatz/JDK-8147564/webrev.00/
> jira: https://bugs.openjdk.java.net/browse/JDK-8147564
> 
> The removed method always returned false, because there is no class implementing both the JavaConstant and the HotSpotMetaspaceConstant interfaces.
> 
> Thanks,
> Roland


From vladimir.kozlov at oracle.com  Tue Jan 19 20:02:07 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 12:02:07 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
Message-ID: <569E963F.8060901@oracle.com>

Looks good.

Thanks,
Vladimir

On 1/19/16 11:32 AM, Tom Rodriguez wrote:
> http://cr.openjdk.java.net/~never/8147432/webrev/index.html
> https://bugs.openjdk.java.net/browse/JDK-8147432
>
> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new
> CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED?
> style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of
> inlined byte codes for use by things like CITimeEach.  Additionally I removed the
> CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and
> PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>
> tom

From vladimir.kozlov at oracle.com  Tue Jan 19 20:04:41 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 12:04:41 -0800
Subject: Request for Reviews (S): JDK-8003585 strength reduce or eliminate
	range checks for power-of-two sized arrays
In-Reply-To: <0317CD9D-F104-4AFE-BB75-3966C8DF8421@oracle.com>
References: <CA+cQ+tR5c3y3uLpdO2Gop-RA0ZPJ4Mmq9yAAK5dOpFtgv-XPcg@mail.gmail.com>
	<440F2280-4B25-4AE6-A4F6-DDD4EB529636@oracle.com>
	<52FC129D.7040409@oracle.com>
	<CA+cQ+tTzA822citeYM0zCNE3FDyBk9hTO2h2i6POiJWSar6hVA@mail.gmail.com>
	<CA+cQ+tTWkkJHwcvJ4R=f31ve9vRwG5TfizHfo56sO9R3p3sRcQ@mail.gmail.com>
	<52FE6A08.20400@oracle.com>
	<CA+cQ+tQ00O1bT0JAAX9Lk+hPrfLV0PgQoiO1wmEUbKrPrqi1kA@mail.gmail.com>
	<52FE7313.3060404@oracle.com> <530209A8.1020501@oracle.com>
	<DC638183-2429-4FD2-B455-F87C0FD7EBFE@oracle.com>
	<CA+cQ+tQ2FoMhSj_B-Q=zgv3sGtJo9HojhWDsEWkx-UThc+mqEA@mail.gmail.com>
	<38EE6922-0B9C-49A6-B54D-E78BA0EFECB1@oracle.com>
	<B42C8B5D-5C7E-4FB9-A519-0B319F89018C@oracle.com>
	<CA+cQ+tQKvH9dyymvOEO6YFrCv2A1+hTo6bwMg0km8hx4G2jf4w@mail.gmail.com>
	<8232A81B-6B78-4F61-A8EC-1A3DF3938648@oracle.com>
	<DB5DF00F-CD36-41D2-8E1D-0958E70E6D1D@oracle.com>
	<70FBA4CF-CF05-4232-AFEC-202E93BFA930@oracle.com>
	<5697E923.6000908@oracle.com>
	<0317CD9D-F104-4AFE-BB75-3966C8DF8421@oracle.com>
Message-ID: <569E96D9.1070909@oracle.com>

Thanks! Looks good.

Vladimir

On 1/19/16 7:22 AM, Roland Westrelin wrote:
> Thanks for taking another look at this, Vladimir.
>
>> I know it is duplication but CmpU creation should be under conditions otherwise you are creating and transforming dead node.
>>
>> +     Node* ncmp = phase->transform(new CmpUNode(cmp1, cmp2));
>> +     if (_test._test == BoolTest::le || _test._test == BoolTest::eq) {
>>
>> The test does not cover next conversions:
>>
>> +   // Change (arraylength <= 0) or (arraylength == 0)
>> +   //   into (arraylength u<= 0)
>> +   // Also change (arraylength != 0) into (arraylength u> 0)
>
> Here is a new webrev:
>
> http://cr.openjdk.java.net/~roland/8003585/webrev.02/
>
> Roland.
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/7/16 1:29 AM, Roland Westrelin wrote:
>>> Can I get a review for this?
>>>
>>> Roland.
>>>
>>>> On Oct 5, 2015, at 12:51 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>>
>>>> Here is a new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~roland/8003585/webrev.01/
>>>>
>>>> Roland.
>>>>
>>>>> On Oct 2, 2015, at 3:30 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>>> Thanks for picking it up! It mostly looks good to me. (Not a Reviewer)
>>>>>
>>>>> Thanks for looking at this again.
>>>>>
>>>>>> What I really needed with my earlier webrev was some instructions as to what test to write -- since the Java corelibs can come across this optimization a lot (e.g. HashMap), I didn't have a good idea of what kind of test really needs to be written.
>>>>>>
>>>>>> A couple of issues with this webrev:
>>>>>>
>>>>>> 1. In subnode.cpp, line 1346:
>>>>>>
>>>>>> 1344     } else if (_test._test == BoolTest::lt &&
>>>>>> 1345                cmp2->Opcode() == Op_AddI &&
>>>>>> 1346                cmp2->in(2)->find_int_con(1)) {
>>>>>> 1347       bound = cmp2->in(1);
>>>>>> 1348     }
>>>>>>
>>>>>> I think it should be
>>>>>> cmp2->in(2)->find_int_con(0) == 1
>>>>>> instead, because the value passed into this function is actually for a "fallback when no int constant is found". Passing the expected value (1) to it defeats the purpose.
>>>>>
>>>>> You?re right. Thanks for spotting that.
>>>>>
>>>>>> jint find_int_con(jint value_if_unknown) const {
>>>>>>    const TypeInt* t = find_int_type();
>>>>>>    return (t != NULL && t->is_con()) ? t->get_con() : value_if_unknown;
>>>>>> }
>>>>>>
>>>>>> 2. Formattign nitpick: could you please trim the spaces before the new's on lines 1368, 1369 and 1387
>>>>>
>>>>> Sure.
>>>>>
>>>>> I?ll send an updated webrev.
>>>>>
>>>>> Roland.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Kris (OpenJDK username: krismo)
>>>>>>
>>>>>> On Wed, Sep 30, 2015 at 1:34 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>>>>> I?m picking that one up. Here is a new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~roland/8003585/webrev.00/
>>>>>>
>>>>>> The only change to c2 compared to the previous webrev is that ((x & m) u< m+1) is optimized the same way ((x & m) u<= m) is. Actually, I don?t think that C2 currently produces the ((x & m) u<= m) shape. The IfNode::fold_compares() logic produces the ((x & m) u< m+1) variant. I also added a test case to check the validity of the transformations and ran usual testing on the change.
>>>>>>
>>>>>> Roland.
>>>>
>>>
>

From christian.thalinger at oracle.com  Tue Jan 19 20:12:31 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 10:12:31 -1000
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
Message-ID: <31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>

src/share/vm/compiler/compileBroker.cpp:

+      failure_reason = ci_env.failure_reason();
+      retry_message = ci_env.retry_message();
       ci_env.report_failure(ci_env.failure_reason());

Why not use failure_reason?

src/share/vm/jvmci/jvmciCompiler.cpp:

+      oop failure_message = CompilationRequestResult::failureMessage(result_object);
+      if (failure_message != NULL) {
+        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;

failure_message is guaranteed to be non-null.

+    oop result_object = (oop) result.get_jobject();
+    if (result_object != NULL) {

Looks like there is nothing to handle the null case.  Should we?

> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
> 
> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
> 
> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/18340df4/attachment-0001.html>

From vladimir.kozlov at oracle.com  Tue Jan 19 20:36:57 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 12:36:57 -0800
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569CE098.4030807@oracle.com>
References: <569CE098.4030807@oracle.com>
Message-ID: <569E9E69.4070202@oracle.com>

Looks fine but in vmStructs.cpp you should replace the field declaration instead of just removing old one.
Also look if SA access it.

Thanks,
Vladimir

On 1/18/16 4:54 AM, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/7177745/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-7177745
>
> JVM aggressively inlines through CallSites, even for mutable and volatile flavors. It's the key optimization for making
> invokedynamic performant.
>
> When a CallSite.target is updated, JVM invalidates all affected nmethods and try to recompile them later. If a call site
> target regularly changes, JVM will eventually mark (after PerMethodRecompilationCutoff invalidations) all hot methods
> which have the call site bound as non-compilable. It leads to significant peak performance reduction, because all
> affected methods will always be executed in interpreter mode since then.
>
> The fix is to avoid updating recompilation count when corresponding nmethod is invalidated due to a call site target
> change.
>
> I filed a separate RFE (JDK-8147550 [1]) to consider slow non-inlined code shape for unstable call sites, as John
> suggested [2].
>
> Testing: regression test, octane, JPRT.
>
> Thanks!
>
> Best regards,
> Vladimir Ivanov
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8147550
> [2]
> https://bugs.openjdk.java.net/browse/JDK-7177745?focusedCommentId=13821545&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821545
>

From tom.rodriguez at oracle.com  Tue Jan 19 20:40:48 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Tue, 19 Jan 2016 12:40:48 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
Message-ID: <9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>


> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> src/share/vm/compiler/compileBroker.cpp:
> 
> +      failure_reason = ci_env.failure_reason();
> +      retry_message = ci_env.retry_message();
>        ci_env.report_failure(ci_env.failure_reason());
> 
> Why not use failure_reason?

Fewer edits?  :)  I?ll fix it.

> 
> src/share/vm/jvmci/jvmciCompiler.cpp:
> 
> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
> +      if (failure_message != NULL) {
> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
> 
> failure_message is guaranteed to be non-null.

Right.  The code evolved a few times but now that test is unnecessary.

> 
> +    oop result_object = (oop) result.get_jobject();
> +    if (result_object != NULL) {
> 
> Looks like there is nothing to handle the null case.  Should we?

I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.

tom

> 
>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>> 
>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>> 
>> tom
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/45a1cb4b/attachment.html>

From christian.thalinger at oracle.com  Tue Jan 19 20:44:20 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 10:44:20 -1000
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
Message-ID: <EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>


> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> 
>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>> src/share/vm/compiler/compileBroker.cpp:
>> 
>> +      failure_reason = ci_env.failure_reason();
>> +      retry_message = ci_env.retry_message();
>>        ci_env.report_failure(ci_env.failure_reason());
>> 
>> Why not use failure_reason?
> 
> Fewer edits?  :)  I?ll fix it.

:-D

> 
>> 
>> src/share/vm/jvmci/jvmciCompiler.cpp:
>> 
>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>> +      if (failure_message != NULL) {
>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>> 
>> failure_message is guaranteed to be non-null.
> 
> Right.  The code evolved a few times but now that test is unnecessary.
> 
>> 
>> +    oop result_object = (oop) result.get_jobject();
>> +    if (result_object != NULL) {
>> 
>> Looks like there is nothing to handle the null case.  Should we?
> 
> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.

Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.

> 
> tom
> 
>> 
>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> 
>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>> 
>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>> 
>>> tom
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/94632a8c/attachment.html>

From vladimir.kozlov at oracle.com  Tue Jan 19 20:46:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 19 Jan 2016 12:46:33 -0800
Subject: RFR(S): 8146096: [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <569E3109.8090107@oracle.com>
References: <569E2CDE.3060805@oracle.com> <569E3109.8090107@oracle.com>
Message-ID: <569EA0A9.8050406@oracle.com>

Simple use timeout to check for generated safepoint is bad idea. It is very inaccurate. At least you need to check call 
stack to see if it stopped in compiled method.
I would prefer to see WB new interface which would check that loop SafePointNode is generated during compilation of 
method. It will be precise.

And we need such tests to make sure a feature is working - we can't remove them.

Thanks,
Vladimir

On 1/19/16 4:50 AM, Vladimir Ivanov wrote:
> As an idea to improve the test: spawn a thread which executes the counted loop and then use WhiteBox.forceSafepoint() to
> trigger a safepoint.
>
> If the test times out, it means there's no safepoint in the loop.
>
> Also, it also simplifies the implementation - no need to spawn a child process, the check can be done in-process.
>
> Best regards,
> Vladimir Ivanov
>
> On 1/19/16 3:32 PM, Andreas Eriksson wrote:
>> Hi,
>>
>> Can I please have a review for the removal of
>> hotspot/test/compiler/loopopts/UseCountedLoopSafepoints.java.
>>
>> The test needs to do a loop that takes more than two seconds to execute
>> fully without doing a safepointing call. For this expensive atomic
>> operations were used. The problem is that on certain embedded platforms
>> they are too expensive, and the test times out.
>> The loop length could probably be reduced, and it should still work on
>> faster machines. However, the test is not very useful, so I think it's
>> better to just remove it to avoid future problems.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8146096
>> Test to be removed:
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/d84a55e7aaf8/test/compiler/loopopts/UseCountedLoopSafepoints.java
>>
>> (I can create a webrev if you think it necessary.)
>>
>> Thanks,
>> Andreas

From tom.rodriguez at oracle.com  Tue Jan 19 20:51:34 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Tue, 19 Jan 2016 12:51:34 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
	<EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
Message-ID: <0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>

http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html <http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html>

I added a Java assert that it?s non-null plus a C++ assert in the else case.  So we won?t crash in product if it returns null and turning on Java assert will report something useful.

tom

> On Jan 19, 2016, at 12:44 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> 
>>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> 
>>> src/share/vm/compiler/compileBroker.cpp:
>>> 
>>> +      failure_reason = ci_env.failure_reason();
>>> +      retry_message = ci_env.retry_message();
>>>        ci_env.report_failure(ci_env.failure_reason());
>>> 
>>> Why not use failure_reason?
>> 
>> Fewer edits?  :)  I?ll fix it.
> 
> :-D
> 
>> 
>>> 
>>> src/share/vm/jvmci/jvmciCompiler.cpp:
>>> 
>>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>>> +      if (failure_message != NULL) {
>>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>>> 
>>> failure_message is guaranteed to be non-null.
>> 
>> Right.  The code evolved a few times but now that test is unnecessary.
>> 
>>> 
>>> +    oop result_object = (oop) result.get_jobject();
>>> +    if (result_object != NULL) {
>>> 
>>> Looks like there is nothing to handle the null case.  Should we?
>> 
>> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.
> 
> Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.
> 
>> 
>> tom
>> 
>>> 
>>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>> 
>>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>>> 
>>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>>> 
>>>> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/d7a4d252/attachment.html>

From christian.thalinger at oracle.com  Tue Jan 19 20:57:03 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 10:57:03 -1000
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
	<EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
	<0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>
Message-ID: <AA293D7C-0D76-4D49-BFFB-80358B6BF3A7@oracle.com>


> On Jan 19, 2016, at 10:51 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html <http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html>
> 
> I added a Java assert that it?s non-null plus a C++ assert in the else case.  So we won?t crash in product if it returns null and turning on Java assert will report something useful.

I am worried about corner cases which we will never see because the VM silently ignores them.

> 
> tom
> 
>> On Jan 19, 2016, at 12:44 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>>> 
>>> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> 
>>> 
>>>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>> 
>>>> src/share/vm/compiler/compileBroker.cpp:
>>>> 
>>>> +      failure_reason = ci_env.failure_reason();
>>>> +      retry_message = ci_env.retry_message();
>>>>        ci_env.report_failure(ci_env.failure_reason());
>>>> 
>>>> Why not use failure_reason?
>>> 
>>> Fewer edits?  :)  I?ll fix it.
>> 
>> :-D
>> 
>>> 
>>>> 
>>>> src/share/vm/jvmci/jvmciCompiler.cpp:
>>>> 
>>>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>>>> +      if (failure_message != NULL) {
>>>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>>>> 
>>>> failure_message is guaranteed to be non-null.
>>> 
>>> Right.  The code evolved a few times but now that test is unnecessary.
>>> 
>>>> 
>>>> +    oop result_object = (oop) result.get_jobject();
>>>> +    if (result_object != NULL) {
>>>> 
>>>> Looks like there is nothing to handle the null case.  Should we?
>>> 
>>> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.
>> 
>> Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.
>> 
>>> 
>>> tom
>>> 
>>>> 
>>>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>> 
>>>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>>>> 
>>>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>>>> 
>>>>> tom
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/020e6038/attachment.html>

From igor.ignatyev at oracle.com  Tue Jan 19 21:06:25 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 20 Jan 2016 00:06:25 +0300
Subject: RFR(XS) : 8141557 : TestResolvedJavaMethod.java times out after 1000
	ms
Message-ID: <52CAB89A-1AA5-4545-9C4B-DD2A6880E463@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8141557/webrev.00/
> 22 lines changed: 16 ins; 0 del; 6 mod; 

Hi all,

Could you please review the fix for 8141557?

The test uses timeout value of org.junit.Test to test reading annotation via JVMCI. In some cases, e.g. on embedded platforms, debug builds or w/ extra vm flags like -Xcomp, 1000ms isn?t enough for the test to complete, and since jtreg doesn?t apply timeout factor for junit/testng timeouts (CODETOOLS-7901567) the test timeouts despite the fact of increased timeout factor.

The fix changes the test to use a separate annotation, which doesn?t affect test execution, and remove timeout value (which means no timeout).

JBS: https://bugs.openjdk.java.net/browse/JDK-8141557
testing: locally

Thanks,
Igor

From tom.rodriguez at oracle.com  Tue Jan 19 21:06:54 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Tue, 19 Jan 2016 13:06:54 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <AA293D7C-0D76-4D49-BFFB-80358B6BF3A7@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
	<EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
	<0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>
	<AA293D7C-0D76-4D49-BFFB-80358B6BF3A7@oracle.com>
Message-ID: <BA717887-73A9-4504-A1B0-C6625194B4CF@oracle.com>


> On Jan 19, 2016, at 12:57 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> 
>> On Jan 19, 2016, at 10:51 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html <http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html>
>> 
>> I added a Java assert that it?s non-null plus a C++ assert in the else case.  So we won?t crash in product if it returns null and turning on Java assert will report something useful.
> 
> I am worried about corner cases which we will never see because the VM silently ignores them.

That?s why we have the Java assert.  This part is all informational anyway.  Do we really need to do more?

tom

> 
>> 
>> tom
>> 
>>> On Jan 19, 2016, at 12:44 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> 
>>>> 
>>>> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>> 
>>>> 
>>>>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>>> 
>>>>> src/share/vm/compiler/compileBroker.cpp:
>>>>> 
>>>>> +      failure_reason = ci_env.failure_reason();
>>>>> +      retry_message = ci_env.retry_message();
>>>>>        ci_env.report_failure(ci_env.failure_reason());
>>>>> 
>>>>> Why not use failure_reason?
>>>> 
>>>> Fewer edits?  :)  I?ll fix it.
>>> 
>>> :-D
>>> 
>>>> 
>>>>> 
>>>>> src/share/vm/jvmci/jvmciCompiler.cpp:
>>>>> 
>>>>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>>>>> +      if (failure_message != NULL) {
>>>>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>>>>> 
>>>>> failure_message is guaranteed to be non-null.
>>>> 
>>>> Right.  The code evolved a few times but now that test is unnecessary.
>>>> 
>>>>> 
>>>>> +    oop result_object = (oop) result.get_jobject();
>>>>> +    if (result_object != NULL) {
>>>>> 
>>>>> Looks like there is nothing to handle the null case.  Should we?
>>>> 
>>>> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.
>>> 
>>> Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.
>>> 
>>>> 
>>>> tom
>>>> 
>>>>> 
>>>>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>>>>> 
>>>>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>>>>> 
>>>>>> tom
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/ac912ac7/attachment.html>

From christian.thalinger at oracle.com  Tue Jan 19 21:11:33 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 11:11:33 -1000
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <BA717887-73A9-4504-A1B0-C6625194B4CF@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
	<EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
	<0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>
	<AA293D7C-0D76-4D49-BFFB-80358B6BF3A7@oracle.com>
	<BA717887-73A9-4504-A1B0-C6625194B4CF@oracle.com>
Message-ID: <A4C720E2-66FA-4A20-871F-AD83CD5A87F7@oracle.com>


> On Jan 19, 2016, at 11:06 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> 
>> On Jan 19, 2016, at 12:57 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> 
>> 
>>> On Jan 19, 2016, at 10:51 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> 
>>> http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html <http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html>
>>> 
>>> I added a Java assert that it?s non-null plus a C++ assert in the else case.  So we won?t crash in product if it returns null and turning on Java assert will report something useful.
>> 
>> I am worried about corner cases which we will never see because the VM silently ignores them.
> 
> That?s why we have the Java assert.

Sure, but customers don?t run with assertions on and if the error is silently ignored we or the customer don't know to turn assertions on.

>  This part is all informational anyway.  Do we really need to do more?

No, let?s push it as it is.

> 
> tom
> 
>> 
>>> 
>>> tom
>>> 
>>>> On Jan 19, 2016, at 12:44 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>> 
>>>>> 
>>>>> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>> 
>>>>> 
>>>>>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>>>> 
>>>>>> src/share/vm/compiler/compileBroker.cpp:
>>>>>> 
>>>>>> +      failure_reason = ci_env.failure_reason();
>>>>>> +      retry_message = ci_env.retry_message();
>>>>>>        ci_env.report_failure(ci_env.failure_reason());
>>>>>> 
>>>>>> Why not use failure_reason?
>>>>> 
>>>>> Fewer edits?  :)  I?ll fix it.
>>>> 
>>>> :-D
>>>> 
>>>>> 
>>>>>> 
>>>>>> src/share/vm/jvmci/jvmciCompiler.cpp:
>>>>>> 
>>>>>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>>>>>> +      if (failure_message != NULL) {
>>>>>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>>>>>> 
>>>>>> failure_message is guaranteed to be non-null.
>>>>> 
>>>>> Right.  The code evolved a few times but now that test is unnecessary.
>>>>> 
>>>>>> 
>>>>>> +    oop result_object = (oop) result.get_jobject();
>>>>>> +    if (result_object != NULL) {
>>>>>> 
>>>>>> Looks like there is nothing to handle the null case.  Should we?
>>>>> 
>>>>> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.
>>>> 
>>>> Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.
>>>> 
>>>>> 
>>>>> tom
>>>>> 
>>>>>> 
>>>>>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>>>> 
>>>>>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>>>>>> 
>>>>>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>>>>>> 
>>>>>>> tom
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/bb54c1af/attachment-0001.html>

From tom.rodriguez at oracle.com  Tue Jan 19 21:26:28 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Tue, 19 Jan 2016 13:26:28 -0800
Subject: RFR(S): 8147432: JVMCI should report bailouts in PrintCompilation
	output
In-Reply-To: <A4C720E2-66FA-4A20-871F-AD83CD5A87F7@oracle.com>
References: <73910769-D6B7-4162-B7CB-A70F2C2380DF@oracle.com>
	<31562912-6D67-402B-A2EF-621D3A59D09A@oracle.com>
	<9111BB03-C6B0-4ECE-8131-0249B62FD94B@oracle.com>
	<EFC87BB9-B0C7-4659-B1E0-8284C2207D68@oracle.com>
	<0965DCE8-4C48-48B7-B8C8-A406C151B588@oracle.com>
	<AA293D7C-0D76-4D49-BFFB-80358B6BF3A7@oracle.com>
	<BA717887-73A9-4504-A1B0-C6625194B4CF@oracle.com>
	<A4C720E2-66FA-4A20-871F-AD83CD5A87F7@oracle.com>
Message-ID: <CBF7B255-1120-4C88-BDA9-04E872D9FF1F@oracle.com>


> On Jan 19, 2016, at 1:11 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> 
>> On Jan 19, 2016, at 11:06 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> 
>>> On Jan 19, 2016, at 12:57 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> 
>>> 
>>>> On Jan 19, 2016, at 10:51 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>> 
>>>> http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html <http://cr.openjdk.java.net/~never/8147432.00-01/webrev/index.html>
>>>> 
>>>> I added a Java assert that it?s non-null plus a C++ assert in the else case.  So we won?t crash in product if it returns null and turning on Java assert will report something useful.
>>> 
>>> I am worried about corner cases which we will never see because the VM silently ignores them.
>> 
>> That?s why we have the Java assert.
> 
> Sure, but customers don?t run with assertions on and if the error is silently ignored we or the customer don't know to turn assertions on.

But in this case the customer is a JVMCI developer.

> 
>>  This part is all informational anyway.  Do we really need to do more?
> 
> No, let?s push it as it is.

Ok.  My last jprt push required a couple tries to get through.  Hopefully this on is smoother.

tom

> 
>> 
>> tom
>> 
>>> 
>>>> 
>>>> tom
>>>> 
>>>>> On Jan 19, 2016, at 12:44 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jan 19, 2016, at 10:40 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> On Jan 19, 2016, at 12:12 PM, Christian Thalinger <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>>>>>> 
>>>>>>> src/share/vm/compiler/compileBroker.cpp:
>>>>>>> 
>>>>>>> +      failure_reason = ci_env.failure_reason();
>>>>>>> +      retry_message = ci_env.retry_message();
>>>>>>>        ci_env.report_failure(ci_env.failure_reason());
>>>>>>> 
>>>>>>> Why not use failure_reason?
>>>>>> 
>>>>>> Fewer edits?  :)  I?ll fix it.
>>>>> 
>>>>> :-D
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> src/share/vm/jvmci/jvmciCompiler.cpp:
>>>>>>> 
>>>>>>> +      oop failure_message = CompilationRequestResult::failureMessage(result_object);
>>>>>>> +      if (failure_message != NULL) {
>>>>>>> +        const char* failure_reason = failure_message != NULL ? java_lang_String::as_utf8_string(failure_message) : "unknown reason?;
>>>>>>> 
>>>>>>> failure_message is guaranteed to be non-null.
>>>>>> 
>>>>>> Right.  The code evolved a few times but now that test is unnecessary.
>>>>>> 
>>>>>>> 
>>>>>>> +    oop result_object = (oop) result.get_jobject();
>>>>>>> +    if (result_object != NULL) {
>>>>>>> 
>>>>>>> Looks like there is nothing to handle the null case.  Should we?
>>>>>> 
>>>>>> I debated on that.  Maybe a Java assert in HotSpotJVMCIRuntime.compileMethod that JVMCICompiler.compileMethod always returns non-null?  I don?t know that there?s anything useful we can in the C++ code if it?s null.
>>>>> 
>>>>> Assert in Java sounds good.  I was thinking about a hard-failure in C++ since it shouldn?t happen.
>>>>> 
>>>>>> 
>>>>>> tom
>>>>>> 
>>>>>>> 
>>>>>>>> On Jan 19, 2016, at 9:32 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>>>>>> 
>>>>>>>> http://cr.openjdk.java.net/~never/8147432/webrev/index.html <http://cr.openjdk.java.net/~never/8147432/webrev/index.html>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8147432 <https://bugs.openjdk.java.net/browse/JDK-8147432>
>>>>>>>> 
>>>>>>>> Currently JVMCI compiles either produce code or they don?t but nothing is reported for failures.  This adds a new CompilationRequestResult object that can return a human readable message to be included in the normal ?COMPILE SKIPPED? style message.  I?ve refactored the printing so it?s shared between compiles.  The result can also include the number of inlined byte codes for use by things like CITimeEach.  Additionally I removed the CompilationToVM.notifyCompilationStatistics as this was apparently a left over.  Tested with specjvm and PrintCompilation which has a few OSR bailouts plus injecting some exceptions to make sure they were reported correctly.
>>>>>>>> 
>>>>>>>> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/4348c6c8/attachment.html>

From john.r.rose at oracle.com  Tue Jan 19 23:37:29 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 19 Jan 2016 15:37:29 -0800
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569CE098.4030807@oracle.com>
References: <569CE098.4030807@oracle.com>
Message-ID: <894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>

On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> The fix is to avoid updating recompilation count when corresponding nmethod is invalidated due to a call site target change.

Although I'm not vetoing it (since it seems it will help customers in the short term), I'm uncomfortable with this fix because it doesn't scale to large dyn. lang. applications with many unstable call sites.  Put another way, it feels like we are duct-taping down a failsafe switch (against infinite recompilation) in order to spam a micro-benchmark:  a small number mega-mutable call sites for which we are willing to spend (potentially) all of the JIT resources, including those usually allocated to application performance in the steady state.  Put a third way:  I am not comfortable with unthrottled infinite recompilation as a performance strategy.

I've commented on the new RFE (JDK-8147550) where to go next, including the following sentiments:

> There is a serious design tension here, though: Some users apparently are willing to endure an infinite series of recompilations as part of the cost of doing business; JDK-7177745 addresses this need by turning off the fail-safe against (accidental, buggy) infinite recompilation for unstable CSs. Other users might find that having a percentage of machine time devoted to recompilation is a problem. (This has been the case in the past with non-dynamic languages, at least.) The code shape proposed in this bug report would cover all simple unstable call sites (bi-stable, for example, would compile to a bi-morphic call), but, in pathological cases (infinite sequence of distinct CS targets) would "settle down" into a code shape that would be sub-optimal for any single target, but (as an indirect MH call) reasonable for all the targets together. 
> 
> In the absence of clear direction from the user or the profile, the JVM has to choose infinite recompilation or a good-enough final compilation. The latter choice is safer. And the infinite recompilation is less safe because there is no intrinsic bound on the amount of machine cycles that could be diverted to recompilation, given a dynamic language application with enough mega-mutable CSs. Settling down to a network of indirect calls has a bounded cost. 
> 
> Yes, one size-fits-all tactics never please everybody. But the JVM should not choose tactics with unlimited downsides. 

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/3bb26897/attachment-0001.html>

From vivek.r.deshpande at intel.com  Wed Jan 20 00:44:25 2016
From: vivek.r.deshpande at intel.com (Deshpande, Vivek R)
Date: Wed, 20 Jan 2016 00:44:25 +0000
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
In-Reply-To: <5699AACF.6080608@oracle.com>
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
	<56999E04.5040207@oracle.com> <5699A3D6.6080305@oracle.com>
	<5699AACF.6080608@oracle.com>
Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A569FECA5@ORSMSX106.amr.corp.intel.com>

Hi

According LIBM experts at Intel for the test cases, 
the data sets used in regression tests for the Intel the math library (libm). 
They were collected over a long period of testing various libm implementations.
The data sets contain function specific data (special and corner cases such as +/-0, 
maximum/minimum normalized numbers, +/-infinity, QNaN/SNaN, maximum/minimum denormal numbers, 
arguments that would produce close to overflow/underflow results, known hard-to-round cases, etc),
implementation specific data (arguments close to table look-up values for different polynomial approximations,
worst cases for range reduction algorithms) and other data with interesting bit patterns.

Regards,
Vivek

-----Original Message-----
From: joe darcy [mailto:joe.darcy at oracle.com] 
Sent: Friday, January 15, 2016 6:29 PM
To: Vladimir Kozlov; Deshpande, Vivek R
Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib

Ah okay; I overlooked the separate push of the tests.

Thanks,

-Joe

On 1/15/2016 5:58 PM, Vladimir Kozlov wrote:
> Note, the test was pushed together with VM changes into hs-comp repo:
>
>  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/ddd59a780769
>
> New sin/cos code is tested in all running modes since it is used by 
> Interpreter and JITed code (C1 and C2).
>
> I will let Vivek answer questions about the test.
>
> Regards,
> Vladimir
>
> On 1/15/16 5:33 PM, Joseph D. Darcy wrote:
>> Hello,
>>
>> Catching up on email, how were these test cases generated or chosen? 
>> In other words, in what sense are they corners?
>>
>> The data would be easier to read if the numbers were aligned by 
>> column (they don't appear that way in the webrev at least).
>>
>> What is the code coverage of the new intrinsics with this set of tests?
>>
>> Theses tests should not be separated from the implementation for 
>> long; in other words, since the new implementation has already been 
>> pushed to a HotSpot forest, test coverage for that new implementation 
>> should not lag behind.
>>
>> Thanks,
>>
>> -Joe
>>
>> On 12/22/2015 5:41 PM, Deshpande, Vivek R wrote:
>>> HI All
>>>
>>> I have uploaded the patch for sin and cos tests with input and 
>>> allowed outputs at this location for your review.
>>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webre
>>> v.00/
>>>
>>> Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>>> Thank you.
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Friday, December 04, 2015 4:50 PM
>>> To: Deshpande, Vivek R; Vladimir Kozlov
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>> math lib
>>>
>>> Hi Vivek,
>>>
>>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>>> Hi
>>>>
>>>> Sure I will add the tests. Shall I use StrictMath result as a 
>>>> reference for exact result.
>>>> Let me know your thoughts.
>>> As a rough test of another sin/cos implementation, StrictMath.{sin, 
>>> cos} can be used a reference with the following caveat: there isn't 
>>> an indication of which why the error is in a StrictMath result. Let 
>>> me given an example, if
>>>
>>>       StrictMath.sin(x) => y
>>>
>>> then one of the following should be true
>>>
>>>       Math.sin(x) => y
>>>       Math.sin(x) => Math.nextUp(y)
>>>       Math.sin(x) => Math.nextDown(y)
>>>
>>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) 
>>> OR equal to one of the floating-point numbers adjacent to that 
>>> result. Of these three options, only two area allowed by the 
>>> accuracy requirements of the StrictMath.sin specification. However, 
>>> since StrictMath.sin doesn't give an indication of which way its 
>>> error went (if it rounded up or down), there is no indication 
>>> without additional work which of
>>> nextUp(y) and nextDown(y) is allowable (assuming StrictMath.sin 
>>> isn't buggy).
>>>
>>> HTH,
>>>
>>> -Joe
>>>
>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:29 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>> math lib
>>>>
>>>> Hello,
>>>>
>>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>>> Vivek,
>>>>>
>>>>> I think Joe is asking you to write these tests as hotspot 
>>>>> regression test in hotspot/test/compiler.
>>>> Exactly; if not generally applicable sin/cos tests that could be 
>>>> hosted in the jdk repo (alongside the regression and unit tests for 
>>>> java.lang.Math), then test of intrinsics in the HotSpot repo 
>>>> alongside other tests targeting intrinsics.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>>> Hi Joe
>>>>>>
>>>>>> It would be great if you would please share the additional tests 
>>>>>> with us.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>> math lib
>>>>>>
>>>>>> I think it is unwise for this large of an implementation change 
>>>>>> to be pushed with no tests targeting the specifics of the new 
>>>>>> implementation.
>>>>>>
>>>>>> The worst-case tests in the jdk repo are the mathematical worst 
>>>>>> cases for floating-point approximations, in other words the cases 
>>>>>> were the exact mathematical answer is closes to half-way between 
>>>>>> two representation floating-point numbers. Passing such tests is 
>>>>>> necessary but not sufficient condition for a new implementation.
>>>>>>
>>>>>> Chers,
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>>> Okay, looks reasonable to me.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev with latest hotspot 
>>>>>>>> source as base for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> Please send link to new webrev on cr server.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> Please find the webrev with your suggested updates attached 
>>>>>>>>> with the mail.
>>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Deshpande, Vivek R
>>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> HI Vladimir, Joe
>>>>>>>>>
>>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you 
>>>>>>>>> have mentioned. It passed those tests.
>>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions 
>>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>>> The performance gain is 3.2x over base jdk, that is over 
>>>>>>>>> current fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>>
>>>>>>>>> Could I get those tests around the boundary values. Would 
>>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>>
>>>>>>>>> I would work on adding either diagnostic flag or just one flag 
>>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Just getting added to the thread..
>>>>>>>>>
>>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>>
>>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition 
>>>>>>>>>> to Hotspot tests.
>>>>>>>>>>
>>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi Vladimir
>>>>>>>>>>>
>>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from 
>>>>>>>>>>> StrictMath result and not exact result. So I added the flag 
>>>>>>>>>>> to switch between FDLIBM and LIBM.
>>>>>>>>>>>
>>>>>>>>>>> Quick explanation:
>>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result 
>>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>>> 0.1945729362957022
>>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA 
>>>>>>>>>>> library result is between the above two values and Exact 
>>>>>>>>>>> result would be pretty close to it.
>>>>>>>>>>> So here StrictMath result is less than quad-precision 
>>>>>>>>>>> result, Math result should be StrictMath or StrictMath + 
>>>>>>>>>>> 1ulp and not StrictMath
>>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both 
>>>>>>>>>> direction, I
>>>>>>>>>> think) and it should be consistent for Interpreter and code 
>>>>>>>>>> generated by JIT compilers:
>>>>>>>>>>
>>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#
>>>>>>>>>> sin
>>>>>>>>>> %
>>>>>>>>>> 28
>>>>>>>>>> do
>>>>>>>>>> u
>>>>>>>>>> ble%29
>>>>>>>>>>
>>>>>>>>> That interpretation of the spec is not quite right. For the 
>>>>>>>>> Math methods with a 1/2 ulp error bound, the floating-point 
>>>>>>>>> result closest to the exact result must be returned. For the 
>>>>>>>>> methods with a
>>>>>>>>> 1 ulp error bound, either of the floating-point result 
>>>>>>>>> bracketing the true result can be returned, subject to the 
>>>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>>>
>>>>>>>>>>> I have done the experiments with 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin and 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dcos. 
>>>>>>>>>>> With this option, the interpreter would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>>> If we want to disable LIBM completely, we need the flags 
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>> I was thinking about using existing
>>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and 
>>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add 
>>>>>>>>>> additional versions of functions which accept intrinsic ID 
>>>>>>>>>> instead of methodHandle.
>>>>>>>>>>
>>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>>
>>>>>>>>>>> Also the performance gain ~4x is with 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>>> You confused me here. So you get 4x when only Interpreter use 
>>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>>
>>>>>>>>> I'm part way through porting the FDLIBM code to Java
>>>>>>>>> (JDK-8134780:
>>>>>>>>> Port fdlibm to Java), which is providing a significant speed 
>>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>>
>>>>>>>>> I find the current patch *insufficient* as-is in terms of its 
>>>>>>>>> testing.
>>>>>>>>> For example, part of patch says
>>>>>>>>>
>>>>>>>>> # For sin
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> # For cos
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> If nothing else, there are no tests at around those boundary 
>>>>>>>>> values, which is unacceptable. There should also be some tests 
>>>>>>>>> of values of interest to the algorithm in question.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> -Joe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Let me know your thoughts on this. I would answer more 
>>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Vivek
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>>> To: Deshpande, Vivek R; 
>>>>>>>>>>> hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>>>> the math lib
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be interesting to see what happens if you disable 
>>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>>
>>>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions 
>>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>
>>>>>>>>>>> I want to point that you can do this experiment later. We 
>>>>>>>>>>> can file bugs and fixed them after FC.
>>>>>>>>>>>
>>>>>>>>>>> For now, please, answer my question about flags only. This 
>>>>>>>>>>> is the only thing holding it from push.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to contribute a patch which optimizes 
>>>>>>>>>>>>> Math.sin() and
>>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel 
>>>>>>>>>>>>> LIBM
>>>>>>>>>>>>>         implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin 
>>>>>>>>>>>>> and cos.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The option to use the optimizations are 
>>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vivek
>>>>>>>>>>>>>
>>


From christian.thalinger at oracle.com  Wed Jan 20 00:45:00 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 14:45:00 -1000
Subject: RFR(XS) : 8141557 : TestResolvedJavaMethod.java times out after
	1000 ms
In-Reply-To: <52CAB89A-1AA5-4545-9C4B-DD2A6880E463@oracle.com>
References: <52CAB89A-1AA5-4545-9C4B-DD2A6880E463@oracle.com>
Message-ID: <3DDA7A22-74CF-400A-A403-9CE70655ABD5@oracle.com>

I suppose TestAnnotionation is a typo?

+    @TestAnnotionation(1000L)

Could you change that to value = 1000L?  Just for extra clarity.  Then it looks good.

> On Jan 19, 2016, at 11:06 AM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~iignatyev/8141557/webrev.00/
>> 22 lines changed: 16 ins; 0 del; 6 mod; 
> 
> Hi all,
> 
> Could you please review the fix for 8141557?
> 
> The test uses timeout value of org.junit.Test to test reading annotation via JVMCI. In some cases, e.g. on embedded platforms, debug builds or w/ extra vm flags like -Xcomp, 1000ms isn?t enough for the test to complete, and since jtreg doesn?t apply timeout factor for junit/testng timeouts (CODETOOLS-7901567) the test timeouts despite the fact of increased timeout factor.
> 
> The fix changes the test to use a separate annotation, which doesn?t affect test execution, and remove timeout value (which means no timeout).
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8141557
> testing: locally
> 
> Thanks,
> Igor


From vivek.r.deshpande at intel.com  Wed Jan 20 00:48:41 2016
From: vivek.r.deshpande at intel.com (Deshpande, Vivek R)
Date: Wed, 20 Jan 2016 00:48:41 +0000
Subject: RFR (M): 8143353: Update for x86 sin and cos in the math lib
References: <53E8E64DB2403849AFD89B7D4DAC8B2A568ED1AC@ORSMSX106.amr.corp.intel.com>
	<564F80F7.5050605@oracle.com> <56535CC7.6020702@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F03BE@ORSMSX106.amr.corp.intel.com>
	<5653B9AF.7060306@oracle.com> <5653CB17.2020308@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A568F26AD@ORSMSX106.amr.corp.intel.com>
	<565E520B.8060801@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CE99C@ORSMSX106.amr.corp.intel.com>
	<5660AEB6.8060007@oracle.com> <5660B13B.1020907@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CECB1@ORSMSX106.amr.corp.intel.com>
	<5660B345.8010905@oracle.com> <5660B40D.4050800@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569CED5A@ORSMSX106.amr.corp.intel.com>
	<566234C6.8010806@oracle.com>
	<53E8E64DB2403849AFD89B7D4DAC8B2A569DFF26@ORSMSX106.amr.corp.intel.com>
	<56999E04.5040207@oracle.com> <5699A3D6.6080305@oracle.com>
	<5699AACF.6080608@oracle.com> 
Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A569FECB4@ORSMSX106.amr.corp.intel.com>

Hi All

Forgot to add in the earlier reply, 
The reference values are computed with Maple and were converted into hexadecimal format.

Regards,
Vivek

-----Original Message-----
From: Deshpande, Vivek R 
Sent: Tuesday, January 19, 2016 4:44 PM
To: 'joe darcy'; Vladimir Kozlov
Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in the math lib

Hi

According LIBM experts at Intel for the test cases, the data sets used in regression tests for the Intel the math library (libm). 
They were collected over a long period of testing various libm implementations.
The data sets contain function specific data (special and corner cases such as +/-0, maximum/minimum normalized numbers, +/-infinity, QNaN/SNaN, maximum/minimum denormal numbers, arguments that would produce close to overflow/underflow results, known hard-to-round cases, etc), implementation specific data (arguments close to table look-up values for different polynomial approximations, worst cases for range reduction algorithms) and other data with interesting bit patterns.

Regards,
Vivek

-----Original Message-----
From: joe darcy [mailto:joe.darcy at oracle.com]
Sent: Friday, January 15, 2016 6:29 PM
To: Vladimir Kozlov; Deshpande, Vivek R
Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the math lib

Ah okay; I overlooked the separate push of the tests.

Thanks,

-Joe

On 1/15/2016 5:58 PM, Vladimir Kozlov wrote:
> Note, the test was pushed together with VM changes into hs-comp repo:
>
>  http://hg.openjdk.java.net/jdk9/hs-comp/jdk/rev/ddd59a780769
>
> New sin/cos code is tested in all running modes since it is used by 
> Interpreter and JITed code (C1 and C2).
>
> I will let Vivek answer questions about the test.
>
> Regards,
> Vladimir
>
> On 1/15/16 5:33 PM, Joseph D. Darcy wrote:
>> Hello,
>>
>> Catching up on email, how were these test cases generated or chosen? 
>> In other words, in what sense are they corners?
>>
>> The data would be easier to read if the numbers were aligned by 
>> column (they don't appear that way in the webrev at least).
>>
>> What is the code coverage of the new intrinsics with this set of tests?
>>
>> Theses tests should not be separated from the implementation for 
>> long; in other words, since the new implementation has already been 
>> pushed to a HotSpot forest, test coverage for that new implementation 
>> should not lag behind.
>>
>> Thanks,
>>
>> -Joe
>>
>> On 12/22/2015 5:41 PM, Deshpande, Vivek R wrote:
>>> HI All
>>>
>>> I have uploaded the patch for sin and cos tests with input and 
>>> allowed outputs at this location for your review.
>>> http://cr.openjdk.java.net/~vdeshpande/libm_sincos/8143353/jdk/webre
>>> v.00/
>>>
>>> Bug ID: https://bugs.openjdk.java.net/browse/JDK-8143353
>>> Thank you.
>>>
>>> Regards,
>>> Vivek
>>>
>>> -----Original Message-----
>>> From: Joseph D. Darcy [mailto:joe.darcy at oracle.com]
>>> Sent: Friday, December 04, 2015 4:50 PM
>>> To: Deshpande, Vivek R; Vladimir Kozlov
>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>> math lib
>>>
>>> Hi Vivek,
>>>
>>> On 12/3/2015 2:01 PM, Deshpande, Vivek R wrote:
>>>> Hi
>>>>
>>>> Sure I will add the tests. Shall I use StrictMath result as a 
>>>> reference for exact result.
>>>> Let me know your thoughts.
>>> As a rough test of another sin/cos implementation, StrictMath.{sin, 
>>> cos} can be used a reference with the following caveat: there isn't 
>>> an indication of which why the error is in a StrictMath result. Let 
>>> me given an example, if
>>>
>>>       StrictMath.sin(x) => y
>>>
>>> then one of the following should be true
>>>
>>>       Math.sin(x) => y
>>>       Math.sin(x) => Math.nextUp(y)
>>>       Math.sin(x) => Math.nextDown(y)
>>>
>>> That is, Math.sin(x) should either be the same as StrictMath.sin(x) 
>>> OR equal to one of the floating-point numbers adjacent to that 
>>> result. Of these three options, only two area allowed by the 
>>> accuracy requirements of the StrictMath.sin specification. However, 
>>> since StrictMath.sin doesn't give an indication of which way its 
>>> error went (if it rounded up or down), there is no indication 
>>> without additional work which of
>>> nextUp(y) and nextDown(y) is allowable (assuming StrictMath.sin 
>>> isn't buggy).
>>>
>>> HTH,
>>>
>>> -Joe
>>>
>>>
>>>> Regards,
>>>> Vivek
>>>>
>>>> -----Original Message-----
>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>> Sent: Thursday, December 03, 2015 1:29 PM
>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>> math lib
>>>>
>>>> Hello,
>>>>
>>>> On 12/3/2015 1:25 PM, Vladimir Kozlov wrote:
>>>>> Vivek,
>>>>>
>>>>> I think Joe is asking you to write these tests as hotspot 
>>>>> regression test in hotspot/test/compiler.
>>>> Exactly; if not generally applicable sin/cos tests that could be 
>>>> hosted in the jdk repo (alongside the regression and unit tests for 
>>>> java.lang.Math), then test of intrinsics in the HotSpot repo 
>>>> alongside other tests targeting intrinsics.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>> Vladimir
>>>>>
>>>>> On 12/3/15 1:22 PM, Deshpande, Vivek R wrote:
>>>>>> Hi Joe
>>>>>>
>>>>>> It would be great if you would please share the additional tests 
>>>>>> with us.
>>>>>>
>>>>>> Regards,
>>>>>> Vivek
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>> Sent: Thursday, December 03, 2015 1:17 PM
>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in the 
>>>>>> math lib
>>>>>>
>>>>>> I think it is unwise for this large of an implementation change 
>>>>>> to be pushed with no tests targeting the specifics of the new 
>>>>>> implementation.
>>>>>>
>>>>>> The worst-case tests in the jdk repo are the mathematical worst 
>>>>>> cases for floating-point approximations, in other words the cases 
>>>>>> were the exact mathematical answer is closes to half-way between 
>>>>>> two representation floating-point numbers. Passing such tests is 
>>>>>> necessary but not sufficient condition for a new implementation.
>>>>>>
>>>>>> Chers,
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>> On 12/3/2015 1:05 PM, Vladimir Kozlov wrote:
>>>>>>> Okay, looks reasonable to me.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 12/3/15 11:06 AM, Deshpande, Vivek R wrote:
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev with latest hotspot 
>>>>>>>> source as base for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.03/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Deshpande, Vivek R
>>>>>>>> Sent: Wednesday, December 02, 2015 10:33 PM
>>>>>>>> To: 'Vladimir Kozlov'; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> Hi Vladimir
>>>>>>>>
>>>>>>>> This is the link for the updated webrev for your review.
>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.02/
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Vivek
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>> Sent: Tuesday, December 01, 2015 6:06 PM
>>>>>>>> To: Deshpande, Vivek R; joe darcy
>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>> the math lib
>>>>>>>>
>>>>>>>> Please send link to new webrev on cr server.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 11/25/15 5:16 PM, Deshpande, Vivek R wrote:
>>>>>>>>> Hi Vladimir
>>>>>>>>>
>>>>>>>>> Please find the webrev with your suggested updates attached 
>>>>>>>>> with the mail.
>>>>>>>>> We will update it in the jbs entry soon.
>>>>>>>>> Please let me know if it needs further changes.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Deshpande, Vivek R
>>>>>>>>> Sent: Tuesday, November 24, 2015 10:22 AM
>>>>>>>>> To: 'joe darcy'; Vladimir Kozlov
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: RE: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> HI Vladimir, Joe
>>>>>>>>>
>>>>>>>>> I have done the jtreg tests in hotspot and tests from jdk you 
>>>>>>>>> have mentioned. It passed those tests.
>>>>>>>>> The ~4x gain is with XX:+UnlockDiagnosticVMOptions 
>>>>>>>>> -XX:DisableIntrinsic=_dsin/_dcos over without that option.
>>>>>>>>> The performance gain is 3.2x over base jdk, that is over 
>>>>>>>>> current fsin/fcos intrinsic. This gain is more realistic.
>>>>>>>>>
>>>>>>>>> Could I get those tests around the boundary values. Would 
>>>>>>>>> WorstCaseTests.java jtreg test in jdk test those ?
>>>>>>>>> If yes, then it has passed those boundary cases.
>>>>>>>>>
>>>>>>>>> I would work on adding either diagnostic flag or just one flag 
>>>>>>>>> for libm and send out the webrev soon.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Vivek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: joe darcy [mailto:joe.darcy at oracle.com]
>>>>>>>>> Sent: Monday, November 23, 2015 6:28 PM
>>>>>>>>> To: Vladimir Kozlov; Deshpande, Vivek R
>>>>>>>>> Cc: Viswanathan, Sandhya; Berg, Michael C; hotspot compiler
>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>> the math lib
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Just getting added to the thread..
>>>>>>>>>
>>>>>>>>> On 11/23/2015 5:13 PM, Vladimir Kozlov wrote:
>>>>>>>>>> Thank you, for explanation, Vivek.
>>>>>>>>>>
>>>>>>>>>> Please, run jdk/test/java/lang/Math/ jtreg tests in addition 
>>>>>>>>>> to Hotspot tests.
>>>>>>>>>>
>>>>>>>>>> On 11/23/15 12:24 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>> Hi Vladimir
>>>>>>>>>>>
>>>>>>>>>>> The result we obtain with LIBM are within +/- 1ulp from 
>>>>>>>>>>> StrictMath result and not exact result. So I added the flag 
>>>>>>>>>>> to switch between FDLIBM and LIBM.
>>>>>>>>>>>
>>>>>>>>>>> Quick explanation:
>>>>>>>>>>> This is what we observed with comparison to HPA Library
>>>>>>>>>>> (http://www.nongnu.org/hpalib/) explained with an example.
>>>>>>>>>>> LIBM Observed Math result=0.19457293629570213
>>>>>>>>>>> (4596178249117717083L) (StrictMath - 1ulp) Required result 
>>>>>>>>>>> should be = 0.19457293629570216
>>>>>>>>>>> (4596178249117717084L) (StrictMath result) or
>>>>>>>>>>> 0.1945729362957022
>>>>>>>>>>> (4596178249117717085L) (StrictMath + 1ulp.) This means HPA 
>>>>>>>>>>> library result is between the above two values and Exact 
>>>>>>>>>>> result would be pretty close to it.
>>>>>>>>>>> So here StrictMath result is less than quad-precision 
>>>>>>>>>>> result, Math result should be StrictMath or StrictMath + 
>>>>>>>>>>> 1ulp and not StrictMath
>>>>>>>>>>> - 1ulp, according to our test.
>>>>>>>>>> Note, java.lang.Math allows to have 1ulp off (in both 
>>>>>>>>>> direction, I
>>>>>>>>>> think) and it should be consistent for Interpreter and code 
>>>>>>>>>> generated by JIT compilers:
>>>>>>>>>>
>>>>>>>>>> http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#
>>>>>>>>>> sin
>>>>>>>>>> %
>>>>>>>>>> 28
>>>>>>>>>> do
>>>>>>>>>> u
>>>>>>>>>> ble%29
>>>>>>>>>>
>>>>>>>>> That interpretation of the spec is not quite right. For the 
>>>>>>>>> Math methods with a 1/2 ulp error bound, the floating-point 
>>>>>>>>> result closest to the exact result must be returned. For the 
>>>>>>>>> methods with a
>>>>>>>>> 1 ulp error bound, either of the floating-point result 
>>>>>>>>> bracketing the true result can be returned, subject to the 
>>>>>>>>> monotonicity constraints of the specification of the particular method.
>>>>>>>>>
>>>>>>>>>>> I have done the experiments with 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin and 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dcos.
>>>>>>>>>>> With this option, the interpreter would go through LIBM and C1 and c2 through FDLIBM.
>>>>>>>>>>> If we want to disable LIBM completely, we need the flags 
>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>> I was thinking about using existing
>>>>>>>>>> DirectiveSet::is_intrinsic_disabled() and 
>>>>>>>>>> vmIntrinsics::is_disabled_by_flags(). You need to add 
>>>>>>>>>> additional versions of functions which accept intrinsic ID 
>>>>>>>>>> instead of methodHandle.
>>>>>>>>>>
>>>>>>>>>> If you still want to use flags make them diagnostic.
>>>>>>>>>> Or have one flag for all LIBM intrinsics -XX:+UseLibmIntrinsic.
>>>>>>>>>>
>>>>>>>>>>> Also the performance gain ~4x is with 
>>>>>>>>>>> XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_dsin/_dcos.
>>>>>>>>>> You confused me here. So you get 4x when only Interpreter use 
>>>>>>>>>> LIBM code and compilers use FDLIB?
>>>>>>>>> Just to be clear, are you comparing the new code to FDLIBM
>>>>>>>>> (StrictMath) or to the existing fsin/fcos instrinsics (Math)?
>>>>>>>>>
>>>>>>>>> I'm part way through porting the FDLIBM code to Java
>>>>>>>>> (JDK-8134780:
>>>>>>>>> Port fdlibm to Java), which is providing a significant speed 
>>>>>>>>> boost to the StrictMath methods that have been ported.
>>>>>>>>>
>>>>>>>>> I find the current patch *insufficient* as-is in terms of its 
>>>>>>>>> testing.
>>>>>>>>> For example, part of patch says
>>>>>>>>>
>>>>>>>>> # For sin
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> # For cos
>>>>>>>>>
>>>>>>>>> +//     This means that the main path is actually only taken for
>>>>>>>>> +//     2^-252 <= |X| < 90112.
>>>>>>>>>
>>>>>>>>> If nothing else, there are no tests at around those boundary 
>>>>>>>>> values, which is unacceptable. There should also be some tests 
>>>>>>>>> of values of interest to the algorithm in question.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> -Joe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> Let me know your thoughts on this. I would answer more 
>>>>>>>>>>> questions and give more data if needed.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Vivek
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>>>>>>>>> Sent: Monday, November 23, 2015 10:37 AM
>>>>>>>>>>> To: Deshpande, Vivek R;
>>>>>>>>>>> hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>>> Cc: Viswanathan, Sandhya
>>>>>>>>>>> Subject: Re: RFR (M): 8143353: Update for x86 sin and cos in 
>>>>>>>>>>> the math lib
>>>>>>>>>>>
>>>>>>>>>>> On 11/20/15 12:22 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>> What is the reason you decided to add new flags? exp() and
>>>>>>>>>>>> log() changes did not have flags.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be interesting to see what happens if you disable 
>>>>>>>>>>>> intrinsics using existing flag, for example:
>>>>>>>>>>>>
>>>>>>>>>>>>         -XX:+UnlockDiagnosticVMOptions 
>>>>>>>>>>>> -XX:DisableIntrinsic=_dexp
>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>
>>>>>>>>>>> I want to point that you can do this experiment later. We 
>>>>>>>>>>> can file bugs and fixed them after FC.
>>>>>>>>>>>
>>>>>>>>>>> For now, please, answer my question about flags only. This 
>>>>>>>>>>> is the only thing holding it from push.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/20/15 12:03 PM, Deshpande, Vivek R wrote:
>>>>>>>>>>>>> Hi all
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to contribute a patch which optimizes
>>>>>>>>>>>>> Math.sin() and
>>>>>>>>>>>>> Math.cos() for 64 and 32 bit X86 architecture using Intel 
>>>>>>>>>>>>> LIBM
>>>>>>>>>>>>>         implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The improvement gives ~4.25x gain over base for both sin 
>>>>>>>>>>>>> and cos.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The option to use the optimizations are 
>>>>>>>>>>>>> -XX:+UseLibmSinIntrinsic and -XX:+UseLibmCosIntrinsic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you please review and sponsor this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug-id:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8143353
>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mcberg/8143353/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vivek
>>>>>>>>>>>>>
>>


From christian.thalinger at oracle.com  Wed Jan 20 00:49:00 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 19 Jan 2016 14:49:00 -1000
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
Message-ID: <63C0FC07-E4C8-4967-A3D3-7083D1B3B7E3@oracle.com>


> On Jan 19, 2016, at 1:37 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>> 
>> The fix is to avoid updating recompilation count when corresponding nmethod is invalidated due to a call site target change.
> 
> Although I'm not vetoing it (since it seems it will help customers in the short term), I'm uncomfortable with this fix because it doesn't scale to large dyn. lang. applications with many unstable call sites.  Put another way, it feels like we are duct-taping down a failsafe switch (against infinite recompilation) in order to spam a micro-benchmark:  a small number mega-mutable call sites for which we are willing to spend (potentially) all of the JIT resources, including those usually allocated to application performance in the steady state.  Put a third way:  I am not comfortable with unthrottled infinite recompilation as a performance strategy.

Having a deja-vu...

https://bugs.openjdk.java.net/browse/JDK-7087838

> 
> I've commented on the new RFE (JDK-8147550) where to go next, including the following sentiments:
> 
>> There is a serious design tension here, though: Some users apparently are willing to endure an infinite series of recompilations as part of the cost of doing business; JDK-7177745 addresses this need by turning off the fail-safe against (accidental, buggy) infinite recompilation for unstable CSs. Other users might find that having a percentage of machine time devoted to recompilation is a problem. (This has been the case in the past with non-dynamic languages, at least.) The code shape proposed in this bug report would cover all simple unstable call sites (bi-stable, for example, would compile to a bi-morphic call), but, in pathological cases (infinite sequence of distinct CS targets) would "settle down" into a code shape that would be sub-optimal for any single target, but (as an indirect MH call) reasonable for all the targets together. 
>> 
>> In the absence of clear direction from the user or the profile, the JVM has to choose infinite recompilation or a good-enough final compilation. The latter choice is safer. And the infinite recompilation is less safe because there is no intrinsic bound on the amount of machine cycles that could be diverted to recompilation, given a dynamic language application with enough mega-mutable CSs. Settling down to a network of indirect calls has a bounded cost. 
>> 
>> Yes, one size-fits-all tactics never please everybody. But the JVM should not choose tactics with unlimited downsides. 
> 
> ? John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160119/91e9fe24/attachment.html>

From roland.westrelin at oracle.com  Wed Jan 20 07:35:27 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 20 Jan 2016 08:35:27 +0100
Subject: RFR(XS): 8147386: assert(size == calc_size) failed: incorrect
	size calculattion x86_32.ad
In-Reply-To: <C568518E7B433348B114B6A7122D474756DBEFAE@FMSMSX102.amr.corp.intel.com>
References: <9CE48190-9B0F-4571-937D-5F4162EA5296@oracle.com>
	<569E7E8A.5080809@oracle.com>
	<C568518E7B433348B114B6A7122D474756DBEFAE@FMSMSX102.amr.corp.intel.com>
Message-ID: <840AB1B7-C249-44BB-BC1D-1540A8EEA24C@oracle.com>

Thanks for the reviews Vladimir, Vladimir and Michael.

Roland.

From andreas.eriksson at oracle.com  Wed Jan 20 09:26:16 2016
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Wed, 20 Jan 2016 10:26:16 +0100
Subject: RFR(S): 8146096: [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <569EA0A9.8050406@oracle.com>
References: <569E2CDE.3060805@oracle.com> <569E3109.8090107@oracle.com>
	<569EA0A9.8050406@oracle.com>
Message-ID: <569F52B8.7020802@oracle.com>

Vladimir Kozlov and Vladimir Ivanov,

Ok, I'll look into using the whitebox api to fix the test.
Thanks for looking at this.

- Andreas

On 2016-01-19 21:46, Vladimir Kozlov wrote:
> Simple use timeout to check for generated safepoint is bad idea. It is 
> very inaccurate. At least you need to check call stack to see if it 
> stopped in compiled method.
> I would prefer to see WB new interface which would check that loop 
> SafePointNode is generated during compilation of method. It will be 
> precise.
>
> And we need such tests to make sure a feature is working - we can't 
> remove them.
>
> Thanks,
> Vladimir
>
> On 1/19/16 4:50 AM, Vladimir Ivanov wrote:
>> As an idea to improve the test: spawn a thread which executes the 
>> counted loop and then use WhiteBox.forceSafepoint() to
>> trigger a safepoint.
>>
>> If the test times out, it means there's no safepoint in the loop.
>>
>> Also, it also simplifies the implementation - no need to spawn a 
>> child process, the check can be done in-process.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 1/19/16 3:32 PM, Andreas Eriksson wrote:
>>> Hi,
>>>
>>> Can I please have a review for the removal of
>>> hotspot/test/compiler/loopopts/UseCountedLoopSafepoints.java.
>>>
>>> The test needs to do a loop that takes more than two seconds to execute
>>> fully without doing a safepointing call. For this expensive atomic
>>> operations were used. The problem is that on certain embedded platforms
>>> they are too expensive, and the test times out.
>>> The loop length could probably be reduced, and it should still work on
>>> faster machines. However, the test is not very useful, so I think it's
>>> better to just remove it to avoid future problems.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8146096
>>> Test to be removed:
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/d84a55e7aaf8/test/compiler/loopopts/UseCountedLoopSafepoints.java 
>>>
>>>
>>> (I can create a webrev if you think it necessary.)
>>>
>>> Thanks,
>>> Andreas


From aph at redhat.com  Wed Jan 20 09:47:11 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 20 Jan 2016 09:47:11 +0000
Subject: [aarch64-port-dev ] RFR(M): 8145336: PPC64: fix string intrinsics
	after CompactStrings change
In-Reply-To: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
Message-ID: <569F579F.4060906@redhat.com>

On 19/01/16 18:57, Volker Simonis wrote:
> The change also affects aarch64 (although it is minimal and I don't
> expect it to break anything) so I cc-ed aarch64-port-dev.
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
> https://bugs.openjdk.java.net/browse/JDK-8145336

That's find by us.  We only defined UseSSE42Intrinsics in order to get
the String.indexOf intrinsic.  Of course, we should really have done it
some other way but we were working on our own outside the main HotSpot
tree.

Andrew.


From volker.simonis at gmail.com  Wed Jan 20 10:20:27 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 20 Jan 2016 11:20:27 +0100
Subject: [aarch64-port-dev ] RFR(M): 8145336: PPC64: fix string intrinsics
	after CompactStrings change
In-Reply-To: <569F579F.4060906@redhat.com>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
	<569F579F.4060906@redhat.com>
Message-ID: <CA+3eh13jLgY6hOW0CvwU0GcGUks0nHE41r=fJqvOpXSD98nDOg@mail.gmail.com>

Hi Andrew,

thanks for looking at it.

Regards,
Volker


On Wed, Jan 20, 2016 at 10:47 AM, Andrew Haley <aph at redhat.com> wrote:
> On 19/01/16 18:57, Volker Simonis wrote:
>> The change also affects aarch64 (although it is minimal and I don't
>> expect it to break anything) so I cc-ed aarch64-port-dev.
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
>> https://bugs.openjdk.java.net/browse/JDK-8145336
>
> That's find by us.  We only defined UseSSE42Intrinsics in order to get
> the String.indexOf intrinsic.  Of course, we should really have done it
> some other way but we were working on our own outside the main HotSpot
> tree.
>
> Andrew.
>

From vladimir.x.ivanov at oracle.com  Wed Jan 20 11:23:47 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 20 Jan 2016 14:23:47 +0300
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569E9E69.4070202@oracle.com>
References: <569CE098.4030807@oracle.com> <569E9E69.4070202@oracle.com>
Message-ID: <569F6E43.4080801@oracle.com>

Thanks for review, Vladimir.

I decided to remove the field declaration because SA doesn't access it it.

Best regards,
Vladimir Ivanov

On 1/19/16 11:36 PM, Vladimir Kozlov wrote:
> Looks fine but in vmStructs.cpp you should replace the field declaration
> instead of just removing old one.
> Also look if SA access it.
>
> Thanks,
> Vladimir
>
> On 1/18/16 4:54 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/7177745/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-7177745
>>
>> JVM aggressively inlines through CallSites, even for mutable and
>> volatile flavors. It's the key optimization for making
>> invokedynamic performant.
>>
>> When a CallSite.target is updated, JVM invalidates all affected
>> nmethods and try to recompile them later. If a call site
>> target regularly changes, JVM will eventually mark (after
>> PerMethodRecompilationCutoff invalidations) all hot methods
>> which have the call site bound as non-compilable. It leads to
>> significant peak performance reduction, because all
>> affected methods will always be executed in interpreter mode since then.
>>
>> The fix is to avoid updating recompilation count when corresponding
>> nmethod is invalidated due to a call site target
>> change.
>>
>> I filed a separate RFE (JDK-8147550 [1]) to consider slow non-inlined
>> code shape for unstable call sites, as John
>> suggested [2].
>>
>> Testing: regression test, octane, JPRT.
>>
>> Thanks!
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8147550
>> [2]
>> https://bugs.openjdk.java.net/browse/JDK-7177745?focusedCommentId=13821545&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821545
>>
>>

From martin.doerr at sap.com  Wed Jan 20 11:43:40 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 20 Jan 2016 11:43:40 +0000
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
	CompactStrings	change
In-Reply-To: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>

Hi Volker,

thank you very much for adapting the non-CompactStrings version of the intrinsics. I especially like that you changed shared code to improve matching of special cases.

Here are some minor change requests:
- I guess you will have to adapt Copyright messages.
- There's a typo in the new comment in library_call: "optimzed".
- The comment for the instruction count (used for loop alignment) is wrong in MacroAssembler::string_indexof_1 (should start with 3 instead of 2).

I have more change requests regarding ppc.ad:

The computation of chr is incorrect for little endian in string_indexOf_imm1_char and string_indexOf_imm1.

Some numbers for compute_padding should be adapted:
int string_indexOf_imm1_charNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; } 
int string_indexOfCharNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; } 
int string_compareNode::compute_padding(int current_offset) const { return (2*4-current_offset)&31; }

Some kill effects are missing:
- ctr in all string_indexOf nodes
- cr0, cr1 in string_indexOf_imm1, string_indexOfChar

The new comment for string_indexOfChar claims "// Kill ... needle" which is not true.

Thanks,
 Martin


-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis
Sent: Dienstag, 19. Januar 2016 19:57
To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change

Hi,

can somebody please review and sponsor this change.

Despite the bug summary, I still had to do some small shared changes
to make this work, so unfortunately I can not push this on my own.

The change also affects aarch64 (although it is minimal and I don't
expect it to break anything) so I cc-ed aarch64-port-dev.

http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
https://bugs.openjdk.java.net/browse/JDK-8145336

As described in the bug, this change only fixes the string intrinsics
for the -XX:-UseCompactStrings mode which is still the default on
ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
added because we already had a similar intrinsic for constant string
needles of length one anyway. A later change (which we're already
working on) will add the intrinsics which can handle compact strings.

The current intrinsics can handle both, the new byte-array based
string representation as well as the old char-array based string
representation because we internally still use the new hotspot with
older versions of the class libraries.

I've also ported some of our internal string tests into a new
regression test (TestStringIntrinsics2.java) because the existing
tests didn't exercise all of our intrinsics.

Following the shared changes I had to do:

Until now, UseSSE42Intrinsics was a global shared option which was
used to control the availability of the stringIndexOf intrinsics. But
UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
make a lot of sense to define it for other architectures. I've
therefore moved the flag to globals_x86.hpp and changed the condition
which checks for the ability of the stringIndexOf intrinsics from:

if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {

to:

if (!Matcher::match_rule_supported(Op_StrIndexOf)) {

The Matcher::match_rule_supported() method already calls
Matcher::has_match_rule() anyway. And it is implemented in the .ad
file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
platforms can now decide in their .ad file if they unconditionally
support the intrinsic or if they need a special feature check. This
change was already briefly discussed in [1].

The other shared change I had to make was in
LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
We have optimized intrinsics for the case that one of the strings to
compare is constant, but the  StrEqualsNode is constructed without
taking into account that one of the string length values could be a
constant. This prevented our optimized instruction from being matched
in the ad-file.

All the other changes are ppc-specific.

Thank you and best regards,
Volker


[1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From vladimir.x.ivanov at oracle.com  Wed Jan 20 11:54:00 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 20 Jan 2016 14:54:00 +0300
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
Message-ID: <569F7558.1030800@oracle.com>

John, Chris, thanks for the feedback.

I don't think it is only about microbenchmarks. Long-running large 
applications with lots of mutable call sites should also benefit for 
this change. Current JVM behavior counts invalidations on root method, 
so nmethods with multiple mutable call sites (from root & all inlined 
callees) are more likely to hit the limit, even if there's no 
mega-mutable sites. It just sums up and PerMethodRecompilationCutoff (= 
400, by default) doesn't look like a huge number.

Also, LambdaForm sharing somewhat worsen the situation. When LambdaForms 
were mostly customized, different method handle chains were compiled 
into a single nmethod. Right now, it means that not only the root method 
is always interpreted, but all bound method handle chains are broken 
into numerous per-LF nmethods (see JDK-8069591 for some details).

MLVM folks, I'd like to hear your opinion about what kind of behavior do 
you expect from JVM w.r.t. mutable call sites.

There are valid use-cases when JVM shouldn't throttle the recompilation 
(e.g., long-running application with indy-based dynamic tracing). Maybe 
there's a place for a new CallSite flavor to clearly communicate 
application expectations to the JVM? Either always recompile (thus 
eventually reaching peak performance) or give up and generate less 
efficient machine code, but save on possible recompilations.

Best regards,
Vladimir Ivanov

On 1/20/16 2:37 AM, John Rose wrote:
> On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov
> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>
>> The fix is to avoid updating recompilation count when corresponding
>> nmethod is invalidated due to a call site target change.
>
> Although I'm not vetoing it (since it seems it will help customers in
> the short term), I'm uncomfortable with this fix because it doesn't
> scale to large dyn. lang. applications with many unstable call sites.
>   Put another way, it feels like we are duct-taping down a failsafe
> switch (against infinite recompilation) in order to spam a
> micro-benchmark:  a small number mega-mutable call sites for which we
> are willing to spend (potentially) all of the JIT resources, including
> those usually allocated to application performance in the steady state.
>   Put a third way:  I am not comfortable with unthrottled infinite
> recompilation as a performance strategy.
>
> I've commented on the new RFE (JDK-8147550) where to go next, including
> the following sentiments:
>
>> There is a serious design tension here, though: Some users apparently
>> are willing to endure an infinite series of recompilations as part of
>> the cost of doing business; JDK-7177745 addresses this need by turning
>> off the fail-safe against (accidental, buggy) infinite recompilation
>> for unstable CSs. Other users might find that having a percentage of
>> machine time devoted to recompilation is a problem. (This has been the
>> case in the past with non-dynamic languages, at least.) The code shape
>> proposed in this bug report would cover all simple unstable call
>> sites (bi-stable, for example, would compile to a bi-morphic call),
>> but, in pathological cases (infinite sequence of distinct CS targets)
>> would "settle down" into a code shape that would be sub-optimal for
>> any single target, but (as an indirect MH call) reasonable for all the
>> targets together.
>>
>> In the absence of clear direction from the user or the profile, the
>> JVM has to choose infinite recompilation or a good-enough final
>> compilation. The latter choice is safer. And the
>> infinite recompilation is less safe because there is no intrinsic
>> bound on the amount of machine cycles that could be diverted to
>> recompilation, given a dynamic language application with
>> enough mega-mutable CSs. Settling down to a network of indirect calls
>> has a bounded cost.
>>
>> Yes, one size-fits-all tactics never please everybody. But the JVM
>> should not choose tactics with unlimited downsides.
>
> ? John

From forax at univ-mlv.fr  Wed Jan 20 12:13:29 2016
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 20 Jan 2016 13:13:29 +0100 (CET)
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
Message-ID: <2036838501.1079316.1453292009390.JavaMail.zimbra@u-pem.fr>

Hi John, 
I understand that having the VM that may always recompile may be seen as a bug, 
but having the VM that bailout and stop recompiling, or more generally change the compilation strategy is a bug too. 

The problem here is that there is no way from the point of view of a dyn lang runtime to know what will be the behavior of the VM for a callsite if the VM decide to stop to recompile, decide to not inline, decide to inline some part of the tree, etc. 
Said differently, using an invokedynamic allows to create code shapes that will change dynamically, if the VM behavior also changes dynamically, it's like building a wall on moving parts, the result is strange dynamic behaviors that are hard to diagnose and reproduce. 

The recompilation behavior of the VM should be keep simple and predicatable, basically, the VM should always recompile the CS with no failsafe switch. 
If dyn lang runtime devs have trouble with that, they can already use an exactInvoker to simulate an indirect mh call and we can even provide new method handle combiners to gracefully handle multi-stable CS. 

regards, 
R?mi 

----- Mail original -----

> De: "John Rose" <john.r.rose at oracle.com>
> ?: "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com>
> Cc: "hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
> Envoy?: Mercredi 20 Janvier 2016 00:37:29
> Objet: Re: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
> target method to always run in interpreter mode

> On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov < vladimir.x.ivanov at oracle.com >
> wrote:

> > The fix is to avoid updating recompilation count when corresponding nmethod
> > is invalidated due to a call site target change.
> 

> Although I'm not vetoing it (since it seems it will help customers in the
> short term), I'm uncomfortable with this fix because it doesn't scale to
> large dyn. lang. applications with many unstable call sites. Put another
> way, it feels like we are duct-taping down a failsafe switch (against
> infinite recompilation) in order to spam a micro-benchmark: a small number
> mega-mutable call sites for which we are willing to spend (potentially) all
> of the JIT resources, including those usually allocated to application
> performance in the steady state. Put a third way: I am not comfortable with
> unthrottled infinite recompilation as a performance strategy.

> I've commented on the new RFE (JDK-8147550) where to go next, including the
> following sentiments:

> > There is a serious design tension here, though: Some users apparently are
> > willing to endure an infinite series of recompilations as part of the cost
> > of doing business; JDK-7177745 addresses this need by turning off the
> > fail-safe against (accidental, buggy) infinite recompilation for unstable
> > CSs. Other users might find that having a percentage of machine time
> > devoted
> > to recompilation is a problem. (This has been the case in the past with
> > non-dynamic languages, at least.) The code shape proposed in this bug
> > report
> > would cover all simple unstable call sites (bi-stable, for example, would
> > compile to a bi-morphic call), but, in pathological cases (infinite
> > sequence
> > of distinct CS targets) would "settle down" into a code shape that would be
> > sub-optimal for any single target, but (as an indirect MH call) reasonable
> > for all the targets together.
> 

> > In the absence of clear direction from the user or the profile, the JVM has
> > to choose infinite recompilation or a good-enough final compilation. The
> > latter choice is safer. And the infinite recompilation is less safe because
> > there is no intrinsic bound on the amount of machine cycles that could be
> > diverted to recompilation, given a dynamic language application with enough
> > mega-mutable CSs. Settling down to a network of indirect calls has a
> > bounded
> > cost.
> 

> > Yes, one size-fits-all tactics never please everybody. But the JVM should
> > not
> > choose tactics with unlimited downsides.
> 

> ? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160120/f026b30f/attachment.html>

From hui.shi at linaro.org  Wed Jan 20 13:30:31 2016
From: hui.shi at linaro.org (Hui Shi)
Date: Wed, 20 Jan 2016 21:30:31 +0800
Subject: RFR(s): AARCH64: 8147805: C1 segmentation fault due to inline
	Unsafe::getAndSetObject
Message-ID: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>

Hi All,

Could some one help review this AArch64 C1 issue? Issue happens when inline
unsafe.getAndSet(data) in C1 and UseCompressedOops flag is true, register
is compressed for store, but it is not restored into decompressed form.
Later compressed result is used as reference address and goes wrong.

Bug:  https://bugs.openjdk.java.net/browse/JDK-8147805
webrev: http://cr.openjdk.java.net/~hshi/8147805/webrev/
Small test case in http://cr.openjdk.java.net/~hshi/8147805/TestUnsafe.java
Crash can be reproduced by java -XX:TieredStopAtLevel=3
-XX:+TieredCompilation -Xms4G -Xmx4G TestUnsafe

In following method, n is stored two times, first in unsafe.getAndSet,
second when store old.next.
  public Node foo(Node n) {
    Node old;
    old = this.getAndSet(n); // inline sun.misc.Unsafe::getAndSetObject
here, shift first time for store
    old.next = n;                 // n is used again and store into
old.next, shift again for store
    return old;
  }

In generated assemlby,  can see "x2" is shifted but not restored
  0x0000007f943af3dc: lsr       x2, x2, #3    // x2 is shifted but not
restored
  0x0000007f943af3e0: add       x4, x1, #0xc
  0x0000007f943af3e4: ldaxr     w3, [x4]
  0x0000007f943af3e8: stlxr     w9, w2, [x4]
  0x0000007f943af3ec: cbnz      w9, 0x0000007f943af3e4
  0x0000007f943af3f0: lsl       x3, x3, #3
  0x0000007f943af3f4: dmb       ish

  0x0000007f943af504: lsr       x8, x2, #3  // x2 is shifted again and wrong
  0x0000007f943af508: str       w8, [x0,#16]
  0x0000007f943af50c: lsr       x2, x0, #9
  0x0000007f943af510: strb      wzr, [x2,x1,lsl #0]  ;*putfield next
                                                ; - TestUnsafe::foo at 11
(line 25)

Patch is using rscratch1 to hold heap_oop address for store when
UseCompressedOops is true.
So later use still get correct object address.

Regards
Hui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160120/81d0b03d/attachment-0001.html>

From roland.schatz at oracle.com  Wed Jan 20 13:35:21 2016
From: roland.schatz at oracle.com (Roland Schatz)
Date: Wed, 20 Jan 2016 14:35:21 +0100
Subject: RFR: 8147599: [JVMCI] simplify code installation interface
Message-ID: <569F8D19.4090305@oracle.com>

Hi,

Please review this change to the JVMCI code installation interface:

webrev: http://cr.openjdk.java.net/~rschatz/JDK-8147599/webrev.00/
jira: https://bugs.openjdk.java.net/browse/JDK-8147599

The new classes in the jdk.vm.ci.code.site package used to be inner 
classes in the removed CompilationResult class, no actual code changes 
there.

Thanks,
Roland

From aph at redhat.com  Wed Jan 20 14:12:57 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 20 Jan 2016 14:12:57 +0000
Subject: RFR(s): AARCH64: 8147805: C1 segmentation fault due to inline
	Unsafe::getAndSetObject
In-Reply-To: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>
References: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>
Message-ID: <569F95E9.1070202@redhat.com>

On 01/20/2016 01:30 PM, Hui Shi wrote:
> Could some one help review this AArch64 C1 issue?

OK, thanks, I'm looking at this to make sure thus problem does not exist
elsewhere.

Andrew.


From edward.nevill at gmail.com  Wed Jan 20 14:21:25 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Wed, 20 Jan 2016 14:21:25 +0000
Subject: [aarch64-port-dev ] RFR(s): AARCH64: 8147805: C1 segmentation
	fault due to inline Unsafe::getAndSetObject
In-Reply-To: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>
References: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>
Message-ID: <1453299685.3772.2.camel@mint>

On Wed, 2016-01-20 at 21:30 +0800, Hui Shi wrote:
> Hi All,
> 
> Could some one help review this AArch64 C1 issue? Issue happens when inline
> unsafe.getAndSet(data) in C1 and UseCompressedOops flag is true, register
> is compressed for store, but it is not restored into decompressed form.
> Later compressed result is used as reference address and goes wrong.
> 
> Bug:  https://bugs.openjdk.java.net/browse/JDK-8147805
> webrev: http://cr.openjdk.java.net/~hshi/8147805/webrev/
> Small test case in http://cr.openjdk.java.net/~hshi/8147805/TestUnsafe.java
> Crash can be reproduced by java -XX:TieredStopAtLevel=3
> -XX:+TieredCompilation -Xms4G -Xmx4G TestUnsafe

Hi Hui Shi,

Thanks for finding this. Your change looks correct, but if I make suggest the following smaller change which achieves the same.

diff -r 46c1abd5c34d src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp
--- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp    Tue Jan 12 14:55:15 2016 +0000
+++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp    Wed Jan 20 14:16:56 2016 +0000
@@ -3169,7 +3169,8 @@
       Register obj = as_reg(data);
       Register dst = as_reg(dest);
       if (is_oop && UseCompressedOops) {
-        __ encode_heap_oop(obj);
+        __ encode_heap_oop(rscratch1, obj);
+        obj = rscratch1;
       }
       assert_different_registers(obj, addr.base(), tmp, rscratch2, dst);
       Label again;

Regards,
Ed.


From aph at redhat.com  Wed Jan 20 14:33:35 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 20 Jan 2016 14:33:35 +0000
Subject: [aarch64-port-dev ] RFR(s): AARCH64: 8147805: C1 segmentation
	fault due to inline Unsafe::getAndSetObject
In-Reply-To: <1453299685.3772.2.camel@mint>
References: <CAF1YaiBtfQtH11y0Bq6-cjuLjGGDKidrneL+r4W-4FHcbcdgmA@mail.gmail.com>
	<1453299685.3772.2.camel@mint>
Message-ID: <569F9ABF.5070501@redhat.com>

On 01/20/2016 02:21 PM, Edward Nevill wrote:
> On Wed, 2016-01-20 at 21:30 +0800, Hui Shi wrote:
>> Hi All,
>>
>> Could some one help review this AArch64 C1 issue? Issue happens when inline
>> unsafe.getAndSet(data) in C1 and UseCompressedOops flag is true, register
>> is compressed for store, but it is not restored into decompressed form.
>> Later compressed result is used as reference address and goes wrong.
>>
>> Bug:  https://bugs.openjdk.java.net/browse/JDK-8147805
>> webrev: http://cr.openjdk.java.net/~hshi/8147805/webrev/
>> Small test case in http://cr.openjdk.java.net/~hshi/8147805/TestUnsafe.java
>> Crash can be reproduced by java -XX:TieredStopAtLevel=3
>> -XX:+TieredCompilation -Xms4G -Xmx4G TestUnsafe
> 
> Hi Hui Shi,
> 
> Thanks for finding this. Your change looks correct, but if I make suggest the following smaller change which achieves the same.
> 
> diff -r 46c1abd5c34d src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp
> --- a/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp    Tue Jan 12 14:55:15 2016 +0000
> +++ b/src/cpu/aarch64/vm/c1_LIRAssembler_aarch64.cpp    Wed Jan 20 14:16:56 2016 +0000
> @@ -3169,7 +3169,8 @@
>        Register obj = as_reg(data);
>        Register dst = as_reg(dest);
>        if (is_oop && UseCompressedOops) {
> -        __ encode_heap_oop(obj);
> +        __ encode_heap_oop(rscratch1, obj);
> +        obj = rscratch1;
>        }
>        assert_different_registers(obj, addr.base(), tmp, rscratch2, dst);
>        Label again;

I agree.  I have tried this and it works well.  The patch is OK with
this change.

Andrew.


From duncan.macgregor at ge.com  Wed Jan 20 14:53:16 2016
From: duncan.macgregor at ge.com (MacGregor, Duncan (GE Energy Management))
Date: Wed, 20 Jan 2016 14:53:16 +0000
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569F7558.1030800@oracle.com>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
	<569F7558.1030800@oracle.com>
Message-ID: <D2C5346D.1174C1%duncan.macgregor@ge.com>

I was going to say it is unlikely to matter in production cases but might
well hit test code which does extensive meta-programming, but actually,
since it?s a question of invalidations across _all_ sites, rather than any
single one I think it might make a difference. I?ll need to take a look at
what our compilation counts eventually come to and experiment with
changing the limits. We did work quite early on to limit the the extent of
call site invalidations.

One thing that might affect this is how megamorphic call sites are
handled. At the moment we keep a cache of classes, method handles, and
switch points, and we check the switch point before calling the method
handle. I had considered a change to bind the switch points to the method
handles and thus allow those checks to be optimised out for methods called
extensively from mega-morphia call sites, would that also fall foul of the
compilation count being increased?

I think there is definitely room for communicating more in the nature of a
callsite to the JIT. Whether this should be around recompilation or
perhaps more focused round inlining and type specialisation to avoid
invalidations and recompilation would be my question. For example, method
invocation sites may go megamorphic, and this currently forms a barrier to
the JIT seeing the types in a way that doesn?t really exist with standard
invokeVirtual sites. If there was some feedback loop allowing sites to be
cloned as methods are inlined, and a way to indicate this was allowed or
desired, then that might allow significantly more optimisations to happen
in invokeDynamic based languages. It would also probably be a horror to
implement in the current model, but I?m sure you guys can fix all that. :-)

Duncan.

On 20/01/2016, 11:54, "mlvm-dev on behalf of Vladimir Ivanov"
<mlvm-dev-bounces at openjdk.java.net on behalf of
vladimir.x.ivanov at oracle.com> wrote:
>MLVM folks, I'd like to hear your opinion about what kind of behavior do
>you expect from JVM w.r.t. mutable call sites.
>
>There are valid use-cases when JVM shouldn't throttle the recompilation
>(e.g., long-running application with indy-based dynamic tracing). Maybe
>there's a place for a new CallSite flavor to clearly communicate
>application expectations to the JVM? Either always recompile (thus
>eventually reaching peak performance) or give up and generate less
>efficient machine code, but save on possible recompilations.
>
>Best regards,
>Vladimir Ivanov
>
>On 1/20/16 2:37 AM, John Rose wrote:
>> On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov
>> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>>
>>wrote:
>>>
>>> The fix is to avoid updating recompilation count when corresponding
>>> nmethod is invalidated due to a call site target change.
>>
>> Although I'm not vetoing it (since it seems it will help customers in
>> the short term), I'm uncomfortable with this fix because it doesn't
>> scale to large dyn. lang. applications with many unstable call sites.
>>   Put another way, it feels like we are duct-taping down a failsafe
>> switch (against infinite recompilation) in order to spam a
>> micro-benchmark:  a small number mega-mutable call sites for which we
>> are willing to spend (potentially) all of the JIT resources, including
>> those usually allocated to application performance in the steady state.
>>   Put a third way:  I am not comfortable with unthrottled infinite
>> recompilation as a performance strategy.
>>
>> I've commented on the new RFE (JDK-8147550) where to go next, including
>> the following sentiments:
>>
>>> There is a serious design tension here, though: Some users apparently
>>> are willing to endure an infinite series of recompilations as part of
>>> the cost of doing business; JDK-7177745 addresses this need by turning
>>> off the fail-safe against (accidental, buggy) infinite recompilation
>>> for unstable CSs. Other users might find that having a percentage of
>>> machine time devoted to recompilation is a problem. (This has been the
>>> case in the past with non-dynamic languages, at least.) The code shape
>>> proposed in this bug report would cover all simple unstable call
>>> sites (bi-stable, for example, would compile to a bi-morphic call),
>>> but, in pathological cases (infinite sequence of distinct CS targets)
>>> would "settle down" into a code shape that would be sub-optimal for
>>> any single target, but (as an indirect MH call) reasonable for all the
>>> targets together.
>>>
>>> In the absence of clear direction from the user or the profile, the
>>> JVM has to choose infinite recompilation or a good-enough final
>>> compilation. The latter choice is safer. And the
>>> infinite recompilation is less safe because there is no intrinsic
>>> bound on the amount of machine cycles that could be diverted to
>>> recompilation, given a dynamic language application with
>>> enough mega-mutable CSs. Settling down to a network of indirect calls
>>> has a bounded cost.
>>>
>>> Yes, one size-fits-all tactics never please everybody. But the JVM
>>> should not choose tactics with unlimited downsides.
>>
>> ? John
>_______________________________________________
>mlvm-dev mailing list
>mlvm-dev at openjdk.java.net
>https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.openjdk.java.net_
>mailman_listinfo_mlvm-2Ddev&d=CwIGaQ&c=IV_clAzoPDE253xZdHuilRgztyh_RiV3wUr
>LrDQYWSI&r=aV08z5NG4zOHLhrrnNlp8QUqO3qoRJCN9uQ9bkMSeqE&m=hJAS8YcL1L_GjSeHP
>APOmxt0FJdHHHxFhzDQ8MvmlGY&s=5oO4O88gUQMGupvgF779GMNnLvUSUnYOGcvKCpcm8sU&e
>= 


From roland.schatz at oracle.com  Wed Jan 20 15:28:33 2016
From: roland.schatz at oracle.com (Roland Schatz)
Date: Wed, 20 Jan 2016 16:28:33 +0100
Subject: RFR and workflow question
Message-ID: <569FA7A1.1080200@oracle.com>

Hi!

Please review this small bugfix:
http://cr.openjdk.java.net/~rschatz/JDK-8147475/webrev.00/

This is on top of another webrev[1]. It has no semantic dependency on 
it, just a random source-level conflict. I can rebase them if we want to 
integrate this one first.


This is missing a commit message, and I'm not sure what to use here.

There are two issues that I *think* this will solve, but I can't be sure 
since I haven't managed to reproduce them:
https://bugs.openjdk.java.net/browse/JDK-8147475
https://bugs.openjdk.java.net/browse/JDK-8146608

Do I just use one of the above issues randomly for the commit message, 
and close the other as duplicate?


Thanks,
Roland

[1] http://cr.openjdk.java.net/~rschatz/JDK-8147599/webrev.00/

From igor.ignatyev at oracle.com  Wed Jan 20 16:05:09 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 20 Jan 2016 19:05:09 +0300
Subject: RFR(XS) : 8141557 : TestResolvedJavaMethod.java times out after
	1000 ms
In-Reply-To: <3DDA7A22-74CF-400A-A403-9CE70655ABD5@oracle.com>
References: <52CAB89A-1AA5-4545-9C4B-DD2A6880E463@oracle.com>
	<3DDA7A22-74CF-400A-A403-9CE70655ABD5@oracle.com>
Message-ID: <613621B5-925D-4E72-A115-1480F47BA43C@oracle.com>

Hi Chris,

thank you for review.

Y, it?s a typo, I?ve fixed that (s/TestAnnotionation/TestAnnotation/g) and added explicit 'value =?:
http://cr.openjdk.java.net/~iignatyev/8141557/webrev.01/

Thanks,
Igor


> On Jan 20, 2016, at 3:45 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> I suppose TestAnnotionation is a typo?
> 
> +    @TestAnnotionation(1000L)
> 
> Could you change that to value = 1000L?  Just for extra clarity.  Then it looks good.
> 
>> On Jan 19, 2016, at 11:06 AM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
>> 
>> http://cr.openjdk.java.net/~iignatyev/8141557/webrev.00/
>>> 22 lines changed: 16 ins; 0 del; 6 mod; 
>> 
>> Hi all,
>> 
>> Could you please review the fix for 8141557?
>> 
>> The test uses timeout value of org.junit.Test to test reading annotation via JVMCI. In some cases, e.g. on embedded platforms, debug builds or w/ extra vm flags like -Xcomp, 1000ms isn?t enough for the test to complete, and since jtreg doesn?t apply timeout factor for junit/testng timeouts (CODETOOLS-7901567) the test timeouts despite the fact of increased timeout factor.
>> 
>> The fix changes the test to use a separate annotation, which doesn?t affect test execution, and remove timeout value (which means no timeout).
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8141557
>> testing: locally
>> 
>> Thanks,
>> Igor
> 


From volker.simonis at gmail.com  Wed Jan 20 16:23:27 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 20 Jan 2016 17:23:27 +0100
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
	CompactStrings change
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>
Message-ID: <CA+3eh10peJ+Xy7NxPKQb5iM8jrbZm9aJyvj1JmbCpFVvuUXYsQ@mail.gmail.com>

Hi Martin,

thanks for your thorough review. I've uploaded a new webrev to:

http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336.v1/

Please find my comments inline.

Regards,
Volker


On Wed, Jan 20, 2016 at 12:43 PM, Doerr, Martin <martin.doerr at sap.com> wrote:
> Hi Volker,
>
> thank you very much for adapting the non-CompactStrings version of the intrinsics. I especially like that you changed shared code to improve matching of special cases.
>
> Here are some minor change requests:
> - I guess you will have to adapt Copyright messages.

Done.

> - There's a typo in the new comment in library_call: "optimzed".

Fixed.

> - The comment for the instruction count (used for loop alignment) is wrong in MacroAssembler::string_indexof_1 (should start with 3 instead of 2).
>

Right, fixed.

> I have more change requests regarding ppc.ad:
>
> The computation of chr is incorrect for little endian in string_indexOf_imm1_char and string_indexOf_imm1.
>

Good catch. Fixed.

> Some numbers for compute_padding should be adapted:
> int string_indexOf_imm1_charNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
> int string_indexOfCharNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
> int string_compareNode::compute_padding(int current_offset) const { return (2*4-current_offset)&31; }
>

Right, and also:
int string_indexOf_imm1Node::compute_padding(int current_offset) const
{ return (3*4-current_offset)&31; }

I have now put a comment in each method which points to the  macro
assembler method it depends on to make this dependency explicit.

> Some kill effects are missing:
> - ctr in all string_indexOf nodes

Added kill effect for ctr register to all str_indexof intrinsics.

> - cr0, cr1 in string_indexOf_imm1, string_indexOfChar
>

Fixed.

> The new comment for string_indexOfChar claims "// Kill ... needle" which is not true.
>

Right, fixed.

> Thanks,
>  Martin
>
>
> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis
> Sent: Dienstag, 19. Januar 2016 19:57
> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Subject: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change
>
> Hi,
>
> can somebody please review and sponsor this change.
>
> Despite the bug summary, I still had to do some small shared changes
> to make this work, so unfortunately I can not push this on my own.
>
> The change also affects aarch64 (although it is minimal and I don't
> expect it to break anything) so I cc-ed aarch64-port-dev.
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
> https://bugs.openjdk.java.net/browse/JDK-8145336
>
> As described in the bug, this change only fixes the string intrinsics
> for the -XX:-UseCompactStrings mode which is still the default on
> ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
> added because we already had a similar intrinsic for constant string
> needles of length one anyway. A later change (which we're already
> working on) will add the intrinsics which can handle compact strings.
>
> The current intrinsics can handle both, the new byte-array based
> string representation as well as the old char-array based string
> representation because we internally still use the new hotspot with
> older versions of the class libraries.
>
> I've also ported some of our internal string tests into a new
> regression test (TestStringIntrinsics2.java) because the existing
> tests didn't exercise all of our intrinsics.
>
> Following the shared changes I had to do:
>
> Until now, UseSSE42Intrinsics was a global shared option which was
> used to control the availability of the stringIndexOf intrinsics. But
> UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
> make a lot of sense to define it for other architectures. I've
> therefore moved the flag to globals_x86.hpp and changed the condition
> which checks for the ability of the stringIndexOf intrinsics from:
>
> if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {
>
> to:
>
> if (!Matcher::match_rule_supported(Op_StrIndexOf)) {
>
> The Matcher::match_rule_supported() method already calls
> Matcher::has_match_rule() anyway. And it is implemented in the .ad
> file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
> platforms can now decide in their .ad file if they unconditionally
> support the intrinsic or if they need a special feature check. This
> change was already briefly discussed in [1].
>
> The other shared change I had to make was in
> LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
> We have optimized intrinsics for the case that one of the strings to
> compare is constant, but the  StrEqualsNode is constructed without
> taking into account that one of the string length values could be a
> constant. This prevented our optimized instruction from being matched
> in the ad-file.
>
> All the other changes are ppc-specific.
>
> Thank you and best regards,
> Volker
>
>
> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From pavel.punegov at oracle.com  Wed Jan 20 16:36:13 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Wed, 20 Jan 2016 19:36:13 +0300
Subject: RFR (S): 8145800: [Testbug] CompilerControl: inline message differs
	for not inlined methods
Message-ID: <0B7801F7-DEFA-482F-BD24-C06BC3037E0B@oracle.com>

Hi,

please review the following fix for the test bug.

Issue: tests incorrectly set inlining state for methods, that belong to any of Internal subclass of both pool.sub.Klass and pool.subpack.KlassDup. 
This happen because test have an assumption that any of method callers will match only *.* directive pattern. But they could match patterns like ?*Internal*?, because
a typical method caller in this case could be  pool/sub/Klass$Internal::lambda$getAllMethods$0. 

Fix: Make method callers (lambdas) do not contain any names used in the test, such as Internal, or Klass. That?s why all executable and callable creation was moved to a new SubMethodHolder class.

bug id: https://bugs.openjdk.java.net/browse/JDK-8145800 <https://bugs.openjdk.java.net/browse/JDK-8145800>
webrev: http://cr.openjdk.java.net/~ppunegov/8145800/webrev.00/ <http://cr.openjdk.java.net/~ppunegov/8145800/webrev.00/>

? Thanks,
Pavel Punegov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160120/6fdaead9/attachment.html>

From martin.doerr at sap.com  Wed Jan 20 16:45:44 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 20 Jan 2016 16:45:44 +0000
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
	CompactStrings change
In-Reply-To: <CA+3eh10peJ+Xy7NxPKQb5iM8jrbZm9aJyvj1JmbCpFVvuUXYsQ@mail.gmail.com>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>
	<CA+3eh10peJ+Xy7NxPKQb5iM8jrbZm9aJyvj1JmbCpFVvuUXYsQ@mail.gmail.com>
Message-ID: <7C9B87B351A4BA4AA9EC95BB418116567228C273@DEWDFEMB19C.global.corp.sap>

Hi Volker,

thanks for the update. Looks good.

Best regards,
  Martin

-----Original Message-----
From: Volker Simonis [mailto:volker.simonis at gmail.com] 
Sent: Mittwoch, 20. Januar 2016 17:23
To: Doerr, Martin <martin.doerr at sap.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>; ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change

Hi Martin,

thanks for your thorough review. I've uploaded a new webrev to:

http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336.v1/

Please find my comments inline.

Regards,
Volker


On Wed, Jan 20, 2016 at 12:43 PM, Doerr, Martin <martin.doerr at sap.com> wrote:
> Hi Volker,
>
> thank you very much for adapting the non-CompactStrings version of the intrinsics. I especially like that you changed shared code to improve matching of special cases.
>
> Here are some minor change requests:
> - I guess you will have to adapt Copyright messages.

Done.

> - There's a typo in the new comment in library_call: "optimzed".

Fixed.

> - The comment for the instruction count (used for loop alignment) is wrong in MacroAssembler::string_indexof_1 (should start with 3 instead of 2).
>

Right, fixed.

> I have more change requests regarding ppc.ad:
>
> The computation of chr is incorrect for little endian in string_indexOf_imm1_char and string_indexOf_imm1.
>

Good catch. Fixed.

> Some numbers for compute_padding should be adapted:
> int string_indexOf_imm1_charNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
> int string_indexOfCharNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
> int string_compareNode::compute_padding(int current_offset) const { return (2*4-current_offset)&31; }
>

Right, and also:
int string_indexOf_imm1Node::compute_padding(int current_offset) const
{ return (3*4-current_offset)&31; }

I have now put a comment in each method which points to the  macro
assembler method it depends on to make this dependency explicit.

> Some kill effects are missing:
> - ctr in all string_indexOf nodes

Added kill effect for ctr register to all str_indexof intrinsics.

> - cr0, cr1 in string_indexOf_imm1, string_indexOfChar
>

Fixed.

> The new comment for string_indexOfChar claims "// Kill ... needle" which is not true.
>

Right, fixed.

> Thanks,
>  Martin
>
>
> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis
> Sent: Dienstag, 19. Januar 2016 19:57
> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Subject: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change
>
> Hi,
>
> can somebody please review and sponsor this change.
>
> Despite the bug summary, I still had to do some small shared changes
> to make this work, so unfortunately I can not push this on my own.
>
> The change also affects aarch64 (although it is minimal and I don't
> expect it to break anything) so I cc-ed aarch64-port-dev.
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
> https://bugs.openjdk.java.net/browse/JDK-8145336
>
> As described in the bug, this change only fixes the string intrinsics
> for the -XX:-UseCompactStrings mode which is still the default on
> ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
> added because we already had a similar intrinsic for constant string
> needles of length one anyway. A later change (which we're already
> working on) will add the intrinsics which can handle compact strings.
>
> The current intrinsics can handle both, the new byte-array based
> string representation as well as the old char-array based string
> representation because we internally still use the new hotspot with
> older versions of the class libraries.
>
> I've also ported some of our internal string tests into a new
> regression test (TestStringIntrinsics2.java) because the existing
> tests didn't exercise all of our intrinsics.
>
> Following the shared changes I had to do:
>
> Until now, UseSSE42Intrinsics was a global shared option which was
> used to control the availability of the stringIndexOf intrinsics. But
> UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
> make a lot of sense to define it for other architectures. I've
> therefore moved the flag to globals_x86.hpp and changed the condition
> which checks for the ability of the stringIndexOf intrinsics from:
>
> if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {
>
> to:
>
> if (!Matcher::match_rule_supported(Op_StrIndexOf)) {
>
> The Matcher::match_rule_supported() method already calls
> Matcher::has_match_rule() anyway. And it is implemented in the .ad
> file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
> platforms can now decide in their .ad file if they unconditionally
> support the intrinsic or if they need a special feature check. This
> change was already briefly discussed in [1].
>
> The other shared change I had to make was in
> LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
> We have optimized intrinsics for the case that one of the strings to
> compare is constant, but the  StrEqualsNode is constructed without
> taking into account that one of the string length values could be a
> constant. This prevented our optimized instruction from being matched
> in the ad-file.
>
> All the other changes are ppc-specific.
>
> Thank you and best regards,
> Volker
>
>
> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From vladimir.kozlov at oracle.com  Wed Jan 20 16:46:48 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 20 Jan 2016 08:46:48 -0800
Subject: RFR and workflow question
In-Reply-To: <569FA7A1.1080200@oracle.com>
References: <569FA7A1.1080200@oracle.com>
Message-ID: <569FB9F8.1070802@oracle.com>

Use 8147475 for commit message - your change can be related (relocation info could be affected by padding).
But I don't see how your change can fix 8146608 - it is patching return PC which is SP relative and nothing to do with 
padding in prolog.

Changes looks fine. Please, integrate it first since it affect all hotspot repos.

Thanks,
Vladimir

On 1/20/16 7:28 AM, Roland Schatz wrote:
> Hi!
>
> Please review this small bugfix:
> http://cr.openjdk.java.net/~rschatz/JDK-8147475/webrev.00/
>
> This is on top of another webrev[1]. It has no semantic dependency on it, just a random source-level conflict. I can
> rebase them if we want to integrate this one first.
>
>
> This is missing a commit message, and I'm not sure what to use here.
>
> There are two issues that I *think* this will solve, but I can't be sure since I haven't managed to reproduce them:
> https://bugs.openjdk.java.net/browse/JDK-8147475
> https://bugs.openjdk.java.net/browse/JDK-8146608
>
> Do I just use one of the above issues randomly for the commit message, and close the other as duplicate?
>
>
> Thanks,
> Roland
>
> [1] http://cr.openjdk.java.net/~rschatz/JDK-8147599/webrev.00/

From vladimir.kozlov at oracle.com  Wed Jan 20 16:49:11 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 20 Jan 2016 08:49:11 -0800
Subject: RFR (S): 8145800: [Testbug] CompilerControl: inline message
	differs for not inlined methods
In-Reply-To: <0B7801F7-DEFA-482F-BD24-C06BC3037E0B@oracle.com>
References: <0B7801F7-DEFA-482F-BD24-C06BC3037E0B@oracle.com>
Message-ID: <569FBA87.4@oracle.com>

Good.
Thanks,
Vladimir

On 1/20/16 8:36 AM, Pavel Punegov wrote:
> Hi,
>
> please review the following fix for the test bug.
>
> Issue: tests incorrectly set inlining state for methods, that belong to any of Internal subclass of both pool.sub.Klass
> and pool.subpack.KlassDup.
> This happen because test have an assumption that any of method callers will match only *.* directive pattern. But they
> could match patterns like ?*Internal*?, because
> a typical method caller in this case could be pool/sub/Klass$Internal::lambda$getAllMethods$0.
>
> Fix: Make method callers (lambdas) do not contain any names used in the test, such as Internal, or Klass. That?s why all
> executable and callable creation was moved to a new SubMethodHolder class.
>
> bug id: https://bugs.openjdk.java.net/browse/JDK-8145800
> webrev: http://cr.openjdk.java.net/~ppunegov/8145800/webrev.00/
>
> ? Thanks,
> Pavel Punegov
>

From vladimir.kozlov at oracle.com  Wed Jan 20 17:08:18 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 20 Jan 2016 09:08:18 -0800
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
	CompactStrings change
In-Reply-To: <7C9B87B351A4BA4AA9EC95BB418116567228C273@DEWDFEMB19C.global.corp.sap>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>
	<CA+3eh10peJ+Xy7NxPKQb5iM8jrbZm9aJyvj1JmbCpFVvuUXYsQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C273@DEWDFEMB19C.global.corp.sap>
Message-ID: <569FBF02.6080707@oracle.com>

+1. Finally UseSSE42Intrinsics was moved!
I will sponsor it.

Thanks,
Vladimir


On 1/20/16 8:45 AM, Doerr, Martin wrote:
> Hi Volker,
>
> thanks for the update. Looks good.
>
> Best regards,
>    Martin
>
> -----Original Message-----
> From: Volker Simonis [mailto:volker.simonis at gmail.com]
> Sent: Mittwoch, 20. Januar 2016 17:23
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>; ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Subject: Re: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change
>
> Hi Martin,
>
> thanks for your thorough review. I've uploaded a new webrev to:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336.v1/
>
> Please find my comments inline.
>
> Regards,
> Volker
>
>
> On Wed, Jan 20, 2016 at 12:43 PM, Doerr, Martin <martin.doerr at sap.com> wrote:
>> Hi Volker,
>>
>> thank you very much for adapting the non-CompactStrings version of the intrinsics. I especially like that you changed shared code to improve matching of special cases.
>>
>> Here are some minor change requests:
>> - I guess you will have to adapt Copyright messages.
>
> Done.
>
>> - There's a typo in the new comment in library_call: "optimzed".
>
> Fixed.
>
>> - The comment for the instruction count (used for loop alignment) is wrong in MacroAssembler::string_indexof_1 (should start with 3 instead of 2).
>>
>
> Right, fixed.
>
>> I have more change requests regarding ppc.ad:
>>
>> The computation of chr is incorrect for little endian in string_indexOf_imm1_char and string_indexOf_imm1.
>>
>
> Good catch. Fixed.
>
>> Some numbers for compute_padding should be adapted:
>> int string_indexOf_imm1_charNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
>> int string_indexOfCharNode::compute_padding(int current_offset) const { return (3*4-current_offset)&31; }
>> int string_compareNode::compute_padding(int current_offset) const { return (2*4-current_offset)&31; }
>>
>
> Right, and also:
> int string_indexOf_imm1Node::compute_padding(int current_offset) const
> { return (3*4-current_offset)&31; }
>
> I have now put a comment in each method which points to the  macro
> assembler method it depends on to make this dependency explicit.
>
>> Some kill effects are missing:
>> - ctr in all string_indexOf nodes
>
> Added kill effect for ctr register to all str_indexof intrinsics.
>
>> - cr0, cr1 in string_indexOf_imm1, string_indexOfChar
>>
>
> Fixed.
>
>> The new comment for string_indexOfChar claims "// Kill ... needle" which is not true.
>>
>
> Right, fixed.
>
>> Thanks,
>>   Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis
>> Sent: Dienstag, 19. Januar 2016 19:57
>> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>> Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
>> Subject: RFR(M): 8145336: PPC64: fix string intrinsics after CompactStrings change
>>
>> Hi,
>>
>> can somebody please review and sponsor this change.
>>
>> Despite the bug summary, I still had to do some small shared changes
>> to make this work, so unfortunately I can not push this on my own.
>>
>> The change also affects aarch64 (although it is minimal and I don't
>> expect it to break anything) so I cc-ed aarch64-port-dev.
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
>> https://bugs.openjdk.java.net/browse/JDK-8145336
>>
>> As described in the bug, this change only fixes the string intrinsics
>> for the -XX:-UseCompactStrings mode which is still the default on
>> ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
>> added because we already had a similar intrinsic for constant string
>> needles of length one anyway. A later change (which we're already
>> working on) will add the intrinsics which can handle compact strings.
>>
>> The current intrinsics can handle both, the new byte-array based
>> string representation as well as the old char-array based string
>> representation because we internally still use the new hotspot with
>> older versions of the class libraries.
>>
>> I've also ported some of our internal string tests into a new
>> regression test (TestStringIntrinsics2.java) because the existing
>> tests didn't exercise all of our intrinsics.
>>
>> Following the shared changes I had to do:
>>
>> Until now, UseSSE42Intrinsics was a global shared option which was
>> used to control the availability of the stringIndexOf intrinsics. But
>> UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
>> make a lot of sense to define it for other architectures. I've
>> therefore moved the flag to globals_x86.hpp and changed the condition
>> which checks for the ability of the stringIndexOf intrinsics from:
>>
>> if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {
>>
>> to:
>>
>> if (!Matcher::match_rule_supported(Op_StrIndexOf)) {
>>
>> The Matcher::match_rule_supported() method already calls
>> Matcher::has_match_rule() anyway. And it is implemented in the .ad
>> file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
>> platforms can now decide in their .ad file if they unconditionally
>> support the intrinsic or if they need a special feature check. This
>> change was already briefly discussed in [1].
>>
>> The other shared change I had to make was in
>> LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
>> We have optimized intrinsics for the case that one of the strings to
>> compare is constant, but the  StrEqualsNode is constructed without
>> taking into account that one of the string length values could be a
>> constant. This prevented our optimized instruction from being matched
>> in the ad-file.
>>
>> All the other changes are ppc-specific.
>>
>> Thank you and best regards,
>> Volker
>>
>>
>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From volker.simonis at gmail.com  Wed Jan 20 17:11:35 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 20 Jan 2016 18:11:35 +0100
Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
	CompactStrings change
In-Reply-To: <569FBF02.6080707@oracle.com>
References: <CA+3eh12PC5cUdEdXgG=Pf6Aj5afjwaHcxryovJobzxsoeGpBQQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C120@DEWDFEMB19C.global.corp.sap>
	<CA+3eh10peJ+Xy7NxPKQb5iM8jrbZm9aJyvj1JmbCpFVvuUXYsQ@mail.gmail.com>
	<7C9B87B351A4BA4AA9EC95BB418116567228C273@DEWDFEMB19C.global.corp.sap>
	<569FBF02.6080707@oracle.com>
Message-ID: <CA+3eh13OX5hGfkfTGBDD_-9renn-y4BOJmPLcwxkQ-5z0t57EQ@mail.gmail.com>

Great!
Thanks a lot Vladimir,
Volker


On Wed, Jan 20, 2016 at 6:08 PM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> +1. Finally UseSSE42Intrinsics was moved!
> I will sponsor it.
>
> Thanks,
> Vladimir
>
>
>
> On 1/20/16 8:45 AM, Doerr, Martin wrote:
>>
>> Hi Volker,
>>
>> thanks for the update. Looks good.
>>
>> Best regards,
>>    Martin
>>
>> -----Original Message-----
>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>> Sent: Mittwoch, 20. Januar 2016 17:23
>> To: Doerr, Martin <martin.doerr at sap.com>
>> Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>;
>> ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
>> Subject: Re: RFR(M): 8145336: PPC64: fix string intrinsics after
>> CompactStrings change
>>
>> Hi Martin,
>>
>> thanks for your thorough review. I've uploaded a new webrev to:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336.v1/
>>
>> Please find my comments inline.
>>
>> Regards,
>> Volker
>>
>>
>> On Wed, Jan 20, 2016 at 12:43 PM, Doerr, Martin <martin.doerr at sap.com>
>> wrote:
>>>
>>> Hi Volker,
>>>
>>> thank you very much for adapting the non-CompactStrings version of the
>>> intrinsics. I especially like that you changed shared code to improve
>>> matching of special cases.
>>>
>>> Here are some minor change requests:
>>> - I guess you will have to adapt Copyright messages.
>>
>>
>> Done.
>>
>>> - There's a typo in the new comment in library_call: "optimzed".
>>
>>
>> Fixed.
>>
>>> - The comment for the instruction count (used for loop alignment) is
>>> wrong in MacroAssembler::string_indexof_1 (should start with 3 instead of
>>> 2).
>>>
>>
>> Right, fixed.
>>
>>> I have more change requests regarding ppc.ad:
>>>
>>> The computation of chr is incorrect for little endian in
>>> string_indexOf_imm1_char and string_indexOf_imm1.
>>>
>>
>> Good catch. Fixed.
>>
>>> Some numbers for compute_padding should be adapted:
>>> int string_indexOf_imm1_charNode::compute_padding(int current_offset)
>>> const { return (3*4-current_offset)&31; }
>>> int string_indexOfCharNode::compute_padding(int current_offset) const {
>>> return (3*4-current_offset)&31; }
>>> int string_compareNode::compute_padding(int current_offset) const {
>>> return (2*4-current_offset)&31; }
>>>
>>
>> Right, and also:
>> int string_indexOf_imm1Node::compute_padding(int current_offset) const
>> { return (3*4-current_offset)&31; }
>>
>> I have now put a comment in each method which points to the  macro
>> assembler method it depends on to make this dependency explicit.
>>
>>> Some kill effects are missing:
>>> - ctr in all string_indexOf nodes
>>
>>
>> Added kill effect for ctr register to all str_indexof intrinsics.
>>
>>> - cr0, cr1 in string_indexOf_imm1, string_indexOfChar
>>>
>>
>> Fixed.
>>
>>> The new comment for string_indexOfChar claims "// Kill ... needle" which
>>> is not true.
>>>
>>
>> Right, fixed.
>>
>>> Thanks,
>>>   Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-compiler-dev
>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Volker
>>> Simonis
>>> Sent: Dienstag, 19. Januar 2016 19:57
>>> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>>> Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
>>> Subject: RFR(M): 8145336: PPC64: fix string intrinsics after
>>> CompactStrings change
>>>
>>> Hi,
>>>
>>> can somebody please review and sponsor this change.
>>>
>>> Despite the bug summary, I still had to do some small shared changes
>>> to make this work, so unfortunately I can not push this on my own.
>>>
>>> The change also affects aarch64 (although it is minimal and I don't
>>> expect it to break anything) so I cc-ed aarch64-port-dev.
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8145336/
>>> https://bugs.openjdk.java.net/browse/JDK-8145336
>>>
>>> As described in the bug, this change only fixes the string intrinsics
>>> for the -XX:-UseCompactStrings mode which is still the default on
>>> ppc64. Additionally, support for the new StrIndexOfChar intrinsic was
>>> added because we already had a similar intrinsic for constant string
>>> needles of length one anyway. A later change (which we're already
>>> working on) will add the intrinsics which can handle compact strings.
>>>
>>> The current intrinsics can handle both, the new byte-array based
>>> string representation as well as the old char-array based string
>>> representation because we internally still use the new hotspot with
>>> older versions of the class libraries.
>>>
>>> I've also ported some of our internal string tests into a new
>>> regression test (TestStringIntrinsics2.java) because the existing
>>> tests didn't exercise all of our intrinsics.
>>>
>>> Following the shared changes I had to do:
>>>
>>> Until now, UseSSE42Intrinsics was a global shared option which was
>>> used to control the availability of the stringIndexOf intrinsics. But
>>> UseSSE42Intrinsics is actually a x86-specific feature so it doesn't
>>> make a lot of sense to define it for other architectures. I've
>>> therefore moved the flag to globals_x86.hpp and changed the condition
>>> which checks for the ability of the stringIndexOf intrinsics from:
>>>
>>> if (!Matcher::has_match_rule(Op_StrIndexOf) || !UseSSE42Intrinsics) {
>>>
>>> to:
>>>
>>> if (!Matcher::match_rule_supported(Op_StrIndexOf)) {
>>>
>>> The Matcher::match_rule_supported() method already calls
>>> Matcher::has_match_rule() anyway. And it is implemented in the .ad
>>> file so I've moved the check for UseSSE42Intrinsics into x86.ad. Other
>>> platforms can now decide in their .ad file if they unconditionally
>>> support the intrinsic or if they need a special feature check. This
>>> change was already briefly discussed in [1].
>>>
>>> The other shared change I had to make was in
>>> LibraryCallKit::make_string_method_node() for the "Op_StrEquals" case.
>>> We have optimized intrinsics for the case that one of the strings to
>>> compare is constant, but the  StrEqualsNode is constructed without
>>> taking into account that one of the string length values could be a
>>> constant. This prevented our optimized instruction from being matched
>>> in the ad-file.
>>>
>>> All the other changes are ppc-specific.
>>>
>>> Thank you and best regards,
>>> Volker
>>>
>>>
>>> [1]
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/thread.html#20400

From pavel.punegov at oracle.com  Wed Jan 20 17:23:33 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Wed, 20 Jan 2016 20:23:33 +0300
Subject: RFR (S): 8145800: [Testbug] CompilerControl: inline message
	differs for not inlined methods
In-Reply-To: <569FBA87.4@oracle.com>
References: <0B7801F7-DEFA-482F-BD24-C06BC3037E0B@oracle.com>
	<569FBA87.4@oracle.com>
Message-ID: <9E055252-9C22-4438-A210-33A6E7B5360E@oracle.com>

Thanks for review, Vladimir.

> On 20 Jan 2016, at 19:49, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> Thanks,
> Vladimir
> 
> On 1/20/16 8:36 AM, Pavel Punegov wrote:
>> Hi,
>> 
>> please review the following fix for the test bug.
>> 
>> Issue: tests incorrectly set inlining state for methods, that belong to any of Internal subclass of both pool.sub.Klass
>> and pool.subpack.KlassDup.
>> This happen because test have an assumption that any of method callers will match only *.* directive pattern. But they
>> could match patterns like ?*Internal*?, because
>> a typical method caller in this case could be pool/sub/Klass$Internal::lambda$getAllMethods$0.
>> 
>> Fix: Make method callers (lambdas) do not contain any names used in the test, such as Internal, or Klass. That?s why all
>> executable and callable creation was moved to a new SubMethodHolder class.
>> 
>> bug id: https://bugs.openjdk.java.net/browse/JDK-8145800
>> webrev: http://cr.openjdk.java.net/~ppunegov/8145800/webrev.00/
>> 
>> ? Thanks,
>> Pavel Punegov
>> 


From christian.thalinger at oracle.com  Thu Jan 21 02:49:21 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 20 Jan 2016 16:49:21 -1000
Subject: RFR(XS) : 8141557 : TestResolvedJavaMethod.java times out after
	1000 ms
In-Reply-To: <613621B5-925D-4E72-A115-1480F47BA43C@oracle.com>
References: <52CAB89A-1AA5-4545-9C4B-DD2A6880E463@oracle.com>
	<3DDA7A22-74CF-400A-A403-9CE70655ABD5@oracle.com>
	<613621B5-925D-4E72-A115-1480F47BA43C@oracle.com>
Message-ID: <72DE4A03-5A01-439A-8D01-F6C5A8BDFB86@oracle.com>

Looks good.

> On Jan 20, 2016, at 6:05 AM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
> 
> Hi Chris,
> 
> thank you for review.
> 
> Y, it?s a typo, I?ve fixed that (s/TestAnnotionation/TestAnnotation/g) and added explicit 'value =?:
> http://cr.openjdk.java.net/~iignatyev/8141557/webrev.01/
> 
> Thanks,
> Igor
> 
> 
>> On Jan 20, 2016, at 3:45 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> I suppose TestAnnotionation is a typo?
>> 
>> +    @TestAnnotionation(1000L)
>> 
>> Could you change that to value = 1000L?  Just for extra clarity.  Then it looks good.
>> 
>>> On Jan 19, 2016, at 11:06 AM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
>>> 
>>> http://cr.openjdk.java.net/~iignatyev/8141557/webrev.00/
>>>> 22 lines changed: 16 ins; 0 del; 6 mod; 
>>> 
>>> Hi all,
>>> 
>>> Could you please review the fix for 8141557?
>>> 
>>> The test uses timeout value of org.junit.Test to test reading annotation via JVMCI. In some cases, e.g. on embedded platforms, debug builds or w/ extra vm flags like -Xcomp, 1000ms isn?t enough for the test to complete, and since jtreg doesn?t apply timeout factor for junit/testng timeouts (CODETOOLS-7901567) the test timeouts despite the fact of increased timeout factor.
>>> 
>>> The fix changes the test to use a separate annotation, which doesn?t affect test execution, and remove timeout value (which means no timeout).
>>> 
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8141557
>>> testing: locally
>>> 
>>> Thanks,
>>> Igor
>> 
> 


From christian.thalinger at oracle.com  Thu Jan 21 02:56:58 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Wed, 20 Jan 2016 16:56:58 -1000
Subject: RFR and workflow question
In-Reply-To: <569FB9F8.1070802@oracle.com>
References: <569FA7A1.1080200@oracle.com> <569FB9F8.1070802@oracle.com>
Message-ID: <8A7390F7-4DC5-4033-B12F-F1202EDA51B8@oracle.com>


> On Jan 20, 2016, at 6:46 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Use 8147475 for commit message - your change can be related (relocation info could be affected by padding).
> But I don't see how your change can fix 8146608 - it is patching return PC which is SP relative and nothing to do with padding in prolog.
> 
> Changes looks fine. Please, integrate it first since it affect all hotspot repos.

Roland is not a Committer yet; just Author.  I?ll take care of it.

> 
> Thanks,
> Vladimir
> 
> On 1/20/16 7:28 AM, Roland Schatz wrote:
>> Hi!
>> 
>> Please review this small bugfix:
>> http://cr.openjdk.java.net/~rschatz/JDK-8147475/webrev.00/
>> 
>> This is on top of another webrev[1]. It has no semantic dependency on it, just a random source-level conflict. I can
>> rebase them if we want to integrate this one first.
>> 
>> 
>> This is missing a commit message, and I'm not sure what to use here.
>> 
>> There are two issues that I *think* this will solve, but I can't be sure since I haven't managed to reproduce them:
>> https://bugs.openjdk.java.net/browse/JDK-8147475
>> https://bugs.openjdk.java.net/browse/JDK-8146608
>> 
>> Do I just use one of the above issues randomly for the commit message, and close the other as duplicate?
>> 
>> 
>> Thanks,
>> Roland
>> 
>> [1] http://cr.openjdk.java.net/~rschatz/JDK-8147599/webrev.00/


From aph at redhat.com  Thu Jan 21 10:22:13 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2016 10:22:13 +0000
Subject: Baffling USE in x86_64.ad
Message-ID: <56A0B155.3090802@redhat.com>

In this pattern:

instruct compI_rReg(rFlagsReg cr, rRegI op1, rRegI op2)
%{
  match(Set cr (CmpI op1 op2));
  effect(DEF cr, USE op1, USE op2);

  format %{ "cmpl    $op1, $op2" %}
  opcode(0x3B);  /* Opcode 3B /r */
  ins_encode(REX_reg_reg(op1, op2), OpcP, reg_reg(op1, op2));
  ins_pipe(ialu_cr_reg_reg);
%}

why does the USE appear in the effect?  And the DEF?  The operands
appear in the match expression in the normal way, so I would have
thought the effect expression unnecessary.  It's this pattern: others
don't have the effect:

instruct compL_rReg(rFlagsReg cr, rRegL op1, rRegL op2)
%{
  match(Set cr (CmpL op1 op2));

  format %{ "cmpq    $op1, $op2" %}
  opcode(0x3B);  /* Opcode 3B /r */
  ins_encode(REX_reg_reg_wide(op1, op2), OpcP, reg_reg(op1, op2));
  ins_pipe(ialu_cr_reg_reg);
%}

Thanks,

Andrew.

From nils.eliasson at oracle.com  Thu Jan 21 10:25:59 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 21 Jan 2016 11:25:59 +0100
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
Message-ID: <56A0B237.9090008@oracle.com>

Hi,

Please review this small change. The diff looks big but most of the 
change is just changing how the directive are passed to the compilers. 
Directives are set in the ciEnv and then passed to the compilers. The 
compilers can then choose to add it to any internal compilation object 
for convenience. The hs_err printing routine in vmError.cpp loads the 
directive from the ciEnv.

Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/

Regards,
Nils

From adinn at redhat.com  Thu Jan 21 10:29:32 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 21 Jan 2016 10:29:32 +0000
Subject: Baffling USE in x86_64.ad
In-Reply-To: <56A0B155.3090802@redhat.com>
References: <56A0B155.3090802@redhat.com>
Message-ID: <56A0B30C.6080205@redhat.com>

On 21/01/16 10:22, Andrew Haley wrote:
> In this pattern:
> 
> instruct compI_rReg(rFlagsReg cr, rRegI op1, rRegI op2)
> %{
>   match(Set cr (CmpI op1 op2));
>   effect(DEF cr, USE op1, USE op2);
> 
>   format %{ "cmpl    $op1, $op2" %}
>   opcode(0x3B);  /* Opcode 3B /r */
>   ins_encode(REX_reg_reg(op1, op2), OpcP, reg_reg(op1, op2));
>   ins_pipe(ialu_cr_reg_reg);
> %}
> 
> why does the USE appear in the effect?  And the DEF?  The operands
> appear in the match expression in the normal way, so I would have
> thought the effect expression unnecessary.  It's this pattern: others
> don't have the effect:

I am not certain of this but I note that the above rule is used for
direct matching and also used for expansion. I believe the expansion
process requires an effect declaration even though that effect is also
implied by the match rule. There are many instruction definitions which
are only used for matching and they have an effect declaration but no
match declaration.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From nils.eliasson at oracle.com  Thu Jan 21 10:31:52 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 21 Jan 2016 11:31:52 +0100
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
In-Reply-To: <56A0B237.9090008@oracle.com>
References: <56A0B237.9090008@oracle.com>
Message-ID: <56A0B398.4000408@oracle.com>

This is how it looks:

[...]

---------------  T H R E A D  ---------------

Current thread (0x00007f071046a000):  JavaThread "C1 CompilerThread10" daemon [_thread_in_native, id=20033, stack(0x00007f05d7afb000,0x00007f05d7bfc000)]

Current CompileTask:
C1:    225    1       3       java.lang.String::isLatin1 (19 bytes)

Current compiler directive:
   inline: -
   Enable:true Exclude:false BreakAtExecute:false BreakAtCompile:false Log:false PrintAssembly:false PrintInlining:false PrintNMethods:false ReplayInline:false DumpReplay:false DumpInline:false CompilerDirectivesIgnoreCompileCommands:false DisableIntrinsic: BlockLayoutByFrequency:true PrintOptoAssembly:false PrintIntrinsics:false TraceOptoPipelining:false TraceOptoOutput:false TraceSpilling:false Vectorize:false VectorizeDebug:false CloneMapDebug:false DoReserveCopyInSuperWordDebug:false IGVPrintLevel:0 MaxNodeLimit:80000

Stack: [0x00007f05d7afb000,0x00007f05d7bfc000],  sp=0x00007f05d7bfa5d0,  free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x12e7532]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x182
V  [libjvm.so+0x12e829a]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
V  [libjvm.so+0x908cca]  report_vm_error(char const*, int, char const*, char const*, ...)+0xea
V  [libjvm.so+0x88df81]  CompileBroker::post_compile(CompilerThread*, CompileTask*, EventCompilation&, bool, ciEnv*)+0x1b1
V  [libjvm.so+0x88ec5a]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x90a
V  [libjvm.so+0x88f960]  CompileBroker::compiler_thread_loop()+0x540
V  [libjvm.so+0x1264789]  JavaThread::thread_main_inner()+0x1c9
V  [libjvm.so+0x1264ac6]  JavaThread::run()+0x2a6
V  [libjvm.so+0x10189aa]  java_start(Thread*)+0xca
C  [libpthread.so.0+0x8182]  start_thread+0xc2

[...]

http://cr.openjdk.java.net/~neliasso/8138756/hserr.txt

Regards,
Nils

On 2016-01-21 11:25, Nils Eliasson wrote:
> Hi,
>
> Please review this small change. The diff looks big but most of the 
> change is just changing how the directive are passed to the compilers. 
> Directives are set in the ciEnv and then passed to the compilers. The 
> compilers can then choose to add it to any internal compilation object 
> for convenience. The hs_err printing routine in vmError.cpp loads the 
> directive from the ciEnv.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/
>
> Regards,
> Nils

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160121/dc745270/attachment.html>

From tobias.hartmann at oracle.com  Thu Jan 21 10:46:16 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 21 Jan 2016 11:46:16 +0100
Subject: [9] RFR(S): 8065334: CodeHeap expansion fails although there is
	uncommitted memory
Message-ID: <56A0B6F8.5040809@oracle.com>

Hi,

please review the following patch:

https://bugs.openjdk.java.net/browse/JDK-8065334
http://cr.openjdk.java.net/~thartmann/8065334/webrev.00/

If ReservedCodeCacheSize (or the size of a single code heap) is not a multiple of CodeCacheExpansionSize, the last code heap expansion fails, leaving unused uncommitted memory. For example, see [1]. Both the profiled and the non-profiled segments are full but there is still 32Kb of uncommitted space that is never used (CodeCacheExpansionSize is 64Kb).

CodeHeap::expand_by() should check for this condition and commit all the remaining space even if the requested expansion size is larger. Like this, we use all the available space [2].

Thanks,
Tobias


[1] Baseline:
CodeHeap 'non-nmethods': size=5696Kb used=2403Kb max_used=2433Kb free=3292Kb
 bounds [0x00007f4d83b27000, 0x00007f4d83d97000, 0x00007f4d840b7000]
CodeHeap 'profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
 bounds [0x00007f4d840b7000, 0x00007f4d8b5e7000, 0x00007f4d8b5ef000]
CodeHeap 'non-profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
 bounds [0x00007f4d8b5ef000, 0x00007f4d92b1f000, 0x00007f4d92b27000]
 total_blobs=248449 nmethods=52 adapters=570
 compilation: disabled (not enough contiguous free space left)

[2] Fixed:
CodeHeap 'non-nmethods': size=5696Kb used=2404Kb max_used=2436Kb free=3291Kb
 bounds [0x00007fe8cd000000, 0x00007fe8cd270000, 0x00007fe8cd590000]
CodeHeap 'profiled nmethods': size=120032Kb used=120031Kb max_used=120031Kb free=0Kb
 bounds [0x00007fe8cd590000, 0x00007fe8d4ac8000, 0x00007fe8d4ac8000]
CodeHeap 'non-profiled nmethods': size=120032Kb used=120032Kb max_used=120032Kb free=0Kb
 bounds [0x00007fe8d4ac8000, 0x00007fe8dc000000, 0x00007fe8dc000000]
 total_blobs=10665 nmethods=127 adapters=570
 compilation: disabled (not enough contiguous free space left)

From aph at redhat.com  Thu Jan 21 11:21:11 2016
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2016 11:21:11 +0000
Subject: Baffling USE in x86_64.ad
In-Reply-To: <56A0B30C.6080205@redhat.com>
References: <56A0B155.3090802@redhat.com> <56A0B30C.6080205@redhat.com>
Message-ID: <56A0BF27.4070200@redhat.com>

On 01/21/2016 10:29 AM, Andrew Dinn wrote:
> I am not certain of this but I note that the above rule is used for
> direct matching and also used for expansion.

Aha!  So it is.  Thanks, I didn't look for that but I see you're right.

Andrew.


From goetz.lindenmaier at sap.com  Thu Jan 21 11:47:19 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 21 Jan 2016 11:47:19 +0000
Subject: Baffling USE in x86_64.ad
In-Reply-To: <56A0B30C.6080205@redhat.com>
References: <56A0B155.3090802@redhat.com> <56A0B30C.6080205@redhat.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC41F1651F@DEWDFEMB12A.global.corp.sap>

Hi,

adlc scans the match rule and derives USE/DEF effects from that.
Then it adds in the effects from the effect() declaration, which are 
the same here.

So that line is superfluous.  

(I once cleaned up the effects in the ppc 
port and removed all these.  And we are expand-power-users :) )

Best regards,
  Goetz.


> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Andrew Dinn
> Sent: Donnerstag, 21. Januar 2016 11:30
> To: Andrew Haley <aph at redhat.com>; hotspot compiler <hotspot-compiler-
> dev at openjdk.java.net>
> Subject: Re: Baffling USE in x86_64.ad
> 
> On 21/01/16 10:22, Andrew Haley wrote:
> > In this pattern:
> >
> > instruct compI_rReg(rFlagsReg cr, rRegI op1, rRegI op2)
> > %{
> >   match(Set cr (CmpI op1 op2));
> >   effect(DEF cr, USE op1, USE op2);
> >
> >   format %{ "cmpl    $op1, $op2" %}
> >   opcode(0x3B);  /* Opcode 3B /r */
> >   ins_encode(REX_reg_reg(op1, op2), OpcP, reg_reg(op1, op2));
> >   ins_pipe(ialu_cr_reg_reg);
> > %}
> >
> > why does the USE appear in the effect?  And the DEF?  The operands
> > appear in the match expression in the normal way, so I would have
> > thought the effect expression unnecessary.  It's this pattern: others
> > don't have the effect:
> 
> I am not certain of this but I note that the above rule is used for
> direct matching and also used for expansion. I believe the expansion
> process requires an effect declaration even though that effect is also
> implied by the match rule. There are many instruction definitions which
> are only used for matching and they have an effect declaration but no
> match declaration.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
> Argiry (US)

From andreas.eriksson at oracle.com  Thu Jan 21 12:51:44 2016
From: andreas.eriksson at oracle.com (Andreas Eriksson)
Date: Thu, 21 Jan 2016 13:51:44 +0100
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
In-Reply-To: <56A0B398.4000408@oracle.com>
References: <56A0B237.9090008@oracle.com> <56A0B398.4000408@oracle.com>
Message-ID: <56A0D460.2070006@oracle.com>

Looks good to me (not Reviewer).

- Andreas

On 2016-01-21 11:31, Nils Eliasson wrote:
> This is how it looks:
>
> [...]
> ---------------  T H R E A D  ---------------
>
> Current thread (0x00007f071046a000):  JavaThread "C1 CompilerThread10" daemon [_thread_in_native, id=20033, stack(0x00007f05d7afb000,0x00007f05d7bfc000)]
>
> Current CompileTask:
> C1:    225    1       3       java.lang.String::isLatin1 (19 bytes)
>
> Current compiler directive:
>    inline: -
>    Enable:true Exclude:false BreakAtExecute:false BreakAtCompile:false Log:false PrintAssembly:false PrintInlining:false PrintNMethods:false ReplayInline:false DumpReplay:false DumpInline:false CompilerDirectivesIgnoreCompileCommands:false DisableIntrinsic: BlockLayoutByFrequency:true PrintOptoAssembly:false PrintIntrinsics:false TraceOptoPipelining:false TraceOptoOutput:false TraceSpilling:false Vectorize:false VectorizeDebug:false CloneMapDebug:false DoReserveCopyInSuperWordDebug:false IGVPrintLevel:0 MaxNodeLimit:80000
>
> Stack: [0x00007f05d7afb000,0x00007f05d7bfc000],  sp=0x00007f05d7bfa5d0,  free space=1021k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x12e7532]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x182
> V  [libjvm.so+0x12e829a]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
> V  [libjvm.so+0x908cca]  report_vm_error(char const*, int, char const*, char const*, ...)+0xea
> V  [libjvm.so+0x88df81]  CompileBroker::post_compile(CompilerThread*, CompileTask*, EventCompilation&, bool, ciEnv*)+0x1b1
> V  [libjvm.so+0x88ec5a]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x90a
> V  [libjvm.so+0x88f960]  CompileBroker::compiler_thread_loop()+0x540
> V  [libjvm.so+0x1264789]  JavaThread::thread_main_inner()+0x1c9
> V  [libjvm.so+0x1264ac6]  JavaThread::run()+0x2a6
> V  [libjvm.so+0x10189aa]  java_start(Thread*)+0xca
> C  [libpthread.so.0+0x8182]  start_thread+0xc2
> [...]
>
> http://cr.openjdk.java.net/~neliasso/8138756/hserr.txt
>
> Regards,
> Nils
>
> On 2016-01-21 11:25, Nils Eliasson wrote:
>> Hi,
>>
>> Please review this small change. The diff looks big but most of the 
>> change is just changing how the directive are passed to the 
>> compilers. Directives are set in the ciEnv and then passed to the 
>> compilers. The compilers can then choose to add it to any internal 
>> compilation object for convenience. The hs_err printing routine in 
>> vmError.cpp loads the directive from the ciEnv.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
>> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/
>>
>> Regards,
>> Nils
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160121/08fc420c/attachment.html>

From goetz.lindenmaier at sap.com  Thu Jan 21 14:53:23 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 21 Jan 2016 14:53:23 +0000
Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
Message-ID: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>

Hi,

SAP changed its name from SAP AG to SAP SE. We were asked to
adapt our copyright messages accordingly.

This change fixes all SAP copyrights in hostpot to follow the patterns
"Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
"Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights reserved."

Please review this change. I please need a sponsor.
http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01

Best regards,
  Goetz.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160121/136badbf/attachment.html>

From roland.schatz at oracle.com  Thu Jan 21 15:41:29 2016
From: roland.schatz at oracle.com (Roland Schatz)
Date: Thu, 21 Jan 2016 16:41:29 +0100
Subject: RFR(S): 8146244: compiler/jvmci/code/DataPatchTest.java crashes:
	SIGSEGV in (getConstClass)getConstClass
Message-ID: <56A0FC29.7040506@oracle.com>

Hi,

Please review this small bugfix:
webrev: http://cr.openjdk.java.net/~rschatz/JDK-8146244/webrev.00/
issue: https://bugs.openjdk.java.net/browse/JDK-8146244

Thanks,
Roland

From thomas.stuefe at gmail.com  Thu Jan 21 15:48:14 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 21 Jan 2016 16:48:14 +0100
Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
In-Reply-To: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
Message-ID: <CAA-vtUzWnn9kLj24PAurWsTrp1PpwKmYgcs7SR9rEtdg=MTMvw@mail.gmail.com>

Hi Goetz,

http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.cpp.frames.html

Please remove Oracle copyright, this is SAP only.

http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.hpp.frames.html

ditto.

Otherwise looks fine.

...
Thomas

On Thu, Jan 21, 2016 at 3:53 PM, Lindenmaier, Goetz <
goetz.lindenmaier at sap.com> wrote:

> Hi,
>
>
>
> SAP changed its name from SAP AG to SAP SE. We were asked to
>
> adapt our copyright messages accordingly.
>
>
>
> This change fixes all SAP copyrights in hostpot to follow the patterns
>
> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
>
> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All
> rights reserved."
>
>
>
> Please review this change. I please need a sponsor.
>
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
>
>
>
> Best regards,
>
>   Goetz.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160121/6adf9e1d/attachment.html>

From volker.simonis at gmail.com  Thu Jan 21 17:29:29 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 21 Jan 2016 18:29:29 +0100
Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
In-Reply-To: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
Message-ID: <CA+3eh13Q2aTYnxE1K6UmVw=KnxM4A7LLc+gyiQzWNyRELqddZg@mail.gmail.com>

Looks good!

Thanks for doing this cleanup,
Volker


On Thu, Jan 21, 2016 at 3:53 PM, Lindenmaier, Goetz
<goetz.lindenmaier at sap.com> wrote:
> Hi,
>
>
>
> SAP changed its name from SAP AG to SAP SE. We were asked to
>
> adapt our copyright messages accordingly.
>
>
>
> This change fixes all SAP copyrights in hostpot to follow the patterns
>
> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
>
> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights
> reserved."
>
>
>
> Please review this change. I please need a sponsor.
>
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
>
>
>
> Best regards,
>
>   Goetz.
>
>

From vladimir.kozlov at oracle.com  Thu Jan 21 18:55:07 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 21 Jan 2016 10:55:07 -0800
Subject: Baffling USE in x86_64.ad
In-Reply-To: <4295855A5C1DE049A61835A1887419CC41F1651F@DEWDFEMB12A.global.corp.sap>
References: <56A0B155.3090802@redhat.com> <56A0B30C.6080205@redhat.com>
	<4295855A5C1DE049A61835A1887419CC41F1651F@DEWDFEMB12A.global.corp.sap>
Message-ID: <56A1298B.8000804@oracle.com>

I agree that is not needed. We never cleaned up this before. I looked on history and it from 2000 when C2 was still in 
development.

Regards,
Vladimir


On 1/21/16 3:47 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> adlc scans the match rule and derives USE/DEF effects from that.
> Then it adds in the effects from the effect() declaration, which are
> the same here.
>
> So that line is superfluous.
>
> (I once cleaned up the effects in the ppc
> port and removed all these.  And we are expand-power-users :) )
>
> Best regards,
>    Goetz.
>
>
>> -----Original Message-----
>> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
>> bounces at openjdk.java.net] On Behalf Of Andrew Dinn
>> Sent: Donnerstag, 21. Januar 2016 11:30
>> To: Andrew Haley <aph at redhat.com>; hotspot compiler <hotspot-compiler-
>> dev at openjdk.java.net>
>> Subject: Re: Baffling USE in x86_64.ad
>>
>> On 21/01/16 10:22, Andrew Haley wrote:
>>> In this pattern:
>>>
>>> instruct compI_rReg(rFlagsReg cr, rRegI op1, rRegI op2)
>>> %{
>>>    match(Set cr (CmpI op1 op2));
>>>    effect(DEF cr, USE op1, USE op2);
>>>
>>>    format %{ "cmpl    $op1, $op2" %}
>>>    opcode(0x3B);  /* Opcode 3B /r */
>>>    ins_encode(REX_reg_reg(op1, op2), OpcP, reg_reg(op1, op2));
>>>    ins_pipe(ialu_cr_reg_reg);
>>> %}
>>>
>>> why does the USE appear in the effect?  And the DEF?  The operands
>>> appear in the match expression in the normal way, so I would have
>>> thought the effect expression unnecessary.  It's this pattern: others
>>> don't have the effect:
>>
>> I am not certain of this but I note that the above rule is used for
>> direct matching and also used for expansion. I believe the expansion
>> process requires an effect declaration even though that effect is also
>> implied by the match rule. There are many instruction definitions which
>> are only used for matching and they have an effect declaration but no
>> match declaration.
>>
>> regards,
>>
>>
>> Andrew Dinn
>> -----------
>> Senior Principal Software Engineer
>> Red Hat UK Ltd
>> Registered in UK and Wales under Company Registration No. 3798903
>> Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
>> Argiry (US)

From vladimir.kozlov at oracle.com  Thu Jan 21 19:18:28 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 21 Jan 2016 11:18:28 -0800
Subject: [9] RFR(S): 8065334: CodeHeap expansion fails although there is
	uncommitted memory
In-Reply-To: <56A0B6F8.5040809@oracle.com>
References: <56A0B6F8.5040809@oracle.com>
Message-ID: <56A12F04.6080305@oracle.com>

Good.

Thanks,
Vladimir

On 1/21/16 2:46 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8065334
> http://cr.openjdk.java.net/~thartmann/8065334/webrev.00/
>
> If ReservedCodeCacheSize (or the size of a single code heap) is not a multiple of CodeCacheExpansionSize, the last code heap expansion fails, leaving unused uncommitted memory. For example, see [1]. Both the profiled and the non-profiled segments are full but there is still 32Kb of uncommitted space that is never used (CodeCacheExpansionSize is 64Kb).
>
> CodeHeap::expand_by() should check for this condition and commit all the remaining space even if the requested expansion size is larger. Like this, we use all the available space [2].
>
> Thanks,
> Tobias
>
>
> [1] Baseline:
> CodeHeap 'non-nmethods': size=5696Kb used=2403Kb max_used=2433Kb free=3292Kb
>   bounds [0x00007f4d83b27000, 0x00007f4d83d97000, 0x00007f4d840b7000]
> CodeHeap 'profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
>   bounds [0x00007f4d840b7000, 0x00007f4d8b5e7000, 0x00007f4d8b5ef000]
> CodeHeap 'non-profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
>   bounds [0x00007f4d8b5ef000, 0x00007f4d92b1f000, 0x00007f4d92b27000]
>   total_blobs=248449 nmethods=52 adapters=570
>   compilation: disabled (not enough contiguous free space left)
>
> [2] Fixed:
> CodeHeap 'non-nmethods': size=5696Kb used=2404Kb max_used=2436Kb free=3291Kb
>   bounds [0x00007fe8cd000000, 0x00007fe8cd270000, 0x00007fe8cd590000]
> CodeHeap 'profiled nmethods': size=120032Kb used=120031Kb max_used=120031Kb free=0Kb
>   bounds [0x00007fe8cd590000, 0x00007fe8d4ac8000, 0x00007fe8d4ac8000]
> CodeHeap 'non-profiled nmethods': size=120032Kb used=120032Kb max_used=120032Kb free=0Kb
>   bounds [0x00007fe8d4ac8000, 0x00007fe8dc000000, 0x00007fe8dc000000]
>   total_blobs=10665 nmethods=127 adapters=570
>   compilation: disabled (not enough contiguous free space left)
>

From vladimir.kozlov at oracle.com  Thu Jan 21 19:28:54 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 21 Jan 2016 11:28:54 -0800
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
In-Reply-To: <56A0B398.4000408@oracle.com>
References: <56A0B237.9090008@oracle.com> <56A0B398.4000408@oracle.com>
Message-ID: <56A13176.804@oracle.com>

Passing directives through ciEnv is fine.
My question is about output in hs_err file. How those directives were selected in your example?
I found it strange to see mixed flags values and oracle commands. "Enable:true Exclude:false" - which these correspond 
to, for example?

Should we not print directives/flags which are not set explicitly?

Thanks,
Vladimir

On 1/21/16 2:31 AM, Nils Eliasson wrote:
> This is how it looks:
>
> [...]
>
> ---------------  T H R E A D  ---------------
>
> Current thread (0x00007f071046a000):  JavaThread "C1 CompilerThread10" daemon [_thread_in_native, id=20033, stack(0x00007f05d7afb000,0x00007f05d7bfc000)]
>
> Current CompileTask:
> C1:    225    1       3       java.lang.String::isLatin1 (19 bytes)
>
> Current compiler directive:
>    inline: -
>    Enable:true Exclude:false BreakAtExecute:false BreakAtCompile:false Log:false PrintAssembly:false PrintInlining:false PrintNMethods:false ReplayInline:false DumpReplay:false DumpInline:false CompilerDirectivesIgnoreCompileCommands:false DisableIntrinsic: BlockLayoutByFrequency:true PrintOptoAssembly:false PrintIntrinsics:false TraceOptoPipelining:false TraceOptoOutput:false TraceSpilling:false Vectorize:false VectorizeDebug:false CloneMapDebug:false DoReserveCopyInSuperWordDebug:false IGVPrintLevel:0 MaxNodeLimit:80000
>
> Stack: [0x00007f05d7afb000,0x00007f05d7bfc000],  sp=0x00007f05d7bfa5d0,  free space=1021k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x12e7532]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x182
> V  [libjvm.so+0x12e829a]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
> V  [libjvm.so+0x908cca]  report_vm_error(char const*, int, char const*, char const*, ...)+0xea
> V  [libjvm.so+0x88df81]  CompileBroker::post_compile(CompilerThread*, CompileTask*, EventCompilation&, bool, ciEnv*)+0x1b1
> V  [libjvm.so+0x88ec5a]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x90a
> V  [libjvm.so+0x88f960]  CompileBroker::compiler_thread_loop()+0x540
> V  [libjvm.so+0x1264789]  JavaThread::thread_main_inner()+0x1c9
> V  [libjvm.so+0x1264ac6]  JavaThread::run()+0x2a6
> V  [libjvm.so+0x10189aa]  java_start(Thread*)+0xca
> C  [libpthread.so.0+0x8182]  start_thread+0xc2
>
> [...]
>
> http://cr.openjdk.java.net/~neliasso/8138756/hserr.txt
>
> Regards,
> Nils
>
> On 2016-01-21 11:25, Nils Eliasson wrote:
>> Hi,
>>
>> Please review this small change. The diff looks big but most of the change is just changing how the directive are
>> passed to the compilers. Directives are set in the ciEnv and then passed to the compilers. The compilers can then
>> choose to add it to any internal compilation object for convenience. The hs_err printing routine in vmError.cpp loads
>> the directive from the ciEnv.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
>> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/
>>
>> Regards,
>> Nils
>

From christian.thalinger at oracle.com  Thu Jan 21 23:03:47 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 21 Jan 2016 13:03:47 -1000
Subject: RFR: 8147599: [JVMCI] simplify code installation interface
In-Reply-To: <569F8D19.4090305@oracle.com>
References: <569F8D19.4090305@oracle.com>
Message-ID: <B414FDB3-014B-42BF-A55F-EAA8D2BA93F3@oracle.com>

Looks good.

> On Jan 20, 2016, at 3:35 AM, Roland Schatz <roland.schatz at oracle.com> wrote:
> 
> Hi,
> 
> Please review this change to the JVMCI code installation interface:
> 
> webrev: http://cr.openjdk.java.net/~rschatz/JDK-8147599/webrev.00/
> jira: https://bugs.openjdk.java.net/browse/JDK-8147599
> 
> The new classes in the jdk.vm.ci.code.site package used to be inner classes in the removed CompilationResult class, no actual code changes there.
> 
> Thanks,
> Roland


From goetz.lindenmaier at sap.com  Fri Jan 22 07:41:42 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 22 Jan 2016 07:41:42 +0000
Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
In-Reply-To: <CAA-vtUzWnn9kLj24PAurWsTrp1PpwKmYgcs7SR9rEtdg=MTMvw@mail.gmail.com>
References: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
	<CAA-vtUzWnn9kLj24PAurWsTrp1PpwKmYgcs7SR9rEtdg=MTMvw@mail.gmail.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC41F16A11@DEWDFEMB12A.global.corp.sap>

Hi Thomas,

I only want to do syntactic changes to our copyright message. I don?t
want to change any content of them.

So please let?s leave this to another change.

Thanks,
  Goetz.

From: Thomas St?fe [mailto:thomas.stuefe at gmail.com]
Sent: Thursday, January 21, 2016 4:48 PM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(M): 8147937: Adapt SAP copyrights to new company name.

Hi Goetz,

http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.cpp.frames.html

Please remove Oracle copyright, this is SAP only.

http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.hpp.frames.html

ditto.

Otherwise looks fine.

...
Thomas

On Thu, Jan 21, 2016 at 3:53 PM, Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>> wrote:
Hi,

SAP changed its name from SAP AG to SAP SE. We were asked to
adapt our copyright messages accordingly.

This change fixes all SAP copyrights in hostpot to follow the patterns
"Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
"Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights reserved."

Please review this change. I please need a sponsor.
http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01

Best regards,
  Goetz.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/ee620a0c/attachment-0001.html>

From tobias.hartmann at oracle.com  Fri Jan 22 08:48:59 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 22 Jan 2016 09:48:59 +0100
Subject: [9] RFR(S): 8065334: CodeHeap expansion fails although there is
	uncommitted memory
In-Reply-To: <56A12F04.6080305@oracle.com>
References: <56A0B6F8.5040809@oracle.com> <56A12F04.6080305@oracle.com>
Message-ID: <56A1ECFB.9000205@oracle.com>

Thanks, Vladimir.

Best,
Tobias

On 21.01.2016 20:18, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 1/21/16 2:46 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8065334
>> http://cr.openjdk.java.net/~thartmann/8065334/webrev.00/
>>
>> If ReservedCodeCacheSize (or the size of a single code heap) is not a multiple of CodeCacheExpansionSize, the last code heap expansion fails, leaving unused uncommitted memory. For example, see [1]. Both the profiled and the non-profiled segments are full but there is still 32Kb of uncommitted space that is never used (CodeCacheExpansionSize is 64Kb).
>>
>> CodeHeap::expand_by() should check for this condition and commit all the remaining space even if the requested expansion size is larger. Like this, we use all the available space [2].
>>
>> Thanks,
>> Tobias
>>
>>
>> [1] Baseline:
>> CodeHeap 'non-nmethods': size=5696Kb used=2403Kb max_used=2433Kb free=3292Kb
>>   bounds [0x00007f4d83b27000, 0x00007f4d83d97000, 0x00007f4d840b7000]
>> CodeHeap 'profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
>>   bounds [0x00007f4d840b7000, 0x00007f4d8b5e7000, 0x00007f4d8b5ef000]
>> CodeHeap 'non-profiled nmethods': size=120032Kb used=119999Kb max_used=119999Kb free=32Kb
>>   bounds [0x00007f4d8b5ef000, 0x00007f4d92b1f000, 0x00007f4d92b27000]
>>   total_blobs=248449 nmethods=52 adapters=570
>>   compilation: disabled (not enough contiguous free space left)
>>
>> [2] Fixed:
>> CodeHeap 'non-nmethods': size=5696Kb used=2404Kb max_used=2436Kb free=3291Kb
>>   bounds [0x00007fe8cd000000, 0x00007fe8cd270000, 0x00007fe8cd590000]
>> CodeHeap 'profiled nmethods': size=120032Kb used=120031Kb max_used=120031Kb free=0Kb
>>   bounds [0x00007fe8cd590000, 0x00007fe8d4ac8000, 0x00007fe8d4ac8000]
>> CodeHeap 'non-profiled nmethods': size=120032Kb used=120032Kb max_used=120032Kb free=0Kb
>>   bounds [0x00007fe8d4ac8000, 0x00007fe8dc000000, 0x00007fe8dc000000]
>>   total_blobs=10665 nmethods=127 adapters=570
>>   compilation: disabled (not enough contiguous free space left)
>>

From aph at redhat.com  Fri Jan 22 09:43:38 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 22 Jan 2016 09:43:38 +0000
Subject: Baffling USE in x86_64.ad
In-Reply-To: <56A1298B.8000804@oracle.com>
References: <56A0B155.3090802@redhat.com> <56A0B30C.6080205@redhat.com>
	<4295855A5C1DE049A61835A1887419CC41F1651F@DEWDFEMB12A.global.corp.sap>
	<56A1298B.8000804@oracle.com>
Message-ID: <56A1F9CA.8030609@redhat.com>

On 21/01/16 18:55, Vladimir Kozlov wrote:

> I agree that is not needed. We never cleaned up this before. I
> looked on history and it from 2000 when C2 was still in development.

Thanks for looking at that.  One of the most difficult problems a
programmer ever encounters is the mysterious line of code: "I don't
know what this is for, but I'm scared to remove it in case it's being
used for some odd side-effect.  I think I'll leave it in."

Andrew.

From mikael.gerdin at oracle.com  Fri Jan 22 09:46:32 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Fri, 22 Jan 2016 10:46:32 +0100
Subject: RFR(M) 8147461: Use byte offsets for vtable start and vtable
	length offsets
In-Reply-To: <56A04DCF.9090204@oracle.com>
References: <569926B9.4070806@oracle.com> <569F7E22.3090905@oracle.com>
	<56A04DCF.9090204@oracle.com>
Message-ID: <56A1FA78.3090608@oracle.com>

Hi Chris,

On 2016-01-21 04:17, Chris Plummer wrote:
> Hi Mikael,
>
> The changes look good except I think you should get someone from the
> compiler team to make sure the change in
> HotSpotResolvedJavaMethodImpl.java and HotSpotVMConfig.java are ok. I'm
> not sure why you chose to remove instanceKlassVtableStartOffset() rather
> than just fix it.

I'm cc:ing hotspot-compiler-dev and graal-dev to see if I can get 
someone to ok the JVMCI parts.

The reason for removing the method is that the only reason for it being 
a method was to apply the wordSize scaling on the value and since I 
changed the offset to be a byte offset it does not need scaling and can 
be treated similar to the other constants in HotSpotVMConfig which are 
accessed without any accessor method.

>
> I think some of your changes may conflict with my changes for
> JDK-8143608. Coleen is pushing JDK-8143608 for me once hs-rt opens up.
> I'd appreciate it if you could wait until after then before doing your
> push.

Will do, would you mind pinging me when you've integrated 8143608?

/Mikael

>
> thanks,
>
> Chris
>
> On 1/20/16 4:31 AM, Mikael Gerdin wrote:
>> Hi again,
>>
>> I've rebased the on hs-rt and had to include some additional changes
>> for JVMCI.
>> I've also updated the copyright years.
>> Unfortunately I can't generate an incremental webrev since i rebased
>> the patch and there's no good way that I know of to make that work
>> with webrev.
>>
>> New webrev at: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.1/
>>
>> Testing: JPRT again (which includes the JVMCI jtreg tests)
>>
>> /Mikael
>>
>> On 2016-01-15 18:04, Mikael Gerdin wrote:
>>> Hi all,
>>>
>>> As per the previous discussion in mid-December[0] about moving the
>>> _vtable_length field to class Klass, here's the first RFR and webrev,
>>> according to my suggested plan[1]:
>>>
>>>> My current plan is to first modify the vtable_length_offset accessor to
>>>> return a byte offset (which is what it's translated to by all callers).
>>>>
>>>> Then I'll tackle moving the _vtable_len field to Klass.
>>>>
>>>> Finally I'll try to consolidate the vtable related methods to Klass,
>>>> where they belong.
>>>
>>> This change actually consists of three changes:
>>> * modifying InstanceKlass::vtable_length_offset to become a byte offset
>>> and use the ByteSize type to communicate the scaling.
>>> * modifying InstanceKlass::vtable_start_offset to become a byte offset
>>> and use the ByteSize type, for symmetry reasons mainly.
>>> * adding a vtableEntry::size_in_bytes() since in many places the vtable
>>> entry size is used in combination with the vtable start to compute a
>>> byte offset for vtable lookups.
>>>
>>> I don't foresee any issues with the fact that the byte offset is
>>> represented as an int, for two reasons:
>>> 1) If the offset of any of these grows to over 2 gigabytes then we have
>>> a huge footprint problem with InstanceKlass
>>> 2) The offsets are converted to byte offsets and stored in ints already
>>> in the cpu specific code I've modified.
>>>
>>> Bug link: https://bugs.openjdk.java.net/browse/JDK-8147461
>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.0/
>>>
>>> Testing: JPRT on Oracle supported platforms, testing on AARCH64 and
>>> PPC64 would be much appreciated, appropriate mailing lists have been
>>> CC:ed to notify them of the request.
>>>
>>>
>>> [0]
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021152.html
>>>
>>>
>>> [1]
>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021224.html
>>>
>>>
>>>
>>> Thanks!
>>> /Mikael
>>
>


From nils.eliasson at oracle.com  Fri Jan 22 13:38:31 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Fri, 22 Jan 2016 14:38:31 +0100
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
In-Reply-To: <56A13176.804@oracle.com>
References: <56A0B237.9090008@oracle.com> <56A0B398.4000408@oracle.com>
	<56A13176.804@oracle.com>
Message-ID: <56A230D7.9060606@oracle.com>

Hi, Vladimir

On 2016-01-21 20:28, Vladimir Kozlov wrote:
> Passing directives through ciEnv is fine.
> My question is about output in hs_err file. How those directives were 
> selected in your example?

It only prints the directive that is used for the current compile task 
(that caused the crash). (Thats why I put them together in the hs_err file)

> I found it strange to see mixed flags values and oracle commands. 
> "Enable:true Exclude:false" - which these correspond to, for example?

These are all options from the directive - and they are set with 
directives (highest priority), compilecommmand or vmflags (lowest 
priority).

>
> Should we not print directives/flags which are not set explicitly?

I updated the print output to mark all options in the directive that are 
not default with a '*'. That makes it quicker to see if any special 
options was applied. It will also print if the directive is the 
unmodified default directive.

Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.03/
Example output: 
http://cr.openjdk.java.net/~neliasso/8138756/webrev.03/hserr.txt

Regards,
Nils

>
> Thanks,
> Vladimir
>
> On 1/21/16 2:31 AM, Nils Eliasson wrote:
>> This is how it looks:
>>
>> [...]
>>
>> ---------------  T H R E A D  ---------------
>>
>> Current thread (0x00007f071046a000):  JavaThread "C1 
>> CompilerThread10" daemon [_thread_in_native, id=20033, 
>> stack(0x00007f05d7afb000,0x00007f05d7bfc000)]
>>
>> Current CompileTask:
>> C1:    225    1       3       java.lang.String::isLatin1 (19 bytes)
>>
>> Current compiler directive:
>>    inline: -
>>    Enable:true Exclude:false BreakAtExecute:false 
>> BreakAtCompile:false Log:false PrintAssembly:false 
>> PrintInlining:false PrintNMethods:false ReplayInline:false 
>> DumpReplay:false DumpInline:false 
>> CompilerDirectivesIgnoreCompileCommands:false DisableIntrinsic: 
>> BlockLayoutByFrequency:true PrintOptoAssembly:false 
>> PrintIntrinsics:false TraceOptoPipelining:false TraceOptoOutput:false 
>> TraceSpilling:false Vectorize:false VectorizeDebug:false 
>> CloneMapDebug:false DoReserveCopyInSuperWordDebug:false 
>> IGVPrintLevel:0 MaxNodeLimit:80000
>>
>> Stack: [0x00007f05d7afb000,0x00007f05d7bfc000], 
>> sp=0x00007f05d7bfa5d0,  free space=1021k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
>> C=native code)
>> V  [libjvm.so+0x12e7532]  VMError::report_and_die(int, char const*, 
>> char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, 
>> char const*, int, unsigned long)+0x182
>> V  [libjvm.so+0x12e829a]  VMError::report_and_die(Thread*, char 
>> const*, int, char const*, char const*, __va_list_tag*)+0x4a
>> V  [libjvm.so+0x908cca]  report_vm_error(char const*, int, char 
>> const*, char const*, ...)+0xea
>> V  [libjvm.so+0x88df81] CompileBroker::post_compile(CompilerThread*, 
>> CompileTask*, EventCompilation&, bool, ciEnv*)+0x1b1
>> V  [libjvm.so+0x88ec5a] 
>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x90a
>> V  [libjvm.so+0x88f960] CompileBroker::compiler_thread_loop()+0x540
>> V  [libjvm.so+0x1264789]  JavaThread::thread_main_inner()+0x1c9
>> V  [libjvm.so+0x1264ac6]  JavaThread::run()+0x2a6
>> V  [libjvm.so+0x10189aa]  java_start(Thread*)+0xca
>> C  [libpthread.so.0+0x8182]  start_thread+0xc2
>>
>> [...]
>>
>> http://cr.openjdk.java.net/~neliasso/8138756/hserr.txt
>>
>> Regards,
>> Nils
>>
>> On 2016-01-21 11:25, Nils Eliasson wrote:
>>> Hi,
>>>
>>> Please review this small change. The diff looks big but most of the 
>>> change is just changing how the directive are
>>> passed to the compilers. Directives are set in the ciEnv and then 
>>> passed to the compilers. The compilers can then
>>> choose to add it to any internal compilation object for convenience. 
>>> The hs_err printing routine in vmError.cpp loads
>>> the directive from the ciEnv.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
>>> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/
>>>
>>> Regards,
>>> Nils
>>


From nils.eliasson at oracle.com  Fri Jan 22 14:40:33 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Fri, 22 Jan 2016 15:40:33 +0100
Subject: RFR(S): 8063112: Compiler diagnostic commands should have locking
	instead of safepoint
Message-ID: <56A23F61.9000201@oracle.com>

Hi,

Please review.

Summary:
Firstly this change removes the unnecessary vm-ops from three compiler 
diagnostic commands and adds locking instead.
Secondly the Compiler.queue diagnostic command is improved with printing 
of any active compilations. I found this useful when diagnosing a rouge VM.
Thirdly, as a bonus, I also add printing of active compilations in the 
thread section of the hs_err file. Very useful when investigating VMs 
terminated by a timeout.

Testing:
This does not pass all tests yet. A few tests is dependent on the output 
from the diagnostic command, and I want to be sure the reviewers are 
happy with the output format first.

Bug: https://bugs.openjdk.java.net/browse/JDK-8063112
Webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.02/

Regards,
Nils


From rahul.v.raghavan at oracle.com  Fri Jan 22 16:11:51 2016
From: rahul.v.raghavan at oracle.com (Rahul Raghavan)
Date: Fri, 22 Jan 2016 08:11:51 -0800 (PST)
Subject: FW: RFR(S): 6378256: Performance problem with
	System.identityHashCode in client compiler
Message-ID: <d66ae4c8-530a-407a-ac89-6484150c4277@default>


> -----Original Message-----
> From: Tobias Hartmann > Sent: Monday, January 11, 2016 2:56 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
> 
> Hi Rahul,
> 
> > http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> 
> Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also don't
> need the 'hash' register if you compute everything in 'result'.

Thank you for your comments Tobias.

I could not get the implementation work with the usage of 'markOopDesc::hash_mask_in_place' in x86_64 (similar to support in x86_32).
Usage of -    __ andptr(result, markOopDesc::hash_mask_in_place);
Results in build error - ' overflow in implicit constant conversion'

Then understood from 'sharedRuntime_sparc.cpp', 'markOop.hpp' -  that the usage of 'hash_mask_in_place' should be avoided for 64-bit because the values are too big!
Similar comments in LibraryCallKit::inline_native_hashcode [hotspot/src/share/vm/opto/library_call.cpp] also.
Could not find some other way to use hash_mask_in_place here for  x86_64?

So depending on markOopDesc::hash_mask, markOopDesc::hash_shift value instead (similar to done in sharedRuntime_sparc)
Added missing comment regarding above in the revised webrev.

Also yes I missed the optimized codegen.
Tried revised patch removing usages of extra 'hash', 'mask' registers and computed all in 'result' itself.

[sharedRuntime_x86_64.cpp]
     ....................
+    Register obj_reg = j_rarg0;
+    Register result = rax;
     ........
+    // get hash
+    // Read the header and build a mask to get its hash field.
+    // Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
+    // because it could be larger than 32 bits in a 64-bit vm. See markOop.hpp.
+    __ shrptr(result, markOopDesc::hash_shift);
+    __ andptr(result, markOopDesc::hash_mask);
+    // test if hashCode exists
+    __ jcc  (Assembler::zero, slowCase);
+    __ ret(0);
+    __ bind (slowCase);
      ........

Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.

Please send your comments. I can submit revised webrev if all okay.

> 
> Best,
> Tobias
> 
> 
> On 08.01.2016 18:13, Rahul Raghavan wrote:
> > Hello,
> >
> > Please review the following revised patch for JDK-6378256 -
> > http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> >
> > This revised webrev got following changes -
> >
> >  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null
> reference input!
> >    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
> >
> >  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
> >
> > Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> >
> > Thanks,
> > Rahul
> >
> >
> >> -----Original Message-----
> >> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-
> dev at openjdk.java.net
> >>
> >>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>
> >> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again would
> be
> >> nice.
> >> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
> >>
> >> Roland.
> >
> >
> >> -----Original Message-----
> >> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
> >>
> >> Hello,
> >>
> >> Please review the following patch for JDK-6378256.
> >>
> >> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>
> >> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
> >> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
> >> slower).
> >> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
> >>
> >> sample unit test:
> >>    public class Jdk6378256Test
> >>    {
> >>       public static void main(String[] args)
> >>       {
> >>          Object obj = new Object();
> >>          long time = System.nanoTime();
> >>          for(int i = 0 ; i < 1000000 ; i++)
> >>             System.identityHashCode(obj);  //compare to obj.hashCode();
> >>          System.out.println ("Result = " + (System.nanoTime() - time));
> >>       }
> >>    }
> >>
> >> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
> >> (looks in the header for the hashCode before calling into the VM).
> >> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
> >> So also added required additional null check for System.identityHashCode case.
> >>
> >> Testing:
> >>    - successful JPRT run (-testset hotspot).
> >>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
> >>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
> >>    - Added 'noreg-perf' label for this performance bug.
> >>       Manual testing done and confirmed expected performance values for unit tests with fix.
> >>
> >> Thanks,
> >> Rahul

From roland.westrelin at oracle.com  Fri Jan 22 16:38:45 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 22 Jan 2016 17:38:45 +0100
Subject: RFR(XS): 8147853: "assert(t->meet(t0) == t) failed: Not monotonic"
	with sun/util/calendar/zi/TestZoneInfo310.java
Message-ID: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>

During CCP, a Phi for the induction variable of a CountedLoop is processed repeatedly while the type of the backedge control is top so only the loop entry input is considered for computing the Phi?s type. 

The loop entry first has type int:1..3 so the Phi?s type is int:1..3
then it has type int:1..4 so the Phi?s type is int:1..4
then it has type int:1..5:www so the Phi?s type is int:1..5:www
then it has type int:1..6:www so the Phi?s type is saturated to int:1..max-1:www

The backedge control?s type is changed to non-top and the type of the Phi is recomputed. This time the special code for counted loop in PhiNode::Value():

  CountedLoopNode* l = r->is_CountedLoop() ? r->as_CountedLoop() : NULL;
  if (l && l->can_be_counted_loop(phase) &&
      ((const Node*)l->phi() == this)) { // Trip counted loop!
    // protect against init_trip() or limit() returning NULL
    const Node *init   = l->init_trip();
    const Node *limit  = l->limit();
    const Node* stride = l->stride();
    if (init != NULL && limit != NULL && stride != NULL) {
      const TypeInt* lo = phase->type(init)->isa_int();
      const TypeInt* hi = phase->type(limit)->isa_int();
      const TypeInt* stride_t = phase->type(stride)->isa_int();
      if (lo != NULL && hi != NULL && stride_t != NULL) { // Dying loops might have TOP here
        assert(stride_t->_hi >= stride_t->_lo, "bad stride type");
        const Type* res = NULL;
        if (stride_t->_hi < 0) {          // Down-counter loop
          swap(lo, hi);
          return TypeInt::make(MIN2(lo->_lo, hi->_lo) , hi->_hi, 3);
        } else if (stride_t->_lo >= 0) {
          return TypeInt::make(lo->_lo, MAX2(lo->_hi, hi->_hi), 3);
        }
      }
    }
  }


kicks in and it computes a type of: int:1..8:www. The type of the Phi was narrowed and the assert fires.

I suggest we fix this by saturating the type of the Phi only once the type of the loop?s backedge is non top. This way, the special code for counted loop above has a chance to run and that should be enough to keep the types during CCP monotonic.

http://cr.openjdk.java.net/~roland/8147853/webrev.00/

Roland.

From christian.thalinger at oracle.com  Fri Jan 22 17:18:04 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 22 Jan 2016 07:18:04 -1000
Subject: RFR(S): 8146244: compiler/jvmci/code/DataPatchTest.java crashes:
	SIGSEGV in (getConstClass)getConstClass
In-Reply-To: <56A0FC29.7040506@oracle.com>
References: <56A0FC29.7040506@oracle.com>
Message-ID: <0C33C507-194F-4867-B282-0DBB16FE4A0A@oracle.com>

Looks good.

> On Jan 21, 2016, at 5:41 AM, Roland Schatz <roland.schatz at oracle.com> wrote:
> 
> Hi,
> 
> Please review this small bugfix:
> webrev: http://cr.openjdk.java.net/~rschatz/JDK-8146244/webrev.00/
> issue: https://bugs.openjdk.java.net/browse/JDK-8146244
> 
> Thanks,
> Roland


From vladimir.kozlov at oracle.com  Fri Jan 22 18:36:30 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 22 Jan 2016 10:36:30 -0800
Subject: RFR(XS): 8147853: "assert(t->meet(t0) == t) failed: Not
	monotonic" with sun/util/calendar/zi/TestZoneInfo310.java
In-Reply-To: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>
References: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>
Message-ID: <56A276AE.8060408@oracle.com>

Good fix.

Can we simple return type of in(EntryControl) phi's input in such case 
(backedge is top) without filter_speculative() and verification code 
under assert which is useless in this case, I think. We already have 
check for the counted loop, we only need to separate 
can_be_counted_loop() condition.

Thanks,
Vladimir

On 1/22/16 8:38 AM, Roland Westrelin wrote:
> During CCP, a Phi for the induction variable of a CountedLoop is processed repeatedly while the type of the backedge control is top so only the loop entry input is considered for computing the Phi?s type.
>
> The loop entry first has type int:1..3 so the Phi?s type is int:1..3
> then it has type int:1..4 so the Phi?s type is int:1..4
> then it has type int:1..5:www so the Phi?s type is int:1..5:www
> then it has type int:1..6:www so the Phi?s type is saturated to int:1..max-1:www
>
> The backedge control?s type is changed to non-top and the type of the Phi is recomputed. This time the special code for counted loop in PhiNode::Value():
>
>    CountedLoopNode* l = r->is_CountedLoop() ? r->as_CountedLoop() : NULL;
>    if (l && l->can_be_counted_loop(phase) &&
>        ((const Node*)l->phi() == this)) { // Trip counted loop!
>      // protect against init_trip() or limit() returning NULL
>      const Node *init   = l->init_trip();
>      const Node *limit  = l->limit();
>      const Node* stride = l->stride();
>      if (init != NULL && limit != NULL && stride != NULL) {
>        const TypeInt* lo = phase->type(init)->isa_int();
>        const TypeInt* hi = phase->type(limit)->isa_int();
>        const TypeInt* stride_t = phase->type(stride)->isa_int();
>        if (lo != NULL && hi != NULL && stride_t != NULL) { // Dying loops might have TOP here
>          assert(stride_t->_hi >= stride_t->_lo, "bad stride type");
>          const Type* res = NULL;
>          if (stride_t->_hi < 0) {          // Down-counter loop
>            swap(lo, hi);
>            return TypeInt::make(MIN2(lo->_lo, hi->_lo) , hi->_hi, 3);
>          } else if (stride_t->_lo >= 0) {
>            return TypeInt::make(lo->_lo, MAX2(lo->_hi, hi->_hi), 3);
>          }
>        }
>      }
>    }
>
>
> kicks in and it computes a type of: int:1..8:www. The type of the Phi was narrowed and the assert fires.
>
> I suggest we fix this by saturating the type of the Phi only once the type of the loop?s backedge is non top. This way, the special code for counted loop above has a chance to run and that should be enough to keep the types during CCP monotonic.
>
> http://cr.openjdk.java.net/~roland/8147853/webrev.00/
>
> Roland.
>

From vladimir.kozlov at oracle.com  Fri Jan 22 18:56:21 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 22 Jan 2016 10:56:21 -0800
Subject: RFR(S): 8138756: Compiler Control: Print directives in hs_err
In-Reply-To: <56A230D7.9060606@oracle.com>
References: <56A0B237.9090008@oracle.com> <56A0B398.4000408@oracle.com>
	<56A13176.804@oracle.com> <56A230D7.9060606@oracle.com>
Message-ID: <56A27B55.6050502@oracle.com>

"no inline - compile commands may apply" is confusing to me (and for 
others who not familiar with directives). What does it mean? :)
Does it mean no 'inline' directives were used or opposite: -XX:-Inline 
flag was specified (or corresponding directive).

If it is switch off inlining then I think it should be "don't inline".
So what "compile commands may apply" means?

 > I updated the print output to mark all options in the directive that are
 > not default with a '*'. That makes it quicker to see if any special

Yes, it is better but I still did not get this. I see that command line 
has PrintInlining command and it is in the list: *PrintInlining:true.
But I don't see PrintCompilation on the list but it is specified on 
command line. On other hand PrintIntrinsics:false is there.

 > It only prints the directive that is used for the current compile task
 > (that caused the crash). (Thats why I put them together in the hs_err 
file)

What do you mean "is used"?

"Print *which* directive (and options) were in use if compiler crash.
  Print *if* directives were used at some point if other crash?"

Should we replace "in use"/"were used" with "were set"?

Thanks,
Vladimir

On 1/22/16 5:38 AM, Nils Eliasson wrote:
> Hi, Vladimir
>
> On 2016-01-21 20:28, Vladimir Kozlov wrote:
>> Passing directives through ciEnv is fine.
>> My question is about output in hs_err file. How those directives were
>> selected in your example?
>
> It only prints the directive that is used for the current compile task
> (that caused the crash). (Thats why I put them together in the hs_err file)
>
>> I found it strange to see mixed flags values and oracle commands.
>> "Enable:true Exclude:false" - which these correspond to, for example?
>
> These are all options from the directive - and they are set with
> directives (highest priority), compilecommmand or vmflags (lowest
> priority).
>
>>
>> Should we not print directives/flags which are not set explicitly?
>
> I updated the print output to mark all options in the directive that are
> not default with a '*'. That makes it quicker to see if any special
> options was applied. It will also print if the directive is the
> unmodified default directive.
>
> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.03/
> Example output:
> http://cr.openjdk.java.net/~neliasso/8138756/webrev.03/hserr.txt
>
> Regards,
> Nils
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/21/16 2:31 AM, Nils Eliasson wrote:
>>> This is how it looks:
>>>
>>> [...]
>>>
>>> ---------------  T H R E A D  ---------------
>>>
>>> Current thread (0x00007f071046a000):  JavaThread "C1
>>> CompilerThread10" daemon [_thread_in_native, id=20033,
>>> stack(0x00007f05d7afb000,0x00007f05d7bfc000)]
>>>
>>> Current CompileTask:
>>> C1:    225    1       3       java.lang.String::isLatin1 (19 bytes)
>>>
>>> Current compiler directive:
>>>    inline: -
>>>    Enable:true Exclude:false BreakAtExecute:false
>>> BreakAtCompile:false Log:false PrintAssembly:false
>>> PrintInlining:false PrintNMethods:false ReplayInline:false
>>> DumpReplay:false DumpInline:false
>>> CompilerDirectivesIgnoreCompileCommands:false DisableIntrinsic:
>>> BlockLayoutByFrequency:true PrintOptoAssembly:false
>>> PrintIntrinsics:false TraceOptoPipelining:false TraceOptoOutput:false
>>> TraceSpilling:false Vectorize:false VectorizeDebug:false
>>> CloneMapDebug:false DoReserveCopyInSuperWordDebug:false
>>> IGVPrintLevel:0 MaxNodeLimit:80000
>>>
>>> Stack: [0x00007f05d7afb000,0x00007f05d7bfc000],
>>> sp=0x00007f05d7bfa5d0,  free space=1021k
>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>> C=native code)
>>> V  [libjvm.so+0x12e7532]  VMError::report_and_die(int, char const*,
>>> char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*,
>>> char const*, int, unsigned long)+0x182
>>> V  [libjvm.so+0x12e829a]  VMError::report_and_die(Thread*, char
>>> const*, int, char const*, char const*, __va_list_tag*)+0x4a
>>> V  [libjvm.so+0x908cca]  report_vm_error(char const*, int, char
>>> const*, char const*, ...)+0xea
>>> V  [libjvm.so+0x88df81] CompileBroker::post_compile(CompilerThread*,
>>> CompileTask*, EventCompilation&, bool, ciEnv*)+0x1b1
>>> V  [libjvm.so+0x88ec5a]
>>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x90a
>>> V  [libjvm.so+0x88f960] CompileBroker::compiler_thread_loop()+0x540
>>> V  [libjvm.so+0x1264789]  JavaThread::thread_main_inner()+0x1c9
>>> V  [libjvm.so+0x1264ac6]  JavaThread::run()+0x2a6
>>> V  [libjvm.so+0x10189aa]  java_start(Thread*)+0xca
>>> C  [libpthread.so.0+0x8182]  start_thread+0xc2
>>>
>>> [...]
>>>
>>> http://cr.openjdk.java.net/~neliasso/8138756/hserr.txt
>>>
>>> Regards,
>>> Nils
>>>
>>> On 2016-01-21 11:25, Nils Eliasson wrote:
>>>> Hi,
>>>>
>>>> Please review this small change. The diff looks big but most of the
>>>> change is just changing how the directive are
>>>> passed to the compilers. Directives are set in the ciEnv and then
>>>> passed to the compilers. The compilers can then
>>>> choose to add it to any internal compilation object for convenience.
>>>> The hs_err printing routine in vmError.cpp loads
>>>> the directive from the ciEnv.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8138756
>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8138756/webrev.01/
>>>>
>>>> Regards,
>>>> Nils
>>>
>

From tom.rodriguez at oracle.com  Fri Jan 22 19:17:25 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 22 Jan 2016 11:17:25 -0800
Subject: RFR(S): 8146424: runtime/ReservedStack/ReservedStackTest.java
	triggers: assert(thread->deopt_mark() == __null) failed: no
	stack overflow from deopt blob/uncommon trap
Message-ID: <593E3AC7-E839-4969-ACDB-74B934DC3F14@oracle.com>

http://cr.openjdk.java.net/~never/8146424/webrev/index.html <http://cr.openjdk.java.net/~never/8146424/webrev/index.html>

JVMCI needs to provide access to Interpreter::size_activation so that JVMCI compilers can properly bang stacks based on their deoptimization requires.  This simply adds a new entry point the compiler can use to compute the required size.  It also exposes HotSpotVMConfig.vm_page_size instead of requiring the compiler to rely on Unsafe.pageSize which has unspecified relationship to that value.

Tested with Graal and the jtreg stack banging tests.  I was unable to reproduce the exact reported failure locally though I confirmed that more stack banging was being done in the required places.

tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/a9e569a4/attachment.html>

From vladimir.kozlov at oracle.com  Fri Jan 22 19:23:47 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 22 Jan 2016 11:23:47 -0800
Subject: RFR(S): 8063112: Compiler diagnostic commands should have locking
	instead of safepoint
In-Reply-To: <56A23F61.9000201@oracle.com>
References: <56A23F61.9000201@oracle.com>
Message-ID: <56A281C3.6010408@oracle.com>

Why you need new print method? Why you can't use existing print()?
Also I prefer to get current compilation tasks print in separate lines - 
not in the list of threads. Then you don't need to use new print?

I am worry about using locks for printing because print code also has 
locks. Do we really have to have locks here? The output for these 
directives is local bufferedStream. As I understand it is separate for 
each directive. So why you need lock? Or VM operation as before?

Thanks,
Vladimir

On 1/22/16 6:40 AM, Nils Eliasson wrote:
> Hi,
>
> Please review.
>
> Summary:
> Firstly this change removes the unnecessary vm-ops from three compiler
> diagnostic commands and adds locking instead.
> Secondly the Compiler.queue diagnostic command is improved with printing
> of any active compilations. I found this useful when diagnosing a rouge VM.
> Thirdly, as a bonus, I also add printing of active compilations in the
> thread section of the hs_err file. Very useful when investigating VMs
> terminated by a timeout.
>
> Testing:
> This does not pass all tests yet. A few tests is dependent on the output
> from the diagnostic command, and I want to be sure the reviewers are
> happy with the output format first.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8063112
> Webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.02/
>
> Regards,
> Nils
>

From tom.rodriguez at oracle.com  Fri Jan 22 22:23:05 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 22 Jan 2016 14:23:05 -0800
Subject: RFR(S): 8146424: runtime/ReservedStack/ReservedStackTest.java
	triggers: assert(thread->deopt_mark() == __null) failed: no
	stack overflow from deopt blob/uncommon trap
In-Reply-To: <593E3AC7-E839-4969-ACDB-74B934DC3F14@oracle.com>
References: <593E3AC7-E839-4969-ACDB-74B934DC3F14@oracle.com>
Message-ID: <08AB95F3-5D22-4053-A870-5A6A9E59D5BD@oracle.com>

I added a regression test and generated a new webrev http://cr.openjdk.java.net/~never/8146424.01/webrev/index.html <http://cr.openjdk.java.net/~never/8146424.01/webrev/index.html>

tom

> On Jan 22, 2016, at 11:17 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~never/8146424/webrev/index.html <http://cr.openjdk.java.net/~never/8146424/webrev/index.html>
> 
> JVMCI needs to provide access to Interpreter::size_activation so that JVMCI compilers can properly bang stacks based on their deoptimization requires.  This simply adds a new entry point the compiler can use to compute the required size.  It also exposes HotSpotVMConfig.vm_page_size instead of requiring the compiler to rely on Unsafe.pageSize which has unspecified relationship to that value.
> 
> Tested with Graal and the jtreg stack banging tests.  I was unable to reproduce the exact reported failure locally though I confirmed that more stack banging was being done in the required places.
> 
> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/641cf181/attachment.html>

From tom.rodriguez at oracle.com  Fri Jan 22 22:25:28 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 22 Jan 2016 14:25:28 -0800
Subject: RFR(S): 8148101: [JVMCI] Make CallingConvention.Type extensible
Message-ID: <165DD75E-8A1E-4C0D-991E-302964E2C0EF@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8148101 <https://bugs.openjdk.java.net/browse/JDK-8148101>


CallingConvention.Type currently fixes the set of types for all possible backend. It's should be abstracted so that it can be more easily extended.  The unused stackOnly parameter was removed at the same time.

tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/2d1d9013/attachment.html>

From christian.thalinger at oracle.com  Fri Jan 22 22:41:06 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 22 Jan 2016 12:41:06 -1000
Subject: RFR(S): 8146424: runtime/ReservedStack/ReservedStackTest.java
	triggers: assert(thread->deopt_mark() == __null) failed: no
	stack overflow from deopt blob/uncommon trap
In-Reply-To: <08AB95F3-5D22-4053-A870-5A6A9E59D5BD@oracle.com>
References: <593E3AC7-E839-4969-ACDB-74B934DC3F14@oracle.com>
	<08AB95F3-5D22-4053-A870-5A6A9E59D5BD@oracle.com>
Message-ID: <E30DFC14-2D3A-459C-A476-A614E3290580@oracle.com>

Looks good.

> On Jan 22, 2016, at 12:23 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> I added a regression test and generated a new webrev http://cr.openjdk.java.net/~never/8146424.01/webrev/index.html <http://cr.openjdk.java.net/~never/8146424.01/webrev/index.html>
> 
> tom
> 
>> On Jan 22, 2016, at 11:17 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> http://cr.openjdk.java.net/~never/8146424/webrev/index.html <http://cr.openjdk.java.net/~never/8146424/webrev/index.html>
>> 
>> JVMCI needs to provide access to Interpreter::size_activation so that JVMCI compilers can properly bang stacks based on their deoptimization requires.  This simply adds a new entry point the compiler can use to compute the required size.  It also exposes HotSpotVMConfig.vm_page_size instead of requiring the compiler to rely on Unsafe.pageSize which has unspecified relationship to that value.
>> 
>> Tested with Graal and the jtreg stack banging tests.  I was unable to reproduce the exact reported failure locally though I confirmed that more stack banging was being done in the required places.
>> 
>> tom
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/ca14a936/attachment.html>

From christian.thalinger at oracle.com  Fri Jan 22 23:00:52 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Fri, 22 Jan 2016 13:00:52 -1000
Subject: RFR(S): 8148101: [JVMCI] Make CallingConvention.Type extensible
In-Reply-To: <165DD75E-8A1E-4C0D-991E-302964E2C0EF@oracle.com>
References: <165DD75E-8A1E-4C0D-991E-302964E2C0EF@oracle.com>
Message-ID: <31EB408D-FAF7-44CF-BBF3-45CA09960FED@oracle.com>

http://cr.openjdk.java.net/~never/8148101/webrev/

Looks good.

> On Jan 22, 2016, at 12:25 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> https://bugs.openjdk.java.net/browse/JDK-8148101 <https://bugs.openjdk.java.net/browse/JDK-8148101>
> 
> 
> CallingConvention.Type currently fixes the set of types for all possible backend. It's should be abstracted so that it can be more easily extended.  The unused stackOnly parameter was removed at the same time.
> 
> tom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160122/ee6fd6d7/attachment.html>

From tobias.hartmann at oracle.com  Mon Jan 25 07:09:48 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 25 Jan 2016 08:09:48 +0100
Subject: FW: RFR(S): 6378256: Performance problem with
	System.identityHashCode in client compiler
In-Reply-To: <d66ae4c8-530a-407a-ac89-6484150c4277@default>
References: <d66ae4c8-530a-407a-ac89-6484150c4277@default>
Message-ID: <56A5CA3C.6070401@oracle.com>

Hi Rahul,

On 22.01.2016 17:11, Rahul Raghavan wrote:
> 
>> -----Original Message-----
>> From: Tobias Hartmann > Sent: Monday, January 11, 2016 2:56 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
>>
>> Hi Rahul,
>>
>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
>>
>> Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also don't
>> need the 'hash' register if you compute everything in 'result'.
> 
> Thank you for your comments Tobias.
> 
> I could not get the implementation work with the usage of 'markOopDesc::hash_mask_in_place' in x86_64 (similar to support in x86_32).
> Usage of -    __ andptr(result, markOopDesc::hash_mask_in_place);
> Results in build error - ' overflow in implicit constant conversion'
> 
> Then understood from 'sharedRuntime_sparc.cpp', 'markOop.hpp' -  that the usage of 'hash_mask_in_place' should be avoided for 64-bit because the values are too big!
> Similar comments in LibraryCallKit::inline_native_hashcode [hotspot/src/share/vm/opto/library_call.cpp] also.
> Could not find some other way to use hash_mask_in_place here for  x86_64?

You are right, I missed that.

> So depending on markOopDesc::hash_mask, markOopDesc::hash_shift value instead (similar to done in sharedRuntime_sparc)
> Added missing comment regarding above in the revised webrev.
> 
> Also yes I missed the optimized codegen.
> Tried revised patch removing usages of extra 'hash', 'mask' registers and computed all in 'result' itself.
> 
> [sharedRuntime_x86_64.cpp]
>      ....................
> +    Register obj_reg = j_rarg0;
> +    Register result = rax;
>      ........
> +    // get hash
> +    // Read the header and build a mask to get its hash field.
> +    // Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
> +    // because it could be larger than 32 bits in a 64-bit vm. See markOop.hpp.
> +    __ shrptr(result, markOopDesc::hash_shift);
> +    __ andptr(result, markOopDesc::hash_mask);
> +    // test if hashCode exists
> +    __ jcc  (Assembler::zero, slowCase);
> +    __ ret(0);
> +    __ bind (slowCase);
>       ........
> 
> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> 
> Please send your comments. I can submit revised webrev if all okay.

Looks good. Please send a new webrev.

Best,
Tobias

> 
>>
>> Best,
>> Tobias
>>
>>
>> On 08.01.2016 18:13, Rahul Raghavan wrote:
>>> Hello,
>>>
>>> Please review the following revised patch for JDK-6378256 -
>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
>>>
>>> This revised webrev got following changes -
>>>
>>>  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null
>> reference input!
>>>    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
>>>
>>>  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
>>>
>>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
>>>
>>> Thanks,
>>> Rahul
>>>
>>>
>>>> -----Original Message-----
>>>> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-
>> dev at openjdk.java.net
>>>>
>>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>>>
>>>> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again would
>> be
>>>> nice.
>>>> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
>>>>
>>>> Roland.
>>>
>>>
>>>> -----Original Message-----
>>>> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
>>>>
>>>> Hello,
>>>>
>>>> Please review the following patch for JDK-6378256.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
>>>> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
>>>> slower).
>>>> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
>>>>
>>>> sample unit test:
>>>>    public class Jdk6378256Test
>>>>    {
>>>>       public static void main(String[] args)
>>>>       {
>>>>          Object obj = new Object();
>>>>          long time = System.nanoTime();
>>>>          for(int i = 0 ; i < 1000000 ; i++)
>>>>             System.identityHashCode(obj);  //compare to obj.hashCode();
>>>>          System.out.println ("Result = " + (System.nanoTime() - time));
>>>>       }
>>>>    }
>>>>
>>>> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
>>>> (looks in the header for the hashCode before calling into the VM).
>>>> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
>>>> So also added required additional null check for System.identityHashCode case.
>>>>
>>>> Testing:
>>>>    - successful JPRT run (-testset hotspot).
>>>>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
>>>>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
>>>>    - Added 'noreg-perf' label for this performance bug.
>>>>       Manual testing done and confirmed expected performance values for unit tests with fix.
>>>>
>>>> Thanks,
>>>> Rahul

From tobias.hartmann at oracle.com  Mon Jan 25 09:48:28 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 25 Jan 2016 10:48:28 +0100
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes outside
	dominated array
Message-ID: <56A5EF6C.4090603@oracle.com>

Hi,

please review the following patch:

https://bugs.openjdk.java.net/browse/JDK-8147876
http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/

ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().

Thanks,
Tobias

From thomas.stuefe at gmail.com  Mon Jan 25 11:02:53 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 25 Jan 2016 12:02:53 +0100
Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
In-Reply-To: <4295855A5C1DE049A61835A1887419CC41F16A11@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC41F1663A@DEWDFEMB12A.global.corp.sap>
	<CAA-vtUzWnn9kLj24PAurWsTrp1PpwKmYgcs7SR9rEtdg=MTMvw@mail.gmail.com>
	<4295855A5C1DE049A61835A1887419CC41F16A11@DEWDFEMB12A.global.corp.sap>
Message-ID: <CAA-vtUxw8DO0qHZiR3_uF4dD1TJVRGxKScUjhdtq0JH_9f47dg@mail.gmail.com>

Ok Goetz! Looks all fine to me.

On Fri, Jan 22, 2016 at 8:41 AM, Lindenmaier, Goetz <
goetz.lindenmaier at sap.com> wrote:

> Hi Thomas,
>
>
>
> I only want to do syntactic changes to our copyright message. I don?t
>
> want to change any content of them.
>
>
>
> So please let?s leave this to another change.
>
>
>
> Thanks,
>
>   Goetz.
>
>
>
> *From:* Thomas St?fe [mailto:thomas.stuefe at gmail.com]
> *Sent:* Thursday, January 21, 2016 4:48 PM
> *To:* Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> *Cc:* hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> *Subject:* Re: RFR(M): 8147937: Adapt SAP copyrights to new company name.
>
>
>
> Hi Goetz,
>
>
>
>
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.cpp.frames.html
>
>
>
> Please remove Oracle copyright, this is SAP only.
>
>
>
>
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/src/os/aix/vm/libodm_aix.hpp.frames.html
>
>
>
> ditto.
>
>
>
> Otherwise looks fine.
>
>
>
> ...
>
> Thomas
>
>
>
> On Thu, Jan 21, 2016 at 3:53 PM, Lindenmaier, Goetz <
> goetz.lindenmaier at sap.com> wrote:
>
> Hi,
>
>
>
> SAP changed its name from SAP AG to SAP SE. We were asked to
>
> adapt our copyright messages accordingly.
>
>
>
> This change fixes all SAP copyrights in hostpot to follow the patterns
>
> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
>
> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All
> rights reserved."
>
>
>
> Please review this change. I please need a sponsor.
>
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
>
>
>
> Best regards,
>
>   Goetz.
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160125/7d074ee1/attachment.html>

From goetz.lindenmaier at sap.com  Mon Jan 25 11:26:34 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 25 Jan 2016 11:26:34 +0000
Subject: sponsor?: RFR(M): 8147937: Adapt SAP copyrights to new company name.
Message-ID: <4295855A5C1DE049A61835A1887419CC41F17026@DEWDFEMB12A.global.corp.sap>

Hi,

could somebody please sponsor this change?
I updated the webrev to apply to the latest hs-comp repo.
http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/

Thanks!
  Goetz.

> -----Original Message-----
> From: Lindenmaier, Goetz
> Sent: Donnerstag, 21. Januar 2016 15:53
> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
> 
> Hi,
> 
> 
> 
> SAP changed its name from SAP AG to SAP SE. We were asked to
> 
> adapt our copyright messages accordingly.
> 
> 
> 
> This change fixes all SAP copyrights in hostpot to follow the patterns
> 
> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
> 
> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights
> reserved."
> 
> 
> 
> Please review this change. I please need a sponsor.
> 
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
> 
> 
> 
> Best regards,
> 
>   Goetz.
> 
> 


From zoltan.majo at oracle.com  Mon Jan 25 11:40:16 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Mon, 25 Jan 2016 12:40:16 +0100
Subject: sponsor?: RFR(M): 8147937: Adapt SAP copyrights to new company
	name.
In-Reply-To: <4295855A5C1DE049A61835A1887419CC41F17026@DEWDFEMB12A.global.corp.sap>
References: <4295855A5C1DE049A61835A1887419CC41F17026@DEWDFEMB12A.global.corp.sap>
Message-ID: <56A609A0.9060109@oracle.com>

Hi Goetz,


On 01/25/2016 12:26 PM, Lindenmaier, Goetz wrote:
> Hi,
>
> could somebody please sponsor this change?

I'll sponsor the change.

Thank you and best regards,


Zoltan

> I updated the webrev to apply to the latest hs-comp repo.
> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/
>
> Thanks!
>    Goetz.
>
>> -----Original Message-----
>> From: Lindenmaier, Goetz
>> Sent: Donnerstag, 21. Januar 2016 15:53
>> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
>> Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
>>
>> Hi,
>>
>>
>>
>> SAP changed its name from SAP AG to SAP SE. We were asked to
>>
>> adapt our copyright messages accordingly.
>>
>>
>>
>> This change fixes all SAP copyrights in hostpot to follow the patterns
>>
>> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
>>
>> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights
>> reserved."
>>
>>
>>
>> Please review this change. I please need a sponsor.
>>
>> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
>>
>>
>>
>> Best regards,
>>
>>    Goetz.
>>
>>


From goetz.lindenmaier at sap.com  Mon Jan 25 11:41:27 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 25 Jan 2016 11:41:27 +0000
Subject: sponsor?: RFR(M): 8147937: Adapt SAP copyrights to new company
	name.
In-Reply-To: <56A609A0.9060109@oracle.com>
References: <4295855A5C1DE049A61835A1887419CC41F17026@DEWDFEMB12A.global.corp.sap>
	<56A609A0.9060109@oracle.com>
Message-ID: <4295855A5C1DE049A61835A1887419CC41F1705E@DEWDFEMB12A.global.corp.sap>

That's great! Thanks!

Best regards,
  Goetz.

> -----Original Message-----
> From: Zolt?n Maj? [mailto:zoltan.majo at oracle.com]
> Sent: Montag, 25. Januar 2016 12:40
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot compiler
> <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: sponsor?: RFR(M): 8147937: Adapt SAP copyrights to new
> company name.
> 
> Hi Goetz,
> 
> 
> On 01/25/2016 12:26 PM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > could somebody please sponsor this change?
> 
> I'll sponsor the change.
> 
> Thank you and best regards,
> 
> 
> Zoltan
> 
> > I updated the webrev to apply to the latest hs-comp repo.
> > http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01/
> >
> > Thanks!
> >    Goetz.
> >
> >> -----Original Message-----
> >> From: Lindenmaier, Goetz
> >> Sent: Donnerstag, 21. Januar 2016 15:53
> >> To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
> >> Subject: RFR(M): 8147937: Adapt SAP copyrights to new company name.
> >>
> >> Hi,
> >>
> >>
> >>
> >> SAP changed its name from SAP AG to SAP SE. We were asked to
> >>
> >> adapt our copyright messages accordingly.
> >>
> >>
> >>
> >> This change fixes all SAP copyrights in hostpot to follow the patterns
> >>
> >> "Copyright (c) [1,2][9,0][0-9][0-9] SAP SE. All rights reserved." or
> >>
> >> "Copyright (c) [1,2][9,0][0-9][0-9], [1,2][9,0][0-9][0-9] SAP SE. All rights
> >> reserved."
> >>
> >>
> >>
> >> Please review this change. I please need a sponsor.
> >>
> >> http://cr.openjdk.java.net/~goetz/wr16/8147937-copyright/webrev.01
> >>
> >>
> >>
> >> Best regards,
> >>
> >>    Goetz.
> >>
> >>


From nils.eliasson at oracle.com  Mon Jan 25 14:12:11 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Mon, 25 Jan 2016 15:12:11 +0100
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A5EF6C.4090603@oracle.com>
References: <56A5EF6C.4090603@oracle.com>
Message-ID: <56A62D3B.6070805@oracle.com>

Looks good.

Best regards,
Nils
  (Not a reviewer)

On 2016-01-25 10:48, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8147876
> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>
> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>
> Thanks,
> Tobias


From tobias.hartmann at oracle.com  Mon Jan 25 14:27:40 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 25 Jan 2016 15:27:40 +0100
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A62D3B.6070805@oracle.com>
References: <56A5EF6C.4090603@oracle.com> <56A62D3B.6070805@oracle.com>
Message-ID: <56A630DC.9020406@oracle.com>

Thanks, Nils!

Best,
Tobias

On 25.01.2016 15:12, Nils Eliasson wrote:
> Looks good.
> 
> Best regards,
> Nils
>  (Not a reviewer)
> 
> On 2016-01-25 10:48, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147876
>> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>>
>> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>>
>> Thanks,
>> Tobias
> 

From doug.simon at oracle.com  Mon Jan 25 16:14:20 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 25 Jan 2016 17:14:20 +0100
Subject: RFR: 8147470: update JVMCI mx extensions
Message-ID: <DB5579B7-C486-4896-95B9-6BD31CB04F55@oracle.com>

Please review these changes to the mx extensions for JVMCI to account for recent HotSpot and Graal changes.

https://bugs.openjdk.java.net/browse/JDK-8147470
http://cr.openjdk.java.net/~dnsimon/8147470/

-Doug

From aleksey.shipilev at oracle.com  Mon Jan 25 16:32:52 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Mon, 25 Jan 2016 19:32:52 +0300
Subject: RFR (M) 8148146: Integrate new internal Unsafe entry points, and
	basic intrinsic support for VarHandles
Message-ID: <56A64E34.8020305@oracle.com>

Hi,

I would like to solicit reviews for the slab of VM changes to support
JEP 193 (VarHandles). This portion covers new Unsafe methods.

Webrev:
 http://cr.openjdk.java.net/~shade/8148146/webrev.jdk.00/
 http://cr.openjdk.java.net/~shade/8148146/webrev.hs.00/

The patches "almost" pass JPRT, with some failures in closed code,
triggered by adding a large number of new intrinsics. Those failures are
to be addressed separately -- and because of that, this change is not
yet pushable. A preliminary review would be appreciated meanwhile.

A brief summary of changes:

 a) jdk.internal.misc.Unsafe has new methods. Since we now have split
s.m.Unsafe and j.i.m.Unsafe, this change "safely" extends the private
Unsafe, leaving the other one untouched.

 b) hotspot/test/compiler/unsafe tests are extended for newly added methods.

 c) unsafe.cpp gets the basic native method implementations. Most new
operations are folded to their volatile (the strongest) counterparts,
hoping that compilers would intrinsify them into more performant versions.

 d) C2 intrinsics for x86:

   * Most intrinsics code is covered by platform-independent
LibraryCallKit changes, which means non-x86 architectures are also
partially covered.

   * There are two classes of ops left for platform-dependent code:
WeakCAS and CompareAndExchange nodes. Both seem simple enough to do, but
there are details to be sorted out on each platform -- let's do those
separately.

   * Both LibraryCallKit::inline_unsafe_access and
LCK::inline_unsafe_load_store were modified to accept new access modes,
and generally brushed up to accept the changes.

   * putOrdered intrinsic methods are purged in favor of put*Release
operations. We still keep Unsafe.putOrdered for testability and
compatibility reasons.

Eyeballing the generated code on x86 yields no obvious problems. Sanity
microbenchmark runs do not show performance regressions on old methods,
and show the expected performance on new methods:
  http://cr.openjdk.java.net/~shade/8148146/notes.txt

Cheers,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160125/11885fdb/signature.asc>

From igor.ignatyev at oracle.com  Mon Jan 25 16:47:14 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 25 Jan 2016 19:47:14 +0300
Subject: RFR(XXS) : 8148161 : quarantine
	compiler/loopopts/UseCountedLoopSafepoints.java
Message-ID: <EA18C91C-E45E-4213-AA98-9F0CEC6489FE@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8148161/webrev.00/
> 1 line changed: 1 ins; 0 del; 0 mod;

Hi all,

could you please review this tiny fix which quarantines 'compiler/loopopts/UseCountedLoopSafepoints.java? test while 8146096 is fixed?

Thanks,
Igor

From vladimir.kozlov at oracle.com  Mon Jan 25 16:54:43 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 25 Jan 2016 08:54:43 -0800
Subject: RFR(XXS) : 8148161 : quarantine
	compiler/loopopts/UseCountedLoopSafepoints.java
In-Reply-To: <EA18C91C-E45E-4213-AA98-9F0CEC6489FE@oracle.com>
References: <EA18C91C-E45E-4213-AA98-9F0CEC6489FE@oracle.com>
Message-ID: <56A65353.3010200@oracle.com>

Good.

Thanks,
Vladimir

On 1/25/16 8:47 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8148161/webrev.00/
>> 1 line changed: 1 ins; 0 del; 0 mod;
>
> Hi all,
>
> could you please review this tiny fix which quarantines 'compiler/loopopts/UseCountedLoopSafepoints.java? test while 8146096 is fixed?
>
> Thanks,
> Igor
>

From rahul.v.raghavan at oracle.com  Mon Jan 25 17:02:00 2016
From: rahul.v.raghavan at oracle.com (Rahul Raghavan)
Date: Mon, 25 Jan 2016 09:02:00 -0800 (PST)
Subject: RFR(S): 6378256: Performance problem with System.identityHashCode
	in client compiler
Message-ID: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>

Hello,

With reference to below email thread, please send review comments for the revised patch for JDK-6378256.
http://cr.openjdk.java.net/~thartmann/6378256/webrev.02/

Thanks,
Rahul

> -----Original Message-----
> From: Tobias Hartmann > Sent: Monday, January 25, 2016 12:40 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
> 
> Hi Rahul,
> 
> On 22.01.2016 17:11, Rahul Raghavan wrote:
> >
> >> -----Original Message-----
> >> From: Tobias Hartmann > Sent: Monday, January 11, 2016 2:56 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
> >>
> >> Hi Rahul,
> >>
> >>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> >>
> >> Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also
> don't
> >> need the 'hash' register if you compute everything in 'result'.
> >
> > Thank you for your comments Tobias.
> >
> > I could not get the implementation work with the usage of 'markOopDesc::hash_mask_in_place' in x86_64 (similar to support in
> x86_32).
> > Usage of -    __ andptr(result, markOopDesc::hash_mask_in_place);
> > Results in build error - ' overflow in implicit constant conversion'
> >
> > Then understood from 'sharedRuntime_sparc.cpp', 'markOop.hpp' -  that the usage of 'hash_mask_in_place' should be avoided for
> 64-bit because the values are too big!
> > Similar comments in LibraryCallKit::inline_native_hashcode [hotspot/src/share/vm/opto/library_call.cpp] also.
> > Could not find some other way to use hash_mask_in_place here for  x86_64?
> 
> You are right, I missed that.
> 
> > So depending on markOopDesc::hash_mask, markOopDesc::hash_shift value instead (similar to done in sharedRuntime_sparc)
> > Added missing comment regarding above in the revised webrev.
> >
> > Also yes I missed the optimized codegen.
> > Tried revised patch removing usages of extra 'hash', 'mask' registers and computed all in 'result' itself.
> >
> > [sharedRuntime_x86_64.cpp]
> >      ....................
> > +    Register obj_reg = j_rarg0;
> > +    Register result = rax;
> >      ........
> > +    // get hash
> > +    // Read the header and build a mask to get its hash field.
> > +    // Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
> > +    // because it could be larger than 32 bits in a 64-bit vm. See markOop.hpp.
> > +    __ shrptr(result, markOopDesc::hash_shift);
> > +    __ andptr(result, markOopDesc::hash_mask);
> > +    // test if hashCode exists
> > +    __ jcc  (Assembler::zero, slowCase);
> > +    __ ret(0);
> > +    __ bind (slowCase);
> >       ........
> >
> > Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> >
> > Please send your comments. I can submit revised webrev if all okay.
> 
> Looks good. Please send a new webrev.
> 
> Best,
> Tobias
> 
> >
> >>
> >> Best,
> >> Tobias
> >>
> >>
> >> On 08.01.2016 18:13, Rahul Raghavan wrote:
> >>> Hello,
> >>>
> >>> Please review the following revised patch for JDK-6378256 -
> >>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> >>>
> >>> This revised webrev got following changes -
> >>>
> >>>  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null
> >> reference input!
> >>>    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
> >>>
> >>>  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
> >>>
> >>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> >>>
> >>> Thanks,
> >>> Rahul
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-
> >> dev at openjdk.java.net
> >>>>
> >>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>>>
> >>>> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again
> would
> >> be
> >>>> nice.
> >>>> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
> >>>>
> >>>> Roland.
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
> >>>>
> >>>> Hello,
> >>>>
> >>>> Please review the following patch for JDK-6378256.
> >>>>
> >>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>>>
> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
> >>>> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
> >>>> slower).
> >>>> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
> >>>>
> >>>> sample unit test:
> >>>>    public class Jdk6378256Test
> >>>>    {
> >>>>       public static void main(String[] args)
> >>>>       {
> >>>>          Object obj = new Object();
> >>>>          long time = System.nanoTime();
> >>>>          for(int i = 0 ; i < 1000000 ; i++)
> >>>>             System.identityHashCode(obj);  //compare to obj.hashCode();
> >>>>          System.out.println ("Result = " + (System.nanoTime() - time));
> >>>>       }
> >>>>    }
> >>>>
> >>>> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
> >>>> (looks in the header for the hashCode before calling into the VM).
> >>>> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
> >>>> So also added required additional null check for System.identityHashCode case.
> >>>>
> >>>> Testing:
> >>>>    - successful JPRT run (-testset hotspot).
> >>>>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
> >>>>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
> >>>>    - Added 'noreg-perf' label for this performance bug.
> >>>>       Manual testing done and confirmed expected performance values for unit tests with fix.
> >>>>
> >>>> Thanks,
> >>>> Rahul

From vladimir.x.ivanov at oracle.com  Mon Jan 25 17:02:56 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Mon, 25 Jan 2016 20:02:56 +0300
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A5EF6C.4090603@oracle.com>
References: <56A5EF6C.4090603@oracle.com>
Message-ID: <56A65540.6010105@oracle.com>

Looks good.

Best regards,
Vladimir Ivanov

On 1/25/16 12:48 PM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8147876
> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>
> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>
> Thanks,
> Tobias
>

From roland.westrelin at oracle.com  Mon Jan 25 17:05:54 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Mon, 25 Jan 2016 18:05:54 +0100
Subject: RFR(XS): 8147853: "assert(t->meet(t0) == t) failed: Not
	monotonic" with sun/util/calendar/zi/TestZoneInfo310.java
In-Reply-To: <56A276AE.8060408@oracle.com>
References: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>
	<56A276AE.8060408@oracle.com>
Message-ID: <8686820A-60A2-4CEC-AEAC-DD3A74369323@oracle.com>

Hi Vladimir,

Thanks for looking at this.

> Can we simple return type of in(EntryControl) phi's input in such case (backedge is top) without filter_speculative() and verification code under assert which is useless in this case, I think. We already have check for the counted loop, we only need to separate can_be_counted_loop() condition.

Does this look better?

http://cr.openjdk.java.net/~roland/8147853/webrev.01/

Roland.

> 
> Thanks,
> Vladimir
> 
> On 1/22/16 8:38 AM, Roland Westrelin wrote:
>> During CCP, a Phi for the induction variable of a CountedLoop is processed repeatedly while the type of the backedge control is top so only the loop entry input is considered for computing the Phi?s type.
>> 
>> The loop entry first has type int:1..3 so the Phi?s type is int:1..3
>> then it has type int:1..4 so the Phi?s type is int:1..4
>> then it has type int:1..5:www so the Phi?s type is int:1..5:www
>> then it has type int:1..6:www so the Phi?s type is saturated to int:1..max-1:www
>> 
>> The backedge control?s type is changed to non-top and the type of the Phi is recomputed. This time the special code for counted loop in PhiNode::Value():
>> 
>>   CountedLoopNode* l = r->is_CountedLoop() ? r->as_CountedLoop() : NULL;
>>   if (l && l->can_be_counted_loop(phase) &&
>>       ((const Node*)l->phi() == this)) { // Trip counted loop!
>>     // protect against init_trip() or limit() returning NULL
>>     const Node *init   = l->init_trip();
>>     const Node *limit  = l->limit();
>>     const Node* stride = l->stride();
>>     if (init != NULL && limit != NULL && stride != NULL) {
>>       const TypeInt* lo = phase->type(init)->isa_int();
>>       const TypeInt* hi = phase->type(limit)->isa_int();
>>       const TypeInt* stride_t = phase->type(stride)->isa_int();
>>       if (lo != NULL && hi != NULL && stride_t != NULL) { // Dying loops might have TOP here
>>         assert(stride_t->_hi >= stride_t->_lo, "bad stride type");
>>         const Type* res = NULL;
>>         if (stride_t->_hi < 0) {          // Down-counter loop
>>           swap(lo, hi);
>>           return TypeInt::make(MIN2(lo->_lo, hi->_lo) , hi->_hi, 3);
>>         } else if (stride_t->_lo >= 0) {
>>           return TypeInt::make(lo->_lo, MAX2(lo->_hi, hi->_hi), 3);
>>         }
>>       }
>>     }
>>   }
>> 
>> 
>> kicks in and it computes a type of: int:1..8:www. The type of the Phi was narrowed and the assert fires.
>> 
>> I suggest we fix this by saturating the type of the Phi only once the type of the loop?s backedge is non top. This way, the special code for counted loop above has a chance to run and that should be enough to keep the types during CCP monotonic.
>> 
>> http://cr.openjdk.java.net/~roland/8147853/webrev.00/
>> 
>> Roland.
>> 


From vladimir.kozlov at oracle.com  Mon Jan 25 17:12:06 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 25 Jan 2016 09:12:06 -0800
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A5EF6C.4090603@oracle.com>
References: <56A5EF6C.4090603@oracle.com>
Message-ID: <56A65766.6030307@oracle.com>

Looks good.

Note to all. When you find problem in recent changes, please, add link 
in bug report to original changes (JDK-8140574 in this case). In a 
future it will help, for example, if we want to backport a original changes.

Thanks,
Vladimir

On 1/25/16 1:48 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
>
> https://bugs.openjdk.java.net/browse/JDK-8147876
> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>
> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>
> Thanks,
> Tobias
>

From igor.ignatyev at oracle.com  Mon Jan 25 17:17:29 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 25 Jan 2016 20:17:29 +0300
Subject: RFR(S) : 8148136 : compile control tests have incorrect @build
	directives
Message-ID: <AB249193-D840-40C1-B994-D7836EF292BF@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8148136/webrev.00/
> 49 lines changed: 24 ins; 0 del; 25 mod

Hi all,

could you please review the patch which fixes build directives for compile control tests?

Compile control tests intermittently fail in concurrent jtreg execution w/ NoClassDefFoundError or ClassNotFoundException for different classes. The tests have @build directives which refers to classname w/o package, but they should refer to FQN. So there are no @build/compile actions which guarantees that all needed classes would be compiled. The patch replaces classname w/ fully qualified name.

testing: run all hotspot/test/compiler
jbs: https://bugs.openjdk.java.net/browse/JDK-8148136

Thanks,
? Igor 

From vladimir.kozlov at oracle.com  Mon Jan 25 17:52:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 25 Jan 2016 09:52:33 -0800
Subject: RFR(XS): 8147853: "assert(t->meet(t0) == t) failed: Not
	monotonic" with sun/util/calendar/zi/TestZoneInfo310.java
In-Reply-To: <8686820A-60A2-4CEC-AEAC-DD3A74369323@oracle.com>
References: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>
	<56A276AE.8060408@oracle.com>
	<8686820A-60A2-4CEC-AEAC-DD3A74369323@oracle.com>
Message-ID: <56A660E1.5020300@oracle.com>

Yes! Thank you for making changes.

Vladimir

On 1/25/16 9:05 AM, Roland Westrelin wrote:
> Hi Vladimir,
>
> Thanks for looking at this.
>
>> Can we simple return type of in(EntryControl) phi's input in such case (backedge is top) without filter_speculative() and verification code under assert which is useless in this case, I think. We already have check for the counted loop, we only need to separate can_be_counted_loop() condition.
>
> Does this look better?
>
> http://cr.openjdk.java.net/~roland/8147853/webrev.01/
>
> Roland.
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/22/16 8:38 AM, Roland Westrelin wrote:
>>> During CCP, a Phi for the induction variable of a CountedLoop is processed repeatedly while the type of the backedge control is top so only the loop entry input is considered for computing the Phi?s type.
>>>
>>> The loop entry first has type int:1..3 so the Phi?s type is int:1..3
>>> then it has type int:1..4 so the Phi?s type is int:1..4
>>> then it has type int:1..5:www so the Phi?s type is int:1..5:www
>>> then it has type int:1..6:www so the Phi?s type is saturated to int:1..max-1:www
>>>
>>> The backedge control?s type is changed to non-top and the type of the Phi is recomputed. This time the special code for counted loop in PhiNode::Value():
>>>
>>>    CountedLoopNode* l = r->is_CountedLoop() ? r->as_CountedLoop() : NULL;
>>>    if (l && l->can_be_counted_loop(phase) &&
>>>        ((const Node*)l->phi() == this)) { // Trip counted loop!
>>>      // protect against init_trip() or limit() returning NULL
>>>      const Node *init   = l->init_trip();
>>>      const Node *limit  = l->limit();
>>>      const Node* stride = l->stride();
>>>      if (init != NULL && limit != NULL && stride != NULL) {
>>>        const TypeInt* lo = phase->type(init)->isa_int();
>>>        const TypeInt* hi = phase->type(limit)->isa_int();
>>>        const TypeInt* stride_t = phase->type(stride)->isa_int();
>>>        if (lo != NULL && hi != NULL && stride_t != NULL) { // Dying loops might have TOP here
>>>          assert(stride_t->_hi >= stride_t->_lo, "bad stride type");
>>>          const Type* res = NULL;
>>>          if (stride_t->_hi < 0) {          // Down-counter loop
>>>            swap(lo, hi);
>>>            return TypeInt::make(MIN2(lo->_lo, hi->_lo) , hi->_hi, 3);
>>>          } else if (stride_t->_lo >= 0) {
>>>            return TypeInt::make(lo->_lo, MAX2(lo->_hi, hi->_hi), 3);
>>>          }
>>>        }
>>>      }
>>>    }
>>>
>>>
>>> kicks in and it computes a type of: int:1..8:www. The type of the Phi was narrowed and the assert fires.
>>>
>>> I suggest we fix this by saturating the type of the Phi only once the type of the loop?s backedge is non top. This way, the special code for counted loop above has a chance to run and that should be enough to keep the types during CCP monotonic.
>>>
>>> http://cr.openjdk.java.net/~roland/8147853/webrev.00/
>>>
>>> Roland.
>>>
>

From vladimir.kozlov at oracle.com  Mon Jan 25 17:53:49 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 25 Jan 2016 09:53:49 -0800
Subject: RFR(S) : 8148136 : compile control tests have incorrect @build
	directives
In-Reply-To: <AB249193-D840-40C1-B994-D7836EF292BF@oracle.com>
References: <AB249193-D840-40C1-B994-D7836EF292BF@oracle.com>
Message-ID: <56A6612D.8010706@oracle.com>

Good.

Thanks,
Vladimir

On 1/25/16 9:17 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8148136/webrev.00/
>> 49 lines changed: 24 ins; 0 del; 25 mod
>
> Hi all,
>
> could you please review the patch which fixes build directives for compile control tests?
>
> Compile control tests intermittently fail in concurrent jtreg execution w/ NoClassDefFoundError or ClassNotFoundException for different classes. The tests have @build directives which refers to classname w/o package, but they should refer to FQN. So there are no @build/compile actions which guarantees that all needed classes would be compiled. The patch replaces classname w/ fully qualified name.
>
> testing: run all hotspot/test/compiler
> jbs: https://bugs.openjdk.java.net/browse/JDK-8148136
>
> Thanks,
> ? Igor
>

From dmitry.fazunenko at oracle.com  Mon Jan 25 20:07:54 2016
From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko)
Date: Mon, 25 Jan 2016 23:07:54 +0300
Subject: RFR(M) 8147461: Use byte offsets for vtable start and vtable
	length offsets
In-Reply-To: <56A1FA78.3090608@oracle.com>
References: <569926B9.4070806@oracle.com> <569F7E22.3090905@oracle.com>
	<56A04DCF.9090204@oracle.com> <56A1FA78.3090608@oracle.com>
Message-ID: <56A6809A.40104@oracle.com>

Hi Igor,

The GC part of change looks good.

Thanks,
Dima

On 22.01.2016 12:46, Mikael Gerdin wrote:
> Hi Chris,
>
> On 2016-01-21 04:17, Chris Plummer wrote:
>> Hi Mikael,
>>
>> The changes look good except I think you should get someone from the
>> compiler team to make sure the change in
>> HotSpotResolvedJavaMethodImpl.java and HotSpotVMConfig.java are ok. I'm
>> not sure why you chose to remove instanceKlassVtableStartOffset() rather
>> than just fix it.
>
> I'm cc:ing hotspot-compiler-dev and graal-dev to see if I can get 
> someone to ok the JVMCI parts.
>
> The reason for removing the method is that the only reason for it 
> being a method was to apply the wordSize scaling on the value and 
> since I changed the offset to be a byte offset it does not need 
> scaling and can be treated similar to the other constants in 
> HotSpotVMConfig which are accessed without any accessor method.
>
>>
>> I think some of your changes may conflict with my changes for
>> JDK-8143608. Coleen is pushing JDK-8143608 for me once hs-rt opens up.
>> I'd appreciate it if you could wait until after then before doing your
>> push.
>
> Will do, would you mind pinging me when you've integrated 8143608?
>
> /Mikael
>
>>
>> thanks,
>>
>> Chris
>>
>> On 1/20/16 4:31 AM, Mikael Gerdin wrote:
>>> Hi again,
>>>
>>> I've rebased the on hs-rt and had to include some additional changes
>>> for JVMCI.
>>> I've also updated the copyright years.
>>> Unfortunately I can't generate an incremental webrev since i rebased
>>> the patch and there's no good way that I know of to make that work
>>> with webrev.
>>>
>>> New webrev at: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.1/
>>>
>>> Testing: JPRT again (which includes the JVMCI jtreg tests)
>>>
>>> /Mikael
>>>
>>> On 2016-01-15 18:04, Mikael Gerdin wrote:
>>>> Hi all,
>>>>
>>>> As per the previous discussion in mid-December[0] about moving the
>>>> _vtable_length field to class Klass, here's the first RFR and webrev,
>>>> according to my suggested plan[1]:
>>>>
>>>>> My current plan is to first modify the vtable_length_offset 
>>>>> accessor to
>>>>> return a byte offset (which is what it's translated to by all 
>>>>> callers).
>>>>>
>>>>> Then I'll tackle moving the _vtable_len field to Klass.
>>>>>
>>>>> Finally I'll try to consolidate the vtable related methods to Klass,
>>>>> where they belong.
>>>>
>>>> This change actually consists of three changes:
>>>> * modifying InstanceKlass::vtable_length_offset to become a byte 
>>>> offset
>>>> and use the ByteSize type to communicate the scaling.
>>>> * modifying InstanceKlass::vtable_start_offset to become a byte offset
>>>> and use the ByteSize type, for symmetry reasons mainly.
>>>> * adding a vtableEntry::size_in_bytes() since in many places the 
>>>> vtable
>>>> entry size is used in combination with the vtable start to compute a
>>>> byte offset for vtable lookups.
>>>>
>>>> I don't foresee any issues with the fact that the byte offset is
>>>> represented as an int, for two reasons:
>>>> 1) If the offset of any of these grows to over 2 gigabytes then we 
>>>> have
>>>> a huge footprint problem with InstanceKlass
>>>> 2) The offsets are converted to byte offsets and stored in ints 
>>>> already
>>>> in the cpu specific code I've modified.
>>>>
>>>> Bug link: https://bugs.openjdk.java.net/browse/JDK-8147461
>>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.0/
>>>>
>>>> Testing: JPRT on Oracle supported platforms, testing on AARCH64 and
>>>> PPC64 would be much appreciated, appropriate mailing lists have been
>>>> CC:ed to notify them of the request.
>>>>
>>>>
>>>> [0]
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021152.html 
>>>>
>>>>
>>>>
>>>> [1]
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021224.html 
>>>>
>>>>
>>>>
>>>>
>>>> Thanks!
>>>> /Mikael
>>>
>>
>


From christian.thalinger at oracle.com  Thu Jan  7 16:45:46 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 07 Jan 2016 16:45:46 -0000
Subject: RFR (S): 8146246: JVMCICompiler::abort_on_pending_exception:
	assert(!thread->owns_locks()) failed: must release all locks
	when leaving VM
In-Reply-To: <0C62FED5-F3F8-44CE-B1DB-095F9170370B@oracle.com>
References: <ED6A057F-0842-42FD-8F7A-8116EA9DF5D2@oracle.com>
	<568D6C63.5000403@oracle.com>
	<0C62FED5-F3F8-44CE-B1DB-095F9170370B@oracle.com>
Message-ID: <80FAECCD-94BA-479C-B042-0A50D9121C8F@oracle.com>

[Changing lists because it should have been on hotspot-dev.]

Coleen, in case 2) below I could replace java_lang_Throwable::print_stack_trace with java_lang_Throwable::java_printStackTrace.

> On Jan 6, 2016, at 12:57 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> 
>> On Jan 6, 2016, at 9:34 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>> 
>> I would go with "Java code do the printing?.
> 
> Yeah, it might be better.
> 
>> You left ttyLocker in case 2) in src/share/vm/runtime/java.cpp
> 
> Right.  Thanks for pointing that out.
> 
>> 
>> Thanks,
>> Vladimir
>> 
>> On 1/6/16 11:19 AM, Christian Thalinger wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8146246
>>> 
>>> The problem is that https://bugs.openjdk.java.net/browse/JDK-8145435 introduced ttyLocker to synchronize the exception output but java_lang_Throwable::print_stack_trace can call out to Java to get the cause.
>>> 
>>> There are two solutions:
>>> 
>>> 1) Remove ttyLocker and deal with some possible scrambling in the rare case of an exception:
>>> 
>>> diff -r df8d635f2296 -r e87e187552fb src/share/vm/jvmci/jvmciCompiler.cpp
>>> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 11:24:01 2015 -0800
>>> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Thu Dec 31 09:20:16 2015 -0800
>>> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>>>     Handle exception(THREAD, PENDING_EXCEPTION);
>>>     CLEAR_PENDING_EXCEPTION;
>>> 
>>> -    {
>>> -      ttyLocker ttyl;
>>> -      java_lang_Throwable::print_stack_trace(exception, tty);
>>> -    }
>>> +    java_lang_Throwable::print_stack_trace(exception, tty);
>>> 
>>>     // Something went wrong so disable compilation at this level
>>>     method->set_not_compilable(CompLevel_full_optimization);
>>> @@ -181,11 +178,8 @@ void JVMCICompiler::abort_on_pending_exc
>>>   Thread* THREAD = Thread::current();
>>>   CLEAR_PENDING_EXCEPTION;
>>> 
>>> -  {
>>> -    ttyLocker ttyl;
>>> -    tty->print_raw_cr(message);
>>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>>> -  }
>>> +  tty->print_raw_cr(message);
>>> +  java_lang_Throwable::print_stack_trace(exception, tty);
>>> 
>>>   // Give other aborting threads to also print their stack traces.
>>>   // This can be very useful when debugging class initialization
>>> diff -r df8d635f2296 -r e87e187552fb src/share/vm/runtime/java.cpp
>>> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 11:24:01 2015 -0800
>>> +++ b/src/share/vm/runtime/java.cpp	Thu Dec 31 09:20:16 2015 -0800
>>> @@ -432,7 +432,6 @@ void before_exit(JavaThread* thread) {
>>>   if (HAS_PENDING_EXCEPTION) {
>>>     Handle exception(THREAD, PENDING_EXCEPTION);
>>>     CLEAR_PENDING_EXCEPTION;
>>> -    ttyLocker ttyl;
>>>     java_lang_Throwable::print_stack_trace(exception, tty);
>>>   }
>>> #endif
>>> 
>>> or
>>> 
>>> 2) Call out to Java and let the Java code do the printing:
>>> 
>>> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.cpp
>>> --- a/src/share/vm/classfile/javaClasses.cpp	Tue Dec 29 18:30:51 2015 +0100
>>> +++ b/src/share/vm/classfile/javaClasses.cpp	Wed Jan 06 09:12:00 2016 -1000
>>> @@ -1784,6 +1784,20 @@ void java_lang_Throwable::print_stack_tr
>>>   }
>>> }
>>> 
>>> +/**
>>> + * Print the throwable stack trace by calling the Java method java.lang.Throwable.printStackTrace().
>>> + */
>>> +void java_lang_Throwable::java_printStackTrace(Handle throwable, TRAPS) {
>>> +  assert(throwable->is_a(SystemDictionary::Throwable_klass()), "Throwable instance expected");
>>> +  JavaValue result(T_VOID);
>>> +  JavaCalls::call_virtual(&result,
>>> +                          throwable,
>>> +                          KlassHandle(THREAD, SystemDictionary::Throwable_klass()),
>>> +                          vmSymbols::printStackTrace_name(),
>>> +                          vmSymbols::void_method_signature(),
>>> +                          THREAD);
>>> +}
>>> +
>>> void java_lang_Throwable::fill_in_stack_trace(Handle throwable, const methodHandle& method, TRAPS) {
>>>   if (!StackTraceInThrowable) return;
>>>   ResourceMark rm(THREAD);
>>> diff -r 0fcfe4b07f7e src/share/vm/classfile/javaClasses.hpp
>>> --- a/src/share/vm/classfile/javaClasses.hpp	Tue Dec 29 18:30:51 2015 +0100
>>> +++ b/src/share/vm/classfile/javaClasses.hpp	Wed Jan 06 09:12:00 2016 -1000
>>> @@ -554,6 +554,7 @@ class java_lang_Throwable: AllStatic {
>>>   // Printing
>>>   static void print(Handle throwable, outputStream* st);
>>>   static void print_stack_trace(Handle throwable, outputStream* st);
>>> +  static void java_printStackTrace(Handle throwable, TRAPS);
>>>   // Debugging
>>>   friend class JavaClasses;
>>> };
>>> diff -r 0fcfe4b07f7e src/share/vm/jvmci/jvmciCompiler.cpp
>>> --- a/src/share/vm/jvmci/jvmciCompiler.cpp	Tue Dec 29 18:30:51 2015 +0100
>>> +++ b/src/share/vm/jvmci/jvmciCompiler.cpp	Wed Jan 06 09:12:00 2016 -1000
>>> @@ -162,10 +162,7 @@ void JVMCICompiler::compile_method(const
>>>     Handle exception(THREAD, PENDING_EXCEPTION);
>>>     CLEAR_PENDING_EXCEPTION;
>>> 
>>> -    {
>>> -      ttyLocker ttyl;
>>> -      java_lang_Throwable::print_stack_trace(exception, tty);
>>> -    }
>>> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>>> 
>>>     // Something went wrong so disable compilation at this level
>>>     method->set_not_compilable(CompLevel_full_optimization);
>>> @@ -181,11 +178,7 @@ void JVMCICompiler::abort_on_pending_exc
>>>   Thread* THREAD = Thread::current();
>>>   CLEAR_PENDING_EXCEPTION;
>>> 
>>> -  {
>>> -    ttyLocker ttyl;
>>> -    tty->print_raw_cr(message);
>>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>>> -  }
>>> +  java_lang_Throwable::java_printStackTrace(exception, THREAD);
>>> 
>>>   // Give other aborting threads to also print their stack traces.
>>>   // This can be very useful when debugging class initialization
>>> diff -r 0fcfe4b07f7e src/share/vm/runtime/java.cpp
>>> --- a/src/share/vm/runtime/java.cpp	Tue Dec 29 18:30:51 2015 +0100
>>> +++ b/src/share/vm/runtime/java.cpp	Wed Jan 06 09:12:00 2016 -1000
>>> @@ -433,7 +433,7 @@ void before_exit(JavaThread* thread) {
>>>     Handle exception(THREAD, PENDING_EXCEPTION);
>>>     CLEAR_PENDING_EXCEPTION;
>>>     ttyLocker ttyl;
>>> -    java_lang_Throwable::print_stack_trace(exception, tty);
>>> +    java_lang_Throwable::java_printStackTrace(exception, THREAD);
>>>   }
>>> #endif
>>> 
> 


From christian.thalinger at oracle.com  Thu Jan 14 21:48:32 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 14 Jan 2016 11:48:32 -1000
Subject: RFR (S): 8146820: JVMCI properties should use
	HotSpotJVMCIRuntime.getBooleanProperty mechanism
In-Reply-To: <E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>
References: <83D3AB99-8164-4326-B847-06BFF27280C7@oracle.com>
	<56940779.8070804@oracle.com>
	<B625DC33-05DF-401B-AE5E-088CD3BBB2D5@oracle.com>
	<490C48FD-48A2-459F-BF0A-56D33966CC60@oracle.com>
	<CA9DB527-C900-4883-ACA9-33770DEE41DC@oracle.com>
	<CE1979F3-2547-49D3-A542-1A34A7256112@oracle.com>
	<9EC9F964-26EE-43B6-BF7E-43F40D192C1E@oracle.com>
	<B21498A3-F218-4BC0-9F3F-B1DD858D11DC@oracle.com>
	<41621484-0886-401C-A8AD-36D534DDE591@oracle.com>
	<B6B8EA46-28F4-4D7A-94D6-7DD1EA6DC983@oracle.com>
	<7C1CBFFE-9A7C-4195-A8EA-BD7B94092E4F@oracle.com>
	<E6B59722-3602-4934-A0C7-BFAD94CB6699@oracle.com>
Message-ID: <97A6E072-7B52-4083-86F1-0DFC8AD287C7@oracle.com>


> On Jan 14, 2016, at 2:44 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 14 Jan 2016, at 06:58, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>>> 
>>> On Jan 12, 2016, at 12:39 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>>> 
>>>> On Jan 12, 2016, at 12:14 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On Jan 12, 2016, at 12:03 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On 12 Jan 2016, at 22:39, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Jan 12, 2016, at 10:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>>> 
>>>>>>> If we?re going with an enum, you could put accessors directly in the enum:
>>>>>>> 
>>>>>>> private static final boolean TrustFinalDefaultFields = Option.TrustFinalDefaultFields.getBoolean(true);
>>>>>>> 
>>>>>>> private static final String TraceMethodDataFilter = Option.TraceMethodDataFilter.getString(null);
>>>>>>> 
>>>>>>> You could then type the value of the options and check the right accessor is used:
>>>>>>> 
>>>>>>> public enum Option {
>>>>>>>     ImplicitStableValues(boolean.class),
>>>>>>>     InitTimer,  // Note: Not used because of visibility issues (see InitTimer.ENABLED).
>>>>>>>     PrintConfig(boolean.class),
>>>>>>>     PrintFlags(boolean.class),
>>>>>>>     ShowFlags(boolean.class),
>>>>>>>     TraceMethodDataFilter(String.class),
>>>>>>>     TrustFinalDefaultFields(String.class);
>>>>>>> 
>>>>>>> Even ignoring these suggestions, the discipline imposed by the enum if a good idea.
>>>>>> 
>>>>>> Excellent idea!  I was also thinking about adding the default value to the enum.
>>>>> 
>>>>> Can you do that without having to box the default value?
>>>> 
>>>> No, we have to box but we can initialize all flags in the constructor:
>>>> 
>>>> http://cr.openjdk.java.net/~twisti/8146820/webrev.02/
>> 
>> Do we agree on the change?
> 
> I would prefer it if the value was lazy initialized (for non-AOT runtimes):

It?s not different in AOT-land because these cannot be constants.

> 
> 
>    /**
>     * Supported JVMCI options.
>     */
>    public enum Option {
>        ImplicitStableValues(boolean.class, true),
>        InitTimer(boolean.class, false),  // Note: Not used (see InitTimer.ENABLED).
>        PrintConfig(boolean.class, false),
>        PrintFlags(boolean.class, false),
>        ShowFlags(boolean.class, false),
>        TraceMethodDataFilter(String.class, null),
>        TrustFinalDefaultFields(String.class, true);
> 
>        /**
>         * The prefix for system properties that are JVMCI options.
>         */
>        private static final String JVMCI_OPTION_PROPERTY_PREFIX = "jvmci.";
> 
>        private final Class<?> type;
>        private Object value;
>        private final Object defaultValue;
>        private boolean isDefault;
> 
>        private Option(Class<?> type, Object defaultValue) {
>            assert Character.isUpperCase(name().charAt(0)) : "Option name must start with upper-case letter: " + name();
>            this.type = type;
>            this.value = "UNINITIALIZED";
>            this.defaultValue = defaultValue;
>        }
> 
>        private Object getValue() {
>            if (value == "UNINITIALIZED") {
>                String propertyValue = VM.getSavedProperty(JVMCI_OPTION_PROPERTY_PREFIX + name());
>                if (propertyValue == null) {
>                    this.value = defaultValue;
>                    this.isDefault = true;
>                } else {
>                    if (type == boolean.class) {
>                        this.value = Boolean.parseBoolean(propertyValue);
>                    } else if (type == String.class) {
>                        this.value = propertyValue;
>                    } else {
>                        throw new JVMCIError("Unexpected option type " + type);
>                    }
>                    this.isDefault = false;
>                }
>                // Saved properties should not be interned - let?s be sure
>                assert value != "UNINITIALIZED";
>            }
>            return value;
>        }
> 
>        /**
>         * Returns the option's value as boolean.
>         *
>         * @return option's value
>         */
>        public boolean getBoolean() {
>            return (boolean) getValue();
>        }
> 
>        /**
>         * Returns the option's value as String.
>         *
>         * @return option's value
>         */
>        public String getString() {
>            return (String) getValue();
>        }
> 
>        /**
>         * Prints all option flags to {@code out}.
>         *
>         * @param out stream to print to
>         */
>        public static void printFlags(PrintStream out) {
>            out.println("[List of JVMCI options]");
>            for (Option option : values()) {
>                Object value = option.getValue();
>                String assign = option.isDefault ? ":=" : " =";
>                out.printf("%9s %-40s %s %-14s%n", option.type.getSimpleName(), option, assign, value);
>            }
>        }
>    }
> 
> 
> Also, you can remove all the static fields that just cache a (possibly unboxed) option value and use the option directly. For example:
> 
> diff -r 1034ff44c5d0 src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java
> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Tue Jan 12 15:04:27 2016 +0100
> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaFieldImpl.java	Thu Jan 14 13:40:28 2016 +0100
> @@ -29,6 +29,7 @@
> import java.lang.reflect.Field;
> 
> import jdk.vm.ci.common.JVMCIError;
> +import jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.Option;
> import jdk.vm.ci.meta.JavaType;
> import jdk.vm.ci.meta.LocationIdentity;
> import jdk.vm.ci.meta.MetaAccessProvider;
> @@ -41,11 +42,6 @@
>  */
> class HotSpotResolvedJavaFieldImpl implements HotSpotResolvedJavaField, HotSpotProxified {
> 
> -    /**
> -     * Mark well-known stable fields as such.
> -     */
> -    private static final boolean ImplicitStableValues = HotSpotJVMCIRuntime.getBooleanProperty("jvmci.ImplicitStableValues", true);
> -
>     private final HotSpotResolvedObjectTypeImpl holder;
>     private final String name;
>     private JavaType type;
> @@ -198,7 +194,7 @@
>             return true;
>         }
>         assert getAnnotation(Stable.class) == null;
> -        if (ImplicitStableValues && isImplicitStableField()) {
> +        if (Option.ImplicitStableValues.getBoolean() && isImplicitStableField()) {
>             return true;
>         }
>         return false;
> 
> None of the current options are used in tight loops where the cost of the unboxing (if any) would matter.

Right.

> 
> Lastly, since you?ve added PrintFlags and ShowFlags, why not add a help message to each option. For example:
> 
>        ImplicitStableValues(boolean.class, true, ?Mark well-known stable fields as such.?),

We should.

http://cr.openjdk.java.net/~twisti/8146820/webrev.03/

$ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.ShowFlags=true InitGraal
[List of JVMCI options]
  boolean ImplicitStableValues                     := true           Mark well-known stable fields as such.
  boolean InitTimer                                := false          Specifies if initialization timing is enabled.
  boolean PrintConfig                              := false          Prints all HotSpotVMConfig fields.
  boolean PrintFlags                               := false          Prints all JVMCI flags and exits.
  boolean ShowFlags                                 = true           Prints all JVMCI flags and continues.
   String TraceMethodDataFilter                    := null           
  boolean TrustFinalDefaultFields                  := true           Determines whether to treat final fields with default values as constant.

> 
> -Doug
> 
>> 
>>>> 
>>>> We will not have many flags so this should be alright.  A PrintFlags looks like this:
>>>> 
>>>> $ ./build/macosx-x86_64-normal-server-release/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Djvmci.PrintFlags=true InitGraal
>>>> [List of JVMCI options]
>>>> boolean ImplicitStableValues                     := true          
>>>> boolean InitTimer                                := false         
>>>> boolean PrintConfig                              := false         
>>>> boolean PrintFlags                                = true          
>>>> boolean ShowFlags                                := false         
>>>>  String TraceMethodDataFilter                    := null          
>>>>  String TrustFinalDefaultFields                  := true          
>>> 
>>> ?and this is a bug, of course :-)
>>> 
>>>> 
>>>> I?m almost tempted to move InitTimer to another package, like jdk.vm.ci.common ?
>>>> 
>>>>> 
>>>>> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160114/49aedf14/attachment-0001.html>

From jaroslav.bachorik at oracle.com  Fri Jan  8 13:12:02 2016
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 08 Jan 2016 13:12:02 -0000
Subject: RFR 8146620: CodelistTest.java fails with "Test failed on:
	jdk.internal.misc.Unsafe.getUnsafe()Ljdk/internal/misc/Unsafe;"
In-Reply-To: <8AA1795C-7E67-4CE6-8E07-490564C91D2A@oracle.com>
References: <568F7A0F.50006@oracle.com>
	<8AA1795C-7E67-4CE6-8E07-490564C91D2A@oracle.com>
Message-ID: <568FB59D.4050902@oracle.com>

On 8.1.2016 10:06, Staffan Larsen wrote:
> Looks good!

Thanks!

>
> Thanks,
> /Staffan
>
>> On 8 jan. 2016, at 09:57, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>
>> Please, review the following simple test fix
>>
>> Issue : https://bugs.openjdk.java.net/browse/JDK-8146620
>> Webrev: http://cr.openjdk.java.net/~jbachorik/8146620/webrev.00
>>
>> The test is treating the 'sun.misc.Unsafe.getUnsafe()' entry from the code list in a specific way - but since now it is possible to meet also the 'jdk.internal.misc.Unsafe.getUnsafe()' entry it is necessary to modify the test to expect this eventuality.
>>
>> Thanks,
>>
>> -JB-
>


From christian.thalinger at oracle.com  Mon Jan 25 23:11:41 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Mon, 25 Jan 2016 13:11:41 -1000
Subject: RFR: 8147470: update JVMCI mx extensions
In-Reply-To: <DB5579B7-C486-4896-95B9-6BD31CB04F55@oracle.com>
References: <DB5579B7-C486-4896-95B9-6BD31CB04F55@oracle.com>
Message-ID: <420CEE06-87D7-4A6C-BB2A-FF76DD303756@oracle.com>

This all looks good.

> On Jan 25, 2016, at 6:14 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Please review these changes to the mx extensions for JVMCI to account for recent HotSpot and Graal changes.
> 
> https://bugs.openjdk.java.net/browse/JDK-8147470
> http://cr.openjdk.java.net/~dnsimon/8147470/
> 
> -Doug


From tobias.hartmann at oracle.com  Tue Jan 26 08:13:10 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 26 Jan 2016 09:13:10 +0100
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A65540.6010105@oracle.com>
References: <56A5EF6C.4090603@oracle.com> <56A65540.6010105@oracle.com>
Message-ID: <56A72A96.8080306@oracle.com>

Thanks, Vladimir.

Best,
Tobias

On 25.01.2016 18:02, Vladimir Ivanov wrote:
> Looks good.
> 
> Best regards,
> Vladimir Ivanov
> 
> On 1/25/16 12:48 PM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147876
>> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>>
>> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>>
>> Thanks,
>> Tobias
>>

From tobias.hartmann at oracle.com  Tue Jan 26 08:17:41 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 26 Jan 2016 09:17:41 +0100
Subject: [9] RFR(XS): 8147876: ciTypeFlow::is_dominated_by() writes
	outside dominated array
In-Reply-To: <56A65766.6030307@oracle.com>
References: <56A5EF6C.4090603@oracle.com> <56A65766.6030307@oracle.com>
Message-ID: <56A72BA5.1020700@oracle.com>

Thanks, Vladimir.

On 25.01.2016 18:12, Vladimir Kozlov wrote:
> Looks good.
> 
> Note to all. When you find problem in recent changes, please, add link in bug report to original changes (JDK-8140574 in this case). In a future it will help, for example, if we want to backport a original changes.

Right, I'll keep that in mind.

Best,
Tobias

> 
> Thanks,
> Vladimir
> 
> On 1/25/16 1:48 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8147876
>> http://cr.openjdk.java.net/~thartmann/8147876/webrev.00/
>>
>> ciTypeFlow::is_dominated_by() write outside the 'dominated' array because it's size is too small. The problem is that the number of ciBlocks is not equal to the Blocks used by ciTypeFlow (there is a 1:n relation). Therefore, we should use block_count() instead of _methodBlocks->num_blocks().
>>
>> Thanks,
>> Tobias
>>

From tobias.hartmann at oracle.com  Tue Jan 26 09:10:14 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 26 Jan 2016 10:10:14 +0100
Subject: RFR(S): 6378256: Performance problem with System.identityHashCode
	in client compiler
In-Reply-To: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>
References: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>
Message-ID: <56A737F6.6030909@oracle.com>

Hi Rahul,

looks good to me (not a Reviewer). The code in sharedRuntime_x86_64.cpp is much better now!

Best,
Tobias

On 25.01.2016 18:02, Rahul Raghavan wrote:
> Hello,
> 
> With reference to below email thread, please send review comments for the revised patch for JDK-6378256.
> http://cr.openjdk.java.net/~thartmann/6378256/webrev.02/
> 
> Thanks,
> Rahul
> 
>> -----Original Message-----
>> From: Tobias Hartmann > Sent: Monday, January 25, 2016 12:40 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
>>
>> Hi Rahul,
>>
>> On 22.01.2016 17:11, Rahul Raghavan wrote:
>>>
>>>> -----Original Message-----
>>>> From: Tobias Hartmann > Sent: Monday, January 11, 2016 2:56 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
>>>>
>>>> Hi Rahul,
>>>>
>>>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
>>>>
>>>> Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also
>> don't
>>>> need the 'hash' register if you compute everything in 'result'.
>>>
>>> Thank you for your comments Tobias.
>>>
>>> I could not get the implementation work with the usage of 'markOopDesc::hash_mask_in_place' in x86_64 (similar to support in
>> x86_32).
>>> Usage of -    __ andptr(result, markOopDesc::hash_mask_in_place);
>>> Results in build error - ' overflow in implicit constant conversion'
>>>
>>> Then understood from 'sharedRuntime_sparc.cpp', 'markOop.hpp' -  that the usage of 'hash_mask_in_place' should be avoided for
>> 64-bit because the values are too big!
>>> Similar comments in LibraryCallKit::inline_native_hashcode [hotspot/src/share/vm/opto/library_call.cpp] also.
>>> Could not find some other way to use hash_mask_in_place here for  x86_64?
>>
>> You are right, I missed that.
>>
>>> So depending on markOopDesc::hash_mask, markOopDesc::hash_shift value instead (similar to done in sharedRuntime_sparc)
>>> Added missing comment regarding above in the revised webrev.
>>>
>>> Also yes I missed the optimized codegen.
>>> Tried revised patch removing usages of extra 'hash', 'mask' registers and computed all in 'result' itself.
>>>
>>> [sharedRuntime_x86_64.cpp]
>>>      ....................
>>> +    Register obj_reg = j_rarg0;
>>> +    Register result = rax;
>>>      ........
>>> +    // get hash
>>> +    // Read the header and build a mask to get its hash field.
>>> +    // Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
>>> +    // because it could be larger than 32 bits in a 64-bit vm. See markOop.hpp.
>>> +    __ shrptr(result, markOopDesc::hash_shift);
>>> +    __ andptr(result, markOopDesc::hash_mask);
>>> +    // test if hashCode exists
>>> +    __ jcc  (Assembler::zero, slowCase);
>>> +    __ ret(0);
>>> +    __ bind (slowCase);
>>>       ........
>>>
>>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
>>>
>>> Please send your comments. I can submit revised webrev if all okay.
>>
>> Looks good. Please send a new webrev.
>>
>> Best,
>> Tobias
>>
>>>
>>>>
>>>> Best,
>>>> Tobias
>>>>
>>>>
>>>> On 08.01.2016 18:13, Rahul Raghavan wrote:
>>>>> Hello,
>>>>>
>>>>> Please review the following revised patch for JDK-6378256 -
>>>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
>>>>>
>>>>> This revised webrev got following changes -
>>>>>
>>>>>  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null
>>>> reference input!
>>>>>    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
>>>>>
>>>>>  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
>>>>>
>>>>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
>>>>>
>>>>> Thanks,
>>>>> Rahul
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-
>>>> dev at openjdk.java.net
>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>>>>>
>>>>>> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again
>> would
>>>> be
>>>>>> nice.
>>>>>> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
>>>>>>
>>>>>> Roland.
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Please review the following patch for JDK-6378256.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
>>>>>> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven times
>>>>>> slower).
>>>>>> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
>>>>>>
>>>>>> sample unit test:
>>>>>>    public class Jdk6378256Test
>>>>>>    {
>>>>>>       public static void main(String[] args)
>>>>>>       {
>>>>>>          Object obj = new Object();
>>>>>>          long time = System.nanoTime();
>>>>>>          for(int i = 0 ; i < 1000000 ; i++)
>>>>>>             System.identityHashCode(obj);  //compare to obj.hashCode();
>>>>>>          System.out.println ("Result = " + (System.nanoTime() - time));
>>>>>>       }
>>>>>>    }
>>>>>>
>>>>>> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
>>>>>> (looks in the header for the hashCode before calling into the VM).
>>>>>> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
>>>>>> So also added required additional null check for System.identityHashCode case.
>>>>>>
>>>>>> Testing:
>>>>>>    - successful JPRT run (-testset hotspot).
>>>>>>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
>>>>>>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
>>>>>>    - Added 'noreg-perf' label for this performance bug.
>>>>>>       Manual testing done and confirmed expected performance values for unit tests with fix.
>>>>>>
>>>>>> Thanks,
>>>>>> Rahul

From roland.westrelin at oracle.com  Tue Jan 26 09:18:22 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Tue, 26 Jan 2016 10:18:22 +0100
Subject: RFR(XS): 8147853: "assert(t->meet(t0) == t) failed: Not
	monotonic" with sun/util/calendar/zi/TestZoneInfo310.java
In-Reply-To: <56A660E1.5020300@oracle.com>
References: <DAC01147-E291-4310-AFDD-188087C7DF75@oracle.com>
	<56A276AE.8060408@oracle.com>
	<8686820A-60A2-4CEC-AEAC-DD3A74369323@oracle.com>
	<56A660E1.5020300@oracle.com>
Message-ID: <D04D80D6-91EC-4CEF-B739-A53A8B38005A@oracle.com>

Thanks for the review, Vladimir.

Roland.


From doug.simon at oracle.com  Tue Jan 26 09:28:16 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 26 Jan 2016 10:28:16 +0100
Subject: RFR: 8148202: move lookup of Java class and hub from ResolvedJavaType
	to ConstantReflectionProvider
Message-ID: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>

Most access to VM constants in JVMCI goes through the ConstantReflectionProvider interface meaning the VM implementation for constant handling is in one place. For historic reasons, some small amount of reflection on VM constants was located in ResolvedJavaType. This issue consolidates these methods to ConstantReflectionProvider.

https://bugs.openjdk.java.net/browse/JDK-8148202
http://cr.openjdk.java.net/~dnsimon/8148202

-Doug

From nils.eliasson at oracle.com  Tue Jan 26 10:40:06 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Tue, 26 Jan 2016 11:40:06 +0100
Subject: RFR(S): 8063112: Compiler diagnostic commands should have locking
	instead of safepoint
In-Reply-To: <56A281C3.6010408@oracle.com>
References: <56A23F61.9000201@oracle.com> <56A281C3.6010408@oracle.com>
Message-ID: <56A74D06.7030408@oracle.com>

Hi Vladimir,

On 2016-01-22 20:23, Vladimir Kozlov wrote:
> Why you need new print method? Why you can't use existing print()?
> Also I prefer to get current compilation tasks print in separate lines 
> - not in the list of threads. Then you don't need to use new print?

Works for me. I moved it directly after the existing thread printing:

---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
   0x00007f4cfc485000 JavaThread "Service Thread" daemon 
[_thread_blocked, id=22409, stack(0x00007f4bf1c5e000,0x00007f4bf1d5f000)]
   0x00007f4cfc476000 JavaThread "Sweeper thread" daemon 
[_thread_blocked, id=22408, stack(0x00007f4bf1d5f000,0x00007f4bf1e60000)]
...
stack(0x00007f4bf35db000,0x00007f4bf36dc000)]
   0x00007f4cfc018800 JavaThread "main" [_thread_in_vm, id=22332, 
stack(0x00007f4d05c78000,0x00007f4d05d79000)]

Other Threads:
   0x00007f4cfc3ea000 VMThread [stack: 
0x00007f4bf36dc000,0x00007f4bf37dd000] [id=22388]
   0x00007f4cfc486800 WatcherThread [stack: 
0x00007f4bf1b5d000,0x00007f4bf1c5e000] [id=22410]

Threads with active compile tasks:
   0x00007f4cfc46a800 id=22403 Compiling:     244    1       3 
java.lang.String::isLatin1 (19 bytes)


>
> I am worry about using locks for printing because print code also has 
> locks. Do we really have to have locks here? The output for these 
> directives is local bufferedStream. As I understand it is separate for 
> each directive. So why you need lock? Or VM operation as before?

I think you are mixing my two RFRs together - this change doesn't print 
directives.

I am removing vm_ops from three diagnostic commands that uses code that 
expects safepoint or lock. Some of the commands are really quick, and 
requesting a safepoint is overkill when it can be done concurrently. 
Only new lock taken is the thread lock when iterating the compiler 
threads from the Compiler.queue jcmd. The thread lock is ranked so it 
can not be reordered with the compile.queue lock.

I cleaned it up a bit further and removed the unused 
print_compiler_threads_on(...) from compileBroker. It is printed in 
JavaThread::print_on(..) where all the other thread info is located.

Hs_err-file looks like the example above.

jcmd Thread.print looks like this for compiling threads:

C1 CompilerThread13" #19 daemon prio=9 os_prio=0 tid=0x00007f8748471800 
nid=0x7732 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
    JavaThread state: _thread_in_native
Thread: 0x00007f8748471800  [0x7732] State: _at_safepoint 
_has_called_back 0 _at_poll_safepoint 0
    JavaThread state: _thread_in_native
    Compiling:  716    b  2       java.util.regex.Pattern::compile (406 
bytes)

And Compiler.queue looks like this:

"Current compiles:
C1 CompilerThread14 435    b  2 java.net.URLStreamHandler::parseURL 
(1166 bytes)

C1 compile queue:
Empty

C2 compile queue:
Empty"


New webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.04/

Regards,
Nils

>
> Thanks,
> Vladimir
>
> On 1/22/16 6:40 AM, Nils Eliasson wrote:
>> Hi,
>>
>> Please review.
>>
>> Summary:
>> Firstly this change removes the unnecessary vm-ops from three compiler
>> diagnostic commands and adds locking instead.
>> Secondly the Compiler.queue diagnostic command is improved with printing
>> of any active compilations. I found this useful when diagnosing a 
>> rouge VM.
>> Thirdly, as a bonus, I also add printing of active compilations in the
>> thread section of the hs_err file. Very useful when investigating VMs
>> terminated by a timeout.
>>
>> Testing:
>> This does not pass all tests yet. A few tests is dependent on the output
>> from the diagnostic command, and I want to be sure the reviewers are
>> happy with the output format first.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8063112
>> Webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.02/
>>
>> Regards,
>> Nils
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/71647d2b/attachment.html>

From ivan at azulsystems.com  Tue Jan 26 10:59:58 2016
From: ivan at azulsystems.com (Ivan Krylov)
Date: Tue, 26 Jan 2016 13:59:58 +0300
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
Message-ID: <56A751AE.9090203@azulsystems.com>

Hello,

Some of you may have a seen a few e-mails on the core-libs alias about a 
proposed ?spin wait hint?. The JEP is forming up nicely at 
https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a 
consensus on the API side. It is now in a draft state and I hope this 
JEP will get targeted for java 9 shortly.  The upcoming API changes can 
be seen at the webrev:
http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/

At this time I would like to ask for a review of the hs-comp changes. 
The plan is push changes into class libraries and hotspot synchronously 
but that may happen after the JEP gets targeted.

Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/

The idea of the fix is pretty simple: hotspot replaces a call to 
java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 
'pause' instruction on x86.  This intrinsic is guarded by the 
-XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a 
verification code that makes sure the flag is off, VM will just execute 
at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. 
According the [1] the 'pause' instruction is functional since SSE2, but 
even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence 
harmless, there seems to be no need to add guarding code for older 
generations of Intel CPUs.

The proposed patch includes a simple regression test that simply makes 
sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  
There are several other producer-consumer-like performance tests ready 
that the authors of this JEP would be happy to make available under 
JEP-230 but I am uncertain about the process.

Thanks,

Ivan

[1]  - 
https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/45660182/attachment.html>

From dmitry.fazunenko at oracle.com  Tue Jan 26 12:02:29 2016
From: dmitry.fazunenko at oracle.com (Dmitry Fazunenko)
Date: Tue, 26 Jan 2016 15:02:29 +0300
Subject: RFR(S) : 8148012 : get rid of slash-dot-dot in @library directives
In-Reply-To: <BC1FB2A4-FAD3-4AEA-873E-D6027D2CA646@oracle.com>
References: <BC1FB2A4-FAD3-4AEA-873E-D6027D2CA646@oracle.com>
Message-ID: <56A76055.5070709@oracle.com>

Hi Igor,

the GC part of fix looks good!
Thank you for caring of GC tests

-- Dima

On 25.01.2016 22:36, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8148012/webrev.00/
>> 31 lines changed: 0 ins; 0 del; 31 mod;
> Having ?external.lib.roots=/../../' in TEST.ROOT made it possible not to use ?/..? in tests, the latest change in JTreg(https://bugs.openjdk.java.net/browse/CODETOOLS-7901585) made it illegal to have references to directories outside a test suite, so it?s required to fix all such entries for switching to the next jtreg version. the fix basically replaces all '/../../test/lib? w/ ?/test/lib? in @library directives.
>
> testing: run the affected tests
> jbs: https://bugs.openjdk.java.net/browse/JDK-8148012
>
> Thanks,
> Igor


From edward.nevill at gmail.com  Tue Jan 26 14:25:08 2016
From: edward.nevill at gmail.com (Edward Nevill)
Date: Tue, 26 Jan 2016 14:25:08 +0000
Subject: RFR: 8148240: random infrequent null pointer exceptions in javac
Message-ID: <1453818308.16279.8.camel@mylittlepony.linaroharston>

Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8148240/webrev/

Jira issue: https://bugs.openjdk.java.net/browse/JDK-8148240

The patch simply works around the issue by disabling FP as an allocatable register.

Thanks,
Ed.


From aph at redhat.com  Tue Jan 26 14:34:58 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 26 Jan 2016 14:34:58 +0000
Subject: [aarch64-port-dev ] RFR: 8148240: random infrequent null pointer
	exceptions in javac
In-Reply-To: <1453818308.16279.8.camel@mylittlepony.linaroharston>
References: <1453818308.16279.8.camel@mylittlepony.linaroharston>
Message-ID: <56A78412.9060203@redhat.com>

On 01/26/2016 02:25 PM, Edward Nevill wrote:
> Hi,
> 
> Please review the following webrev
> 
> http://cr.openjdk.java.net/~enevill/8148240/webrev/
> 
> Jira issue: https://bugs.openjdk.java.net/browse/JDK-8148240
> 
> The patch simply works around the issue by disabling FP as an allocatable register.

Yes, thanks.  OK for JDK9 and backports to AArch64 JDK8 and 7.

Andrew.


From zoltan.majo at oracle.com  Tue Jan 26 16:43:15 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 26 Jan 2016 17:43:15 +0100
Subject: [9] RFR (S): 8146478: Node limit exceeded with
	-XX:AllocateInstancePrefetchLines=1073741823
Message-ID: <56A7A223.9050403@oracle.com>

Hi,


please review the patch for 8146478.

https://bugs.openjdk.java.net/browse/JDK-8146478

Problem: Setting a high value for AllocateInstancePrefetchLines can 
trigger an assert in the C2 compiler The reasons is that the number of 
live nodes exceeds the maximum node limit. The same problem can happen 
if AllocateInstanceLines is given a high value.

Solution:
Limit the range for AllocateInstancePrefetchLines/AllocateInstanceLines 
to 8. I picked the value 8 because
- (1) the maximum possible value for theses flags is 4/2, so having a 
slightly higher value than 4/2 still allows for some experiments;
- (2) the node_check() in PhaseMacroExpand::expand_macro_nodes() assumes 
that each macro node expansion will generate <75 new nodes. The number 
of nodes generated by expand_allocate_array()/expand_allocate() for 8 
prefetched lines closely fits into that margin (experimentally verified).

In addition, I removed some code that is that is now unnecessary because 
of the range checks we have in place.


Webrev:
http://cr.openjdk.java.net/~zmajo/8146478/webrev.00/

Testing:
- JPRT: All JTREG hotspot tests, incl. TestOptionsWithRanges.java

Thank you and best regards,


Zoltan


From vladimir.kozlov at oracle.com  Tue Jan 26 19:01:15 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 26 Jan 2016 11:01:15 -0800
Subject: [9] RFR (S): 8146478: Node limit exceeded with
	-XX:AllocateInstancePrefetchLines=1073741823
In-Reply-To: <56A7A223.9050403@oracle.com>
References: <56A7A223.9050403@oracle.com>
Message-ID: <56A7C27B.8050004@oracle.com>

Where 4/2 number comes from? Some spec runs used higher number:

-XX:AllocatePrefetchLines=16

http://spec.org/jbb2005/results/res2009q1/jbb2005-20081203-00563.html

I would suggest something like 64 - I never see such number is used.

Also, please, limit AllocatePrefetchStepSize range. It corresponds to 
cache line size. 512 I would say for future proof - with assert that 
check that its setting in vm_Version_<arch>.cpp is in these range.

For the case AllocatePrefetchStyle == 2 number of lines is calculated as:

uint lines = AllocatePrefetchDistance / AllocatePrefetchStepSize;

Since AllocatePrefetchDistance limit is big you can get a lot of nodes 
again. May be also set the limit - 
AllocatePrefetchLines*AllocatePrefetchStepSize 64*32 = 2048.

Thanks,
Vladimir

On 1/26/16 8:43 AM, Zolt?n Maj? wrote:
> Hi,
>
>
> please review the patch for 8146478.
>
> https://bugs.openjdk.java.net/browse/JDK-8146478
>
> Problem: Setting a high value for AllocateInstancePrefetchLines can
> trigger an assert in the C2 compiler The reasons is that the number of
> live nodes exceeds the maximum node limit. The same problem can happen
> if AllocateInstanceLines is given a high value.
>
> Solution:
> Limit the range for AllocateInstancePrefetchLines/AllocateInstanceLines
> to 8. I picked the value 8 because
> - (1) the maximum possible value for theses flags is 4/2, so having a
> slightly higher value than 4/2 still allows for some experiments;
> - (2) the node_check() in PhaseMacroExpand::expand_macro_nodes() assumes
> that each macro node expansion will generate <75 new nodes. The number
> of nodes generated by expand_allocate_array()/expand_allocate() for 8
> prefetched lines closely fits into that margin (experimentally verified).
>
> In addition, I removed some code that is that is now unnecessary because
> of the range checks we have in place.
>
>
> Webrev:
> http://cr.openjdk.java.net/~zmajo/8146478/webrev.00/
>
> Testing:
> - JPRT: All JTREG hotspot tests, incl. TestOptionsWithRanges.java
>
> Thank you and best regards,
>
>
> Zoltan
>

From christian.thalinger at oracle.com  Tue Jan 26 20:16:05 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 26 Jan 2016 10:16:05 -1000
Subject: RFR: 8148202: move lookup of Java class and hub from
	ResolvedJavaType to ConstantReflectionProvider
In-Reply-To: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>
References: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>
Message-ID: <B495046A-9717-4390-AFB0-033DD42ACA75@oracle.com>

Looks good.  For the record, this was contributed by Christian Wimmer.

> On Jan 25, 2016, at 11:28 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Most access to VM constants in JVMCI goes through the ConstantReflectionProvider interface meaning the VM implementation for constant handling is in one place. For historic reasons, some small amount of reflection on VM constants was located in ResolvedJavaType. This issue consolidates these methods to ConstantReflectionProvider.
> 
> https://bugs.openjdk.java.net/browse/JDK-8148202
> http://cr.openjdk.java.net/~dnsimon/8148202
> 
> -Doug


From christian.thalinger at oracle.com  Tue Jan 26 21:05:01 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 26 Jan 2016 11:05:01 -1000
Subject: RFR: 8148202: move lookup of Java class and hub from
	ResolvedJavaType to ConstantReflectionProvider
In-Reply-To: <B495046A-9717-4390-AFB0-033DD42ACA75@oracle.com>
References: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>
	<B495046A-9717-4390-AFB0-033DD42ACA75@oracle.com>
Message-ID: <68B93E91-BBFA-4C14-A427-18FB7A6AC842@oracle.com>

Correction, two tests fail:

FAILED: compiler/jvmci/code/DataPatchTest.java
FAILED: compiler/jvmci/code/SimpleDebugInfoTest.java

> On Jan 26, 2016, at 10:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> Looks good.  For the record, this was contributed by Christian Wimmer.
> 
>> On Jan 25, 2016, at 11:28 PM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> Most access to VM constants in JVMCI goes through the ConstantReflectionProvider interface meaning the VM implementation for constant handling is in one place. For historic reasons, some small amount of reflection on VM constants was located in ResolvedJavaType. This issue consolidates these methods to ConstantReflectionProvider.
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8148202
>> http://cr.openjdk.java.net/~dnsimon/8148202
>> 
>> -Doug
> 


From doug.simon at oracle.com  Tue Jan 26 21:23:19 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 26 Jan 2016 22:23:19 +0100
Subject: RFR: 8148202: move lookup of Java class and hub from
	ResolvedJavaType to ConstantReflectionProvider
In-Reply-To: <68B93E91-BBFA-4C14-A427-18FB7A6AC842@oracle.com>
References: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>
	<B495046A-9717-4390-AFB0-033DD42ACA75@oracle.com>
	<68B93E91-BBFA-4C14-A427-18FB7A6AC842@oracle.com>
Message-ID: <986242A4-21E8-4EB7-9486-EB1A49D0E365@oracle.com>

I fixed these tests and updated the webrev in situ.

> On 26 Jan 2016, at 22:05, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> Correction, two tests fail:
> 
> FAILED: compiler/jvmci/code/DataPatchTest.java
> FAILED: compiler/jvmci/code/SimpleDebugInfoTest.java
> 
>> On Jan 26, 2016, at 10:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> Looks good.  For the record, this was contributed by Christian Wimmer.
>> 
>>> On Jan 25, 2016, at 11:28 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> Most access to VM constants in JVMCI goes through the ConstantReflectionProvider interface meaning the VM implementation for constant handling is in one place. For historic reasons, some small amount of reflection on VM constants was located in ResolvedJavaType. This issue consolidates these methods to ConstantReflectionProvider.
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8148202
>>> http://cr.openjdk.java.net/~dnsimon/8148202
>>> 
>>> -Doug
>> 
> 


From christian.thalinger at oracle.com  Tue Jan 26 21:28:41 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Tue, 26 Jan 2016 11:28:41 -1000
Subject: RFR: 8148202: move lookup of Java class and hub from
	ResolvedJavaType to ConstantReflectionProvider
In-Reply-To: <986242A4-21E8-4EB7-9486-EB1A49D0E365@oracle.com>
References: <D6D9620C-A850-4BD5-BD25-7226958037A7@oracle.com>
	<B495046A-9717-4390-AFB0-033DD42ACA75@oracle.com>
	<68B93E91-BBFA-4C14-A427-18FB7A6AC842@oracle.com>
	<986242A4-21E8-4EB7-9486-EB1A49D0E365@oracle.com>
Message-ID: <4CA58378-4063-466A-832F-F225EA39EA71@oracle.com>

Now we are good:

Test results: passed: 61; error: 4

> On Jan 26, 2016, at 11:23 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> I fixed these tests and updated the webrev in situ.
> 
>> On 26 Jan 2016, at 22:05, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>> 
>> Correction, two tests fail:
>> 
>> FAILED: compiler/jvmci/code/DataPatchTest.java
>> FAILED: compiler/jvmci/code/SimpleDebugInfoTest.java
>> 
>>> On Jan 26, 2016, at 10:16 AM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
>>> 
>>> Looks good.  For the record, this was contributed by Christian Wimmer.
>>> 
>>>> On Jan 25, 2016, at 11:28 PM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> Most access to VM constants in JVMCI goes through the ConstantReflectionProvider interface meaning the VM implementation for constant handling is in one place. For historic reasons, some small amount of reflection on VM constants was located in ResolvedJavaType. This issue consolidates these methods to ConstantReflectionProvider.
>>>> 
>>>> https://bugs.openjdk.java.net/browse/JDK-8148202
>>>> http://cr.openjdk.java.net/~dnsimon/8148202
>>>> 
>>>> -Doug
>>> 
>> 
> 


From john.r.rose at oracle.com  Tue Jan 26 22:48:22 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 26 Jan 2016 14:48:22 -0800
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <569F7558.1030800@oracle.com>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
	<569F7558.1030800@oracle.com>
Message-ID: <CE0BD2EB-718D-4F38-9C1C-C679F42DC904@oracle.com>

What I would like to see is for users to feel free to use CallSites
with any amount of mutability, and have the JVM pick a good
strategy for speculating and optimizing through CS target bindings.

By "good" I mean that, if the CS is not megamutable, you get
the performance comparable to an "invokestatic".  But if the
CS *is* megamutable (unstable), it is not "good" (IMO) to issue
a storm of recompilations, especially if (as is usually the case)
the megamutable CS is one of 1000s of other call sites in the
same code blob, all of which must be recompiled because one
CS had a problem.

Instead, the megamutable CS should be downgraded to an
indirect call through a normal (or volatile) variable.

So does this leave some performance on the floor?  Of course;
perhaps the CS finally settles down long enough for the JVM
to venture a profitable recompilation, and for the cost of recompilation
to be paid off by further stability and efficient execution of the CS.

My main point here is that reoptimization of megamutables is
a misuse of speculation.  I'm not saying that the JIT should have
a tantrum and refuse to compile the call site (which is a bug),
but it should stop speculating that it is stable when in fact it is not.

There are lots of ways to improve the performance of megamutables,
but unconditional recompilation is not one of those ways.  It uses
a wrecking ball to swat a fly.

Handling megamutables is very much like handling megamorphics.
You want to hang on to the hope that there are really just a few
branches (common case) and optimize those, and call out-of-line
for the rest.  If that hope fails, you call out-of-line always.  And
you want to detect if the statistics change, where the entropy of
the CS target goes down to a small number, so you can venture
another recompile with up-to-date speculation.  We should apply
these techniques to both megamorphics and megamutables.

So there's an ambiguity in the contract:  Is CS speculation just
a best-efforts kind of thing, or is the JVM contracted to mechanically
recompile on every CS change?  I think the reasonable reading
of the javadoc (etc.) is the first, not the second.

How would a user communicate that his CS is a special one,
whose invalidation should *always* trigger reoptimization?
I don't know, maybe an integer-valued callback that is triggered
during setTarget calls, and returns the amount of (virtual)
time before the next reoptimization should be attempted.
The callback would be passed the number of previous
reoptimizations (at this site or in the whole method or
both), as a warning of how resource-intensive this CS
is becoming.  Returning constant zero means the
current behavior.  I think you can see lots of problems
with such an API.

And, I think that sort of thing isn't notably better than simple
JVM heuristics.  Here's how I think we should fix the
megamutable problem:

1. Speculate at first that a CS is immutable.

2. If that fails, speculate that it is stable, as:
   if (cs.t == expected) inline expected(); else outline cs.t();
Collect a profile count along the outline path.

3. Every once in a while, if a code blog is accumulating
outline counts, queue it for reoptimization.
Crucially, do this in such a way that the JIT does
not become a foreground consumer of CPU cycles.

4. When recompiling a stable call site, always
inline the current target ("this time fer sure!").
Maybe if this is a *really* bad actor (but how
can you tell?) forget the speculation part.

5. Maybe, speculate on the LF of the target,
not the target itself, to allow some degree of
harmless variation by targets.  (For some
codes that will help, although it interacts
with MH customization in tricky ways.)

6. Maybe fiddle with collecting previous hot targets,
or (better) empower the JDK code to manage that stuff.
PIC logic should be handled at the JDK level,
not in the JIT.

Anyway, if the above gets addressed eventually,
or if the rest of the MLVM crew proves that I don't
know what I'm talking about, I'm OK with this fix.

"Reviewed", assuming future improvements.

? John

On Jan 20, 2016, at 3:54 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> John, Chris, thanks for the feedback.
> 
> I don't think it is only about microbenchmarks. Long-running large applications with lots of mutable call sites should also benefit for this change. Current JVM behavior counts invalidations on root method, so nmethods with multiple mutable call sites (from root & all inlined callees) are more likely to hit the limit, even if there's no mega-mutable sites. It just sums up and PerMethodRecompilationCutoff (= 400, by default) doesn't look like a huge number.
> 
> Also, LambdaForm sharing somewhat worsen the situation. When LambdaForms were mostly customized, different method handle chains were compiled into a single nmethod. Right now, it means that not only the root method is always interpreted, but all bound method handle chains are broken into numerous per-LF nmethods (see JDK-8069591 for some details).
> 
> MLVM folks, I'd like to hear your opinion about what kind of behavior do you expect from JVM w.r.t. mutable call sites.
> 
> There are valid use-cases when JVM shouldn't throttle the recompilation (e.g., long-running application with indy-based dynamic tracing). Maybe there's a place for a new CallSite flavor to clearly communicate application expectations to the JVM? Either always recompile (thus eventually reaching peak performance) or give up and generate less efficient machine code, but save on possible recompilations.
> 
> Best regards,
> Vladimir Ivanov
> 
> On 1/20/16 2:37 AM, John Rose wrote:
>> On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov
>> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>>> 
>>> The fix is to avoid updating recompilation count when corresponding
>>> nmethod is invalidated due to a call site target change.
>> 
>> Although I'm not vetoing it (since it seems it will help customers in
>> the short term), I'm uncomfortable with this fix because it doesn't
>> scale to large dyn. lang. applications with many unstable call sites.
>>  Put another way, it feels like we are duct-taping down a failsafe
>> switch (against infinite recompilation) in order to spam a
>> micro-benchmark:  a small number mega-mutable call sites for which we
>> are willing to spend (potentially) all of the JIT resources, including
>> those usually allocated to application performance in the steady state.
>>  Put a third way:  I am not comfortable with unthrottled infinite
>> recompilation as a performance strategy.
>> 
>> I've commented on the new RFE (JDK-8147550) where to go next, including
>> the following sentiments:
>> 
>>> There is a serious design tension here, though: Some users apparently
>>> are willing to endure an infinite series of recompilations as part of
>>> the cost of doing business; JDK-7177745 addresses this need by turning
>>> off the fail-safe against (accidental, buggy) infinite recompilation
>>> for unstable CSs. Other users might find that having a percentage of
>>> machine time devoted to recompilation is a problem. (This has been the
>>> case in the past with non-dynamic languages, at least.) The code shape
>>> proposed in this bug report would cover all simple unstable call
>>> sites (bi-stable, for example, would compile to a bi-morphic call),
>>> but, in pathological cases (infinite sequence of distinct CS targets)
>>> would "settle down" into a code shape that would be sub-optimal for
>>> any single target, but (as an indirect MH call) reasonable for all the
>>> targets together.
>>> 
>>> In the absence of clear direction from the user or the profile, the
>>> JVM has to choose infinite recompilation or a good-enough final
>>> compilation. The latter choice is safer. And the
>>> infinite recompilation is less safe because there is no intrinsic
>>> bound on the amount of machine cycles that could be diverted to
>>> recompilation, given a dynamic language application with
>>> enough mega-mutable CSs. Settling down to a network of indirect calls
>>> has a bounded cost.
>>> 
>>> Yes, one size-fits-all tactics never please everybody. But the JVM
>>> should not choose tactics with unlimited downsides.
>> 
>> ? John


From john.r.rose at oracle.com  Tue Jan 26 23:18:53 2016
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 26 Jan 2016 15:18:53 -0800
Subject: [9] RFR (S): 7177745: JSR292: Many Callsite relinkages cause
	target method to always run in interpreter mode
In-Reply-To: <2036838501.1079316.1453292009390.JavaMail.zimbra@u-pem.fr>
References: <569CE098.4030807@oracle.com>
	<894B7E15-D940-4EC5-8E4B-CF48B557A86D@oracle.com>
	<2036838501.1079316.1453292009390.JavaMail.zimbra@u-pem.fr>
Message-ID: <525DDA2D-9676-43CF-8D75-C5ED52031E73@oracle.com>

On Jan 20, 2016, at 4:13 AM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
> I understand that having the VM that may always recompile may be seen as a bug,
> but having the VM that bailout and stop recompiling, or more generally change the compilation strategy is a bug too.

As you can guess from my previous message, I agree with this, except for
"change the compilation strategy".  The JVM earns its way in the world by
routinely changing compilation strategy.  The reason most people don't
notice is the strategy changes are profile-driven and self-correcting.

Nothing in the 292 world promises a particular strategy, just a best effort
to create and execute great code, assuming stable application behavior.

When an optimization breaks, the JVM's strategy may also fail to adjust
correctly.  One symptom of that is infinite recompilation, usually because
one line of code is being handled badly, but which creates huge bloat
in the code cache for thousands of lines of code that happen to be
inlined nearby.  We try hard to avoid this.

We also try hard to detect this problem.  That is the true meaning of those
strange cutoffs.  Nobody things falling into the interpreter is a good idea,
except that it, on balance, is a better idea than (a) throwing an assertion
error, or (b) filling the CPU with JIT jobs and the code cache with discards.
The third choice (c) run offending method in the interpreter at least
preserves a degree of forward progress, while allowing the outraged
user to report a bug.

The correct fix to the bug, IMO, is never to jump from (c) to (a) or (b).
It is to find and fix the problem with the compilation strategy, and the
profile-driven gating logic for it.

If your car's transmission gets a bug (now that they are computers,
they can), what would you prefer?
(a) stop the car immediately,
(b) run the car in first gear at full speed, or
(c) slow the car to a defined speed limit (25mph).
Detroit prefers, and the JVM implements, option (c).

> The problem here is that there is no way from the point of view of a dyn lang runtime to know what will be the behavior of the VM for a callsite if the VM decide to stop to recompile, decide to not inline, decide to inline some part of the tree, etc.

Yes.  And it usually doesn't matter; the issue doesn't come up until something breaks,
or we find a performance pothole.  The current problem is (in my mind) a break, not
a performance pothole that needs tuning.  If we fix the break, people shouldn't need
to worry about this stuff, usually.

> Said differently, using an invokedynamic allows to create code shapes that will change dynamically, if the VM behavior also changes dynamically, it's like building a wall on moving parts, the result is strange dynamic behaviors that are hard to diagnose and reproduce.

JVMs have always been like that, because of dynamic class loading, but with indy
it is more so, since it's much easier to "override" some previously fixed behavior.

> The recompilation behavior of the VM should be keep simple and predicatable, basically, the VM should always recompile the CS with no failsafe switch.

We agree that the failsafe should not trip.  Just like we agree that
the circuit breakers in our building should not trip.  We disagree,
perhaps, about what to do when they trip.  I don't want to duct-tape
them back into the "on" position; do you?

> If dyn lang runtime devs have trouble with that, they can already use an exactInvoker to simulate an indirect mh call and we can even provide new method handle combiners to gracefully handle multi-stable CS.

That's all true.  The new combiners might have some sort
of handshake with the JVM to self-adjust their code shape.

But I claim the baseline behavior that I have called for is
the most generally useful, since it is able to amortize
recompilation resources over multiple CS misses,
put global limits on total recompilation effort, and
preserve reasonable forward progress executing
good-enough code.

(Having a CS change force a reoptimization is tantamount to
adding a JIT control API, as Compiler.recompile(cs) like
System.gc().  But just for CS-bearing methods.  We are
a long way from understanding how to work such an API.)

Idea:  Perhaps CS's should have a callback which says,
"Hey, CS, the JIT has mispredicted you a bunch of times;
would you like to nominate an alternative representation?"
The call would be made asynchronously, outside the JIT.
The default behavior would be to say "nope" with the results
given above, but the CS could also return a MH (perhaps
the CS.dynamicInvoker, or perhaps some more elaborate logic),
which the JVM would slide into place over the top of the CS.
Despite the fact that CS bindings are final, the new binding
would take its place.  And it would be the user's choice
whether that binding pointed to the old CS or a new CS or
some combination of both.

? John

From igor.veresov at oracle.com  Wed Jan 27 00:44:53 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 16:44:53 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56A751AE.9090203@azulsystems.com>
References: <56A751AE.9090203@azulsystems.com>
Message-ID: <A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>

So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?

igor

> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
> 
> Hello,
> 
> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
> 
> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
> 
> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
> 
> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
> 
> Thanks,
> 
> Ivan
> 
> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/78625c27/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan 27 02:15:25 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 26 Jan 2016 21:15:25 -0500
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
Message-ID: <CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>

Subsequent loads at this point will likely be polls of same memory location
that just failed a test, and the author inserted a pause.  It's unlikely
that the memory changed that quickly and scheduling the next load before
the pause is equivalent to two loads back to back essentially, which
wouldn't make sense given the intended usage.  There's also the risk that
the compiler would move enough of those load+test pairs before the pause
and fill up the speculative pipeline with them; that pipeline will need to
be flushed once the spin exits since those load instructions likely
speculated incorrectly.  And here we're basically describing the reason for
putting pause there in the first place :).

On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com> wrote:

> So, why does the new node have a memory effect? That would seem to prevent
> any movement of the subsequent loads in your loop, right? If that?s
> intentional I wonder why is that?
>
> igor
>
> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>
> Hello,
>
> Some of you may have a seen a few e-mails on the core-libs alias about a
> proposed ?spin wait hint?. The JEP is forming up nicely at
> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
> consensus on the API side. It is now in a draft state and I hope this JEP
> will get targeted for java 9 shortly.  The upcoming API changes can be seen
> at the webrev:
> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>
> At this time I would like to ask for a review of the hs-comp changes. The
> plan is push changes into class libraries and hotspot synchronously but
> that may happen after the JEP gets targeted.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>
> The idea of the fix is pretty simple: hotspot replaces a call to
> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
> 'pause' instruction on x86.  This intrinsic is guarded by the
> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
> verification code that makes sure the flag is off, VM will just execute at
> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
> According the [1] the 'pause' instruction is functional since SSE2, but
> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
> harmless, there seems to be no need to add guarding code for older
> generations of Intel CPUs.
>
> The proposed patch includes a simple regression test that simply makes
> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
> are several other producer-consumer-like performance tests ready that the
> authors of this JEP would be happy to make available under JEP-230 but I am
> uncertain about the process.
>
> Thanks,
>
> Ivan
>
> [1]  -
> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/b22641bd/attachment.html>

From igor.veresov at oracle.com  Wed Jan 27 03:56:52 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 19:56:52 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
Message-ID: <0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>

Wouldn?t you use a volatile load for the memory location you?re polling?

igor

> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> Subsequent loads at this point will likely be polls of same memory location that just failed a test, and the author inserted a pause.  It's unlikely that the memory changed that quickly and scheduling the next load before the pause is equivalent to two loads back to back essentially, which wouldn't make sense given the intended usage.  There's also the risk that the compiler would move enough of those load+test pairs before the pause and fill up the speculative pipeline with them; that pipeline will need to be flushed once the spin exits since those load instructions likely speculated incorrectly.  And here we're basically describing the reason for putting pause there in the first place :).
> 
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
> 
> igor
> 
>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <>> wrote:
>> 
>> Hello,
>> 
>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>> 
>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>> 
>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>> 
>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>> 
>> Thanks,
>> 
>> Ivan
>> 
>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
> 
> 
> -- 
> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/ea684577/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan 27 04:08:00 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 26 Jan 2016 23:08:00 -0500
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
Message-ID: <CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>

You would but subsequent volatile load could move before the pause.  If you
unroll the loop, you could (theoretically) end up with all loads moved
before the pause but all appearing ordered with respect to each other, eg:

cmp addr, 0 // from iteration 1
je label
cmp addr, 0 // from iteration 2
je label
...
pause

What prevents that if pause is not a compiler member?

On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com> wrote:

> Wouldn?t you use a volatile load for the memory location you?re polling?
>
> igor
>
> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
> Subsequent loads at this point will likely be polls of same memory
> location that just failed a test, and the author inserted a pause.  It's
> unlikely that the memory changed that quickly and scheduling the next load
> before the pause is equivalent to two loads back to back essentially, which
> wouldn't make sense given the intended usage.  There's also the risk that
> the compiler would move enough of those load+test pairs before the pause
> and fill up the speculative pipeline with them; that pipeline will need to
> be flushed once the spin exits since those load instructions likely
> speculated incorrectly.  And here we're basically describing the reason for
> putting pause there in the first place :).
>
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
> wrote:
>
>> So, why does the new node have a memory effect? That would seem to
>> prevent any movement of the subsequent loads in your loop, right? If that?s
>> intentional I wonder why is that?
>>
>> igor
>>
>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>
>> Hello,
>>
>> Some of you may have a seen a few e-mails on the core-libs alias about a
>> proposed ?spin wait hint?. The JEP is forming up nicely at
>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>> consensus on the API side. It is now in a draft state and I hope this JEP
>> will get targeted for java 9 shortly.  The upcoming API changes can be seen
>> at the webrev:
>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>
>> At this time I would like to ask for a review of the hs-comp changes. The
>> plan is push changes into class libraries and hotspot synchronously but
>> that may happen after the JEP gets targeted.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>
>> The idea of the fix is pretty simple: hotspot replaces a call to
>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>> 'pause' instruction on x86.  This intrinsic is guarded by the
>> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>> verification code that makes sure the flag is off, VM will just execute at
>> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
>> According the [1] the 'pause' instruction is functional since SSE2, but
>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>> harmless, there seems to be no need to add guarding code for older
>> generations of Intel CPUs.
>>
>> The proposed patch includes a simple regression test that simply makes
>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
>> are several other producer-consumer-like performance tests ready that the
>> authors of this JEP would be happy to make available under JEP-230 but I am
>> uncertain about the process.
>>
>> Thanks,
>>
>> Ivan
>>
>> [1]  -
>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>
>>
>>
>
> --
> Sent from my phone
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/cdfbf12a/attachment.html>

From igor.veresov at oracle.com  Wed Jan 27 04:47:43 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 20:47:43 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
Message-ID: <D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>


> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> You would but subsequent volatile load could move before the pause.  If you unroll the loop, you could (theoretically) end up with all loads moved before the pause but all appearing ordered with respect to each other, eg:
> 
> cmp addr, 0 // from iteration 1
> je label
> cmp addr, 0 // from iteration 2
> je label
> ...
> pause
> 
> What prevents that if pause is not a compiler member?
> 

I think volatile loads explicitly depend on control. If the pause node consumes and produces control it all should be in a rigid control chain. Other regular loads (that don?t have control dependencies) would still be free to move around.

igor

> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
> Wouldn?t you use a volatile load for the memory location you?re polling?
> 
> igor
> 
>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>> 
>> Subsequent loads at this point will likely be polls of same memory location that just failed a test, and the author inserted a pause.  It's unlikely that the memory changed that quickly and scheduling the next load before the pause is equivalent to two loads back to back essentially, which wouldn't make sense given the intended usage.  There's also the risk that the compiler would move enough of those load+test pairs before the pause and fill up the speculative pipeline with them; that pipeline will need to be flushed once the spin exits since those load instructions likely speculated incorrectly.  And here we're basically describing the reason for putting pause there in the first place :).
>> 
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>> 
>> igor
>> 
>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <>> wrote:
>>> 
>>> Hello,
>>> 
>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>>> 
>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>> 
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>>> 
>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>> 
>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>> 
>>> Thanks,
>>> 
>>> Ivan
>>> 
>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
>> 
>> 
>> -- 
>> Sent from my phone
> 
> 
> 
> -- 
> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/35fce085/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan 27 05:35:37 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 27 Jan 2016 00:35:37 -0500
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
	<D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
Message-ID: <CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>

On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com> wrote:

>
> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
> You would but subsequent volatile load could move before the pause.  If
> you unroll the loop, you could (theoretically) end up with all loads moved
> before the pause but all appearing ordered with respect to each other, eg:
>
> cmp addr, 0 // from iteration 1
> je label
> cmp addr, 0 // from iteration 2
> je label
> ...
> pause
>
> What prevents that if pause is not a compiler member?
>
>
> I think volatile loads explicitly depend on control. If the pause node
> consumes and produces control it all should be in a rigid control chain.
>
Other regular loads (that don?t have control dependencies) would still be
> free to move around.
>

Is this to avoid out of thin air values? That is, suppose you have:

if (some condition)
    read volatile (or regular)

Regular load can be scheduled before the if and result used if control
reaches there.  For volatile, load cannot be scheduled above the if since
value can be bogus at that point?

Is it safe for compiler to assume that something else anchors loads around
the pause?

That aside, given the intended usage, I'm not sure what other regular loads
would be there.  The usage is a tight spin loop waiting for exit condition
to be met.  Although I suppose if compiler sees regular loads after the
loop exits successfully, perhaps scheduling them before the loop can be
beneficial.  Is that what you have in mind?


> igor
>
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com
> <javascript:_e(%7B%7D,'cvml','igor.veresov at oracle.com');>> wrote:
>
>> Wouldn?t you use a volatile load for the memory location you?re polling?
>>
>> igor
>>
>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>
>> Subsequent loads at this point will likely be polls of same memory
>> location that just failed a test, and the author inserted a pause.  It's
>> unlikely that the memory changed that quickly and scheduling the next load
>> before the pause is equivalent to two loads back to back essentially, which
>> wouldn't make sense given the intended usage.  There's also the risk that
>> the compiler would move enough of those load+test pairs before the pause
>> and fill up the speculative pipeline with them; that pipeline will need to
>> be flushed once the spin exits since those load instructions likely
>> speculated incorrectly.  And here we're basically describing the reason for
>> putting pause there in the first place :).
>>
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>>
>>> So, why does the new node have a memory effect? That would seem to
>>> prevent any movement of the subsequent loads in your loop, right? If that?s
>>> intentional I wonder why is that?
>>>
>>> igor
>>>
>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>
>>> Hello,
>>>
>>> Some of you may have a seen a few e-mails on the core-libs alias about a
>>> proposed ?spin wait hint?. The JEP is forming up nicely at
>>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>>> consensus on the API side. It is now in a draft state and I hope this JEP
>>> will get targeted for java 9 shortly.  The upcoming API changes can be seen
>>> at the webrev:
>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>
>>> At this time I would like to ask for a review of the hs-comp changes.
>>> The plan is push changes into class libraries and hotspot synchronously but
>>> that may happen after the JEP gets targeted.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>
>>> The idea of the fix is pretty simple: hotspot replaces a call to
>>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>>> 'pause' instruction on x86.  This intrinsic is guarded by the
>>> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>>> verification code that makes sure the flag is off, VM will just execute at
>>> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
>>> According the [1] the 'pause' instruction is functional since SSE2, but
>>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>>> harmless, there seems to be no need to add guarding code for older
>>> generations of Intel CPUs.
>>>
>>> The proposed patch includes a simple regression test that simply makes
>>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
>>> are several other producer-consumer-like performance tests ready that the
>>> authors of this JEP would be happy to make available under JEP-230 but I am
>>> uncertain about the process.
>>>
>>> Thanks,
>>>
>>> Ivan
>>>
>>> [1]  -
>>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>>
>>>
>>>
>>
>> --
>> Sent from my phone
>>
>>
>>
>
> --
> Sent from my phone
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/10b4c56f/attachment.html>

From igor.veresov at oracle.com  Wed Jan 27 06:03:13 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 22:03:13 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
	<D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
	<CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>
Message-ID: <C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>


> On Jan 26, 2016, at 9:35 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> 
> 
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
> 
>> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com <>> wrote:
>> 
>> You would but subsequent volatile load could move before the pause.  If you unroll the loop, you could (theoretically) end up with all loads moved before the pause but all appearing ordered with respect to each other, eg:
>> 
>> cmp addr, 0 // from iteration 1
>> je label
>> cmp addr, 0 // from iteration 2
>> je label
>> ...
>> pause
>> 
>> What prevents that if pause is not a compiler member?
>> 
> 
> I think volatile loads explicitly depend on control. If the pause node consumes and produces control it all should be in a rigid control chain.  
> Other regular loads (that don?t have control dependencies) would still be free to move around.
> 
> Is this to avoid out of thin air values? That is, suppose you have:
> 
> if (some condition)
>     read volatile (or regular)

> 
> Regular load can be scheduled before the if and result used if control reaches there.  For volatile, load cannot be scheduled above the if since value can be bogus at that point?

Right. Regular reads can move up anywhere to the preceding memory effect, that modified that alias index.

> 
> Is it safe for compiler to assume that something else anchors loads around the pause?
> 
> That aside, given the intended usage, I'm not sure what other regular loads would be there.  The usage is a tight spin loop waiting for exit condition to be met.  Although I suppose if compiler sees regular loads after the loop exits successfully, perhaps scheduling them before the loop can be beneficial.  Is that what you have in mind?


No just simple stuff like:

while(?) {
  a = x.f;
  pause();
  b = x.f;
}

If pause() is a wide memory kill, regular field loads around it obviously won?t fold. So in the example above those field loads are both going to be there. I realize it?s probably not a big deal in reality for the wait loops, but I was just wondering why make it a wide mem kill if membar nodes for volatiles (that will have to be in the loop) already have wide kill semantics.

igor 


> 
> 
> igor
> 
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
>> Wouldn?t you use a volatile load for the memory location you?re polling?
>> 
>> igor
>> 
>>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com <>> wrote:
>>> 
>>> Subsequent loads at this point will likely be polls of same memory location that just failed a test, and the author inserted a pause.  It's unlikely that the memory changed that quickly and scheduling the next load before the pause is equivalent to two loads back to back essentially, which wouldn't make sense given the intended usage.  There's also the risk that the compiler would move enough of those load+test pairs before the pause and fill up the speculative pipeline with them; that pipeline will need to be flushed once the spin exits since those load instructions likely speculated incorrectly.  And here we're basically describing the reason for putting pause there in the first place :).
>>> 
>>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>> 
>>> igor
>>> 
>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <>> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>>>> 
>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>> 
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>>>> 
>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>> 
>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>> 
>>>> Thanks,
>>>> 
>>>> Ivan
>>>> 
>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
>>> 
>>> 
>>> -- 
>>> Sent from my phone
>> 
>> 
>> 
>> -- 
>> Sent from my phone
> 
> 
> 
> -- 
> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/04d550a1/attachment-0001.html>

From igor.veresov at oracle.com  Wed Jan 27 06:12:06 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 22:12:06 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
Message-ID: <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>

I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
Anyways, looks ok for the first cut.

igor

> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
> 
> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?
> 
> ? Gil.
> 
>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>> 
>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>> 
>> igor
>> 
>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <mailto:ivan at azulsystems.com>> wrote:
>>> 
>>> Hello,
>>> 
>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>>> 
>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>> 
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>>> 
>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>> 
>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>> 
>>> Thanks,
>>> 
>>> Ivan
>>> 
>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/30d3d24b/attachment.html>

From igor.veresov at oracle.com  Wed Jan 27 06:30:39 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 26 Jan 2016 22:30:39 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
	<D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
	<CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>
	<C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>
Message-ID: <412CE10D-193F-48D7-9E94-7D0C4DD9D6FD@oracle.com>

Or to put it another way. Memory effect of the pause prevents ordinary loads to float up. However, the control effect of it alone should be enough to prevent the _volatile_ loads to float up, since they are control-dependent. Hence the original thought that the memory effect of the pause might be unnecessarily restrictive if it?s used with volatile loads. But may be I?m missing something. 

igor

> On Jan 26, 2016, at 10:03 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
> 
>> 
>> On Jan 26, 2016, at 9:35 PM, Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
>> 
>> 
>> 
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>> 
>>> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com <>> wrote:
>>> 
>>> You would but subsequent volatile load could move before the pause.  If you unroll the loop, you could (theoretically) end up with all loads moved before the pause but all appearing ordered with respect to each other, eg:
>>> 
>>> cmp addr, 0 // from iteration 1
>>> je label
>>> cmp addr, 0 // from iteration 2
>>> je label
>>> ...
>>> pause
>>> 
>>> What prevents that if pause is not a compiler member?
>>> 
>> 
>> I think volatile loads explicitly depend on control. If the pause node consumes and produces control it all should be in a rigid control chain.  
>> Other regular loads (that don?t have control dependencies) would still be free to move around.
>> 
>> Is this to avoid out of thin air values? That is, suppose you have:
>> 
>> if (some condition)
>>     read volatile (or regular)
> 
>> 
>> Regular load can be scheduled before the if and result used if control reaches there.  For volatile, load cannot be scheduled above the if since value can be bogus at that point?
> 
> Right. Regular reads can move up anywhere to the preceding memory effect, that modified that alias index.
> 
>> 
>> Is it safe for compiler to assume that something else anchors loads around the pause?
>> 
>> That aside, given the intended usage, I'm not sure what other regular loads would be there.  The usage is a tight spin loop waiting for exit condition to be met.  Although I suppose if compiler sees regular loads after the loop exits successfully, perhaps scheduling them before the loop can be beneficial.  Is that what you have in mind?
> 
> 
> No just simple stuff like:
> 
> while(?) {
>   a = x.f;
>   pause();
>   b = x.f;
> }
> 
> If pause() is a wide memory kill, regular field loads around it obviously won?t fold. So in the example above those field loads are both going to be there. I realize it?s probably not a big deal in reality for the wait loops, but I was just wondering why make it a wide mem kill if membar nodes for volatiles (that will have to be in the loop) already have wide kill semantics.
> 
> igor 
> 
> 
>> 
>> 
>> igor
>> 
>>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
>>> Wouldn?t you use a volatile load for the memory location you?re polling?
>>> 
>>> igor
>>> 
>>>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com <>> wrote:
>>>> 
>>>> Subsequent loads at this point will likely be polls of same memory location that just failed a test, and the author inserted a pause.  It's unlikely that the memory changed that quickly and scheduling the next load before the pause is equivalent to two loads back to back essentially, which wouldn't make sense given the intended usage.  There's also the risk that the compiler would move enough of those load+test pairs before the pause and fill up the speculative pipeline with them; that pipeline will need to be flushed once the spin exits since those load instructions likely speculated incorrectly.  And here we're basically describing the reason for putting pause there in the first place :).
>>>> 
>>>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com <>> wrote:
>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>>> 
>>>> igor
>>>> 
>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <>> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>>>>> 
>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>> 
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>>>>> 
>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>> 
>>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Ivan
>>>>> 
>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>
>>>> 
>>>> 
>>>> -- 
>>>> Sent from my phone
>>> 
>>> 
>>> 
>>> -- 
>>> Sent from my phone
>> 
>> 
>> 
>> -- 
>> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160126/65f5b106/attachment-0001.html>

From rahul.v.raghavan at oracle.com  Wed Jan 27 11:14:03 2016
From: rahul.v.raghavan at oracle.com (Rahul Raghavan)
Date: Wed, 27 Jan 2016 03:14:03 -0800 (PST)
Subject: RFR(S): 6378256: Performance problem with System.identityHashCode
	in client compiler
In-Reply-To: <56A737F6.6030909@oracle.com>
References: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>
	<56A737F6.6030909@oracle.com>
Message-ID: <00a3e7ca-2212-4699-b591-52ceaa9c909b@default>


> -----Original Message-----
> From: Tobias Hartmann > Sent: Tuesday, January 26, 2016 2:40 PM > To: Rahul Raghavan; hotspot-compiler-dev at openjdk.java.net
> 
> Hi Rahul,
> 
> looks good to me (not a Reviewer). The code in sharedRuntime_x86_64.cpp is much better now!

Thank you Tobias.

> 
> Best,
> Tobias
> 
> On 25.01.2016 18:02, Rahul Raghavan wrote:
> > Hello,
> >
> > With reference to below email thread, please send review comments for the revised patch for JDK-6378256.
> > http://cr.openjdk.java.net/~thartmann/6378256/webrev.02/
> >
> > Thanks,
> > Rahul
> >
> >> -----Original Message-----
> >> From: Tobias Hartmann > Sent: Monday, January 25, 2016 12:40 PM > To: Rahul Raghavan; hotspot-compiler-
> dev at openjdk.java.net
> >>
> >> Hi Rahul,
> >>
> >> On 22.01.2016 17:11, Rahul Raghavan wrote:
> >>>
> >>>> -----Original Message-----
> >>>> From: Tobias Hartmann > Sent: Monday, January 11, 2016 2:56 PM > To: Rahul Raghavan; hotspot-compiler-
> dev at openjdk.java.net
> >>>>
> >>>> Hi Rahul,
> >>>>
> >>>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> >>>>
> >>>> Why don't you use 'markOopDesc::hash_mask_in_place' for the 64 bit version? This should safe some instructions and you also
> >> don't
> >>>> need the 'hash' register if you compute everything in 'result'.
> >>>
> >>> Thank you for your comments Tobias.
> >>>
> >>> I could not get the implementation work with the usage of 'markOopDesc::hash_mask_in_place' in x86_64 (similar to support in
> >> x86_32).
> >>> Usage of -    __ andptr(result, markOopDesc::hash_mask_in_place);
> >>> Results in build error - ' overflow in implicit constant conversion'
> >>>
> >>> Then understood from 'sharedRuntime_sparc.cpp', 'markOop.hpp' -  that the usage of 'hash_mask_in_place' should be avoided
> for
> >> 64-bit because the values are too big!
> >>> Similar comments in LibraryCallKit::inline_native_hashcode [hotspot/src/share/vm/opto/library_call.cpp] also.
> >>> Could not find some other way to use hash_mask_in_place here for  x86_64?
> >>
> >> You are right, I missed that.
> >>
> >>> So depending on markOopDesc::hash_mask, markOopDesc::hash_shift value instead (similar to done in sharedRuntime_sparc)
> >>> Added missing comment regarding above in the revised webrev.
> >>>
> >>> Also yes I missed the optimized codegen.
> >>> Tried revised patch removing usages of extra 'hash', 'mask' registers and computed all in 'result' itself.
> >>>
> >>> [sharedRuntime_x86_64.cpp]
> >>>      ....................
> >>> +    Register obj_reg = j_rarg0;
> >>> +    Register result = rax;
> >>>      ........
> >>> +    // get hash
> >>> +    // Read the header and build a mask to get its hash field.
> >>> +    // Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
> >>> +    // because it could be larger than 32 bits in a 64-bit vm. See markOop.hpp.
> >>> +    __ shrptr(result, markOopDesc::hash_shift);
> >>> +    __ andptr(result, markOopDesc::hash_mask);
> >>> +    // test if hashCode exists
> >>> +    __ jcc  (Assembler::zero, slowCase);
> >>> +    __ ret(0);
> >>> +    __ bind (slowCase);
> >>>       ........
> >>>
> >>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> >>>
> >>> Please send your comments. I can submit revised webrev if all okay.
> >>
> >> Looks good. Please send a new webrev.
> >>
> >> Best,
> >> Tobias
> >>
> >>>
> >>>>
> >>>> Best,
> >>>> Tobias
> >>>>
> >>>>
> >>>> On 08.01.2016 18:13, Rahul Raghavan wrote:
> >>>>> Hello,
> >>>>>
> >>>>> Please review the following revised patch for JDK-6378256 -
> >>>>> http://cr.openjdk.java.net/~thartmann/6378256/webrev.01/
> >>>>>
> >>>>> This revised webrev got following changes -
> >>>>>
> >>>>>  1) A minor, better optimized code with return 0 at initial stage (instead of continuing to 'slowCase' path), for special/rare null
> >>>> reference input!
> >>>>>    (as per documentation, test results confirmed it is safe to 'return 0' for null reference input, for System.identityHashCode)
> >>>>>
> >>>>>  2) Added similar Object.hashCode, System.identityHashCode optimization support in sharedRuntime_x86_64.cpp.
> >>>>>
> >>>>> Confirmed no issues with jprt testing (-testset hotspot) and expected results for unit tests.
> >>>>>
> >>>>> Thanks,
> >>>>> Rahul
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Roland Westrelin > Sent: Wednesday, December 09, 2015 8:03 PM > To: Rahul Raghavan> Cc: hotspot-compiler-
> >>>> dev at openjdk.java.net
> >>>>>>
> >>>>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>>>>>
> >>>>>> Justifying the comment lines 2019-2022 in sharedRuntime_sparc.cpp (lines 1743-1746 in sharedRuntime_x86_32.cpp) again
> >> would
> >>>> be
> >>>>>> nice.
> >>>>>> Shouldn't we use this as an opportunity to add the same optimization to sharedRuntime_x86_64.cpp?
> >>>>>>
> >>>>>> Roland.
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Rahul Raghavan > Sent: Wednesday, December 09, 2015 2:43 PM > To: hotspot-compiler-dev at openjdk.java.net
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> Please review the following patch for JDK-6378256.
> >>>>>>
> >>>>>> webrev: http://cr.openjdk.java.net/~thartmann/6378256/webrev.00/ .
> >>>>>>
> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-6378256  .
> >>>>>> Performance problem with System.identityHashCode, compared to Object.hashCode, with client compiler (at least seven
> times
> >>>>>> slower).
> >>>>>> Issue reproducible for x86_32, SPARC (with -client / -XX:TieredStopAtLevel=1 , 2, 3 options).
> >>>>>>
> >>>>>> sample unit test:
> >>>>>>    public class Jdk6378256Test
> >>>>>>    {
> >>>>>>       public static void main(String[] args)
> >>>>>>       {
> >>>>>>          Object obj = new Object();
> >>>>>>          long time = System.nanoTime();
> >>>>>>          for(int i = 0 ; i < 1000000 ; i++)
> >>>>>>             System.identityHashCode(obj);  //compare to obj.hashCode();
> >>>>>>          System.out.println ("Result = " + (System.nanoTime() - time));
> >>>>>>       }
> >>>>>>    }
> >>>>>>
> >>>>>> Fix: Enabled the C1 optimization which was done only for Object.hashCode, now for System.identityHashCode() also.
> >>>>>> (looks in the header for the hashCode before calling into the VM).
> >>>>>> Unlike for Object.hashCode, System.identityHashCode is static method and gets object as argument instead of the receiver.
> >>>>>> So also added required additional null check for System.identityHashCode case.
> >>>>>>
> >>>>>> Testing:
> >>>>>>    - successful JPRT run (-testset hotspot).
> >>>>>>    - JTREG testing (hotspot/test, jdk/test - java/util, java/io, java/lang/System).
> >>>>>>        (with -client / -XX:TieredStopAtLevel=1 etc. options).
> >>>>>>    - Added 'noreg-perf' label for this performance bug.
> >>>>>>       Manual testing done and confirmed expected performance values for unit tests with fix.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Rahul

From vitalyd at gmail.com  Wed Jan 27 11:28:22 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 27 Jan 2016 06:28:22 -0500
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <412CE10D-193F-48D7-9E94-7D0C4DD9D6FD@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
	<D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
	<CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>
	<C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>
	<412CE10D-193F-48D7-9E94-7D0C4DD9D6FD@oracle.com>
Message-ID: <CAHjP37GMMXTP8Ei_0dxSLjyM4J=WSZDUpa7hzZVLgyLjkFQZCQ@mail.gmail.com>

Although those same volatile loads will prevent much of regular load
movement on their own; my statement earlier about scheduling a regular load
that is after the loop to be before the loop won't work anyway due to the
volatile loads in the loop.  So all in all, it seems it wouldn't matter in
practice.

On Wednesday, January 27, 2016, Igor Veresov <igor.veresov at oracle.com>
wrote:

> Or to put it another way. Memory effect of the pause prevents ordinary
> loads to float up. However, the control effect of it alone should be enough
> to prevent the _volatile_ loads to float up, since they are
> control-dependent. Hence the original thought that the memory effect of the
> pause might be unnecessarily restrictive if it?s used with volatile loads.
> But may be I?m missing something.
>
> igor
>
> On Jan 26, 2016, at 10:03 PM, Igor Veresov <igor.veresov at oracle.com
> <javascript:_e(%7B%7D,'cvml','igor.veresov at oracle.com');>> wrote:
>
>
> On Jan 26, 2016, at 9:35 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
>
>
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com
> <javascript:_e(%7B%7D,'cvml','igor.veresov at oracle.com');>> wrote:
>
>>
>> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>
>> You would but subsequent volatile load could move before the pause.  If
>> you unroll the loop, you could (theoretically) end up with all loads moved
>> before the pause but all appearing ordered with respect to each other, eg:
>>
>> cmp addr, 0 // from iteration 1
>> je label
>> cmp addr, 0 // from iteration 2
>> je label
>> ...
>> pause
>>
>> What prevents that if pause is not a compiler member?
>>
>>
>> I think volatile loads explicitly depend on control. If the pause node
>> consumes and produces control it all should be in a rigid control chain.
>>
> Other regular loads (that don?t have control dependencies) would still be
>> free to move around.
>>
>
> Is this to avoid out of thin air values? That is, suppose you have:
>
> if (some condition)
>     read volatile (or regular)
>
>
> Regular load can be scheduled before the if and result used if control
> reaches there.  For volatile, load cannot be scheduled above the if since
> value can be bogus at that point?
>
>
> Right. Regular reads can move up anywhere to the preceding memory effect,
> that modified that alias index.
>
>
>
> Is it safe for compiler to assume that something else anchors loads around
> the pause?
>
> That aside, given the intended usage, I'm not sure what other regular
> loads would be there.  The usage is a tight spin loop waiting for exit
> condition to be met.  Although I suppose if compiler sees regular loads
> after the loop exits successfully, perhaps scheduling them before the loop
> can be beneficial.  Is that what you have in mind?
>
>
>
> No just simple stuff like:
>
> while(?) {
>   a = x.f;
>   pause();
>   b = x.f;
> }
>
> If pause() is a wide memory kill, regular field loads around it obviously
> won?t fold. So in the example above those field loads are both going to be
> there. I realize it?s probably not a big deal in reality for the wait
> loops, but I was just wondering why make it a wide mem kill if membar nodes
> for volatiles (that will have to be in the loop) already have wide kill
> semantics.
>
> igor
>
>
>
>
>> igor
>>
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>>
>>> Wouldn?t you use a volatile load for the memory location you?re polling?
>>>
>>> igor
>>>
>>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>> Subsequent loads at this point will likely be polls of same memory
>>> location that just failed a test, and the author inserted a pause.  It's
>>> unlikely that the memory changed that quickly and scheduling the next load
>>> before the pause is equivalent to two loads back to back essentially, which
>>> wouldn't make sense given the intended usage.  There's also the risk that
>>> the compiler would move enough of those load+test pairs before the pause
>>> and fill up the speculative pipeline with them; that pipeline will need to
>>> be flushed once the spin exits since those load instructions likely
>>> speculated incorrectly.  And here we're basically describing the reason for
>>> putting pause there in the first place :).
>>>
>>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>>> wrote:
>>>
>>>> So, why does the new node have a memory effect? That would seem to
>>>> prevent any movement of the subsequent loads in your loop, right? If that?s
>>>> intentional I wonder why is that?
>>>>
>>>> igor
>>>>
>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Some of you may have a seen a few e-mails on the core-libs alias about
>>>> a proposed ?spin wait hint?. The JEP is forming up nicely at
>>>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>>>> consensus on the API side. It is now in a draft state and I hope this JEP
>>>> will get targeted for java 9 shortly.  The upcoming API changes can be seen
>>>> at the webrev:
>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>
>>>> At this time I would like to ask for a review of the hs-comp changes.
>>>> The plan is push changes into class libraries and hotspot synchronously but
>>>> that may happen after the JEP gets targeted.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>
>>>> The idea of the fix is pretty simple: hotspot replaces a call to
>>>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>>>> 'pause' instruction on x86.  This intrinsic is guarded by the
>>>> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>>>> verification code that makes sure the flag is off, VM will just execute at
>>>> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
>>>> According the [1] the 'pause' instruction is functional since SSE2, but
>>>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>>>> harmless, there seems to be no need to add guarding code for older
>>>> generations of Intel CPUs.
>>>>
>>>> The proposed patch includes a simple regression test that simply makes
>>>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
>>>> are several other producer-consumer-like performance tests ready that the
>>>> authors of this JEP would be happy to make available under JEP-230 but I am
>>>> uncertain about the process.
>>>>
>>>> Thanks,
>>>>
>>>> Ivan
>>>>
>>>> [1]  -
>>>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>>>
>>>>
>>>>
>>>
>>> --
>>> Sent from my phone
>>>
>>>
>>>
>>
>> --
>> Sent from my phone
>>
>>
>>
>
> --
> Sent from my phone
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/11e42a19/attachment-0001.html>

From vitalyd at gmail.com  Wed Jan 27 12:22:15 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 27 Jan 2016 07:22:15 -0500
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<CAHjP37FD46FtWrk93f21sdomdJuXdMwYe6+dsugj-uZyugM9VA@mail.gmail.com>
	<0DCFF214-7A0D-48CF-A9CD-6DD32922701D@oracle.com>
	<CAHjP37G9ME-SppYK1-c3qAjumT6kho7ZvQEG=Pt1LwhArNQnCA@mail.gmail.com>
	<D8BA6D2D-6961-4C88-85B2-04CCADD49DBD@oracle.com>
	<CAHjP37GreGM6-k0q3HsapynaBykjhxrFY7qJzdMf=iWFo__vSw@mail.gmail.com>
	<C69F620F-75B7-4D53-8F67-C9018FAFF8F9@oracle.com>
Message-ID: <CAHjP37FyyJRnvjVts84hvx80y=nCru8DPS6RKy+en9QcRiAA7Q@mail.gmail.com>

On Wednesday, January 27, 2016, Igor Veresov <igor.veresov at oracle.com>
wrote:

>
> On Jan 26, 2016, at 9:35 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
>
>
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com
> <javascript:_e(%7B%7D,'cvml','igor.veresov at oracle.com');>> wrote:
>
>>
>> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>
>> You would but subsequent volatile load could move before the pause.  If
>> you unroll the loop, you could (theoretically) end up with all loads moved
>> before the pause but all appearing ordered with respect to each other, eg:
>>
>> cmp addr, 0 // from iteration 1
>> je label
>> cmp addr, 0 // from iteration 2
>> je label
>> ...
>> pause
>>
>> What prevents that if pause is not a compiler member?
>>
>>
>> I think volatile loads explicitly depend on control. If the pause node
>> consumes and produces control it all should be in a rigid control chain.
>>
> Other regular loads (that don?t have control dependencies) would still be
>> free to move around.
>>
>
> Is this to avoid out of thin air values? That is, suppose you have:
>
> if (some condition)
>     read volatile (or regular)
>
>
> Regular load can be scheduled before the if and result used if control
> reaches there.  For volatile, load cannot be scheduled above the if since
> value can be bogus at that point?
>
>
> Right. Regular reads can move up anywhere to the preceding memory effect,
> that modified that alias index.
>

I wonder if that's required by JMM though.  In my example above, if the
condition being read doesn't have volatile load semantics then it seems
there's no happens-before between the condition and the volatile load.

Your sentence above regarding modifying the alias index sort of makes it
sound like store-load forwarding by the compiler, allowing the read to be
skipped entirely (for regular loads), is that right or did I read too much
into it? If that's right, volatile loads cannot be eliminated so not quite
sure where that nets out.

I can see how volatile loads having control is a safe/conservative
implementation approach but I can also see how scheduling them
aggressively, when not prevented by other memory ordering, could be
beneficial.

>
>
> Is it safe for compiler to assume that something else anchors loads around
> the pause?
>
> That aside, given the intended usage, I'm not sure what other regular
> loads would be there.  The usage is a tight spin loop waiting for exit
> condition to be met.  Although I suppose if compiler sees regular loads
> after the loop exits successfully, perhaps scheduling them before the loop
> can be beneficial.  Is that what you have in mind?
>
>
>
> No just simple stuff like:
>
> while(?) {
>   a = x.f;
>   pause();
>   b = x.f;
> }
>
> If pause() is a wide memory kill, regular field loads around it obviously
> won?t fold. So in the example above those field loads are both going to be
> there. I realize it?s probably not a big deal in reality for the wait
> loops, but I was just wondering why make it a wide mem kill if membar nodes
> for volatiles (that will have to be in the loop) already have wide kill
> semantics.
>
> igor
>
>
>
>
>> igor
>>
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>>
>>> Wouldn?t you use a volatile load for the memory location you?re polling?
>>>
>>> igor
>>>
>>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>> Subsequent loads at this point will likely be polls of same memory
>>> location that just failed a test, and the author inserted a pause.  It's
>>> unlikely that the memory changed that quickly and scheduling the next load
>>> before the pause is equivalent to two loads back to back essentially, which
>>> wouldn't make sense given the intended usage.  There's also the risk that
>>> the compiler would move enough of those load+test pairs before the pause
>>> and fill up the speculative pipeline with them; that pipeline will need to
>>> be flushed once the spin exits since those load instructions likely
>>> speculated incorrectly.  And here we're basically describing the reason for
>>> putting pause there in the first place :).
>>>
>>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>>> wrote:
>>>
>>>> So, why does the new node have a memory effect? That would seem to
>>>> prevent any movement of the subsequent loads in your loop, right? If that?s
>>>> intentional I wonder why is that?
>>>>
>>>> igor
>>>>
>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Some of you may have a seen a few e-mails on the core-libs alias about
>>>> a proposed ?spin wait hint?. The JEP is forming up nicely at
>>>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>>>> consensus on the API side. It is now in a draft state and I hope this JEP
>>>> will get targeted for java 9 shortly.  The upcoming API changes can be seen
>>>> at the webrev:
>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>
>>>> At this time I would like to ask for a review of the hs-comp changes.
>>>> The plan is push changes into class libraries and hotspot synchronously but
>>>> that may happen after the JEP gets targeted.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>
>>>> The idea of the fix is pretty simple: hotspot replaces a call to
>>>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>>>> 'pause' instruction on x86.  This intrinsic is guarded by the
>>>> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>>>> verification code that makes sure the flag is off, VM will just execute at
>>>> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
>>>> According the [1] the 'pause' instruction is functional since SSE2, but
>>>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>>>> harmless, there seems to be no need to add guarding code for older
>>>> generations of Intel CPUs.
>>>>
>>>> The proposed patch includes a simple regression test that simply makes
>>>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
>>>> are several other producer-consumer-like performance tests ready that the
>>>> authors of this JEP would be happy to make available under JEP-230 but I am
>>>> uncertain about the process.
>>>>
>>>> Thanks,
>>>>
>>>> Ivan
>>>>
>>>> [1]  -
>>>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>>>
>>>>
>>>>
>>>
>>> --
>>> Sent from my phone
>>>
>>>
>>>
>>
>> --
>> Sent from my phone
>>
>>
>>
>
> --
> Sent from my phone
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/a3e262af/attachment-0001.html>

From ivan at azulsystems.com  Wed Jan 27 12:48:29 2016
From: ivan at azulsystems.com (Ivan Krylov)
Date: Wed, 27 Jan 2016 15:48:29 +0300
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
Message-ID: <56A8BC9D.8060004@azulsystems.com>

Looks like there was some good discussion while I was peacefully sleeping.
I don't have much to add. This patch was somewhat inspired by JEP-171 
changes.
Perhaps,there are other ways to achieve the same semantics.

So, if we can consider this reviewed - I will wait for the actual JEP to 
become targeted to 9 and then seek a sponsor to do the push.

Thanks,

Ivan

On 27/01/2016 09:12, Igor Veresov wrote:
> I realize it?s not a big deal. I was just wondering if there was any 
> specific reason control alone is not enough.
> Anyways, looks ok for the first cut.
>
> igor
>
>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com 
>> <mailto:gil at azul.com>> wrote:
>>
>> Since a sensical loop that calls onSpinWait() would include at least 
>> a volatile load on every iteration (and possibly a volatile store), 
>> the new node does not create significant extra move restrictions that 
>> are not already there. Modeling this with a memory effect is one 
>> simple way to prevent it from being re-ordered out of the loop. There 
>> are probably other ways to achieve this, but this one doesn't really 
>> have a performance downside?
>>
>> ? Gil.
>>
>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com 
>>> <mailto:igor.veresov at oracle.com>> wrote:
>>>
>>> So, why does the new node have a memory effect? That would seem to 
>>> prevent any movement of the subsequent loads in your loop, right? If 
>>> that?s intentional I wonder why is that?
>>>
>>> igor
>>>
>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com 
>>>> <mailto:ivan at azulsystems.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Some of you may have a seen a few e-mails on the core-libs alias 
>>>> about a proposed ?spin wait hint?. The JEP is forming up nicely at 
>>>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be 
>>>> a consensus on the API side. It is now in a draft state and I hope 
>>>> this JEP will get targeted for java 9 shortly.  The upcoming API 
>>>> changes can be seen at the webrev:
>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>
>>>> At this time I would like to ask for a review of the hs-comp 
>>>> changes. The plan is push changes into class libraries and hotspot 
>>>> synchronously but that may happen after the JEP gets targeted.
>>>>
>>>> Bug:https://bugs.openjdk.java.net/browse/JDK-8147844
>>>> Webrev:http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>
>>>> The idea of the fix is pretty simple: hotspot replaces a call to 
>>>> java.lang.Runtime.onSpinWait() with an intrinsic that is 
>>>> effectively a 'pause' instruction on x86.  This intrinsic is 
>>>> guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 
>>>> platforms there is a verification code that makes sure the flag is 
>>>> off, VM will just execute at empty method 
>>>> java.lang.Runtime.onSpinWait() ? effectively a no-op. According the 
>>>> [1] the 'pause' instruction is functional since SSE2, but even on 
>>>> CPUs prior to SSE2 the  'pause' instruction is a no-op and hence 
>>>> harmless, there seems to be no need to add guarding code for older 
>>>> generations of Intel CPUs.
>>>>
>>>> The proposed patch includes a simple regression test that simply 
>>>> makes sure that method java.lang.Runtime.onSpinWait() gets 
>>>> intrinsified.  There are several other producer-consumer-like 
>>>> performance tests ready that the authors of this JEP would be happy 
>>>> to make available under JEP-230 but I am uncertain about the process.
>>>>
>>>> Thanks,
>>>>
>>>> Ivan
>>>>
>>>> [1] 
>>>> -https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/bd873f62/attachment-0001.html>

From pavel.punegov at oracle.com  Wed Jan 27 14:51:19 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Wed, 27 Jan 2016 17:51:19 +0300
Subject: RFR (XXS): [TESTBUG] InlineCommandTest.java: unknown compiler level 0
	for commpile ID: 651
Message-ID: <1EC15E02-BB6F-480C-8FB4-40F8DB9A7C39@oracle.com>

Please review the following small patch for inlining tests.

Issue: tests are unable to find JFR compilation event for appropriate inline event. This happens because the recording stops before the compilation finished. Invocation of the test method is not synchronised with compilation.

Fix: add Xbatch to make compilation block test thread.  

bug: https://bugs.openjdk.java.net/browse/JDK-8144239 <https://bugs.openjdk.java.net/browse/JDK-8144239>
webrev: http://cr.openjdk.java.net/~ppunegov/8144239/webrev.00/ <http://cr.openjdk.java.net/~ppunegov/8144239/webrev.00/>
? Thanks,
Pavel Punegov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/366375a5/attachment.html>

From roland.westrelin at oracle.com  Wed Jan 27 15:39:11 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Wed, 27 Jan 2016 16:39:11 +0100
Subject: RFR(S): 8147645: get_ctrl_no_update() code is wrong
Message-ID: <77A6696F-8F4B-4023-AE58-61E2D01A8035@oracle.com>

The intrinsify_fill() code doesn?t mark a replaced control as dead. As suggested in the bug, I added an assert to 
get_ctrl_no_update() so we don?t use a loop as a control by accident. I also dropped 
lazy_replace_proj() which is obsolete AFAICT.

http://cr.openjdk.java.net/~roland/8147645/webrev.00/

Roland.

From zoltan.majo at oracle.com  Wed Jan 27 15:56:28 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Wed, 27 Jan 2016 16:56:28 +0100
Subject: [9] RFR (S): 8146478: Node limit exceeded with
	-XX:AllocateInstancePrefetchLines=1073741823
In-Reply-To: <56A7C27B.8050004@oracle.com>
References: <56A7A223.9050403@oracle.com> <56A7C27B.8050004@oracle.com>
Message-ID: <56A8E8AC.7010007@oracle.com>

Hi Vladimir,


thank you for the feedback!

On 01/26/2016 08:01 PM, Vladimir Kozlov wrote:
> Where 4/2 number comes from? Some spec runs used higher number:

Those are the highest values set by the VM. I was not aware that SPEC 
runs using values higher than those.

>
> -XX:AllocatePrefetchLines=16
>
> http://spec.org/jbb2005/results/res2009q1/jbb2005-20081203-00563.html
>
> I would suggest something like 64 - I never see such number is used.

OK, I set the maximum value for both AllocatePrefetchLines and 
AllocateInstancePrefetchLines to 64.

>
> Also, please, limit AllocatePrefetchStepSize range. It corresponds to 
> cache line size. 512 I would say for future proof - 

OK, done.

> with assert that check that its setting in vm_Version_<arch>.cpp is in 
> these 

OK, I modified the range check in the 
AllocatePrefetchStepSizeConstraintFunc() constraint function 
accordingly. I hope that is fine.

>
> For the case AllocatePrefetchStyle == 2 number of lines is calculated as:
>
> uint lines = AllocatePrefetchDistance / AllocatePrefetchStepSize;
>
> Since AllocatePrefetchDistance limit is big you can get a lot of nodes 
> again. May be also set the limit - 
> AllocatePrefetchLines*AllocatePrefetchStepSize 64*32 = 2048.

Thank you for catching that. I extended the constraint function 
AllocatePrefetchStepSizeConstraintFunc() to check that

AllocatePrefetchDistance / AllocatePrefetchStepSize <= 64

(64 is the maximum value that we expect for 'lines' in 
PhaseMacroExpand::prefetch_allocation() AllocatePrefetchStyle == 2.) I 
hope this is fine.

I also modified the expected node count increase after expansion in 
PhaseMacroExpand::expand_macro_nodes() to account for the increased 
thresholds.


Here is the updated webrev:
http://cr.openjdk.java.net/~zmajo/8146478/webrev.01/

I re-tested with JPRT (incl. TestOptionsWithRanges.java), all tests pass.

Thank you and best regards,


Zoltan


>
> Thanks,
> Vladimir
>
> On 1/26/16 8:43 AM, Zolt?n Maj? wrote:
>> Hi,
>>
>>
>> please review the patch for 8146478.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8146478
>>
>> Problem: Setting a high value for AllocateInstancePrefetchLines can
>> trigger an assert in the C2 compiler The reasons is that the number of
>> live nodes exceeds the maximum node limit. The same problem can happen
>> if AllocateInstanceLines is given a high value.
>>
>> Solution:
>> Limit the range for AllocateInstancePrefetchLines/AllocateInstanceLines
>> to 8. I picked the value 8 because
>> - (1) the maximum possible value for theses flags is 4/2, so having a
>> slightly higher value than 4/2 still allows for some experiments;
>> - (2) the node_check() in PhaseMacroExpand::expand_macro_nodes() assumes
>> that each macro node expansion will generate <75 new nodes. The number
>> of nodes generated by expand_allocate_array()/expand_allocate() for 8
>> prefetched lines closely fits into that margin (experimentally 
>> verified).
>>
>> In addition, I removed some code that is that is now unnecessary because
>> of the range checks we have in place.
>>
>>
>> Webrev:
>> http://cr.openjdk.java.net/~zmajo/8146478/webrev.00/
>>
>> Testing:
>> - JPRT: All JTREG hotspot tests, incl. TestOptionsWithRanges.java
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>


From tatiana.pivovarova at oracle.com  Wed Jan 27 17:15:12 2016
From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova)
Date: Wed, 27 Jan 2016 20:15:12 +0300
Subject: RFR(M): 8148375: [jittester] Bug with generation function with void
	parameter and non empty arguments
Message-ID: <56A8FB20.4000709@oracle.com>

Hello!

Please review the following patch for jit-tester.

When jit-tester generates Function node with void return type and with 
some arguments then this function in .java file returns arg_0.
While I fixed this bug there accumulated some small fixes like:
  - small performance improvement in SymbolTabe::merge function
  - renamed 'klass' to 'owner' where it make sense
  - added more stream-style
- moved TypeUtil to utils package
  - changed Makefile to get ability to use different 'seed' and 
'number-of-tests' (not only from property file)

bug-id: https://bugs.openjdk.java.net/browse/JDK-8148375
webrev: http://cr.openjdk.java.net/~tpivovarova/8148375/webrev.00/

Thanks,
Tatiana
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/b067869a/attachment.html>

From igor.veresov at oracle.com  Wed Jan 27 19:03:35 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 27 Jan 2016 11:03:35 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56A8BC9D.8060004@azulsystems.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
	<56A8BC9D.8060004@azulsystems.com>
Message-ID: <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>

Actually, I?d rather use Matcher::match_rule_supported() to test if it?s supported on the platform, rather than fixing all vm_version_*.* to check for the flag validity, that?s tedious (you forgot x86-32 and there?s going to be more platforms to fix for you sponsor). Something like UseOnSpinWaitIntrinsic && Matcher::match_rule_supported(Op_OnSpinWait) to decide whether or not to inline the intrinsic. Also, why are you not turning it on by default?

igor

> On Jan 27, 2016, at 4:48 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
> 
> Looks like there was some good discussion while I was peacefully sleeping.
> I don't have much to add. This patch was somewhat inspired by JEP-171 changes. 
> Perhaps,there are other ways to achieve the same semantics.
> 
> So, if we can consider this reviewed - I will wait for the actual JEP to become targeted to 9 and then seek a sponsor to do the push.
> 
> Thanks,
> 
> Ivan
> 
> On 27/01/2016 09:12, Igor Veresov wrote:
>> I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
>> Anyways, looks ok for the first cut.
>> 
>> igor
>> 
>>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
>>> 
>>> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?
>>> 
>>> ? Gil.
>>> 
>>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
>>>> 
>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>>> 
>>>> igor
>>>> 
>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>> 
>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>> 
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>> 
>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>> 
>>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Ivan
>>>>> 
>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>>>> 
>>> 
>> 
> 


From vladimir.kozlov at oracle.com  Wed Jan 27 19:10:31 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 27 Jan 2016 11:10:31 -0800
Subject: RFR (XXS): [TESTBUG] InlineCommandTest.java: unknown compiler
	level 0 for commpile ID: 651
In-Reply-To: <1EC15E02-BB6F-480C-8FB4-40F8DB9A7C39@oracle.com>
References: <1EC15E02-BB6F-480C-8FB4-40F8DB9A7C39@oracle.com>
Message-ID: <56A91627.4040500@oracle.com>

Looks fine.

Vlaidmir

On 1/27/16 6:51 AM, Pavel Punegov wrote:
> Please review the following small patch for inlining tests.
>
> Issue: tests are unable to find JFR compilation event for appropriate
> inline event. This happens because the recording stops before the
> compilation finished. Invocation of the test method is not synchronised
> with compilation.
>
> Fix: add Xbatch to make compilation block test thread.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8144239
> webrev: http://cr.openjdk.java.net/~ppunegov/8144239/webrev.00/
>
> ? Thanks,
> Pavel Punegov
>

From vladimir.kozlov at oracle.com  Wed Jan 27 19:13:16 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 27 Jan 2016 11:13:16 -0800
Subject: [9] RFR (S): 8146478: Node limit exceeded with
	-XX:AllocateInstancePrefetchLines=1073741823
In-Reply-To: <56A8E8AC.7010007@oracle.com>
References: <56A7A223.9050403@oracle.com> <56A7C27B.8050004@oracle.com>
	<56A8E8AC.7010007@oracle.com>
Message-ID: <56A916CC.6070103@oracle.com>

Looks good.

Thanks,
Vladimir

On 1/27/16 7:56 AM, Zolt?n Maj? wrote:
> Hi Vladimir,
>
>
> thank you for the feedback!
>
> On 01/26/2016 08:01 PM, Vladimir Kozlov wrote:
>> Where 4/2 number comes from? Some spec runs used higher number:
>
> Those are the highest values set by the VM. I was not aware that SPEC
> runs using values higher than those.
>
>>
>> -XX:AllocatePrefetchLines=16
>>
>> http://spec.org/jbb2005/results/res2009q1/jbb2005-20081203-00563.html
>>
>> I would suggest something like 64 - I never see such number is used.
>
> OK, I set the maximum value for both AllocatePrefetchLines and
> AllocateInstancePrefetchLines to 64.
>
>>
>> Also, please, limit AllocatePrefetchStepSize range. It corresponds to
>> cache line size. 512 I would say for future proof -
>
> OK, done.
>
>> with assert that check that its setting in vm_Version_<arch>.cpp is in
>> these
>
> OK, I modified the range check in the
> AllocatePrefetchStepSizeConstraintFunc() constraint function
> accordingly. I hope that is fine.
>
>>
>> For the case AllocatePrefetchStyle == 2 number of lines is calculated as:
>>
>> uint lines = AllocatePrefetchDistance / AllocatePrefetchStepSize;
>>
>> Since AllocatePrefetchDistance limit is big you can get a lot of nodes
>> again. May be also set the limit -
>> AllocatePrefetchLines*AllocatePrefetchStepSize 64*32 = 2048.
>
> Thank you for catching that. I extended the constraint function
> AllocatePrefetchStepSizeConstraintFunc() to check that
>
> AllocatePrefetchDistance / AllocatePrefetchStepSize <= 64
>
> (64 is the maximum value that we expect for 'lines' in
> PhaseMacroExpand::prefetch_allocation() AllocatePrefetchStyle == 2.) I
> hope this is fine.
>
> I also modified the expected node count increase after expansion in
> PhaseMacroExpand::expand_macro_nodes() to account for the increased
> thresholds.
>
>
> Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8146478/webrev.01/
>
> I re-tested with JPRT (incl. TestOptionsWithRanges.java), all tests pass.
>
> Thank you and best regards,
>
>
> Zoltan
>
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/26/16 8:43 AM, Zolt?n Maj? wrote:
>>> Hi,
>>>
>>>
>>> please review the patch for 8146478.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8146478
>>>
>>> Problem: Setting a high value for AllocateInstancePrefetchLines can
>>> trigger an assert in the C2 compiler The reasons is that the number of
>>> live nodes exceeds the maximum node limit. The same problem can happen
>>> if AllocateInstanceLines is given a high value.
>>>
>>> Solution:
>>> Limit the range for AllocateInstancePrefetchLines/AllocateInstanceLines
>>> to 8. I picked the value 8 because
>>> - (1) the maximum possible value for theses flags is 4/2, so having a
>>> slightly higher value than 4/2 still allows for some experiments;
>>> - (2) the node_check() in PhaseMacroExpand::expand_macro_nodes() assumes
>>> that each macro node expansion will generate <75 new nodes. The number
>>> of nodes generated by expand_allocate_array()/expand_allocate() for 8
>>> prefetched lines closely fits into that margin (experimentally
>>> verified).
>>>
>>> In addition, I removed some code that is that is now unnecessary because
>>> of the range checks we have in place.
>>>
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~zmajo/8146478/webrev.00/
>>>
>>> Testing:
>>> - JPRT: All JTREG hotspot tests, incl. TestOptionsWithRanges.java
>>>
>>> Thank you and best regards,
>>>
>>>
>>> Zoltan
>>>
>

From vladimir.kozlov at oracle.com  Wed Jan 27 19:53:11 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 27 Jan 2016 11:53:11 -0800
Subject: RFR(S): 8147645: get_ctrl_no_update() code is wrong
In-Reply-To: <77A6696F-8F4B-4023-AE58-61E2D01A8035@oracle.com>
References: <77A6696F-8F4B-4023-AE58-61E2D01A8035@oracle.com>
Message-ID: <56A92027.2070602@oracle.com>

Yes, old_node->add_req(NULL) was very odd. It was from day one and I 
don't get why it was needed. I would understand if it was set_req(0, 
NULL) but that is done by remove_globally_dead_node() later.

Your changes are good. I agree.

Thanks,
Vladimir

On 1/27/16 7:39 AM, Roland Westrelin wrote:
> The intrinsify_fill() code doesn?t mark a replaced control as dead. As suggested in the bug, I added an assert to
> get_ctrl_no_update() so we don?t use a loop as a control by accident. I also dropped
> lazy_replace_proj() which is obsolete AFAICT.
>
> http://cr.openjdk.java.net/~roland/8147645/webrev.00/
>
> Roland.
>

From vladimir.kozlov at oracle.com  Wed Jan 27 20:57:19 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 27 Jan 2016 12:57:19 -0800
Subject: RFR(S): 8063112: Compiler diagnostic commands should have locking
	instead of safepoint
In-Reply-To: <56A74D06.7030408@oracle.com>
References: <56A23F61.9000201@oracle.com> <56A281C3.6010408@oracle.com>
	<56A74D06.7030408@oracle.com>
Message-ID: <56A92F2F.9070507@oracle.com>

Yes, this looks much better. Reviewed.

thanks,
Vladimir

On 1/26/16 2:40 AM, Nils Eliasson wrote:
> Hi Vladimir,
>
> On 2016-01-22 20:23, Vladimir Kozlov wrote:
>> Why you need new print method? Why you can't use existing print()?
>> Also I prefer to get current compilation tasks print in separate lines
>> - not in the list of threads. Then you don't need to use new print?
>
> Works for me. I moved it directly after the existing thread printing:
>
> ---------------  P R O C E S S  ---------------
>
> Java Threads: ( => current thread )
>    0x00007f4cfc485000 JavaThread "Service Thread" daemon
> [_thread_blocked, id=22409, stack(0x00007f4bf1c5e000,0x00007f4bf1d5f000)]
>    0x00007f4cfc476000 JavaThread "Sweeper thread" daemon
> [_thread_blocked, id=22408, stack(0x00007f4bf1d5f000,0x00007f4bf1e60000)]
> ...
> stack(0x00007f4bf35db000,0x00007f4bf36dc000)]
>    0x00007f4cfc018800 JavaThread "main" [_thread_in_vm, id=22332,
> stack(0x00007f4d05c78000,0x00007f4d05d79000)]
>
> Other Threads:
>    0x00007f4cfc3ea000 VMThread [stack:
> 0x00007f4bf36dc000,0x00007f4bf37dd000] [id=22388]
>    0x00007f4cfc486800 WatcherThread [stack:
> 0x00007f4bf1b5d000,0x00007f4bf1c5e000] [id=22410]
>
> Threads with active compile tasks:
>    0x00007f4cfc46a800 id=22403 Compiling:     244    1       3
> java.lang.String::isLatin1 (19 bytes)
>
>
>>
>> I am worry about using locks for printing because print code also has
>> locks. Do we really have to have locks here? The output for these
>> directives is local bufferedStream. As I understand it is separate for
>> each directive. So why you need lock? Or VM operation as before?
>
> I think you are mixing my two RFRs together - this change doesn't print
> directives.
>
> I am removing vm_ops from three diagnostic commands that uses code that
> expects safepoint or lock. Some of the commands are really quick, and
> requesting a safepoint is overkill when it can be done concurrently.
> Only new lock taken is the thread lock when iterating the compiler
> threads from the Compiler.queue jcmd. The thread lock is ranked so it
> can not be reordered with the compile.queue lock.
>
> I cleaned it up a bit further and removed the unused
> print_compiler_threads_on(...) from compileBroker. It is printed in
> JavaThread::print_on(..) where all the other thread info is located.
>
> Hs_err-file looks like the example above.
>
> jcmd Thread.print looks like this for compiling threads:
>
> C1 CompilerThread13" #19 daemon prio=9 os_prio=0 tid=0x00007f8748471800
> nid=0x7732 runnable [0x0000000000000000]
>     java.lang.Thread.State: RUNNABLE
>     JavaThread state: _thread_in_native
> Thread: 0x00007f8748471800  [0x7732] State: _at_safepoint
> _has_called_back 0 _at_poll_safepoint 0
>     JavaThread state: _thread_in_native
>     Compiling:  716    b  2       java.util.regex.Pattern::compile (406
> bytes)
>
> And Compiler.queue looks like this:
>
> "Current compiles:
> C1 CompilerThread14 435    b  2 java.net.URLStreamHandler::parseURL
> (1166 bytes)
>
> C1 compile queue:
> Empty
>
> C2 compile queue:
> Empty"
>
>
> New webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.04/
>
> Regards,
> Nils
>
>>
>> Thanks,
>> Vladimir
>>
>> On 1/22/16 6:40 AM, Nils Eliasson wrote:
>>> Hi,
>>>
>>> Please review.
>>>
>>> Summary:
>>> Firstly this change removes the unnecessary vm-ops from three compiler
>>> diagnostic commands and adds locking instead.
>>> Secondly the Compiler.queue diagnostic command is improved with printing
>>> of any active compilations. I found this useful when diagnosing a
>>> rouge VM.
>>> Thirdly, as a bonus, I also add printing of active compilations in the
>>> thread section of the hs_err file. Very useful when investigating VMs
>>> terminated by a timeout.
>>>
>>> Testing:
>>> This does not pass all tests yet. A few tests is dependent on the output
>>> from the diagnostic command, and I want to be sure the reviewers are
>>> happy with the output format first.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8063112
>>> Webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.02/
>>>
>>> Regards,
>>> Nils
>>>
>

From vladimir.x.ivanov at oracle.com  Wed Jan 27 22:05:47 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 28 Jan 2016 01:05:47 +0300
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56A751AE.9090203@azulsystems.com>
References: <56A751AE.9090203@azulsystems.com>
Message-ID: <56A93F3B.7070301@oracle.com>

Ivan,

There's no need in yet another flag (-XX:?UseOnSpinWaitIntrinsic). 
-XX:DisableIntrinsic=_onSpinWait should do the same.

Best regards,
Vladimir Ivanov

On 1/26/16 1:59 PM, Ivan Krylov wrote:
> Hello,
>
> Some of you may have a seen a few e-mails on the core-libs alias about a
> proposed ?spin wait hint?. The JEP is forming up nicely at
> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
> consensus on the API side. It is now in a draft state and I hope this
> JEP will get targeted for java 9 shortly.  The upcoming API changes can
> be seen at the webrev:
> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>
> At this time I would like to ask for a review of the hs-comp changes.
> The plan is push changes into class libraries and hotspot synchronously
> but that may happen after the JEP gets targeted.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>
> The idea of the fix is pretty simple: hotspot replaces a call to
> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
> 'pause' instruction on x86.  This intrinsic is guarded by the
> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
> verification code that makes sure the flag is off, VM will just execute
> at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
> According the [1] the 'pause' instruction is functional since SSE2, but
> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
> harmless, there seems to be no need to add guarding code for older
> generations of Intel CPUs.
>
> The proposed patch includes a simple regression test that simply makes
> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.
> There are several other producer-consumer-like performance tests ready
> that the authors of this JEP would be happy to make available under
> JEP-230 but I am uncertain about the process.
>
> Thanks,
>
> Ivan
>
> [1]  -
> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>

From ivan at azulsystems.com  Wed Jan 27 22:37:55 2016
From: ivan at azulsystems.com (Ivan Krylov)
Date: Thu, 28 Jan 2016 01:37:55 +0300
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56A93F3B.7070301@oracle.com>
References: <56A751AE.9090203@azulsystems.com> <56A93F3B.7070301@oracle.com>
Message-ID: <56A946C3.3060104@azulsystems.com>

On 28/01/2016 01:05, Vladimir Ivanov wrote:
> Ivan,
>
> There's no need in yet another flag (-XX:?UseOnSpinWaitIntrinsic). 
> -XX:DisableIntrinsic=_onSpinWait should do the same.
Good suggestion. I will accommodate that.

Thanks,

Ivan


>
> Best regards,
> Vladimir Ivanov
>
> On 1/26/16 1:59 PM, Ivan Krylov wrote:
>> Hello,
>>
>> Some of you may have a seen a few e-mails on the core-libs alias about a
>> proposed ?spin wait hint?. The JEP is forming up nicely at
>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>> consensus on the API side. It is now in a draft state and I hope this
>> JEP will get targeted for java 9 shortly.  The upcoming API changes can
>> be seen at the webrev:
>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>
>> At this time I would like to ask for a review of the hs-comp changes.
>> The plan is push changes into class libraries and hotspot synchronously
>> but that may happen after the JEP gets targeted.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>
>> The idea of the fix is pretty simple: hotspot replaces a call to
>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>> 'pause' instruction on x86.  This intrinsic is guarded by the
>> -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>> verification code that makes sure the flag is off, VM will just execute
>> at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op.
>> According the [1] the 'pause' instruction is functional since SSE2, but
>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>> harmless, there seems to be no need to add guarding code for older
>> generations of Intel CPUs.
>>
>> The proposed patch includes a simple regression test that simply makes
>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.
>> There are several other producer-consumer-like performance tests ready
>> that the authors of this JEP would be happy to make available under
>> JEP-230 but I am uncertain about the process.
>>
>> Thanks,
>>
>> Ivan
>>
>> [1]  -
>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops 
>>
>>


From zoltan.majo at oracle.com  Thu Jan 28 07:22:36 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Thu, 28 Jan 2016 08:22:36 +0100
Subject: [9] RFR (S): 8146478: Node limit exceeded with
	-XX:AllocateInstancePrefetchLines=1073741823
In-Reply-To: <56A916CC.6070103@oracle.com>
References: <56A7A223.9050403@oracle.com> <56A7C27B.8050004@oracle.com>
	<56A8E8AC.7010007@oracle.com> <56A916CC.6070103@oracle.com>
Message-ID: <56A9C1BC.7000202@oracle.com>

Hi Vladimir,


thank you for the review!

Best regards,


Zoltan

On 01/27/2016 08:13 PM, Vladimir Kozlov wrote:
> Looks good.
>
> Thanks,
> Vladimir
>
> On 1/27/16 7:56 AM, Zolt?n Maj? wrote:
>> Hi Vladimir,
>>
>>
>> thank you for the feedback!
>>
>> On 01/26/2016 08:01 PM, Vladimir Kozlov wrote:
>>> Where 4/2 number comes from? Some spec runs used higher number:
>>
>> Those are the highest values set by the VM. I was not aware that SPEC
>> runs using values higher than those.
>>
>>>
>>> -XX:AllocatePrefetchLines=16
>>>
>>> http://spec.org/jbb2005/results/res2009q1/jbb2005-20081203-00563.html
>>>
>>> I would suggest something like 64 - I never see such number is used.
>>
>> OK, I set the maximum value for both AllocatePrefetchLines and
>> AllocateInstancePrefetchLines to 64.
>>
>>>
>>> Also, please, limit AllocatePrefetchStepSize range. It corresponds to
>>> cache line size. 512 I would say for future proof -
>>
>> OK, done.
>>
>>> with assert that check that its setting in vm_Version_<arch>.cpp is in
>>> these
>>
>> OK, I modified the range check in the
>> AllocatePrefetchStepSizeConstraintFunc() constraint function
>> accordingly. I hope that is fine.
>>
>>>
>>> For the case AllocatePrefetchStyle == 2 number of lines is 
>>> calculated as:
>>>
>>> uint lines = AllocatePrefetchDistance / AllocatePrefetchStepSize;
>>>
>>> Since AllocatePrefetchDistance limit is big you can get a lot of nodes
>>> again. May be also set the limit -
>>> AllocatePrefetchLines*AllocatePrefetchStepSize 64*32 = 2048.
>>
>> Thank you for catching that. I extended the constraint function
>> AllocatePrefetchStepSizeConstraintFunc() to check that
>>
>> AllocatePrefetchDistance / AllocatePrefetchStepSize <= 64
>>
>> (64 is the maximum value that we expect for 'lines' in
>> PhaseMacroExpand::prefetch_allocation() AllocatePrefetchStyle == 2.) I
>> hope this is fine.
>>
>> I also modified the expected node count increase after expansion in
>> PhaseMacroExpand::expand_macro_nodes() to account for the increased
>> thresholds.
>>
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8146478/webrev.01/
>>
>> I re-tested with JPRT (incl. TestOptionsWithRanges.java), all tests 
>> pass.
>>
>> Thank you and best regards,
>>
>>
>> Zoltan
>>
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/26/16 8:43 AM, Zolt?n Maj? wrote:
>>>> Hi,
>>>>
>>>>
>>>> please review the patch for 8146478.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8146478
>>>>
>>>> Problem: Setting a high value for AllocateInstancePrefetchLines can
>>>> trigger an assert in the C2 compiler The reasons is that the number of
>>>> live nodes exceeds the maximum node limit. The same problem can happen
>>>> if AllocateInstanceLines is given a high value.
>>>>
>>>> Solution:
>>>> Limit the range for 
>>>> AllocateInstancePrefetchLines/AllocateInstanceLines
>>>> to 8. I picked the value 8 because
>>>> - (1) the maximum possible value for theses flags is 4/2, so having a
>>>> slightly higher value than 4/2 still allows for some experiments;
>>>> - (2) the node_check() in PhaseMacroExpand::expand_macro_nodes() 
>>>> assumes
>>>> that each macro node expansion will generate <75 new nodes. The number
>>>> of nodes generated by expand_allocate_array()/expand_allocate() for 8
>>>> prefetched lines closely fits into that margin (experimentally
>>>> verified).
>>>>
>>>> In addition, I removed some code that is that is now unnecessary 
>>>> because
>>>> of the range checks we have in place.
>>>>
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8146478/webrev.00/
>>>>
>>>> Testing:
>>>> - JPRT: All JTREG hotspot tests, incl. TestOptionsWithRanges.java
>>>>
>>>> Thank you and best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>


From nils.eliasson at oracle.com  Thu Jan 28 08:21:12 2016
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Thu, 28 Jan 2016 09:21:12 +0100
Subject: RFR(S): 8063112: Compiler diagnostic commands should have locking
	instead of safepoint
In-Reply-To: <56A92F2F.9070507@oracle.com>
References: <56A23F61.9000201@oracle.com> <56A281C3.6010408@oracle.com>
	<56A74D06.7030408@oracle.com> <56A92F2F.9070507@oracle.com>
Message-ID: <56A9CF78.8000106@oracle.com>

Thanks Vladimir!

Regards,
Nils

On 2016-01-27 21:57, Vladimir Kozlov wrote:
> Yes, this looks much better. Reviewed.
>
> thanks,
> Vladimir
>
> On 1/26/16 2:40 AM, Nils Eliasson wrote:
>> Hi Vladimir,
>>
>> On 2016-01-22 20:23, Vladimir Kozlov wrote:
>>> Why you need new print method? Why you can't use existing print()?
>>> Also I prefer to get current compilation tasks print in separate lines
>>> - not in the list of threads. Then you don't need to use new print?
>>
>> Works for me. I moved it directly after the existing thread printing:
>>
>> ---------------  P R O C E S S  ---------------
>>
>> Java Threads: ( => current thread )
>>    0x00007f4cfc485000 JavaThread "Service Thread" daemon
>> [_thread_blocked, id=22409, 
>> stack(0x00007f4bf1c5e000,0x00007f4bf1d5f000)]
>>    0x00007f4cfc476000 JavaThread "Sweeper thread" daemon
>> [_thread_blocked, id=22408, 
>> stack(0x00007f4bf1d5f000,0x00007f4bf1e60000)]
>> ...
>> stack(0x00007f4bf35db000,0x00007f4bf36dc000)]
>>    0x00007f4cfc018800 JavaThread "main" [_thread_in_vm, id=22332,
>> stack(0x00007f4d05c78000,0x00007f4d05d79000)]
>>
>> Other Threads:
>>    0x00007f4cfc3ea000 VMThread [stack:
>> 0x00007f4bf36dc000,0x00007f4bf37dd000] [id=22388]
>>    0x00007f4cfc486800 WatcherThread [stack:
>> 0x00007f4bf1b5d000,0x00007f4bf1c5e000] [id=22410]
>>
>> Threads with active compile tasks:
>>    0x00007f4cfc46a800 id=22403 Compiling:     244    1       3
>> java.lang.String::isLatin1 (19 bytes)
>>
>>
>>>
>>> I am worry about using locks for printing because print code also has
>>> locks. Do we really have to have locks here? The output for these
>>> directives is local bufferedStream. As I understand it is separate for
>>> each directive. So why you need lock? Or VM operation as before?
>>
>> I think you are mixing my two RFRs together - this change doesn't print
>> directives.
>>
>> I am removing vm_ops from three diagnostic commands that uses code that
>> expects safepoint or lock. Some of the commands are really quick, and
>> requesting a safepoint is overkill when it can be done concurrently.
>> Only new lock taken is the thread lock when iterating the compiler
>> threads from the Compiler.queue jcmd. The thread lock is ranked so it
>> can not be reordered with the compile.queue lock.
>>
>> I cleaned it up a bit further and removed the unused
>> print_compiler_threads_on(...) from compileBroker. It is printed in
>> JavaThread::print_on(..) where all the other thread info is located.
>>
>> Hs_err-file looks like the example above.
>>
>> jcmd Thread.print looks like this for compiling threads:
>>
>> C1 CompilerThread13" #19 daemon prio=9 os_prio=0 tid=0x00007f8748471800
>> nid=0x7732 runnable [0x0000000000000000]
>>     java.lang.Thread.State: RUNNABLE
>>     JavaThread state: _thread_in_native
>> Thread: 0x00007f8748471800  [0x7732] State: _at_safepoint
>> _has_called_back 0 _at_poll_safepoint 0
>>     JavaThread state: _thread_in_native
>>     Compiling:  716    b  2 java.util.regex.Pattern::compile (406
>> bytes)
>>
>> And Compiler.queue looks like this:
>>
>> "Current compiles:
>> C1 CompilerThread14 435    b  2 java.net.URLStreamHandler::parseURL
>> (1166 bytes)
>>
>> C1 compile queue:
>> Empty
>>
>> C2 compile queue:
>> Empty"
>>
>>
>> New webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.04/
>>
>> Regards,
>> Nils
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 1/22/16 6:40 AM, Nils Eliasson wrote:
>>>> Hi,
>>>>
>>>> Please review.
>>>>
>>>> Summary:
>>>> Firstly this change removes the unnecessary vm-ops from three compiler
>>>> diagnostic commands and adds locking instead.
>>>> Secondly the Compiler.queue diagnostic command is improved with 
>>>> printing
>>>> of any active compilations. I found this useful when diagnosing a
>>>> rouge VM.
>>>> Thirdly, as a bonus, I also add printing of active compilations in the
>>>> thread section of the hs_err file. Very useful when investigating VMs
>>>> terminated by a timeout.
>>>>
>>>> Testing:
>>>> This does not pass all tests yet. A few tests is dependent on the 
>>>> output
>>>> from the diagnostic command, and I want to be sure the reviewers are
>>>> happy with the output format first.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8063112
>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8063112/webrev.02/
>>>>
>>>> Regards,
>>>> Nils
>>>>
>>


From christian.thalinger at oracle.com  Thu Jan 28 09:41:58 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 28 Jan 2016 10:41:58 +0100
Subject: RFR(M) 8147461: Use byte offsets for vtable start and vtable
	length offsets
In-Reply-To: <56A1FA78.3090608@oracle.com>
References: <569926B9.4070806@oracle.com> <569F7E22.3090905@oracle.com>
	<56A04DCF.9090204@oracle.com> <56A1FA78.3090608@oracle.com>
Message-ID: <FCE6B7E1-9965-40F1-83EA-AD99B5D46746@oracle.com>


> On Jan 22, 2016, at 10:46 AM, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
> 
> Hi Chris,
> 
> On 2016-01-21 04:17, Chris Plummer wrote:
>> Hi Mikael,
>> 
>> The changes look good except I think you should get someone from the
>> compiler team to make sure the change in
>> HotSpotResolvedJavaMethodImpl.java and HotSpotVMConfig.java are ok. I'm
>> not sure why you chose to remove instanceKlassVtableStartOffset() rather
>> than just fix it.
> 
> I'm cc:ing hotspot-compiler-dev and graal-dev to see if I can get someone to ok the JVMCI parts.
> 
> The reason for removing the method is that the only reason for it being a method was to apply the wordSize scaling on the value and since I changed the offset to be a byte offset it does not need scaling and can be treated similar to the other constants in HotSpotVMConfig which are accessed without any accessor method.

For the record, the JVMCI changes look good.

> 
>> 
>> I think some of your changes may conflict with my changes for
>> JDK-8143608. Coleen is pushing JDK-8143608 for me once hs-rt opens up.
>> I'd appreciate it if you could wait until after then before doing your
>> push.
> 
> Will do, would you mind pinging me when you've integrated 8143608?
> 
> /Mikael
> 
>> 
>> thanks,
>> 
>> Chris
>> 
>> On 1/20/16 4:31 AM, Mikael Gerdin wrote:
>>> Hi again,
>>> 
>>> I've rebased the on hs-rt and had to include some additional changes
>>> for JVMCI.
>>> I've also updated the copyright years.
>>> Unfortunately I can't generate an incremental webrev since i rebased
>>> the patch and there's no good way that I know of to make that work
>>> with webrev.
>>> 
>>> New webrev at: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.1/
>>> 
>>> Testing: JPRT again (which includes the JVMCI jtreg tests)
>>> 
>>> /Mikael
>>> 
>>> On 2016-01-15 18:04, Mikael Gerdin wrote:
>>>> Hi all,
>>>> 
>>>> As per the previous discussion in mid-December[0] about moving the
>>>> _vtable_length field to class Klass, here's the first RFR and webrev,
>>>> according to my suggested plan[1]:
>>>> 
>>>>> My current plan is to first modify the vtable_length_offset accessor to
>>>>> return a byte offset (which is what it's translated to by all callers).
>>>>> 
>>>>> Then I'll tackle moving the _vtable_len field to Klass.
>>>>> 
>>>>> Finally I'll try to consolidate the vtable related methods to Klass,
>>>>> where they belong.
>>>> 
>>>> This change actually consists of three changes:
>>>> * modifying InstanceKlass::vtable_length_offset to become a byte offset
>>>> and use the ByteSize type to communicate the scaling.
>>>> * modifying InstanceKlass::vtable_start_offset to become a byte offset
>>>> and use the ByteSize type, for symmetry reasons mainly.
>>>> * adding a vtableEntry::size_in_bytes() since in many places the vtable
>>>> entry size is used in combination with the vtable start to compute a
>>>> byte offset for vtable lookups.
>>>> 
>>>> I don't foresee any issues with the fact that the byte offset is
>>>> represented as an int, for two reasons:
>>>> 1) If the offset of any of these grows to over 2 gigabytes then we have
>>>> a huge footprint problem with InstanceKlass
>>>> 2) The offsets are converted to byte offsets and stored in ints already
>>>> in the cpu specific code I've modified.
>>>> 
>>>> Bug link: https://bugs.openjdk.java.net/browse/JDK-8147461
>>>> Webrev: http://cr.openjdk.java.net/~mgerdin/8147461/webrev.0/
>>>> 
>>>> Testing: JPRT on Oracle supported platforms, testing on AARCH64 and
>>>> PPC64 would be much appreciated, appropriate mailing lists have been
>>>> CC:ed to notify them of the request.
>>>> 
>>>> 
>>>> [0]
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021152.html
>>>> 
>>>> 
>>>> [1]
>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-December/021224.html
>>>> 
>>>> 
>>>> 
>>>> Thanks!
>>>> /Mikael
>>> 
>> 
> 


From tobias.hartmann at oracle.com  Thu Jan 28 11:16:24 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 28 Jan 2016 12:16:24 +0100
Subject: [9] RFR(XS): 8148460: TestUnsafeUnalignedMismatchedAccesses.java
	fails: error: package jdk.internal.misc does not exist
Message-ID: <56A9F888.4050609@oracle.com>

Hi,

please review the following patch:

https://bugs.openjdk.java.net/browse/JDK-8148460
http://cr.openjdk.java.net/~thartmann/8148460/webrev.00/

The test fails because it's missing the jtreg tag to load the 'jdk.internal.misc' module.

Thanks,
Tobias

From roland.westrelin at oracle.com  Thu Jan 28 12:49:38 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 28 Jan 2016 13:49:38 +0100
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations with
	G1
Message-ID: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>

http://cr.openjdk.java.net/~roland/8087341/webrev.00/

C2 currently doesn?t optimize the field load in the following code:

    static Object field;

    static Object m(Object o) {
        field = o;
        return field;
    }

It should return o but instead loads the value back from memory. The reason it misses such simple optimization is that the G1 post barrier has a memory barrier with a wide effect on the memory state. C2 doesn?t optimize this either:

        object.field = other_object;
        object.field = other_object; 

Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark

That memory barrier was added to have a memory barrier instruction and doesn?t have to have a wide memory effect.

Roland.

From pavel.punegov at oracle.com  Thu Jan 28 14:07:56 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Thu, 28 Jan 2016 17:07:56 +0300
Subject: RFR (XXS): [TESTBUG] InlineCommandTest.java: unknown compiler
	level 0 for commpile ID: 651
In-Reply-To: <56A91627.4040500@oracle.com>
References: <1EC15E02-BB6F-480C-8FB4-40F8DB9A7C39@oracle.com>
	<56A91627.4040500@oracle.com>
Message-ID: <7B1F2311-AFFD-4C4A-BEDF-418662A16564@oracle.com>

Thanks Vladimir

? Pavel.

> On 27 Jan 2016, at 22:10, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Looks fine.
> 
> Vlaidmir
> 
> On 1/27/16 6:51 AM, Pavel Punegov wrote:
>> Please review the following small patch for inlining tests.
>> 
>> Issue: tests are unable to find JFR compilation event for appropriate
>> inline event. This happens because the recording stops before the
>> compilation finished. Invocation of the test method is not synchronised
>> with compilation.
>> 
>> Fix: add Xbatch to make compilation block test thread.
>> 
>> bug: https://bugs.openjdk.java.net/browse/JDK-8144239
>> webrev: http://cr.openjdk.java.net/~ppunegov/8144239/webrev.00/
>> 
>> ? Thanks,
>> Pavel Punegov
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160128/7c027b95/attachment.html>

From roland.westrelin at oracle.com  Thu Jan 28 14:11:37 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 28 Jan 2016 15:11:37 +0100
Subject: [9] RFR(XS): 8148460: TestUnsafeUnalignedMismatchedAccesses.java
	fails: error: package jdk.internal.misc does not exist
In-Reply-To: <56A9F888.4050609@oracle.com>
References: <56A9F888.4050609@oracle.com>
Message-ID: <148AD43A-07E3-4E55-B03A-A6915306DF57@oracle.com>

> http://cr.openjdk.java.net/~thartmann/8148460/webrev.00/

That looks good to me.

Roland.

From roland.westrelin at oracle.com  Thu Jan 28 14:22:19 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Thu, 28 Jan 2016 15:22:19 +0100
Subject: RFR(S): 6378256: Performance problem with System.identityHashCode
	in client compiler
In-Reply-To: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>
References: <e979523e-07f1-4ab6-b7fe-0c54268f55fc@default>
Message-ID: <D0439CBA-7C8E-4FDD-BF1D-D3D359FAD328@oracle.com>

> With reference to below email thread, please send review comments for the revised patch for JDK-6378256.
> http://cr.openjdk.java.net/~thartmann/6378256/webrev.02/

That looks good. Can you justify the comments again?

Also the x86_64 and x86_32 are (mostly?) identical. Do we want to create a sharedRuntime_x86.cpp, move the InlineObjectHash code in its own function there to avoid duplication?

Roland.

From tobias.hartmann at oracle.com  Thu Jan 28 14:25:52 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 28 Jan 2016 15:25:52 +0100
Subject: [9] RFR(XS): 8148460: TestUnsafeUnalignedMismatchedAccesses.java
	fails: error: package jdk.internal.misc does not exist
In-Reply-To: <148AD43A-07E3-4E55-B03A-A6915306DF57@oracle.com>
References: <56A9F888.4050609@oracle.com>
	<148AD43A-07E3-4E55-B03A-A6915306DF57@oracle.com>
Message-ID: <56AA24F0.5010709@oracle.com>

Thanks, Roland.

Best,
Tobias

On 28.01.2016 15:11, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~thartmann/8148460/webrev.00/
> 
> That looks good to me.
> 
> Roland.
> 

From tobias.hartmann at oracle.com  Thu Jan 28 14:27:25 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 28 Jan 2016 15:27:25 +0100
Subject: [9] RFR(XS): 8148460: TestUnsafeUnalignedMismatchedAccesses.java
	fails: error: package jdk.internal.misc does not exist
In-Reply-To: <56A9F888.4050609@oracle.com>
References: <56A9F888.4050609@oracle.com>
Message-ID: <56AA254D.7020303@oracle.com>

Hi,

I was asked to push this directly to the Jake repo. Christian T. (CC'ed) will sponsor the change.

Thanks,
Tobias

On 28.01.2016 12:16, Tobias Hartmann wrote:
> Hi,
> 
> please review the following patch:
> 
> https://bugs.openjdk.java.net/browse/JDK-8148460
> http://cr.openjdk.java.net/~thartmann/8148460/webrev.00/
> 
> The test fails because it's missing the jtreg tag to load the 'jdk.internal.misc' module.
> 
> Thanks,
> Tobias
> 

From adinn at redhat.com  Thu Jan 28 14:30:35 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 28 Jan 2016 14:30:35 +0000
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
Message-ID: <56AA260B.8080101@redhat.com>

Hi Roland,

On 28/01/16 12:49, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
> 
> C2 currently doesn?t optimize the field load in the following code:
> 
> static Object field;
> 
> static Object m(Object o) { field = o; return field; }
> 
> It should return o but instead loads the value back from memory. The
> reason it misses such simple optimization is that the G1 post barrier
> has a memory barrier with a wide effect on the memory state. C2
> doesn?t optimize this either:
> 
>     object.field = other_object;
>     object.field = other_object;
> 
> Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark
> 
> That memory barrier was added to have a memory barrier instruction
> and doesn?t have to have a wide memory effect.

I think this looks ok -- not sure until I try it out. However, I /am/
fairly sure it is going to cause a problem for the AArch64 code which
optimizes volatile loads and stores. That's because it change the
characteristic shape of the subgraph searched for by the predicates
which decide whether to i) generate loads + membars or ii) plant stlr or
ladr instructions.

I'll look into this asap.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (US), Michael O'Neill (Ireland), Paul
Argiry (US)

From ivan at azulsystems.com  Thu Jan 28 14:51:16 2016
From: ivan at azulsystems.com (Ivan Krylov)
Date: Thu, 28 Jan 2016 17:51:16 +0300
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
	<56A8BC9D.8060004@azulsystems.com>
	<6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>
Message-ID: <56AA2AE4.2090803@azulsystems.com>

Hi Igor,

Following Vladimir's suggestion I eliminated the UseOnSpinWaitIntrinsic 
flag altogether. I have adopted the Matcher::match_rule_supported() 
logic - seems to work on intel, but I don't have any non-intel box to test.

Anyway, the new webrev:
http://cr.openjdk.java.net/~ikrylov/8147844.hs.01/

Igor, Vladimir, thanks,

Ivan

On 27/01/2016 22:03, Igor Veresov wrote:
> Actually, I?d rather use Matcher::match_rule_supported() to test if it?s supported on the platform, rather than fixing all vm_version_*.* to check for the flag validity, that?s tedious (you forgot x86-32 and there?s going to be more platforms to fix for you sponsor). Something like UseOnSpinWaitIntrinsic && Matcher::match_rule_supported(Op_OnSpinWait) to decide whether or not to inline the intrinsic. Also, why are you not turning it on by default?
>
> igor
>
>> On Jan 27, 2016, at 4:48 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>
>> Looks like there was some good discussion while I was peacefully sleeping.
>> I don't have much to add. This patch was somewhat inspired by JEP-171 changes.
>> Perhaps,there are other ways to achieve the same semantics.
>>
>> So, if we can consider this reviewed - I will wait for the actual JEP to become targeted to 9 and then seek a sponsor to do the push.
>>
>> Thanks,
>>
>> Ivan
>>
>> On 27/01/2016 09:12, Igor Veresov wrote:
>>> I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
>>> Anyways, looks ok for the first cut.
>>>
>>> igor
>>>
>>>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
>>>>
>>>> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?
>>>>
>>>> ? Gil.
>>>>
>>>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
>>>>>
>>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>>>>
>>>>> igor
>>>>>
>>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>>>
>>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>>>
>>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>>>
>>>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Ivan
>>>>>>
>>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops


From aleksey.shipilev at oracle.com  Thu Jan 28 15:04:05 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Thu, 28 Jan 2016 18:04:05 +0300
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
Message-ID: <56AA2DE5.5050008@oracle.com>

On 01/28/2016 03:49 PM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8087341/webrev.00/

This looks good, for most tests here:
  http://cr.openjdk.java.net/~shade/8087341/G1BackToBackStores.java

The generated code indeed shows commoned loads/stores with this patch,
and some other things that can be improved in the codegen -- I'll file
the separate issue(s) for that. I think this one is better to be renamed
to something more specific, e.g. "Overly wide StoreLoad barrier in G1
breaks load/store coalescing"?

On i7-4790K @ 4.0 GHz, Linux x86_64:

== Baseline:

Benchmark                    Mode  Cnt  Score   Error  Units
G1BackToBackStores.test_1    avgt   15  2.193 ? 0.037  ns/op
G1BackToBackStores.test_11   avgt   15  2.984 ? 0.076  ns/op
G1BackToBackStores.test_111  avgt   15  3.706 ? 0.017  ns/op
G1BackToBackStores.test_112  avgt   15  3.978 ? 0.078  ns/op
G1BackToBackStores.test_121  avgt   15  4.107 ? 0.028  ns/op
G1BackToBackStores.test_211  avgt   15  3.824 ? 0.186  ns/op

== Patched:

Benchmark                    Mode  Cnt  Score   Error  Units
G1BackToBackStores.test_1    avgt   15  2.184 ? 0.020  ns/op
G1BackToBackStores.test_11   avgt   15  2.790 ? 0.065  ns/op  // !
G1BackToBackStores.test_111  avgt   15  3.264 ? 0.008  ns/op  // !!!
G1BackToBackStores.test_112  avgt   15  3.640 ? 0.011  ns/op  // !
G1BackToBackStores.test_121  avgt   15  4.194 ? 0.033  ns/op
G1BackToBackStores.test_211  avgt   15  3.665 ? 0.415  ns/op  // !

Cheers,
-Aleksey

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160128/f4ccfffb/signature.asc>

From jamsheed.c.m at oracle.com  Thu Jan 28 16:16:23 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Thu, 28 Jan 2016 21:46:23 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
Message-ID: <56AA3ED7.4030407@oracle.com>

Hi,

Please review the fix made for issue

bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/

Unit tests: As its hard, none

Other tests: jprt.

Description of the issue:
A valid pc match in exception cache returning an invalid handler makes 
assert to fail.
This happens as  ExceptionCache reads are lock free access.

As a fix for this i have put a storestore mem barrier before the count 
is updated.

Best Regards,
Jamsheed

From christian.thalinger at oracle.com  Thu Jan 28 18:45:37 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Thu, 28 Jan 2016 19:45:37 +0100
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AA3ED7.4030407@oracle.com>
References: <56AA3ED7.4030407@oracle.com>
Message-ID: <844981E3-4C91-4D52-BBDB-450D055C2599@oracle.com>

   if (count() < cache_size) {
     set_pc_at(count(),addr);
     set_handler_at(count(), handler);

Shouldn?t we read count() only once into a local variable to rule any odd race bugs down the road?

> On Jan 28, 2016, at 5:16 PM, Jamsheed C m <jamsheed.c.m at oracle.com> wrote:
> 
> Hi,
> 
> Please review the fix made for issue
> 
> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
> 
> Unit tests: As its hard, none
> 
> Other tests: jprt.
> 
> Description of the issue:
> A valid pc match in exception cache returning an invalid handler makes assert to fail.
> This happens as  ExceptionCache reads are lock free access.
> 
> As a fix for this i have put a storestore mem barrier before the count is updated.
> 
> Best Regards,
> Jamsheed


From jamsheed.c.m at oracle.com  Thu Jan 28 20:29:51 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Fri, 29 Jan 2016 01:59:51 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <844981E3-4C91-4D52-BBDB-450D055C2599@oracle.com>
References: <56AA3ED7.4030407@oracle.com>
	<844981E3-4C91-4D52-BBDB-450D055C2599@oracle.com>
Message-ID: <56AA7A3F.6040800@oracle.com>


On 1/29/2016 12:15 AM, Christian Thalinger wrote:
>     if (count() < cache_size) {
>       set_pc_at(count(),addr);
>       set_handler_at(count(), handler);
>
> Shouldn?t we read count() only once into a local variable to rule any odd race bugs down the road?

write to cache is mutex lock protected. so this code is safe.

Issue is seen in weak memory order machines.  lockless read of exception 
cache values fails as writes in cache get reordered.

Best Regards,
Jamsheed
>
>> On Jan 28, 2016, at 5:16 PM, Jamsheed C m <jamsheed.c.m at oracle.com> wrote:
>>
>> Hi,
>>
>> Please review the fix made for issue
>>
>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>
>> Unit tests: As its hard, none
>>
>> Other tests: jprt.
>>
>> Description of the issue:
>> A valid pc match in exception cache returning an invalid handler makes assert to fail.
>> This happens as  ExceptionCache reads are lock free access.
>>
>> As a fix for this i have put a storestore mem barrier before the count is updated.
>>
>> Best Regards,
>> Jamsheed


From igor.veresov at oracle.com  Thu Jan 28 20:41:42 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Thu, 28 Jan 2016 12:41:42 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56AA2AE4.2090803@azulsystems.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
	<56A8BC9D.8060004@azulsystems.com>
	<6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>
	<56AA2AE4.2090803@azulsystems.com>
Message-ID: <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com>

x86.ad:

It seems that the comment here is off:
1714     case Op_OnSpinWait:
1715       if (UseSSE < 2) // requires at least SSE4
1716         ret_value = false;
1717       break;

Also we don?t support CPUs with SSE < 2, so you don?t have to make these changes to x86.ad. It?s enough that has_match_rule(), that is called by  Matcher::match_rule_supported(), will return true for Op_OnSpinWait.


x86_64.ad:
+instruct onspinwait()
+%{
+  match(OnSpinWait);
+  ins_cost(200);
...

Is there any reason this can?t be moved to generic x86.ad ? It can be easily supported on 32bit as well, right (we do still support 32bit mode on linux)? The encoding is the same for both 32 and 64 bit modes, so that should be trivial.

library_call.cpp:

I think you forgot to actually call Matcher::match_rule_supported(). I think it should be something like:

bool LibraryCallKit::inline_onspinwait() {
  if (Matcher::match_rule_supported(Op_OnSpinWait) {
     insert_mem_bar(Op_OnSpinWait);
     return true;
  }
  return false;
}


igor

> On Jan 28, 2016, at 6:51 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
> 
> Hi Igor,
> 
> Following Vladimir's suggestion I eliminated the UseOnSpinWaitIntrinsic flag altogether. I have adopted the Matcher::match_rule_supported() logic - seems to work on intel, but I don't have any non-intel box to test.
> 
> Anyway, the new webrev:
> http://cr.openjdk.java.net/~ikrylov/8147844.hs.01/
> 
> Igor, Vladimir, thanks,
> 
> Ivan
> 
> On 27/01/2016 22:03, Igor Veresov wrote:
>> Actually, I?d rather use Matcher::match_rule_supported() to test if it?s supported on the platform, rather than fixing all vm_version_*.* to check for the flag validity, that?s tedious (you forgot x86-32 and there?s going to be more platforms to fix for you sponsor). Something like UseOnSpinWaitIntrinsic && Matcher::match_rule_supported(Op_OnSpinWait) to decide whether or not to inline the intrinsic. Also, why are you not turning it on by default?
>> 
>> igor
>> 
>>> On Jan 27, 2016, at 4:48 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>> 
>>> Looks like there was some good discussion while I was peacefully sleeping.
>>> I don't have much to add. This patch was somewhat inspired by JEP-171 changes.
>>> Perhaps,there are other ways to achieve the same semantics.
>>> 
>>> So, if we can consider this reviewed - I will wait for the actual JEP to become targeted to 9 and then seek a sponsor to do the push.
>>> 
>>> Thanks,
>>> 
>>> Ivan
>>> 
>>> On 27/01/2016 09:12, Igor Veresov wrote:
>>>> I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
>>>> Anyways, looks ok for the first cut.
>>>> 
>>>> igor
>>>> 
>>>>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
>>>>> 
>>>>> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?
>>>>> 
>>>>> ? Gil.
>>>>> 
>>>>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
>>>>>> 
>>>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>>>>> 
>>>>>> igor
>>>>>> 
>>>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>>>> 
>>>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>>>> 
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>>>> 
>>>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>>>> 
>>>>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Ivan
>>>>>>> 
>>>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
> 


From gil at azul.com  Wed Jan 27 05:24:42 2016
From: gil at azul.com (Gil Tene)
Date: Wed, 27 Jan 2016 05:24:42 +0000
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
Message-ID: <DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>

Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?

? Gil.

> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
> 
> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
> 
> igor
> 
>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com <mailto:ivan at azulsystems.com>> wrote:
>> 
>> Hello,
>> 
>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832 <https://bugs.openjdk.java.net/browse/JDK-8147832>. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/>
>> 
>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844 <https://bugs.openjdk.java.net/browse/JDK-8147844>
>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/ <http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/>
>> 
>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>> 
>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>> 
>> Thanks,
>> 
>> Ivan
>> 
>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops <https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/1ef62b39/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/1ef62b39/signature-0001.asc>

From vladimir.kozlov at oracle.com  Thu Jan 28 23:45:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 28 Jan 2016 15:45:12 -0800
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
Message-ID: <56AAA808.6090604@oracle.com>

G1 barrier was added by Mikael Gerdin from GC. He should also look on this change.

https://bugs.openjdk.java.net/browse/JDK-8014555

Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?
And we need to keep precedent edge link to oop store in case EA eliminates related allocation.

Thanks,
Vladimir

On 1/28/16 4:49 AM, Roland Westrelin wrote:
> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
>
> C2 currently doesn?t optimize the field load in the following code:
>
>      static Object field;
>
>      static Object m(Object o) {
>          field = o;
>          return field;
>      }
>
> It should return o but instead loads the value back from memory. The reason it misses such simple optimization is that the G1 post barrier has a memory barrier with a wide effect on the memory state. C2 doesn?t optimize this either:
>
>          object.field = other_object;
>          object.field = other_object;
>
> Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark
>
> That memory barrier was added to have a memory barrier instruction and doesn?t have to have a wide memory effect.
>
> Roland.
>

From vladimir.kozlov at oracle.com  Fri Jan 29 00:34:28 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 28 Jan 2016 16:34:28 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <56AA2AE4.2090803@azulsystems.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
	<56A8BC9D.8060004@azulsystems.com>
	<6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>
	<56AA2AE4.2090803@azulsystems.com>
Message-ID: <56AAB394.2080404@oracle.com>

First, it was Igor's suggestion about match_rule_supported().

Why you check intrinsic in inline_native_Class_query() ?

I don't see the match_rule_supported() check for this intrinsic (c2compiler.cpp).

I think you should consider to implement this for C1 and Interpreter since Tiered Compilation is on by default. So that 
Client VM can benefit too.

Change test:

1. Don't use /othervm since you fork separate process and don't use flags.

2. Don't use "-server" flag - Client VM could be tested which does not have server.

3. Don't use -Xcomp - it will timeout on slow machines. Create separate test() method to be compiled:

          public static void main(final String[] args) throws Exception {
              int end = 20_000;

              for (int i=0; i < end; i++) {
                  test();
              }
          }

          static void test() {
              java.lang.Runtime.onSpinWait();
          }

Thanks,
Vladimir

On 1/28/16 6:51 AM, Ivan Krylov wrote:
> Hi Igor,
>
> Following Vladimir's suggestion I eliminated the UseOnSpinWaitIntrinsic flag altogether. I have adopted the
> Matcher::match_rule_supported() logic - seems to work on intel, but I don't have any non-intel box to test.
>
> Anyway, the new webrev:
> http://cr.openjdk.java.net/~ikrylov/8147844.hs.01/
>
> Igor, Vladimir, thanks,
>
> Ivan
>
> On 27/01/2016 22:03, Igor Veresov wrote:
>> Actually, I?d rather use Matcher::match_rule_supported() to test if it?s supported on the platform, rather than fixing
>> all vm_version_*.* to check for the flag validity, that?s tedious (you forgot x86-32 and there?s going to be more
>> platforms to fix for you sponsor). Something like UseOnSpinWaitIntrinsic &&
>> Matcher::match_rule_supported(Op_OnSpinWait) to decide whether or not to inline the intrinsic. Also, why are you not
>> turning it on by default?
>>
>> igor
>>
>>> On Jan 27, 2016, at 4:48 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>
>>> Looks like there was some good discussion while I was peacefully sleeping.
>>> I don't have much to add. This patch was somewhat inspired by JEP-171 changes.
>>> Perhaps,there are other ways to achieve the same semantics.
>>>
>>> So, if we can consider this reviewed - I will wait for the actual JEP to become targeted to 9 and then seek a sponsor
>>> to do the push.
>>>
>>> Thanks,
>>>
>>> Ivan
>>>
>>> On 27/01/2016 09:12, Igor Veresov wrote:
>>>> I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
>>>> Anyways, looks ok for the first cut.
>>>>
>>>> igor
>>>>
>>>>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
>>>>>
>>>>> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and
>>>>> possibly a volatile store), the new node does not create significant extra move restrictions that are not already
>>>>> there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop.
>>>>> There are probably other ways to achieve this, but this one doesn't really have a performance downside?
>>>>>
>>>>> ? Gil.
>>>>>
>>>>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
>>>>>>
>>>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in
>>>>>> your loop, right? If that?s intentional I wonder why is that?
>>>>>>
>>>>>> igor
>>>>>>
>>>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is
>>>>>>> forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a consensus on the API
>>>>>>> side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API
>>>>>>> changes can be seen at the webrev:
>>>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>>>>
>>>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class
>>>>>>> libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>>>>
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>>>>
>>>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic
>>>>>>> that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic
>>>>>>> flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at
>>>>>>> empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is
>>>>>>> functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless,
>>>>>>> there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>>>>
>>>>>>> The proposed patch includes a simple regression test that simply makes sure that method
>>>>>>> java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance
>>>>>>> tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about
>>>>>>> the process.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Ivan
>>>>>>>
>>>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>

From vladimir.kozlov at oracle.com  Fri Jan 29 00:52:26 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 28 Jan 2016 16:52:26 -0800
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AA7A3F.6040800@oracle.com>
References: <56AA3ED7.4030407@oracle.com>
	<844981E3-4C91-4D52-BBDB-450D055C2599@oracle.com>
	<56AA7A3F.6040800@oracle.com>
Message-ID: <56AAB7CA.4000604@oracle.com>

On 1/28/16 12:29 PM, Jamsheed C m wrote:
>
>
> On 1/29/2016 12:15 AM, Christian Thalinger wrote:
>>     if (count() < cache_size) {
>>       set_pc_at(count(),addr);
>>       set_handler_at(count(), handler);
>>
>> Shouldn?t we read count() only once into a local variable to rule any odd race bugs down the road?

+1. As I understand, Chris is suggesting to do it in addition to storestore barrier.

Do we have other similar code?

Thanks,
Vladimir

>
> write to cache is mutex lock protected. so this code is safe.
>
> Issue is seen in weak memory order machines.  lockless read of exception cache values fails as writes in cache get
> reordered.
>
> Best Regards,
> Jamsheed
>>
>>> On Jan 28, 2016, at 5:16 PM, Jamsheed C m <jamsheed.c.m at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> Please review the fix made for issue
>>>
>>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>>
>>> Unit tests: As its hard, none
>>>
>>> Other tests: jprt.
>>>
>>> Description of the issue:
>>> A valid pc match in exception cache returning an invalid handler makes assert to fail.
>>> This happens as  ExceptionCache reads are lock free access.
>>>
>>> As a fix for this i have put a storestore mem barrier before the count is updated.
>>>
>>> Best Regards,
>>> Jamsheed
>

From igor.veresov at oracle.com  Fri Jan 29 01:48:48 2016
From: igor.veresov at oracle.com (Igor Veresov)
Date: Thu, 28 Jan 2016 17:48:48 -0800
Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the
	corresponding x86 hotspot instrinsic
In-Reply-To: <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com>
References: <56A751AE.9090203@azulsystems.com>
	<A3061E35-6AAA-4D5D-BE88-5C1E33B1E439@oracle.com>
	<DB160C5D-C421-4B63-89A7-286EFFF5D751@azul.com>
	<45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com>
	<56A8BC9D.8060004@azulsystems.com>
	<6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com>
	<56AA2AE4.2090803@azulsystems.com>
	<2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com>
Message-ID: <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com>


> On Jan 28, 2016, at 12:41 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
> 
> x86.ad:
> 
> It seems that the comment here is off:
> 1714     case Op_OnSpinWait:
> 1715       if (UseSSE < 2) // requires at least SSE4
> 1716         ret_value = false;
> 1717       break;
> 
> Also we don?t support CPUs with SSE < 2, so you don?t have to make these changes to x86.ad. It?s enough that has_match_rule(), that is called by  Matcher::match_rule_supported(), will return true for Op_OnSpinWait.
> 
> 
> x86_64.ad:
> +instruct onspinwait()
> +%{
> +  match(OnSpinWait);
> +  ins_cost(200);
> ...
> 
> Is there any reason this can?t be moved to generic x86.ad ? It can be easily supported on 32bit as well, right (we do still support 32bit mode on linux)? The encoding is the same for both 32 and 64 bit modes, so that should be trivial.
> 
> library_call.cpp:
> 
> I think you forgot to actually call Matcher::match_rule_supported(). I think it should be something like:
> 
> bool LibraryCallKit::inline_onspinwait() {
>  if (Matcher::match_rule_supported(Op_OnSpinWait) {
>     insert_mem_bar(Op_OnSpinWait);
>     return true;
>  }
>  return false;
> }
> 

As Vladimir suggested, it?s better to check Matcher::match_rule_supported() in c2compiler.cpp in is_intrinsic_supported(). Sorry about the confusion. I stand by the other comments though.

igor

> 
> igor
> 
>> On Jan 28, 2016, at 6:51 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>> 
>> Hi Igor,
>> 
>> Following Vladimir's suggestion I eliminated the UseOnSpinWaitIntrinsic flag altogether. I have adopted the Matcher::match_rule_supported() logic - seems to work on intel, but I don't have any non-intel box to test.
>> 
>> Anyway, the new webrev:
>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.01/
>> 
>> Igor, Vladimir, thanks,
>> 
>> Ivan
>> 
>> On 27/01/2016 22:03, Igor Veresov wrote:
>>> Actually, I?d rather use Matcher::match_rule_supported() to test if it?s supported on the platform, rather than fixing all vm_version_*.* to check for the flag validity, that?s tedious (you forgot x86-32 and there?s going to be more platforms to fix for you sponsor). Something like UseOnSpinWaitIntrinsic && Matcher::match_rule_supported(Op_OnSpinWait) to decide whether or not to inline the intrinsic. Also, why are you not turning it on by default?
>>> 
>>> igor
>>> 
>>>> On Jan 27, 2016, at 4:48 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>> 
>>>> Looks like there was some good discussion while I was peacefully sleeping.
>>>> I don't have much to add. This patch was somewhat inspired by JEP-171 changes.
>>>> Perhaps,there are other ways to achieve the same semantics.
>>>> 
>>>> So, if we can consider this reviewed - I will wait for the actual JEP to become targeted to 9 and then seek a sponsor to do the push.
>>>> 
>>>> Thanks,
>>>> 
>>>> Ivan
>>>> 
>>>> On 27/01/2016 09:12, Igor Veresov wrote:
>>>>> I realize it?s not a big deal. I was just wondering if there was any specific reason control alone is not enough.
>>>>> Anyways, looks ok for the first cut.
>>>>> 
>>>>> igor
>>>>> 
>>>>>> On Jan 26, 2016, at 9:24 PM, Gil Tene <gil at azul.com> wrote:
>>>>>> 
>>>>>> Since a sensical loop that calls onSpinWait() would include at least a volatile load on every iteration (and possibly a volatile store), the new node does not create significant extra move restrictions that are not already there. Modeling this with a memory effect is one simple way to prevent it from being re-ordered out of the loop. There are probably other ways to achieve this, but this one doesn't really have a performance downside?
>>>>>> 
>>>>>> ? Gil.
>>>>>> 
>>>>>>> On Jan 26, 2016, at 4:44 PM, Igor Veresov <igor.veresov at oracle.com> wrote:
>>>>>>> 
>>>>>>> So, why does the new node have a memory effect? That would seem to prevent any movement of the subsequent loads in your loop, right? If that?s intentional I wonder why is that?
>>>>>>> 
>>>>>>> igor
>>>>>>> 
>>>>>>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> Some of you may have a seen a few e-mails on the core-libs alias about a proposed ?spin wait hint?. The JEP is forming up nicely at  https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a consensus on the API side. It is now in a draft state and I hope this JEP will get targeted for java 9 shortly.  The upcoming API changes can be seen at the webrev:
>>>>>>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>>>>>>> 
>>>>>>>> At this time I would like to ask for a review of the hs-comp changes. The plan is push changes into class libraries and hotspot synchronously but that may happen after the JEP gets targeted.
>>>>>>>> 
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>>>>>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>>>>>>> 
>>>>>>>> The idea of the fix is pretty simple: hotspot replaces a call to java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a 'pause' instruction on x86.  This intrinsic is guarded by the -XX:?UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a verification code that makes sure the flag is off, VM will just execute at empty method java.lang.Runtime.onSpinWait() ? effectively a no-op. According the [1] the 'pause' instruction is functional since SSE2, but even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence harmless, there seems to be no need to add guarding code for older generations of Intel CPUs.
>>>>>>>> 
>>>>>>>> The proposed patch includes a simple regression test that simply makes sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There are several other producer-consumer-like performance tests ready that the authors of this JEP would be happy to make available under JEP-230 but I am uncertain about the process.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Ivan
>>>>>>>> 
>>>>>>>> [1]  - https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>> 
> 


From igor.ignatyev at oracle.com  Fri Jan 29 01:49:28 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 29 Jan 2016 04:49:28 +0300
Subject: RFR(M) : 8134102 : [TESTBUG]
	compiler/unsafe/UnsafeGetConstantField.java test fails in Jake
Message-ID: <B6D066BC-1EBD-48A6-8F21-72CFB7D7ADD2@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8134102/webrev.00/
> 134 lines changed: 84 ins; 15 del; 35 mod;
Hi all,

could you please review the patch for compiler/unsafe/UnsafeGetConstantField.java test?

the test fails in jake, because the test class is in java.lang.invoke package, which is already defined in java.base module. Instead of using ?patch? mechanism, which allows to add classes into existing modules, I decided to remove direct usage of package-private members from j.l.i:
 - @DontInline was changed by a corresponding -XX:CompileCommand
 - direct usage of Stable.class replaced w/ Class.forName
 - UnsafeGetConstantField is moved from java.lang.invoke package to compile.unsafe, thus all the nested classes used from generated tests are made public

Besides changes for jake, I also slightly modified the test (originally to be sure that the test still checks that it supposed to):
 - for getObject* tests, String constant is used as the field value instead of 'new Object()'
 - add checks that Test::testDirect/testUnsafe return prev. value even after field's value was changed. this check fails for Unsafe::getCharUnaligned if JVM is started w/ -XX:-UseUnalignedAccesses. I?ve filed a bug for that (JDK-8148518) and temporary disabled the check which fails
 - in case of failure, the generated class is dumped into workdir

testing: run the test against 2016-01-26 jake nightly build

JDK-8134102 : http://bugs.openjdk.java.net/browse/JDK-8134102
JDK-8148518 : http://bugs.openjdk.java.net/browse/JDK-8148518

PS the patch will be integrated thru jigsaw/jake repo

Thanks,
? Igor

From dean.long at oracle.com  Fri Jan 29 04:10:34 2016
From: dean.long at oracle.com (Dean Long)
Date: Thu, 28 Jan 2016 20:10:34 -0800
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AA3ED7.4030407@oracle.com>
References: <56AA3ED7.4030407@oracle.com>
Message-ID: <56AAE63A.4060905@oracle.com>

As you noticed, for this kind of bug the memory is going to consistent 
by the time the core file is written.
So to help debug this assert it if happens again, could you change it to 
something like:

#ifdef ASSERT
     address computed_address = 
SharedRuntime::compute_compiled_exc_handler(nm, pc, exception, 
force_unwind, true);
     vmassert(handler_address == computed_address, PTR_FORMAT " != " 
PTR_FORMAT, p2i(handler_address), p2i(computed_address));
#endif

dl

On 1/28/2016 8:16 AM, Jamsheed C m wrote:
> Hi,
>
> Please review the fix made for issue
>
> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>
> Unit tests: As its hard, none
>
> Other tests: jprt.
>
> Description of the issue:
> A valid pc match in exception cache returning an invalid handler makes 
> assert to fail.
> This happens as  ExceptionCache reads are lock free access.
>
> As a fix for this i have put a storestore mem barrier before the count 
> is updated.
>
> Best Regards,
> Jamsheed


From jamsheed.c.m at oracle.com  Fri Jan 29 06:36:24 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Fri, 29 Jan 2016 12:06:24 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AAE63A.4060905@oracle.com>
References: <56AA3ED7.4030407@oracle.com> <56AAE63A.4060905@oracle.com>
Message-ID: <56AB0868.2080307@oracle.com>

Hi Dean,

On 1/29/2016 9:40 AM, Dean Long wrote:
> As you noticed, for this kind of bug the memory is going to consistent 
> by the time the core file is written.
> So to help debug this assert it if happens again, could you change it 
> to something like:
>
> #ifdef ASSERT
>     address computed_address = 
> SharedRuntime::compute_compiled_exc_handler(nm, pc, exception, 
> force_unwind, true);
>     vmassert(handler_address == computed_address, PTR_FORMAT " != " 
> PTR_FORMAT, p2i(handler_address), p2i(computed_address));
> #endif
I got handler_address value in this case. This value was inconsistent 
with value in ExceptionCache.
It was having initial value and that was helpful in figuring out what 
would have went wrong.

I will make this change.

Best Regards,
Jamsheed
>
> dl
>
> On 1/28/2016 8:16 AM, Jamsheed C m wrote:
>> Hi,
>>
>> Please review the fix made for issue
>>
>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>
>> Unit tests: As its hard, none
>>
>> Other tests: jprt.
>>
>> Description of the issue:
>> A valid pc match in exception cache returning an invalid handler 
>> makes assert to fail.
>> This happens as  ExceptionCache reads are lock free access.
>>
>> As a fix for this i have put a storestore mem barrier before the 
>> count is updated.
>>
>> Best Regards,
>> Jamsheed
>


From jamsheed.c.m at oracle.com  Fri Jan 29 07:09:27 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Fri, 29 Jan 2016 12:39:27 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AAB7CA.4000604@oracle.com>
References: <56AA3ED7.4030407@oracle.com>
	<844981E3-4C91-4D52-BBDB-450D055C2599@oracle.com>
	<56AA7A3F.6040800@oracle.com> <56AAB7CA.4000604@oracle.com>
Message-ID: <56AB1027.4050402@oracle.com>

Hi Vladimir,

On 1/29/2016 6:22 AM, Vladimir Kozlov wrote:
> On 1/28/16 12:29 PM, Jamsheed C m wrote:
>>
>>
>> On 1/29/2016 12:15 AM, Christian Thalinger wrote:
>>>     if (count() < cache_size) {
>>>       set_pc_at(count(),addr);
>>>       set_handler_at(count(), handler);
>>>
>>> Shouldn?t we read count() only once into a local variable to rule 
>>> any odd race bugs down the road?
>
> +1. As I understand, Chris is suggesting to do it in addition to 
> storestore barrier.
Ok.
>
>
> Do we have other similar code?

I am not sure, let me have a check.

Best Regards,
Jamsheed
>
> Thanks,
> Vladimir
>
>>
>> write to cache is mutex lock protected. so this code is safe.
>>
>> Issue is seen in weak memory order machines.  lockless read of 
>> exception cache values fails as writes in cache get
>> reordered.
>>
>> Best Regards,
>> Jamsheed
>>>
>>>> On Jan 28, 2016, at 5:16 PM, Jamsheed C m <jamsheed.c.m at oracle.com> 
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Please review the fix made for issue
>>>>
>>>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>>>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>>>
>>>> Unit tests: As its hard, none
>>>>
>>>> Other tests: jprt.
>>>>
>>>> Description of the issue:
>>>> A valid pc match in exception cache returning an invalid handler 
>>>> makes assert to fail.
>>>> This happens as  ExceptionCache reads are lock free access.
>>>>
>>>> As a fix for this i have put a storestore mem barrier before the 
>>>> count is updated.
>>>>
>>>> Best Regards,
>>>> Jamsheed
>>


From mikael.gerdin at oracle.com  Fri Jan 29 09:17:49 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Fri, 29 Jan 2016 10:17:49 +0100
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <56AAA808.6090604@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AAA808.6090604@oracle.com>
Message-ID: <56AB2E3D.9030501@oracle.com>

Hi,

On 2016-01-29 00:45, Vladimir Kozlov wrote:
> G1 barrier was added by Mikael Gerdin from GC. He should also look on
> this change.

I don't have enough C2 knowledge to decode exactly want Roland's changes 
achieve, but I can attempt to describe what I needed to achieve with the 
Op_MemBarVolatile:

In the assignment

o.f = a;

G1 needs a post-barrier of the form:

o.f = a;
if (card_for(&o.f) != 32)) {
   #StoreLoad
   if (card_for(&o.f) != 0)) {
     card_for(&o.f) = 0
   }
}

The #StoreLoad is needed to force the second card table load to not get 
reordered with the store of the field.

The first load from the card table and the check for 32 is an 
optimization, where we know that the value 32 is idempotent, it will not 
change outside of safepoints.

The second load from the card table must not be allowed to occur after 
we know that other threads see the value "a" in o.f, otherwise a 
concurrent refinement thread can see the old value of o.f and we will 
crash in interesting ways later on...

/Mikael

>
> https://bugs.openjdk.java.net/browse/JDK-8014555
>
> Also we have specialized insert_mem_bar_volatile() if we don't want wide
> memory affect. Why not use it?
> And we need to keep precedent edge link to oop store in case EA
> eliminates related allocation.
>
> Thanks,
> Vladimir
>
> On 1/28/16 4:49 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
>>
>> C2 currently doesn?t optimize the field load in the following code:
>>
>>      static Object field;
>>
>>      static Object m(Object o) {
>>          field = o;
>>          return field;
>>      }
>>
>> It should return o but instead loads the value back from memory. The
>> reason it misses such simple optimization is that the G1 post barrier
>> has a memory barrier with a wide effect on the memory state. C2
>> doesn?t optimize this either:
>>
>>          object.field = other_object;
>>          object.field = other_object;
>>
>> Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark
>>
>> That memory barrier was added to have a memory barrier instruction and
>> doesn?t have to have a wide memory effect.
>>
>> Roland.
>>


From vladimir.x.ivanov at oracle.com  Fri Jan 29 12:24:09 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 29 Jan 2016 15:24:09 +0300
Subject: RFR(M) : 8134102 : [TESTBUG]
	compiler/unsafe/UnsafeGetConstantField.java test fails in Jake
In-Reply-To: <B6D066BC-1EBD-48A6-8F21-72CFB7D7ADD2@oracle.com>
References: <B6D066BC-1EBD-48A6-8F21-72CFB7D7ADD2@oracle.com>
Message-ID: <56AB59E9.9000606@oracle.com>

Overall, looks good.

One request: JDK-8148518 is caused by field and getter type mismatch [1] 
(char vs short). Such behavior is expected, since char & short loads 
aren't interchangeable [2].

There are different ways to fix that particular case (add new intrinsic 
or enhance constant folding logic to take the cast into account), but 
for now, please, change the filter to ignore CharUnaligned and add a 
comment with bug id (JDK-8148518):
+            if (!hasDefaultValue && (stable || g.isFinal())) {

Best regards,
Vladimir Ivanov

[1] jdk/src/java.base/share/classes/jdk/internal/misc/Unsafe.java:
     @HotSpotIntrinsicCandidate
     public final char getCharUnaligned(Object o, long offset) {
         return (char)getShortUnaligned(o, offset);
     }

[2] 
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018322.html
?I spotted a bug when field and accessor types mismatch, but the JIT 
still constant-folds the load. The fix made expected result detection 
even more complex, so I decided to get rid of it & WhiteBox hooks 
altogether. The test exercises different code paths and compares 
returned values now.?

On 1/29/16 4:49 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8134102/webrev.00/
>> 134 lines changed: 84 ins; 15 del; 35 mod;
> Hi all,
>
> could you please review the patch for compiler/unsafe/UnsafeGetConstantField.java test?
>
> the test fails in jake, because the test class is in java.lang.invoke package, which is already defined in java.base module. Instead of using ?patch? mechanism, which allows to add classes into existing modules, I decided to remove direct usage of package-private members from j.l.i:
>   - @DontInline was changed by a corresponding -XX:CompileCommand
>   - direct usage of Stable.class replaced w/ Class.forName
>   - UnsafeGetConstantField is moved from java.lang.invoke package to compile.unsafe, thus all the nested classes used from generated tests are made public
>
> Besides changes for jake, I also slightly modified the test (originally to be sure that the test still checks that it supposed to):
>   - for getObject* tests, String constant is used as the field value instead of 'new Object()'
>   - add checks that Test::testDirect/testUnsafe return prev. value even after field's value was changed. this check fails for Unsafe::getCharUnaligned if JVM is started w/ -XX:-UseUnalignedAccesses. I?ve filed a bug for that (JDK-8148518) and temporary disabled the check which fails
>   - in case of failure, the generated class is dumped into workdir
>
> testing: run the test against 2016-01-26 jake nightly build
>
> JDK-8134102 : http://bugs.openjdk.java.net/browse/JDK-8134102
> JDK-8148518 : http://bugs.openjdk.java.net/browse/JDK-8148518
>
> PS the patch will be integrated thru jigsaw/jake repo
>
> Thanks,
> ? Igor
>

From roland.westrelin at oracle.com  Fri Jan 29 13:27:23 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 29 Jan 2016 14:27:23 +0100
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <56AA260B.8080101@redhat.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA260B.8080101@redhat.com>
Message-ID: <434839E5-8AB1-4FEC-BDD7-AD30ABBD6C76@oracle.com>

Hi Andrew,

> I think this looks ok -- not sure until I try it out. However, I /am/
> fairly sure it is going to cause a problem for the AArch64 code which
> optimizes volatile loads and stores. That's because it change the
> characteristic shape of the subgraph searched for by the predicates
> which decide whether to i) generate loads + membars or ii) plant stlr or
> ladr instructions.
> 
> I'll look into this asap.

Thanks for looking at this. I?ll wait to hear back from you until I move forward with this change.

Roland.


From aleksey.shipilev at oracle.com  Fri Jan 29 13:28:12 2016
From: aleksey.shipilev at oracle.com (Aleksey Shipilev)
Date: Fri, 29 Jan 2016 16:28:12 +0300
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <56AA2DE5.5050008@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AA2DE5.5050008@oracle.com>
Message-ID: <56AB68EC.6030801@oracle.com>

On 01/28/2016 06:04 PM, Aleksey Shipilev wrote:
> On 01/28/2016 03:49 PM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
> 
> This looks good, for most tests here:
>   http://cr.openjdk.java.net/~shade/8087341/G1BackToBackStores.java
> 
> The generated code indeed shows commoned loads/stores with this patch,
> and some other things that can be improved in the codegen -- I'll file
> the separate issue(s) for that. I think this one is better to be renamed
> to something more specific, e.g. "Overly wide StoreLoad barrier in G1
> breaks load/store coalescing"?

Found an even more convincing example:
 http://cr.openjdk.java.net/~shade/8087341/G1LoopStores.java

Happy to re-run once Roland has the patch with Vladimir's comments.

Cheers,
-Aleksey


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160129/dd664618/signature-0001.asc>

From roland.westrelin at oracle.com  Fri Jan 29 14:04:39 2016
From: roland.westrelin at oracle.com (Roland Westrelin)
Date: Fri, 29 Jan 2016 15:04:39 +0100
Subject: RFR(S): 8087341: C2 doesn't optimize redundant memory operations
	with G1
In-Reply-To: <56AAA808.6090604@oracle.com>
References: <F1EBA70C-852F-4EEA-9C53-58FC4849FD2D@oracle.com>
	<56AAA808.6090604@oracle.com>
Message-ID: <550334AB-0A58-41A7-B00A-974EBA3F8B0B@oracle.com>

Hi Vladimir,

Thanks for looking at this.

> G1 barrier was added by Mikael Gerdin from GC. He should also look on this change.
> 
> https://bugs.openjdk.java.net/browse/JDK-8014555
> 
> Also we have specialized insert_mem_bar_volatile() if we don't want wide memory affect. Why not use it?

The membar in the change takes the entire memory state as input but only changes raw memory. I don?t think that can be achieved with insert_mem_bar_volatile(). As explained by Mikael, the membar is here to force ordering between the oop store and the card table load. That?s why I think the membar?s inputs and outputs should be set up that way.

> And we need to keep precedent edge link to oop store in case EA eliminates related allocation.

I missed that, indeed. Mikael, can you confirm if this is ok (eliminating the barrier if the object being stored to doesn?t escape)?

Roland.

> 
> Thanks,
> Vladimir
> 
> On 1/28/16 4:49 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8087341/webrev.00/
>> 
>> C2 currently doesn?t optimize the field load in the following code:
>> 
>>     static Object field;
>> 
>>     static Object m(Object o) {
>>         field = o;
>>         return field;
>>     }
>> 
>> It should return o but instead loads the value back from memory. The reason it misses such simple optimization is that the G1 post barrier has a memory barrier with a wide effect on the memory state. C2 doesn?t optimize this either:
>> 
>>         object.field = other_object;
>>         object.field = other_object;
>> 
>> Same applies to -XX:+UseConcMarkSweepGC -XX:+UseCondCardMark
>> 
>> That memory barrier was added to have a memory barrier instruction and doesn?t have to have a wide memory effect.
>> 
>> Roland.
>> 


From tobias.hartmann at oracle.com  Fri Jan 29 14:16:22 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 29 Jan 2016 15:16:22 +0100
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to
	restore xmm registers on 32 bit
Message-ID: <56AB7436.7020302@oracle.com>

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8148490
http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/

RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit: 

Save: 
   ...
   0xf34ca12e:	vmovdqu %xmm0,0xb0(%esp)
   0xf34ca137:	vmovdqu %xmm1,0xc0(%esp)
   ...
   0xf34ca16d:	vmovdqu %xmm7,0x120(%esp)
   0xf34ca176:	sub    $0x80,%esp
   0xf34ca17c:	vextractf128 $0x1,%ymm0,(%esp)
   0xf34ca183:	vextractf128 $0x1,%ymm1,0x10(%esp)
   ...
   0xf34ca1b3:	vextractf128 $0x1,%ymm7,0x70(%esp)
   ...

Restore:
   ...
   0xf34ca202:	vinsertf128 $0x1,(%esp),%ymm0,%ymm0
   0xf34ca209:	vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
   ...
   0xf34ca239:	vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
   0xf34ca241:	add    $0x80,%esp
   0xf34ca247:	vmovdqu 0x130(%esp),%xmm0
   0xf34ca250:	vmovdqu 0x140(%esp),%xmm1
   ...
   0xf34ca286:	vmovdqu 0x1a0(%esp),%xmm7
   ...

The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:

283     __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers

The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.

I spotted and fixed the following other problems:
- the vmovdqu instructions should be emitted before restoring YMM and ZMM because they zero the upper part of the XMM registers (i.e. YMM/ZMM)
- if 'UseAVX > 2' is set/available, we save the ZMM registers as well but we do not increment 'additional_frame_words' accordingly (we need another 8*32 bytes of stack space) 

Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?

The problems were introduced by the fix for JDK-8142980.

Thanks,
Tobias

From doug.simon at oracle.com  Fri Jan 29 15:34:41 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 29 Jan 2016 16:34:41 +0100
Subject: RFR: 8148507: [JVMCI] mitigate deadlocks related to JVMCI compiler
	under -Xbatch
Message-ID: <845F1D56-3194-49AE-95C1-79545F8C50AC@oracle.com>

Please review this small change to further mitigate deadlocks that can be caused by JVMCI when BackgroundCompilation is disabled.

https://bugs.openjdk.java.net/browse/JDK-8148507
http://cr.openjdk.java.net/~dnsimon/8148507

-Doug

From michael.c.berg at intel.com  Fri Jan 29 18:36:47 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Fri, 29 Jan 2016 18:36:47 +0000
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers()
	fails to restore xmm registers on 32 bit
In-Reply-To: <56AB7436.7020302@oracle.com>
References: <56AB7436.7020302@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756DCE7EC@FMSMSX102.amr.corp.intel.com>

Ok, I will take a look.

-Michael

-----Original Message-----
From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] 
Sent: Friday, January 29, 2016 6:16 AM
To: hotspot-compiler-dev at openjdk.java.net
Cc: Berg, Michael C
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to restore xmm registers on 32 bit

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8148490
http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/

RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit: 

Save: 
   ...
   0xf34ca12e:	vmovdqu %xmm0,0xb0(%esp)
   0xf34ca137:	vmovdqu %xmm1,0xc0(%esp)
   ...
   0xf34ca16d:	vmovdqu %xmm7,0x120(%esp)
   0xf34ca176:	sub    $0x80,%esp
   0xf34ca17c:	vextractf128 $0x1,%ymm0,(%esp)
   0xf34ca183:	vextractf128 $0x1,%ymm1,0x10(%esp)
   ...
   0xf34ca1b3:	vextractf128 $0x1,%ymm7,0x70(%esp)
   ...

Restore:
   ...
   0xf34ca202:	vinsertf128 $0x1,(%esp),%ymm0,%ymm0
   0xf34ca209:	vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
   ...
   0xf34ca239:	vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
   0xf34ca241:	add    $0x80,%esp
   0xf34ca247:	vmovdqu 0x130(%esp),%xmm0
   0xf34ca250:	vmovdqu 0x140(%esp),%xmm1
   ...
   0xf34ca286:	vmovdqu 0x1a0(%esp),%xmm7
   ...

The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:

283     __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers

The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.

I spotted and fixed the following other problems:
- the vmovdqu instructions should be emitted before restoring YMM and ZMM because they zero the upper part of the XMM registers (i.e. YMM/ZMM)
- if 'UseAVX > 2' is set/available, we save the ZMM registers as well but we do not increment 'additional_frame_words' accordingly (we need another 8*32 bytes of stack space) 

Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?

The problems were introduced by the fix for JDK-8142980.

Thanks,
Tobias

From dean.long at oracle.com  Fri Jan 29 19:19:16 2016
From: dean.long at oracle.com (Dean Long)
Date: Fri, 29 Jan 2016 11:19:16 -0800
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AB0868.2080307@oracle.com>
References: <56AA3ED7.4030407@oracle.com> <56AAE63A.4060905@oracle.com>
	<56AB0868.2080307@oracle.com>
Message-ID: <56ABBB34.80002@oracle.com>

On 1/28/2016 10:36 PM, Jamsheed C m wrote:
> Hi Dean,
>
> On 1/29/2016 9:40 AM, Dean Long wrote:
>> As you noticed, for this kind of bug the memory is going to 
>> consistent by the time the core file is written.
>> So to help debug this assert it if happens again, could you change it 
>> to something like:
>>
>> #ifdef ASSERT
>>     address computed_address = 
>> SharedRuntime::compute_compiled_exc_handler(nm, pc, exception, 
>> force_unwind, true);
>>     vmassert(handler_address == computed_address, PTR_FORMAT " != " 
>> PTR_FORMAT, p2i(handler_address), p2i(computed_address));
>> #endif
> I got handler_address value in this case. This value was inconsistent 
> with value in ExceptionCache.
> It was having initial value and that was helpful in figuring out what 
> would have went wrong.
>

In the bug report, you said all data in the core file was consistent, so 
I'm just wondering where you saw
it inconsistent.   Just to confirm what was going wrong, you suspect 
that _count was being updated before the handler?

dl

> I will make this change.
>
> Best Regards,
> Jamsheed
>>
>> dl
>>
>> On 1/28/2016 8:16 AM, Jamsheed C m wrote:
>>> Hi,
>>>
>>> Please review the fix made for issue
>>>
>>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>>
>>> Unit tests: As its hard, none
>>>
>>> Other tests: jprt.
>>>
>>> Description of the issue:
>>> A valid pc match in exception cache returning an invalid handler 
>>> makes assert to fail.
>>> This happens as  ExceptionCache reads are lock free access.
>>>
>>> As a fix for this i have put a storestore mem barrier before the 
>>> count is updated.
>>>
>>> Best Regards,
>>> Jamsheed
>>
>


From vladimir.kozlov at oracle.com  Fri Jan 29 19:39:45 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 29 Jan 2016 11:39:45 -0800
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers()
	fails to restore xmm registers on 32 bit
In-Reply-To: <56AB7436.7020302@oracle.com>
References: <56AB7436.7020302@oracle.com>
Message-ID: <56ABC001.6080302@oracle.com>

Tobias, please verify that 64-bit code works correctly.
About 32-bit code.

Please verify correctness of next asserts:

      assert(UseAVX > 0, "512bit vectors are supported only with EVEX");
      assert(MaxVectorSize == 64, "only 512bit vectors are supported now");

Originally we could have vectors even with only 64bit XMM registers. MaxVectorSize and UseAVX can be set on command line 
- what happens in such case? No vectorization?

May be it is done because we save whole 128bit XMM always. Still MaxVectorSize == 64 condition is strange.

Thanks,
Vladimir

On 1/29/16 6:16 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8148490
> http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/
>
> RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit:
>
> Save:
>     ...
>     0xf34ca12e:	vmovdqu %xmm0,0xb0(%esp)
>     0xf34ca137:	vmovdqu %xmm1,0xc0(%esp)
>     ...
>     0xf34ca16d:	vmovdqu %xmm7,0x120(%esp)
>     0xf34ca176:	sub    $0x80,%esp
>     0xf34ca17c:	vextractf128 $0x1,%ymm0,(%esp)
>     0xf34ca183:	vextractf128 $0x1,%ymm1,0x10(%esp)
>     ...
>     0xf34ca1b3:	vextractf128 $0x1,%ymm7,0x70(%esp)
>     ...
>
> Restore:
>     ...
>     0xf34ca202:	vinsertf128 $0x1,(%esp),%ymm0,%ymm0
>     0xf34ca209:	vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
>     ...
>     0xf34ca239:	vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
>     0xf34ca241:	add    $0x80,%esp
>     0xf34ca247:	vmovdqu 0x130(%esp),%xmm0
>     0xf34ca250:	vmovdqu 0x140(%esp),%xmm1
>     ...
>     0xf34ca286:	vmovdqu 0x1a0(%esp),%xmm7
>     ...
>
> The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:
>
> 283     __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers
>
> The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.
>
> I spotted and fixed the following other problems:
> - the vmovdqu instructions should be emitted before restoring YMM and ZMM because they zero the upper part of the XMM registers (i.e. YMM/ZMM)
> - if 'UseAVX > 2' is set/available, we save the ZMM registers as well but we do not increment 'additional_frame_words' accordingly (we need another 8*32 bytes of stack space)
>
> Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?
>
> The problems were introduced by the fix for JDK-8142980.
>
> Thanks,
> Tobias
>

From michael.c.berg at intel.com  Fri Jan 29 22:28:19 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Fri, 29 Jan 2016 22:28:19 +0000
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers()
	fails to restore xmm registers on 32 bit
In-Reply-To: <56ABC001.6080302@oracle.com>
References: <56AB7436.7020302@oracle.com> <56ABC001.6080302@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756DCE938@FMSMSX102.amr.corp.intel.com>

Tobias/Vladimir:

I would change the two asserts to in the 64bit code to make the check clear:

      assert(UseAVX > 0, "up to 512bit vectors are supported with EVEX");
      assert(MaxVectorSize <= 64, "up to 512bit vectors are supported now");

As for testing with the patch applied to hotspot on a current jdk(01-29-16):

Windows sde 32-bit: skx - pass, also ran and passed part of specjvm2008
Windows 32-bit: hsw - pass, also ran and passed all of specjvm2008
Windows sde 64-bit: skx - pass, also ran and passed part of specjvm2008
Windows 64-bit: hsw -pass, also ran and passed all of specjvm2008 : caveat
Linux on skx: 32-bit - pass, also ran and passed all of specjvm2008
Linux on skx:64-bit - pass, also ran and passed all of specjvm2008

We should proceed with checkin in the changelist after the usual testing. 

Note: The above tests were done with the asserts changed on windows only. The 64bit changes are mostly cosmetic.  It's the change to the additional_frame_bytes that makes it correct, we used
equivalent constants in the stack adjustment beforehand, they had not been mapped to the movdqu for the non-vector case for a few iterations on the file.  Early on I did have that code though.

Caveat: xml.transform fails with the changelist and without, I checked this against a 12-21-15 built jdk which is 1 month old, so we have a new bug that is causing this app to fail as well (on windows for 64bit) on hsw.
I checked recent jbs traffic, the occurrence does not appear to be tracked at this time.

-Michael

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Friday, January 29, 2016 11:40 AM
To: hotspot-compiler-dev at openjdk.java.net
Cc: Berg, Michael C
Subject: Re: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to restore xmm registers on 32 bit

Tobias, please verify that 64-bit code works correctly.
About 32-bit code.

Please verify correctness of next asserts:

      assert(UseAVX > 0, "512bit vectors are supported only with EVEX");
      assert(MaxVectorSize == 64, "only 512bit vectors are supported now");

Originally we could have vectors even with only 64bit XMM registers. MaxVectorSize and UseAVX can be set on command line 
- what happens in such case? No vectorization?

May be it is done because we save whole 128bit XMM always. Still MaxVectorSize == 64 condition is strange.

Thanks,
Vladimir

On 1/29/16 6:16 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8148490
> http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/
>
> RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit:
>
> Save:
>     ...
>     0xf34ca12e:	vmovdqu %xmm0,0xb0(%esp)
>     0xf34ca137:	vmovdqu %xmm1,0xc0(%esp)
>     ...
>     0xf34ca16d:	vmovdqu %xmm7,0x120(%esp)
>     0xf34ca176:	sub    $0x80,%esp
>     0xf34ca17c:	vextractf128 $0x1,%ymm0,(%esp)
>     0xf34ca183:	vextractf128 $0x1,%ymm1,0x10(%esp)
>     ...
>     0xf34ca1b3:	vextractf128 $0x1,%ymm7,0x70(%esp)
>     ...
>
> Restore:
>     ...
>     0xf34ca202:	vinsertf128 $0x1,(%esp),%ymm0,%ymm0
>     0xf34ca209:	vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
>     ...
>     0xf34ca239:	vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
>     0xf34ca241:	add    $0x80,%esp
>     0xf34ca247:	vmovdqu 0x130(%esp),%xmm0
>     0xf34ca250:	vmovdqu 0x140(%esp),%xmm1
>     ...
>     0xf34ca286:	vmovdqu 0x1a0(%esp),%xmm7
>     ...
>
> The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:
>
> 283     __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers
>
> The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.
>
> I spotted and fixed the following other problems:
> - the vmovdqu instructions should be emitted before restoring YMM and ZMM because they zero the upper part of the XMM registers (i.e. YMM/ZMM)
> - if 'UseAVX > 2' is set/available, we save the ZMM registers as well but we do not increment 'additional_frame_words' accordingly (we need another 8*32 bytes of stack space)
>
> Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?
>
> The problems were introduced by the fix for JDK-8142980.
>
> Thanks,
> Tobias
>

From vladimir.kozlov at oracle.com  Sat Jan 30 01:38:54 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 29 Jan 2016 17:38:54 -0800
Subject: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers()
	fails to restore xmm registers on 32 bit
In-Reply-To: <C568518E7B433348B114B6A7122D474756DCE938@FMSMSX102.amr.corp.intel.com>
References: <56AB7436.7020302@oracle.com> <56ABC001.6080302@oracle.com>
	<C568518E7B433348B114B6A7122D474756DCE938@FMSMSX102.amr.corp.intel.com>
Message-ID: <56AC142E.6010309@oracle.com>

Michael,

Thank you for testing changes.
Please, file JBS bug for xml.transform problem.

Thanks,
Vladimir

On 1/29/16 2:28 PM, Berg, Michael C wrote:
> Tobias/Vladimir:
>
> I would change the two asserts to in the 64bit code to make the check clear:
>
>        assert(UseAVX > 0, "up to 512bit vectors are supported with EVEX");
>        assert(MaxVectorSize <= 64, "up to 512bit vectors are supported now");
>
> As for testing with the patch applied to hotspot on a current jdk(01-29-16):
>
> Windows sde 32-bit: skx - pass, also ran and passed part of specjvm2008
> Windows 32-bit: hsw - pass, also ran and passed all of specjvm2008
> Windows sde 64-bit: skx - pass, also ran and passed part of specjvm2008
> Windows 64-bit: hsw -pass, also ran and passed all of specjvm2008 : caveat
> Linux on skx: 32-bit - pass, also ran and passed all of specjvm2008
> Linux on skx:64-bit - pass, also ran and passed all of specjvm2008
>
> We should proceed with checkin in the changelist after the usual testing.
>
> Note: The above tests were done with the asserts changed on windows only. The 64bit changes are mostly cosmetic.  It's the change to the additional_frame_bytes that makes it correct, we used
> equivalent constants in the stack adjustment beforehand, they had not been mapped to the movdqu for the non-vector case for a few iterations on the file.  Early on I did have that code though.
>
> Caveat: xml.transform fails with the changelist and without, I checked this against a 12-21-15 built jdk which is 1 month old, so we have a new bug that is causing this app to fail as well (on windows for 64bit) on hsw.
> I checked recent jbs traffic, the occurrence does not appear to be tracked at this time.
>
> -Michael
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Friday, January 29, 2016 11:40 AM
> To: hotspot-compiler-dev at openjdk.java.net
> Cc: Berg, Michael C
> Subject: Re: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to restore xmm registers on 32 bit
>
> Tobias, please verify that 64-bit code works correctly.
> About 32-bit code.
>
> Please verify correctness of next asserts:
>
>        assert(UseAVX > 0, "512bit vectors are supported only with EVEX");
>        assert(MaxVectorSize == 64, "only 512bit vectors are supported now");
>
> Originally we could have vectors even with only 64bit XMM registers. MaxVectorSize and UseAVX can be set on command line
> - what happens in such case? No vectorization?
>
> May be it is done because we save whole 128bit XMM always. Still MaxVectorSize == 64 condition is strange.
>
> Thanks,
> Vladimir
>
> On 1/29/16 6:16 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8148490
>> http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/
>>
>> RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit:
>>
>> Save:
>>      ...
>>      0xf34ca12e:	vmovdqu %xmm0,0xb0(%esp)
>>      0xf34ca137:	vmovdqu %xmm1,0xc0(%esp)
>>      ...
>>      0xf34ca16d:	vmovdqu %xmm7,0x120(%esp)
>>      0xf34ca176:	sub    $0x80,%esp
>>      0xf34ca17c:	vextractf128 $0x1,%ymm0,(%esp)
>>      0xf34ca183:	vextractf128 $0x1,%ymm1,0x10(%esp)
>>      ...
>>      0xf34ca1b3:	vextractf128 $0x1,%ymm7,0x70(%esp)
>>      ...
>>
>> Restore:
>>      ...
>>      0xf34ca202:	vinsertf128 $0x1,(%esp),%ymm0,%ymm0
>>      0xf34ca209:	vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
>>      ...
>>      0xf34ca239:	vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
>>      0xf34ca241:	add    $0x80,%esp
>>      0xf34ca247:	vmovdqu 0x130(%esp),%xmm0
>>      0xf34ca250:	vmovdqu 0x140(%esp),%xmm1
>>      ...
>>      0xf34ca286:	vmovdqu 0x1a0(%esp),%xmm7
>>      ...
>>
>> The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:
>>
>> 283     __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers
>>
>> The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.
>>
>> I spotted and fixed the following other problems:
>> - the vmovdqu instructions should be emitted before restoring YMM and ZMM because they zero the upper part of the XMM registers (i.e. YMM/ZMM)
>> - if 'UseAVX > 2' is set/available, we save the ZMM registers as well but we do not increment 'additional_frame_words' accordingly (we need another 8*32 bytes of stack space)
>>
>> Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?
>>
>> The problems were introduced by the fix for JDK-8142980.
>>
>> Thanks,
>> Tobias
>>

From jamsheed.c.m at oracle.com  Sat Jan 30 04:08:44 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Sat, 30 Jan 2016 09:38:44 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56ABBB34.80002@oracle.com>
References: <56AA3ED7.4030407@oracle.com> <56AAE63A.4060905@oracle.com>
	<56AB0868.2080307@oracle.com> <56ABBB34.80002@oracle.com>
Message-ID: <56AC374C.9050800@oracle.com>


Hi Dean,

On 1/30/2016 12:49 AM, Dean Long wrote:
> On 1/28/2016 10:36 PM, Jamsheed C m wrote:
>> Hi Dean,
>>
>> On 1/29/2016 9:40 AM, Dean Long wrote:
>>> As you noticed, for this kind of bug the memory is going to 
>>> consistent by the time the core file is written.
>>> So to help debug this assert it if happens again, could you change 
>>> it to something like:
>>>
>>> #ifdef ASSERT
>>>     address computed_address = 
>>> SharedRuntime::compute_compiled_exc_handler(nm, pc, exception, 
>>> force_unwind, true);
>>>     vmassert(handler_address == computed_address, PTR_FORMAT " != " 
>>> PTR_FORMAT, p2i(handler_address), p2i(computed_address));
>>> #endif
>> I got handler_address value in this case. This value was inconsistent 
>> with value in ExceptionCache.
>> It was having initial value and that was helpful in figuring out what 
>> would have went wrong.
>>
>
> In the bug report, you said all data in the core file was consistent, 
> so I'm just wondering where you saw
> it inconsistent.   Just to confirm what was going wrong, you suspect 
> that _count was being updated before the handler?
i meant ExceptionCache(heap) and ExecptionHandlerTable(heap) contents 
were consistent at the time core file was written.
handler_address(local variable) had already captured failing value. 
handler_address(local variable) was inconsistent with 
ExceptionCache(heap) hanlder_address in core file.

there were two failing case.
1) Only one entry in exception cache and failing

         -here i suspect handler_address in exception cache write code 
got reordered well below count and and even ExceptioCache pointer update 
in nm.
2)Two entries in exception cache for an exception and second entry 
causing failure.

         - here i suspect handler_address in exception cache write code 
got reordered below count.

These reordering happens in very small window, as this is code is 
already lock protected ( and has a mem barrier below).

Best,
Jamsheed

>
> dl
>
>> I will make this change.
>>
>> Best Regards,
>> Jamsheed
>>>
>>> dl
>>>
>>> On 1/28/2016 8:16 AM, Jamsheed C m wrote:
>>>> Hi,
>>>>
>>>> Please review the fix made for issue
>>>>
>>>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>>>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>>>
>>>> Unit tests: As its hard, none
>>>>
>>>> Other tests: jprt.
>>>>
>>>> Description of the issue:
>>>> A valid pc match in exception cache returning an invalid handler 
>>>> makes assert to fail.
>>>> This happens as  ExceptionCache reads are lock free access.
>>>>
>>>> As a fix for this i have put a storestore mem barrier before the 
>>>> count is updated.
>>>>
>>>> Best Regards,
>>>> Jamsheed
>>>
>>
>


From jamsheed.c.m at oracle.com  Sat Jan 30 05:19:09 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Sat, 30 Jan 2016 10:49:09 +0530
Subject: RFR(XS): 8143897 :Weblogic12medrec assert(handler_address ==
	SharedRuntime::compute_compiled_exc_handler(nm, pc, exception,
	force_unwind, true)) failed: Must be the same
In-Reply-To: <56AC374C.9050800@oracle.com>
References: <56AA3ED7.4030407@oracle.com> <56AAE63A.4060905@oracle.com>
	<56AB0868.2080307@oracle.com> <56ABBB34.80002@oracle.com>
	<56AC374C.9050800@oracle.com>
Message-ID: <56AC47CD.4040700@oracle.com>


On 1/30/2016 9:38 AM, Jamsheed C m wrote:
>
> Hi Dean,
>
> On 1/30/2016 12:49 AM, Dean Long wrote:
>> On 1/28/2016 10:36 PM, Jamsheed C m wrote:
>>> Hi Dean,
>>>
>>> On 1/29/2016 9:40 AM, Dean Long wrote:
>>>> As you noticed, for this kind of bug the memory is going to 
>>>> consistent by the time the core file is written.
>>>> So to help debug this assert it if happens again, could you change 
>>>> it to something like:
>>>>
>>>> #ifdef ASSERT
>>>>     address computed_address = 
>>>> SharedRuntime::compute_compiled_exc_handler(nm, pc, exception, 
>>>> force_unwind, true);
>>>>     vmassert(handler_address == computed_address, PTR_FORMAT " != " 
>>>> PTR_FORMAT, p2i(handler_address), p2i(computed_address));
>>>> #endif
>>> I got handler_address value in this case. This value was 
>>> inconsistent with value in ExceptionCache.
>>> It was having initial value and that was helpful in figuring out 
>>> what would have went wrong.
>>>
>>
>> In the bug report, you said all data in the core file was consistent, 
>> so I'm just wondering where you saw
>> it inconsistent.   Just to confirm what was going wrong, you suspect 
>> that _count was being updated before the handler?
> i meant ExceptionCache(heap) and ExecptionHandlerTable(heap) contents 
> were consistent at the time core file was written.
> handler_address(local variable) had already captured failing value. 
> handler_address(local variable) was inconsistent with 
> ExceptionCache(heap) hanlder_address in core file.
>
> there were two failing case.
> 1) Only one entry in exception cache and failing
>
>         -here i suspect handler_address in exception cache write code 
> got reordered well below count and and even ExceptioCache pointer 
> update in nm.
> 2)Two entries in exception cache for an exception and second entry 
> causing failure.
>
>         - here i suspect handler_address in exception cache write code 
> got reordered below count.
>
> These reordering happens in very small window, as this code is already 
> lock protected ( and has a mem barrier below).
i have removed the ambiguity in the bug report.

Best Regards,
Jamsheed
>
> Best,
> Jamsheed
>
>>
>> dl
>>
>>> I will make this change.
>>>
>>> Best Regards,
>>> Jamsheed
>>>>
>>>> dl
>>>>
>>>> On 1/28/2016 8:16 AM, Jamsheed C m wrote:
>>>>> Hi,
>>>>>
>>>>> Please review the fix made for issue
>>>>>
>>>>> bug url: https://bugs.openjdk.java.net/browse/JDK-8143897
>>>>> web rev: http://cr.openjdk.java.net/~thartmann/8143897/webrev.00/
>>>>>
>>>>> Unit tests: As its hard, none
>>>>>
>>>>> Other tests: jprt.
>>>>>
>>>>> Description of the issue:
>>>>> A valid pc match in exception cache returning an invalid handler 
>>>>> makes assert to fail.
>>>>> This happens as  ExceptionCache reads are lock free access.
>>>>>
>>>>> As a fix for this i have put a storestore mem barrier before the 
>>>>> count is updated.
>>>>>
>>>>> Best Regards,
>>>>> Jamsheed
>>>>
>>>
>>
>


From christian.thalinger at oracle.com  Sun Jan 31 13:48:04 2016
From: christian.thalinger at oracle.com (Christian Thalinger)
Date: Sun, 31 Jan 2016 14:48:04 +0100
Subject: RFR: 8148507: [JVMCI] mitigate deadlocks related to JVMCI
	compiler under -Xbatch
In-Reply-To: <845F1D56-3194-49AE-95C1-79545F8C50AC@oracle.com>
References: <845F1D56-3194-49AE-95C1-79545F8C50AC@oracle.com>
Message-ID: <F591CC0A-7FA7-4806-A64D-39DCC5DE1AB9@oracle.com>

Looks good.

> On Jan 29, 2016, at 4:34 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Please review this small change to further mitigate deadlocks that can be caused by JVMCI when BackgroundCompilation is disabled.
> 
> https://bugs.openjdk.java.net/browse/JDK-8148507
> http://cr.openjdk.java.net/~dnsimon/8148507
> 
> -Doug