From vladimir.kozlov at oracle.com  Tue Nov  1 00:52:56 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 31 Oct 2016 17:52:56 -0700
Subject: [9] RFR(M) 8166416: [AOT] Integrate JDK build changes and
	launcher 'jaotc' for AOT compiler
In-Reply-To: <7bb59402-ecba-692b-9718-16414fe17efd@redhat.com>
References: <58114E0C.107@oracle.com>
	<a9625de0-2cba-b895-21f9-a9739213455b@oracle.com>
	<58124A47.5010604@oracle.com> <581304B6.7030804@oracle.com>
	<9f52669e-75fe-7139-78cb-d28ae1aff0a7@redhat.com>
	<58137448.6040205@oracle.com>
	<23d2851f-027d-f142-e4d1-8c42e4a011f2@redhat.com>
	<7bb59402-ecba-692b-9718-16414fe17efd@redhat.com>
Message-ID: <94ac122b-6ff8-d038-f0c5-fb8011a661dd@oracle.com>

Thank you, Andrew

I fixed compiledCI_aarch64.cpp and updated webrev in place.

Thanks,
Vladimir

On 10/31/16 8:35 AM, Andrew Dinn wrote:
> Hi Vladimir,
>
> On 31/10/16 11:38, Andrew Dinn wrote:
>> On 28/10/16 16:52, Vladimir Kozlov wrote:
>>> Thank you, Andrew, for verifying that build changes do not break AArch64.
>>> But it would be nice if you can also apply Hotspot changes (revert
>>> hs.make.webrev changes before that since hs.webrev have them):
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>
>>> and jaotc sources (which are located in Hotspot repo):
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/jaotc.webrev/
>>
>> I tried this and found two missing changes to compiledIC_aarch64.cpp
>> (basically a missing arg in each of two class to find_stub() -- see
>> below for diff).
>>
>> However, I then ran into the problem Volker saw:
>>
>> Compiling 15 files for jdk.attach
>> /home/adinn/openjdk/hs/hotspot/src/jdk.vm.ci/share/classes/module-info.java:40:
>> error: module not found: jdk.vm.compiler
>>        jdk.vm.compiler;
>>              ^
>> /home/adinn/openjdk/hs/hotspot/src/jdk.vm.ci/share/classes/module-info.java:43:
>> error: module not found: jdk.vm.compiler
>>        jdk.vm.compiler;
>>
>> . . .
>>
>> I assume fixing this second problem requires me to clone the graal-core
>> repo into my tree and the apply the graal.webrev patch then rebuild.
>
> I cloned and patched the graal-core/graal tree and then copied it into
> my hotspot space as follows
>
>   $ cp /path/to/graal-core/graal \
>      /otherpath/to/hs/hotspot/src/share/classes/jdk.vm.compiler
>
> With this and the extra tweaks to compiledCI_aarch64.cpp mentioned in
> the previous reply I managed to build a slowdebug release which
> successfully ran 'java Hello' and 'javac Hello.java'.
>
> Andrew Haley is currently trying to get Graal itself to run on AArch64.
> So, this is probably good enough for now to confirm the acceptability of
> the hs and jaotc change sets.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>

From coleen.phillimore at oracle.com  Tue Nov  1 01:35:21 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Mon, 31 Oct 2016 21:35:21 -0400
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <58115536.5080205@oracle.com>
References: <58115536.5080205@oracle.com>
Message-ID: <ca040c50-421b-e25b-48dd-574e2d7f053f@oracle.com>


I looked at the runtime code and it looks fine to me.  I'm pleased the 
changes were not more invasive.   Some minor questions and nits:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/code/nmethod.cpp.udiff.html

*+ virtual void set_to_interpreted(methodHandle method, CompiledICInfo& 
info) {*


Can you pass methodHandle by const reference so that the copy 
constructor and destructor aren't called?

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/oops/methodCounters.hpp.udiff.html

Why does this add a Method* pointer for #ifndef AOT code?   This could 
be a lot of additional footprint.

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/globals.hpp.udiff.html

Why are the AOT parameters in two separate sections?  The intx ones 
should be defined with a valid range.

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/vmStructs.cpp.udiff.html

Why is this added and the SA code fixed?  AOT doesn't use the SA, does 
it?  Was it added for debugging?

Thanks,
Colee

On 10/26/16 9:15 PM, Vladimir Kozlov wrote:
> AOT JEP:
> https://bugs.openjdk.java.net/browse/JDK-8166089
> Subtask:
> https://bugs.openjdk.java.net/browse/JDK-8166415
> Webrev:
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>
> Please, review Hotspot VM part of AOT changes.
>
> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot 
> will be build only on Linux/x64.
>
> AOT code is NOT linked during AOT libraries load as it happens with 
> normal .so libraries. AOT code entry points are not exposed (not 
> global) in AOT libraries. Only class data has global labels which we 
> look for with dlsym(klass_name).
>
> AOT-compiled code in AOT libraries is treated by JVM as *extension* of 
> existing CodeCache. When a java class is loaded JVM looks if 
> corresponding AOT-compiled methods exist in loaded AOT libraries and 
> add links to them from java methods descriptors (we have new field 
> Method::_aot_code). AOT-compiled code follows the same 
> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>
> Calls in AOT code use the same methods resolution runtime code as 
> calls in JITed code. The difference is call's destination address is 
> loaded indirectly because we can't patch AOT code - it is immutable 
> (to share between multiple JVM instances).
>
> Classes and Strings referenced in AOT code are resolved lazily by 
> calling into runtime. All mutable pointers (oops (mostly strings), 
> metadata) are stored and modified in a separate mutable memory (GOT 
> cells) - they are not embedded into AOT code.
>
> Changes includes klass fingerprint generation since we need it to find 
> correct klass data in loaded AOT libraries.
>
> Thanks,
> Vladimir


From stefan.karlsson at oracle.com  Tue Nov  1 07:28:06 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 1 Nov 2016 08:28:06 +0100
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <58115536.5080205@oracle.com>
References: <58115536.5080205@oracle.com>
Message-ID: <511061ab-e70d-970b-f8e3-67d87a894099@oracle.com>

Hi Vladimir,

I just took a quick look at the GC code.

1) You need to go over the entire patch and fix all the include lines 
that were added. They are are not sorted, as they should.

Some examples:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html

#include "utilities/debug.hpp"
#include "utilities/macros.hpp"
*+ #include "aot/aotLoader.hpp"*
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html 
#include "gc/g1/g1Policy.hpp"
#include "gc/g1/g1RootClosures.hpp"
#include "gc/g1/g1RootProcessor.hpp"
#include "gc/g1/heapRegion.inline.hpp"
#include "memory/allocation.inline.hpp"
*+ #include "aot/aotLoader.hpp"*
#include "runtime/fprofiler.hpp"
#include "runtime/mutex.hpp"
#include "services/management.hpp"

2) I'd prefer if the the check if AOT is enabled was folded into 
AOTLoader::oops_do, so that the additions to the GC code would be less 
conspicuous: For example: 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html 


CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
CodeBlobToOopClosure::FixRelocations);
CodeCache::blobs_do(&adjust_from_blobs);
*+ if (UseAOT) {*
*+ AOTLoader::oops_do(adjust_pointer_closure());*
*+ }*
StringTable::oops_do(adjust_pointer_closure());
ref_processor()->weak_oops_do(adjust_pointer_closure());
PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());

Would be:

CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
CodeBlobToOopClosure::FixRelocations);
CodeCache::blobs_do(&adjust_from_blobs);
*+ AOTLoader::oops_do(adjust_pointer_closure());*
StringTable::oops_do(adjust_pointer_closure());
ref_processor()->weak_oops_do(adjust_pointer_closure());
PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());

3) aotLoader.hpp uses implements methods using GrowableArray. This will 
expose the growable array functions to all includers of that file. 
Please move all that code out to an aotLoader.inline.hpp file, and then 
remove the unneeded includes from the aotLoader.hpp file. 4) 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html 


// Reserved area
char* low_boundary() const { return _low_boundary; }
char* high_boundary() const { return _high_boundary; }

*+ void set_low_boundary(char *p) { _low_boundary = p; }*
*+ void set_high_boundary(char *p) { _high_boundary = p; }*
*+ void set_low(char *p) { _low = p; }*
*+ void set_high(char *p) { _high = p; }*
*+ *
bool special() const { return _special; } These seems unsafe to me, but 
that might be because I don't understand how this is used. VirtualSpace 
has three sections, the lower, middle, and the high. The middle section 
might have another alignment (large pages) than the others. Is this 
property still maintained when these functions are used? 5) 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html 
Did you discuss with the Runtime team about the naming of these tags? 
The other class* tags where split up into multiple tags. For example, 
classload was changed to class,load.

Thanks, StefanK On 27/10/16 03:15, Vladimir Kozlov wrote:
> AOT JEP: https://bugs.openjdk.java.net/browse/JDK-8166089 Subtask: 
> https://bugs.openjdk.java.net/browse/JDK-8166415 Webrev: 
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ Please, review Hotspot 
> VM part of AOT changes. Only Linux/x64 platform is supported. 'jaotc' 
> and AOT part of Hotspot will be build only on Linux/x64. AOT code is 
> NOT linked during AOT libraries load as it happens with normal .so 
> libraries. AOT code entry points are not exposed (not global) in AOT 
> libraries. Only class data has global labels which we look for with 
> dlsym(klass_name). AOT-compiled code in AOT libraries is treated by 
> JVM as *extension* of existing CodeCache. When a java class is loaded 
> JVM looks if corresponding AOT-compiled methods exist in loaded AOT 
> libraries and add links to them from java methods descriptors (we have 
> new field Method::_aot_code). AOT-compiled code follows the same 
> invocation/deoptimization/unloading rules as normal JIT-compiled code. 
> Calls in AOT code use the same methods resolution runtime code as 
> calls in JITed code. The difference is call's destination address is 
> loaded indirectly because we can't patch AOT code - it is immutable 
> (to share between multiple JVM instances). Classes and Strings 
> referenced in AOT code are resolved lazily by calling into runtime. 
> All mutable pointers (oops (mostly strings), metadata) are stored and 
> modified in a separate mutable memory (GOT cells) - they are not 
> embedded into AOT code. Changes includes klass fingerprint generation 
> since we need it to find correct klass data in loaded AOT libraries. 
> Thanks, Vladimir 


From stefan.karlsson at oracle.com  Tue Nov  1 07:38:39 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 1 Nov 2016 08:38:39 +0100
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <58115536.5080205@oracle.com>
References: <58115536.5080205@oracle.com>
Message-ID: <dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>

(resending without formatting)

Hi Vladimir,

I just took a quick look at the GC code.

1) You need to go over the entire patch and fix all the include lines 
that were added. They are are not sorted, as they should.

Some examples:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html

   #include "utilities/debug.hpp"
   #include "utilities/macros.hpp"
+ #include "aot/aotLoader.hpp"


http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html

   #include "gc/g1/g1Policy.hpp"
   #include "gc/g1/g1RootClosures.hpp"
   #include "gc/g1/g1RootProcessor.hpp"
   #include "gc/g1/heapRegion.inline.hpp"
   #include "memory/allocation.inline.hpp"
+ #include "aot/aotLoader.hpp"
   #include "runtime/fprofiler.hpp"
   #include "runtime/mutex.hpp"
   #include "services/management.hpp"

2) I'd prefer if the the check if AOT is enabled was folded into 
AOTLoader::oops_do, so that the additions to the GC code would be less 
conspicuous.

For example: 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html

     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
CodeBlobToOopClosure::FixRelocations);
     CodeCache::blobs_do(&adjust_from_blobs);
+   if (UseAOT) {
+     AOTLoader::oops_do(adjust_pointer_closure());
+   }
     StringTable::oops_do(adjust_pointer_closure());
     ref_processor()->weak_oops_do(adjust_pointer_closure());
PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());

Would be:

     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
CodeBlobToOopClosure::FixRelocations);
     CodeCache::blobs_do(&adjust_from_blobs);
+   AOTLoader::oops_do(adjust_pointer_closure());
     StringTable::oops_do(adjust_pointer_closure());
     ref_processor()->weak_oops_do(adjust_pointer_closure());
PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());

3) aotLoader.hpp uses implements methods using GrowableArray. This will 
expose the growable array functions to all includers of that file. 
Please move all that code out to an aotLoader.inline.hpp file, and then 
remove the unneeded includes from the aotLoader.hpp file. 4) 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html

     // Reserved area
     char* low_boundary()  const { return _low_boundary; }
     char* high_boundary() const { return _high_boundary; }

+   void set_low_boundary(char *p)  { _low_boundary = p; }
+   void set_high_boundary(char *p) { _high_boundary = p; }
+   void set_low(char *p)           { _low = p; }
+   void set_high(char *p)          { _high = p; }
+
     bool special() const { return _special; }

These seems unsafe to me, but that might be because I don't understand 
how this is used. VirtualSpace has three sections, the lower, middle, 
and the high. The middle section might have another alignment (large 
pages) than the others. Is this property still maintained when these 
functions are used?


5) 
http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html

Did you discuss with the Runtime team about the naming of these tags? 
The other class* tags where split up into multiple tags. For example, 
classload was changed to class,load.

Thanks,
StefanK

On 27/10/16 03:15, Vladimir Kozlov wrote:
> AOT JEP:
> https://bugs.openjdk.java.net/browse/JDK-8166089
> Subtask:
> https://bugs.openjdk.java.net/browse/JDK-8166415
> Webrev:
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>
> Please, review Hotspot VM part of AOT changes.
>
> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot 
> will be build only on Linux/x64.
>
> AOT code is NOT linked during AOT libraries load as it happens with 
> normal .so libraries. AOT code entry points are not exposed (not 
> global) in AOT libraries. Only class data has global labels which we 
> look for with dlsym(klass_name).
>
> AOT-compiled code in AOT libraries is treated by JVM as *extension* of 
> existing CodeCache. When a java class is loaded JVM looks if 
> corresponding AOT-compiled methods exist in loaded AOT libraries and 
> add links to them from java methods descriptors (we have new field 
> Method::_aot_code). AOT-compiled code follows the same 
> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>
> Calls in AOT code use the same methods resolution runtime code as 
> calls in JITed code. The difference is call's destination address is 
> loaded indirectly because we can't patch AOT code - it is immutable 
> (to share between multiple JVM instances).
>
> Classes and Strings referenced in AOT code are resolved lazily by 
> calling into runtime. All mutable pointers (oops (mostly strings), 
> metadata) are stored and modified in a separate mutable memory (GOT 
> cells) - they are not embedded into AOT code.
>
> Changes includes klass fingerprint generation since we need it to find 
> correct klass data in loaded AOT libraries.
>
> Thanks,
> Vladimir


From coleen.phillimore at oracle.com  Tue Nov  1 10:40:23 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 1 Nov 2016 06:40:23 -0400
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
Message-ID: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>

5.  Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging. 
Thanks
Coleen


> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> 
> (resending without formatting)
> 
> Hi Vladimir,
> 
> I just took a quick look at the GC code.
> 
> 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should.
> 
> Some examples:
> 
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
> 
>  #include "utilities/debug.hpp"
>  #include "utilities/macros.hpp"
> + #include "aot/aotLoader.hpp"
> 
> 
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html
> 
>  #include "gc/g1/g1Policy.hpp"
>  #include "gc/g1/g1RootClosures.hpp"
>  #include "gc/g1/g1RootProcessor.hpp"
>  #include "gc/g1/heapRegion.inline.hpp"
>  #include "memory/allocation.inline.hpp"
> + #include "aot/aotLoader.hpp"
>  #include "runtime/fprofiler.hpp"
>  #include "runtime/mutex.hpp"
>  #include "services/management.hpp"
> 
> 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous.
> 
> For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
> 
>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>    CodeCache::blobs_do(&adjust_from_blobs);
> +   if (UseAOT) {
> +     AOTLoader::oops_do(adjust_pointer_closure());
> +   }
>    StringTable::oops_do(adjust_pointer_closure());
>    ref_processor()->weak_oops_do(adjust_pointer_closure());
> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
> 
> Would be:
> 
>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>    CodeCache::blobs_do(&adjust_from_blobs);
> +   AOTLoader::oops_do(adjust_pointer_closure());
>    StringTable::oops_do(adjust_pointer_closure());
>    ref_processor()->weak_oops_do(adjust_pointer_closure());
> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
> 
> 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
> 
>    // Reserved area
>    char* low_boundary()  const { return _low_boundary; }
>    char* high_boundary() const { return _high_boundary; }
> 
> +   void set_low_boundary(char *p)  { _low_boundary = p; }
> +   void set_high_boundary(char *p) { _high_boundary = p; }
> +   void set_low(char *p)           { _low = p; }
> +   void set_high(char *p)          { _high = p; }
> +
>    bool special() const { return _special; }
> 
> These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used?
> 
> 
> 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
> 
> Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load.
> 
> Thanks,
> StefanK
> 
>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>> AOT JEP:
>> https://bugs.openjdk.java.net/browse/JDK-8166089
>> Subtask:
>> https://bugs.openjdk.java.net/browse/JDK-8166415
>> Webrev:
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>> 
>> Please, review Hotspot VM part of AOT changes.
>> 
>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64.
>> 
>> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name).
>> 
>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code.
>> 
>> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances).
>> 
>> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code.
>> 
>> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries.
>> 
>> Thanks,
>> Vladimir
> 
> 


From coleen.phillimore at oracle.com  Tue Nov  1 15:14:02 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 1 Nov 2016 11:14:02 -0400
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
Message-ID: <F0F13AD5-2166-415C-B25B-1B6C4B7391B6@oracle.com>

Sorry my phone autocorrected.  I meant tag aot composed with the others.
Coleen

Sent from my iPhone

> On Nov 1, 2016, at 6:40 AM, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
> 
> 5.  Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging. 
> Thanks
> Coleen
> 
> 
>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>> 
>> (resending without formatting)
>> 
>> Hi Vladimir,
>> 
>> I just took a quick look at the GC code.
>> 
>> 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should.
>> 
>> Some examples:
>> 
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
>> 
>> #include "utilities/debug.hpp"
>> #include "utilities/macros.hpp"
>> + #include "aot/aotLoader.hpp"
>> 
>> 
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html
>> 
>> #include "gc/g1/g1Policy.hpp"
>> #include "gc/g1/g1RootClosures.hpp"
>> #include "gc/g1/g1RootProcessor.hpp"
>> #include "gc/g1/heapRegion.inline.hpp"
>> #include "memory/allocation.inline.hpp"
>> + #include "aot/aotLoader.hpp"
>> #include "runtime/fprofiler.hpp"
>> #include "runtime/mutex.hpp"
>> #include "services/management.hpp"
>> 
>> 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous.
>> 
>> For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>> 
>>   CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>>   CodeCache::blobs_do(&adjust_from_blobs);
>> +   if (UseAOT) {
>> +     AOTLoader::oops_do(adjust_pointer_closure());
>> +   }
>>   StringTable::oops_do(adjust_pointer_closure());
>>   ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>> 
>> Would be:
>> 
>>   CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>>   CodeCache::blobs_do(&adjust_from_blobs);
>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>   StringTable::oops_do(adjust_pointer_closure());
>>   ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>> 
>> 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>> 
>>   // Reserved area
>>   char* low_boundary()  const { return _low_boundary; }
>>   char* high_boundary() const { return _high_boundary; }
>> 
>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>> +   void set_low(char *p)           { _low = p; }
>> +   void set_high(char *p)          { _high = p; }
>> +
>>   bool special() const { return _special; }
>> 
>> These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used?
>> 
>> 
>> 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>> 
>> Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load.
>> 
>> Thanks,
>> StefanK
>> 
>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>> AOT JEP:
>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>> Subtask:
>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>> Webrev:
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>> 
>>> Please, review Hotspot VM part of AOT changes.
>>> 
>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64.
>>> 
>>> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name).
>>> 
>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>> 
>>> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances).
>>> 
>>> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code.
>>> 
>>> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries.
>>> 
>>> Thanks,
>>> Vladimir
> 


From trevor.d.watson at oracle.com  Tue Nov  1 17:16:09 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Tue, 1 Nov 2016 17:16:09 +0000
Subject: Tests for lzcnt
Message-ID: <23a928c5-704f-a3b6-2bdb-cba9bd424fdd@oracle.com>

I'm working on https://bugs.openjdk.java.net/browse/JDK-8162865, which 
implements the inlining of LZCNT instructions on SPARC platforms which 
support it.

I have the code implemented and have written a test-case that validates 
the values returned by Long.countLeadingZeros() and 
Integer.countLeadingZeros().

Looking through the hotspot tests, I see there is some testing of lzcnt 
in compiler/intrinsics/bmi. I've adapted TestLzcntI.java and 
TestLzcntL.java to check for the relevant SPARC feature as well as the 
x86/x64 feature. These tests work fine, but only validate that C2 
generates the same results as interpreted code for a selection of random 
values. Whilst that is perfectly valid, I'd like to also be able to 
verify that the inlined code for the lz count for each power of 2 in an 
Integer and Long produces correct values (per my standalone test).

Would the "bmi" directory be the appropriate place to add a new test 
like this or even hold a test which supports both x86/x64 and SPARC 
given that "bmi" appears to refer to some kind of x86/x64 cpu feature 
set? Or am I reading too much into "bmi" and its just used here as a 
generic name?

I hope that made sense.

Thanks,
Trevor


From kirill.zhaldybin at oracle.com  Tue Nov  1 17:30:09 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Tue, 1 Nov 2016 20:30:09 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
Message-ID: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>

Dear all,

Could you please review this fix for 8169003?

I changed parsing of time string so now it is not depend on LC_NUMERIC 
locale so the test does not fail if locale where "floating point" is 
actually a comma is set.

WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
CR: https://bugs.openjdk.java.net/browse/JDK-8169003

Thank you.

Regards, Kirill

From claes.redestad at oracle.com  Tue Nov  1 21:22:45 2016
From: claes.redestad at oracle.com (Claes Redestad)
Date: Tue, 1 Nov 2016 22:22:45 +0100
Subject: RFR 8163553 java.lang.LinkageError from test
	java/lang/ThreadGroup/Stop.java
In-Reply-To: <8C56B19F-22EC-4E1E-AA47-E0D629231B07@oracle.com>
References: <5EA4A44D-3E66-4B76-8160-163580606FF1@oracle.com>
	<8C56B19F-22EC-4E1E-AA47-E0D629231B07@oracle.com>
Message-ID: <581907A5.8020600@oracle.com>

+1

/Claes

On 2016-10-27 21:24, Paul Sandoz wrote:
> Gentle reminder.
>
> Paul.
>
>> On 18 Oct 2016, at 11:41, Paul Sandoz <Paul.Sandoz at oracle.com> wrote:
>>
>> Hi,
>>
>> Please review:
>>
>>   http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8163553-vh-mh-link-errors-not-wrapped/webrev/
>>
>> This is the issue that motivated a change in the behaviour of indy wrapping Errors in BootstrapMethodError, JDK-8166974. I plan to push this issue with JDK-8166974 to hs, since they are related in behaviour even though there is no direct dependency between the patches.
>>
>>
>> When invoking signature-polymorphic methods a similar but hardcoded dance occurs, with an appeal to Java code, to link the call site.
>>
>> - MethodHandle.invoke/invokeExact (and the VH methods) would wrap all Errors in LinkageError. Now they are passed through, thus an Error like ThreadDeath is not wrapped.
>>
>> - MethodHandle.invoke/invokeExact/invokeBasic throw Throwable, and in certain cases the Throwable is wrapped in an InternalError. In many other cases Error and RuntimeException are propagated, which i think in general is the right pattern, so i consistently applied that.
>>
>> - I updated StringConcatFactory to also pass through Errors and avoid unduly wrapping StringConcatException in another instance of StringConcatException. (LambdaMetafactory and associated classes required no changes.)
>>
>> Thanks,
>> Paul.
>

From 1072213404 at qq.com  Wed Nov  2 08:11:35 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Wed, 2 Nov 2016 16:11:35 +0800
Subject: =?gb18030?B?d2hhdCdzIHRoZSBmdW5jdGlvbiBhbmQgZGlmZnJl?=
	=?gb18030?B?bmNlIGJldHdlZW4gY29tcGlsZXIgYzEgYW5kIGMy?=
	=?gb18030?B?IKO/IA==?=
Message-ID: <tencent_44E2879E3E9596DA0F8F6275@qq.com>

what's the function and diffrence between hotspot compiler c1 and c2 ?


about c1?
i have found something about ?voaltile? and  
methods  LIRGenerator::do_StoreField(StoreField* x)  and  LIRGenerator::do_LoadField(LoadField* x) in   ?share/vm/c1/c1_LIRGenerator.cpp??


when operating a  variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method?
if true? then method do_LoadField or do_StoreField  by  which method?

From 1072213404 at qq.com  Wed Nov  2 08:19:18 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Wed, 2 Nov 2016 16:19:18 +0800
Subject: =?gb18030?B?cGxlYXNlIGhlbHAgdW5kZXJzdGFuZGluZyB3aGF0?=
	=?gb18030?B?J3MgdGhlIGZ1bmN0aW9uIGFuZCBkaWZmcmVuY2Ug?=
	=?gb18030?B?YmV0d2VlbiBob3RzcG90IGNvbXBpbGVyIGMxIGFu?=
	=?gb18030?B?ZCBjMiCjvw==?=
Message-ID: <tencent_1640014A25C277F977843EEB@qq.com>

what's the function and diffrence between hotspot compiler c1 and c2 ?


about c1?
i have found something about ?voaltile? and  
methods  LIRGenerator::do_StoreField(StoreField* x)  and  LIRGenerator::do_LoadField(LoadField* x) in   ?share/vm/c1/c1_LIRGenerator.cpp??


when operating a  variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method?
if true? then method do_LoadField or do_StoreField  by  which method?

From rednaxelafx at gmail.com  Wed Nov  2 08:28:50 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 2 Nov 2016 01:28:50 -0700
Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_function_an?=
	=?UTF-8?Q?d_diffrence_between_hotspot_compiler_c1_and_c2_=EF=BC=9F?=
In-Reply-To: <tencent_1640014A25C277F977843EEB@qq.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
Message-ID: <CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>

Hi,

I don't think I understand your question, but I'll take a shot.
Are you trying to ask what the differences are between C1 and C2, with
regards to how they handle volatile field accesses?

For C1, yes, all Java fields accesses (load/store) are represented in the
HIR with LoadField and StoreField instructions. The ciField in these
instructions would carry the information about whether the field is
volatile or not.
When lowering HIR to LIR, the LIRGenerator::do_LoadField() and
do_StoreField() functions are called. What is it that you're trying to
learn about these functions?

For C2, it's a bit complicated, because volatile semantics involve the
memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1]
and [2] for some background information before you dive into the code.

Hope it helps,
Kris

[1]:
https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation
[2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes

On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote:

> what's the function and diffrence between hotspot compiler c1 and c2 ?
>
>
> about c1?
> i have found something about ?voaltile? and
> methods  LIRGenerator::do_StoreField(StoreField* x)  and
> LIRGenerator::do_LoadField(LoadField* x) in
>  ?share/vm/c1/c1_LIRGenerator.cpp??
>
>
> when operating a  variable with a volatile qualifier?will it finally
> invoke do_StoreField or do_LoadField method?
> if true? then method do_LoadField or do_StoreField  by  which method?

From 1072213404 at qq.com  Wed Nov  2 08:48:30 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Wed, 2 Nov 2016 16:48:30 +0800
Subject: =?gb18030?B?IHBsZWFzZSBoZWxwIHVuZGVyc3RhbmRpbmcgd2hh?=
	=?gb18030?B?dCdzIHRoZSByZWxhdGlvbnNoaXAgb2YgIGhvdHNw?=
	=?gb18030?B?b3QgY29tcGlsZXIgYzEgYW5kIGMyIKO/?=
In-Reply-To: <CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
Message-ID: <tencent_7E4288D17388D52A31E33BD1@qq.com>

Thank you ?Krystal ?
i think i need some time to make these things light?


and
which part of  code  link HIR and LIR together in openjdk ?


c1 matches client   mode?
c2 matches server  mode?
when some code running? just one of them work or both do?


------------------ ???? ------------------
???: "Krystal Mok";<rednaxelafx at gmail.com>;
????: 2016?11?2?(???) ??4:28
???: "????"<1072213404 at qq.com>; 
??: "hotspot-dev"<hotspot-dev at openjdk.java.net>; 
??: Re: please help understanding what's the function and diffrence between hotspot compiler c1 and c2 ?


Hi,

I don't think I understand your question, but I'll take a shot.
Are you trying to ask what the differences are between C1 and C2, with regards to how they handle volatile field accesses?


For C1, yes, all Java fields accesses (load/store) are represented in the HIR with LoadField and StoreField instructions. The ciField in these instructions would carry the information about whether the field is volatile or not.
When lowering HIR to LIR, the LIRGenerator::do_LoadField() and do_StoreField() functions are called. What is it that you're trying to learn about these functions?


For C2, it's a bit complicated, because volatile semantics involve the memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1] and [2] for some background information before you dive into the code.


Hope it helps,
Kris


[1]: https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation
[2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes


On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote:
what's the function and diffrence between hotspot compiler c1 and c2 ?
 
 
 about c1?
 i have found something about ?voaltile? and
 methods  LIRGenerator::do_StoreField(StoreField* x)  and  LIRGenerator::do_LoadField(LoadField* x) in   ?share/vm/c1/c1_LIRGenerator.cpp??
 
 
 when operating a  variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method?
 if true? then method do_LoadField or do_StoreField  by  which method?

From rednaxelafx at gmail.com  Wed Nov  2 08:54:10 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 2 Nov 2016 01:54:10 -0700
Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_relationshi?=
	=?UTF-8?Q?p_of_hotspot_compiler_c1_and_c2_=EF=BC=9F?=
In-Reply-To: <tencent_7E4288D17388D52A31E33BD1@qq.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
	<tencent_7E4288D17388D52A31E33BD1@qq.com>
Message-ID: <CA+cQ+tTHEdfpOxsVEMMw_r+o4oZek053ZX_tDXFmWiRFKXfMyg@mail.gmail.com>

Hi,

By "HIR" and "LIR", I specifically mean the "High-level IR" and "Low-level
IR" for C1. In C1, the GraphBuilder is what parses Java bytecodes into HIR,
and then the LIRGenerator is what lowers HIR into LIR, and finally the
LIRAssembler is what encodes LIR into actual machine code.

C1 is the "Client Compiler", and C2 is the "Server Compiler".
In a HotSpot Client VM build (which is by default not available on 64-bit
architectures), it only contains C1.
In a JDK7+ HotSpot Server VM, the VM actually contains both C1 and C2
compilers. They can work together in what's called a "tiered compilation
system", where methods can be interpreted first, then compiled by C1, and
then further compiled by C2.
In JDK7, -XX:+TieredCompilation is off by default, where as in JDK8 it's on
by default.

- Kris

On Wed, Nov 2, 2016 at 1:48 AM, ???? <1072213404 at qq.com> wrote:

>
> Thank you ?Krystal ?
> i think i need some time to make these things light?
>
> and
> which part of  code  link HIR and LIR together in openjdk ?
>
> c1 matches client   mode?
> c2 matches server  mode?
> when some code running? just one of them work or both do?
>
>
>
>
>
>
> ------------------ ???? ------------------
> *???:* "Krystal Mok";<rednaxelafx at gmail.com>;
> *????:* 2016?11?2?(???) ??4:28
> *???:* "????"<1072213404 at qq.com>;
> *??:* "hotspot-dev"<hotspot-dev at openjdk.java.net>;
> *??:* Re: please help understanding what's the function and diffrence
> between hotspot compiler c1 and c2 ?
>
> Hi,
>
> I don't think I understand your question, but I'll take a shot.
> Are you trying to ask what the differences are between C1 and C2, with
> regards to how they handle volatile field accesses?
>
> For C1, yes, all Java fields accesses (load/store) are represented in the
> HIR with LoadField and StoreField instructions. The ciField in these
> instructions would carry the information about whether the field is
> volatile or not.
> When lowering HIR to LIR, the LIRGenerator::do_LoadField() and
> do_StoreField() functions are called. What is it that you're trying to
> learn about these functions?
>
> For C2, it's a bit complicated, because volatile semantics involve the
> memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1]
> and [2] for some background information before you dive into the code.
>
> Hope it helps,
> Kris
>
> [1]: https://wiki.openjdk.java.net/display/HotSpot/
> Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation
> [2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes
>
> On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote:
>
>> what's the function and diffrence between hotspot compiler c1 and c2 ?
>>
>>
>> about c1?
>> i have found something about ?voaltile? and
>> methods  LIRGenerator::do_StoreField(StoreField* x)  and
>> LIRGenerator::do_LoadField(LoadField* x) in
>>  ?share/vm/c1/c1_LIRGenerator.cpp??
>>
>>
>> when operating a  variable with a volatile qualifier?will it finally
>> invoke do_StoreField or do_LoadField method?
>> if true? then method do_LoadField or do_StoreField  by  which method?
>
>
>

From hyperdak at gmail.com  Wed Nov  2 09:34:30 2016
From: hyperdak at gmail.com (=?UTF-8?B?5Lqi5Lyf5qWg?=)
Date: Wed, 2 Nov 2016 17:34:30 +0800
Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_relationshi?=
	=?UTF-8?Q?p_of_hotspot_compiler_c1_and_c2_=EF=BC=9F?=
In-Reply-To: <tencent_7E4288D17388D52A31E33BD1@qq.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
	<tencent_7E4288D17388D52A31E33BD1@qq.com>
Message-ID: <CANrPAPW7Hr1K3z79hFgw53nvxehGmt2WE-+WAjo3g4kW9WA=yA@mail.gmail.com>

Hi,
    When use tiered compilation (default enable in jdk8),tiered  VM can use
C1 and C2 both [1].Client VM will use C1 and Server VM will use C2.

Thanks,
hyperdak

[1]
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation

From 1072213404 at qq.com  Wed Nov  2 09:40:13 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Wed, 2 Nov 2016 17:40:13 +0800
Subject: =?gb18030?B?u9i4tKO6IHBsZWFzZSBoZWxwIHVuZGVyc3RhbmRp?=
	=?gb18030?B?bmcgd2hhdCdzIHRoZSByZWxhdGlvbnNoaXAgb2Yg?=
	=?gb18030?B?aG90c3BvdCBjb21waWxlciBjMSBhbmQgYzIgo78=?=
In-Reply-To: <CANrPAPW7Hr1K3z79hFgw53nvxehGmt2WE-+WAjo3g4kW9WA=yA@mail.gmail.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
	<tencent_7E4288D17388D52A31E33BD1@qq.com>
	<CANrPAPW7Hr1K3z79hFgw53nvxehGmt2WE-+WAjo3g4kW9WA=yA@mail.gmail.com>
Message-ID: <tencent_12110CBC49A341400534DCBB@qq.com>

Hi,
Server VM will use C2, in this mode
which method processes 'volatile' operations like method do_StoreField in  src/share/vm/c1/c1_LIRGenerator.cpp?


Thank you !
------------------ ???? ------------------
???: "???";<hyperdak at gmail.com>;
????: 2016?11?2?(???) ??5:34
???: "????"<1072213404 at qq.com>; 
??: "Krystal Mok"<rednaxelafx at gmail.com>; "hotspot-dev"<hotspot-dev at openjdk.java.net>; 
??: Re: please help understanding what's the relationship of hotspot compiler c1 and c2 ?


Hi,
    When use tiered compilation (default enable in jdk8),tiered  VM can use C1 and C2 both [1].Client VM will use C1 and Server VM will use C2.


Thanks,
hyperdak


[1] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation

From aph at redhat.com  Wed Nov  2 10:59:26 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 2 Nov 2016 10:59:26 +0000
Subject: =?UTF-8?B?UmU6IOWbnuWkje+8miBwbGVhc2UgaGVscCB1bmRlcnN0YW5kaW5nIHdo?=
	=?UTF-8?Q?at's_the_relationship_of_hotspot_compiler_c1_and_c2_=ef=bc=9f?=
In-Reply-To: <tencent_12110CBC49A341400534DCBB@qq.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
	<tencent_7E4288D17388D52A31E33BD1@qq.com>
	<CANrPAPW7Hr1K3z79hFgw53nvxehGmt2WE-+WAjo3g4kW9WA=yA@mail.gmail.com>
	<tencent_12110CBC49A341400534DCBB@qq.com>
Message-ID: <3b13f1b8-e560-e666-d1a6-fde0ec474ce7@redhat.com>

On 02/11/16 09:40, ???? wrote:
> which method processes 'volatile' operations like method do_StoreField in  src/share/vm/c1/c1_LIRGenerator.cpp?

It's done in line 1771:

   if (is_volatile && os::is_MP()) {
    __ membar_release();
  }

and 1793:

  if (!support_IRIW_for_not_multiple_copy_atomic_cpu && is_volatile && os::is_MP()) {
    __ membar();
  }

Andrew.


From vladimir.kozlov at oracle.com  Thu Nov  3 02:51:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 2 Nov 2016 19:51:12 -0700
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <ca040c50-421b-e25b-48dd-574e2d7f053f@oracle.com>
References: <58115536.5080205@oracle.com>
	<ca040c50-421b-e25b-48dd-574e2d7f053f@oracle.com>
Message-ID: <452951ff-f480-57a0-bf4b-def10a599424@oracle.com>

Thank you, Coleen

On 10/31/16 6:35 PM, Coleen Phillimore wrote:
>
> I looked at the runtime code and it looks fine to me.  I'm pleased the
> changes were not more invasive.   Some minor questions and nits:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/code/nmethod.cpp.udiff.html
>
>
> *+ virtual void set_to_interpreted(methodHandle method, CompiledICInfo&
> info) {*
>
>
> Can you pass methodHandle by const reference so that the copy
> constructor and destructor aren't called?

It was original declaration for CompiledStaticCall::set_to_interpreted():
http://hg.openjdk.java.net/jdk9/hs/hotspot/file/031e87605d21/src/share/vm/code/compiledIC.hpp#l300

But your suggestion is good - I implemented it:

set_to_interpreted(const methodHandle& method,

I also have the same change for set_to_far().

>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/oops/methodCounters.hpp.udiff.html
>
> Why does this add a Method* pointer for #ifndef AOT code?   This could
> be a lot of additional footprint.

Good catch. Put under $if INCLUDE_AOT.

> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/globals.hpp.udiff.html
>
> Why are the AOT parameters in two separate sections?  The intx ones
> should be defined with a valid range.

We think Tiered compilation flags should be together. I added missing 
range().

> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/vmStructs.cpp.udiff.html
>
> Why is this added and the SA code fixed?  AOT doesn't use the SA, does
> it?  Was it added for debugging?

Yes, AOT does not use SA. It is for debugging of core files to correctly 
calculate size of instanceKlass structure - it depends on presence of 
fingerprint field:
+//    [EMBEDDED fingerprint] only if should_store_fingerprint()==true

Ioi added that for klass fingerprint code which is part of hotspot AOT 
changes:

https://bugs.openjdk.java.net/browse/JDK-8165142

Thanks,
Vladimir

>
> Thanks,
> Colee
>
> On 10/26/16 9:15 PM, Vladimir Kozlov wrote:
>> AOT JEP:
>> https://bugs.openjdk.java.net/browse/JDK-8166089
>> Subtask:
>> https://bugs.openjdk.java.net/browse/JDK-8166415
>> Webrev:
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>
>> Please, review Hotspot VM part of AOT changes.
>>
>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot
>> will be build only on Linux/x64.
>>
>> AOT code is NOT linked during AOT libraries load as it happens with
>> normal .so libraries. AOT code entry points are not exposed (not
>> global) in AOT libraries. Only class data has global labels which we
>> look for with dlsym(klass_name).
>>
>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of
>> existing CodeCache. When a java class is loaded JVM looks if
>> corresponding AOT-compiled methods exist in loaded AOT libraries and
>> add links to them from java methods descriptors (we have new field
>> Method::_aot_code). AOT-compiled code follows the same
>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>
>> Calls in AOT code use the same methods resolution runtime code as
>> calls in JITed code. The difference is call's destination address is
>> loaded indirectly because we can't patch AOT code - it is immutable
>> (to share between multiple JVM instances).
>>
>> Classes and Strings referenced in AOT code are resolved lazily by
>> calling into runtime. All mutable pointers (oops (mostly strings),
>> metadata) are stored and modified in a separate mutable memory (GOT
>> cells) - they are not embedded into AOT code.
>>
>> Changes includes klass fingerprint generation since we need it to find
>> correct klass data in loaded AOT libraries.
>>
>> Thanks,
>> Vladimir
>

From vladimir.kozlov at oracle.com  Thu Nov  3 06:54:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 2 Nov 2016 23:54:33 -0700
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
Message-ID: <3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com>

Thank you, Stefan

On 11/1/16 12:38 AM, Stefan Karlsson wrote:
> (resending without formatting)
>
> Hi Vladimir,
>
> I just took a quick look at the GC code.
>
> 1) You need to go over the entire patch and fix all the include lines
> that were added. They are are not sorted, as they should.

Done.

>
> Some examples:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
>
>
>   #include "utilities/debug.hpp"
>   #include "utilities/macros.hpp"
> + #include "aot/aotLoader.hpp"
>
>
>
> 2) I'd prefer if the the check if AOT is enabled was folded into
> AOTLoader::oops_do, so that the additions to the GC code would be less
> conspicuous.

Done. But I don't remove UseAOT for complex checks to avoid executing 
following checks, like next:

+   if (UseAOT && 
!_process_strong_tasks->is_task_claimed(GCH_PS_aot_oops_do)) {
+     AOTLoader::oops_do(strong_roots);
+   }

>
> For example:
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>
>
>     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
> CodeBlobToOopClosure::FixRelocations);
>     CodeCache::blobs_do(&adjust_from_blobs);
> +   if (UseAOT) {
> +     AOTLoader::oops_do(adjust_pointer_closure());
> +   }
>     StringTable::oops_do(adjust_pointer_closure());
>     ref_processor()->weak_oops_do(adjust_pointer_closure());
> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>
> Would be:
>
>     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
> CodeBlobToOopClosure::FixRelocations);
>     CodeCache::blobs_do(&adjust_from_blobs);
> +   AOTLoader::oops_do(adjust_pointer_closure());
>     StringTable::oops_do(adjust_pointer_closure());
>     ref_processor()->weak_oops_do(adjust_pointer_closure());
> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>
> 3) aotLoader.hpp uses implements methods using GrowableArray. This will
> expose the growable array functions to all includers of that file.
> Please move all that code out to an aotLoader.inline.hpp file, and then
> remove the unneeded includes from the aotLoader.hpp file.
>

Done.

> 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>
>
>     // Reserved area
>     char* low_boundary()  const { return _low_boundary; }
>     char* high_boundary() const { return _high_boundary; }
>
> +   void set_low_boundary(char *p)  { _low_boundary = p; }
> +   void set_high_boundary(char *p) { _high_boundary = p; }
> +   void set_low(char *p)           { _low = p; }
> +   void set_high(char *p)          { _high = p; }
> +
>     bool special() const { return _special; }
>
> These seems unsafe to me, but that might be because I don't understand
> how this is used. VirtualSpace has three sections, the lower, middle,
> and the high. The middle section might have another alignment (large
> pages) than the others. Is this property still maintained when these
> functions are used?

This is used only by AOT code because it does not call 
VirtualSpace::initialize_with_granularity() when creates AOTCodeHeap 
(inherited from CodeHeap). There is no actual memory heap reservation 
for AOT code. We set those boundary values to code section addresses in 
AOT library. AOT does not use alignment and middle section.

I put #if INCLUDE_AOT around these methods to be clear where they are used.

>
> 5)
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>
>
> Did you discuss with the Runtime team about the naming of these tags?
> The other class* tags where split up into multiple tags. For example,
> classload was changed to class,load.

Will reply to following Coleen's mail.

Thanks,
Vladimir

>
> Thanks,
> StefanK
>
> On 27/10/16 03:15, Vladimir Kozlov wrote:
>> AOT JEP:
>> https://bugs.openjdk.java.net/browse/JDK-8166089
>> Subtask:
>> https://bugs.openjdk.java.net/browse/JDK-8166415
>> Webrev:
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>
>> Please, review Hotspot VM part of AOT changes.
>>
>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot
>> will be build only on Linux/x64.
>>
>> AOT code is NOT linked during AOT libraries load as it happens with
>> normal .so libraries. AOT code entry points are not exposed (not
>> global) in AOT libraries. Only class data has global labels which we
>> look for with dlsym(klass_name).
>>
>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of
>> existing CodeCache. When a java class is loaded JVM looks if
>> corresponding AOT-compiled methods exist in loaded AOT libraries and
>> add links to them from java methods descriptors (we have new field
>> Method::_aot_code). AOT-compiled code follows the same
>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>
>> Calls in AOT code use the same methods resolution runtime code as
>> calls in JITed code. The difference is call's destination address is
>> loaded indirectly because we can't patch AOT code - it is immutable
>> (to share between multiple JVM instances).
>>
>> Classes and Strings referenced in AOT code are resolved lazily by
>> calling into runtime. All mutable pointers (oops (mostly strings),
>> metadata) are stored and modified in a separate mutable memory (GOT
>> cells) - they are not embedded into AOT code.
>>
>> Changes includes klass fingerprint generation since we need it to find
>> correct klass data in loaded AOT libraries.
>>
>> Thanks,
>> Vladimir
>
>

From stefan.karlsson at oracle.com  Thu Nov  3 07:57:27 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 3 Nov 2016 08:57:27 +0100
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com>
Message-ID: <3a21f4ec-3364-66e5-ab39-296e9db810cc@oracle.com>

Thanks!

StefanK

On 03/11/16 07:54, Vladimir Kozlov wrote:
> Thank you, Stefan
>
> On 11/1/16 12:38 AM, Stefan Karlsson wrote:
>> (resending without formatting)
>>
>> Hi Vladimir,
>>
>> I just took a quick look at the GC code.
>>
>> 1) You need to go over the entire patch and fix all the include lines
>> that were added. They are are not sorted, as they should.
>
> Done.
>
>>
>> Some examples:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html 
>>
>>
>>
>>   #include "utilities/debug.hpp"
>>   #include "utilities/macros.hpp"
>> + #include "aot/aotLoader.hpp"
>>
>>
>>
>> 2) I'd prefer if the the check if AOT is enabled was folded into
>> AOTLoader::oops_do, so that the additions to the GC code would be less
>> conspicuous.
>
> Done. But I don't remove UseAOT for complex checks to avoid executing 
> following checks, like next:
>
> +   if (UseAOT && 
> !_process_strong_tasks->is_task_claimed(GCH_PS_aot_oops_do)) {
> +     AOTLoader::oops_do(strong_roots);
> +   }
>
>>
>> For example:
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html 
>>
>>
>>
>>     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>> CodeBlobToOopClosure::FixRelocations);
>>     CodeCache::blobs_do(&adjust_from_blobs);
>> +   if (UseAOT) {
>> +     AOTLoader::oops_do(adjust_pointer_closure());
>> +   }
>>     StringTable::oops_do(adjust_pointer_closure());
>>     ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>
>>
>> Would be:
>>
>>     CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>> CodeBlobToOopClosure::FixRelocations);
>>     CodeCache::blobs_do(&adjust_from_blobs);
>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>     StringTable::oops_do(adjust_pointer_closure());
>>     ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>
>>
>> 3) aotLoader.hpp uses implements methods using GrowableArray. This will
>> expose the growable array functions to all includers of that file.
>> Please move all that code out to an aotLoader.inline.hpp file, and then
>> remove the unneeded includes from the aotLoader.hpp file.
>>
>
> Done.
>
>> 4) 
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>
>>
>>     // Reserved area
>>     char* low_boundary()  const { return _low_boundary; }
>>     char* high_boundary() const { return _high_boundary; }
>>
>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>> +   void set_low(char *p)           { _low = p; }
>> +   void set_high(char *p)          { _high = p; }
>> +
>>     bool special() const { return _special; }
>>
>> These seems unsafe to me, but that might be because I don't understand
>> how this is used. VirtualSpace has three sections, the lower, middle,
>> and the high. The middle section might have another alignment (large
>> pages) than the others. Is this property still maintained when these
>> functions are used?
>
> This is used only by AOT code because it does not call 
> VirtualSpace::initialize_with_granularity() when creates AOTCodeHeap 
> (inherited from CodeHeap). There is no actual memory heap reservation 
> for AOT code. We set those boundary values to code section addresses 
> in AOT library. AOT does not use alignment and middle section.
>
> I put #if INCLUDE_AOT around these methods to be clear where they are 
> used.
>
>>
>> 5)
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html 
>>
>>
>>
>> Did you discuss with the Runtime team about the naming of these tags?
>> The other class* tags where split up into multiple tags. For example,
>> classload was changed to class,load.
>
> Will reply to following Coleen's mail.
>
> Thanks,
> Vladimir
>
>>
>> Thanks,
>> StefanK
>>
>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>> AOT JEP:
>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>> Subtask:
>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>> Webrev:
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>
>>> Please, review Hotspot VM part of AOT changes.
>>>
>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot
>>> will be build only on Linux/x64.
>>>
>>> AOT code is NOT linked during AOT libraries load as it happens with
>>> normal .so libraries. AOT code entry points are not exposed (not
>>> global) in AOT libraries. Only class data has global labels which we
>>> look for with dlsym(klass_name).
>>>
>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of
>>> existing CodeCache. When a java class is loaded JVM looks if
>>> corresponding AOT-compiled methods exist in loaded AOT libraries and
>>> add links to them from java methods descriptors (we have new field
>>> Method::_aot_code). AOT-compiled code follows the same
>>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>
>>> Calls in AOT code use the same methods resolution runtime code as
>>> calls in JITed code. The difference is call's destination address is
>>> loaded indirectly because we can't patch AOT code - it is immutable
>>> (to share between multiple JVM instances).
>>>
>>> Classes and Strings referenced in AOT code are resolved lazily by
>>> calling into runtime. All mutable pointers (oops (mostly strings),
>>> metadata) are stored and modified in a separate mutable memory (GOT
>>> cells) - they are not embedded into AOT code.
>>>
>>> Changes includes klass fingerprint generation since we need it to find
>>> correct klass data in loaded AOT libraries.
>>>
>>> Thanks,
>>> Vladimir
>>
>>


From vladimir.kozlov at oracle.com  Thu Nov  3 11:33:38 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 3 Nov 2016 04:33:38 -0700
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
Message-ID: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>

Done:

java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so HelloWorld
[0.060s][trace][aot,class,load] found  java.lang.Object  in 
./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000
...

I updated webrev with your, Coleen, and Stefan all suggestions:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/

this is delta of changes:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/

Thanks,
Vladimir

On 11/1/16 3:40 AM, Coleen Phillimore wrote:
> 5.  Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging.
> Thanks
> Coleen
>
>
>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>>
>> (resending without formatting)
>>
>> Hi Vladimir,
>>
>> I just took a quick look at the GC code.
>>
>> 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should.
>>
>> Some examples:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
>>
>>  #include "utilities/debug.hpp"
>>  #include "utilities/macros.hpp"
>> + #include "aot/aotLoader.hpp"
>>
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html
>>
>>  #include "gc/g1/g1Policy.hpp"
>>  #include "gc/g1/g1RootClosures.hpp"
>>  #include "gc/g1/g1RootProcessor.hpp"
>>  #include "gc/g1/heapRegion.inline.hpp"
>>  #include "memory/allocation.inline.hpp"
>> + #include "aot/aotLoader.hpp"
>>  #include "runtime/fprofiler.hpp"
>>  #include "runtime/mutex.hpp"
>>  #include "services/management.hpp"
>>
>> 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous.
>>
>> For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>>
>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>>    CodeCache::blobs_do(&adjust_from_blobs);
>> +   if (UseAOT) {
>> +     AOTLoader::oops_do(adjust_pointer_closure());
>> +   }
>>    StringTable::oops_do(adjust_pointer_closure());
>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>
>> Would be:
>>
>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations);
>>    CodeCache::blobs_do(&adjust_from_blobs);
>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>    StringTable::oops_do(adjust_pointer_closure());
>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>
>> 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>
>>    // Reserved area
>>    char* low_boundary()  const { return _low_boundary; }
>>    char* high_boundary() const { return _high_boundary; }
>>
>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>> +   void set_low(char *p)           { _low = p; }
>> +   void set_high(char *p)          { _high = p; }
>> +
>>    bool special() const { return _special; }
>>
>> These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used?
>>
>>
>> 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>>
>> Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load.
>>
>> Thanks,
>> StefanK
>>
>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>> AOT JEP:
>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>> Subtask:
>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>> Webrev:
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>
>>> Please, review Hotspot VM part of AOT changes.
>>>
>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64.
>>>
>>> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name).
>>>
>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>
>>> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances).
>>>
>>> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code.
>>>
>>> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries.
>>>
>>> Thanks,
>>> Vladimir
>>
>>
>

From stefan.karlsson at oracle.com  Thu Nov  3 12:47:36 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 3 Nov 2016 13:47:36 +0100
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
	<7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
Message-ID: <a2790356-4c09-d2fc-bafb-760d2fa99bbe@oracle.com>

Hi Vladimir,

On 03/11/16 12:33, Vladimir Kozlov wrote:
> Done:
>
> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so 
> HelloWorld
> [0.060s][trace][aot,class,load] found  java.lang.Object  in 
> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000
> ...
>
> I updated webrev with your, Coleen, and Stefan all suggestions:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/
>
> this is delta of changes:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/

Looks good to me.

I noticed one nit:
http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.cpp.udiff.html

#include "precompiled.hpp"
*+ *
*+ #include "aot/aotLoader.hpp"*
*+ #include "aot/aotCodeHeap.hpp"*

The lines should be swapped.

Thanks,
StefanK

>
> Thanks,
> Vladimir
>
> On 11/1/16 3:40 AM, Coleen Phillimore wrote:
>> 5.  Thanks for pointing out the logging tags Stefan. Yes we would 
>> prefer adding "apt" and "fingerprint" and using the composition of 
>> existing tags for logging.
>> Thanks
>> Coleen
>>
>>
>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson 
>>> <stefan.karlsson at oracle.com> wrote:
>>>
>>> (resending without formatting)
>>>
>>> Hi Vladimir,
>>>
>>> I just took a quick look at the GC code.
>>>
>>> 1) You need to go over the entire patch and fix all the include 
>>> lines that were added. They are are not sorted, as they should.
>>>
>>> Some examples:
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html 
>>>
>>>
>>>  #include "utilities/debug.hpp"
>>>  #include "utilities/macros.hpp"
>>> + #include "aot/aotLoader.hpp"
>>>
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html 
>>>
>>>
>>>  #include "gc/g1/g1Policy.hpp"
>>>  #include "gc/g1/g1RootClosures.hpp"
>>>  #include "gc/g1/g1RootProcessor.hpp"
>>>  #include "gc/g1/heapRegion.inline.hpp"
>>>  #include "memory/allocation.inline.hpp"
>>> + #include "aot/aotLoader.hpp"
>>>  #include "runtime/fprofiler.hpp"
>>>  #include "runtime/mutex.hpp"
>>>  #include "services/management.hpp"
>>>
>>> 2) I'd prefer if the the check if AOT is enabled was folded into 
>>> AOTLoader::oops_do, so that the additions to the GC code would be 
>>> less conspicuous.
>>>
>>> For example: 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>>>
>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
>>> CodeBlobToOopClosure::FixRelocations);
>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>> +   if (UseAOT) {
>>> +     AOTLoader::oops_do(adjust_pointer_closure());
>>> +   }
>>>    StringTable::oops_do(adjust_pointer_closure());
>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>>
>>>
>>> Would be:
>>>
>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
>>> CodeBlobToOopClosure::FixRelocations);
>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>>    StringTable::oops_do(adjust_pointer_closure());
>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>>
>>>
>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This 
>>> will expose the growable array functions to all includers of that 
>>> file. Please move all that code out to an aotLoader.inline.hpp file, 
>>> and then remove the unneeded includes from the aotLoader.hpp file. 
>>> 4) 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>>
>>>    // Reserved area
>>>    char* low_boundary()  const { return _low_boundary; }
>>>    char* high_boundary() const { return _high_boundary; }
>>>
>>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>>> +   void set_low(char *p)           { _low = p; }
>>> +   void set_high(char *p)          { _high = p; }
>>> +
>>>    bool special() const { return _special; }
>>>
>>> These seems unsafe to me, but that might be because I don't 
>>> understand how this is used. VirtualSpace has three sections, the 
>>> lower, middle, and the high. The middle section might have another 
>>> alignment (large pages) than the others. Is this property still 
>>> maintained when these functions are used?
>>>
>>>
>>> 5) 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>>>
>>> Did you discuss with the Runtime team about the naming of these 
>>> tags? The other class* tags where split up into multiple tags. For 
>>> example, classload was changed to class,load.
>>>
>>> Thanks,
>>> StefanK
>>>
>>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>>> AOT JEP:
>>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>>> Subtask:
>>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>>
>>>> Please, review Hotspot VM part of AOT changes.
>>>>
>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of 
>>>> Hotspot will be build only on Linux/x64.
>>>>
>>>> AOT code is NOT linked during AOT libraries load as it happens with 
>>>> normal .so libraries. AOT code entry points are not exposed (not 
>>>> global) in AOT libraries. Only class data has global labels which 
>>>> we look for with dlsym(klass_name).
>>>>
>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* 
>>>> of existing CodeCache. When a java class is loaded JVM looks if 
>>>> corresponding AOT-compiled methods exist in loaded AOT libraries 
>>>> and add links to them from java methods descriptors (we have new 
>>>> field Method::_aot_code). AOT-compiled code follows the same 
>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>>
>>>> Calls in AOT code use the same methods resolution runtime code as 
>>>> calls in JITed code. The difference is call's destination address 
>>>> is loaded indirectly because we can't patch AOT code - it is 
>>>> immutable (to share between multiple JVM instances).
>>>>
>>>> Classes and Strings referenced in AOT code are resolved lazily by 
>>>> calling into runtime. All mutable pointers (oops (mostly strings), 
>>>> metadata) are stored and modified in a separate mutable memory (GOT 
>>>> cells) - they are not embedded into AOT code.
>>>>
>>>> Changes includes klass fingerprint generation since we need it to 
>>>> find correct klass data in loaded AOT libraries.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>
>>>
>>


From bob.vandette at oracle.com  Thu Nov  3 14:06:36 2016
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 3 Nov 2016 10:06:36 -0400
Subject: RFR: 8167501 ARMv7 Linux C2 compiler crashes running jtreg harness on
	MP systems
Message-ID: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com>

Please review this JDK9 work-around for a reliability problem causing crashes and hangs
running jtreg on ARMv7 MP platforms using the server compiler.

This work-around disables the use of quick-enter on ARM.  This enhancement was
previously disabled for AARCH64 binaries.

This work-around has been independently verified by running jtreg on two different MP 
based ARM systems.

https://bugs.openjdk.java.net/browse/JDK-8167501 <https://bugs.openjdk.java.net/browse/JDK-8167501>

diff --git a/src/share/vm/runtime/sharedRuntime.cpp b/src/share/vm/runtime/sharedRuntime.cpp
--- a/src/share/vm/runtime/sharedRuntime.cpp
+++ b/src/share/vm/runtime/sharedRuntime.cpp
@@ -1983,8 +1983,10 @@
 // Handles the uncommon case in locking, i.e., contention or an inflated lock.
 JRT_BLOCK_ENTRY(void, SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* lock, JavaThread* thread))
   // Disable ObjectSynchronizer::quick_enter() in default config
-  // on AARCH64 until JDK-8153107 is resolved.
-  if (AARCH64_ONLY((SyncFlags & 256) != 0 &&) !SafepointSynchronize::is_synchronizing()) {
+  // on AARCH64 and ARM until JDK-8153107 is resolved.
+  if (ARM_ONLY((SyncFlags & 256) != 0 &&)
+      AARCH64_ONLY((SyncFlags & 256) != 0 &&)
+      !SafepointSynchronize::is_synchronizing()) {
     // Only try quick_enter() if we're not trying to reach a safepoint
     // so that the calling thread reaches the safepoint more quickly.
     if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return;


The real problem will be investigated and fixed under this bug:

https://bugs.openjdk.java.net/browse/JDK-8153107 <https://bugs.openjdk.java.net/browse/JDK-8153107>

Bob.


From daniel.daugherty at oracle.com  Thu Nov  3 14:32:25 2016
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 3 Nov 2016 08:32:25 -0600
Subject: RFR: 8167501 ARMv7 Linux C2 compiler crashes running jtreg
	harness on MP systems
In-Reply-To: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com>
References: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com>
Message-ID: <7a659d9f-9625-8b12-c3b2-4c8bc6032516@oracle.com>

Thumbs up.

Dan


On 11/3/16 8:06 AM, Bob Vandette wrote:
> Please review this JDK9 work-around for a reliability problem causing crashes and hangs
> running jtreg on ARMv7 MP platforms using the server compiler.
>
> This work-around disables the use of quick-enter on ARM.  This enhancement was
> previously disabled for AARCH64 binaries.
>
> This work-around has been independently verified by running jtreg on two different MP
> based ARM systems.
>
> https://bugs.openjdk.java.net/browse/JDK-8167501 <https://bugs.openjdk.java.net/browse/JDK-8167501>
>
> diff --git a/src/share/vm/runtime/sharedRuntime.cpp b/src/share/vm/runtime/sharedRuntime.cpp
> --- a/src/share/vm/runtime/sharedRuntime.cpp
> +++ b/src/share/vm/runtime/sharedRuntime.cpp
> @@ -1983,8 +1983,10 @@
>   // Handles the uncommon case in locking, i.e., contention or an inflated lock.
>   JRT_BLOCK_ENTRY(void, SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* lock, JavaThread* thread))
>     // Disable ObjectSynchronizer::quick_enter() in default config
> -  // on AARCH64 until JDK-8153107 is resolved.
> -  if (AARCH64_ONLY((SyncFlags & 256) != 0 &&) !SafepointSynchronize::is_synchronizing()) {
> +  // on AARCH64 and ARM until JDK-8153107 is resolved.
> +  if (ARM_ONLY((SyncFlags & 256) != 0 &&)
> +      AARCH64_ONLY((SyncFlags & 256) != 0 &&)
> +      !SafepointSynchronize::is_synchronizing()) {
>       // Only try quick_enter() if we're not trying to reach a safepoint
>       // so that the calling thread reaches the safepoint more quickly.
>       if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return;
>
>
> The real problem will be investigated and fixed under this bug:
>
> https://bugs.openjdk.java.net/browse/JDK-8153107 <https://bugs.openjdk.java.net/browse/JDK-8153107>
>
> Bob.
>
>


From coleen.phillimore at oracle.com  Thu Nov  3 16:00:15 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 3 Nov 2016 12:00:15 -0400
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
	<7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
Message-ID: <034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com>


On 11/3/16 7:33 AM, Vladimir Kozlov wrote:
> Done:
>
> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so 
> HelloWorld
> [0.060s][trace][aot,class,load] found  java.lang.Object  in 
> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000
> ...
>
> I updated webrev with your, Coleen, and Stefan all suggestions:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/
>
> this is delta of changes:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/

Thank you for making these changes.  I must have missed these CompiledIC 
=> const methodHandle& changes when I went through a while ago.

One minor change though:

http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.hpp.udiff.html
http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.cpp.udiff.html
http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.hpp.udiff.html

Since instanceKlassHandle is not a real handle, can you keep that 
passing by value?  I have a script that removes them for jdk 10. Sorry 
for the confusion about these.  The difference is that methodHandle has 
a copy constructor and destructor, so passing by const reference avoids 
copying them.   instanceKlassHandle and KlassHandle are dummy now and 
don't have these so don't add to the code.

I don't need to see another webrev.  Thank you for fixing the logging.

Coleen


>
> Thanks,
> Vladimir
>
> On 11/1/16 3:40 AM, Coleen Phillimore wrote:
>> 5.  Thanks for pointing out the logging tags Stefan. Yes we would 
>> prefer adding "apt" and "fingerprint" and using the composition of 
>> existing tags for logging.
>> Thanks
>> Coleen
>>
>>
>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson 
>>> <stefan.karlsson at oracle.com> wrote:
>>>
>>> (resending without formatting)
>>>
>>> Hi Vladimir,
>>>
>>> I just took a quick look at the GC code.
>>>
>>> 1) You need to go over the entire patch and fix all the include 
>>> lines that were added. They are are not sorted, as they should.
>>>
>>> Some examples:
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html 
>>>
>>>
>>>  #include "utilities/debug.hpp"
>>>  #include "utilities/macros.hpp"
>>> + #include "aot/aotLoader.hpp"
>>>
>>>
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html 
>>>
>>>
>>>  #include "gc/g1/g1Policy.hpp"
>>>  #include "gc/g1/g1RootClosures.hpp"
>>>  #include "gc/g1/g1RootProcessor.hpp"
>>>  #include "gc/g1/heapRegion.inline.hpp"
>>>  #include "memory/allocation.inline.hpp"
>>> + #include "aot/aotLoader.hpp"
>>>  #include "runtime/fprofiler.hpp"
>>>  #include "runtime/mutex.hpp"
>>>  #include "services/management.hpp"
>>>
>>> 2) I'd prefer if the the check if AOT is enabled was folded into 
>>> AOTLoader::oops_do, so that the additions to the GC code would be 
>>> less conspicuous.
>>>
>>> For example: 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>>>
>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
>>> CodeBlobToOopClosure::FixRelocations);
>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>> +   if (UseAOT) {
>>> +     AOTLoader::oops_do(adjust_pointer_closure());
>>> +   }
>>>    StringTable::oops_do(adjust_pointer_closure());
>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>>
>>>
>>> Would be:
>>>
>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), 
>>> CodeBlobToOopClosure::FixRelocations);
>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>>    StringTable::oops_do(adjust_pointer_closure());
>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 
>>>
>>>
>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This 
>>> will expose the growable array functions to all includers of that 
>>> file. Please move all that code out to an aotLoader.inline.hpp file, 
>>> and then remove the unneeded includes from the aotLoader.hpp file. 
>>> 4) 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>>
>>>    // Reserved area
>>>    char* low_boundary()  const { return _low_boundary; }
>>>    char* high_boundary() const { return _high_boundary; }
>>>
>>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>>> +   void set_low(char *p)           { _low = p; }
>>> +   void set_high(char *p)          { _high = p; }
>>> +
>>>    bool special() const { return _special; }
>>>
>>> These seems unsafe to me, but that might be because I don't 
>>> understand how this is used. VirtualSpace has three sections, the 
>>> lower, middle, and the high. The middle section might have another 
>>> alignment (large pages) than the others. Is this property still 
>>> maintained when these functions are used?
>>>
>>>
>>> 5) 
>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>>>
>>> Did you discuss with the Runtime team about the naming of these 
>>> tags? The other class* tags where split up into multiple tags. For 
>>> example, classload was changed to class,load.
>>>
>>> Thanks,
>>> StefanK
>>>
>>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>>> AOT JEP:
>>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>>> Subtask:
>>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>>
>>>> Please, review Hotspot VM part of AOT changes.
>>>>
>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of 
>>>> Hotspot will be build only on Linux/x64.
>>>>
>>>> AOT code is NOT linked during AOT libraries load as it happens with 
>>>> normal .so libraries. AOT code entry points are not exposed (not 
>>>> global) in AOT libraries. Only class data has global labels which 
>>>> we look for with dlsym(klass_name).
>>>>
>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* 
>>>> of existing CodeCache. When a java class is loaded JVM looks if 
>>>> corresponding AOT-compiled methods exist in loaded AOT libraries 
>>>> and add links to them from java methods descriptors (we have new 
>>>> field Method::_aot_code). AOT-compiled code follows the same 
>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>>
>>>> Calls in AOT code use the same methods resolution runtime code as 
>>>> calls in JITed code. The difference is call's destination address 
>>>> is loaded indirectly because we can't patch AOT code - it is 
>>>> immutable (to share between multiple JVM instances).
>>>>
>>>> Classes and Strings referenced in AOT code are resolved lazily by 
>>>> calling into runtime. All mutable pointers (oops (mostly strings), 
>>>> metadata) are stored and modified in a separate mutable memory (GOT 
>>>> cells) - they are not embedded into AOT code.
>>>>
>>>> Changes includes klass fingerprint generation since we need it to 
>>>> find correct klass data in loaded AOT libraries.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>
>>>
>>


From vladimir.kozlov at oracle.com  Thu Nov  3 19:04:27 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 3 Nov 2016 12:04:27 -0700
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <a2790356-4c09-d2fc-bafb-760d2fa99bbe@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
	<7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
	<a2790356-4c09-d2fc-bafb-760d2fa99bbe@oracle.com>
Message-ID: <bbd0f93b-a454-56a7-8faf-fb4819d2ba05@oracle.com>

Thank you, Stefan

I fixed aotCodeHeap.cpp as you suggested.

Vladimir

On 11/3/16 5:47 AM, Stefan Karlsson wrote:
> Hi Vladimir,
>
> On 03/11/16 12:33, Vladimir Kozlov wrote:
>> Done:
>>
>> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so
>> HelloWorld
>> [0.060s][trace][aot,class,load] found  java.lang.Object  in
>> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000
>> ...
>>
>> I updated webrev with your, Coleen, and Stefan all suggestions:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/
>>
>> this is delta of changes:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/
>
> Looks good to me.
>
> I noticed one nit:
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.cpp.udiff.html
>
>
> #include "precompiled.hpp"
> *+ *
> *+ #include "aot/aotLoader.hpp"*
> *+ #include "aot/aotCodeHeap.hpp"*
>
> The lines should be swapped.
>
> Thanks,
> StefanK
>
>>
>> Thanks,
>> Vladimir
>>
>> On 11/1/16 3:40 AM, Coleen Phillimore wrote:
>>> 5.  Thanks for pointing out the logging tags Stefan. Yes we would
>>> prefer adding "apt" and "fingerprint" and using the composition of
>>> existing tags for logging.
>>> Thanks
>>> Coleen
>>>
>>>
>>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson
>>>> <stefan.karlsson at oracle.com> wrote:
>>>>
>>>> (resending without formatting)
>>>>
>>>> Hi Vladimir,
>>>>
>>>> I just took a quick look at the GC code.
>>>>
>>>> 1) You need to go over the entire patch and fix all the include
>>>> lines that were added. They are are not sorted, as they should.
>>>>
>>>> Some examples:
>>>>
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
>>>>
>>>>
>>>>  #include "utilities/debug.hpp"
>>>>  #include "utilities/macros.hpp"
>>>> + #include "aot/aotLoader.hpp"
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html
>>>>
>>>>
>>>>  #include "gc/g1/g1Policy.hpp"
>>>>  #include "gc/g1/g1RootClosures.hpp"
>>>>  #include "gc/g1/g1RootProcessor.hpp"
>>>>  #include "gc/g1/heapRegion.inline.hpp"
>>>>  #include "memory/allocation.inline.hpp"
>>>> + #include "aot/aotLoader.hpp"
>>>>  #include "runtime/fprofiler.hpp"
>>>>  #include "runtime/mutex.hpp"
>>>>  #include "services/management.hpp"
>>>>
>>>> 2) I'd prefer if the the check if AOT is enabled was folded into
>>>> AOTLoader::oops_do, so that the additions to the GC code would be
>>>> less conspicuous.
>>>>
>>>> For example:
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>>>>
>>>>
>>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>>>> CodeBlobToOopClosure::FixRelocations);
>>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>>> +   if (UseAOT) {
>>>> +     AOTLoader::oops_do(adjust_pointer_closure());
>>>> +   }
>>>>    StringTable::oops_do(adjust_pointer_closure());
>>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>>>
>>>>
>>>> Would be:
>>>>
>>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>>>> CodeBlobToOopClosure::FixRelocations);
>>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>>>    StringTable::oops_do(adjust_pointer_closure());
>>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>>>
>>>>
>>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This
>>>> will expose the growable array functions to all includers of that
>>>> file. Please move all that code out to an aotLoader.inline.hpp file,
>>>> and then remove the unneeded includes from the aotLoader.hpp file.
>>>> 4)
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>>>
>>>>
>>>>    // Reserved area
>>>>    char* low_boundary()  const { return _low_boundary; }
>>>>    char* high_boundary() const { return _high_boundary; }
>>>>
>>>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>>>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>>>> +   void set_low(char *p)           { _low = p; }
>>>> +   void set_high(char *p)          { _high = p; }
>>>> +
>>>>    bool special() const { return _special; }
>>>>
>>>> These seems unsafe to me, but that might be because I don't
>>>> understand how this is used. VirtualSpace has three sections, the
>>>> lower, middle, and the high. The middle section might have another
>>>> alignment (large pages) than the others. Is this property still
>>>> maintained when these functions are used?
>>>>
>>>>
>>>> 5)
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>>>>
>>>>
>>>> Did you discuss with the Runtime team about the naming of these
>>>> tags? The other class* tags where split up into multiple tags. For
>>>> example, classload was changed to class,load.
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>>>> AOT JEP:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>>>> Subtask:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>>>
>>>>> Please, review Hotspot VM part of AOT changes.
>>>>>
>>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of
>>>>> Hotspot will be build only on Linux/x64.
>>>>>
>>>>> AOT code is NOT linked during AOT libraries load as it happens with
>>>>> normal .so libraries. AOT code entry points are not exposed (not
>>>>> global) in AOT libraries. Only class data has global labels which
>>>>> we look for with dlsym(klass_name).
>>>>>
>>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension*
>>>>> of existing CodeCache. When a java class is loaded JVM looks if
>>>>> corresponding AOT-compiled methods exist in loaded AOT libraries
>>>>> and add links to them from java methods descriptors (we have new
>>>>> field Method::_aot_code). AOT-compiled code follows the same
>>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>>>
>>>>> Calls in AOT code use the same methods resolution runtime code as
>>>>> calls in JITed code. The difference is call's destination address
>>>>> is loaded indirectly because we can't patch AOT code - it is
>>>>> immutable (to share between multiple JVM instances).
>>>>>
>>>>> Classes and Strings referenced in AOT code are resolved lazily by
>>>>> calling into runtime. All mutable pointers (oops (mostly strings),
>>>>> metadata) are stored and modified in a separate mutable memory (GOT
>>>>> cells) - they are not embedded into AOT code.
>>>>>
>>>>> Changes includes klass fingerprint generation since we need it to
>>>>> find correct klass data in loaded AOT libraries.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>
>>>>
>>>
>

From vladimir.kozlov at oracle.com  Thu Nov  3 19:17:14 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 3 Nov 2016 12:17:14 -0700
Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes
In-Reply-To: <034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com>
References: <58115536.5080205@oracle.com>
	<dd08fe72-41b5-2124-1440-1cc142c1bdfa@oracle.com>
	<009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com>
	<7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com>
	<034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com>
Message-ID: <bc9f6c2c-6778-d7a7-21e9-6cb003562c8f@oracle.com>

Thank you, Coleen

I reverted instanceKlassHandle changes.

Vladimir

On 11/3/16 9:00 AM, Coleen Phillimore wrote:
>
>
> On 11/3/16 7:33 AM, Vladimir Kozlov wrote:
>> Done:
>>
>> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so
>> HelloWorld
>> [0.060s][trace][aot,class,load] found  java.lang.Object  in
>> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000
>> ...
>>
>> I updated webrev with your, Coleen, and Stefan all suggestions:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/
>>
>> this is delta of changes:
>>
>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/
>
> Thank you for making these changes.  I must have missed these CompiledIC
> => const methodHandle& changes when I went through a while ago.
>
> One minor change though:
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.hpp.udiff.html
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.cpp.udiff.html
>
> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.hpp.udiff.html
>
>
> Since instanceKlassHandle is not a real handle, can you keep that
> passing by value?  I have a script that removes them for jdk 10. Sorry
> for the confusion about these.  The difference is that methodHandle has
> a copy constructor and destructor, so passing by const reference avoids
> copying them.   instanceKlassHandle and KlassHandle are dummy now and
> don't have these so don't add to the code.
>
> I don't need to see another webrev.  Thank you for fixing the logging.
>
> Coleen
>
>
>
>
>
>>
>> Thanks,
>> Vladimir
>>
>> On 11/1/16 3:40 AM, Coleen Phillimore wrote:
>>> 5.  Thanks for pointing out the logging tags Stefan. Yes we would
>>> prefer adding "apt" and "fingerprint" and using the composition of
>>> existing tags for logging.
>>> Thanks
>>> Coleen
>>>
>>>
>>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson
>>>> <stefan.karlsson at oracle.com> wrote:
>>>>
>>>> (resending without formatting)
>>>>
>>>> Hi Vladimir,
>>>>
>>>> I just took a quick look at the GC code.
>>>>
>>>> 1) You need to go over the entire patch and fix all the include
>>>> lines that were added. They are are not sorted, as they should.
>>>>
>>>> Some examples:
>>>>
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html
>>>>
>>>>
>>>>  #include "utilities/debug.hpp"
>>>>  #include "utilities/macros.hpp"
>>>> + #include "aot/aotLoader.hpp"
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html
>>>>
>>>>
>>>>  #include "gc/g1/g1Policy.hpp"
>>>>  #include "gc/g1/g1RootClosures.hpp"
>>>>  #include "gc/g1/g1RootProcessor.hpp"
>>>>  #include "gc/g1/heapRegion.inline.hpp"
>>>>  #include "memory/allocation.inline.hpp"
>>>> + #include "aot/aotLoader.hpp"
>>>>  #include "runtime/fprofiler.hpp"
>>>>  #include "runtime/mutex.hpp"
>>>>  #include "services/management.hpp"
>>>>
>>>> 2) I'd prefer if the the check if AOT is enabled was folded into
>>>> AOTLoader::oops_do, so that the additions to the GC code would be
>>>> less conspicuous.
>>>>
>>>> For example:
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html
>>>>
>>>>
>>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>>>> CodeBlobToOopClosure::FixRelocations);
>>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>>> +   if (UseAOT) {
>>>> +     AOTLoader::oops_do(adjust_pointer_closure());
>>>> +   }
>>>>    StringTable::oops_do(adjust_pointer_closure());
>>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>>>
>>>>
>>>> Would be:
>>>>
>>>>    CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(),
>>>> CodeBlobToOopClosure::FixRelocations);
>>>>    CodeCache::blobs_do(&adjust_from_blobs);
>>>> +   AOTLoader::oops_do(adjust_pointer_closure());
>>>>    StringTable::oops_do(adjust_pointer_closure());
>>>>    ref_processor()->weak_oops_do(adjust_pointer_closure());
>>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure());
>>>>
>>>>
>>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This
>>>> will expose the growable array functions to all includers of that
>>>> file. Please move all that code out to an aotLoader.inline.hpp file,
>>>> and then remove the unneeded includes from the aotLoader.hpp file.
>>>> 4)
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html
>>>>
>>>>
>>>>    // Reserved area
>>>>    char* low_boundary()  const { return _low_boundary; }
>>>>    char* high_boundary() const { return _high_boundary; }
>>>>
>>>> +   void set_low_boundary(char *p)  { _low_boundary = p; }
>>>> +   void set_high_boundary(char *p) { _high_boundary = p; }
>>>> +   void set_low(char *p)           { _low = p; }
>>>> +   void set_high(char *p)          { _high = p; }
>>>> +
>>>>    bool special() const { return _special; }
>>>>
>>>> These seems unsafe to me, but that might be because I don't
>>>> understand how this is used. VirtualSpace has three sections, the
>>>> lower, middle, and the high. The middle section might have another
>>>> alignment (large pages) than the others. Is this property still
>>>> maintained when these functions are used?
>>>>
>>>>
>>>> 5)
>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html
>>>>
>>>>
>>>> Did you discuss with the Runtime team about the naming of these
>>>> tags? The other class* tags where split up into multiple tags. For
>>>> example, classload was changed to class,load.
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>>> On 27/10/16 03:15, Vladimir Kozlov wrote:
>>>>> AOT JEP:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8166089
>>>>> Subtask:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8166415
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/
>>>>>
>>>>> Please, review Hotspot VM part of AOT changes.
>>>>>
>>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of
>>>>> Hotspot will be build only on Linux/x64.
>>>>>
>>>>> AOT code is NOT linked during AOT libraries load as it happens with
>>>>> normal .so libraries. AOT code entry points are not exposed (not
>>>>> global) in AOT libraries. Only class data has global labels which
>>>>> we look for with dlsym(klass_name).
>>>>>
>>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension*
>>>>> of existing CodeCache. When a java class is loaded JVM looks if
>>>>> corresponding AOT-compiled methods exist in loaded AOT libraries
>>>>> and add links to them from java methods descriptors (we have new
>>>>> field Method::_aot_code). AOT-compiled code follows the same
>>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code.
>>>>>
>>>>> Calls in AOT code use the same methods resolution runtime code as
>>>>> calls in JITed code. The difference is call's destination address
>>>>> is loaded indirectly because we can't patch AOT code - it is
>>>>> immutable (to share between multiple JVM instances).
>>>>>
>>>>> Classes and Strings referenced in AOT code are resolved lazily by
>>>>> calling into runtime. All mutable pointers (oops (mostly strings),
>>>>> metadata) are stored and modified in a separate mutable memory (GOT
>>>>> cells) - they are not embedded into AOT code.
>>>>>
>>>>> Changes includes klass fingerprint generation since we need it to
>>>>> find correct klass data in loaded AOT libraries.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>
>>>>
>>>
>

From david.holmes at oracle.com  Thu Nov  3 19:39:35 2016
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 4 Nov 2016 05:39:35 +1000
Subject: =?UTF-8?B?UmU6IOWbnuWkje+8miBwbGVhc2UgaGVscCB1bmRlcnN0YW5kaW5nIHdo?=
	=?UTF-8?Q?at's_the_relationship_of_hotspot_compiler_c1_and_c2_=ef=bc=9f?=
In-Reply-To: <tencent_12110CBC49A341400534DCBB@qq.com>
References: <tencent_1640014A25C277F977843EEB@qq.com>
	<CA+cQ+tQAvsQPgrs2v8zxfVE_9RWRFtBbEWK8kPfto0bC-YBNNQ@mail.gmail.com>
	<tencent_7E4288D17388D52A31E33BD1@qq.com>
	<CANrPAPW7Hr1K3z79hFgw53nvxehGmt2WE-+WAjo3g4kW9WA=yA@mail.gmail.com>
	<tencent_12110CBC49A341400534DCBB@qq.com>
Message-ID: <6554c154-0fb7-1fa1-169d-57c58df57e6b@oracle.com>

On 2/11/2016 7:40 PM, ???? wrote:
> Hi,
> Server VM will use C2, in this mode
> which method processes 'volatile' operations like method do_StoreField in  src/share/vm/c1/c1_LIRGenerator.cpp?

C2 definitions are in the .ad files (that get fed into Adlc to generate 
the compiler implementation).

Eg.

hotspot/src/cpu/x86/vm/x86_32.ad

  // Atomically load the volatile long
   enc_class enc_loadL_volatile( memory mem, stackSlotL dst ) %{
     emit_opcode(cbuf,0xDF);
     int rm_byte_opcode = 0x05;
     int base     = $mem$$base;
     int index    = $mem$$index;
     int scale    = $mem$$scale;
     int displace = $mem$$disp;
     relocInfo::relocType disp_reloc = $mem->disp_reloc(); // 
disp-as-oop when working with static globals
     encode_RegMem(cbuf, rm_byte_opcode, base, index, scale, displace, 
disp_reloc);
     store_to_stackslot( cbuf, 0x0DF, 0x07, $dst$$disp );
   %}

David
-----

>
> Thank you !
> ------------------ ???? ------------------
> ???: "???";<hyperdak at gmail.com>;
> ????: 2016?11?2?(???) ??5:34
> ???: "????"<1072213404 at qq.com>;
> ??: "Krystal Mok"<rednaxelafx at gmail.com>; "hotspot-dev"<hotspot-dev at openjdk.java.net>;
> ??: Re: please help understanding what's the relationship of hotspot compiler c1 and c2 ?
>
>
>
> Hi,
>     When use tiered compilation (default enable in jdk8),tiered  VM can use C1 and C2 both [1].Client VM will use C1 and Server VM will use C2.
>
>
> Thanks,
> hyperdak
>
>
> [1] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation
>

From 1072213404 at qq.com  Fri Nov  4 06:02:12 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Fri, 4 Nov 2016 14:02:12 +0800
Subject: help me understanding how volatile field of java was executed on
	hotspot
Message-ID: <tencent_05D6DF38615589591C68492D@qq.com>

for now, i know that when operating volatile field, it may be executed on below method in hotspot :


interpreter :
hotspot/src/share/vm/interpreter/templateTable.hpp
static void getfield_or_static(int byte_no, bool is_static);
static void putfield_or_static(int byte_no, bool is_static);


C1:
hotspot/src/share/vm/c1/c1_LIRGenerator.cpp
void LIRGenerator::do_LoadField(LoadField* x)
void LIRGenerator::do_StoreField(StoreField* x)


C2:
hotspot/src/share/vm/opto/parse.hpp
void do_get_xxx(Node* obj, ciField* field, bool is_field);
void do_put_xxx(Node* obj, ciField* field, bool is_field);


is there some offical doc describing these method?
and is there someway to prove volatile field actually executed on above three methods?


Thank you ?


Arron

From yang.zhang at linaro.org  Fri Nov  4 07:07:53 2016
From: yang.zhang at linaro.org (Yang Zhang)
Date: Fri, 4 Nov 2016 15:07:53 +0800
Subject: jdk9/hs/hotspot make native libs for test build failure both on x86
	and aarch64
Message-ID: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>

Hi,

jdk9/hs/hotspot native libs for jtreg build failed after the push of
http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/308a53dd5aee

Build command:  make test-image-hotspot-jtreg-native

Could someone please help to fix it?

The reason is that dl library isn't found. I think the following
change could fix that:

------

diff --git a/make/test/JtregNative.gmk b/make/test/JtregNative.gmk
index 78e78d7..95b5747 100644
--- a/make/test/JtregNative.gmk
+++ b/make/test/JtregNative.gmk
@@ -91,7 +91,7 @@ ifeq ($(OPENJDK_TARGET_OS), linux)
     BUILD_HOTSPOT_JTREG_LIBRARIES_LDFLAGS_libtest-rwx := -z execstack
     BUILD_HOTSPOT_JTREG_EXECUTABLES_LIBS_exeinvoke := -ljvm -lpthread
     BUILD_TEST_invoke_exeinvoke.c_OPTIMIZATION := NONE
-    BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -ldl
+    BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs :=
-Wl,--no-as-needed -ldl
 endif

 ifeq ($(OPENJDK_TARGET_OS), windows)

------


Regards
Yang

From aph at redhat.com  Fri Nov  4 09:18:01 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 4 Nov 2016 09:18:01 +0000
Subject: help me understanding how volatile field of java was executed on
	hotspot
In-Reply-To: <tencent_05D6DF38615589591C68492D@qq.com>
References: <tencent_05D6DF38615589591C68492D@qq.com>
Message-ID: <d61631ab-880f-aaf1-e214-3a3abd61670a@redhat.com>

On 04/11/16 06:02, ???? wrote:
> is there some offical doc describing these method?

There's the source code that you're looking at.

> and is there someway to prove volatile field actually executed on above three method?

Beyond looking at the code, no.

There must be an important reason that you're asking this, and we'd be
happy to help if you told us.

Andrew.

From peter.hofer at jku.at  Fri Nov  4 10:00:38 2016
From: peter.hofer at jku.at (Peter Hofer)
Date: Fri, 4 Nov 2016 11:00:38 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
Message-ID: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>

Hello everyone,

we are researchers at the University of Linz and have worked on a lock 
contention profiler that is built into HotSpot. We would like to 
contribute this work to the OpenJDK community.

Our profiler records an event when a thread fails to acquire a contended 
lock and also when a thread releases a contended lock. It further 
efficiently records the stack traces where these events occur. We 
devised a versatile visualization tool that analyzes the recorded events 
and determines when and where threads _cause_ contention by holding a 
contended lock. The visualization tool can show the contention by stack 
trace, by lock, by lock class, by thread, and by any combination of 
those aspects.

We described our profiler in more detail in a research paper at ICPE 
2016. [1] In our evaluation, we found that the overhead is typically 
below 10% for common multi-threaded Java benchmarks. Please find a free 
download of the paper on our website:
> http://mevss.jku.at/lct/

I contribute this work on behalf of Dynatrace Austria (the sponsor of 
this research), my colleagues David Gnedt and Andreas Schoergenhumer, 
and myself. The necessary OCAs have already been submitted.

We provide two patches:

Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we 
described and evaluated in our paper, plus minor improvements. It 
records events for Java intrinsic locks (monitors) and for 
java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). 
We support only Linux on 64-bit x86 hardware.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/

Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal 
native locks only. We consider this to be useful for HotSpot developers 
to find locking bottlenecks in HotSpot itself. We tested this patch only 
on Linux on 64-bit x86 hardware, but it should require few changes for 
other platforms.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/

With both patches, the profiler is enabled with -XX:+EnableEventTracing. 
By default, an uncompressed event trace is written to file "output.trc".

More detailed usage information and a download of the corresponding 
visualization tool is available on our website, http://mevss.jku.at/lct/.

Kind regards,
  Peter Hofer


--
Peter Hofer
Christian Doppler Laboratory on Monitoring and Evolution of 
Very-Large-Scale Software Systems / Institute for System Software
University of Linz


[1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter 
Moessenboeck. Efficient Tracing and Versatile Analysis of Lock 
Contention in Java Applications on the Virtual Machine Level. 
Proceedings of the 7th ACM/SPEC International Conference on Performance 
Engineering (ICPE?16), Delft, Netherlands, 2016.

From adinn at redhat.com  Fri Nov  4 11:14:50 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 4 Nov 2016 11:14:50 +0000
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
Message-ID: <f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>

Hi Peter,

On 04/11/16 10:00, Peter Hofer wrote:
> Hello everyone,
> 
> we are researchers at the University of Linz and have worked on a lock
> contention profiler that is built into HotSpot. We would like to
> contribute this work to the OpenJDK community.
> . . .

This sounds very interesting.

> We described our profiler in more detail in a research paper at ICPE
> 2016. [1] In our evaluation, we found that the overhead is typically
> below 10% for common multi-threaded Java benchmarks. Please find a free
> download of the paper on our website:

Have you measured the overhead this change produces when running with
contention detection disabled? (i.e. do we pay to have this feature even
when we don't use it).

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From aaron.grunthal at infinite-source.de  Fri Nov  4 11:31:34 2016
From: aaron.grunthal at infinite-source.de (Aaron Grunthal)
Date: Fri, 4 Nov 2016 12:31:34 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
Message-ID: <051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de>

I think for lock contention the distribution of the blocking time is of
interest. Can the profiler show that or just the cumulative time?

Most profilers only record the sum, which is useful for optimizing
throughput bottlenecks, but when optimizing for latency the CDF also is
of interest since some methods can have vastly different average and
worst case behaviors which can get obscured in the averages.

- Aaron

On 04.11.2016 11:00, Peter Hofer wrote:
> Hello everyone,
> 
> we are researchers at the University of Linz and have worked on a lock
> contention profiler that is built into HotSpot. We would like to
> contribute this work to the OpenJDK community.
> 
> Our profiler records an event when a thread fails to acquire a contended
> lock and also when a thread releases a contended lock. It further
> efficiently records the stack traces where these events occur. We
> devised a versatile visualization tool that analyzes the recorded events
> and determines when and where threads _cause_ contention by holding a
> contended lock. The visualization tool can show the contention by stack
> trace, by lock, by lock class, by thread, and by any combination of
> those aspects.
> 
> We described our profiler in more detail in a research paper at ICPE
> 2016. [1] In our evaluation, we found that the overhead is typically
> below 10% for common multi-threaded Java benchmarks. Please find a free
> download of the paper on our website:
>> http://mevss.jku.at/lct/
> 
> I contribute this work on behalf of Dynatrace Austria (the sponsor of
> this research), my colleagues David Gnedt and Andreas Schoergenhumer,
> and myself. The necessary OCAs have already been submitted.
> 
> We provide two patches:
> 
> Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
> described and evaluated in our paper, plus minor improvements. It
> records events for Java intrinsic locks (monitors) and for
> java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
> We support only Linux on 64-bit x86 hardware.
> 
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/
> 
> Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
> native locks only. We consider this to be useful for HotSpot developers
> to find locking bottlenecks in HotSpot itself. We tested this patch only
> on Linux on 64-bit x86 hardware, but it should require few changes for
> other platforms.
> 
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/
>>
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/
>>
> 
> With both patches, the profiler is enabled with -XX:+EnableEventTracing.
> By default, an uncompressed event trace is written to file "output.trc".
> 
> More detailed usage information and a download of the corresponding
> visualization tool is available on our website, http://mevss.jku.at/lct/.
> 
> Kind regards,
>  Peter Hofer
> 
> 
> -- 
> Peter Hofer
> Christian Doppler Laboratory on Monitoring and Evolution of
> Very-Large-Scale Software Systems / Institute for System Software
> University of Linz
> 
> 
> [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
> Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
> Contention in Java Applications on the Virtual Machine Level.
> Proceedings of the 7th ACM/SPEC International Conference on Performance
> Engineering (ICPE?16), Delft, Netherlands, 2016.
> 


From peter.hofer at jku.at  Fri Nov  4 12:04:12 2016
From: peter.hofer at jku.at (Peter Hofer)
Date: Fri, 4 Nov 2016 13:04:12 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>
Message-ID: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at>

Hi Andrew,

On 11/04/2016 12:14 PM, Andrew Dinn wrote:
>> We described our profiler in more detail in a research paper at ICPE
>> 2016. [1] In our evaluation, we found that the overhead is typically
>> below 10% for common multi-threaded Java benchmarks. Please find a free
>> download of the paper on our website:
>
> Have you measured the overhead this change produces when running with
> contention detection disabled? (i.e. do we pay to have this feature even
> when we don't use it).

We measured only the overhead relative to an unmodified OpenJDK build.

Our profiler observes only lock contention, which is generally handled 
via slow paths in the VM code, so this is where we added the code to 
record events. I don't expect this code to cause much overhead when 
disabled. However, we added fields to several data structures, which 
might make a difference.

I'll run some more benchmarks and report my findings.

Cheers,
  Peter

From peter.hofer at jku.at  Fri Nov  4 12:59:58 2016
From: peter.hofer at jku.at (Peter Hofer)
Date: Fri, 4 Nov 2016 13:59:58 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de>
Message-ID: <5c925ca4-27ef-ddb6-fb92-b91297e9b676@jku.at>

Hi Aaron,

On 11/04/2016 12:31 PM, Aaron Grunthal wrote:
> I think for lock contention the distribution of the blocking time is
> of interest. Can the profiler show that or just the cumulative time?
>
> Most profilers only record the sum, which is useful for optimizing
> throughput bottlenecks, but when optimizing for latency the CDF also
> is of interest since some methods can have vastly different average
> and worst case behaviors which can get obscured in the averages.

Our visualization tool currently shows only the cumulative contention
times for each stack trace, lock, lock class, thread, or any combination
of those aspects.

However, individual blocking times could be computed from the events
that the profiler records. These times could also be computed from the
lock owner thread's perspective, i.e., the time from when the owned lock
becomes contended until the thread releases the lock.

Individual blocking times would only work well for monitors (and native
monitors) though. With java.util.concurrent locks, we observe individual
park()/unpark() calls. A thread that cannot acquire a lock may call
park() more than once, and we cannot distinguish this from when a thread
tries to acquire a lock multiple times and calls park() once each time.
We would likely need bytecode instrumentation to group multiple park()
calls that are part of a single lock acquisition and use the duration
from the first park() call to the return of the last park() call as the
blocking time.

Cheers,
  Peter

> On 04.11.2016 11:00, Peter Hofer wrote:
>> Hello everyone,
>>
>> we are researchers at the University of Linz and have worked on a
>> lock contention profiler that is built into HotSpot. We would like
>> to contribute this work to the OpenJDK community.
>>
>> Our profiler records an event when a thread fails to acquire a
>> contended lock and also when a thread releases a contended lock. It
>> further efficiently records the stack traces where these events
>> occur. We devised a versatile visualization tool that analyzes the
>> recorded events and determines when and where threads _cause_
>> contention by holding a contended lock. The visualization tool can
>> show the contention by stack trace, by lock, by lock class, by
>> thread, and by any combination of those aspects.
>>
>> [...]

From adinn at redhat.com  Fri Nov  4 14:21:31 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 4 Nov 2016 14:21:31 +0000
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>
	<8a2536a3-8d16-1777-55c7-95b10000465b@jku.at>
Message-ID: <204272cd-b606-2be7-9359-ea05d0922515@redhat.com>

On 04/11/16 12:04, Peter Hofer wrote:
   . . .
>> Have you measured the overhead this change produces when running with
>> contention detection disabled? (i.e. do we pay to have this feature even
>> when we don't use it).
> 
> We measured only the overhead relative to an unmodified OpenJDK build.
> 
> Our profiler observes only lock contention, which is generally handled
> via slow paths in the VM code, so this is where we added the code to
> record events. I don't expect this code to cause much overhead when
> disabled. However, we added fields to several data structures, which
> might make a difference.

Yes, increased footprint (in code as well as object space) would be as
much a concern as increased execution time.

> I'll run some more benchmarks and report my findings.

Thanks very much.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From erik.joelsson at oracle.com  Fri Nov  4 14:22:46 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Fri, 4 Nov 2016 15:22:46 +0100
Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is
	configured for static linking
Message-ID: <a9a53ce9-2ac9-0595-a9fe-cfa2344bbb17@oracle.com>

In the build, we have a global setting for linking libstdc++ static or 
dynamic on Linux. All libraries and executables that go in the product 
honor this setting. The gtestLauncher currently doesn't. This causes 
trouble in testing where some machines might not have the 32bit 
libstdc++.so installed. Since installing that library is not needed for 
just running the product, it's awkward to have to install it to run 
certain tests.

This patch adds the LIBCXX flags from configure when linking 
gtestLauncher. The resulting file actually comes out a little bit 
smaller, so there is no footprint overhead. The tests still pass.

Bug: https://bugs.openjdk.java.net/browse/JDK-8169255

Patch:

diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk
--- a/make/lib/CompileGtest.gmk
+++ b/make/lib/CompileGtest.gmk
@@ -107,6 +107,7 @@
      LDFLAGS := $(LDFLAGS_JDKEXE), \
      LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call 
SET_SHARED_LIBRARY_ORIGIN), \
      LDFLAGS_solaris := -library=stlport4, \
+    LIBS_linux := $(LIBCXX), \
      LIBS_unix := -ljvm, \
      LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \
      COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \


/Erik


From tim.bell at oracle.com  Fri Nov  4 14:31:21 2016
From: tim.bell at oracle.com (Tim Bell)
Date: Fri, 4 Nov 2016 07:31:21 -0700
Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is
	configured for static linking
In-Reply-To: <a9a53ce9-2ac9-0595-a9fe-cfa2344bbb17@oracle.com>
References: <a9a53ce9-2ac9-0595-a9fe-cfa2344bbb17@oracle.com>
Message-ID: <b9e38b6b-2b32-41d2-f118-f0e97a05b0cb@oracle.com>

Erik:

> In the build, we have a global setting for linking libstdc++ static or
> dynamic on Linux. All libraries and executables that go in the product
> honor this setting. The gtestLauncher currently doesn't. This causes
> trouble in testing where some machines might not have the 32bit
> libstdc++.so installed. Since installing that library is not needed for
> just running the product, it's awkward to have to install it to run
> certain tests.
>
> This patch adds the LIBCXX flags from configure when linking
> gtestLauncher. The resulting file actually comes out a little bit
> smaller, so there is no footprint overhead. The tests still pass.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8169255
>
> Patch:
>
> diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk
> --- a/make/lib/CompileGtest.gmk
> +++ b/make/lib/CompileGtest.gmk
> @@ -107,6 +107,7 @@
>      LDFLAGS := $(LDFLAGS_JDKEXE), \
>      LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call
> SET_SHARED_LIBRARY_ORIGIN), \
>      LDFLAGS_solaris := -library=stlport4, \
> +    LIBS_linux := $(LIBCXX), \
>      LIBS_unix := -ljvm, \
>      LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \
>      COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \

Looks good to me.

Tim


From marcus.larsson at oracle.com  Fri Nov  4 15:16:34 2016
From: marcus.larsson at oracle.com (Marcus Larsson)
Date: Fri, 4 Nov 2016 16:16:34 +0100
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
Message-ID: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>

Hi,

Thanks for fixing this.


On 2016-11-01 18:30, Kirill Zhaldybin wrote:
> Dear all,
>
> Could you please review this fix for 8169003?
>
> I changed parsing of time string so now it is not depend on LC_NUMERIC 
> locale so the test does not fail if locale where "floating point" is 
> actually a comma is set.
>
> WebRev: 
> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/

ISO8601 says the decimal point can be either '.' or ',' so the test 
should accept either. You could let sscanf read out the decimal point as 
a character and just verify that it is one of the two.

In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that we 
won't accept "Z" suffixed strings. Please revert that.

Thanks,
Marcus

> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>
> Thank you.
>
> Regards, Kirill


From David.Gnedt at jku.at  Fri Nov  4 11:26:28 2016
From: David.Gnedt at jku.at (David Gnedt)
Date: Fri, 04 Nov 2016 12:26:28 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
Message-ID: <581C7E740200009400009CA7@gwia.im.jku.at>

Hello,

I am one of the authors of this work and I gladly support this
contribution.

Best regards,
David Gnedt

>>> Peter Hofer <peter.hofer at jku.at> 04.11.16 11.01 Uhr >>>
Hello everyone,

we are researchers at the University of Linz and have worked on a lock 
contention profiler that is built into HotSpot. We would like to 
contribute this work to the OpenJDK community.

Our profiler records an event when a thread fails to acquire a contended

lock and also when a thread releases a contended lock. It further 
efficiently records the stack traces where these events occur. We 
devised a versatile visualization tool that analyzes the recorded events

and determines when and where threads _cause_ contention by holding a 
contended lock. The visualization tool can show the contention by stack 
trace, by lock, by lock class, by thread, and by any combination of 
those aspects.

We described our profiler in more detail in a research paper at ICPE 
2016. [1] In our evaluation, we found that the overhead is typically 
below 10% for common multi-threaded Java benchmarks. Please find a free 
download of the paper on our website:
> http://mevss.jku.at/lct/

I contribute this work on behalf of Dynatrace Austria (the sponsor of 
this research), my colleagues David Gnedt and Andreas Schoergenhumer, 
and myself. The necessary OCAs have already been submitted.

We provide two patches:

Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we 
described and evaluated in our paper, plus minor improvements. It 
records events for Java intrinsic locks (monitors) and for 
java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). 
We support only Linux on 64-bit x86 hardware.

>
http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/

Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal 
native locks only. We consider this to be useful for HotSpot developers 
to find locking bottlenecks in HotSpot itself. We tested this patch only

on Linux on 64-bit x86 hardware, but it should require few changes for 
other platforms.

>
http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/
>
http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/

With both patches, the profiler is enabled with -XX:+EnableEventTracing.

By default, an uncompressed event trace is written to file "output.trc".

More detailed usage information and a download of the corresponding 
visualization tool is available on our website,
http://mevss.jku.at/lct/.

Kind regards,
  Peter Hofer


--
Peter Hofer
Christian Doppler Laboratory on Monitoring and Evolution of 
Very-Large-Scale Software Systems / Institute for System Software
University of Linz


[1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter 
Moessenboeck. Efficient Tracing and Versatile Analysis of Lock 
Contention in Java Applications on the Virtual Machine Level. 
Proceedings of the 7th ACM/SPEC International Conference on Performance 
Engineering (ICPE?16), Delft, Netherlands, 2016.


From Derek.White at cavium.com  Fri Nov  4 17:07:37 2016
From: Derek.White at cavium.com (White, Derek)
Date: Fri, 4 Nov 2016 17:07:37 +0000
Subject: jdk9/hs/hotspot make native libs for test build failure both on
	x86	and aarch64
In-Reply-To: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>
References: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>
Message-ID: <CY1PR07MB23939B8D10757A4A611ADCB784A20@CY1PR07MB2393.namprd07.prod.outlook.com>

I saw this on some of my machines also - I thought it was a configuration issue, but now I think it?s due to the fix for JDK-8067744 being incompatible with some versions of gcc (which have differences with "as-needed").

Created bug: https://bugs.openjdk.java.net/browse/JDK-8169261

-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Yang Zhang
Sent: Friday, November 04, 2016 3:08 AM
To: hotspot-dev at openjdk.java.net
Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64

Hi,

jdk9/hs/hotspot native libs for jtreg build failed after the push of http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/308a53dd5aee

Build command:  make test-image-hotspot-jtreg-native

Could someone please help to fix it?

The reason is that dl library isn't found. I think the following change could fix that:

------

diff --git a/make/test/JtregNative.gmk b/make/test/JtregNative.gmk index 78e78d7..95b5747 100644
--- a/make/test/JtregNative.gmk
+++ b/make/test/JtregNative.gmk
@@ -91,7 +91,7 @@ ifeq ($(OPENJDK_TARGET_OS), linux)
     BUILD_HOTSPOT_JTREG_LIBRARIES_LDFLAGS_libtest-rwx := -z execstack
     BUILD_HOTSPOT_JTREG_EXECUTABLES_LIBS_exeinvoke := -ljvm -lpthread
     BUILD_TEST_invoke_exeinvoke.c_OPTIMIZATION := NONE
-    BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -ldl
+    BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs :=
-Wl,--no-as-needed -ldl
 endif

 ifeq ($(OPENJDK_TARGET_OS), windows)

------


Regards
Yang

From aph at redhat.com  Fri Nov  4 17:24:23 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 4 Nov 2016 17:24:23 +0000
Subject: jdk9/hs/hotspot make native libs for test build failure both on
	x86 and aarch64
In-Reply-To: <CY1PR07MB23939B8D10757A4A611ADCB784A20@CY1PR07MB2393.namprd07.prod.outlook.com>
References: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>
	<CY1PR07MB23939B8D10757A4A611ADCB784A20@CY1PR07MB2393.namprd07.prod.outlook.com>
Message-ID: <e91662c5-8c21-e8be-9391-793f2920567c@redhat.com>

On 04/11/16 17:07, White, Derek wrote:
> The reason is that dl library isn't found. I think the following change could fix that:

But why isn't it found?  It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2
or somesuch.  Or your system wouldn't work.

Andrew.


From Derek.White at cavium.com  Fri Nov  4 17:54:40 2016
From: Derek.White at cavium.com (White, Derek)
Date: Fri, 4 Nov 2016 17:54:40 +0000
Subject: jdk9/hs/hotspot make native libs for test build failure both on
	x86 and aarch64
In-Reply-To: <e91662c5-8c21-e8be-9391-793f2920567c@redhat.com>
References: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>
	<CY1PR07MB23939B8D10757A4A611ADCB784A20@CY1PR07MB2393.namprd07.prod.outlook.com>
	<e91662c5-8c21-e8be-9391-793f2920567c@redhat.com>
Message-ID: <CY1PR07MB239322A4515F2A64624D97E684A20@CY1PR07MB2393.namprd07.prod.outlook.com>

This is a build-time linking error. The link command does include -ldl, the libraries do exist in the expect places. So I don't understand exactly what the issue is.

But Yang's fix follows some internet wisdom that include this claim:
	"Apparently it has something to do with recent versions of gcc/ld default to linking with --	as-needed."

I haven't had time to track down a fuller explanation.

 - Derek
-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Friday, November 04, 2016 1:24 PM
To: White, Derek <Derek.White at cavium.com>; Yang Zhang <yang.zhang at linaro.org>; hotspot-dev at openjdk.java.net
Subject: Re: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64

On 04/11/16 17:07, White, Derek wrote:
> The reason is that dl library isn't found. I think the following change could fix that:

But why isn't it found?  It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2 or somesuch.  Or your system wouldn't work.

Andrew.


From jeremymanson at google.com  Fri Nov  4 18:24:20 2016
From: jeremymanson at google.com (Jeremy Manson)
Date: Fri, 4 Nov 2016 11:24:20 -0700
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <581C7E740200009400009CA7@gwia.im.jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<581C7E740200009400009CA7@gwia.im.jku.at>
Message-ID: <CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>

Why aren't these extensions to JVMTI, which already has
MonitorContendedEnter and MonitorContendedEntered events?  You could just
add a MonitorContendedRelease event to cover what you want.  Then the bulk
of the tracking work can be done in JVMTI.

At Google, we've built on these JVMTI primitives quite successfully.  The
only internal enhancements we've had to make is to make them support j.u.c
locks.

(We've also done the hotspot lock contention work, but it has been less
directly useful.)

Jeremy

On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt <David.Gnedt at jku.at> wrote:

> Hello,
>
> I am one of the authors of this work and I gladly support this
> contribution.
>
> Best regards,
> David Gnedt
>
> >>> Peter Hofer <peter.hofer at jku.at> 04.11.16 11.01 Uhr >>>
> Hello everyone,
>
> we are researchers at the University of Linz and have worked on a lock
> contention profiler that is built into HotSpot. We would like to
> contribute this work to the OpenJDK community.
>
> Our profiler records an event when a thread fails to acquire a contended
>
> lock and also when a thread releases a contended lock. It further
> efficiently records the stack traces where these events occur. We
> devised a versatile visualization tool that analyzes the recorded events
>
> and determines when and where threads _cause_ contention by holding a
> contended lock. The visualization tool can show the contention by stack
> trace, by lock, by lock class, by thread, and by any combination of
> those aspects.
>
> We described our profiler in more detail in a research paper at ICPE
> 2016. [1] In our evaluation, we found that the overhead is typically
> below 10% for common multi-threaded Java benchmarks. Please find a free
> download of the paper on our website:
> > http://mevss.jku.at/lct/
>
> I contribute this work on behalf of Dynatrace Austria (the sponsor of
> this research), my colleagues David Gnedt and Andreas Schoergenhumer,
> and myself. The necessary OCAs have already been submitted.
>
> We provide two patches:
>
> Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
> described and evaluated in our paper, plus minor improvements. It
> records events for Java intrinsic locks (monitors) and for
> java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
> We support only Linux on 64-bit x86 hardware.
>
> >
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
> > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/
>
> Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
> native locks only. We consider this to be useful for HotSpot developers
> to find locking bottlenecks in HotSpot itself. We tested this patch only
>
> on Linux on 64-bit x86 hardware, but it should require few changes for
> other platforms.
>
> >
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_
> nativelocksonly_hotspot_jdk9%2b140/
> >
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_
> nativelocksonly_jdk_jdk-9%2b140/
>
> With both patches, the profiler is enabled with -XX:+EnableEventTracing.
>
> By default, an uncompressed event trace is written to file "output.trc".
>
> More detailed usage information and a download of the corresponding
> visualization tool is available on our website,
> http://mevss.jku.at/lct/.
>
> Kind regards,
>   Peter Hofer
>
>
> --
> Peter Hofer
> Christian Doppler Laboratory on Monitoring and Evolution of
> Very-Large-Scale Software Systems / Institute for System Software
> University of Linz
>
>
> [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
> Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
> Contention in Java Applications on the Virtual Machine Level.
> Proceedings of the 7th ACM/SPEC International Conference on Performance
> Engineering (ICPE?16), Delft, Netherlands, 2016.
>
>
>

From dmitry.samersoff at oracle.com  Fri Nov  4 18:27:36 2016
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Fri, 4 Nov 2016 21:27:36 +0300
Subject: jdk9/hs/hotspot make native libs for test build failure both on
	x86 and aarch64
In-Reply-To: <e91662c5-8c21-e8be-9391-793f2920567c@redhat.com>
References: <CAHMTGtZvhAVJ7FeMJ+FN+hatQ5rh71ou6h2DyrEK=HeRbedemw@mail.gmail.com>
	<CY1PR07MB23939B8D10757A4A611ADCB784A20@CY1PR07MB2393.namprd07.prod.outlook.com>
	<e91662c5-8c21-e8be-9391-793f2920567c@redhat.com>
Message-ID: <7f5b2cc3-7a67-b7da-f715-5a4d9dac3127@oracle.com>

Andrew,

gcc -Wl,--as-needed flag allows the linker to don't link to shared
library if it think that this library is not necessary.

It can cause an error in some cases e.g. libraries (-lXXX) appears in
command line in wrong order[1].

So I think that explicitly disable as-needed when building tests is a
good idea.

1.
g++ -Wl,--no-as-needed -o test -ldl test.cxx
OK.

g++ -Wl,--as-needed -o test -ldl test.cxx

/tmp/ccOqlI4O.o: In function `main':
test.cxx:(.text+0xb7): undefined reference to `dlopen'
collect2: error: ld returned 1 exit status

-Dmitry 	


On 2016-11-04 20:24, Andrew Haley wrote:
> On 04/11/16 17:07, White, Derek wrote:
>> The reason is that dl library isn't found. I think the following change could fix that:
> 
> But why isn't it found?  It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2
> or somesuch.  Or your system wouldn't work.
> 
> Andrew.
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From ceeaspb at gmail.com  Fri Nov  4 19:39:28 2016
From: ceeaspb at gmail.com (Alex Bagehot)
Date: Fri, 4 Nov 2016 19:39:28 +0000
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<581C7E740200009400009CA7@gwia.im.jku.at>
	<CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>
Message-ID: <CAHeneC891y+tP4M1OfFA0tuu5FvrB+TYvZ715haR9dde-m9CRA@mail.gmail.com>

Seems release was removed:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4986044
and... https://bugs.openjdk.java.net/browse/JDK-8038441

Related, there is a dtrace/systemtap probe for exit
http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2009-April/005445.html

On Fri, Nov 4, 2016 at 6:24 PM, Jeremy Manson <jeremymanson at google.com> wrote:
> Why aren't these extensions to JVMTI, which already has
> MonitorContendedEnter and MonitorContendedEntered events?  You could just
> add a MonitorContendedRelease event to cover what you want.  Then the bulk
> of the tracking work can be done in JVMTI.
>
> At Google, we've built on these JVMTI primitives quite successfully.  The
> only internal enhancements we've had to make is to make them support j.u.c
> locks.
>
> (We've also done the hotspot lock contention work, but it has been less
> directly useful.)
>
> Jeremy
>
> On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt <David.Gnedt at jku.at> wrote:
>
>> Hello,
>>
>> I am one of the authors of this work and I gladly support this
>> contribution.
>>
>> Best regards,
>> David Gnedt
>>
>> >>> Peter Hofer <peter.hofer at jku.at> 04.11.16 11.01 Uhr >>>
>> Hello everyone,
>>
>> we are researchers at the University of Linz and have worked on a lock
>> contention profiler that is built into HotSpot. We would like to
>> contribute this work to the OpenJDK community.
>>
>> Our profiler records an event when a thread fails to acquire a contended
>>
>> lock and also when a thread releases a contended lock. It further
>> efficiently records the stack traces where these events occur. We
>> devised a versatile visualization tool that analyzes the recorded events
>>
>> and determines when and where threads _cause_ contention by holding a
>> contended lock. The visualization tool can show the contention by stack
>> trace, by lock, by lock class, by thread, and by any combination of
>> those aspects.
>>
>> We described our profiler in more detail in a research paper at ICPE
>> 2016. [1] In our evaluation, we found that the overhead is typically
>> below 10% for common multi-threaded Java benchmarks. Please find a free
>> download of the paper on our website:
>> > http://mevss.jku.at/lct/
>>
>> I contribute this work on behalf of Dynatrace Austria (the sponsor of
>> this research), my colleagues David Gnedt and Andreas Schoergenhumer,
>> and myself. The necessary OCAs have already been submitted.
>>
>> We provide two patches:
>>
>> Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
>> described and evaluated in our paper, plus minor improvements. It
>> records events for Java intrinsic locks (monitors) and for
>> java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
>> We support only Linux on 64-bit x86 hardware.
>>
>> >
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
>> > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/
>>
>> Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
>> native locks only. We consider this to be useful for HotSpot developers
>> to find locking bottlenecks in HotSpot itself. We tested this patch only
>>
>> on Linux on 64-bit x86 hardware, but it should require few changes for
>> other platforms.
>>
>> >
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_
>> nativelocksonly_hotspot_jdk9%2b140/
>> >
>> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_
>> nativelocksonly_jdk_jdk-9%2b140/
>>
>> With both patches, the profiler is enabled with -XX:+EnableEventTracing.
>>
>> By default, an uncompressed event trace is written to file "output.trc".
>>
>> More detailed usage information and a download of the corresponding
>> visualization tool is available on our website,
>> http://mevss.jku.at/lct/.
>>
>> Kind regards,
>>   Peter Hofer
>>
>>
>> --
>> Peter Hofer
>> Christian Doppler Laboratory on Monitoring and Evolution of
>> Very-Large-Scale Software Systems / Institute for System Software
>> University of Linz
>>
>>
>> [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
>> Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
>> Contention in Java Applications on the Virtual Machine Level.
>> Proceedings of the 7th ACM/SPEC International Conference on Performance
>> Engineering (ICPE?16), Delft, Netherlands, 2016.
>>
>>
>>

From david.holmes at oracle.com  Sat Nov  5 18:43:52 2016
From: david.holmes at oracle.com (David Holmes)
Date: Sun, 6 Nov 2016 04:43:52 +1000
Subject: Memory ordering properties of Atomic::r-m-w operations
In-Reply-To: <a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<OFA2287681.8B1427FA-ON4925803E.0035621E-4925803E.00387EBB@notes.na.collabserv.com>
	<1475236951.6301.72.camel@oracle.com>
	<OF78EB09B0.8B71606C-ON49258040.004F1656-49258040.00512C99@notes.na.collabserv.com>
	<CAP_pwnUsC18TNvRg1_M273tjCav11_Xy=jQCkQC2_KPgztEu2A@mail.gmail.com>
	<6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com>
	<14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com>
	<OFCFB3DB17.F187E7F2-ON49258042.0053F83E-4925
	<c5dcc160-8d30-8160-2c5d-93a62bf2c940@oracle.com> <OFC81622C2.A
	<9bffd66d-abe0-8d3d-262a-4be55b81b9a4@
	<1477654313.3851.11.camel@oracle.com>
	<OF6B2B76B0.120062F1-ON4925805B.00311F7B-4925805B.003A58B5@notes.na.collabserv.com>
	<a86d39db-574c-1613-f5c0-200e517cdb05@redhat.com>
	<1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com>
	<f13b8b58-4aa5-2bf3-8c00-a66d8809b355@redhat.com>
	<b680fce7-b279-f28f-5ba0-5cd891372e01@oracle.com>
	<a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
Message-ID: <bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>

Forking new discussion from:

RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

On 1/11/2016 7:44 PM, Andrew Haley wrote:
> On 31/10/16 21:30, David Holmes wrote:
>>
>>
>> On 31/10/2016 7:32 PM, Andrew Haley wrote:
>>> On 30/10/16 21:26, David Holmes wrote:
>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote:
>>>>>
>>>>> And, while we're on the subject, is memory_order_conservative actually
>>>>> defined anywhere?
>>>>
>>>> No. It was chosen to represent the current status quo that the Atomic::
>>>> ops should all be (by default) full bi-directional fences.
>>>
>>> Does that mean that a CAS is actually stronger than a load acquire
>>> followed by a store release?  And that a CAS is a release fence even
>>> when it fails and no store happens?
>>
>> Yes. Yes.
>>
>>    // All of the atomic operations that imply a read-modify-write
>>    // action guarantee a two-way memory barrier across that
>>    // operation. Historically these semantics reflect the strength
>>    // of atomic operations that are provided on SPARC/X86. We assume
>>    // that strength is necessary unless we can prove that a weaker
>>    // form is sufficiently safe.
>
> Mmmm, but that doesn't say anything about a CAS that fails.  But fair
> enough, I accept your interpretation.

Granted the above was not written with load-linked/store-conditional 
style implementations in mind; and the historical behaviour on sparc and 
x86 is not affected by failure of the cas, so it isn't called out. I 
should fix that.

>> But there is some contention as to whether the actual implementations
>> obey this completely.
>
> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified
> as a
>
>   "full barrier".  That is, no memory operand is moved across the
>   operation, either forward or backward.  Further, instructions are
>   issued as necessary to prevent the processor from speculating loads
>   across the operation and from queuing stores after the operation.
>
> ... which reads the same as the language you quoted above, but looking
> at the assembly code I'm sure that it's really no stronger than a seq
> cst load followed by a seq cst store.

Are you saying that a seq_cst load followed by a seq_cst store is weaker 
than a full barrier?

> I guess maybe I could give up fighting this and implement all AArch64
> CAS sequences as
>
>    CAS(seq_cst); full fence
>
> or, even more extremely,
>
>    full fence; CAS(relaxed); full fence
>
> but it all seems unreasonably heavyweight.

Indeed. A couple of issues here. If you are thinking in terms of 
orderAccess::fence() then it needs to guarantee visibility as well as 
ordering - see this bug I just filed:

https://bugs.openjdk.java.net/browse/JDK-8169193

So would be heavier than a "full barrier" that simply combined all four 
storeload membar variants. Though of course the actual implementation on 
a given architecture may be just as heavyweight. And of course the 
Atomic op must guarantee visibility of the successful store (else the 
atomicity aspect would not be present).

That aside we do not need two "fences" surrounding the atomic op. For 
platforms where the atomic op is a single instruction which combines 
load and store then conceptually all we need is:

loadload|storeload; op; storeload|storestore

Note this is at odds with the commentary in atomic.hpp which says things 
like:

   // <fence> add-value-to-dest <membar StoreLoad|StoreStore>

I need to check why we settled on the above formulation - I suspect it 
was conservatism. And of course for the cmpxchg it fails to account for 
the fact there may not be a store to order with.

For load-linked/store-conditional based operations that would expand to 
(assume a retry loop for unrelated store failures):

loadLoad|storeLoad
temp = ld-linked &val
cmp temp, expected
jmp ne
st-cond &val, newVal
storeload|storestore

which is fine if we actually store, but if we find the wrong value there 
is no store for those final barriers to sync with. That then raises the 
question: can subsequent loads and stores move into the 
ld-linked/st-cond region? The general context-free answer would be yes, 
but the actual details may be architecture specific and also context 
dependent - ie the subsequent loads/stores may be dependent on the CAS 
succeeding (or on it failing). So without further knowledge you would 
need to use a "full-barrier" after the st-cond.

David
-----


>>> And that a conservative load is a *store* barrier?
>>
>> Not sure what you mean. Atomic::load is not a r-m-w action so not
>> expected to be a two-way memory barrier.
>
> OK.
>
> Thanks,
>
> Andrew.
>

From david.holmes at oracle.com  Sat Nov  5 18:48:28 2016
From: david.holmes at oracle.com (David Holmes)
Date: Sun, 6 Nov 2016 04:48:28 +1000
Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is
	configured for static linking
In-Reply-To: <a9a53ce9-2ac9-0595-a9fe-cfa2344bbb17@oracle.com>
References: <a9a53ce9-2ac9-0595-a9fe-cfa2344bbb17@oracle.com>
Message-ID: <7ffb2f76-9bc9-f774-7224-e6ff30b0e33c@oracle.com>

Looks good.

Thanks for fixing this Erik.

David

On 5/11/2016 12:22 AM, Erik Joelsson wrote:
> In the build, we have a global setting for linking libstdc++ static or
> dynamic on Linux. All libraries and executables that go in the product
> honor this setting. The gtestLauncher currently doesn't. This causes
> trouble in testing where some machines might not have the 32bit
> libstdc++.so installed. Since installing that library is not needed for
> just running the product, it's awkward to have to install it to run
> certain tests.
>
> This patch adds the LIBCXX flags from configure when linking
> gtestLauncher. The resulting file actually comes out a little bit
> smaller, so there is no footprint overhead. The tests still pass.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8169255
>
> Patch:
>
> diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk
> --- a/make/lib/CompileGtest.gmk
> +++ b/make/lib/CompileGtest.gmk
> @@ -107,6 +107,7 @@
>      LDFLAGS := $(LDFLAGS_JDKEXE), \
>      LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call
> SET_SHARED_LIBRARY_ORIGIN), \
>      LDFLAGS_solaris := -library=stlport4, \
> +    LIBS_linux := $(LIBCXX), \
>      LIBS_unix := -ljvm, \
>      LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \
>      COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \
>
>
> /Erik
>

From aph at redhat.com  Sun Nov  6 10:54:53 2016
From: aph at redhat.com (Andrew Haley)
Date: Sun, 6 Nov 2016 10:54:53 +0000
Subject: Memory ordering properties of Atomic::r-m-w operations
In-Reply-To: <bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<1475236951.6301.72.camel@oracle.com>
	<OF78EB09B0.8B71606C-ON49258040.004F1656-49258040.00512C99@notes.na.collabserv.com>
	<CAP_pwnUsC18TNvRg1_M273tjCav11_Xy=jQCkQC2_KPgztEu2A@mail.gmail.com>
	<6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com>
	<14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com>
	<OFCFB3DB17.F187E7F2-ON49258042.0053F83E-4925
	<c5dcc160-8d30-8160-2c5d-93a62bf2c940@oracle.com> <OFC81622C2.A
	<9bffd66d-abe0-8d3d-262a-4be55b81b9a4@
	<1477654313.3851.11.camel@oracle.com>
	<OF6B2B76B0.120062F1-ON4925805B.00311F7B-4925805B.003A58B5@notes.na.collabserv.com>
	<a86d39db-574c-1613-f5c0-200e517cdb05@redhat.com>
	<1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com>
	<f13b8b58-4aa5-2bf3-8c00-a66d8809b355@redhat.com>
	<b680fce7-b279-f28f-5ba0-5cd891372e01@oracle.com>
	<a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
	<bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>
Message-ID: <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com>

On 05/11/16 18:43, David Holmes wrote:
> Forking new discussion from:
> 
> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
> 
> On 1/11/2016 7:44 PM, Andrew Haley wrote:
>> On 31/10/16 21:30, David Holmes wrote:
>>>
>>>
>>> On 31/10/2016 7:32 PM, Andrew Haley wrote:
>>>> On 30/10/16 21:26, David Holmes wrote:
>>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote:
>>>
>>>    // All of the atomic operations that imply a read-modify-write
>>>    // action guarantee a two-way memory barrier across that
>>>    // operation. Historically these semantics reflect the strength
>>>    // of atomic operations that are provided on SPARC/X86. We assume
>>>    // that strength is necessary unless we can prove that a weaker
>>>    // form is sufficiently safe.
>>
>> Mmmm, but that doesn't say anything about a CAS that fails.  But fair
>> enough, I accept your interpretation.
> 
> Granted the above was not written with load-linked/store-conditional
> style implementations in mind; and the historical behaviour on sparc
> and x86 is not affected by failure of the cas, so it isn't called
> out. I should fix that.
> 
>>> But there is some contention as to whether the actual implementations
>>> obey this completely.
>>
>> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified
>> as a
>>
>>   "full barrier".  That is, no memory operand is moved across the
>>   operation, either forward or backward.  Further, instructions are
>>   issued as necessary to prevent the processor from speculating loads
>>   across the operation and from queuing stores after the operation.
>>
>> ... which reads the same as the language you quoted above, but looking
>> at the assembly code I'm sure that it's really no stronger than a seq
>> cst load followed by a seq cst store.
> 
> Are you saying that a seq_cst load followed by a seq_cst store is weaker 
> than a full barrier?

Probably.  I'm saying that when someone says "full barrier" they
aren't exactly clear what that means.  I know what sequential
consistency is, but not "full barrier" because it's used
inconsistently.

For example, the above says that no memory operand is moved across the
barrier, but if you have

store_relaxed(a)
load_seq_cst(b)
store_seq_cst(c)
load_relaxed(d)

there's nothing to prevent

load_seq_cst(b)
load_relaxed(d)
store_relaxed(a)
store_seq_cst(c)

It is true that neither store a nor load d have moved across this
operation, but they have exchanged places.  As far as GCC is concerned
this is a correct implementation, and it does meet the requirement of
sequential consistency as defined in the C++ memory model.

>> I guess maybe I could give up fighting this and implement all AArch64
>> CAS sequences as
>>
>>    CAS(seq_cst); full fence
>>
>> or, even more extremely,
>>
>>    full fence; CAS(relaxed); full fence
>>
>> but it all seems unreasonably heavyweight.
> 
> Indeed. A couple of issues here. If you are thinking in terms of 
> orderAccess::fence() then it needs to guarantee visibility as well as 
> ordering - see this bug I just filed:
> 
> https://bugs.openjdk.java.net/browse/JDK-8169193

Ouch.  Yes, I agree that something needs fixing.  That comment:

// Use release_store_fence to update values like the thread state,
// where we don't want the current thread to continue until all our
// prior memory accesses (including the new thread state) are visible
// to other threads.

... seems very unhelpful, at least because a release fence (using
conventional terminology) does not have that property: a release
fence is only LoadStore|StoreStore.

> So would be heavier than a "full barrier" that simply combined all
> four storeload membar variants. Though of course the actual
> implementation on a given architecture may be just as
> heavyweight. And of course the Atomic op must guarantee visibility
> of the successful store (else the atomicity aspect would not be
> present).

I don't think that's exactly right. As I understand the ARMv8 memory
model, it's possible to have a CAS which imposes no memory ordering or
visibility at all: it's a relaxed load and a relaxed store.  Other
threads can still see stale values of the store unless they attempt a
CAS.  This is really good: it's exactly what you want for some shared
counters.

> That aside we do not need two "fences" surrounding the atomic
> op. For platforms where the atomic op is a single instruction which
> combines load and store then conceptually all we need is:
> 
> loadload|storeload; op; storeload|storestore
>
> Note this is at odds with the commentary in atomic.hpp which says things 
> like:
> 
>    // <fence> add-value-to-dest <membar StoreLoad|StoreStore>
> 
> I need to check why we settled on the above formulation - I suspect it 
> was conservatism. And of course for the cmpxchg it fails to account for 
> the fact there may not be a store to order with.
> 
> For load-linked/store-conditional based operations that would expand to 
> (assume a retry loop for unrelated store failures):
> 
> loadLoad|storeLoad
> temp = ld-linked &val
> cmp temp, expected
> jmp ne
> st-cond &val, newVal
> storeload|storestore
> 
> which is fine if we actually store, but if we find the wrong value
> there is no store for those final barriers to sync with. That then
> raises the question: can subsequent loads and stores move into the
> ld-linked/st-cond region? The general context-free answer would be
> yes, but the actual details may be architecture specific and also
> context dependent - ie the subsequent loads/stores may be dependent
> on the CAS succeeding (or on it failing).  So without further
> knowledge you would need to use a "full-barrier" after the st-cond.

On most (all?) architectures a StoreLoad fence is a full barrier, so
this formulation is equivalent to what I was saying anyway.

Andrew.

From 1072213404 at qq.com  Mon Nov  7 03:09:22 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Mon, 7 Nov 2016 11:09:22 +0800
Subject: help understanding release sematics 
Message-ID: <tencent_633192EE1DAAA7DA57AAF88B@qq.com>

Hi,


in OrderAccess
inline void OrderAccess::storestore() { release(); }
inline void OrderAccess::loadstore()  { acquire(); }
the storestore can complete release sematics why some blog saying that release sematics include both storestore and loadstore?


i can understand what blog say but i am a little confused by the code.


thank you !


Arron

From david.holmes at oracle.com  Mon Nov  7 04:15:36 2016
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 7 Nov 2016 14:15:36 +1000
Subject: help understanding release sematics
In-Reply-To: <tencent_633192EE1DAAA7DA57AAF88B@qq.com>
References: <tencent_633192EE1DAAA7DA57AAF88B@qq.com>
Message-ID: <16bb2830-a0b6-7b27-d4cc-43519613c98c@oracle.com>

On 7/11/2016 1:09 PM, ???? wrote:
> Hi,
>
>
> in OrderAccess
> inline void OrderAccess::storestore() { release(); }
> inline void OrderAccess::loadstore()  { acquire(); }
> the storestore can complete release sematics why some blog saying that release sematics include both storestore and loadstore?

You are looking at a particular platform's implementation where the two 
things are the same at the hardware level. Conceptually it is the wrong 
way to express it.

In orderAccess.hpp we define:

acquire() == loadLoad|loadStore
release() == loadStore|storeStore

This is a particular definition inside hotspot such that we define an 
equivalence between these pairs:

release_store(&x, 1) ? release(); x = 1;

and

y = load_acquire(&x) ? y = x; acquire();

In the more general literature this equivalence does not exist as the 
two statements could be reordered.

acquire/release can not be exactly expressed using loadload/loadstore etc.

I actually have a presentation on all this that I just did last week. I 
plan to add a few updates then make it available.

David

>
>
> i can understand what blog say but i am a little confused by the code.
>
>
> thank you !
>
>
> Arron
>

From thomas.schatzl at oracle.com  Mon Nov  7 10:53:34 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 07 Nov 2016 11:53:34 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
Message-ID: <1478516014.2646.16.camel@oracle.com>

Hi,

On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote:
> > 
> > On Oct 21, 2016, at 9:54 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > 
> > > 
> > > On Oct 21, 2016, at 8:46 PM, Kim Barrett <kim.barrett at oracle.com>
> > > wrote:
> > > In the humongous case, if it bails because klass_or_null == NULL,
> > > we must re-enqueue
> > > the card ?
> This update (webrev.02) reverts part of the previous change.
> 
> In the original RFR I said:
> 
> ? As a result of the changes in oops_on_card_seq_iterate_careful, we
> ? now almost never fail to process the card.??The only place where
> ? that can occur is a stale card in a humongous region with an
> ? in-progress allocation, where we can just ignore it.??So the only
> ? caller, refine_card, no longer needs to examine the result of the
> ? call and enqueue the card for later reconsideration.
> 
> Ignoring such a stale card is incorrect at the point where it was
> being done.??At that point we've already cleaned the card, so we must
> either process the designated object(s) or, if we can't do the
> processing because of in-progress allocation (klass_or_null returned
> NULL), then re-queue the card for later reconsideration.
> 
> So the change to refine_card to eliminate that behavior, and the
> associated changes to oops_on_card_seq_iterate_careful, were a
> mistake, and are being reverted by this new version.??As a result,
> refine_card is no longer changed at all.

Thanks for catching this.

Maybe it would be cleaner to call a method in the barrier set instead
of inlining the dirtying + enqueuing in lines 685 to 691? Maybe as an
additional RFE.

> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166607
> 
> Webrev:
> Full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/
> Incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02.inc/
> 
> Testing:
> Local specjbb2015 (non-perf).
> Local jtreg hotspot_all.
> Also tested as baseline of changes for JDK-8166811.
> 
> Additionally, in the original RFR I also said:
> 
> ? Note that [...] At present the only source of stale cards in the
> ? concurrent case seems to be HCC eviction.??[...]??Doing HCC cleanup
> ? when freeing regions might remove the need for klass_or_null
> ? checking in the humongous case for concurrent refinement, so might
> ? be worth looking into later.
> 
> That was also incorrect; there are other sources of stale cards.

Can you elaborate on that?

>  That doesn't affect this change, but may effect how JDK-8166811
> should be fixed, and removes the rationale for JDK-8166995 (which has
> been resolved Won't Fix because of that).
> 
> See also the RFR for the followup JDK-8166811.

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Mon Nov  7 10:57:05 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 07 Nov 2016 11:57:05 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
Message-ID: <1478516225.2646.19.camel@oracle.com>

Hi,

On Sat, 2016-10-29 at 19:26 -0400, Kim Barrett wrote:
> > 
> > On Oct 25, 2016, at 7:13 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > 
> > Please review this change to address missing memory barriers needed
> > to
> > ensure ordering between allocation and refinement in G1.
> > [?]
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8166811
> > 
> > Webrev:
> > http://cr.openjdk.java.net/~kbarrett/8166811/webrev.00/
> > [Based on http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/]
> > 
> -------------------------------------------------------------------
> -----------?
> src/share/vm/gc/g1/g1RemSet.cpp
> ?581???// The region could be young.??Cards for young regions are 
> dirtied,
> ?582???// so the post-barrier will filter them out.??However, that
> dirtying
> ?583???// is performed concurrently.??A write to a young object could
> occur
> ?584???// before the card has been dirtied, slipping past the filter.
> 
> This is a rewording of the comment that used to be here.??However, it
> was not true even before these changes.??As part of JDK-8014555 we
> mark young region cards with g1_young_card_val().??That's the change
> set that added the storeload to the post-barrier.
> 
> I'm not quite sure what to do about this. The comment is currently
> wrong.??However, the storeload is considered a problem, and there
> have been various ideas discussed for eliminating it that might allow
> us to go back to dirtying young cards.

Depends on what "dirtying" is supposed to mean in this context -
setting it to "dirty" or setting it to something non-clean.

One could replace "dirtied" by something less specific here to make it
right again.

Thanks,
? Thomas


From kim.barrett at oracle.com  Mon Nov  7 18:36:46 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Nov 2016 13:36:46 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1478516225.2646.19.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
	<1478516225.2646.19.camel@oracle.com>
Message-ID: <CCDA724F-192B-4591-91A6-3054E9B8B75D@oracle.com>

> On Nov 7, 2016, at 5:57 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On Sat, 2016-10-29 at 19:26 -0400, Kim Barrett wrote:
>>> 
>>> On Oct 25, 2016, at 7:13 PM, Kim Barrett <kim.barrett at oracle.com>
>>> wrote:
>>> 
>>> Please review this change to address missing memory barriers needed
>>> to
>>> ensure ordering between allocation and refinement in G1.
>>> [?]
>>> 
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8166811
>>> 
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8166811/webrev.00/
>>> [Based on http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/]
>>> 
>> -------------------------------------------------------------------
>> ----------- 
>> src/share/vm/gc/g1/g1RemSet.cpp
>>  581   // The region could be young.  Cards for young regions are 
>> dirtied,
>>  582   // so the post-barrier will filter them out.  However, that
>> dirtying
>>  583   // is performed concurrently.  A write to a young object could
>> occur
>>  584   // before the card has been dirtied, slipping past the filter.
>> 
>> This is a rewording of the comment that used to be here.  However, it
>> was not true even before these changes.  As part of JDK-8014555 we
>> mark young region cards with g1_young_card_val().  That's the change
>> set that added the storeload to the post-barrier.
>> 
>> I'm not quite sure what to do about this. The comment is currently
>> wrong.  However, the storeload is considered a problem, and there
>> have been various ideas discussed for eliminating it that might allow
>> us to go back to dirtying young cards.
> 
> Depends on what "dirtying" is supposed to mean in this context -
> setting it to "dirty" or setting it to something non-clean.
> 
> One could replace "dirtied" by something less specific here to make it
> right again.

Good idea.  How about this rewording (using ?set to a value?)

  // The region could be young.  Cards for young regions are set to a
  // value that allows the post-barrier to filter them out.  However,
  // that card setting is performed concurrently.  A write to a young
  // object could occur before the card has been set, slipping past
  // the filter.


From kim.barrett at oracle.com  Mon Nov  7 19:07:27 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Nov 2016 14:07:27 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <CCDA724F-192B-4591-91A6-3054E9B8B75D@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
	<1478516225.2646.19.camel@oracle.com>
	<CCDA724F-192B-4591-91A6-3054E9B8B75D@oracle.com>
Message-ID: <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com>

> On Nov 7, 2016, at 1:36 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>> src/share/vm/gc/g1/g1RemSet.cpp
>>> 581   // The region could be young.  Cards for young regions are 
>>> dirtied,
>>> 582   // so the post-barrier will filter them out.  However, that
>>> dirtying
>>> 583   // is performed concurrently.  A write to a young object could
>>> occur
>>> 584   // before the card has been dirtied, slipping past the filter.
>>> 
>>> This is a rewording of the comment that used to be here.  However, it
>>> was not true even before these changes.  As part of JDK-8014555 we
>>> mark young region cards with g1_young_card_val().  That's the change
>>> set that added the storeload to the post-barrier.
>>> 
>>> I'm not quite sure what to do about this. The comment is currently
>>> wrong.  However, the storeload is considered a problem, and there
>>> have been various ideas discussed for eliminating it that might allow
>>> us to go back to dirtying young cards.
>> 
>> Depends on what "dirtying" is supposed to mean in this context -
>> setting it to "dirty" or setting it to something non-clean.
>> 
>> One could replace "dirtied" by something less specific here to make it
>> right again.
> 
> Good idea.  How about this rewording (using ?set to a value?)
> 
>  // The region could be young.  Cards for young regions are set to a
>  // value that allows the post-barrier to filter them out.  However,
>  // that card setting is performed concurrently.  A write to a young
>  // object could occur before the card has been set, slipping past
>  // the filter.

Oops, no, that isn't right.  (It's been a couple of weeks since I
looked at this, and forgot part of the problem.)

Part of what's wrong with the comment is that we can no longer get to
that point with a young region.  A young region's cards will be either
g1_young_gen or clean, never dirty.  Hence the filtering out of
non-dirty cards a few lines before this comment will have already
discarded a young card before we reach the test this comment is
discussing.  So the whole premise of the comment in question, that the
region could be young, is false.


From peter.hofer at jku.at  Mon Nov  7 19:35:42 2016
From: peter.hofer at jku.at (Peter Hofer)
Date: Mon, 7 Nov 2016 20:35:42 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<581C7E740200009400009CA7@gwia.im.jku.at>
	<CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>
Message-ID: <fb082266-7b7c-91f9-b9d2-ea92fa42d9f5@jku.at>

Hi Jeremy,

On 2016-11-04 19:24, Jeremy Manson wrote:
> Why aren't these extensions to JVMTI, which already has
> MonitorContendedEnter and MonitorContendedEntered events? You could
> just add a MonitorContendedRelease event to cover what you want.
> Then the bulk of the tracking work can be done in JVMTI.

  One of our main goals was to make profiling very lightweight so that 
there is a chance that the profiler can be used on production systems. 
In the HotSpot code, we can record events and maintain state very 
efficiently.

I agree that a profiler that uses only JVMTI and extension methods would 
be more modular. We actually tried to implement a comparable profiler 
using JVMTI. It performs very frequent state transitions to the agent 
and back, requires wrapping all references and data structures, and 
needs tagging to associate state with objects. Moreover, it cannot 
efficiently cache stack traces without always resolving inlined methods 
from the compiler's debug information (which makes a lot of difference 
in our HotSpot-internal profiler). The JVMTI-based profiler turned out 
to be rather inefficient, which is why we didn't pursue this approach 
further.

As Alex pointed out, there used to be a MonitorContendedExit event in 
early versions of JVMTI. It was eliminated because the context of a 
monitor exit is not really safe for invoking a JVMTI callback, which is 
another issue that would need to be addressed first.

Cheers,
  Peter

> At Google, we've built on these JVMTI primitives quite successfully.
> The only internal enhancements we've had to make is to make them support
> j.u.c locks.
>
> (We've also done the hotspot lock contention work, but it has been less
> directly useful.)
>
> Jeremy
>
> On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt <David.Gnedt at jku.at
> <mailto:David.Gnedt at jku.at>> wrote:
>
>     Hello,
>
>     I am one of the authors of this work and I gladly support this
>     contribution.
>
>     Best regards,
>     David Gnedt
>
>     >>> Peter Hofer <peter.hofer at jku.at <mailto:peter.hofer at jku.at>>
>     04.11.16 11.01 Uhr >>>
>     Hello everyone,
>
>     we are researchers at the University of Linz and have worked on a lock
>     contention profiler that is built into HotSpot. We would like to
>     contribute this work to the OpenJDK community.
>
>     Our profiler records an event when a thread fails to acquire a contended
>
>     lock and also when a thread releases a contended lock. It further
>     efficiently records the stack traces where these events occur. We
>     devised a versatile visualization tool that analyzes the recorded events
>
>     and determines when and where threads _cause_ contention by holding a
>     contended lock. The visualization tool can show the contention by stack
>     trace, by lock, by lock class, by thread, and by any combination of
>     those aspects.
>
>     We described our profiler in more detail in a research paper at ICPE
>     2016. [1] In our evaluation, we found that the overhead is typically
>     below 10% for common multi-threaded Java benchmarks. Please find a free
>     download of the paper on our website:
>     > http://mevss.jku.at/lct/
>
>     I contribute this work on behalf of Dynatrace Austria (the sponsor of
>     this research), my colleagues David Gnedt and Andreas Schoergenhumer,
>     and myself. The necessary OCAs have already been submitted.
>
>     We provide two patches:
>
>     Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
>     described and evaluated in our paper, plus minor improvements. It
>     records events for Java intrinsic locks (monitors) and for
>     java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
>     We support only Linux on 64-bit x86 hardware.
>
>     >
>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/>
>     >
>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/
>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/>
>
>     Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
>     native locks only. We consider this to be useful for HotSpot developers
>     to find locking bottlenecks in HotSpot itself. We tested this patch only
>
>     on Linux on 64-bit x86 hardware, but it should require few changes for
>     other platforms.
>
>     >
>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/
>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/>
>     >
>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/
>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/>
>
>     With both patches, the profiler is enabled with -XX:+EnableEventTracing.
>
>     By default, an uncompressed event trace is written to file "output.trc".
>
>     More detailed usage information and a download of the corresponding
>     visualization tool is available on our website,
>     http://mevss.jku.at/lct/.
>
>     Kind regards,
>       Peter Hofer
>
>
>     --
>     Peter Hofer
>     Christian Doppler Laboratory on Monitoring and Evolution of
>     Very-Large-Scale Software Systems / Institute for System Software
>     University of Linz
>
>
>     [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
>     Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
>     Contention in Java Applications on the Virtual Machine Level.
>     Proceedings of the 7th ACM/SPEC International Conference on Performance
>     Engineering (ICPE?16), Delft, Netherlands, 2016.
>
>
>

From kim.barrett at oracle.com  Mon Nov  7 19:38:25 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 7 Nov 2016 14:38:25 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <1478516014.2646.16.camel@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
Message-ID: <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>

> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote:
>>> 
>>> On Oct 21, 2016, at 9:54 PM, Kim Barrett <kim.barrett at oracle.com>
>>> wrote:
>>> 
>>>> 
>>>> On Oct 21, 2016, at 8:46 PM, Kim Barrett <kim.barrett at oracle.com>
>>>> wrote:
>>>> In the humongous case, if it bails because klass_or_null == NULL,
>>>> we must re-enqueue
>>>> the card ?
>> This update (webrev.02) reverts part of the previous change.
>> 
>> In the original RFR I said:
>> 
>>   As a result of the changes in oops_on_card_seq_iterate_careful, we
>>   now almost never fail to process the card.  The only place where
>>   that can occur is a stale card in a humongous region with an
>>   in-progress allocation, where we can just ignore it.  So the only
>>   caller, refine_card, no longer needs to examine the result of the
>>   call and enqueue the card for later reconsideration.
>> 
>> Ignoring such a stale card is incorrect at the point where it was
>> being done.  At that point we've already cleaned the card, so we must
>> either process the designated object(s) or, if we can't do the
>> processing because of in-progress allocation (klass_or_null returned
>> NULL), then re-queue the card for later reconsideration.
>> 
>> So the change to refine_card to eliminate that behavior, and the
>> associated changes to oops_on_card_seq_iterate_careful, were a
>> mistake, and are being reverted by this new version.  As a result,
>> refine_card is no longer changed at all.
> 
> Thanks for catching this.
> 
> Maybe it would be cleaner to call a method in the barrier set instead
> of inlining the dirtying + enqueuing in lines 685 to 691? Maybe as an
> additional RFE.

We could use _ct_bs->invalidate(dirtyRegion).  That's rather
overgeneralized and inefficient for this situation, but this situation
should occur *very* rarely; it requires a stale card get processed
just as a humongous object is in the midst of being allocated in the
same region.

>> Additionally, in the original RFR I also said:
>> 
>>   Note that [...] At present the only source of stale cards in the
>>   concurrent case seems to be HCC eviction.  [...]  Doing HCC cleanup
>>   when freeing regions might remove the need for klass_or_null
>>   checking in the humongous case for concurrent refinement, so might
>>   be worth looking into later.
>> 
>> That was also incorrect; there are other sources of stale cards.
> 
> Can you elaborate on that?

Here's a scenario that I've observed while running a jtreg test (I
think it was hotspot/test/gc/TestHumongousReferenceObject).

We have humongous object H, referring to young object Y.  This induces
a remembered set entry for card C in region R (allocated for H).

H becomes unreachable.
Start concurrent collection cycle.
Pause Initial Mark scan_rs pushes &H->Y onto mark stack.
Pause Initial Mark evac processes &H->Y, copying Y, updating &H->Y,
  and adding C to g1h_dcqs in update_rs.
Pause Initial Mark redirty_logged_cards dirties g1h_dcqs entries, including C.
Pause Initial Mark merges g1h_dcqs into java_dcqs, adding dirty C to java_dcqs.
Concurrent Mark determines H is dead.
Pause Cleanup frees regions for H, including R.
Concurrent Refinement finally comes across stale C in now (possibly) free R.

A similar situation can arise if instead of H we have old O in region
R and all objects in R are unreachable before starting concurrent
collection, so that Pause Cleanup frees R.


From david.holmes at oracle.com  Tue Nov  8 01:11:57 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 8 Nov 2016 11:11:57 +1000
Subject: Memory ordering properties of Atomic::r-m-w operations
In-Reply-To: <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<1475236951.6301.72.camel@oracle.com>
	<OF78EB09B0.8B71606C-ON49258040.004F1656-49258040.00512C99@notes.na.collabserv.com>
	<CAP_pwnUsC18TNvRg1_M273tjCav11_Xy=jQCkQC2_KPgztEu2A@mail.gmail.com>
	<6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com>
	<14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com>
	<OFCFB3DB17.F187E7F2-ON49258042.0053F83E-4925
	<c5dcc160-8d30-8160-2c5d-93a62bf2c940@oracle.com> <OFC81622C2.A
	<9bffd66d-abe0-8d3d-262a-4be55b81b9a4@
	<1477654313.3851.11.camel@oracle.com>
	<OF6B2B76B0.120062F1-ON4925805B.00311F7B-4925805B.003A58B5@notes.na.collabserv.com>
	<a86d39db-574c-1613-f5c0-200e517cdb05@redhat.com>
	<1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com>
	<f13b8b58-4aa5-2bf3-8c00-a66d8809b355@redhat.com>
	<b680fce7-b279-f28f-5ba0-5cd891372e01@oracle.com>
	<a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
	<bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>
	<333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com>
Message-ID: <b87cb8b6-fb44-fc21-f7c8-a110d96e28b0@oracle.com>

On 6/11/2016 8:54 PM, Andrew Haley wrote:
> On 05/11/16 18:43, David Holmes wrote:
>> Forking new discussion from:
>>
>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
>>
>> On 1/11/2016 7:44 PM, Andrew Haley wrote:
>>> On 31/10/16 21:30, David Holmes wrote:
>>>>
>>>>
>>>> On 31/10/2016 7:32 PM, Andrew Haley wrote:
>>>>> On 30/10/16 21:26, David Holmes wrote:
>>>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote:
>>>>
>>>>    // All of the atomic operations that imply a read-modify-write
>>>>    // action guarantee a two-way memory barrier across that
>>>>    // operation. Historically these semantics reflect the strength
>>>>    // of atomic operations that are provided on SPARC/X86. We assume
>>>>    // that strength is necessary unless we can prove that a weaker
>>>>    // form is sufficiently safe.
>>>
>>> Mmmm, but that doesn't say anything about a CAS that fails.  But fair
>>> enough, I accept your interpretation.
>>
>> Granted the above was not written with load-linked/store-conditional
>> style implementations in mind; and the historical behaviour on sparc
>> and x86 is not affected by failure of the cas, so it isn't called
>> out. I should fix that.
>>
>>>> But there is some contention as to whether the actual implementations
>>>> obey this completely.
>>>
>>> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified
>>> as a
>>>
>>>   "full barrier".  That is, no memory operand is moved across the
>>>   operation, either forward or backward.  Further, instructions are
>>>   issued as necessary to prevent the processor from speculating loads
>>>   across the operation and from queuing stores after the operation.
>>>
>>> ... which reads the same as the language you quoted above, but looking
>>> at the assembly code I'm sure that it's really no stronger than a seq
>>> cst load followed by a seq cst store.
>>
>> Are you saying that a seq_cst load followed by a seq_cst store is weaker
>> than a full barrier?
>
> Probably.  I'm saying that when someone says "full barrier" they
> aren't exactly clear what that means.  I know what sequential
> consistency is, but not "full barrier" because it's used
> inconsistently.

Agreed it is not a term that has a common definition - it may just 
relate to no-reorderings of any loads or stores, or it may also imply 
visibility guarantees. Though while I know what "sequential consistency" 
is I do not know what exactly it means to implement an operation with 
seq_cst semantics.

> For example, the above says that no memory operand is moved across the
> barrier, but if you have
>
> store_relaxed(a)
> load_seq_cst(b)
> store_seq_cst(c)
> load_relaxed(d)
>
> there's nothing to prevent
>
> load_seq_cst(b)
> load_relaxed(d)
> store_relaxed(a)
> store_seq_cst(c)
>
> It is true that neither store a nor load d have moved across this
> operation, but they have exchanged places.  As far as GCC is concerned
> this is a correct implementation, and it does meet the requirement of
> sequential consistency as defined in the C++ memory model.

It does? Then it emphasises what I just said about not knowing what it 
means to implement an operation with seq_cst semantics. I would have 
expected full ordering of all loads and stores to get "sequential 
consistency".

>>> I guess maybe I could give up fighting this and implement all AArch64
>>> CAS sequences as
>>>
>>>    CAS(seq_cst); full fence
>>>
>>> or, even more extremely,
>>>
>>>    full fence; CAS(relaxed); full fence
>>>
>>> but it all seems unreasonably heavyweight.
>>
>> Indeed. A couple of issues here. If you are thinking in terms of
>> orderAccess::fence() then it needs to guarantee visibility as well as
>> ordering - see this bug I just filed:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8169193
>
> Ouch.  Yes, I agree that something needs fixing.  That comment:
>
> // Use release_store_fence to update values like the thread state,
> // where we don't want the current thread to continue until all our
> // prior memory accesses (including the new thread state) are visible
> // to other threads.
>
> ... seems very unhelpful, at least because a release fence (using
> conventional terminology) does not have that property: a release
> fence is only LoadStore|StoreStore.

In release_store_fence the release and fence are distinct memory 
ordering components. It is not a store combined with a "release fence" 
but a store between a "release" and a "fence". And critically in hotspot 
that "fence" must have visibility guarantees to ensure correctness of 
Dekker-duality algorithms.

Note the equivalence of release() with LoadStore|StoreStore is a 
definition within orderAccess.hpp, it is not a general equivalence.

>> So would be heavier than a "full barrier" that simply combined all
>> four storeload membar variants. Though of course the actual
>> implementation on a given architecture may be just as
>> heavyweight. And of course the Atomic op must guarantee visibility
>> of the successful store (else the atomicity aspect would not be
>> present).
>
> I don't think that's exactly right. As I understand the ARMv8 memory
> model, it's possible to have a CAS which imposes no memory ordering or
> visibility at all: it's a relaxed load and a relaxed store.  Other
> threads can still see stale values of the store unless they attempt a
> CAS.  This is really good: it's exactly what you want for some shared
> counters.

Okay - yes - a naked "relaxed" load need not see the result of a recent 
successful "CAS". But the load-with-reservation within a "CAS" must see 
such a store I would think, to ensure things work correctly - though I 
suppose that could also be handled at the store-with-reservation point. 
Which suggests that a CAS with a "full two-way memory barrier" on ARMv8 
does indeed need a fairly heavy pre- and post-op memory barrier (which 
makes me wonder whether the reservation using ld.acq and st.rel can be 
efficiently strengthened as needed, or whether plain ld and st would be 
more efficient within the overall sequence).

>> That aside we do not need two "fences" surrounding the atomic
>> op. For platforms where the atomic op is a single instruction which
>> combines load and store then conceptually all we need is:
>>
>> loadload|storeload; op; storeload|storestore
>>
>> Note this is at odds with the commentary in atomic.hpp which says things
>> like:
>>
>>    // <fence> add-value-to-dest <membar StoreLoad|StoreStore>
>>
>> I need to check why we settled on the above formulation - I suspect it
>> was conservatism. And of course for the cmpxchg it fails to account for
>> the fact there may not be a store to order with.

Just a note that, for example, SPARC does not require a CAS to succeed, 
for a subsequent membar to consider the CAS as a load+store.

>>
>> For load-linked/store-conditional based operations that would expand to
>> (assume a retry loop for unrelated store failures):
>>
>> loadLoad|storeLoad
>> temp = ld-linked &val
>> cmp temp, expected
>> jmp ne
>> st-cond &val, newVal
>> storeload|storestore
>>
>> which is fine if we actually store, but if we find the wrong value
>> there is no store for those final barriers to sync with. That then
>> raises the question: can subsequent loads and stores move into the
>> ld-linked/st-cond region? The general context-free answer would be
>> yes, but the actual details may be architecture specific and also
>> context dependent - ie the subsequent loads/stores may be dependent
>> on the CAS succeeding (or on it failing).  So without further
>> knowledge you would need to use a "full-barrier" after the st-cond.
>
> On most (all?) architectures a StoreLoad fence is a full barrier, so
> this formulation is equivalent to what I was saying anyway.

I'm trying to distinguish the desired semantics from any actual 
implementation mechanism. That fact that, for example, on SPARC and x86, 
the only explicit barrier needed is storeLoad, so if you have that then 
you effectively have a "full barrier" because the other three are 
implicit, is incidental.

Cheers,
David

> Andrew.
>

From jeremymanson at google.com  Tue Nov  8 07:32:14 2016
From: jeremymanson at google.com (Jeremy Manson)
Date: Mon, 7 Nov 2016 23:32:14 -0800
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <fb082266-7b7c-91f9-b9d2-ea92fa42d9f5@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<581C7E740200009400009CA7@gwia.im.jku.at>
	<CAPYFHW0=tTzviwJJ4FZK6o36J6vpK5ovw4DNx=6KSnZjiahsxw@mail.gmail.com>
	<fb082266-7b7c-91f9-b9d2-ea92fa42d9f5@jku.at>
Message-ID: <CAPYFHW2H3qTBfCYm11d8X1yO+-oE=PdFvWJaY6ZUZAgcdR2F1w@mail.gmail.com>

Fair enough.  We sample instead of getting detailed information, but we are
generally trying to profile an application running on lots of JVMs, so we
have some scale that not everyone does.

We wouldn't have been able to live with 7.8% overhead either, though.

I bet someone could come up with a reasonable strategy to make
MonitorContendedExit work if they were motivated. :)

Jeremy

On Mon, Nov 7, 2016 at 11:35 AM, Peter Hofer <peter.hofer at jku.at> wrote:

> Hi Jeremy,
>
> On 2016-11-04 19:24, Jeremy Manson wrote:
>
>> Why aren't these extensions to JVMTI, which already has
>> MonitorContendedEnter and MonitorContendedEntered events? You could
>> just add a MonitorContendedRelease event to cover what you want.
>> Then the bulk of the tracking work can be done in JVMTI.
>>
>
>  One of our main goals was to make profiling very lightweight so that
> there is a chance that the profiler can be used on production systems. In
> the HotSpot code, we can record events and maintain state very efficiently.
>
> I agree that a profiler that uses only JVMTI and extension methods would
> be more modular. We actually tried to implement a comparable profiler using
> JVMTI. It performs very frequent state transitions to the agent and back,
> requires wrapping all references and data structures, and needs tagging to
> associate state with objects. Moreover, it cannot efficiently cache stack
> traces without always resolving inlined methods from the compiler's debug
> information (which makes a lot of difference in our HotSpot-internal
> profiler). The JVMTI-based profiler turned out to be rather inefficient,
> which is why we didn't pursue this approach further.
>
> As Alex pointed out, there used to be a MonitorContendedExit event in
> early versions of JVMTI. It was eliminated because the context of a monitor
> exit is not really safe for invoking a JVMTI callback, which is another
> issue that would need to be addressed first.
>
> Cheers,
>  Peter
>
> At Google, we've built on these JVMTI primitives quite successfully.
>> The only internal enhancements we've had to make is to make them support
>> j.u.c locks.
>>
>> (We've also done the hotspot lock contention work, but it has been less
>> directly useful.)
>>
>> Jeremy
>>
>> On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt <David.Gnedt at jku.at
>> <mailto:David.Gnedt at jku.at>> wrote:
>>
>>     Hello,
>>
>>     I am one of the authors of this work and I gladly support this
>>     contribution.
>>
>>     Best regards,
>>     David Gnedt
>>
>>     >>> Peter Hofer <peter.hofer at jku.at <mailto:peter.hofer at jku.at>>
>>
>>     04.11.16 11.01 Uhr >>>
>>     Hello everyone,
>>
>>     we are researchers at the University of Linz and have worked on a lock
>>     contention profiler that is built into HotSpot. We would like to
>>     contribute this work to the OpenJDK community.
>>
>>     Our profiler records an event when a thread fails to acquire a
>> contended
>>
>>     lock and also when a thread releases a contended lock. It further
>>     efficiently records the stack traces where these events occur. We
>>     devised a versatile visualization tool that analyzes the recorded
>> events
>>
>>     and determines when and where threads _cause_ contention by holding a
>>     contended lock. The visualization tool can show the contention by
>> stack
>>     trace, by lock, by lock class, by thread, and by any combination of
>>     those aspects.
>>
>>     We described our profiler in more detail in a research paper at ICPE
>>     2016. [1] In our evaluation, we found that the overhead is typically
>>     below 10% for common multi-threaded Java benchmarks. Please find a
>> free
>>     download of the paper on our website:
>>     > http://mevss.jku.at/lct/
>>
>>     I contribute this work on behalf of Dynatrace Austria (the sponsor of
>>     this research), my colleagues David Gnedt and Andreas Schoergenhumer,
>>     and myself. The necessary OCAs have already been submitted.
>>
>>     We provide two patches:
>>
>>     Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
>>     described and evaluated in our paper, plus minor improvements. It
>>     records events for Java intrinsic locks (monitors) and for
>>     java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
>>     We support only Linux on 64-bit x86 hardware.
>>
>>     >
>>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_
>> jdk8u102b14/
>>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_
>> jdk8u102b14/>
>>     >
>>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/
>>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/>
>>
>>     Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
>>     native locks only. We consider this to be useful for HotSpot
>> developers
>>     to find locking bottlenecks in HotSpot itself. We tested this patch
>> only
>>
>>     on Linux on 64-bit x86 hardware, but it should require few changes for
>>     other platforms.
>>
>>     >
>>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativeloc
>> ksonly_hotspot_jdk9%2b140/
>>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelo
>> cksonly_hotspot_jdk9%2b140/>
>>     >
>>     http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativeloc
>> ksonly_jdk_jdk-9%2b140/
>>     <http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelo
>> cksonly_jdk_jdk-9%2b140/>
>>
>>     With both patches, the profiler is enabled with
>> -XX:+EnableEventTracing.
>>
>>     By default, an uncompressed event trace is written to file
>> "output.trc".
>>
>>     More detailed usage information and a download of the corresponding
>>     visualization tool is available on our website,
>>     http://mevss.jku.at/lct/.
>>
>>     Kind regards,
>>       Peter Hofer
>>
>>
>>     --
>>     Peter Hofer
>>     Christian Doppler Laboratory on Monitoring and Evolution of
>>     Very-Large-Scale Software Systems / Institute for System Software
>>     University of Linz
>>
>>
>>     [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
>>     Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
>>     Contention in Java Applications on the Virtual Machine Level.
>>     Proceedings of the 7th ACM/SPEC International Conference on
>> Performance
>>     Engineering (ICPE?16), Delft, Netherlands, 2016.
>>
>>
>>
>>

From 1072213404 at qq.com  Tue Nov  8 08:48:44 2016
From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=)
Date: Tue, 8 Nov 2016 16:48:44 +0800
Subject: help understanding lock instruction in  OrderAccess::fence()
Message-ID: <tencent_3A99744247D537A3423AA243@qq.com>

hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp
inline void OrderAccess::fence() {
  if (os::is_MP()) {
    // always use locked addl since mfence is sometimes expensive
#ifdef AMD64
    __asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory");
#else
    __asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory");
#endif
  }
}


my classmates think that code ?addl $0,0(%%esp)? having some specific effect?
because esp points to the top of stack .
it that true ?
or the code ?addl $0,0(%%esp)?  just equals no op, 
needing one operation after lock at least,  otherwise  lock instruction will produce an error .


Thank you !


Arron

From david.holmes at oracle.com  Tue Nov  8 09:35:24 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 8 Nov 2016 19:35:24 +1000
Subject: help understanding lock instruction in OrderAccess::fence()
In-Reply-To: <tencent_3A99744247D537A3423AA243@qq.com>
References: <tencent_3A99744247D537A3423AA243@qq.com>
Message-ID: <fd3bc406-0c94-5c81-678c-1fba910a1016@oracle.com>

On 8/11/2016 6:48 PM, ???? wrote:
> hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp
> inline void OrderAccess::fence() {
>   if (os::is_MP()) {
>     // always use locked addl since mfence is sometimes expensive
> #ifdef AMD64
>     __asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory");
> #else
>     __asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory");
> #endif
>   }
> }
>
>
> my classmates think that code ?addl $0,0(%%esp)? having some specific effect?
> because esp points to the top of stack .
> it that true ?
> or the code ?addl $0,0(%%esp)?  just equals no op,

It is a no-op - adding zero to a value.

> needing one operation after lock at least,  otherwise  lock instruction will produce an error .

"lock" is not an instruction, it is an instruction prefix, so has to go 
before some other instruction. The "lock" prefix acts as a storeload** 
barrier for x86 and as per the comment can be cheaper than an explicit 
mfence instruction.

**All the other barriers are implicit in the x86 memory model, so you 
only need to add a storeload barrier to get the necessary fence semantics.

David

>
> Thank you !
>
>
> Arron
>

From thomas.schatzl at oracle.com  Tue Nov  8 10:01:53 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 08 Nov 2016 11:01:53 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
	<1478516225.2646.19.camel@oracle.com>
	<CCDA724F-192B-4591-91A6-3054E9B8B75D@oracle.com>
	<867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com>
Message-ID: <1478599313.2689.44.camel@oracle.com>

Hi Kim,

On Mon, 2016-11-07 at 14:07 -0500, Kim Barrett wrote:
> > 
> > On Nov 7, 2016, at 1:36 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > > 
> > > > 
> > > > 
[...]
> > > One could replace "dirtied" by something less specific here to
> > > make it
> > > right again.
> > Good idea.??How about this rewording (using ?set to a value?)
> > 
> > ?// The region could be young.??Cards for young regions are set to
> > a
> > ?// value that allows the post-barrier to filter them
> > out.??However,
> > ?// that card setting is performed concurrently.??A write to a
> > young
> > ?// object could occur before the card has been set, slipping past
> > ?// the filter.
> Oops, no, that isn't right.??(It's been a couple of weeks since I
> looked at this, and forgot part of the problem.)
> 
> Part of what's wrong with the comment is that we can no longer get to
> that point with a young region.??A young region's cards will be ither
> g1_young_gen or clean, never dirty.??Hence the filtering out of non-

Why? I think the reason for this comment has been that the following
could happen:

A: allocate new young region X, allocate object, storestore, stops at
the beginning of the dirty_young_block() method

B: allocate new object B in X, set B.y = something-outside, making the
card "Dirty" since thread A did not actually start doing
dirty_young_block() yet.

Refinement: scans the card; since R does not seem to synchronize with A
either, you may get a "dirty" card in a young (or free, depending on
whether the setting of the region flag in X has already been observed -
but it must be either one) region here in this case?

A: does the work in dirty_young_block()

(The previous is_young() check has indeed been wrong, and
is_old_or_humongous() is better)

> dirty cards a few lines before this comment will have already
> discarded a young card before we reach the test this comment is
> discussing.??So the whole premise of the comment in question, that
> the region could be young, is false.

I think the comment is good after all. I would even emphasize the act
of setting it to "g1_young_gen" by writing something like:

// The region could be young.??Cards for young regions are set to?
// "g1_young_gen" so the post-barrier will filter them out.??However,?
// that dirtying is performed concurrently.??A write to a young object?
// could occur in the same region before the cards have been set to?
// that value, slipping past the filter.

Because then, if somebody removes g1_young_gen, he will hopefully find
this place again by searching for "g1_young_gen" and think about this
situation again. (in theory :))

Thanks,
? Thomas


From aph at redhat.com  Tue Nov  8 10:18:09 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 8 Nov 2016 10:18:09 +0000
Subject: Memory ordering properties of Atomic::r-m-w operations
In-Reply-To: <b87cb8b6-fb44-fc21-f7c8-a110d96e28b0@oracle.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<OF78EB09B0.8B71606C-ON49258040.004F1656-49258040.00512C99@notes.na.collabserv.com>
	<CAP_pwnUsC18TNvRg1_M273tjCav11_Xy=jQCkQC2_KPgztEu2A@mail.gmail.com>
	<6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com>
	<14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com>
	<OFCFB3DB17.F187E7F2-ON49258042.0053F83E-4925
	<c5dcc160-8d30-8160-2c5d-93a62bf2c940@oracle.com> <OFC81622C2.A
	<9bffd66d-abe0-8d3d-262a-4be55b81b9a4@
	<1477654313.3851.11.camel@oracle.com>
	<OF6B2B76B0.120062F1-ON4925805B.00311F7B-4925805B.003A58B5@notes.na.collabserv.com>
	<a86d39db-574c-1613-f5c0-200e517cdb05@redhat.com>
	<1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com>
	<f13b8b58-4aa5-2bf3-8c00-a66d8809b355@redhat.com>
	<b680fce7-b279-f28f-5ba0-5cd891372e01@oracle.com>
	<a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
	<bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>
	<333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com>
	<b87cb8b6-fb44-fc21-f7c8-a110d96e28b0@oracle.com>
Message-ID: <5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com>

On 08/11/16 01:11, David Holmes wrote:
> On 6/11/2016 8:54 PM, Andrew Haley wrote:
>> On 05/11/16 18:43, David Holmes wrote:
>>> Forking new discussion from:
>>>
>>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
>>>
>>> On 1/11/2016 7:44 PM, Andrew Haley wrote:
>>>> On 31/10/16 21:30, David Holmes wrote:
> 
>>  if you have
>>
>> store_relaxed(a)
>> load_seq_cst(b)
>> store_seq_cst(c)
>> load_relaxed(d)
>>
>> there's nothing to prevent
>>
>> load_seq_cst(b)
>> load_relaxed(d)
>> store_relaxed(a)
>> store_seq_cst(c)
>>
>> It is true that neither store a nor load d have moved across this
>> operation, but they have exchanged places.  As far as GCC is concerned
>> this is a correct implementation, and it does meet the requirement of
>> sequential consistency as defined in the C++ memory model.
>
> It does? Then it emphasises what I just said about not knowing what it 
> means to implement an operation with seq_cst semantics.

I take your point, but seq_cst is not a real mystery, it's just a
matter of looking it up: it's all defined in the C++11 standard.  And
it's not significantly different from Java volatile.

> I would have expected full ordering of all loads and stores to get
> "sequential consistency".

Why?  There are only two sequentially-consistent loads and stores in
that block of code.  Of course those two have a total order.  But you
surely wouldn't expect a sequentially-consistent store to be ordered
with respect to a relaxed load.

>> Ouch.  Yes, I agree that something needs fixing.  That comment:
>>
>> // Use release_store_fence to update values like the thread state,
>> // where we don't want the current thread to continue until all our
>> // prior memory accesses (including the new thread state) are visible
>> // to other threads.
>>
>> ... seems very unhelpful, at least because a release fence (using
>> conventional terminology) does not have that property: a release
>> fence is only LoadStore|StoreStore.
> 
> In release_store_fence the release and fence are distinct memory
> ordering components. It is not a store combined with a "release
> fence" but a store between a "release" and a "fence". And critically
> in hotspot that "fence" must have visibility guarantees to ensure
> correctness of Dekker-duality algorithms.

Ah, that is a slightly misleading name.  The "_fence" at the end of
the name is really a StoreLoad fence, got it.  I noticed that once
before, but I'd forgotten.  I guess what's intended here is a
sequentially-consistent store.

> Note the equivalence of release() with LoadStore|StoreStore is a 
> definition within orderAccess.hpp, it is not a general equivalence.

OK.  It would certainly be nice if HotSpot could move to using
standard terminology.  Then, in time, we could just use the C++11
atomics.

Andrew.

From david.holmes at oracle.com  Tue Nov  8 10:35:17 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 8 Nov 2016 20:35:17 +1000
Subject: Memory ordering properties of Atomic::r-m-w operations
In-Reply-To: <5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<CAP_pwnUsC18TNvRg1_M273tjCav11_Xy=jQCkQC2_KPgztEu2A@mail.gmail.com>
	<6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com>
	<14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com>
	<OFCFB3DB17.F187E7F2-ON49258042.0053F83E-4925
	<c5dcc160-8d30-8160-2c5d-93a62bf2c940@oracle.com> <OFC81622C2.A
	<9bffd66d-abe0-8d3d-262a-4be55b81b9a4@
	<1477654313.3851.11.camel@oracle.com>
	<OF6B2B76B0.120062F1-ON4925805B.00311F7B-4925805B.003A58B5@notes.na.collabserv.com>
	<a86d39db-574c-1613-f5c0-200e517cdb05@redhat.com>
	<1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com>
	<f13b8b58-4aa5-2bf3-8c00-a66d8809b355@redhat.com>
	<b680fce7-b279-f28f-5ba0-5cd891372e01@oracle.com>
	<a57421e0-1048-8dfc-9f8a-d8c1c68e1832@redhat.com>
	<bd7ac705-54f9-44d9-82a7-239600ec9fd5@oracle.com>
	<333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com>
	<b87cb8b6-fb44-fc21-f7c8-a110d96e28b0@oracle.com>
	<5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com>
Message-ID: <59d08376-fb41-cf39-3b1f-01f826e8d9e7@oracle.com>

On 8/11/2016 8:18 PM, Andrew Haley wrote:
> On 08/11/16 01:11, David Holmes wrote:
>> On 6/11/2016 8:54 PM, Andrew Haley wrote:
>>> On 05/11/16 18:43, David Holmes wrote:
>>>> Forking new discussion from:
>>>>
>>>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
>>>>
>>>> On 1/11/2016 7:44 PM, Andrew Haley wrote:
>>>>> On 31/10/16 21:30, David Holmes wrote:
>>
>>>  if you have
>>>
>>> store_relaxed(a)
>>> load_seq_cst(b)
>>> store_seq_cst(c)
>>> load_relaxed(d)
>>>
>>> there's nothing to prevent
>>>
>>> load_seq_cst(b)
>>> load_relaxed(d)
>>> store_relaxed(a)
>>> store_seq_cst(c)
>>>
>>> It is true that neither store a nor load d have moved across this
>>> operation, but they have exchanged places.  As far as GCC is concerned
>>> this is a correct implementation, and it does meet the requirement of
>>> sequential consistency as defined in the C++ memory model.
>>
>> It does? Then it emphasises what I just said about not knowing what it
>> means to implement an operation with seq_cst semantics.
>
> I take your point, but seq_cst is not a real mystery, it's just a
> matter of looking it up: it's all defined in the C++11 standard.  And
> it's not significantly different from Java volatile.

I have looked at it of course, but still find it rather "mysterious".

>> I would have expected full ordering of all loads and stores to get
>> "sequential consistency".
>
> Why?  There are only two sequentially-consistent loads and stores in
> that block of code.  Of course those two have a total order.  But you
> surely wouldn't expect a sequentially-consistent store to be ordered
> with respect to a relaxed load.

I guess I think of sequentially consistent as a global property of a 
system, not relative to just atomic operations.

>>> Ouch.  Yes, I agree that something needs fixing.  That comment:
>>>
>>> // Use release_store_fence to update values like the thread state,
>>> // where we don't want the current thread to continue until all our
>>> // prior memory accesses (including the new thread state) are visible
>>> // to other threads.
>>>
>>> ... seems very unhelpful, at least because a release fence (using
>>> conventional terminology) does not have that property: a release
>>> fence is only LoadStore|StoreStore.
>>
>> In release_store_fence the release and fence are distinct memory
>> ordering components. It is not a store combined with a "release
>> fence" but a store between a "release" and a "fence". And critically
>> in hotspot that "fence" must have visibility guarantees to ensure
>> correctness of Dekker-duality algorithms.
>
> Ah, that is a slightly misleading name.  The "_fence" at the end of
> the name is really a StoreLoad fence, got it.  I noticed that once
> before, but I'd forgotten.  I guess what's intended here is a
> sequentially-consistent store.

It is intended to be:

release(); store; fence();

but might be implementable in a more efficient manner when combined in a 
single function.

I have a problem with referring to a "storeload fence". storeload is one 
form of memory barrier - a full fence represents all four forms to me. 
Terminology is a disaster in this field unfortunately - one 
architectures barrier is anothers fence. :(

>> Note the equivalence of release() with LoadStore|StoreStore is a
>> definition within orderAccess.hpp, it is not a general equivalence.
>
> OK.  It would certainly be nice if HotSpot could move to using
> standard terminology.  Then, in time, we could just use the C++11
> atomics.

The stand-alone (unbound) release() and acquire() are defined as they 
are to allow them to be associated with a subsequent store, or previous 
load, in cases where we can not access the variable directly to apply a 
release_store, or load_acquire operation. This is somewhat independent 
of the atomic API.

David
-----

> Andrew.
>

From thomas.schatzl at oracle.com  Tue Nov  8 12:52:29 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 08 Nov 2016 13:52:29 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
Message-ID: <1478609549.2689.71.camel@oracle.com>

Hi Kim,

On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
> Please review this change to address missing memory barriers needed
> to ensure ordering between allocation and refinement in G1.
> 
> Rather than simply adding the "obvious" barriers, this change
> modifies refinement to not need any additional memory barriers.
> 
> First, the heap region type predicate used to decide whether the card
> should be processed has been changed.??Previously, !is_young was
> used, but that isn't really the state of interest. Rather, processing
> should only occur if the region is old or humongous, not if young or
> *free*. The free case (and so other cases that should be filtered
> out) can happen if the card is stale, and there are several ways to
> get stale cards here.??So added is_old_or_humongous type predicate
> and use it for filtering based on the region's type.
> 
> Second, moved to refine_card the card region trimming to the heap
> region's allocated space, and the associated filtering, to be
> co-located with the type-based filtering.??An empty trimmed card
> region is another indication of a stale card.
> 
> We should filter out cards that are !is_old_or_humongous or when the
> card's region is beyond the allocated space for the heap
> region.??Only if the card is old/humongous and covers allocated space
> should we proceed with processing, and then only for the subset of
> the card covering allocated space.
> 
> Moved the card cleaning to refine_card.??Having the cleaning in the
> iterator seemed misplaced.??Placing it in refine_card, after the card
> trimming and type-based filtering also allows the fence needed for
> the cleaning to serve double duty; in addition to ensuring processing
> occurs after card cleaning (the original purpose), it now also
> ensures processing follows the filtering.??And this ensures the
> necessary synchronization with allocation; we can't safely examine
> some parts of the heap region object or the memory designated by the
> card until after the card has been trimmed and filtered.??Part of
> this involved changing the storeload to a full fence, though for
> supported platforms that makes no difference in the underlying
> implementation.
> 
> (This change to card cleaning would benefit from a store_fence
> operation on some platforms, but that operation was phased out, and a
> release_store_fence is stronger (and more expensive) than needed on
> some platforms.)

It would also be beneficial to make the fence conditional on
is_gc_active(), but that may be another change as we previously did the
storeload unconditionally too.

> There is still a situation where processing can fail, namely an
> in-progress humongous allocation that hasn't set the klass yet.??We
> continue to handle that as before.

- I am not completely sure about whether this case is handled
correctly. I am mostly concerned that the information used before the
fence may not be the correct ones, but the checks expect them to be
valid.

Probably I am overlooking something critical somewhere.

A: allocates humongous object C, sets region type, issues storestore,
sets top pointers, writes the object, and then sets C.y = x to mark a
card

Refinement: gets card (and assuming we have no further synchronization
around which is not true, e.g. the enqueuing)

?592???if (!r->is_old_or_humongous()) {

assume refinement thread has not received the "type" correctly yet, so
must be Free. So the card will be filtered out incorrectly?

That is contradictory to what I said in the other email about the
comment discussion, but I only thoroughly looked at the comment aspect
there. :)

I think at this point in general we can't do anything but !is_young(),
as we can't ignore cards in "Free" regions - they may be for cards for
humongous ones where the thread did not receive top and/or the type
yet?

- assuming this works due to other synchronization, I have another
similar concern with later trimming:

653 } else {
654 ? // Non-humongous objects are only allocated in the old-gen during
655 ? // GC, so if region is old then top is stable.??Humongous object
656 ? // allocation sets top last; if top has not yet been set, then
657 ? // we'll end up with an empty intersection.
658 ? scan_limit = r->top();
659 }
660 if (scan_limit <= start) {
661 ? // If the trimmed region is empty, the card must be stale.
662 ? return false;
663 }

Assume that the current value of top for a humongous object has not
been seen yet by the thread and we end up with an empty intersection.

Now, didn't we potentially just drop a card to a humongous object in
waiting to scan but did not re-enqueue it? (And we did not clear the
card table value either?)

We may do it after the fence though I think.

Maybe I am completely wrong though, what do you think?

- another stale comment:

?636???// a card beyond the heap.??This is not safe without a perm
?637???// gen at the upper end of the heap.

Could everything after "without" be removed in this sentence? We
haven't had a "perm gen" for a long time...

Thanks,
? Thomas


From markus.gronlund at oracle.com  Tue Nov  8 14:14:12 2016
From: markus.gronlund at oracle.com (Markus Gronlund)
Date: Tue, 8 Nov 2016 06:14:12 -0800 (PST)
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
Message-ID: <2c9077d1-fa5b-42f6-a805-0eb343b5b22e@default>

Hi Peter,

Thanks for your offer to contribute this work to the OpenJDK.

You will most likely need to follow the Java Enhancement Proposal (JEP) process for this work:

Please see the following link for the JEP process description:

http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html

Thanks
Markus


-----Original Message-----
From: Peter Hofer [mailto:peter.hofer at jku.at] 
Sent: den 4 november 2016 11:01
To: hotspot-dev at openjdk.java.net
Cc: David Gnedt; Andreas Schoergenhumer
Subject: Contribution: Lock Contention Profiler for HotSpot

Hello everyone,

we are researchers at the University of Linz and have worked on a lock contention profiler that is built into HotSpot. We would like to contribute this work to the OpenJDK community.

Our profiler records an event when a thread fails to acquire a contended lock and also when a thread releases a contended lock. It further efficiently records the stack traces where these events occur. We devised a versatile visualization tool that analyzes the recorded events and determines when and where threads _cause_ contention by holding a contended lock. The visualization tool can show the contention by stack trace, by lock, by lock class, by thread, and by any combination of those aspects.

We described our profiler in more detail in a research paper at ICPE 2016. [1] In our evaluation, we found that the overhead is typically below 10% for common multi-threaded Java benchmarks. Please find a free download of the paper on our website:
> http://mevss.jku.at/lct/

I contribute this work on behalf of Dynatrace Austria (the sponsor of this research), my colleagues David Gnedt and Andreas Schoergenhumer, and myself. The necessary OCAs have already been submitted.

We provide two patches:

Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we described and evaluated in our paper, plus minor improvements. It records events for Java intrinsic locks (monitors) and for java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). 
We support only Linux on 64-bit x86 hardware.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14
> / http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/

Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal native locks only. We consider this to be useful for HotSpot developers to find locking bottlenecks in HotSpot itself. We tested this patch only on Linux on 64-bit x86 hardware, but it should require few changes for other platforms.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hot
> spot_jdk9%2b140/ 
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk
> _jdk-9%2b140/

With both patches, the profiler is enabled with -XX:+EnableEventTracing. 
By default, an uncompressed event trace is written to file "output.trc".

More detailed usage information and a download of the corresponding visualization tool is available on our website, http://mevss.jku.at/lct/.

Kind regards,
  Peter Hofer


--
Peter Hofer
Christian Doppler Laboratory on Monitoring and Evolution of Very-Large-Scale Software Systems / Institute for System Software University of Linz


[1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter 
Moessenboeck. Efficient Tracing and Versatile Analysis of Lock 
Contention in Java Applications on the Virtual Machine Level. 
Proceedings of the 7th ACM/SPEC International Conference on Performance 
Engineering (ICPE?16), Delft, Netherlands, 2016.

From andreas.schoergenhumer at jku.at  Tue Nov  8 08:27:37 2016
From: andreas.schoergenhumer at jku.at (=?UTF-8?Q?Andreas_Sch=c3=b6rgenhumer?=)
Date: Tue, 8 Nov 2016 09:27:37 +0100
Subject: Fwd: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
Message-ID: <551c23dc-c9d0-8a22-070c-c3668ad6d63d@jku.at>

Hi,

I am one of the authors of this work and I gladly support this contribution.

Kind regards,
Andreas Sch?rgenhumer


-------- Forwarded Message --------
Subject: Contribution: Lock Contention Profiler for HotSpot
Date: Fri, 4 Nov 2016 11:00:38 +0100
From: Peter Hofer <peter.hofer at jku.at <mailto:peter.hofer at jku.at>>

Hello everyone,

we are researchers at the University of Linz and have worked on a lock
contention profiler that is built into HotSpot. We would like to
contribute this work to the OpenJDK community.

Our profiler records an event when a thread fails to acquire a contended
lock and also when a thread releases a contended lock. It further
efficiently records the stack traces where these events occur. We
devised a versatile visualization tool that analyzes the recorded events
and determines when and where threads _cause_ contention by holding a
contended lock. The visualization tool can show the contention by stack
trace, by lock, by lock class, by thread, and by any combination of
those aspects.

We described our profiler in more detail in a research paper at ICPE
2016. [1] In our evaluation, we found that the overhead is typically
below 10% for common multi-threaded Java benchmarks. Please find a free
download of the paper on our website:
> http://mevss.jku.at/lct/

I contribute this work on behalf of Dynatrace Austria (the sponsor of
this research), my colleagues David Gnedt and Andreas Schoergenhumer,
and myself. The necessary OCAs have already been submitted.

We provide two patches:

Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we
described and evaluated in our paper, plus minor improvements. It
records events for Java intrinsic locks (monitors) and for
java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock).
We support only Linux on 64-bit x86 hardware.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/

Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal
native locks only. We consider this to be useful for HotSpot developers
to find locking bottlenecks in HotSpot itself. We tested this patch only
on Linux on 64-bit x86 hardware, but it should require few changes for
other platforms.

> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/
> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/

With both patches, the profiler is enabled with -XX:+EnableEventTracing.
By default, an uncompressed event trace is written to file "output.trc".

More detailed usage information and a download of the corresponding
visualization tool is available on our website, http://mevss.jku.at/lct/.

Kind regards,
   Peter Hofer


--
Peter Hofer
Christian Doppler Laboratory on Monitoring and Evolution of
Very-Large-Scale Software Systems / Institute for System Software
University of Linz


[1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter
Moessenboeck. Efficient Tracing and Versatile Analysis of Lock
Contention in Java Applications on the Virtual Machine Level.
Proceedings of the 7th ACM/SPEC International Conference on Performance
Engineering (ICPE?16), Delft, Netherlands, 2016.


From aph at redhat.com  Wed Nov  9 10:42:02 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 9 Nov 2016 10:42:02 +0000
Subject: Segfaults in error traces caused by modules
Message-ID: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com>

I'm seeing repeated segfaults in error traces.  These seem to be caused by

void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const {
  if (_cb != NULL) {
    if (Interpreter::contains(pc())) {
      Method* m = this->interpreter_frame_method();
      if (m != NULL) {
        m->name_and_sig_as_C_string(buf, buflen);
        st->print("j  %s", buf);
        st->print("+%d", this->interpreter_frame_bci());
        ModuleEntry* module = m->method_holder()->module();
        if (module->is_named()) {
          module->name()->as_C_string(buf, buflen);
          st->print(" %s", buf);
          module->version()->as_C_string(buf, buflen);

where module->version() returns NULL.

Is this expected?

Andrew.

From Alan.Bateman at oracle.com  Wed Nov  9 11:03:41 2016
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 9 Nov 2016 12:03:41 +0100
Subject: Segfaults in error traces caused by modules
In-Reply-To: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com>
References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com>
Message-ID: <e1e92b84-4647-fe3e-a212-2819a43084ae@oracle.com>

On 09/11/2016 11:42, Andrew Haley wrote:

> I'm seeing repeated segfaults in error traces.  These seem to be caused by
>
> void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const {
>    if (_cb != NULL) {
>      if (Interpreter::contains(pc())) {
>        Method* m = this->interpreter_frame_method();
>        if (m != NULL) {
>          m->name_and_sig_as_C_string(buf, buflen);
>          st->print("j  %s", buf);
>          st->print("+%d", this->interpreter_frame_bci());
>          ModuleEntry* module = m->method_holder()->module();
>          if (module->is_named()) {
>            module->name()->as_C_string(buf, buflen);
>            st->print(" %s", buf);
>            module->version()->as_C_string(buf, buflen);
>
> where module->version() returns NULL.
>
>
The version is optional and so modules may be defined to the VM with a 
version string of NULL. It may be that this code has only been tested 
with images builds, where the platform modules have version ("9" or 
"9-internal" ...). However with an exploded build then the platform 
modules don't have a version string and I assume this is where you hit this.

-Alan

From aph at redhat.com  Wed Nov  9 11:15:51 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 9 Nov 2016 11:15:51 +0000
Subject: Segfaults in error traces caused by modules
In-Reply-To: <e1e92b84-4647-fe3e-a212-2819a43084ae@oracle.com>
References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com>
	<e1e92b84-4647-fe3e-a212-2819a43084ae@oracle.com>
Message-ID: <e9ca394a-be60-b8b6-93b6-2e11bafd845d@redhat.com>

On 09/11/16 11:03, Alan Bateman wrote:
> On 09/11/2016 11:42, Andrew Haley wrote:
> 
>> I'm seeing repeated segfaults in error traces.  These seem to be caused by
>>
>> void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const {
>>    if (_cb != NULL) {
>>      if (Interpreter::contains(pc())) {
>>        Method* m = this->interpreter_frame_method();
>>        if (m != NULL) {
>>          m->name_and_sig_as_C_string(buf, buflen);
>>          st->print("j  %s", buf);
>>          st->print("+%d", this->interpreter_frame_bci());
>>          ModuleEntry* module = m->method_holder()->module();
>>          if (module->is_named()) {
>>            module->name()->as_C_string(buf, buflen);
>>            st->print(" %s", buf);
>>            module->version()->as_C_string(buf, buflen);
>>
>> where module->version() returns NULL.
>>
> The version is optional and so modules may be defined to the VM with a 
> version string of NULL. It may be that this code has only been tested 
> with images builds, where the platform modules have version ("9" or 
> "9-internal" ...). However with an exploded build then the platform 
> modules don't have a version string and I assume this is where you hit this.

Yes.  OK, so it's a bug.  Thanks.

Andrew.


From shafi.s.ahmad at oracle.com  Thu Nov 10 06:42:02 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Wed, 9 Nov 2016 22:42:02 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
Message-ID: <77e0b348-2b95-4097-ba95-906257d8893c@default>

Hi,

Please review the backport of following dependent backports.

jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473
Conflict in file src/share/vm/opto/memnode.cpp due to 
1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-8080289]. Manual merge is not done as the corresponding code is not there in jdk8u-dev.
Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual merge is done. 
webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4

jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
Conflict in file src/share/vm/opto/library_call.cpp due to 
1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 [JDK-8140309]. Manual merge is not done as the corresponding code is not there in jdk8u-dev.  
webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef

jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781
Clean merge
webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70

jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101
Conflict in file src/share/vm/opto/library_call.cpp due to 
1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 [JDK-8160360] - Resolved 
2. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 [JDK-8148146] - Manual merge is not done as the corresponding code is not there in jdk8u-dev.
webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843

Testing: jprt and jtreg

Regards,
Shafi

> -----Original Message-----
> From: Shafi Ahmad
> Sent: Thursday, October 20, 2016 10:08 AM
> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces 
> mismatched unsafe accesses
> 
> Thanks Vladimir.
> 
> I will create dependent  backport of
> 1. https://bugs.openjdk.java.net/browse/JDK-8136473
> 2. https://bugs.openjdk.java.net/browse/JDK-8155781
> 3. https://bugs.openjdk.java.net/browse/JDK-8162101
> 
> Regards,
> Shafi
> 
> > -----Original Message-----
> > From: Vladimir Kozlov
> > Sent: Wednesday, October 19, 2016 8:27 AM
> > To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation 
> > produces mismatched unsafe accesses
> >
> > Hi Shafi,
> >
> > You should also consider backporting following related fixes:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8155781
> > https://bugs.openjdk.java.net/browse/JDK-8162101
> >
> > Otherwise you may hit asserts added by 8134918 changes.
> >
> > Thanks,
> > Vladimir
> >
> > On 10/17/16 3:12 AM, Shafi Ahmad wrote:
> > > Hi All,
> > >
> > > Please review the backport of JDK-8134918 - C2: Type speculation 
> > > produces
> > mismatched unsafe accesses to jdk8u-dev.
> > >
> > > Please note that backport is not clean and the conflict is due to:
> > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
> > > 65
> > >
> > >  Getting debug build failure because of:
> > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
> > > 55
> > >
> > > The above changes are done under bug# 'JDK-8136473: failed: no
> > mismatched stores, except on raw memory: StoreB StoreI' which is not 
> > back ported to jdk8u and the current backport is on top of above change.
> > >
> > >  Please note that I am not sure if there is any dependency between 
> > > these
> > two changesets.
> > >
> > > open webrev:
> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
> > > jdk9 changeset:
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> > >
> > > testing: Passes JPRT, jtreg not completed
> > >
> > > Regards,
> > > Shafi
> > >

From shafi.s.ahmad at oracle.com  Thu Nov 10 07:10:20 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Wed, 9 Nov 2016 23:10:20 -0800 (PST)
Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <f9c92143-2470-4edc-82c7-d8794e3d116c@default>
References: <bce6cf75-6067-43e2-a6a1-5b99a0d925aa@default>
	<ad78d9b4-258a-568d-a3a2-85713513f689@oracle.com>
	<2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default>
	<f9c92143-2470-4edc-82c7-d8794e3d116c@default>
Message-ID: <c88a8081-fc33-4497-bdeb-30df925463b6@default>

Hi All,

May I get the second review for this backport.

Regards,
Shafi

> -----Original Message-----
> From: Shafi Ahmad
> Sent: Tuesday, October 25, 2016 9:09 AM
> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> Cc: Vladimir Ivanov
> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
> ciObjectFactory::create_new_metadata
> 
> May I get the second review for this backport.
> 
> Regards,
> Shafi
> 
> > -----Original Message-----
> > From: Shafi Ahmad
> > Sent: Thursday, October 20, 2016 9:55 AM
> > To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> > Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with
> > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
> >
> > Thank you Vladimir for the review.
> >
> > Please find the updated webrev link.
> > http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/
> >
> > All,
> >
> > May I get 2nd review for this.
> >
> > Regards,
> > Shafi
> >
> > > -----Original Message-----
> > > From: Vladimir Kozlov
> > > Sent: Wednesday, October 19, 2016 10:14 PM
> > > To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> > > Cc: Vladimir Ivanov; Jamsheed C M
> > > Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with
> > > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
> > >
> > > In ciMethod.hpp you duplicated comment line:
> > >
> > > +   // Given a certain calling environment, find the monomorphic
> > > + target
> > >      // Given a certain calling environment, find the monomorphic
> > > target
> > >
> > > Otherwise looks good.
> > >
> > > Thanks,
> > > Vladimir K
> > >
> > > On 10/19/16 12:53 AM, Shafi Ahmad wrote:
> > > > Hi All,
> > > >
> > > > Please review the backport of 'JDK-8134389: Crash in HotSpot with
> > > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev.
> > > >
> > > > Please note that backport is not clean as I was getting build failure due
> to:
> > > > Formal parameter 'ignore_return' in method
> > > > GraphBuilder::method_return
> > > is added in the fix of https://bugs.openjdk.java.net/browse/JDK-
> 8164122.
> > > > The current code change is done on top of aforesaid bug fix and
> > > > this formal
> > > parameter is referenced in this code change.
> > > >   * if (x != NULL && !ignore_return) { *
> > > >
> > > > Author of this code change suggested me, we can safely remove this
> > > addition conditional expression ' && !ignore_return'.
> > > >
> > > > open webrev:
> > http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/
> > > > jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389
> > > > jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs-
> > > comp/hotspot/rev/4191b33b3629
> > > >
> > > > testing: Passes JPRT, jtreg on Linux [amd64] and newly added test
> > > > case
> > > >
> > > > Regards,
> > > > Shafi
> > > >

From harold.seigel at oracle.com  Thu Nov 10 14:54:30 2016
From: harold.seigel at oracle.com (harold seigel)
Date: Thu, 10 Nov 2016 09:54:30 -0500
Subject: Segfaults in error traces caused by modules
In-Reply-To: <e9ca394a-be60-b8b6-93b6-2e11bafd845d@redhat.com>
References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com>
	<e1e92b84-4647-fe3e-a212-2819a43084ae@oracle.com>
	<e9ca394a-be60-b8b6-93b6-2e11bafd845d@redhat.com>
Message-ID: <66f938eb-b319-98dd-97a1-2ffef7d58d18@oracle.com>

Thanks for letting us know about this.  I entered 
https://bugs.openjdk.java.net/browse/JDK-8169551 for this issue.

Harold

On 11/9/2016 6:15 AM, Andrew Haley wrote:
> On 09/11/16 11:03, Alan Bateman wrote:
>> On 09/11/2016 11:42, Andrew Haley wrote:
>>
>>> I'm seeing repeated segfaults in error traces.  These seem to be caused by
>>>
>>> void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const {
>>>     if (_cb != NULL) {
>>>       if (Interpreter::contains(pc())) {
>>>         Method* m = this->interpreter_frame_method();
>>>         if (m != NULL) {
>>>           m->name_and_sig_as_C_string(buf, buflen);
>>>           st->print("j  %s", buf);
>>>           st->print("+%d", this->interpreter_frame_bci());
>>>           ModuleEntry* module = m->method_holder()->module();
>>>           if (module->is_named()) {
>>>             module->name()->as_C_string(buf, buflen);
>>>             st->print(" %s", buf);
>>>             module->version()->as_C_string(buf, buflen);
>>>
>>> where module->version() returns NULL.
>>>
>> The version is optional and so modules may be defined to the VM with a
>> version string of NULL. It may be that this code has only been tested
>> with images builds, where the platform modules have version ("9" or
>> "9-internal" ...). However with an exploded build then the platform
>> modules don't have a version string and I assume this is where you hit this.
> Yes.  OK, so it's a bug.  Thanks.
>
> Andrew.
>


From kim.barrett at oracle.com  Thu Nov 10 17:42:34 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 10 Nov 2016 12:42:34 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1478599313.2689.44.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<D2E48183-C74F-4ACF-92AD-07AB9A4862B6@oracle.com>
	<1478516225.2646.19.camel@oracle.com>
	<CCDA724F-192B-4591-91A6-3054E9B8B75D@oracle.com>
	<867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com>
	<1478599313.2689.44.camel@oracle.com>
Message-ID: <1D73FB14-127D-4508-A9CA-F9F88F12EACD@oracle.com>

> On Nov 8, 2016, at 5:01 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> On Mon, 2016-11-07 at 14:07 -0500, Kim Barrett wrote:
>>> 
>>> On Nov 7, 2016, at 1:36 PM, Kim Barrett <kim.barrett at oracle.com>
>>> wrote:
>>>> 
>>>>> 
>>>>> 
> [...]
>>>> One could replace "dirtied" by something less specific here to
>>>> make it
>>>> right again.
>>> Good idea.  How about this rewording (using ?set to a value?)
>>> 
>>>  // The region could be young.  Cards for young regions are set to
>>> a
>>>  // value that allows the post-barrier to filter them
>>> out.  However,
>>>  // that card setting is performed concurrently.  A write to a
>>> young
>>>  // object could occur before the card has been set, slipping past
>>>  // the filter.
>> Oops, no, that isn't right.  (It's been a couple of weeks since I
>> looked at this, and forgot part of the problem.)
>> 
>> Part of what's wrong with the comment is that we can no longer get to
>> that point with a young region.  A young region's cards will be ither
>> g1_young_gen or clean, never dirty.  Hence the filtering out of non-
> 
> Why? I think the reason for this comment has been that the following
> could happen:
> 
> A: allocate new young region X, allocate object, storestore, stops at
> the beginning of the dirty_young_block() method
> 
> B: allocate new object B in X, set B.y = something-outside, making the
> card "Dirty" since thread A did not actually start doing
> dirty_young_block() yet.
> 
> Refinement: scans the card; since R does not seem to synchronize with A
> either, you may get a "dirty" card in a young (or free, depending on
> whether the setting of the region flag in X has already been observed -
> but it must be either one) region here in this case?
> 
> A: does the work in dirty_young_block()
> 
> (The previous is_young() check has indeed been wrong, and
> is_old_or_humongous() is better)

You are correct.  Hopefully I've refreshed my understanding
sufficiently that I won't keep making similar mistakes in this
discussion.

> I think the comment is good after all. I would even emphasize the act
> of setting it to "g1_young_gen" by writing something like:
> 
> // The region could be young.  Cards for young regions are set to 
> // "g1_young_gen" so the post-barrier will filter them out.  However, 
> // that dirtying is performed concurrently.  A write to a young object 
> // could occur in the same region before the cards have been set to 
> // that value, slipping past the filter.
> 
> Because then, if somebody removes g1_young_gen, he will hopefully find
> this place again by searching for "g1_young_gen" and think about this
> situation again. (in theory :))

Yes, that?s better.  I?ll make that change.


From kim.barrett at oracle.com  Thu Nov 10 18:20:41 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 10 Nov 2016 13:20:41 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1478609549.2689.71.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
Message-ID: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>

> On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
>> There is still a situation where processing can fail, namely an
>> in-progress humongous allocation that hasn't set the klass yet.  We
>> continue to handle that as before.
> 
> - I am not completely sure about whether this case is handled
> correctly. I am mostly concerned that the information used before the
> fence may not be the correct ones, but the checks expect them to be
> valid.
> 
> Probably I am overlooking something critical somewhere.
> 
> A: allocates humongous object C, sets region type, issues storestore,
> sets top pointers, writes the object, and then sets C.y = x to mark a
> card
> 
> Refinement: gets card (and assuming we have no further synchronization
> around which is not true, e.g. the enqueuing)
> 
>  592   if (!r->is_old_or_humongous()) {
> 
> assume refinement thread has not received the "type" correctly yet, so
> must be Free. So the card will be filtered out incorrectly?
> 
> That is contradictory to what I said in the other email about the
> comment discussion, but I only thoroughly looked at the comment aspect
> there. :)
> 
> I think at this point in general we can't do anything but !is_young(),
> as we can't ignore cards in "Free" regions - they may be for cards for
> humongous ones where the thread did not receive top and/or the type
> yet?
> 
> - assuming this works due to other synchronization,

This is the critical point.  There *is* synchronization there.

In the scenario described, the card that was marked and enqueued after
the object was created will pass through some synchronization barriers
(full locks, perhaps someday lock-free but with appropriate memory
barriers) along the way to refinement.

This is the "easy" case.  If only it were that simple...

The additional checks are to deal with the possibility of stale cards.

> [?] I have another
> similar concern with later trimming:
> 
> 653 } else {
> 654   // Non-humongous objects are only allocated in the old-gen during
> 655   // GC, so if region is old then top is stable.  Humongous object
> 656   // allocation sets top last; if top has not yet been set, then
> 657   // we'll end up with an empty intersection.
> 658   scan_limit = r->top();
> 659 }
> 660 if (scan_limit <= start) {
> 661   // If the trimmed region is empty, the card must be stale.
> 662   return false;
> 663 }
> 
> Assume that the current value of top for a humongous object has not
> been seen yet by the thread and we end up with an empty intersection.
> 
> Now, didn't we potentially just drop a card to a humongous object in
> waiting to scan but did not re-enqueue it? (And we did not clear the
> card table value either?)
> 
> We may do it after the fence though I think.
> 
> Maybe I am completely wrong though, what do you think?

If we see the old (zero) value of top in conjunction with a humongous
region type, it is because this is a stale card.  If this were a
non-stale card, the synchronization between enqueuing the card and
reaching refinement would have ensured we see an up-to-date top (as
well as an up-to-date type).  Card table entries for a free region are
cleaned before the region can be allocated (and there are locks in the
allocation path that provide the needed ordering).  Since this is a
stale card and regions are allocated with clean card table entries,
the dirty card table entry check having passed implies there is
another (non-stale and not-yet-processed) card making its way to
refinement through the usual channels, including the needed
synchronization barriers.


> - another stale comment:
> 
>  636   // a card beyond the heap.  This is not safe without a perm
>  637   // gen at the upper end of the heap.
> 
> Could everything after "without" be removed in this sentence? We
> haven't had a "perm gen" for a long time?

Yes.  I?ll make that change.


From vladimir.kozlov at oracle.com  Thu Nov 10 19:55:45 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 10 Nov 2016 11:55:45 -0800
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <77e0b348-2b95-4097-ba95-906257d8893c@default>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
Message-ID: <b52c43da-549e-906f-2812-2631ac47d717@oracle.com>

On 11/9/16 10:42 PM, Shafi Ahmad wrote:
> Hi,
>
> Please review the backport of following dependent backports.
>
> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473
> Conflict in file src/share/vm/opto/memnode.cpp due to
> 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-8080289]. Manual merge is not done as the corresponding code is not there in jdk8u-dev.
> Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual merge is done.
> webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/

unaligned unsafe access methods were added in jdk 9 only. In your 
changes  unaligned argument is always false. You can simplify changes.

Also you should base changes on JDK-8140309 (original 8136473 changes 
were backout by 8140267):

On 11/4/15 10:21 PM, Roland Westrelin wrote:
 > http://cr.openjdk.java.net/~roland/8140309/webrev.00/
 >
 > Same as 8136473 with only the following change:
 >
 > diff --git a/src/share/vm/opto/library_call.cpp 
b/src/share/vm/opto/library_call.cpp
 > --- a/src/share/vm/opto/library_call.cpp
 > +++ b/src/share/vm/opto/library_call.cpp
 > @@ -2527,7 +2527,7 @@
 >     // of safe & unsafe memory.
 >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
 >
 > -  assert(is_native_ptr || alias_type->adr_type() == 
TypeOopPtr::BOTTOM ||
 > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || 
alias_type->adr_type() == TypeOopPtr::BOTTOM ||
 >            alias_type->field() != NULL || alias_type->element() != 
NULL, "field, array element or unknown");
 >     bool mismatched = false;
 >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
 >
 > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr 
case and the case where the unsafe method is called with a null object.

> jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
>
> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
> Conflict in file src/share/vm/opto/library_call.cpp due to
> 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 [JDK-8140309]. Manual merge is not done as the corresponding code is not there in jdk8u-dev.

I explained situation with this line above.

> webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/

This webrev is not incremental for your 8136473 changes - 
library_call.cpp has part from 8136473 changes.

> jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>
> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781
> Clean merge
> webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/

Thanks seems fine.

> jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
>
> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101
> Conflict in file src/share/vm/opto/library_call.cpp due to
> 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 [JDK-8160360] - Resolved
> 2. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 [JDK-8148146] - Manual merge is not done as the corresponding code is not there in jdk8u-dev.
> webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/

This webrev is not incremental in library_call.cpp. Difficult to see 
this part of changes.

Thanks,
Vladimir

> jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
>
> Testing: jprt and jtreg
>
> Regards,
> Shafi
>
>> -----Original Message-----
>> From: Shafi Ahmad
>> Sent: Thursday, October 20, 2016 10:08 AM
>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>> mismatched unsafe accesses
>>
>> Thanks Vladimir.
>>
>> I will create dependent  backport of
>> 1. https://bugs.openjdk.java.net/browse/JDK-8136473
>> 2. https://bugs.openjdk.java.net/browse/JDK-8155781
>> 3. https://bugs.openjdk.java.net/browse/JDK-8162101
>>
>> Regards,
>> Shafi
>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov
>>> Sent: Wednesday, October 19, 2016 8:27 AM
>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>> produces mismatched unsafe accesses
>>>
>>> Hi Shafi,
>>>
>>> You should also consider backporting following related fixes:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8155781
>>> https://bugs.openjdk.java.net/browse/JDK-8162101
>>>
>>> Otherwise you may hit asserts added by 8134918 changes.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
>>>> Hi All,
>>>>
>>>> Please review the backport of JDK-8134918 - C2: Type speculation
>>>> produces
>>> mismatched unsafe accesses to jdk8u-dev.
>>>>
>>>> Please note that backport is not clean and the conflict is due to:
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
>>>> 65
>>>>
>>>>  Getting debug build failure because of:
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
>>>> 55
>>>>
>>>> The above changes are done under bug# 'JDK-8136473: failed: no
>>> mismatched stores, except on raw memory: StoreB StoreI' which is not
>>> back ported to jdk8u and the current backport is on top of above change.
>>>>
>>>>  Please note that I am not sure if there is any dependency between
>>>> these
>>> two changesets.
>>>>
>>>> open webrev:
>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
>>>> jdk9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>>
>>>> testing: Passes JPRT, jtreg not completed
>>>>
>>>> Regards,
>>>> Shafi
>>>>

From jesper.wilhelmsson at oracle.com  Fri Nov 11 16:15:27 2016
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Fri, 11 Nov 2016 17:15:27 +0100
Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until
	JDK-8169232 is solved
Message-ID: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com>

Hi,

Please review this minor change to quarantine a new test that is triggering an 
old bug. The bug is being worked on and the test will be enabled again once the 
bug is fixed.

Bug: https://bugs.openjdk.java.net/browse/JDK-8169597
Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/

Thanks,
/Jesper

From erik.gahlin at oracle.com  Fri Nov 11 17:24:40 2016
From: erik.gahlin at oracle.com (Erik Gahlin)
Date: Fri, 11 Nov 2016 18:24:40 +0100
Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until
	JDK-8169232 is solved
In-Reply-To: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com>
References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com>
Message-ID: <5825FED8.3080503@oracle.com>

Looks good.

Erik

> Hi,
>
> Please review this minor change to quarantine a new test that is 
> triggering an old bug. The bug is being worked on and the test will be 
> enabled again once the bug is fixed.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8169597
> Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/
>
> Thanks,
> /Jesper


From george.triantafillou at oracle.com  Fri Nov 11 19:07:03 2016
From: george.triantafillou at oracle.com (George Triantafillou)
Date: Fri, 11 Nov 2016 14:07:03 -0500
Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until
	JDK-8169232 is solved
In-Reply-To: <5825FED8.3080503@oracle.com>
References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com>
	<5825FED8.3080503@oracle.com>
Message-ID: <e102ca3c-c15c-b720-3bf8-6499e18dc031@oracle.com>

+1

-George

On 11/11/2016 12:24 PM, Erik Gahlin wrote:
> Looks good.
>
> Erik
>
>> Hi,
>>
>> Please review this minor change to quarantine a new test that is 
>> triggering an old bug. The bug is being worked on and the test will 
>> be enabled again once the bug is fixed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8169597
>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/
>>
>> Thanks,
>> /Jesper
>


From jesper.wilhelmsson at oracle.com  Fri Nov 11 19:40:01 2016
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Fri, 11 Nov 2016 20:40:01 +0100
Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until
	JDK-8169232 is solved
In-Reply-To: <e102ca3c-c15c-b720-3bf8-6499e18dc031@oracle.com>
References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com>
	<5825FED8.3080503@oracle.com>
	<e102ca3c-c15c-b720-3bf8-6499e18dc031@oracle.com>
Message-ID: <fe5e4675-1bd6-5ce2-7e07-308cfe6f353c@oracle.com>

Thanks Erik and George!
/Jesper

Den 11/11/16 kl. 20:07, skrev George Triantafillou:
> +1
>
> -George
>
> On 11/11/2016 12:24 PM, Erik Gahlin wrote:
>> Looks good.
>>
>> Erik
>>
>>> Hi,
>>>
>>> Please review this minor change to quarantine a new test that is triggering
>>> an old bug. The bug is being worked on and the test will be enabled again
>>> once the bug is fixed.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8169597
>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/
>>>
>>> Thanks,
>>> /Jesper
>>
>

From shafi.s.ahmad at oracle.com  Mon Nov 14 09:03:17 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Mon, 14 Nov 2016 01:03:17 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
Message-ID: <137be921-c1ef-48d8-b85a-301d597109c0@default>

Hi Vladimir,

Thanks for the review.

Please find updated webrevs. 

All webrevs are with respect to the base changes on JDK-8140309.
http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/

Regards,
Shafi
  

> -----Original Message-----
> From: Vladimir Kozlov
> Sent: Friday, November 11, 2016 1:26 AM
> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> mismatched unsafe accesses
> 
> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
> > Hi,
> >
> > Please review the backport of following dependent backports.
> >
> > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473
> > Conflict in file src/share/vm/opto/memnode.cpp due to 1.
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-
> 8080289]. Manual merge is not done as the corresponding code is not there
> in jdk8u-dev.
> > Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual
> merge is done.
> > webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
> 
> unaligned unsafe access methods were added in jdk 9 only. In your changes
> unaligned argument is always false. You can simplify changes.
> 
> Also you should base changes on JDK-8140309 (original 8136473 changes
> were backout by 8140267):
> 
> On 11/4/15 10:21 PM, Roland Westrelin wrote:
>  > http://cr.openjdk.java.net/~roland/8140309/webrev.00/
>  >
>  > Same as 8136473 with only the following change:
>  >
>  > diff --git a/src/share/vm/opto/library_call.cpp
> b/src/share/vm/opto/library_call.cpp
>  > --- a/src/share/vm/opto/library_call.cpp
>  > +++ b/src/share/vm/opto/library_call.cpp
>  > @@ -2527,7 +2527,7 @@
>  >     // of safe & unsafe memory.
>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
>  >
>  > -  assert(is_native_ptr || alias_type->adr_type() == TypeOopPtr::BOTTOM
> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
>  >            alias_type->field() != NULL || alias_type->element() !=
> NULL, "field, array element or unknown");
>  >     bool mismatched = false;
>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
>  >
>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr
> case and the case where the unsafe method is called with a null object.
> 
> > jdk9 changeset:
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
> >
> > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
> > Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165
> [JDK-8140309]. Manual merge is not done as the corresponding code is not
> there in jdk8u-dev.
> 
> I explained situation with this line above.
> 
> > webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> 
> This webrev is not incremental for your 8136473 changes - library_call.cpp has
> part from 8136473 changes.
> 
> > jdk9 changeset:
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >
> > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781
> > Clean merge
> > webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
> 
> Thanks seems fine.
> 
> > jdk9 changeset:
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
> >
> > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101
> > Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
> > [JDK-8160360] - Resolved 2.
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273
> [JDK-8148146] - Manual merge is not done as the corresponding code is not
> there in jdk8u-dev.
> > webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
> 
> This webrev is not incremental in library_call.cpp. Difficult to see this part of
> changes.
> 
> Thanks,
> Vladimir
> 
> > jdk9 changeset:
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
> >
> > Testing: jprt and jtreg
> >
> > Regards,
> > Shafi
> >
> >> -----Original Message-----
> >> From: Shafi Ahmad
> >> Sent: Thursday, October 20, 2016 10:08 AM
> >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> >> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> >> mismatched unsafe accesses
> >>
> >> Thanks Vladimir.
> >>
> >> I will create dependent  backport of
> >> 1. https://bugs.openjdk.java.net/browse/JDK-8136473
> >> 2. https://bugs.openjdk.java.net/browse/JDK-8155781
> >> 3. https://bugs.openjdk.java.net/browse/JDK-8162101
> >>
> >> Regards,
> >> Shafi
> >>
> >>> -----Original Message-----
> >>> From: Vladimir Kozlov
> >>> Sent: Wednesday, October 19, 2016 8:27 AM
> >>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>> produces mismatched unsafe accesses
> >>>
> >>> Hi Shafi,
> >>>
> >>> You should also consider backporting following related fixes:
> >>>
> >>> https://bugs.openjdk.java.net/browse/JDK-8155781
> >>> https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>
> >>> Otherwise you may hit asserts added by 8134918 changes.
> >>>
> >>> Thanks,
> >>> Vladimir
> >>>
> >>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
> >>>> Hi All,
> >>>>
> >>>> Please review the backport of JDK-8134918 - C2: Type speculation
> >>>> produces
> >>> mismatched unsafe accesses to jdk8u-dev.
> >>>>
> >>>> Please note that backport is not clean and the conflict is due to:
> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
> >>>> 65
> >>>>
> >>>>  Getting debug build failure because of:
> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
> >>>> 55
> >>>>
> >>>> The above changes are done under bug# 'JDK-8136473: failed: no
> >>> mismatched stores, except on raw memory: StoreB StoreI' which is not
> >>> back ported to jdk8u and the current backport is on top of above
> change.
> >>>>
> >>>>  Please note that I am not sure if there is any dependency between
> >>>> these
> >>> two changesets.
> >>>>
> >>>> open webrev:
> >> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
> >>>> jdk9 changeset:
> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >>>>
> >>>> testing: Passes JPRT, jtreg not completed
> >>>>
> >>>> Regards,
> >>>> Shafi
> >>>>

From volker.simonis at gmail.com  Mon Nov 14 10:09:46 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 14 Nov 2016 11:09:46 +0100
Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390
	builds
Message-ID: <CA+3eh10o5s_ob99zboWt2zSkz5pqgf5AsB6kXqt10zVhGk==Vg@mail.gmail.com>

Hi,

can I please have a review and sponsor for the following small change
which only affects ppc64/s390x but touches a shared make file:

http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/
https://bugs.openjdk.java.net/browse/JDK-8169625

It is unfortunate that the build of the libjsig library (see
make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags
used by the hotspot build (i.e. the ones specified in JVM_CFLAGS).
Instead, CompileLibjsig.gmk defines its own compiler  flags in
LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems
if the compiler on these platforms uses other default settings as
configured for the OpenJDK build.

Thank you and best regards,
Volker

From erik.joelsson at oracle.com  Mon Nov 14 10:14:00 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 14 Nov 2016 11:14:00 +0100
Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390
	builds
In-Reply-To: <CA+3eh10o5s_ob99zboWt2zSkz5pqgf5AsB6kXqt10zVhGk==Vg@mail.gmail.com>
References: <CA+3eh10o5s_ob99zboWt2zSkz5pqgf5AsB6kXqt10zVhGk==Vg@mail.gmail.com>
Message-ID: <a4497603-2d2a-5c3a-39a9-8f73c350f337@oracle.com>

Looks good. I will push it.

/Erik


On 2016-11-14 11:09, Volker Simonis wrote:
> Hi,
>
> can I please have a review and sponsor for the following small change
> which only affects ppc64/s390x but touches a shared make file:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/
> https://bugs.openjdk.java.net/browse/JDK-8169625
>
> It is unfortunate that the build of the libjsig library (see
> make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags
> used by the hotspot build (i.e. the ones specified in JVM_CFLAGS).
> Instead, CompileLibjsig.gmk defines its own compiler  flags in
> LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems
> if the compiler on these platforms uses other default settings as
> configured for the OpenJDK build.
>
> Thank you and best regards,
> Volker


From volker.simonis at gmail.com  Mon Nov 14 10:15:09 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 14 Nov 2016 11:15:09 +0100
Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390
	builds
In-Reply-To: <a4497603-2d2a-5c3a-39a9-8f73c350f337@oracle.com>
References: <CA+3eh10o5s_ob99zboWt2zSkz5pqgf5AsB6kXqt10zVhGk==Vg@mail.gmail.com>
	<a4497603-2d2a-5c3a-39a9-8f73c350f337@oracle.com>
Message-ID: <CA+3eh10XwErQe+JjH1V+DFf9RFx-2ZoVPxjFBeY10iN2Lrg5gQ@mail.gmail.com>

Thanks a lot Erik!
Volker


On Mon, Nov 14, 2016 at 11:14 AM, Erik Joelsson
<erik.joelsson at oracle.com> wrote:
> Looks good. I will push it.
>
> /Erik
>
>
>
> On 2016-11-14 11:09, Volker Simonis wrote:
>>
>> Hi,
>>
>> can I please have a review and sponsor for the following small change
>> which only affects ppc64/s390x but touches a shared make file:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/
>> https://bugs.openjdk.java.net/browse/JDK-8169625
>>
>> It is unfortunate that the build of the libjsig library (see
>> make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags
>> used by the hotspot build (i.e. the ones specified in JVM_CFLAGS).
>> Instead, CompileLibjsig.gmk defines its own compiler  flags in
>> LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems
>> if the compiler on these platforms uses other default settings as
>> configured for the OpenJDK build.
>>
>> Thank you and best regards,
>> Volker
>
>

From vladimir.kozlov at oracle.com  Mon Nov 14 17:50:01 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 14 Nov 2016 09:50:01 -0800
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <137be921-c1ef-48d8-b85a-301d597109c0@default>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
Message-ID: <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>

On 11/14/16 1:03 AM, Shafi Ahmad wrote:
> Hi Vladimir,
>
> Thanks for the review.
>
> Please find updated webrevs.
>
> All webrevs are with respect to the base changes on JDK-8140309.
> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/

Why you kept unaligned parameter in changes?

The test TestUnsafeUnalignedMismatchedAccesses.java will not work since 
since Unsafe class in jdk8 does not have unaligned methods.
Hot did you run it?

> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/

Good. Did you run new UnsafeAccess.java test?

> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/

Good.

Thanks,
Vladimir

>
> Regards,
> Shafi
>
>
>
>> -----Original Message-----
>> From: Vladimir Kozlov
>> Sent: Friday, November 11, 2016 1:26 AM
>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>> mismatched unsafe accesses
>>
>> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
>>> Hi,
>>>
>>> Please review the backport of following dependent backports.
>>>
>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473
>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-
>> 8080289]. Manual merge is not done as the corresponding code is not there
>> in jdk8u-dev.
>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual
>> merge is done.
>>> webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
>>
>> unaligned unsafe access methods were added in jdk 9 only. In your changes
>> unaligned argument is always false. You can simplify changes.
>>
>> Also you should base changes on JDK-8140309 (original 8136473 changes
>> were backout by 8140267):
>>
>> On 11/4/15 10:21 PM, Roland Westrelin wrote:
>>  > http://cr.openjdk.java.net/~roland/8140309/webrev.00/
>>  >
>>  > Same as 8136473 with only the following change:
>>  >
>>  > diff --git a/src/share/vm/opto/library_call.cpp
>> b/src/share/vm/opto/library_call.cpp
>>  > --- a/src/share/vm/opto/library_call.cpp
>>  > +++ b/src/share/vm/opto/library_call.cpp
>>  > @@ -2527,7 +2527,7 @@
>>  >     // of safe & unsafe memory.
>>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
>>  >
>>  > -  assert(is_native_ptr || alias_type->adr_type() == TypeOopPtr::BOTTOM
>> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
>> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
>>  >            alias_type->field() != NULL || alias_type->element() !=
>> NULL, "field, array element or unknown");
>>  >     bool mismatched = false;
>>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
>>  >
>>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr
>> case and the case where the unsafe method is called with a null object.
>>
>>> jdk9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
>>>
>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165
>> [JDK-8140309]. Manual merge is not done as the corresponding code is not
>> there in jdk8u-dev.
>>
>> I explained situation with this line above.
>>
>>> webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>
>> This webrev is not incremental for your 8136473 changes - library_call.cpp has
>> part from 8136473 changes.
>>
>>> jdk9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>
>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781
>>> Clean merge
>>> webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
>>
>> Thanks seems fine.
>>
>>> jdk9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
>>>
>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101
>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
>>> [JDK-8160360] - Resolved 2.
>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273
>> [JDK-8148146] - Manual merge is not done as the corresponding code is not
>> there in jdk8u-dev.
>>> webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
>>
>> This webrev is not incremental in library_call.cpp. Difficult to see this part of
>> changes.
>>
>> Thanks,
>> Vladimir
>>
>>> jdk9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
>>>
>>> Testing: jprt and jtreg
>>>
>>> Regards,
>>> Shafi
>>>
>>>> -----Original Message-----
>>>> From: Shafi Ahmad
>>>> Sent: Thursday, October 20, 2016 10:08 AM
>>>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>>>> mismatched unsafe accesses
>>>>
>>>> Thanks Vladimir.
>>>>
>>>> I will create dependent  backport of
>>>> 1. https://bugs.openjdk.java.net/browse/JDK-8136473
>>>> 2. https://bugs.openjdk.java.net/browse/JDK-8155781
>>>> 3. https://bugs.openjdk.java.net/browse/JDK-8162101
>>>>
>>>> Regards,
>>>> Shafi
>>>>
>>>>> -----Original Message-----
>>>>> From: Vladimir Kozlov
>>>>> Sent: Wednesday, October 19, 2016 8:27 AM
>>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>> produces mismatched unsafe accesses
>>>>>
>>>>> Hi Shafi,
>>>>>
>>>>> You should also consider backporting following related fixes:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781
>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101
>>>>>
>>>>> Otherwise you may hit asserts added by 8134918 changes.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Please review the backport of JDK-8134918 - C2: Type speculation
>>>>>> produces
>>>>> mismatched unsafe accesses to jdk8u-dev.
>>>>>>
>>>>>> Please note that backport is not clean and the conflict is due to:
>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
>>>>>> 65
>>>>>>
>>>>>>  Getting debug build failure because of:
>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1
>>>>>> 55
>>>>>>
>>>>>> The above changes are done under bug# 'JDK-8136473: failed: no
>>>>> mismatched stores, except on raw memory: StoreB StoreI' which is not
>>>>> back ported to jdk8u and the current backport is on top of above
>> change.
>>>>>>
>>>>>>  Please note that I am not sure if there is any dependency between
>>>>>> these
>>>>> two changesets.
>>>>>>
>>>>>> open webrev:
>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918
>>>>>> jdk9 changeset:
>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>>>>
>>>>>> testing: Passes JPRT, jtreg not completed
>>>>>>
>>>>>> Regards,
>>>>>> Shafi
>>>>>>

From kumar.x.srinivasan at oracle.com  Mon Nov 14 14:36:43 2016
From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan)
Date: Mon, 14 Nov 2016 06:36:43 -0800
Subject: Note: JDK-8168010: Deprecate obsolete launcher -d32/-d64 options
Message-ID: <5829CBFB.2020000@oracle.com>

Hello community,

This is to inform you that the -d32 and -d64 options are obsolete and are
destined to be  removed in JDK10, see [1] and [2], this will be Release 
noted
for JDK9.

Please make every effort to inspect your java start-up scripts and purge 
these
options.

Thanks
Kumar Srinivasan

[1] https://bugs.openjdk.java.net/browse/JDK-8168010
[2] https://bugs.openjdk.java.net/browse/JDK-8169646

From david.lloyd at redhat.com  Mon Nov 14 18:11:00 2016
From: david.lloyd at redhat.com (David M. Lloyd)
Date: Mon, 14 Nov 2016 12:11:00 -0600
Subject: Sporadic NPEs in compiled code
Message-ID: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com>

We observed a problem where java.net.NetworkInterface appeared to be 
throwing an NPE originating at a line of code corresponding to its 
return instruction:

Caused by: java.lang.NullPointerException
         at java.net.NetworkInterface.<init>(NetworkInterface.java:80)
         at java.net.NetworkInterface.getAll(Native Method)
         at 
java.net.NetworkInterface.getNetworkInterfaces(NetworkInterface.java:343)

java.net.NetworkInterface();
     Code:
        0: aload_0
        1: invokespecial #3                  // Method 
java/lang/Object."<init>":()V
        4: aload_0
        5: aconst_null
        6: putfield      #4                  // Field 
parent:Ljava/net/NetworkInterface;
        9: aload_0
       10: iconst_0
       11: putfield      #5                  // Field virtual:Z
       14: return
     LineNumberTable:
       line 79: 0
       line 50: 4
       line 51: 9
       line 80: 14

I assumed that the problem was possibly JNI-related, because of the 
previous stack frame, however we've begun seeing the problem in other 
bits of code as well, areas like this:

        0: aload_0
        1: invokestatic  #10                 // Method 
doInject:(Lorg/jboss/msc/service/ValueInjection;)V
        4: return

or this constructor:

       87: aload_0
       88: aload         7
       90: putfield      #17                 // Field 
extensionModuleName:Ljava/lang/String;

We've started testing with -XX:TieredStopAtLevel=1 and so far it seems 
the problems have disappeared, however, it's not clear to my 
hotspot-amateur mind at all whether it's C2 that is causing this or 
whether there is a more general timing-related race condition that is 
hidden by limiting the compiler in this way.

The OpenJDK version is:

openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-b16)
OpenJDK 64-Bit Server VM (build 25.111-b16, mixed mode)

It's coming out of a Fedora 24 distribution.
-- 
- DML

From chf at redhat.com  Mon Nov 14 18:25:40 2016
From: chf at redhat.com (Christine Flood)
Date: Mon, 14 Nov 2016 13:25:40 -0500 (EST)
Subject: JEP 189: Shenandoah: An Ultra-Low-Pause-Time Garbage Collector
In-Reply-To: <432402741.14209755.1479147790128.JavaMail.zimbra@redhat.com>
Message-ID: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com>


Hi

We've addressed the issues with the JEP that were brought up last summer.  
We've been meeting our performance goals.

What do we need to do to get Shenandoah approved for OpenJDK10?


Christine


From vladimir.kozlov at oracle.com  Mon Nov 14 19:18:37 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 14 Nov 2016 11:18:37 -0800
Subject: Sporadic NPEs in compiled code
In-Reply-To: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com>
References: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com>
Message-ID: <109eaf6c-9951-aa40-d393-236144fba268@oracle.com>

Could be https://bugs.openjdk.java.net/browse/JDK-8038348

My be other related to EA issues.

First try to run with C2 only: -XX:-TieredCompialtion
Then try to switch off EA as whole: -XX:-DoEscapeAnalysis
Or just subset of EA: -XX:-OptimizePtrCompare

Also could be incorrect memory instruction scheduling (above NULL check).

You can generate hs_err file to see recent events (deoptimization, 
compilations, uncommon traps):

-XX:+UnlockDiagnosticVMOptions 
-XX:AbortVMOnException=java.lang.NullPointerException

Also build fastdebug version of JDK and run with it. To see if it hits 
some asserts.

Thanks,
Vladimir

On 11/14/16 10:11 AM, David M. Lloyd wrote:
> We observed a problem where java.net.NetworkInterface appeared to be
> throwing an NPE originating at a line of code corresponding to its
> return instruction:
>
> Caused by: java.lang.NullPointerException
>         at java.net.NetworkInterface.<init>(NetworkInterface.java:80)
>         at java.net.NetworkInterface.getAll(Native Method)
>         at
> java.net.NetworkInterface.getNetworkInterfaces(NetworkInterface.java:343)
>
> java.net.NetworkInterface();
>     Code:
>        0: aload_0
>        1: invokespecial #3                  // Method
> java/lang/Object."<init>":()V
>        4: aload_0
>        5: aconst_null
>        6: putfield      #4                  // Field
> parent:Ljava/net/NetworkInterface;
>        9: aload_0
>       10: iconst_0
>       11: putfield      #5                  // Field virtual:Z
>       14: return
>     LineNumberTable:
>       line 79: 0
>       line 50: 4
>       line 51: 9
>       line 80: 14
>
> I assumed that the problem was possibly JNI-related, because of the
> previous stack frame, however we've begun seeing the problem in other
> bits of code as well, areas like this:
>
>        0: aload_0
>        1: invokestatic  #10                 // Method
> doInject:(Lorg/jboss/msc/service/ValueInjection;)V
>        4: return
>
> or this constructor:
>
>       87: aload_0
>       88: aload         7
>       90: putfield      #17                 // Field
> extensionModuleName:Ljava/lang/String;
>
> We've started testing with -XX:TieredStopAtLevel=1 and so far it seems
> the problems have disappeared, however, it's not clear to my
> hotspot-amateur mind at all whether it's C2 that is causing this or
> whether there is a more general timing-related race condition that is
> hidden by limiting the compiler in this way.
>
> The OpenJDK version is:
>
> openjdk version "1.8.0_111"
> OpenJDK Runtime Environment (build 1.8.0_111-b16)
> OpenJDK 64-Bit Server VM (build 25.111-b16, mixed mode)
>
> It's coming out of a Fedora 24 distribution.

From shafi.s.ahmad at oracle.com  Tue Nov 15 06:34:42 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Mon, 14 Nov 2016 22:34:42 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
Message-ID: <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>

Hi Vladimir,

 
Thanks for the review. 

 
> -----Original Message-----

> From: Vladimir Kozlov

> Sent: Monday, November 14, 2016 11:20 PM

> To: Shafi Ahmad; hotspot-dev at openjdk.java.net

> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces

> mismatched unsafe accesses

> 

> On 11/14/16 1:03 AM, Shafi Ahmad wrote:

> > Hi Vladimir,

> >

> > Thanks for the review.

> >

> > Please find updated webrevs.

> >

> > All webrevs are with respect to the base changes on JDK-8140309.

> > http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/

> 

> Why you kept unaligned parameter in changes?

The fix of JDK-8136473 caused many problems after integration (see JDK-8140267). 

The fix was backed out and re-implemented with JDK-8140309 by slightly changing the assert: 

http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-November/019696.html

 
The code change for the fix of JDK-8140309 is code changes for JDK-8136473 by slightly changing one assert.

jdk9 original changeset is http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c

As this is a backport so I keep the changes as it is. 

 
> 

> The test TestUnsafeUnalignedMismatchedAccesses.java will not work since

> since Unsafe class in jdk8 does not have unaligned methods.

> Hot did you run it?

 
I am sorry, looks there is some issue with my testing. 

I have run jtreg test after merging the changes but somehow the test does not run and I verified only the failing list of jtreg result. 

When I run the test case separately it is failing as you already pointed out the same.   

 
$java -jar ~/Tools/jtreg/lib/jtreg.jar -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java

Test results: failed: 1

Report written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTreport/html/report.html

Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork

 
Error: 

/scratch/shshahma/Java/jdk8u-dev-8140309_01/hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: cannot find symbol

        UNSAFE.putIntUnaligned(array, UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);

 
Not sure if we should push without the test case. 

 
> 

> > http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/

> 

> Good. Did you run new UnsafeAccess.java test?

Due to same process issue the test case is not run and when I run it separately it fails.

It passes after doing below changes:

1. Added /othervm 

2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by 'import sun.misc.Unsafe;'

 
Updated webrev: http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/

> 

> > http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/

I am getting the similar compilation error as above for added test case.  Not sure if we can push without the test case. 

 
Regards,

Shafi

 
> 

> Good.

> 

> Thanks,

> Vladimir

> 

> >

> > Regards,

> > Shafi

> >

> >

> >

> >> -----Original Message-----

> >> From: Vladimir Kozlov

> >> Sent: Friday, November 11, 2016 1:26 AM

> >> To: Shafi Ahmad; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net

> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces

> >> mismatched unsafe accesses

> >>

> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote:

> >>> Hi,

> >>>

> >>> Please review the backport of following dependent backports.

> >>>

> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473

> >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-

> >> 8080289]. Manual merge is not done as the corresponding code is not

> >> there in jdk8u-dev.

> >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and

> >>> manual

> >> merge is done.

> >>> webrev link:

> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/

> >>

> >> unaligned unsafe access methods were added in jdk 9 only. In your

> >> changes unaligned argument is always false. You can simplify changes.

> >>

> >> Also you should base changes on JDK-8140309 (original 8136473 changes

> >> were backout by 8140267):

> >>

> >> On 11/4/15 10:21 PM, Roland Westrelin wrote:

> >>  > http://cr.openjdk.java.net/~roland/8140309/webrev.00/

> >>  >

> >>  > Same as 8136473 with only the following change:

> >>  >

> >>  > diff --git a/src/share/vm/opto/library_call.cpp

> >> b/src/share/vm/opto/library_call.cpp

> >>  > --- a/src/share/vm/opto/library_call.cpp

> >>  > +++ b/src/share/vm/opto/library_call.cpp

> >>  > @@ -2527,7 +2527,7 @@

> >>  >     // of safe & unsafe memory.

> >>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);

> >>  >

> >>  > -  assert(is_native_ptr || alias_type->adr_type() ==

> >> TypeOopPtr::BOTTOM

> >> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||

> >> alias_type->adr_type() == TypeOopPtr::BOTTOM ||

> >>  >            alias_type->field() != NULL || alias_type->element() !=

> >> NULL, "field, array element or unknown");

> >>  >     bool mismatched = false;

> >>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {

> >>  >

> >>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the

> >> is_native_ptr case and the case where the unsafe method is called with a

> null object.

> >>

> >>> jdk9 changeset:

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4

> >>>

> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918

> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.

> >>>

> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165

> >> [JDK-8140309]. Manual merge is not done as the corresponding code is

> >> not there in jdk8u-dev.

> >>

> >> I explained situation with this line above.

> >>

> >>> webrev link:

> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/

> >>

> >> This webrev is not incremental for your 8136473 changes -

> >> library_call.cpp has part from 8136473 changes.

> >>

> >>> jdk9 changeset:

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef

> >>>

> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781

> >>> Clean merge

> >>> webrev link:

> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/

> >>

> >> Thanks seems fine.

> >>

> >>> jdk9 changeset:

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70

> >>>

> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101

> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7

> >>> [JDK-8160360] - Resolved 2.

> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273

> >> [JDK-8148146] - Manual merge is not done as the corresponding code is

> >> not there in jdk8u-dev.

> >>> webrev link:

> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/

> >>

> >> This webrev is not incremental in library_call.cpp. Difficult to see

> >> this part of changes.

> >>

> >> Thanks,

> >> Vladimir

> >>

> >>> jdk9 changeset:

> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843

> >>>

> >>> Testing: jprt and jtreg

> >>>

> >>> Regards,

> >>> Shafi

> >>>

> >>>> -----Original Message-----

> >>>> From: Shafi Ahmad

> >>>> Sent: Thursday, October 20, 2016 10:08 AM

> >>>> To: Vladimir Kozlov; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net

> >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation

> >>>> produces mismatched unsafe accesses

> >>>>

> >>>> Thanks Vladimir.

> >>>>

> >>>> I will create dependent  backport of 1.

> >>>> https://bugs.openjdk.java.net/browse/JDK-8136473

> >>>> 2. https://bugs.openjdk.java.net/browse/JDK-8155781

> >>>> 3. https://bugs.openjdk.java.net/browse/JDK-8162101

> >>>>

> >>>> Regards,

> >>>> Shafi

> >>>>

> >>>>> -----Original Message-----

> >>>>> From: Vladimir Kozlov

> >>>>> Sent: Wednesday, October 19, 2016 8:27 AM

> >>>>> To: Shafi Ahmad; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net

> >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation

> >>>>> produces mismatched unsafe accesses

> >>>>>

> >>>>> Hi Shafi,

> >>>>>

> >>>>> You should also consider backporting following related fixes:

> >>>>>

> >>>>> https://bugs.openjdk.java.net/browse/JDK-8155781

> >>>>> https://bugs.openjdk.java.net/browse/JDK-8162101

> >>>>>

> >>>>> Otherwise you may hit asserts added by 8134918 changes.

> >>>>>

> >>>>> Thanks,

> >>>>> Vladimir

> >>>>>

> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:

> >>>>>> Hi All,

> >>>>>>

> >>>>>> Please review the backport of JDK-8134918 - C2: Type speculation

> >>>>>> produces

> >>>>> mismatched unsafe accesses to jdk8u-dev.

> >>>>>>

> >>>>>> Please note that backport is not clean and the conflict is due to:

> >>>>>>

> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.

> >>>>>> 1

> >>>>>> 65

> >>>>>>

> >>>>>>  Getting debug build failure because of:

> >>>>>>

> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.

> >>>>>> 1

> >>>>>> 55

> >>>>>>

> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no

> >>>>> mismatched stores, except on raw memory: StoreB StoreI' which is

> >>>>> not back ported to jdk8u and the current backport is on top of

> >>>>> above

> >> change.

> >>>>>>

> >>>>>>  Please note that I am not sure if there is any dependency

> >>>>>> between these

> >>>>> two changesets.

> >>>>>>

> >>>>>> open webrev:

> >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/

> >>>>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918

> >>>>>> jdk9 changeset:

> >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef

> >>>>>>

> >>>>>> testing: Passes JPRT, jtreg not completed

> >>>>>>

> >>>>>> Regards,

> >>>>>> Shafi

> >>>>>>

 
From thomas.schatzl at oracle.com  Tue Nov 15 10:21:04 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Nov 2016 11:21:04 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
Message-ID: <1479205264.3251.13.camel@oracle.com>

Hi Kim,

On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
> > 
> > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.c
> > om> wrote:
> > On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote:
> > > 
> > > > 
> > > > 
> > > > On Oct 21, 2016, at 9:54 PM, Kim Barrett <kim.barrett at oracle.co
> > > > m>
> > > > wrote:
> > > > 
> > > > > 
> > > > > 
> > > > > On Oct 21, 2016, at 8:46 PM, Kim Barrett <kim.barrett at oracle.
> > > > > com>
> > > > > wrote:
> > > > > In the humongous case, if it bails because klass_or_null ==
> > > > > NULL,
> > > > > we must re-enqueue
> > > > > the card ?
> > > This update (webrev.02) reverts part of the previous change.
> > > 
> > > In the original RFR I said:
> > > 
> > > ? As a result of the changes in oops_on_card_seq_iterate_careful,
> > > we
> > > ? now almost never fail to process the card.??The only place
> > > where
> > > ? that can occur is a stale card in a humongous region with an
> > > ? in-progress allocation, where we can just ignore it.??So the
> > > only
> > > ? caller, refine_card, no longer needs to examine the result of
> > > the
> > > ? call and enqueue the card for later reconsideration.
> > > 
> > > Ignoring such a stale card is incorrect at the point where it was
> > > being done.??At that point we've already cleaned the card, so we
> > > must
> > > either process the designated object(s) or, if we can't do the
> > > processing because of in-progress allocation (klass_or_null
> > > returned
> > > NULL), then re-queue the card for later reconsideration.
> > > 
> > > So the change to refine_card to eliminate that behavior, and the
> > > associated changes to oops_on_card_seq_iterate_careful, were a
> > > mistake, and are being reverted by this new version.??As a
> > > result,
> > > refine_card is no longer changed at all.
> > Thanks for catching this.
> > 
> > Maybe it would be cleaner to call a method in the barrier set
> > instead of inlining the dirtying + enqueuing in lines 685 to 691?
> > Maybe as an additional RFE.
> We could use _ct_bs->invalidate(dirtyRegion).??That's rather
> overgeneralized and inefficient for this situation, but this
> situation should occur *very* rarely; it requires a stale card get
> processed just as a humongous object is in the midst of being
> allocated in the same region.

I kind of think for these reasons we should use _ct_bs->invalidate() as
it seems clearer to me. There is the mentioned drawback of having no
other more efficient way, so I will let you decide about this.

> > > Additionally, in the original RFR I also said:
> > > 
> > > ? Note that [...] At present the only source of stale cards in
> > > the concurrent case seems to be HCC eviction.??[...]??Doing HCC
> > > cleanup when freeing regions might remove the need for
> > > klass_or_null checking in the humongous case for concurrent
> > > refinement, so might be worth looking into later.
> > > 
> > > That was also incorrect; there are other sources of stale cards.
> > Can you elaborate on that?
> Here's a scenario that I've observed while running a jtreg test (I
> think it was hotspot/test/gc/TestHumongousReferenceObject).
> 
> We have humongous object H, referring to young object Y.??This
> induces a remembered set entry for card C in region R (allocated for
> H).
> 
> H becomes unreachable.
> Start concurrent collection cycle.
> Pause Initial Mark scan_rs pushes &H->Y onto mark stack.
> Pause Initial Mark evac processes &H->Y, copying Y, updating &H->Y,
> ? and adding C to g1h_dcqs in update_rs.
> Pause Initial Mark redirty_logged_cards dirties g1h_dcqs entries,
> including C.
> Pause Initial Mark merges g1h_dcqs into java_dcqs, adding dirty C to
> java_dcqs.
> Concurrent Mark determines H is dead.
> Pause Cleanup frees regions for H, including R.
> Concurrent Refinement finally comes across stale C in now (possibly)
> free R.
> 
> A similar situation can arise if instead of H we have old O in region
> R and all objects in R are unreachable before starting concurrent
> collection, so that Pause Cleanup frees R.

Okay, thanks, understood.

Thomas


From thomas.schatzl at oracle.com  Tue Nov 15 10:26:48 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 15 Nov 2016 11:26:48 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
Message-ID: <1479205608.3251.18.camel@oracle.com>

Hi,

On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote:
> > 
> > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at oracle.c
> > om> wrote:
> > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
> > > 
> > > There is still a situation where processing can fail, namely an
> > > in-progress humongous allocation that hasn't set the klass
> > > yet.??We continue to handle that as before.
> > - I am not completely sure about whether this case is handled
> > correctly. I?am mostly concerned that the information used before
> > the fence may not be the correct ones, but the checks expect them
> > to be valid.
> > 
> > Probably I am overlooking something critical somewhere.
> > 
> > A: allocates humongous object C, sets region type, issues
> > storestore, sets top pointers, writes the object, and then sets C.y
> > = x to mark a card
> > 
> > Refinement: gets card (and assuming we have no further
> > synchronization around which is not true, e.g. the enqueuing)
> > 
> > ?592???if (!r->is_old_or_humongous()) {
> > 
> > assume refinement thread has not received the "type" correctly yet,
> > so must be Free. So the card will be filtered out incorrectly?
> > 
> > That is contradictory to what I said in the other email about the
> > comment discussion, but I only thoroughly looked at the comment
> > aspect there. :)
> > 
> > I think at this point in general we can't do anything but
> > !is_young(), as we can't ignore cards in "Free" regions - they may
> > be for cards for humongous ones where the thread did not receive
> > top and/or the type yet?
> > 
> > - assuming this works due to other synchronization,
> This is the critical point.??There *is* synchronization there.

Okay, thanks. I just wanted to make sure that we are aware of that we
are using this other synchronization here.

> In the scenario described, the card that was marked and enqueued
> after the object was created will pass through some synchronization
> barriers (full locks, perhaps someday lock-free but with appropriate
> memory barriers) along the way to refinement.
> 
> This is the "easy" case.??If only it were that simple...
> 
> The additional checks are to deal with the possibility of stale
> cards.
> 
> > 
> > [?] I have another
> > similar concern with later trimming:
> > 
> > 653 } else {
> > 654???// Non-humongous objects are only allocated in the old-gen
> > during
> > 655???// GC, so if region is old then top is stable.??Humongous
> > object
> > 656???// allocation sets top last; if top has not yet been set,
> > then
> > 657???// we'll end up with an empty intersection.
> > 658???scan_limit = r->top();
> > 659 }
> > 660 if (scan_limit <= start) {
> > 661???// If the trimmed region is empty, the card must be stale.
> > 662???return false;
> > 663 }
> > 
> > Assume that the current value of top for a humongous object has not
> > been seen yet by the thread and we end up with an empty
> > intersection.
> > 
> > Now, didn't we potentially just drop a card to a humongous object
> > in waiting to scan but did not re-enqueue it? (And we did not clear
> > the card table value either?)
> > 
> > We may do it after the fence though I think.
> > 
> > Maybe I am completely wrong though, what do you think?
> If we see the old (zero) value of top in conjunction with a humongous
> region type, it is because this is a stale card.??If this were a
> non-stale card, the synchronization between enqueuing the card and
> reaching refinement would have ensured we see an up-to-date top (as
> well as an up-to-date type).??Card table entries for a free region
> are cleaned before the region can be allocated (and there are locks
> in the allocation path that provide the needed ordering).??Since this
> is a stale card and regions are allocated with clean card table
> entries, the dirty card table entry check having passed implies there
> is another (non-stale and not-yet-processed) card making its way to
> refinement through the usual channels, including the needed
> synchronization barriers.

Thanks. Again I was mostly worried about noting this reliance on
previous synchronization down somewhere, even if it is only the mailing
list.

It may be useful to note this in the code too. This would save the next
one working on this code looking through old mailing list threads.

Maybe I am a bit overly concerned about making sure that these thoughts
are provided in the proper place though. Or maybe everyone thinks that
everything is clear :)

> > 
> > - another stale comment:
> > 
> > ?636???// a card beyond the heap.??This is not safe without a perm
> > ?637???// gen at the upper end of the heap.
> > 
> > Could everything after "without" be removed in this sentence? We
> > haven't had a "perm gen" for a long time?
> Yes.??I?ll make that change.
> 

Thanks.

Thanks,
? Thomas


From trevor.d.watson at oracle.com  Tue Nov 15 11:57:50 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Tue, 15 Nov 2016 11:57:50 +0000
Subject: RFR: 8162865 Implementation of SPARC lzcnt
Message-ID: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>

I have implemented the code to use the lzcnt instruction for both 
integer and long countLeadingZeros() methods on SPARC platforms 
supporting the vis3 instruction set.

Current "bmi" tests for the above are updated so that they run on both 
SPARC and x86 platforms.

I've also implemented a test to ensure that Integer.countLeadingZeros() 
and Long.countLeadingZeros() return the correct values when C2 runs. 
This test is currently under the intrinsics "bmi" tests for want of 
somewhere better (they do apply to both SPARC and x86 though).

http://cr.openjdk.java.net/~alanbur/8162865/

Thanks,
Trevor

From vladimir.kozlov at oracle.com  Tue Nov 15 19:29:51 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 15 Nov 2016 11:29:51 -0800
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
Message-ID: <582B622F.7030909@oracle.com>

Hi Shafi

You should not backport tests which use only new JDK 9 APIs. Like TestUnsafeUnalignedMismatchedAccesses.java test.

But it is perfectly fine to modify backport by removing part of changes which use a new API. For example,  8162101 changes in OpaqueAccesses.java test which use getIntUnaligned() method.

It is unfortunate that 8140309 changes include also code which process new Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will simplify this and following backports. So I 
agree with changes you did for 8140309 backport.

Thanks,
Vladimir

On 11/14/16 10:34 PM, Shafi Ahmad wrote:
> Hi Vladimir,
>
> Thanks for the review.
>
>> -----Original Message-----
>
>> From: Vladimir Kozlov
>
>> Sent: Monday, November 14, 2016 11:20 PM
>
>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>
>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>
>> mismatched unsafe accesses
>
>>
>
>> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
>
>> > Hi Vladimir,
>
>> >
>
>> > Thanks for the review.
>
>> >
>
>> > Please find updated webrevs.
>
>> >
>
>> > All webrevs are with respect to the base changes on JDK-8140309.
>
>> >http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
>
>>
>
>> Why you kept unaligned parameter in changes?
>
> The fix of JDK-8136473 caused many problems after integration (see JDK-8140267).
>
> The fix was backed out and re-implemented with JDK-8140309 by slightly changing the assert:
>
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-November/019696.html
>
> The code change for the fix of JDK-8140309 is code changes for JDK-8136473 by slightly changing one assert.
>
> jdk9 original changeset is http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
>
> As this is a backport so I keep the changes as it is.
>
>>
>
>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work since
>
>> since Unsafe class in jdk8 does not have unaligned methods.
>
>> Hot did you run it?
>
> I am sorry, looks there is some issue with my testing.
>
> I have run jtreg test after merging the changes but somehow the test does not run and I verified only the failing list of jtreg result.
>
> When I run the test case separately it is failing as you already pointed out the same.
>
> $java -jar ~/Tools/jtreg/lib/jtreg.jar -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java
>
> Test results: failed: 1
>
> Report written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTreport/html/report.html
>
> Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
>
> Error:
>
> /scratch/shshahma/Java/jdk8u-dev-8140309_01/hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: cannot find symbol
>
>          UNSAFE.putIntUnaligned(array, UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
>
> Not sure if we should push without the test case.
>
>>
>
>> >http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
>
>>
>
>> Good. Did you run new UnsafeAccess.java test?
>
> Due to same process issue the test case is not run and when I run it separately it fails.
>
> It passes after doing below changes:
>
> 1. Added /othervm
>
> 2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by 'import sun.misc.Unsafe;'
>
> Updated webrev: http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
>
>>
>
>> >http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
>
> I am getting the similar compilation error as above for added test case.  Not sure if we can push without the test case.
>
> Regards,
>
> Shafi
>
>>
>
>> Good.
>
>>
>
>> Thanks,
>
>> Vladimir
>
>>
>
>> >
>
>> > Regards,
>
>> > Shafi
>
>> >
>
>> >
>
>> >
>
>> >> -----Original Message-----
>
>> >> From: Vladimir Kozlov
>
>> >> Sent: Friday, November 11, 2016 1:26 AM
>
>> >> To: Shafi Ahmad;hotspot-dev at openjdk.java.net <mailto:hotspot-dev at openjdk.java.net>
>
>> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>
>> >> mismatched unsafe accesses
>
>> >>
>
>> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
>
>> >>> Hi,
>
>> >>>
>
>> >>> Please review the backport of following dependent backports.
>
>> >>>
>
>> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473
>
>> >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-
>
>> >> 8080289]. Manual merge is not done as the corresponding code is not
>
>> >> there in jdk8u-dev.
>
>> >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and
>
>> >>> manual
>
>> >> merge is done.
>
>> >>> webrev link:
>
>>http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
>
>> >>
>
>> >> unaligned unsafe access methods were added in jdk 9 only. In your
>
>> >> changes unaligned argument is always false. You can simplify changes.
>
>> >>
>
>> >> Also you should base changes on JDK-8140309 (original 8136473 changes
>
>> >> were backout by 8140267):
>
>> >>
>
>> >> On 11/4/15 10:21 PM, Roland Westrelin wrote:
>
>> >>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
>
>> >>  >
>
>> >>  > Same as 8136473 with only the following change:
>
>> >>  >
>
>> >>  > diff --git a/src/share/vm/opto/library_call.cpp
>
>> >> b/src/share/vm/opto/library_call.cpp
>
>> >>  > --- a/src/share/vm/opto/library_call.cpp
>
>> >>  > +++ b/src/share/vm/opto/library_call.cpp
>
>> >>  > @@ -2527,7 +2527,7 @@
>
>> >>  >     // of safe & unsafe memory.
>
>> >>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
>
>> >>  >
>
>> >>  > -  assert(is_native_ptr || alias_type->adr_type() ==
>
>> >> TypeOopPtr::BOTTOM
>
>> >> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
>
>> >> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
>
>> >>  >            alias_type->field() != NULL || alias_type->element() !=
>
>> >> NULL, "field, array element or unknown");
>
>> >>  >     bool mismatched = false;
>
>> >>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
>
>> >>  >
>
>> >>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
>
>> >> is_native_ptr case and the case where the unsafe method is called with a
>
>> null object.
>
>> >>
>
>> >>> jdk9 changeset:
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
>
>> >>>
>
>> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
>
>> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>
>> >>>
>
>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165
>
>> >> [JDK-8140309]. Manual merge is not done as the corresponding code is
>
>> >> not there in jdk8u-dev.
>
>> >>
>
>> >> I explained situation with this line above.
>
>> >>
>
>> >>> webrev link:
>
>>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>
>> >>
>
>> >> This webrev is not incremental for your 8136473 changes -
>
>> >> library_call.cpp has part from 8136473 changes.
>
>> >>
>
>> >>> jdk9 changeset:
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>
>> >>>
>
>> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781
>
>> >>> Clean merge
>
>> >>> webrev link:
>
>>http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
>
>> >>
>
>> >> Thanks seems fine.
>
>> >>
>
>> >>> jdk9 changeset:
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
>
>> >>>
>
>> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101
>
>> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
>
>> >>> [JDK-8160360] - Resolved 2.
>
>> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273
>
>> >> [JDK-8148146] - Manual merge is not done as the corresponding code is
>
>> >> not there in jdk8u-dev.
>
>> >>> webrev link:
>
>>http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
>
>> >>
>
>> >> This webrev is not incremental in library_call.cpp. Difficult to see
>
>> >> this part of changes.
>
>> >>
>
>> >> Thanks,
>
>> >> Vladimir
>
>> >>
>
>> >>> jdk9 changeset:
>
>> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
>
>> >>>
>
>> >>> Testing: jprt and jtreg
>
>> >>>
>
>> >>> Regards,
>
>> >>> Shafi
>
>> >>>
>
>> >>>> -----Original Message-----
>
>> >>>> From: Shafi Ahmad
>
>> >>>> Sent: Thursday, October 20, 2016 10:08 AM
>
>> >>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net <mailto:hotspot-dev at openjdk.java.net>
>
>> >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
>
>> >>>> produces mismatched unsafe accesses
>
>> >>>>
>
>> >>>> Thanks Vladimir.
>
>> >>>>
>
>> >>>> I will create dependent  backport of 1.
>
>> >>>>https://bugs.openjdk.java.net/browse/JDK-8136473
>
>> >>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
>
>> >>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
>
>> >>>>
>
>> >>>> Regards,
>
>> >>>> Shafi
>
>> >>>>
>
>> >>>>> -----Original Message-----
>
>> >>>>> From: Vladimir Kozlov
>
>> >>>>> Sent: Wednesday, October 19, 2016 8:27 AM
>
>> >>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net <mailto:hotspot-dev at openjdk.java.net>
>
>> >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>
>> >>>>> produces mismatched unsafe accesses
>
>> >>>>>
>
>> >>>>> Hi Shafi,
>
>> >>>>>
>
>> >>>>> You should also consider backporting following related fixes:
>
>> >>>>>
>
>> >>>>>https://bugs.openjdk.java.net/browse/JDK-8155781
>
>> >>>>>https://bugs.openjdk.java.net/browse/JDK-8162101
>
>> >>>>>
>
>> >>>>> Otherwise you may hit asserts added by 8134918 changes.
>
>> >>>>>
>
>> >>>>> Thanks,
>
>> >>>>> Vladimir
>
>> >>>>>
>
>> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
>
>> >>>>>> Hi All,
>
>> >>>>>>
>
>> >>>>>> Please review the backport of JDK-8134918 - C2: Type speculation
>
>> >>>>>> produces
>
>> >>>>> mismatched unsafe accesses to jdk8u-dev.
>
>> >>>>>>
>
>> >>>>>> Please note that backport is not clean and the conflict is due to:
>
>> >>>>>>
>
>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>
>> >>>>>> 1
>
>> >>>>>> 65
>
>> >>>>>>
>
>> >>>>>>  Getting debug build failure because of:
>
>> >>>>>>
>
>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>
>> >>>>>> 1
>
>> >>>>>> 55
>
>> >>>>>>
>
>> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no
>
>> >>>>> mismatched stores, except on raw memory: StoreB StoreI' which is
>
>> >>>>> not back ported to jdk8u and the current backport is on top of
>
>> >>>>> above
>
>> >> change.
>
>> >>>>>>
>
>> >>>>>>  Please note that I am not sure if there is any dependency
>
>> >>>>>> between these
>
>> >>>>> two changesets.
>
>> >>>>>>
>
>> >>>>>> open webrev:
>
>> >>>>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>
>> >>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
>
>> >>>>>> jdk9 changeset:
>
>> >>>>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>
>> >>>>>>
>
>> >>>>>> testing: Passes JPRT, jtreg not completed
>
>> >>>>>>
>
>> >>>>>> Regards,
>
>> >>>>>> Shafi
>
>> >>>>>>
>

From kim.barrett at oracle.com  Tue Nov 15 23:58:24 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 15 Nov 2016 18:58:24 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <1479205264.3251.13.camel@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
Message-ID: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>

> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>> 
>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>> om> wrote:
>>> Maybe it would be cleaner to call a method in the barrier set
>>> instead of inlining the dirtying + enqueuing in lines 685 to 691?
>>> Maybe as an additional RFE.
>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>> overgeneralized and inefficient for this situation, but this
>> situation should occur *very* rarely; it requires a stale card get
>> processed just as a humongous object is in the midst of being
>> allocated in the same region.
> 
> I kind of think for these reasons we should use _ct_bs->invalidate() as
> it seems clearer to me. There is the mentioned drawback of having no
> other more efficient way, so I will let you decide about this.

I've made the change to call invalidate, and also updated some comments.

CR:
https://bugs.openjdk.java.net/browse/JDK-8166607

Webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/

Also, see RFR: 8166811, where I've included a webrev combining the
latest changes for 8166607 and 8166811, since they are rather
intertwined.  I think I'll do as Erik suggested and push the two
together.


From kim.barrett at oracle.com  Wed Nov 16 00:00:02 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 15 Nov 2016 19:00:02 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1479205608.3251.18.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
Message-ID: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>

> On Nov 15, 2016, at 5:26 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote:
>>> 
>>> On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>> om> wrote:
>>> On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
>>> - assuming this works due to other synchronization,
>> This is the critical point.  There *is* synchronization there.
> 
> Okay, thanks. I just wanted to make sure that we are aware of that we
> are using this other synchronization here.
> 
> Thanks. Again I was mostly worried about noting this reliance on
> previous synchronization down somewhere, even if it is only the mailing
> list.
> 
> It may be useful to note this in the code too. This would save the next
> one working on this code looking through old mailing list threads.
> 
> Maybe I am a bit overly concerned about making sure that these thoughts
> are provided in the proper place though. Or maybe everyone thinks that
> everything is clear :)

I've updated some comments to mention that external synchronization.

CR:
https://bugs.openjdk.java.net/browse/JDK-8166811

Webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/
incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/

Also, since this set of changes is rather intertwined with the changes
for 8166607, here is a combined webrev for both:
http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/

I think I'll do as Erik suggested and push the two together.


From thomas.schatzl at oracle.com  Wed Nov 16 09:06:54 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 16 Nov 2016 10:06:54 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
Message-ID: <1479287214.2466.35.camel@oracle.com>

Hi Kim,

On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote:
> > 
> > On Nov 15, 2016, at 5:26 AM, Thomas Schatzl <thomas.schatzl at oracle.
> > com> wrote:
> > 
> > Hi,
> > 
> > On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote:
> > > 
> > > > 
> > > > 
> > > > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at orac
> > > > le.c
> > > > om> wrote:
> > > > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
> > > > - assuming this works due to other synchronization,
> > > This is the critical point.??There *is* synchronization there.
> > Okay, thanks. I just wanted to make sure that we are aware of that
> > we
> > are using this other synchronization here.
> > 
> > Thanks. Again I was mostly worried about noting this reliance on
> > previous synchronization down somewhere, even if it is only the
> > mailing
> > list.
> > 
> > It may be useful to note this in the code too. This would save the
> > next
> > one working on this code looking through old mailing list threads.
> > 
> > Maybe I am a bit overly concerned about making sure that these
> > thoughts are provided in the proper place though. Or maybe everyone
> > thinks that everything is clear :)
> I've updated some comments to mention that external synchronization.

?581???// The region could be young.??Cards for young regions are set
to
?582???// g1_young_gen, so the post-barrier will filter them
out.??However,
?583???// that marking is performed concurrently.??A write to a young
?584???// object could occur before the card has been marked young,
slipping
?585???// past the filter.

I would prefer if the text would not change terminology for the same
thing mid-paragraph, from "setting" to "marking". The advantage of it
reading better seems to be smaller than the potential confusion.

Everything else looks very nice.

Thanks for considering my comments.

> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166811
> 
> Webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/
> incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/
> 
> Also, since this set of changes is rather intertwined with the
> changes
> for 8166607, here is a combined webrev for both:
> http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/
> 
> I think I'll do as Erik suggested and push the two together.

Just fyi, you can push two commits at once, or one commit having two
CR-number lines.
I think it is sufficient to commit these two changes in a single push
job, but I do not see a need for making it a single commit.

Either way is fine with me.

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Wed Nov 16 09:21:27 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 16 Nov 2016 10:21:27 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
Message-ID: <1479288087.2466.36.camel@oracle.com>

Hi Kim,

On Tue, 2016-11-15 at 18:58 -0500, Kim Barrett wrote:
> > 
> > On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.
> > com> wrote:
> > 
> > Hi Kim,
> > 
> > On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
> > > 
> > > > 
> > > > 
> > > > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at orac
> > > > le.c
> > > > om> wrote:
> > > > Maybe it would be cleaner to call a method in the barrier set
> > > > instead of inlining the dirtying + enqueuing in lines 685 to
> > > > 691?
> > > > Maybe as an additional RFE.
> > > We could use _ct_bs->invalidate(dirtyRegion).??That's rather
> > > overgeneralized and inefficient for this situation, but this
> > > situation should occur *very* rarely; it requires a stale card
> > > get
> > > processed just as a humongous object is in the midst of being
> > > allocated in the same region.
> > I kind of think for these reasons we should use _ct_bs-
> > >invalidate() as
> > it seems clearer to me. There is the mentioned drawback of having
> > no
> > other more efficient way, so I will let you decide about this.
> I've made the change to call invalidate, and also updated some
> comments.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166607
> 
> Webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
> 

? thanks, looks good.

Thomas


From shafi.s.ahmad at oracle.com  Wed Nov 16 12:52:24 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Wed, 16 Nov 2016 04:52:24 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <582B622F.7030909@oracle.com>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
	<582B622F.7030909@oracle.com>
Message-ID: <4332d26a-0efa-4582-9068-f28fb7ebd109@default>

Hi Vladimir,

Thank you for the review and feedback.

Please find updated webrevs:
http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed the test case as it use only jdk9 APIs.
http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed test methods testFixedOffsetHeaderArray17() and testFixedOffsetHeader17() which referenced jdk9 API UNSAFE.getIntUnaligned.


Regards,
Shafi


> -----Original Message-----
> From: Vladimir Kozlov
> Sent: Wednesday, November 16, 2016 1:00 AM
> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> mismatched unsafe accesses
> 
> Hi Shafi
> 
> You should not backport tests which use only new JDK 9 APIs. Like
> TestUnsafeUnalignedMismatchedAccesses.java test.
> 
> But it is perfectly fine to modify backport by removing part of changes which
> use a new API. For example,  8162101 changes in OpaqueAccesses.java test
> which use getIntUnaligned() method.
> 
> It is unfortunate that 8140309 changes include also code which process new
> Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will
> simplify this and following backports. So I agree with changes you did for
> 8140309 backport.
> 
> Thanks,
> Vladimir
> 
> On 11/14/16 10:34 PM, Shafi Ahmad wrote:
> > Hi Vladimir,
> >
> > Thanks for the review.
> >
> >> -----Original Message-----
> >
> >> From: Vladimir Kozlov
> >
> >> Sent: Monday, November 14, 2016 11:20 PM
> >
> >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >
> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> >
> >> mismatched unsafe accesses
> >
> >>
> >
> >> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
> >
> >> > Hi Vladimir,
> >
> >> >
> >
> >> > Thanks for the review.
> >
> >> >
> >
> >> > Please find updated webrevs.
> >
> >> >
> >
> >> > All webrevs are with respect to the base changes on JDK-8140309.
> >
> >> >http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
> >
> >>
> >
> >> Why you kept unaligned parameter in changes?
> >
> > The fix of JDK-8136473 caused many problems after integration (see JDK-
> 8140267).
> >
> > The fix was backed out and re-implemented with JDK-8140309 by slightly
> changing the assert:
> >
> > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-
> Novem
> > ber/019696.html
> >
> > The code change for the fix of JDK-8140309 is code changes for JDK-8136473
> by slightly changing one assert.
> >
> > jdk9 original changeset is
> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
> >
> > As this is a backport so I keep the changes as it is.
> >
> >>
> >
> >> The test TestUnsafeUnalignedMismatchedAccesses.java will not work
> >> since
> >
> >> since Unsafe class in jdk8 does not have unaligned methods.
> >
> >> Hot did you run it?
> >
> > I am sorry, looks there is some issue with my testing.
> >
> > I have run jtreg test after merging the changes but somehow the test does
> not run and I verified only the failing list of jtreg result.
> >
> > When I run the test case separately it is failing as you already pointed out
> the same.
> >
> > $java -jar ~/Tools/jtreg/lib/jtreg.jar
> > -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/
> >
> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedA
> > ccesses.java
> >
> > Test results: failed: 1
> >
> > Report written to
> > /scratch/shshahma/Java/jdk8u-dev-
> 8140309_01/JTreport/html/report.html
> >
> > Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
> >
> > Error:
> >
> > /scratch/shshahma/Java/jdk8u-dev-
> 8140309_01/hotspot/test/compiler/intr
> > insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error:
> > cannot find symbol
> >
> >          UNSAFE.putIntUnaligned(array,
> > UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
> >
> > Not sure if we should push without the test case.
> >
> >>
> >
> >> >http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
> >
> >>
> >
> >> Good. Did you run new UnsafeAccess.java test?
> >
> > Due to same process issue the test case is not run and when I run it
> separately it fails.
> >
> > It passes after doing below changes:
> >
> > 1. Added /othervm
> >
> > 2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by 'import
> sun.misc.Unsafe;'
> >
> > Updated webrev:
> > http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
> >
> >>
> >
> >> >http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
> >
> > I am getting the similar compilation error as above for added test case.  Not
> sure if we can push without the test case.
> >
> > Regards,
> >
> > Shafi
> >
> >>
> >
> >> Good.
> >
> >>
> >
> >> Thanks,
> >
> >> Vladimir
> >
> >>
> >
> >> >
> >
> >> > Regards,
> >
> >> > Shafi
> >
> >> >
> >
> >> >
> >
> >> >
> >
> >> >> -----Original Message-----
> >
> >> >> From: Vladimir Kozlov
> >
> >> >> Sent: Friday, November 11, 2016 1:26 AM
> >
> >> >> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >> >> <mailto:hotspot-dev at openjdk.java.net>
> >
> >> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >> >> produces
> >
> >> >> mismatched unsafe accesses
> >
> >> >>
> >
> >> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
> >
> >> >>> Hi,
> >
> >> >>>
> >
> >> >>> Please review the backport of following dependent backports.
> >
> >> >>>
> >
> >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473
> >
> >> >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
> >
> >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61
> >> >>>[JDK-
> >
> >> >> 8080289]. Manual merge is not done as the corresponding code is
> >> >> not
> >
> >> >> there in jdk8u-dev.
> >
> >> >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and
> >
> >> >>> manual
> >
> >> >> merge is done.
> >
> >> >>> webrev link:
> >
> >>http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
> >
> >> >>
> >
> >> >> unaligned unsafe access methods were added in jdk 9 only. In your
> >
> >> >> changes unaligned argument is always false. You can simplify changes.
> >
> >> >>
> >
> >> >> Also you should base changes on JDK-8140309 (original 8136473
> >> >> changes
> >
> >> >> were backout by 8140267):
> >
> >> >>
> >
> >> >> On 11/4/15 10:21 PM, Roland Westrelin wrote:
> >
> >> >>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
> >
> >> >>  >
> >
> >> >>  > Same as 8136473 with only the following change:
> >
> >> >>  >
> >
> >> >>  > diff --git a/src/share/vm/opto/library_call.cpp
> >
> >> >> b/src/share/vm/opto/library_call.cpp
> >
> >> >>  > --- a/src/share/vm/opto/library_call.cpp
> >
> >> >>  > +++ b/src/share/vm/opto/library_call.cpp
> >
> >> >>  > @@ -2527,7 +2527,7 @@
> >
> >> >>  >     // of safe & unsafe memory.
> >
> >> >>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
> >
> >> >>  >
> >
> >> >>  > -  assert(is_native_ptr || alias_type->adr_type() ==
> >
> >> >> TypeOopPtr::BOTTOM
> >
> >> >> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
> >
> >> >> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
> >
> >> >>  >            alias_type->field() != NULL || alias_type->element() !=
> >
> >> >> NULL, "field, array element or unknown");
> >
> >> >>  >     bool mismatched = false;
> >
> >> >>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
> >
> >> >>  >
> >
> >> >>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
> >
> >> >> is_native_ptr case and the case where the unsafe method is called
> >> >> with a
> >
> >> null object.
> >
> >> >>
> >
> >> >>> jdk9 changeset:
> >
> >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
> >
> >> >>>
> >
> >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
> >
> >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >
> >> >>>
> >
> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165
> >
> >> >> [JDK-8140309]. Manual merge is not done as the corresponding code
> >> >> is
> >
> >> >> not there in jdk8u-dev.
> >
> >> >>
> >
> >> >> I explained situation with this line above.
> >
> >> >>
> >
> >> >>> webrev link:
> >
> >>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >
> >> >>
> >
> >> >> This webrev is not incremental for your 8136473 changes -
> >
> >> >> library_call.cpp has part from 8136473 changes.
> >
> >> >>
> >
> >> >>> jdk9 changeset:
> >
> >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >
> >> >>>
> >
> >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781
> >
> >> >>> Clean merge
> >
> >> >>> webrev link:
> >
> >>http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
> >
> >> >>
> >
> >> >> Thanks seems fine.
> >
> >> >>
> >
> >> >>> jdk9 changeset:
> >
> >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
> >
> >> >>>
> >
> >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101
> >
> >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >
> >>
> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
> >
> >> >>> [JDK-8160360] - Resolved 2.
> >
> >>
> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2
> >> >>73
> >
> >> >> [JDK-8148146] - Manual merge is not done as the corresponding code
> >> >> is
> >
> >> >> not there in jdk8u-dev.
> >
> >> >>> webrev link:
> >
> >>http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
> >
> >> >>
> >
> >> >> This webrev is not incremental in library_call.cpp. Difficult to
> >> >> see
> >
> >> >> this part of changes.
> >
> >> >>
> >
> >> >> Thanks,
> >
> >> >> Vladimir
> >
> >> >>
> >
> >> >>> jdk9 changeset:
> >
> >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
> >
> >> >>>
> >
> >> >>> Testing: jprt and jtreg
> >
> >> >>>
> >
> >> >>> Regards,
> >
> >> >>> Shafi
> >
> >> >>>
> >
> >> >>>> -----Original Message-----
> >
> >> >>>> From: Shafi Ahmad
> >
> >> >>>> Sent: Thursday, October 20, 2016 10:08 AM
> >
> >> >>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net
> >> >>>> <mailto:hotspot-dev at openjdk.java.net>
> >
> >> >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
> >
> >> >>>> produces mismatched unsafe accesses
> >
> >> >>>>
> >
> >> >>>> Thanks Vladimir.
> >
> >> >>>>
> >
> >> >>>> I will create dependent  backport of 1.
> >
> >> >>>>https://bugs.openjdk.java.net/browse/JDK-8136473
> >
> >> >>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
> >
> >> >>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
> >
> >> >>>>
> >
> >> >>>> Regards,
> >
> >> >>>> Shafi
> >
> >> >>>>
> >
> >> >>>>> -----Original Message-----
> >
> >> >>>>> From: Vladimir Kozlov
> >
> >> >>>>> Sent: Wednesday, October 19, 2016 8:27 AM
> >
> >> >>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >> >>>>> <mailto:hotspot-dev at openjdk.java.net>
> >
> >> >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >
> >> >>>>> produces mismatched unsafe accesses
> >
> >> >>>>>
> >
> >> >>>>> Hi Shafi,
> >
> >> >>>>>
> >
> >> >>>>> You should also consider backporting following related fixes:
> >
> >> >>>>>
> >
> >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8155781
> >
> >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8162101
> >
> >> >>>>>
> >
> >> >>>>> Otherwise you may hit asserts added by 8134918 changes.
> >
> >> >>>>>
> >
> >> >>>>> Thanks,
> >
> >> >>>>> Vladimir
> >
> >> >>>>>
> >
> >> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
> >
> >> >>>>>> Hi All,
> >
> >> >>>>>>
> >
> >> >>>>>> Please review the backport of JDK-8134918 - C2: Type
> >> >>>>>> speculation
> >
> >> >>>>>> produces
> >
> >> >>>>> mismatched unsafe accesses to jdk8u-dev.
> >
> >> >>>>>>
> >
> >> >>>>>> Please note that backport is not clean and the conflict is due to:
> >
> >> >>>>>>
> >
> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >
> >> >>>>>> 1
> >
> >> >>>>>> 65
> >
> >> >>>>>>
> >
> >> >>>>>>  Getting debug build failure because of:
> >
> >> >>>>>>
> >
> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >
> >> >>>>>> 1
> >
> >> >>>>>> 55
> >
> >> >>>>>>
> >
> >> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no
> >
> >> >>>>> mismatched stores, except on raw memory: StoreB StoreI' which
> >> >>>>> is
> >
> >> >>>>> not back ported to jdk8u and the current backport is on top of
> >
> >> >>>>> above
> >
> >> >> change.
> >
> >> >>>>>>
> >
> >> >>>>>>  Please note that I am not sure if there is any dependency
> >
> >> >>>>>> between these
> >
> >> >>>>> two changesets.
> >
> >> >>>>>>
> >
> >> >>>>>> open webrev:
> >
> >> >>>>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >
> >> >>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
> >
> >> >>>>>> jdk9 changeset:
> >
> >> >>>>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >
> >> >>>>>>
> >
> >> >>>>>> testing: Passes JPRT, jtreg not completed
> >
> >> >>>>>>
> >
> >> >>>>>> Regards,
> >
> >> >>>>>> Shafi
> >
> >> >>>>>>
> >

From kevin.walls at oracle.com  Wed Nov 16 15:57:10 2016
From: kevin.walls at oracle.com (Kevin Walls)
Date: Wed, 16 Nov 2016 15:57:10 +0000
Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <c88a8081-fc33-4497-bdeb-30df925463b6@default>
References: <bce6cf75-6067-43e2-a6a1-5b99a0d925aa@default>
	<ad78d9b4-258a-568d-a3a2-85713513f689@oracle.com>
	<2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default>
	<f9c92143-2470-4edc-82c7-d8794e3d116c@default>
	<c88a8081-fc33-4497-bdeb-30df925463b6@default>
Message-ID: <970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com>


Hi Shafi - yes, backport looks good,

Regards
Kevin

On 10/11/2016 07:10, Shafi Ahmad wrote:
> Hi All,
>
> May I get the second review for this backport.
>
> Regards,
> Shafi
>
>> -----Original Message-----
>> From: Shafi Ahmad
>> Sent: Tuesday, October 25, 2016 9:09 AM
>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
>> Cc: Vladimir Ivanov
>> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
>> ciObjectFactory::create_new_metadata
>>
>> May I get the second review for this backport.
>>
>> Regards,
>> Shafi
>>
>>> -----Original Message-----
>>> From: Shafi Ahmad
>>> Sent: Thursday, October 20, 2016 9:55 AM
>>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
>>> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with
>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
>>>
>>> Thank you Vladimir for the review.
>>>
>>> Please find the updated webrev link.
>>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/
>>>
>>> All,
>>>
>>> May I get 2nd review for this.
>>>
>>> Regards,
>>> Shafi
>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov
>>>> Sent: Wednesday, October 19, 2016 10:14 PM
>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>>> Cc: Vladimir Ivanov; Jamsheed C M
>>>> Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with
>>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
>>>>
>>>> In ciMethod.hpp you duplicated comment line:
>>>>
>>>> +   // Given a certain calling environment, find the monomorphic
>>>> + target
>>>>       // Given a certain calling environment, find the monomorphic
>>>> target
>>>>
>>>> Otherwise looks good.
>>>>
>>>> Thanks,
>>>> Vladimir K
>>>>
>>>> On 10/19/16 12:53 AM, Shafi Ahmad wrote:
>>>>> Hi All,
>>>>>
>>>>> Please review the backport of 'JDK-8134389: Crash in HotSpot with
>>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev.
>>>>> Please note that backport is not clean as I was getting build failure due
>> to:
>>>>> Formal parameter 'ignore_return' in method
>>>>> GraphBuilder::method_return
>>>> is added in the fix of https://bugs.openjdk.java.net/browse/JDK-
>> 8164122.
>>>>> The current code change is done on top of aforesaid bug fix and
>>>>> this formal
>>>> parameter is referenced in this code change.
>>>>>    * if (x != NULL && !ignore_return) { *
>>>>>
>>>>> Author of this code change suggested me, we can safely remove this
>>>> addition conditional expression ' && !ignore_return'.
>>>>> open webrev:
>>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/
>>>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389
>>>>> jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs-
>>>> comp/hotspot/rev/4191b33b3629
>>>>> testing: Passes JPRT, jtreg on Linux [amd64] and newly added test
>>>>> case
>>>>>
>>>>> Regards,
>>>>> Shafi
>>>>>


From shafi.s.ahmad at oracle.com  Wed Nov 16 16:32:42 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Wed, 16 Nov 2016 08:32:42 -0800 (PST)
Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com>
References: <bce6cf75-6067-43e2-a6a1-5b99a0d925aa@default>
	<ad78d9b4-258a-568d-a3a2-85713513f689@oracle.com>
	<2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default>
	<f9c92143-2470-4edc-82c7-d8794e3d116c@default>
	<c88a8081-fc33-4497-bdeb-30df925463b6@default>
	<970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com>
Message-ID: <5e62d234-df62-44b5-826a-c041a002e548@default>

Thank you Kevin for the review.

Regards,
Shafi

> -----Original Message-----
> From: Kevin Walls
> Sent: Wednesday, November 16, 2016 9:27 PM
> To: Shafi Ahmad; Vladimir Kozlov; hotspot-dev at openjdk.java.net
> Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48
> ciObjectFactory::create_new_metadata
> 
> 
> Hi Shafi - yes, backport looks good,
> 
> Regards
> Kevin
> 
> On 10/11/2016 07:10, Shafi Ahmad wrote:
> > Hi All,
> >
> > May I get the second review for this backport.
> >
> > Regards,
> > Shafi
> >
> >> -----Original Message-----
> >> From: Shafi Ahmad
> >> Sent: Tuesday, October 25, 2016 9:09 AM
> >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> >> Cc: Vladimir Ivanov
> >> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with
> >> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
> >>
> >> May I get the second review for this backport.
> >>
> >> Regards,
> >> Shafi
> >>
> >>> -----Original Message-----
> >>> From: Shafi Ahmad
> >>> Sent: Thursday, October 20, 2016 9:55 AM
> >>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net
> >>> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with
> >>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
> >>>
> >>> Thank you Vladimir for the review.
> >>>
> >>> Please find the updated webrev link.
> >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/
> >>>
> >>> All,
> >>>
> >>> May I get 2nd review for this.
> >>>
> >>> Regards,
> >>> Shafi
> >>>
> >>>> -----Original Message-----
> >>>> From: Vladimir Kozlov
> >>>> Sent: Wednesday, October 19, 2016 10:14 PM
> >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >>>> Cc: Vladimir Ivanov; Jamsheed C M
> >>>> Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with
> >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata
> >>>>
> >>>> In ciMethod.hpp you duplicated comment line:
> >>>>
> >>>> +   // Given a certain calling environment, find the monomorphic
> >>>> + target
> >>>>       // Given a certain calling environment, find the monomorphic
> >>>> target
> >>>>
> >>>> Otherwise looks good.
> >>>>
> >>>> Thanks,
> >>>> Vladimir K
> >>>>
> >>>> On 10/19/16 12:53 AM, Shafi Ahmad wrote:
> >>>>> Hi All,
> >>>>>
> >>>>> Please review the backport of 'JDK-8134389: Crash in HotSpot with
> >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev.
> >>>>> Please note that backport is not clean as I was getting build
> >>>>> failure due
> >> to:
> >>>>> Formal parameter 'ignore_return' in method
> >>>>> GraphBuilder::method_return
> >>>> is added in the fix of https://bugs.openjdk.java.net/browse/JDK-
> >> 8164122.
> >>>>> The current code change is done on top of aforesaid bug fix and
> >>>>> this formal
> >>>> parameter is referenced in this code change.
> >>>>>    * if (x != NULL && !ignore_return) { *
> >>>>>
> >>>>> Author of this code change suggested me, we can safely remove this
> >>>> addition conditional expression ' && !ignore_return'.
> >>>>> open webrev:
> >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/
> >>>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389
> >>>>> jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs-
> >>>> comp/hotspot/rev/4191b33b3629
> >>>>> testing: Passes JPRT, jtreg on Linux [amd64] and newly added test
> >>>>> case
> >>>>>
> >>>>> Regards,
> >>>>> Shafi
> >>>>>
> 

From david.buck at oracle.com  Wed Nov 16 16:44:03 2016
From: david.buck at oracle.com (david buck)
Date: Thu, 17 Nov 2016 01:44:03 +0900
Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV
In-Reply-To: <c3382481-0dd6-1f7b-ef26-ce561d3a9835@oracle.com>
References: <c3382481-0dd6-1f7b-ef26-ce561d3a9835@oracle.com>
Message-ID: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com>

(moving to hotspot-dev for more exposure.)

Jamsheed, thanks once again reviewing my backport!

Any reviewers out there willing to chime in?

Cheers,
-Buck


-------- Forwarded Message --------
Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV
Date: Wed, 16 Nov 2016 21:48:10 +0530
From: Jamsheed C m <jamsheed.c.m at oracle.com>
Organization: Oracle Corporation
To: david buck <david.buck at oracle.com>, 
hotspot-compiler-dev at openjdk.java.net 
<hotspot-compiler-dev at openjdk.java.net>

Thanks for fixing. new webrev looks good to me (not a reviewer).

Best Regards,
Jamsheed
On 11/16/2016 4:31 PM, david buck wrote:
> Hi Jamsheed!
>
> Thank you for catching the mistake! I have modified the backport to
> include the relevant change from 8072008 [0]. Here is an updated webrev:
>
> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/
>
> In the new chunk of code added, the only difference from the code in
> JDK 9 is I had to add a call to err_msg() as JDK 8 does not have
> variadic macro version of assert() [1].
>
> I have reran all tests (both JPRT and manual) with no issues.
>
> Cheers,
> -Buck
>
> [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8080775
>
> On 2016/11/16 16:29, Jamsheed C m wrote:
>> Hi David,
>>
>> this change is missing
>>
>> JVMState* VirtualCallGenerator::generate(JVMState* jvms) {
>>
>> ...
>>
>>   if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) {
>>     assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", kit.java_bc(),
>> Bytecodes::name(kit.java_bc()));
>>     ciMethod* declared_method =
>> kit.method()->get_method_at_bci(kit.bci());
>>     int arg_size =
>> declared_method->signature()->arg_size_for_bc(kit.java_bc());
>>     kit.inc_sp(arg_size);  // restore arguments
>>     kit.uncommon_trap(Deoptimization::Reason_null_check,
>>                       Deoptimization::Action_none,
>>                       NULL, "null receiver");
>>
>>
>> Best Regards,
>>
>> Jamsheed
>>
>>
>> On 11/15/2016 8:55 PM, david buck wrote:
>>> Hi!
>>>
>>> Please review the backported changes of JDK-8158639 to 8u:
>>>
>>> It is a very straightforward backport. The only two differences are:
>>>
>>> - I added a convenience macro, get_method_at_bci(), from the change
>>> for 8072008 to make the backport cleaner.
>>>
>>> - I had to modify (remove) the package used for the testcase.
>>>
>>> Bug Report:
>>> [ 8158639: C2 compilation fails with SIGSEGV ]
>>> https://bugs.openjdk.java.net/browse/JDK-8158639
>>>
>>> JDK 9 changeset:
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b
>>>
>>> 8u-dev Webrev:
>>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/
>>>
>>> Testing:
>>> Manual verification and JPRT (default and hotspot testsets)
>>>
>>> Cheers,
>>> -Buck
>>


From vladimir.kozlov at oracle.com  Wed Nov 16 16:51:09 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 16 Nov 2016 08:51:09 -0800
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <4332d26a-0efa-4582-9068-f28fb7ebd109@default>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
	<582B622F.7030909@oracle.com>
	<4332d26a-0efa-4582-9068-f28fb7ebd109@default>
Message-ID: <b3e52d65-024f-cbb0-9588-594fab8c59b0@oracle.com>

Looks good.

I would suggest to run all jtreg tests (or even RBT) when you apply all changes before pushing this.

Thanks,
Vladimir

On 11/16/16 4:52 AM, Shafi Ahmad wrote:
> Hi Vladimir,
>
> Thank you for the review and feedback.
>
> Please find updated webrevs:
> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed the test case as it use only jdk9 APIs.
> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed test methods testFixedOffsetHeaderArray17() and testFixedOffsetHeader17() which referenced jdk9 API UNSAFE.getIntUnaligned.
>
>
> Regards,
> Shafi
>
>
>> -----Original Message-----
>> From: Vladimir Kozlov
>> Sent: Wednesday, November 16, 2016 1:00 AM
>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>> mismatched unsafe accesses
>>
>> Hi Shafi
>>
>> You should not backport tests which use only new JDK 9 APIs. Like
>> TestUnsafeUnalignedMismatchedAccesses.java test.
>>
>> But it is perfectly fine to modify backport by removing part of changes which
>> use a new API. For example,  8162101 changes in OpaqueAccesses.java test
>> which use getIntUnaligned() method.
>>
>> It is unfortunate that 8140309 changes include also code which process new
>> Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will
>> simplify this and following backports. So I agree with changes you did for
>> 8140309 backport.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/14/16 10:34 PM, Shafi Ahmad wrote:
>>> Hi Vladimir,
>>>
>>> Thanks for the review.
>>>
>>>> -----Original Message-----
>>>
>>>> From: Vladimir Kozlov
>>>
>>>> Sent: Monday, November 14, 2016 11:20 PM
>>>
>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>>
>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>>>
>>>> mismatched unsafe accesses
>>>
>>>>
>>>
>>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
>>>
>>>>> Hi Vladimir,
>>>
>>>>>
>>>
>>>>> Thanks for the review.
>>>
>>>>>
>>>
>>>>> Please find updated webrevs.
>>>
>>>>>
>>>
>>>>> All webrevs are with respect to the base changes on JDK-8140309.
>>>
>>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
>>>
>>>>
>>>
>>>> Why you kept unaligned parameter in changes?
>>>
>>> The fix of JDK-8136473 caused many problems after integration (see JDK-
>> 8140267).
>>>
>>> The fix was backed out and re-implemented with JDK-8140309 by slightly
>> changing the assert:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-
>> Novem
>>> ber/019696.html
>>>
>>> The code change for the fix of JDK-8140309 is code changes for JDK-8136473
>> by slightly changing one assert.
>>>
>>> jdk9 original changeset is
>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
>>>
>>> As this is a backport so I keep the changes as it is.
>>>
>>>>
>>>
>>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work
>>>> since
>>>
>>>> since Unsafe class in jdk8 does not have unaligned methods.
>>>
>>>> Hot did you run it?
>>>
>>> I am sorry, looks there is some issue with my testing.
>>>
>>> I have run jtreg test after merging the changes but somehow the test does
>> not run and I verified only the failing list of jtreg result.
>>>
>>> When I run the test case separately it is failing as you already pointed out
>> the same.
>>>
>>> $java -jar ~/Tools/jtreg/lib/jtreg.jar
>>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/
>>>
>> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedA
>>> ccesses.java
>>>
>>> Test results: failed: 1
>>>
>>> Report written to
>>> /scratch/shshahma/Java/jdk8u-dev-
>> 8140309_01/JTreport/html/report.html
>>>
>>> Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
>>>
>>> Error:
>>>
>>> /scratch/shshahma/Java/jdk8u-dev-
>> 8140309_01/hotspot/test/compiler/intr
>>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error:
>>> cannot find symbol
>>>
>>>          UNSAFE.putIntUnaligned(array,
>>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
>>>
>>> Not sure if we should push without the test case.
>>>
>>>>
>>>
>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
>>>
>>>>
>>>
>>>> Good. Did you run new UnsafeAccess.java test?
>>>
>>> Due to same process issue the test case is not run and when I run it
>> separately it fails.
>>>
>>> It passes after doing below changes:
>>>
>>> 1. Added /othervm
>>>
>>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by 'import
>> sun.misc.Unsafe;'
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
>>>
>>>>
>>>
>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
>>>
>>> I am getting the similar compilation error as above for added test case.  Not
>> sure if we can push without the test case.
>>>
>>> Regards,
>>>
>>> Shafi
>>>
>>>>
>>>
>>>> Good.
>>>
>>>>
>>>
>>>> Thanks,
>>>
>>>> Vladimir
>>>
>>>>
>>>
>>>>>
>>>
>>>>> Regards,
>>>
>>>>> Shafi
>>>
>>>>>
>>>
>>>>>
>>>
>>>>>
>>>
>>>>>> -----Original Message-----
>>>
>>>>>> From: Vladimir Kozlov
>>>
>>>>>> Sent: Friday, November 11, 2016 1:26 AM
>>>
>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>
>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>>> produces
>>>
>>>>>> mismatched unsafe accesses
>>>
>>>>>>
>>>
>>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
>>>
>>>>>>> Hi,
>>>
>>>>>>>
>>>
>>>>>>> Please review the backport of following dependent backports.
>>>
>>>>>>>
>>>
>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473
>>>
>>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61
>>>>>>> [JDK-
>>>
>>>>>> 8080289]. Manual merge is not done as the corresponding code is
>>>>>> not
>>>
>>>>>> there in jdk8u-dev.
>>>
>>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and
>>>
>>>>>>> manual
>>>
>>>>>> merge is done.
>>>
>>>>>>> webrev link:
>>>
>>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
>>>
>>>>>>
>>>
>>>>>> unaligned unsafe access methods were added in jdk 9 only. In your
>>>
>>>>>> changes unaligned argument is always false. You can simplify changes.
>>>
>>>>>>
>>>
>>>>>> Also you should base changes on JDK-8140309 (original 8136473
>>>>>> changes
>>>
>>>>>> were backout by 8140267):
>>>
>>>>>>
>>>
>>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote:
>>>
>>>>>>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
>>>
>>>>>>  >
>>>
>>>>>>  > Same as 8136473 with only the following change:
>>>
>>>>>>  >
>>>
>>>>>>  > diff --git a/src/share/vm/opto/library_call.cpp
>>>
>>>>>> b/src/share/vm/opto/library_call.cpp
>>>
>>>>>>  > --- a/src/share/vm/opto/library_call.cpp
>>>
>>>>>>  > +++ b/src/share/vm/opto/library_call.cpp
>>>
>>>>>>  > @@ -2527,7 +2527,7 @@
>>>
>>>>>>  >     // of safe & unsafe memory.
>>>
>>>>>>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
>>>
>>>>>>  >
>>>
>>>>>>  > -  assert(is_native_ptr || alias_type->adr_type() ==
>>>
>>>>>> TypeOopPtr::BOTTOM
>>>
>>>>>> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
>>>
>>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
>>>
>>>>>>  >            alias_type->field() != NULL || alias_type->element() !=
>>>
>>>>>> NULL, "field, array element or unknown");
>>>
>>>>>>  >     bool mismatched = false;
>>>
>>>>>>  >     if (alias_type->element() != NULL || alias_type->field() != NULL) {
>>>
>>>>>>  >
>>>
>>>>>>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
>>>
>>>>>> is_native_ptr case and the case where the unsafe method is called
>>>>>> with a
>>>
>>>> null object.
>>>
>>>>>>
>>>
>>>>>>> jdk9 changeset:
>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
>>>
>>>>>>>
>>>
>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
>>>
>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>>
>>>>>>>
>>>
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165
>>>
>>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code
>>>>>> is
>>>
>>>>>> not there in jdk8u-dev.
>>>
>>>>>>
>>>
>>>>>> I explained situation with this line above.
>>>
>>>>>>
>>>
>>>>>>> webrev link:
>>>
>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>
>>>>>>
>>>
>>>>>> This webrev is not incremental for your 8136473 changes -
>>>
>>>>>> library_call.cpp has part from 8136473 changes.
>>>
>>>>>>
>>>
>>>>>>> jdk9 changeset:
>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>
>>>>>>>
>>>
>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781
>>>
>>>>>>> Clean merge
>>>
>>>>>>> webrev link:
>>>
>>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
>>>
>>>>>>
>>>
>>>>>> Thanks seems fine.
>>>
>>>>>>
>>>
>>>>>>> jdk9 changeset:
>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
>>>
>>>>>>>
>>>
>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101
>>>
>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>>
>>>>
>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
>>>
>>>>>>> [JDK-8160360] - Resolved 2.
>>>
>>>>
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2
>>>>>> 73
>>>
>>>>>> [JDK-8148146] - Manual merge is not done as the corresponding code
>>>>>> is
>>>
>>>>>> not there in jdk8u-dev.
>>>
>>>>>>> webrev link:
>>>
>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
>>>
>>>>>>
>>>
>>>>>> This webrev is not incremental in library_call.cpp. Difficult to
>>>>>> see
>>>
>>>>>> this part of changes.
>>>
>>>>>>
>>>
>>>>>> Thanks,
>>>
>>>>>> Vladimir
>>>
>>>>>>
>>>
>>>>>>> jdk9 changeset:
>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
>>>
>>>>>>>
>>>
>>>>>>> Testing: jprt and jtreg
>>>
>>>>>>>
>>>
>>>>>>> Regards,
>>>
>>>>>>> Shafi
>>>
>>>>>>>
>>>
>>>>>>>> -----Original Message-----
>>>
>>>>>>>> From: Shafi Ahmad
>>>
>>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM
>>>
>>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net
>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>
>>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>
>>>>>>>> produces mismatched unsafe accesses
>>>
>>>>>>>>
>>>
>>>>>>>> Thanks Vladimir.
>>>
>>>>>>>>
>>>
>>>>>>>> I will create dependent  backport of 1.
>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473
>>>
>>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
>>>
>>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
>>>
>>>>>>>>
>>>
>>>>>>>> Regards,
>>>
>>>>>>>> Shafi
>>>
>>>>>>>>
>>>
>>>>>>>>> -----Original Message-----
>>>
>>>>>>>>> From: Vladimir Kozlov
>>>
>>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM
>>>
>>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
>>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>
>>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>
>>>>>>>>> produces mismatched unsafe accesses
>>>
>>>>>>>>>
>>>
>>>>>>>>> Hi Shafi,
>>>
>>>>>>>>>
>>>
>>>>>>>>> You should also consider backporting following related fixes:
>>>
>>>>>>>>>
>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781
>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101
>>>
>>>>>>>>>
>>>
>>>>>>>>> Otherwise you may hit asserts added by 8134918 changes.
>>>
>>>>>>>>>
>>>
>>>>>>>>> Thanks,
>>>
>>>>>>>>> Vladimir
>>>
>>>>>>>>>
>>>
>>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
>>>
>>>>>>>>>> Hi All,
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type
>>>>>>>>>> speculation
>>>
>>>>>>>>>> produces
>>>
>>>>>>>>> mismatched unsafe accesses to jdk8u-dev.
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> Please note that backport is not clean and the conflict is due to:
>>>
>>>>>>>>>>
>>>
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>>>
>>>>>>>>>> 1
>>>
>>>>>>>>>> 65
>>>
>>>>>>>>>>
>>>
>>>>>>>>>>  Getting debug build failure because of:
>>>
>>>>>>>>>>
>>>
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>>>
>>>>>>>>>> 1
>>>
>>>>>>>>>> 55
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed: no
>>>
>>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which
>>>>>>>>> is
>>>
>>>>>>>>> not back ported to jdk8u and the current backport is on top of
>>>
>>>>>>>>> above
>>>
>>>>>> change.
>>>
>>>>>>>>>>
>>>
>>>>>>>>>>  Please note that I am not sure if there is any dependency
>>>
>>>>>>>>>> between these
>>>
>>>>>>>>> two changesets.
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> open webrev:
>>>
>>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>
>>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
>>>
>>>>>>>>>> jdk9 changeset:
>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> testing: Passes JPRT, jtreg not completed
>>>
>>>>>>>>>>
>>>
>>>>>>>>>> Regards,
>>>
>>>>>>>>>> Shafi
>>>
>>>>>>>>>>
>>>

From kim.barrett at oracle.com  Wed Nov 16 17:28:09 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 16 Nov 2016 12:28:09 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <1479288087.2466.36.camel@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
	<1479288087.2466.36.camel@oracle.com>
Message-ID: <47742160-0A06-48C9-BDBD-76F453C33A68@oracle.com>

> On Nov 16, 2016, at 4:21 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Tue, 2016-11-15 at 18:58 -0500, Kim Barrett wrote:
>>> 
>>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.
>>> com> wrote:
>>> 
>>> Hi Kim,
>>> 
>>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at orac
>>>>> le.c
>>>>> om> wrote:
>>>>> Maybe it would be cleaner to call a method in the barrier set
>>>>> instead of inlining the dirtying + enqueuing in lines 685 to
>>>>> 691?
>>>>> Maybe as an additional RFE.
>>>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>>>> overgeneralized and inefficient for this situation, but this
>>>> situation should occur *very* rarely; it requires a stale card
>>>> get
>>>> processed just as a humongous object is in the midst of being
>>>> allocated in the same region.
>>> I kind of think for these reasons we should use _ct_bs-
>>>> invalidate() as
>>> it seems clearer to me. There is the mentioned drawback of having
>>> no
>>> other more efficient way, so I will let you decide about this.
>> I've made the change to call invalidate, and also updated some
>> comments.
>> 
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8166607
>> 
>> Webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
>> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
>> 
> 
>   thanks, looks good.
> 
> Thomas

Thanks.


From kim.barrett at oracle.com  Wed Nov 16 18:02:07 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 16 Nov 2016 13:02:07 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1479287214.2466.35.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
	<1479287214.2466.35.camel@oracle.com>
Message-ID: <1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com>

> On Nov 16, 2016, at 4:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote:
>> I've updated some comments to mention that external synchronization.
> 
>  581   // The region could be young.  Cards for young regions are set
> to
>  582   // g1_young_gen, so the post-barrier will filter them
> out.  However,
>  583   // that marking is performed concurrently.  A write to a young
>  584   // object could occur before the card has been marked young,
> slipping
>  585   // past the filter.
> 
> I would prefer if the text would not change terminology for the same
> thing mid-paragraph, from "setting" to "marking". The advantage of it
> reading better seems to be smaller than the potential confusion.

  // The region could be young.  Cards for young regions are
  // distinctly marked (set to g1_young_gen), so the post-barrier will
  // filter them out.  However, that marking is performed
  // concurrently.  A write to a young object could occur before the
  // card has been marked young, slipping past the filter.

Better?

> 
> Everything else looks very nice.
> 
> Thanks for considering my comments.

Thanks, and thank you for reviewing so carefully.

>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8166811
>> 
>> Webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/
>> incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/
>> 
>> Also, since this set of changes is rather intertwined with the
>> changes
>> for 8166607, here is a combined webrev for both:
>> http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/
>> 
>> I think I'll do as Erik suggested and push the two together.
> 
> Just fyi, you can push two commits at once, or one commit having two
> CR-number lines.
> I think it is sufficient to commit these two changes in a single push
> job, but I do not see a need for making it a single commit.
> 
> Either way is fine with me.

Perhaps I sowed confusion with the combined webrev.  The purpose
of that was to make it easy to see the combined effect of the two
changes.  I?m planning to do one push of two change sets.

> 
> Thanks,
>   Thomas


From gromero at linux.vnet.ibm.com  Thu Nov 17 01:45:50 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Wed, 16 Nov 2016 23:45:50 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized compilation
Message-ID: <582D0BCE.2030209@linux.vnet.ibm.com>

Hi,

Currently, optimization for building fdlibm is disabled, except for the
"solaris" OS target [1].

As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to,
sin(), cos(), and tan() perform verify poor in comparison to the same methods
in Math class [2]:

                   Math  StrictMath
              =========  ==========
sin           0m29.984s   1m41.184s
cos           0m30.031s   1m41.200s
tan           0m31.772s   1m46.976s
asin           0m4.577s    0m4.543s
acos           0m4.539s    0m4.525s
atan          0m12.929s   0m12.896s
exp            0m1.071s    0m4.570s
log            0m3.272s   0m14.239s
log10          0m4.362s   0m20.236s
sqrt           0m0.913s    0m0.981s
cbrt          0m10.786s   0m10.808s
sinh           0m4.438s    0m4.433s
cosh           0m4.496s    0m4.478s
tanh           0m3.360s    0m3.353s
expm1          0m4.076s    0m4.094s
log1p          0m13.518s  0m13.527s
IEEEremainder  0m38.803s  0m38.909s
atan2          0m20.100s  0m20.057s
pow            0m14.096s  0m19.938s
hypot          0m5.136s    0m5.122s


Switching on the O3 optimization can damage precision of those methods,
nonetheless it's possible to avoid that side effect and yet get huge benefits of
the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in
addition to the -O3  optimization flag.

In that sense the following change is proposed to resolve the issue:

diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk
--- a/make/lib/CoreLibraries.gmk	Wed Nov 09 13:37:19 2016 +0100
+++ b/make/lib/CoreLibraries.gmk	Wed Nov 16 19:11:11 2016 -0500
@@ -33,10 +33,16 @@
 # libfdlibm is statically linked with libjava below and not delivered into the
 # product on its own.

-BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
+BUILD_LIBFDLIBM_OPTIMIZATION := NONE

-ifneq ($(OPENJDK_TARGET_OS), solaris)
-  BUILD_LIBFDLIBM_OPTIMIZATION := NONE
+ifeq ($(OPENJDK_TARGET_OS), solaris)
+  BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
+endif
+
+ifeq ($(OPENJDK_TARGET_OS), linux)
+  ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc)
+    BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
+  endif
 endif

 LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm
@@ -51,6 +57,7 @@
       CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \
       CFLAGS_windows_debug := -DLOGGING, \
       CFLAGS_aix := -qfloat=nomaf, \
+      CFLAGS_linux_ppc := -fno-expensive-optimizations, \
       DISABLED_WARNINGS_gcc := sign-compare, \
       DISABLED_WARNINGS_microsoft := 4146 4244 4018, \
       ARFLAGS := $(ARFLAGS), \


diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk
--- a/make/common/NativeCompilation.gmk	Wed Nov 09 15:32:39 2016 +0100
+++ b/make/common/NativeCompilation.gmk	Wed Nov 16 19:08:06 2016 -0500
@@ -569,16 +569,19 @@
   $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES))

   # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS.
-  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS))
+  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \
+      $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH))
   ifneq ($(DEBUG_LEVEL),release)
     # Pickup extra debug dependent variables for CFLAGS
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug)
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug)
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug)
+    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug)
   else
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release)
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release)
     $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release)
+    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release)
   endif

   # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS.


After enabling the optimization it's possible to again up to 3x on performance
regarding the aforementioned methods without losing precision:

                           StrictMath, original              StrictMath, optimized
                   ============================       ============================
sin                1.7136493465700542 1m41.184s      1.7136493465700542  0m33.895s
cos                0.1709843554185943 1m41.200s      0.1709843554185943  0m33.884s
tan             -5.5500322522995315E7 1m46.976s   -5.5500322522995315E7  0m36.461s
asin                              NaN  0m4.543s                     NaN   0m3.175s
acos                              NaN  0m4.525s                     NaN   0m3.211s
atan             1.5707961389886132E8 0m12.896s    1.5707961389886132E8   0m7.100s
exp                          Infinity  0m4.570s                Infinity   0m3.187s
log              1.7420680845245087E9 0m14.239s    1.7420680845245087E9   0m7.170s
log10             7.565705562087342E8 0m20.236s     7.565705562087342E8   0m9.610s
sqrt              6.66666671666567E11  0m0.981s     6.66666671666567E11   0m0.948s
cbrt             3.481191648389617E10 0m10.808s    3.481191648389617E10  0m10.786s
sinh                         Infinity  0m4.433s                Infinity   0m3.179s
cosh                         Infinity  0m4.478s                Infinity   0m3.174s
tanh              9.999999971990079E7  0m3.353s     9.999999971990079E7   0m3.208s
expm1                        Infinity  0m4.094s                Infinity   0m3.185s
log1p            1.7420681029451895E9 0m13.527s    1.7420681029451895E9   0m8.756s
IEEEremainder                502000.0 0m38.909s                502000.0  0m14.055s
atan2             1.570453905253704E8 0m20.057s     1.570453905253704E8  0m10.510s
pow                          Infinity 0m19.938s                Infinity  0m20.204s
hypot            5.000000099033372E15  0m5.122s    5.000000099033372E15   0m5.130s


I believe that as the FC is passed but FEC is not the change can, after the due
scrutiny and review, be pushed if a special exception approval grants it. Once
on 9, I'll request the downport to 8.

Could I open a bug to address that issue?

Thank you very much.


Regards,
Gustavo

[1] http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39
[2] https://github.com/gromero/strictmath (comparison script used to get the results)


From david.holmes at oracle.com  Thu Nov 17 02:31:48 2016
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 17 Nov 2016 12:31:48 +1000
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <582D0BCE.2030209@linux.vnet.ibm.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
Message-ID: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>

Adding in build-dev as they need to scrutinize all build changes.

David

On 17/11/2016 11:45 AM, Gustavo Romero wrote:
> Hi,
>
> Currently, optimization for building fdlibm is disabled, except for the
> "solaris" OS target [1].
>
> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to,
> sin(), cos(), and tan() perform verify poor in comparison to the same methods
> in Math class [2]:
>
>                    Math  StrictMath
>               =========  ==========
> sin           0m29.984s   1m41.184s
> cos           0m30.031s   1m41.200s
> tan           0m31.772s   1m46.976s
> asin           0m4.577s    0m4.543s
> acos           0m4.539s    0m4.525s
> atan          0m12.929s   0m12.896s
> exp            0m1.071s    0m4.570s
> log            0m3.272s   0m14.239s
> log10          0m4.362s   0m20.236s
> sqrt           0m0.913s    0m0.981s
> cbrt          0m10.786s   0m10.808s
> sinh           0m4.438s    0m4.433s
> cosh           0m4.496s    0m4.478s
> tanh           0m3.360s    0m3.353s
> expm1          0m4.076s    0m4.094s
> log1p          0m13.518s  0m13.527s
> IEEEremainder  0m38.803s  0m38.909s
> atan2          0m20.100s  0m20.057s
> pow            0m14.096s  0m19.938s
> hypot          0m5.136s    0m5.122s
>
>
> Switching on the O3 optimization can damage precision of those methods,
> nonetheless it's possible to avoid that side effect and yet get huge benefits of
> the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in
> addition to the -O3  optimization flag.
>
> In that sense the following change is proposed to resolve the issue:
>
> diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk
> --- a/make/lib/CoreLibraries.gmk	Wed Nov 09 13:37:19 2016 +0100
> +++ b/make/lib/CoreLibraries.gmk	Wed Nov 16 19:11:11 2016 -0500
> @@ -33,10 +33,16 @@
>  # libfdlibm is statically linked with libjava below and not delivered into the
>  # product on its own.
>
> -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>
> -ifneq ($(OPENJDK_TARGET_OS), solaris)
> -  BUILD_LIBFDLIBM_OPTIMIZATION := NONE
> +ifeq ($(OPENJDK_TARGET_OS), solaris)
> +  BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +endif
> +
> +ifeq ($(OPENJDK_TARGET_OS), linux)
> +  ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc)
> +    BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +  endif
>  endif
>
>  LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm
> @@ -51,6 +57,7 @@
>        CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \
>        CFLAGS_windows_debug := -DLOGGING, \
>        CFLAGS_aix := -qfloat=nomaf, \
> +      CFLAGS_linux_ppc := -fno-expensive-optimizations, \
>        DISABLED_WARNINGS_gcc := sign-compare, \
>        DISABLED_WARNINGS_microsoft := 4146 4244 4018, \
>        ARFLAGS := $(ARFLAGS), \
>
>
> diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk
> --- a/make/common/NativeCompilation.gmk	Wed Nov 09 15:32:39 2016 +0100
> +++ b/make/common/NativeCompilation.gmk	Wed Nov 16 19:08:06 2016 -0500
> @@ -569,16 +569,19 @@
>    $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES))
>
>    # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS.
> -  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS))
> +  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \
> +      $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH))
>    ifneq ($(DEBUG_LEVEL),release)
>      # Pickup extra debug dependent variables for CFLAGS
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug)
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug)
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug)
> +    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug)
>    else
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release)
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release)
>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release)
> +    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release)
>    endif
>
>    # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS.
>
>
> After enabling the optimization it's possible to again up to 3x on performance
> regarding the aforementioned methods without losing precision:
>
>                            StrictMath, original              StrictMath, optimized
>                    ============================       ============================
> sin                1.7136493465700542 1m41.184s      1.7136493465700542  0m33.895s
> cos                0.1709843554185943 1m41.200s      0.1709843554185943  0m33.884s
> tan             -5.5500322522995315E7 1m46.976s   -5.5500322522995315E7  0m36.461s
> asin                              NaN  0m4.543s                     NaN   0m3.175s
> acos                              NaN  0m4.525s                     NaN   0m3.211s
> atan             1.5707961389886132E8 0m12.896s    1.5707961389886132E8   0m7.100s
> exp                          Infinity  0m4.570s                Infinity   0m3.187s
> log              1.7420680845245087E9 0m14.239s    1.7420680845245087E9   0m7.170s
> log10             7.565705562087342E8 0m20.236s     7.565705562087342E8   0m9.610s
> sqrt              6.66666671666567E11  0m0.981s     6.66666671666567E11   0m0.948s
> cbrt             3.481191648389617E10 0m10.808s    3.481191648389617E10  0m10.786s
> sinh                         Infinity  0m4.433s                Infinity   0m3.179s
> cosh                         Infinity  0m4.478s                Infinity   0m3.174s
> tanh              9.999999971990079E7  0m3.353s     9.999999971990079E7   0m3.208s
> expm1                        Infinity  0m4.094s                Infinity   0m3.185s
> log1p            1.7420681029451895E9 0m13.527s    1.7420681029451895E9   0m8.756s
> IEEEremainder                502000.0 0m38.909s                502000.0  0m14.055s
> atan2             1.570453905253704E8 0m20.057s     1.570453905253704E8  0m10.510s
> pow                          Infinity 0m19.938s                Infinity  0m20.204s
> hypot            5.000000099033372E15  0m5.122s    5.000000099033372E15   0m5.130s
>
>
> I believe that as the FC is passed but FEC is not the change can, after the due
> scrutiny and review, be pushed if a special exception approval grants it. Once
> on 9, I'll request the downport to 8.
>
> Could I open a bug to address that issue?
>
> Thank you very much.
>
>
> Regards,
> Gustavo
>
> [1] http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39
> [2] https://github.com/gromero/strictmath (comparison script used to get the results)
>

From erik.joelsson at oracle.com  Thu Nov 17 09:17:33 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Thu, 17 Nov 2016 10:17:33 +0100
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>
Message-ID: <9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com>

Hello,

Overall this looks reasonable to me. However, if we want to introduce a 
new possible tuple for specifying compilation flags to 
SetupNativeCompilation, we (the build team) would prefer if we used 
OPENJDK_TARGET_CPU instead of OPENJDK_TARGET_CPU_ARCH.

/Erik

On 2016-11-17 03:31, David Holmes wrote:
> Adding in build-dev as they need to scrutinize all build changes.
>
> David
>
> On 17/11/2016 11:45 AM, Gustavo Romero wrote:
>> Hi,
>>
>> Currently, optimization for building fdlibm is disabled, except for the
>> "solaris" OS target [1].
>>
>> As a consequence on PPC64 (Linux) StrictMath methods like, but not 
>> limited to,
>> sin(), cos(), and tan() perform verify poor in comparison to the same 
>> methods
>> in Math class [2]:
>>
>>                    Math  StrictMath
>>               =========  ==========
>> sin           0m29.984s   1m41.184s
>> cos           0m30.031s   1m41.200s
>> tan           0m31.772s   1m46.976s
>> asin           0m4.577s    0m4.543s
>> acos           0m4.539s    0m4.525s
>> atan          0m12.929s   0m12.896s
>> exp            0m1.071s    0m4.570s
>> log            0m3.272s   0m14.239s
>> log10          0m4.362s   0m20.236s
>> sqrt           0m0.913s    0m0.981s
>> cbrt          0m10.786s   0m10.808s
>> sinh           0m4.438s    0m4.433s
>> cosh           0m4.496s    0m4.478s
>> tanh           0m3.360s    0m3.353s
>> expm1          0m4.076s    0m4.094s
>> log1p          0m13.518s  0m13.527s
>> IEEEremainder  0m38.803s  0m38.909s
>> atan2          0m20.100s  0m20.057s
>> pow            0m14.096s  0m19.938s
>> hypot          0m5.136s    0m5.122s
>>
>>
>> Switching on the O3 optimization can damage precision of those methods,
>> nonetheless it's possible to avoid that side effect and yet get huge 
>> benefits of
>> the -O3 optimization on PPC64 if -fno-expensive-optimizations is 
>> passed in
>> addition to the -O3  optimization flag.
>>
>> In that sense the following change is proposed to resolve the issue:
>>
>> diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk
>> --- a/make/lib/CoreLibraries.gmk    Wed Nov 09 13:37:19 2016 +0100
>> +++ b/make/lib/CoreLibraries.gmk    Wed Nov 16 19:11:11 2016 -0500
>> @@ -33,10 +33,16 @@
>>  # libfdlibm is statically linked with libjava below and not 
>> delivered into the
>>  # product on its own.
>>
>> -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> +BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>>
>> -ifneq ($(OPENJDK_TARGET_OS), solaris)
>> -  BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>> +ifeq ($(OPENJDK_TARGET_OS), solaris)
>> +  BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> +endif
>> +
>> +ifeq ($(OPENJDK_TARGET_OS), linux)
>> +  ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc)
>> +    BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
>> +  endif
>>  endif
>>
>>  LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm
>> @@ -51,6 +57,7 @@
>>        CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \
>>        CFLAGS_windows_debug := -DLOGGING, \
>>        CFLAGS_aix := -qfloat=nomaf, \
>> +      CFLAGS_linux_ppc := -fno-expensive-optimizations, \
>>        DISABLED_WARNINGS_gcc := sign-compare, \
>>        DISABLED_WARNINGS_microsoft := 4146 4244 4018, \
>>        ARFLAGS := $(ARFLAGS), \
>>
>>
>> diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk
>> --- a/make/common/NativeCompilation.gmk    Wed Nov 09 15:32:39 2016 
>> +0100
>> +++ b/make/common/NativeCompilation.gmk    Wed Nov 16 19:08:06 2016 
>> -0500
>> @@ -569,16 +569,19 @@
>>    $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) 
>> $$($1_EXTRA_OBJECT_FILES))
>>
>>    # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS 
>> dependent variables for CFLAGS.
>> -  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) 
>> $$($1_CFLAGS_$(OPENJDK_TARGET_OS))
>> +  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) 
>> $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \
>> + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH))
>>    ifneq ($(DEBUG_LEVEL),release)
>>      # Pickup extra debug dependent variables for CFLAGS
>>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug)
>>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug)
>> + 
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug)
>>    else
>>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release)
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release)
>>      $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release)
>> + 
>> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release)
>>    endif
>>
>>    # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS 
>> dependent variables for CXXFLAGS.
>>
>>
>> After enabling the optimization it's possible to again up to 3x on 
>> performance
>> regarding the aforementioned methods without losing precision:
>>
>>                            StrictMath, original StrictMath, optimized
>>                    ============================ 
>> ============================
>> sin                1.7136493465700542 1m41.184s 1.7136493465700542  
>> 0m33.895s
>> cos                0.1709843554185943 1m41.200s 0.1709843554185943  
>> 0m33.884s
>> tan             -5.5500322522995315E7 1m46.976s 
>> -5.5500322522995315E7  0m36.461s
>> asin                              NaN 0m4.543s                     
>> NaN   0m3.175s
>> acos                              NaN 0m4.525s                     
>> NaN   0m3.211s
>> atan             1.5707961389886132E8 0m12.896s 
>> 1.5707961389886132E8   0m7.100s
>> exp                          Infinity  0m4.570s Infinity   0m3.187s
>> log              1.7420680845245087E9 0m14.239s 
>> 1.7420680845245087E9   0m7.170s
>> log10             7.565705562087342E8 0m20.236s 7.565705562087342E8   
>> 0m9.610s
>> sqrt              6.66666671666567E11  0m0.981s 6.66666671666567E11   
>> 0m0.948s
>> cbrt             3.481191648389617E10 0m10.808s 3.481191648389617E10  
>> 0m10.786s
>> sinh                         Infinity  0m4.433s Infinity   0m3.179s
>> cosh                         Infinity  0m4.478s Infinity   0m3.174s
>> tanh              9.999999971990079E7  0m3.353s 9.999999971990079E7   
>> 0m3.208s
>> expm1                        Infinity  0m4.094s Infinity   0m3.185s
>> log1p            1.7420681029451895E9 0m13.527s 
>> 1.7420681029451895E9   0m8.756s
>> IEEEremainder                502000.0 0m38.909s 502000.0  0m14.055s
>> atan2             1.570453905253704E8 0m20.057s 1.570453905253704E8  
>> 0m10.510s
>> pow                          Infinity 0m19.938s Infinity  0m20.204s
>> hypot            5.000000099033372E15  0m5.122s 
>> 5.000000099033372E15   0m5.130s
>>
>>
>> I believe that as the FC is passed but FEC is not the change can, 
>> after the due
>> scrutiny and review, be pushed if a special exception approval grants 
>> it. Once
>> on 9, I'll request the downport to 8.
>>
>> Could I open a bug to address that issue?
>>
>> Thank you very much.
>>
>>
>> Regards,
>> Gustavo
>>
>> [1] 
>> http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39
>> [2] https://github.com/gromero/strictmath (comparison script used to 
>> get the results)
>>


From thomas.schatzl at oracle.com  Thu Nov 17 11:28:06 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 17 Nov 2016 12:28:06 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
Message-ID: <1479382086.2891.24.camel@oracle.com>

Hi Kim,

? while unconsciously dwelling on the issue I think there is one
unanswered question:

On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote:
> > 
> > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at oracle.c
> > om> wrote:
> > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
> > > 
> > > There is still a situation where processing can fail, namely an
> > > in-progress humongous allocation that hasn't set the klass
> > > yet.??We
> > > continue to handle that as before.
> > - I am not completely sure about whether this case is handled
> > correctly. I am mostly concerned that the information used before
> > the
> > fence may not be the correct ones, but the checks expect them to be
> > valid.
> > 
> > Probably I am overlooking something critical somewhere.
> > 
> > A: allocates humongous object C, sets region type, issues
> > storestore, sets top pointers, writes the object, and then sets C.y
> > = x to mark a
> > card
> > 
> > Refinement: gets card (and assuming we have no further
> > synchronization
> > around which is not true, e.g. the enqueuing)
> > 
> > ?592???if (!r->is_old_or_humongous()) {
> > 
> > assume refinement thread has not received the "type" correctly yet,
> > so must be Free. So the card will be filtered out incorrectly?
> > 
> > That is contradictory to what I said in the other email about the
> > comment discussion, but I only thoroughly looked at the comment
> > aspect there. :)
> > 
> > I think at this point in general we can't do anything but
> > !is_young(), as we can't ignore cards in "Free" regions - they may
> > be for cards for humongous ones where the thread did not receive
> > top and/or the type yet?

Here, combined with the scenario described in the other thread (I will
repeat it for clarity):

"
A: allocate new young region X, allocate object, storestore, stops at
the beginning of the dirty_young_block() method

B: allocate new object B in X, set B.y = something-outside, making the
card "Dirty" since thread A did not actually start doing
dirty_young_block() yet.

Refinement: scans the card; since R does not seem to synchronize with A
either, you may get a "dirty" card in a young (or free, depending on
whether the setting of the region flag in X has already been observed -
but it must be either one) region here in this case?

A: does the work in dirty_young_block()"

Since thread A allocated the region X, the top and region type of
region X are set by A.

Now, in this scenario, refinement gets the dirty card from thread B
first (because eg. it happens that thread B's queue just got full), and
A is still busy marking the card table.
The region type change (caused by A) for region X may not have been
observed by the refinement yet, so it may still be Free?

So the check in g1RemSet.cpp

?597 ? if (!r->is_old_or_humongous()) {

may filter the card out wrongly when processing the card from thread B
as far as I can see.

That's why I remarked about only being able to filter out using
is_young() here. For the refinement thread, "top" is current (after the
fence), but the region type not (may still be "Free" until the
refinement "synchronizes" with thread A in some way), doesn't it?

The change to "top" must have been observed already after the fence (in
line 684) though and is safe to use (the allocation of the TLAB for
thread B sets top using appropriate barriers, and the refinement will
synchronize with whatever thread B set).

Probably I am overlooking something about how the type of region X set by thread A can be visible to refinement if it only "synchronizes" with thread B (that did not write the type of region X).

Thanks,
? Thomas


From thomas.schatzl at oracle.com  Thu Nov 17 11:31:38 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 17 Nov 2016 12:31:38 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
	<1479287214.2466.35.camel@oracle.com>
	<1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com>
Message-ID: <1479382298.2891.25.camel@oracle.com>

Hi Kim,

On Wed, 2016-11-16 at 13:02 -0500, Kim Barrett wrote:
> > 
> > On Nov 16, 2016, at 4:06 AM, Thomas Schatzl <thomas.schatzl at oracle.
> > com> wrote:
> > 
> > Hi Kim,
> > 
> > On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote:
> > > 
> > > I've updated some comments to mention that external
> > > synchronization.
> > ?581???// The region could be young.??Cards for young regions are
> > set
> > to
> > ?582???// g1_young_gen, so the post-barrier will filter them
> > out.??However,
> > ?583???// that marking is performed concurrently.??A write to a
> > young
> > ?584???// object could occur before the card has been marked young,
> > slipping
> > ?585???// past the filter.
> > 
> > I would prefer if the text would not change terminology for the
> > same
> > thing mid-paragraph, from "setting" to "marking". The advantage of
> > it
> > reading better seems to be smaller than the potential confusion.
> ? // The region could be young.??Cards for young regions are
> ? // distinctly marked (set to g1_young_gen), so the post-barrier
> will
> ? // filter them out.??However, that marking is performed
> ? // concurrently.??A write to a young object could occur before the
> ? // card has been marked young, slipping past the filter.
> 
> Better?

? better :)

Thanks,
? Thomas


From tobias.hartmann at oracle.com  Thu Nov 17 12:42:26 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 17 Nov 2016 13:42:26 +0100
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic
	method is disabled
Message-ID: <582DA5B2.4020307@oracle.com>

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8169711
http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/

When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()).

Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code.

The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).

I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub.

While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because
1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and
2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized.

I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals.

Tested with regression test, JPRT and RBT (running).

Thanks,
Tobias

[1] https://bugs.openjdk.java.net/browse/JDK-8169867

From erik.helin at oracle.com  Thu Nov 17 14:13:14 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 17 Nov 2016 15:13:14 +0100
Subject: JEP 189: Shenandoah: An Ultra-Low-Pause-Time Garbage Collector
In-Reply-To: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com>
References: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com>
Message-ID: <76e1a719-c786-5c84-7287-4053d4e96021@oracle.com>

On 11/14/2016 07:25 PM, Christine Flood wrote:
> Hi
>
> We've addressed the issues with the JEP that were brought up last summer.
> We've been meeting our performance goals.
>
> What do we need to do to get Shenandoah approved for OpenJDK10?

Hi Christine,

I read through the JEP, thanks for making the suggested changes. One 
thing I'm missing though are the operating systems you intend to 
support? The JEP mentions that Red Hat will support Shenandoah for the 
arm64 and amd64 CPU architectures, but doesn't mention any operating 
systems.

I would strongly prefer that the JEP suggested by Roman, "GC Interface: 
Better isolation of GC implementations" [0], is integrated before this 
JEP is submitted in order to ensure that the code can co-exist 
side-by-side with the existing GC algorithms (and be maintained 
effectively by another contributor). Would you mind adding a dependency 
in the Shenandoah JEP on Roman's "GCInterface" JEP?

As for the JEP process, please see http://openjdk.java.net/jeps/1 and 
http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html.

Thanks,
Erik

[0]: https://bugs.openjdk.java.net/browse/JDK-8163329

> Christine
>

From vladimir.x.ivanov at oracle.com  Thu Nov 17 14:34:35 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 17 Nov 2016 17:34:35 +0300
Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV
In-Reply-To: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com>
References: <c3382481-0dd6-1f7b-ef26-ce561d3a9835@oracle.com>
	<8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com>
Message-ID: <5bc4a55e-e2d1-4bcf-ccf4-382e65239d82@oracle.com>

Looks good (not a 8u Reviewer).

Best regards,
Vladimir Ivanov

On 11/16/16 7:44 PM, david buck wrote:
> (moving to hotspot-dev for more exposure.)
>
> Jamsheed, thanks once again reviewing my backport!
>
> Any reviewers out there willing to chime in?
>
> Cheers,
> -Buck
>
>
> -------- Forwarded Message --------
> Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV
> Date: Wed, 16 Nov 2016 21:48:10 +0530
> From: Jamsheed C m <jamsheed.c.m at oracle.com>
> Organization: Oracle Corporation
> To: david buck <david.buck at oracle.com>,
> hotspot-compiler-dev at openjdk.java.net
> <hotspot-compiler-dev at openjdk.java.net>
>
> Thanks for fixing. new webrev looks good to me (not a reviewer).
>
> Best Regards,
> Jamsheed
> On 11/16/2016 4:31 PM, david buck wrote:
>> Hi Jamsheed!
>>
>> Thank you for catching the mistake! I have modified the backport to
>> include the relevant change from 8072008 [0]. Here is an updated webrev:
>>
>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/
>>
>> In the new chunk of code added, the only difference from the code in
>> JDK 9 is I had to add a call to err_msg() as JDK 8 does not have
>> variadic macro version of assert() [1].
>>
>> I have reran all tests (both JPRT and manual) with no issues.
>>
>> Cheers,
>> -Buck
>>
>> [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8080775
>>
>> On 2016/11/16 16:29, Jamsheed C m wrote:
>>> Hi David,
>>>
>>> this change is missing
>>>
>>> JVMState* VirtualCallGenerator::generate(JVMState* jvms) {
>>>
>>> ...
>>>
>>>   if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) {
>>>     assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", kit.java_bc(),
>>> Bytecodes::name(kit.java_bc()));
>>>     ciMethod* declared_method =
>>> kit.method()->get_method_at_bci(kit.bci());
>>>     int arg_size =
>>> declared_method->signature()->arg_size_for_bc(kit.java_bc());
>>>     kit.inc_sp(arg_size);  // restore arguments
>>>     kit.uncommon_trap(Deoptimization::Reason_null_check,
>>>                       Deoptimization::Action_none,
>>>                       NULL, "null receiver");
>>>
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>>
>>> On 11/15/2016 8:55 PM, david buck wrote:
>>>> Hi!
>>>>
>>>> Please review the backported changes of JDK-8158639 to 8u:
>>>>
>>>> It is a very straightforward backport. The only two differences are:
>>>>
>>>> - I added a convenience macro, get_method_at_bci(), from the change
>>>> for 8072008 to make the backport cleaner.
>>>>
>>>> - I had to modify (remove) the package used for the testcase.
>>>>
>>>> Bug Report:
>>>> [ 8158639: C2 compilation fails with SIGSEGV ]
>>>> https://bugs.openjdk.java.net/browse/JDK-8158639
>>>>
>>>> JDK 9 changeset:
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b
>>>>
>>>> 8u-dev Webrev:
>>>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/
>>>>
>>>> Testing:
>>>> Manual verification and JPRT (default and hotspot testsets)
>>>>
>>>> Cheers,
>>>> -Buck
>>>
>

From coleen.phillimore at oracle.com  Thu Nov 17 15:42:25 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Thu, 17 Nov 2016 10:42:25 -0500
Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV
In-Reply-To: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com>
References: <c3382481-0dd6-1f7b-ef26-ce561d3a9835@oracle.com>
	<8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com>
Message-ID: <d30b9719-4d59-dbab-65a2-ae326106d4a3@oracle.com>


This looks like a good backport of the original bug fix.
Reviewed.
Coleen

On 11/16/16 11:44 AM, david buck wrote:
> (moving to hotspot-dev for more exposure.)
>
> Jamsheed, thanks once again reviewing my backport!
>
> Any reviewers out there willing to chime in?
>
> Cheers,
> -Buck
>
>
> -------- Forwarded Message --------
> Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV
> Date: Wed, 16 Nov 2016 21:48:10 +0530
> From: Jamsheed C m <jamsheed.c.m at oracle.com>
> Organization: Oracle Corporation
> To: david buck <david.buck at oracle.com>, 
> hotspot-compiler-dev at openjdk.java.net 
> <hotspot-compiler-dev at openjdk.java.net>
>
> Thanks for fixing. new webrev looks good to me (not a reviewer).
>
> Best Regards,
> Jamsheed
> On 11/16/2016 4:31 PM, david buck wrote:
>> Hi Jamsheed!
>>
>> Thank you for catching the mistake! I have modified the backport to
>> include the relevant change from 8072008 [0]. Here is an updated webrev:
>>
>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/
>>
>> In the new chunk of code added, the only difference from the code in
>> JDK 9 is I had to add a call to err_msg() as JDK 8 does not have
>> variadic macro version of assert() [1].
>>
>> I have reran all tests (both JPRT and manual) with no issues.
>>
>> Cheers,
>> -Buck
>>
>> [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8080775
>>
>> On 2016/11/16 16:29, Jamsheed C m wrote:
>>> Hi David,
>>>
>>> this change is missing
>>>
>>> JVMState* VirtualCallGenerator::generate(JVMState* jvms) {
>>>
>>> ...
>>>
>>>   if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) {
>>>     assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", 
>>> kit.java_bc(),
>>> Bytecodes::name(kit.java_bc()));
>>>     ciMethod* declared_method =
>>> kit.method()->get_method_at_bci(kit.bci());
>>>     int arg_size =
>>> declared_method->signature()->arg_size_for_bc(kit.java_bc());
>>>     kit.inc_sp(arg_size);  // restore arguments
>>>     kit.uncommon_trap(Deoptimization::Reason_null_check,
>>>                       Deoptimization::Action_none,
>>>                       NULL, "null receiver");
>>>
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>>
>>> On 11/15/2016 8:55 PM, david buck wrote:
>>>> Hi!
>>>>
>>>> Please review the backported changes of JDK-8158639 to 8u:
>>>>
>>>> It is a very straightforward backport. The only two differences are:
>>>>
>>>> - I added a convenience macro, get_method_at_bci(), from the change
>>>> for 8072008 to make the backport cleaner.
>>>>
>>>> - I had to modify (remove) the package used for the testcase.
>>>>
>>>> Bug Report:
>>>> [ 8158639: C2 compilation fails with SIGSEGV ]
>>>> https://bugs.openjdk.java.net/browse/JDK-8158639
>>>>
>>>> JDK 9 changeset:
>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b
>>>>
>>>> 8u-dev Webrev:
>>>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/
>>>>
>>>> Testing:
>>>> Manual verification and JPRT (default and hotspot testsets)
>>>>
>>>> Cheers,
>>>> -Buck
>>>
>


From thomas.schatzl at oracle.com  Thu Nov 17 17:07:41 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 17 Nov 2016 18:07:41 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation
	and refinement
In-Reply-To: <1479382086.2891.24.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479382086.2891.24.camel@oracle.com>
Message-ID: <1479402461.2522.21.camel@oracle.com>

Hi Kim,


On Thu, 2016-11-17 at 12:28 +0100, Thomas Schatzl wrote:
> Hi Kim,
> 
> 
[...]

> So the check in g1RemSet.cpp
> 
> ?597 ? if (!r->is_old_or_humongous()) {
> 
> may filter the card out wrongly when processing the card from thread
> B
> as far as I can see.
> 
> That's why I remarked about only being able to filter out using
> is_young() here. For the refinement thread, "top" is current (after
> the
> fence), but the region type not (may still be "Free" until the
> refinement "synchronizes" with thread A in some way), doesn't it?
> 
> The change to "top" must have been observed already after the fence
> (in
> line 684) though and is safe to use (the allocation of the TLAB for
> thread B sets top using appropriate barriers, and the refinement will
> synchronize with whatever thread B set).
> 
> Probably I am overlooking something about how the type of region X
> set by thread A can be visible to refinement if it only
> "synchronizes" with thread B (that did not write the type of region
> X).

? I think it is good. Erik gave me the hint (and probably you already
mentioned it somewhere). That case can only happen for young regions,
and we can ignore them.

We only allocate into humongous regions once.

Thanks,
? Thomas


From trevor.d.watson at oracle.com  Thu Nov 17 17:29:14 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Thu, 17 Nov 2016 17:29:14 +0000
Subject: Unsafe compareAnd*
Message-ID: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com>

I'm working on an implementation of the C2 code for 
compareAndExchangeShort on SPARC.

I've only implemented this function so far, and no compareAndSwapShort 
equivalent.

When I run the test in 
hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java 
it fails because Unsafe.compareAndSwapShort() returns an incorrect 
value. This test passes without my implementation of 
compareAndExchangeShort.

If I comment out the Unsafe.compareAndSwapShort() tests, the 
Unsafe.compareAndExchangeShort tests run successfully but the 
Unsafe.weakCompareAndSwapShort() tests subsequently fail.

Can anyone tell me why it might be that an implementation for 
CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()?

Thanks,
Trevor

From erik.helin at oracle.com  Thu Nov 17 17:28:28 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 17 Nov 2016 18:28:28 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
Message-ID: <f89a2a1c-f48d-3e9b-f253-c19bb5261b89@oracle.com>


On 11/16/2016 01:00 AM, Kim Barrett wrote:
>> On Nov 15, 2016, at 5:26 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi,
>>
>> On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote:
>>>>
>>>> On Nov 8, 2016, at 7:52 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>>> om> wrote:
>>>> On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
>>>> - assuming this works due to other synchronization,
>>> This is the critical point.  There *is* synchronization there.
>>
>> Okay, thanks. I just wanted to make sure that we are aware of that we
>> are using this other synchronization here.
>>
>> Thanks. Again I was mostly worried about noting this reliance on
>> previous synchronization down somewhere, even if it is only the mailing
>> list.
>>
>> It may be useful to note this in the code too. This would save the next
>> one working on this code looking through old mailing list threads.
>>
>> Maybe I am a bit overly concerned about making sure that these thoughts
>> are provided in the proper place though. Or maybe everyone thinks that
>> everything is clear :)
>
> I've updated some comments to mention that external synchronization.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166811
>
> Webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/

First of all, thanks for doing this tricky work. One initial comment:

659   // Iterate over the objects overlapping the card designated by
660   // card_ptr, applying cl to all references in the region.  This
661   // is a helper for G1RemSet::refine_card, and is tightly coupled
662   // with it.

In the first sentence you mention the now removed argument card_ptr. 
Maybe just reword this to "Iterate over the objects covered by the 
memory region, applying cl to all references in the region"?

I will have to sleep on this review, the synchronization to make all of 
this hold together seems to be all over place :) (not your fault, 
pre-existing). I will continue this review tomorrow morning with a fresh 
brain.

Thanks,
Erik

> incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/
>
> Also, since this set of changes is rather intertwined with the changes
> for 8166607, here is a combined webrev for both:
> http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/
>
> I think I'll do as Erik suggested and push the two together.
>

From joe.darcy at oracle.com  Thu Nov 17 17:35:05 2016
From: joe.darcy at oracle.com (joe darcy)
Date: Thu, 17 Nov 2016 09:35:05 -0800
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <582D0BCE.2030209@linux.vnet.ibm.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
Message-ID: <d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>

Hello,


On 11/16/2016 5:45 PM, Gustavo Romero wrote:
> Hi,
>
> Currently, optimization for building fdlibm is disabled, except for the
> "solaris" OS target [1].

The reason for that is because historically the Solaris compilers have 
had sufficient discipline and control regarding floating-point semantics 
and compiler optimizations to still implement the Java-mandated results 
when optimization was enabled. The gcc family of compilers, for example, 
has lacked such discipline.

>
> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to,
> sin(), cos(), and tan() perform verify poor in comparison to the same methods
> in Math class [2]:

If you are doing your work against JDK 9, note that the pow, hypot, and 
cbrt fdlibm methods required by StrictMath have been ported to Java 
(JDK-8134780: Port fdlibm to Java). I have intentions to port the 
remaining methods to Java, but it is unclear whether or not this will 
occur for JDK 9.

Methods in the Math class, such as pow, are often intrinsified and use a 
different algorithm so a straight performance comparison may not be as 
fair or meaningful in those cases.

>
>                     Math  StrictMath
>                =========  ==========
> sin           0m29.984s   1m41.184s
> cos           0m30.031s   1m41.200s
> tan           0m31.772s   1m46.976s
> asin           0m4.577s    0m4.543s
> acos           0m4.539s    0m4.525s
> atan          0m12.929s   0m12.896s
> exp            0m1.071s    0m4.570s
> log            0m3.272s   0m14.239s
> log10          0m4.362s   0m20.236s
> sqrt           0m0.913s    0m0.981s
> cbrt          0m10.786s   0m10.808s
> sinh           0m4.438s    0m4.433s
> cosh           0m4.496s    0m4.478s
> tanh           0m3.360s    0m3.353s
> expm1          0m4.076s    0m4.094s
> log1p          0m13.518s  0m13.527s
> IEEEremainder  0m38.803s  0m38.909s
> atan2          0m20.100s  0m20.057s
> pow            0m14.096s  0m19.938s
> hypot          0m5.136s    0m5.122s
>
>
> Switching on the O3 optimization can damage precision of those methods,
> nonetheless it's possible to avoid that side effect and yet get huge benefits of
> the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in
> addition to the -O3  optimization flag.
>
> In that sense the following change is proposed to resolve the issue:
>
> diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk
> --- a/make/lib/CoreLibraries.gmk	Wed Nov 09 13:37:19 2016 +0100
> +++ b/make/lib/CoreLibraries.gmk	Wed Nov 16 19:11:11 2016 -0500
> @@ -33,10 +33,16 @@
>   # libfdlibm is statically linked with libjava below and not delivered into the
>   # product on its own.
>
> -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +BUILD_LIBFDLIBM_OPTIMIZATION := NONE
>
> -ifneq ($(OPENJDK_TARGET_OS), solaris)
> -  BUILD_LIBFDLIBM_OPTIMIZATION := NONE
> +ifeq ($(OPENJDK_TARGET_OS), solaris)
> +  BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +endif
> +
> +ifeq ($(OPENJDK_TARGET_OS), linux)
> +  ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc)
> +    BUILD_LIBFDLIBM_OPTIMIZATION := HIGH
> +  endif
>   endif
>
>   LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm
> @@ -51,6 +57,7 @@
>         CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \
>         CFLAGS_windows_debug := -DLOGGING, \
>         CFLAGS_aix := -qfloat=nomaf, \
> +      CFLAGS_linux_ppc := -fno-expensive-optimizations, \
>         DISABLED_WARNINGS_gcc := sign-compare, \
>         DISABLED_WARNINGS_microsoft := 4146 4244 4018, \
>         ARFLAGS := $(ARFLAGS), \
>
>
> diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk
> --- a/make/common/NativeCompilation.gmk	Wed Nov 09 15:32:39 2016 +0100
> +++ b/make/common/NativeCompilation.gmk	Wed Nov 16 19:08:06 2016 -0500
> @@ -569,16 +569,19 @@
>     $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES))
>
>     # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS.
> -  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS))
> +  $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \
> +      $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH))
>     ifneq ($(DEBUG_LEVEL),release)
>       # Pickup extra debug dependent variables for CFLAGS
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug)
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug)
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug)
> +    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug)
>     else
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release)
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release)
>       $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release)
> +    $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release)
>     endif
>
>     # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS.
>
>
> After enabling the optimization it's possible to again up to 3x on performance
> regarding the aforementioned methods without losing precision:
>
>                             StrictMath, original              StrictMath, optimized
>                     ============================       ============================
> sin                1.7136493465700542 1m41.184s      1.7136493465700542  0m33.895s
> cos                0.1709843554185943 1m41.200s      0.1709843554185943  0m33.884s
> tan             -5.5500322522995315E7 1m46.976s   -5.5500322522995315E7  0m36.461s
> asin                              NaN  0m4.543s                     NaN   0m3.175s
> acos                              NaN  0m4.525s                     NaN   0m3.211s
> atan             1.5707961389886132E8 0m12.896s    1.5707961389886132E8   0m7.100s
> exp                          Infinity  0m4.570s                Infinity   0m3.187s
> log              1.7420680845245087E9 0m14.239s    1.7420680845245087E9   0m7.170s
> log10             7.565705562087342E8 0m20.236s     7.565705562087342E8   0m9.610s
> sqrt              6.66666671666567E11  0m0.981s     6.66666671666567E11   0m0.948s
> cbrt             3.481191648389617E10 0m10.808s    3.481191648389617E10  0m10.786s
> sinh                         Infinity  0m4.433s                Infinity   0m3.179s
> cosh                         Infinity  0m4.478s                Infinity   0m3.174s
> tanh              9.999999971990079E7  0m3.353s     9.999999971990079E7   0m3.208s
> expm1                        Infinity  0m4.094s                Infinity   0m3.185s
> log1p            1.7420681029451895E9 0m13.527s    1.7420681029451895E9   0m8.756s
> IEEEremainder                502000.0 0m38.909s                502000.0  0m14.055s
> atan2             1.570453905253704E8 0m20.057s     1.570453905253704E8  0m10.510s
> pow                          Infinity 0m19.938s                Infinity  0m20.204s
> hypot            5.000000099033372E15  0m5.122s    5.000000099033372E15   0m5.130s
>
>
> I believe that as the FC is passed but FEC is not the change can, after the due
> scrutiny and review, be pushed if a special exception approval grants it. Once
> on 9, I'll request the downport to 8.

Accumulating the the results of the functions and comparisons the sums 
is not a sufficiently robust way of checking to see if the optimized 
versions are indeed equivalent to the non-optimized ones. The 
specification of StrictMath requires a particular result for each set of 
floating-point arguments and sums get round-away low-order bits that differ.

Running the JDK math library regression tests and corresponding JCK 
tests is recommended for work in this area.

Cheers,

-Joe

From gromero at linux.vnet.ibm.com  Thu Nov 17 17:45:59 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Thu, 17 Nov 2016 15:45:59 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>
Message-ID: <582DECD7.4020901@linux.vnet.ibm.com>

Hi David,

On 17-11-2016 00:31, David Holmes wrote:
> Adding in build-dev as they need to scrutinize all build changes.

Thanks a lot.

Regards,
Gustavo


From gromero at linux.vnet.ibm.com  Thu Nov 17 17:47:40 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Thu, 17 Nov 2016 15:47:40 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com>
	<9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com>
Message-ID: <582DED3C.5030507@linux.vnet.ibm.com>

Hi Erik,

On 17-11-2016 07:17, Erik Joelsson wrote:
> Overall this looks reasonable to me. However, if we want to introduce a new possible tuple for specifying compilation flags to SetupNativeCompilation, we (the build team) would prefer if we used
> OPENJDK_TARGET_CPU instead of OPENJDK_TARGET_CPU_ARCH.

Got it. Thanks a lot for that info. I'll take that into account.

Regards,
Gustavo


From gromero at linux.vnet.ibm.com  Thu Nov 17 18:31:00 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Thu, 17 Nov 2016 16:31:00 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
Message-ID: <582DF764.70504@linux.vnet.ibm.com>

Hi Joe,

Thanks a lot for your valuable comments.

On 17-11-2016 15:35, joe darcy wrote:
>> Currently, optimization for building fdlibm is disabled, except for the
>> "solaris" OS target [1].
> 
> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the
> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline.

oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm
optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does
not affect the precision, even if setting -O3 does not improve the performance
as much as on PPC64.


>> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to,
>> sin(), cos(), and tan() perform verify poor in comparison to the same methods
>> in Math class [2]:
> 
> If you are doing your work against JDK 9, note that the pow, hypot, and cbrt fdlibm methods required by StrictMath have been ported to Java (JDK-8134780: Port fdlibm to Java). I have intentions to
> port the remaining methods to Java, but it is unclear whether or not this will occur for JDK 9.

Yes, I'm doing my work against 9. So is there any problem if I proceed with my
change? I understand that there is no conflict as JDK-8134780 progresses and
replaces the StrictMath methods by their counterparts in Java. Please, advice.

Is it intended to downport JDK-8134780 to 8?


> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases.

I agree. It's just that the issue on StrictMath methods was first noted due to
that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64.


> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones.
> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ.

That's really good point, thanks for letting me know about that. I'll re-test my
change under that perspective.


> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area.

Got it. By "the JDK math library regression tests" you mean exactly which test
suite? the jtreg tests?

For testing against JCK/TCK I'll need some help on that.

Thank you very much.


Regards,
Gustavo


From paul.sandoz at oracle.com  Thu Nov 17 18:48:50 2016
From: paul.sandoz at oracle.com (Paul Sandoz)
Date: Thu, 17 Nov 2016 10:48:50 -0800
Subject: Unsafe compareAnd*
In-Reply-To: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com>
References: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com>
Message-ID: <77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com>

Hi Trevor,

The compareAndSwapShort (non-instrinsic) implementation defers to the compareAndExchangeShortVolatile implementation, and the weakCompareAndSwapShortVolatile implementation defers to (the stronger) compareAndSwapShort implementation [*]:

i.e. weakCompareAndSwapShortVolatile -> compareAndSwapShort -> compareAndExchangeShortVolatile

@HotSpotIntrinsicCandidate
public final short compareAndExchangeShortVolatile(Object o, long offset,
                                         short expected,
                                         short x) {
    if ((offset & 3) == 3) {
        throw new IllegalArgumentException("Update spans the word, not supported");
    }

    ?
}

@HotSpotIntrinsicCandidate
public final boolean compareAndSwapShort(Object o, long offset,
                                         short expected,
                                         short x) {
    return compareAndExchangeShortVolatile(o, offset, expected, x) == expected;
}

@HotSpotIntrinsicCandidate
public final boolean weakCompareAndSwapShortVolatile(Object o, long offset,
                                                     short expected,
                                                     short x) {
    return compareAndSwapShort(o, offset, expected, x);
}

I think that explains why you are observing failing Unsafe.weakCompareAndSwapShort() tests.

Paul.

[*] Note, we really need to change the names here to be consistent with the schema on VarHandles


> On 17 Nov 2016, at 09:29, Trevor Watson <trevor.d.watson at oracle.com> wrote:
> 
> I'm working on an implementation of the C2 code for compareAndExchangeShort on SPARC.
> 
> I've only implemented this function so far, and no compareAndSwapShort equivalent.
> 
> When I run the test in hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java it fails because Unsafe.compareAndSwapShort() returns an incorrect value. This test passes without my implementation of compareAndExchangeShort.
> 
> If I comment out the Unsafe.compareAndSwapShort() tests, the Unsafe.compareAndExchangeShort tests run successfully but the Unsafe.weakCompareAndSwapShort() tests subsequently fail.
> 
> Can anyone tell me why it might be that an implementation for CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()?
> 
> Thanks,
> Trevor


From vladimir.kozlov at oracle.com  Thu Nov 17 19:34:29 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 17 Nov 2016 11:34:29 -0800
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if
	intrinsic method is disabled
In-Reply-To: <582DA5B2.4020307@oracle.com>
References: <582DA5B2.4020307@oracle.com>
Message-ID: <efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>

Hi Tobias,

It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods.
May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false.

Thanks,
Vladimir

On 11/17/16 4:42 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8169711
> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/
>
> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()).
>
> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code.
>
> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).
>
> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub.
>
> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because
> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and
> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized.
>
> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals.
>
> Tested with regression test, JPRT and RBT (running).
>
> Thanks,
> Tobias
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8169867
>

From kim.barrett at oracle.com  Thu Nov 17 21:06:19 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 17 Nov 2016 16:06:19 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <f89a2a1c-f48d-3e9b-f253-c19bb5261b89@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
	<f89a2a1c-f48d-3e9b-f253-c19bb5261b89@oracle.com>
Message-ID: <4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com>

> On Nov 17, 2016, at 12:28 PM, Erik Helin <erik.helin at oracle.com> wrote:
> 
> First of all, thanks for doing this tricky work. One initial comment:
> 
> 659   // Iterate over the objects overlapping the card designated by
> 660   // card_ptr, applying cl to all references in the region.  This
> 661   // is a helper for G1RemSet::refine_card, and is tightly coupled
> 662   // with it.
> 
> In the first sentence you mention the now removed argument card_ptr. Maybe just reword this to "Iterate over the objects covered by the memory region, applying cl to all references in the region??

You?re right, I missed updating the comment when the signature was changed.
Changing to:

  // Iterate over the objects overlapping part of a card, applying cl
  // to all references in the region.  This is a helper for
  // G1RemSet::refine_card, and is tightly coupled with it.

which is still immediately followed by:

  // mr: the memory region covered by the card, trimmed to the
  // allocated space for this region.  Must not be empty.


From kim.barrett at oracle.com  Thu Nov 17 21:20:35 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 17 Nov 2016 16:20:35 -0500
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <1479402461.2522.21.camel@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479382086.2891.24.camel@oracle.com>
	<1479402461.2522.21.camel@oracle.com>
Message-ID: <CC360A7B-BCCD-4078-813A-928C67694E81@oracle.com>

> On Nov 17, 2016, at 12:07 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> 
> 
> On Thu, 2016-11-17 at 12:28 +0100, Thomas Schatzl wrote:
>> Hi Kim,
>> 
>> 
> [...]
> 
>> So the check in g1RemSet.cpp
>> 
>>  597   if (!r->is_old_or_humongous()) {
>> 
>> may filter the card out wrongly when processing the card from thread
>> B
>> as far as I can see.
>> 
>> That's why I remarked about only being able to filter out using
>> is_young() here. For the refinement thread, "top" is current (after
>> the
>> fence), but the region type not (may still be "Free" until the
>> refinement "synchronizes" with thread A in some way), doesn't it?
>> 
>> The change to "top" must have been observed already after the fence
>> (in
>> line 684) though and is safe to use (the allocation of the TLAB for
>> thread B sets top using appropriate barriers, and the refinement will
>> synchronize with whatever thread B set).
>> 
>> Probably I am overlooking something about how the type of region X
>> set by thread A can be visible to refinement if it only
>> "synchronizes" with thread B (that did not write the type of region
>> X).
> 
>   I think it is good. Erik gave me the hint (and probably you already
> mentioned it somewhere). That case can only happen for young regions,
> and we can ignore them.
> 
> We only allocate into humongous regions once.
> 
> Thanks,
>   Thomas

Kudos to Erik of helping you answer your question.  I was still struggling
with understanding the scenario you were trying to describe.


From ioi.lam at oracle.com  Thu Nov 17 21:31:39 2016
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 17 Nov 2016 13:31:39 -0800
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if
	intrinsic method is disabled
In-Reply-To: <efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>
References: <582DA5B2.4020307@oracle.com>
	<efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>
Message-ID: <582E21BB.1060704@oracle.com>

Hi Tobias,

The interpreter changes look OK to me. I'll defer to Vladimir on his 
opinion on the asserts.

Thanks
- Ioi

On 11/17/16 11:34 AM, Vladimir Kozlov wrote:
> Hi Tobias,
>
> It is a little inconsistent. CRC32 instrinsics check their flag in 
> generate_CRC32* methods.
> May be we should do the same for FMA instead of assert in 
> generate_math_entry() return NULL if flag is false.
>
> Thanks,
> Vladimir
>
> On 11/17/16 4:42 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8169711
>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/
>>
>> When dumping metadata with class data sharing (CDS), 
>> Method::unlink_method() takes care of removing all entry points of 
>> methods that will be shared. The _i2i and _from_interpreted entries 
>> are set to the corresponding address in the _cds_entry_table (see 
>> AbstractInterpreter::entry_for_cds_method()). This address points to 
>> a trampoline in shared space that jumps to the actual (unshared) 
>> interpreter method entry at runtime (see 
>> AbstractInterpreter::update_cds_entry_table()).
>>
>> Intrinsic methods may have a special interpreter entry (for example, 
>> 'Interpreter::java_lang_math_fmaF') and if they are shared, their 
>> entry points are set to such a trampoline that is patched at runtime 
>> to jump to the interpreter stub containing the intrinsic code.
>>
>> The problem is that if an intrinsic is enabled during dumping but 
>> disabled during re-using the shared archive, the trampoline is not 
>> patched and therefore still refers to the old stub address that was 
>> only valid during dumping. In debug, we hit the "should be correctly 
>> set during dump time" assert in Method::link_method() because the 
>> method entries are inconsistent. In product, we crash because we jump 
>> to an invalid address through the unpatched trampoline. This problem 
>> exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).
>>
>> I fixed this by always creating the interpreter method entries for 
>> intrinsified methods but replace them with vanilla entries in 
>> TemplateInterpreterGenerator::generate_method_entry() if the 
>> intrinsic is disabled at runtime. Like this, we patch the trampoline 
>> destination address even if the intrinsic is disabled but just 
>> execute the Java bytecodes instead of the stub.
>>
>> While testing, I noticed that the assert in Method::link_method() is 
>> not always triggered (sometimes we just crash). This is because
>> 1) the "_from_compiled_entry == NULL" check in 
>> Method::restore_unshareable_info() is always false and therefore 
>> link_method() is not invoked and
>> 2) in Method::link_method() we only execute the check if the adapter 
>> (which is shared) was not yet initialized.
>>
>> I filed JDK-8169867 [1] for this because I'm not too familiar with 
>> the CDS internals.
>>
>> Tested with regression test, JPRT and RBT (running).
>>
>> Thanks,
>> Tobias
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867
>>


From joe.darcy at oracle.com  Thu Nov 17 21:33:48 2016
From: joe.darcy at oracle.com (joe darcy)
Date: Thu, 17 Nov 2016 13:33:48 -0800
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <582DF764.70504@linux.vnet.ibm.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
Message-ID: <cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>

Hi Gustavo,


On 11/17/2016 10:31 AM, Gustavo Romero wrote:
> Hi Joe,
>
> Thanks a lot for your valuable comments.
>
> On 17-11-2016 15:35, joe darcy wrote:
>>> Currently, optimization for building fdlibm is disabled, except for the
>>> "solaris" OS target [1].
>> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the
>> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline.
> oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm
> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does
> not affect the precision, even if setting -O3 does not improve the performance
> as much as on PPC64.

The fdlibm code relies on aliasing a two-element array of int with a 
double to do bit-level reads and writes of floating-point values. As I 
understand it, the C spec allows compilers to assume values of different 
types don't overlap in memory. The compilation environment has to be 
configured in such a way that the C compiler disables code generation 
and optimization techniques that would run afoul of these fdlibm coding 
practices.

>>> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to,
>>> sin(), cos(), and tan() perform verify poor in comparison to the same methods
>>> in Math class [2]:
>> If you are doing your work against JDK 9, note that the pow, hypot, and cbrt fdlibm methods required by StrictMath have been ported to Java (JDK-8134780: Port fdlibm to Java). I have intentions to
>> port the remaining methods to Java, but it is unclear whether or not this will occur for JDK 9.
> Yes, I'm doing my work against 9. So is there any problem if I proceed with my
> change? I understand that there is no conflict as JDK-8134780 progresses and
> replaces the StrictMath methods by their counterparts in Java. Please, advice.

If I manage to finish the fdlibm C -> Java port in JDK 9, the changes 
you are proposing would eventually be removed as unneeded since the C 
code wouldn't be there to get compiled anymore.

>
> Is it intended to downport JDK-8134780 to 8?

Such a backport would be technically possible, but we at Oracle don't 
currently plan to do so.

>
>
>> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases.
> I agree. It's just that the issue on StrictMath methods was first noted due to
> that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64.

Depending on how Math.{sin, cos} is implemented on PPC64, compiling the 
fdlibm sin/cos with more aggressive optimizations should not be expected 
to close the performance gap. In particular, if Math.{sin, cos} is an 
intrinsic on PPC64 (I haven't checked the sources) that used 
platform-specific feature (say fused multiply add instructions) then 
just compiling fdlibm more aggressively wouldn't necessarily make up 
that gap.

To allow cross-platform and cross-release reproducibility, StrictMath is 
specified to use the particular fdlibm algorithms, which precludes using 
better algorithms developed more recently. If we were to start with a 
clean slate today, to get such reproducibility we would specify 
correctly-rounded behavior of all those methods, but such an approach 
was much less tractable technical 20+ years ago without benefit of the 
research that was been done in the interim, such as the work of Prof. 
Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.

>
>
>> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones.
>> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ.
> That's really good point, thanks for letting me know about that. I'll re-test my
> change under that perspective.
>
>
>> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area.
> Got it. By "the JDK math library regression tests" you mean exactly which test
> suite? the jtreg tests?

Specifically, the regression tests under test/java/lang/Math and 
test/java/lang/StrictMath in the jdk repository. There are some other 
math library tests in the hotspot repo, but I don't know where they are 
offhand.

A note on methodologies, when I've been writing test for my port I've 
tried to include test cases that exercise all the branches point in the 
code. Due to the large input space (~2^64 for a single-argument method), 
random sampling alone is an inefficient way to try to find differences 
in behavior.
> For testing against JCK/TCK I'll need some help on that.
>

I believe the JCK/TCK does have additional testcases relevant here.

HTH; thanks,

-Joe

From chris.plummer at oracle.com  Thu Nov 17 21:48:36 2016
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 17 Nov 2016 13:48:36 -0800
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
Message-ID: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>

On 11/17/16 1:33 PM, joe darcy wrote:
> Hi Gustavo,
>
>
> On 11/17/2016 10:31 AM, Gustavo Romero wrote:
>> Hi Joe,
>>
>> Thanks a lot for your valuable comments.
>>
>> On 17-11-2016 15:35, joe darcy wrote:
>>>> Currently, optimization for building fdlibm is disabled, except for 
>>>> the
>>>> "solaris" OS target [1].
>>> The reason for that is because historically the Solaris compilers 
>>> have had sufficient discipline and control regarding floating-point 
>>> semantics and compiler optimizations to still implement the
>>> Java-mandated results when optimization was enabled. The gcc family 
>>> of compilers, for example, has lacked such discipline.
>> oh, I see. Thanks for clarifying that. I was exactly wondering why 
>> fdlibm
>> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 
>> only, does
>> not affect the precision, even if setting -O3 does not improve the 
>> performance
>> as much as on PPC64.
>
> The fdlibm code relies on aliasing a two-element array of int with a 
> double to do bit-level reads and writes of floating-point values. As I 
> understand it, the C spec allows compilers to assume values of 
> different types don't overlap in memory. The compilation environment 
> has to be configured in such a way that the C compiler disables code 
> generation and optimization techniques that would run afoul of these 
> fdlibm coding practices.
This is the strict aliasing issue right? It's a long standing problem 
with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling 
with -fno-strict-aliasing fixes it, but it's been more than 12 years 
since I last dealt with fdlibm and compiler aliasing issues.

Chris
>
>>>> As a consequence on PPC64 (Linux) StrictMath methods like, but not 
>>>> limited to,
>>>> sin(), cos(), and tan() perform verify poor in comparison to the 
>>>> same methods
>>>> in Math class [2]:
>>> If you are doing your work against JDK 9, note that the pow, hypot, 
>>> and cbrt fdlibm methods required by StrictMath have been ported to 
>>> Java (JDK-8134780: Port fdlibm to Java). I have intentions to
>>> port the remaining methods to Java, but it is unclear whether or not 
>>> this will occur for JDK 9.
>> Yes, I'm doing my work against 9. So is there any problem if I 
>> proceed with my
>> change? I understand that there is no conflict as JDK-8134780 
>> progresses and
>> replaces the StrictMath methods by their counterparts in Java. 
>> Please, advice.
>
> If I manage to finish the fdlibm C -> Java port in JDK 9, the changes 
> you are proposing would eventually be removed as unneeded since the C 
> code wouldn't be there to get compiled anymore.
>
>>
>> Is it intended to downport JDK-8134780 to 8?
>
> Such a backport would be technically possible, but we at Oracle don't 
> currently plan to do so.
>
>>
>>
>>> Methods in the Math class, such as pow, are often intrinsified and 
>>> use a different algorithm so a straight performance comparison may 
>>> not be as fair or meaningful in those cases.
>> I agree. It's just that the issue on StrictMath methods was first 
>> noted due to
>> that huge gap (Math vs StrictMath) on PPC64, which is not prominent 
>> on x64.
>
> Depending on how Math.{sin, cos} is implemented on PPC64, compiling 
> the fdlibm sin/cos with more aggressive optimizations should not be 
> expected to close the performance gap. In particular, if Math.{sin, 
> cos} is an intrinsic on PPC64 (I haven't checked the sources) that 
> used platform-specific feature (say fused multiply add instructions) 
> then just compiling fdlibm more aggressively wouldn't necessarily make 
> up that gap.
>
> To allow cross-platform and cross-release reproducibility, StrictMath 
> is specified to use the particular fdlibm algorithms, which precludes 
> using better algorithms developed more recently. If we were to start 
> with a clean slate today, to get such reproducibility we would specify 
> correctly-rounded behavior of all those methods, but such an approach 
> was much less tractable technical 20+ years ago without benefit of the 
> research that was been done in the interim, such as the work of Prof. 
> Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.
>
>>
>>
>>> Accumulating the the results of the functions and comparisons the 
>>> sums is not a sufficiently robust way of checking to see if the 
>>> optimized versions are indeed equivalent to the non-optimized ones.
>>> The specification of StrictMath requires a particular result for 
>>> each set of floating-point arguments and sums get round-away 
>>> low-order bits that differ.
>> That's really good point, thanks for letting me know about that. I'll 
>> re-test my
>> change under that perspective.
>>
>>
>>> Running the JDK math library regression tests and corresponding JCK 
>>> tests is recommended for work in this area.
>> Got it. By "the JDK math library regression tests" you mean exactly 
>> which test
>> suite? the jtreg tests?
>
> Specifically, the regression tests under test/java/lang/Math and 
> test/java/lang/StrictMath in the jdk repository. There are some other 
> math library tests in the hotspot repo, but I don't know where they 
> are offhand.
>
> A note on methodologies, when I've been writing test for my port I've 
> tried to include test cases that exercise all the branches point in 
> the code. Due to the large input space (~2^64 for a single-argument 
> method), random sampling alone is an inefficient way to try to find 
> differences in behavior.
>> For testing against JCK/TCK I'll need some help on that.
>>
>
> I believe the JCK/TCK does have additional testcases relevant here.
>
> HTH; thanks,
>
> -Joe


From Derek.White at cavium.com  Thu Nov 17 22:47:58 2016
From: Derek.White at cavium.com (White, Derek)
Date: Thu, 17 Nov 2016 22:47:58 +0000
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
	<5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
Message-ID: <CY1PR07MB2393948C11DDD6F533CBADA384B10@CY1PR07MB2393.namprd07.prod.outlook.com>

Hi Joe,

Although neither a floating point expert (as I think I've proven to you over the years), or a gcc expert, I checked with our in-house gcc expert and got this following answer:

	"Yes using -fno-strict-aliasing fixes the issues.  Also there are many forks of fdlibm which has this fixed including the code inside glibc. "

FWIW,
 - Derek

-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Chris Plummer
Sent: Thursday, November 17, 2016 4:49 PM
To: joe darcy <joe.darcy at oracle.com>; Gustavo Romero <gromero at linux.vnet.ibm.com>; ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net
Cc: build-dev <build-dev at openjdk.java.net>
Subject: Re: PPC64: Poor StrictMath performance due to non-optimized compilation

On 11/17/16 1:33 PM, joe darcy wrote:
> Hi Gustavo,
>
>
> On 11/17/2016 10:31 AM, Gustavo Romero wrote:
>> Hi Joe,
>>
>> Thanks a lot for your valuable comments.
>>
>> On 17-11-2016 15:35, joe darcy wrote:
>>>> Currently, optimization for building fdlibm is disabled, except for 
>>>> the "solaris" OS target [1].
>>> The reason for that is because historically the Solaris compilers 
>>> have had sufficient discipline and control regarding floating-point 
>>> semantics and compiler optimizations to still implement the 
>>> Java-mandated results when optimization was enabled. The gcc family 
>>> of compilers, for example, has lacked such discipline.
>> oh, I see. Thanks for clarifying that. I was exactly wondering why 
>> fdlibm optimization is off even for x86_x64 as it, AFAICS regarding 
>> gcc 5 only, does not affect the precision, even if setting -O3 does 
>> not improve the performance as much as on PPC64.
>
> The fdlibm code relies on aliasing a two-element array of int with a 
> double to do bit-level reads and writes of floating-point values. As I 
> understand it, the C spec allows compilers to assume values of 
> different types don't overlap in memory. The compilation environment 
> has to be configured in such a way that the C compiler disables code 
> generation and optimization techniques that would run afoul of these 
> fdlibm coding practices.
This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more than 12 years since I last dealt with fdlibm and compiler aliasing issues.

Chris
>
>>>> As a consequence on PPC64 (Linux) StrictMath methods like, but not 
>>>> limited to, sin(), cos(), and tan() perform verify poor in 
>>>> comparison to the same methods in Math class [2]:
>>> If you are doing your work against JDK 9, note that the pow, hypot, 
>>> and cbrt fdlibm methods required by StrictMath have been ported to 
>>> Java (JDK-8134780: Port fdlibm to Java). I have intentions to port 
>>> the remaining methods to Java, but it is unclear whether or not this 
>>> will occur for JDK 9.
>> Yes, I'm doing my work against 9. So is there any problem if I 
>> proceed with my change? I understand that there is no conflict as 
>> JDK-8134780 progresses and replaces the StrictMath methods by their 
>> counterparts in Java.
>> Please, advice.
>
> If I manage to finish the fdlibm C -> Java port in JDK 9, the changes 
> you are proposing would eventually be removed as unneeded since the C 
> code wouldn't be there to get compiled anymore.
>
>>
>> Is it intended to downport JDK-8134780 to 8?
>
> Such a backport would be technically possible, but we at Oracle don't 
> currently plan to do so.
>
>>
>>
>>> Methods in the Math class, such as pow, are often intrinsified and 
>>> use a different algorithm so a straight performance comparison may 
>>> not be as fair or meaningful in those cases.
>> I agree. It's just that the issue on StrictMath methods was first 
>> noted due to that huge gap (Math vs StrictMath) on PPC64, which is 
>> not prominent on x64.
>
> Depending on how Math.{sin, cos} is implemented on PPC64, compiling 
> the fdlibm sin/cos with more aggressive optimizations should not be 
> expected to close the performance gap. In particular, if Math.{sin, 
> cos} is an intrinsic on PPC64 (I haven't checked the sources) that 
> used platform-specific feature (say fused multiply add instructions) 
> then just compiling fdlibm more aggressively wouldn't necessarily make 
> up that gap.
>
> To allow cross-platform and cross-release reproducibility, StrictMath 
> is specified to use the particular fdlibm algorithms, which precludes 
> using better algorithms developed more recently. If we were to start 
> with a clean slate today, to get such reproducibility we would specify 
> correctly-rounded behavior of all those methods, but such an approach 
> was much less tractable technical 20+ years ago without benefit of the 
> research that was been done in the interim, such as the work of Prof.
> Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.
>
>>
>>
>>> Accumulating the the results of the functions and comparisons the 
>>> sums is not a sufficiently robust way of checking to see if the 
>>> optimized versions are indeed equivalent to the non-optimized ones.
>>> The specification of StrictMath requires a particular result for 
>>> each set of floating-point arguments and sums get round-away 
>>> low-order bits that differ.
>> That's really good point, thanks for letting me know about that. I'll 
>> re-test my change under that perspective.
>>
>>
>>> Running the JDK math library regression tests and corresponding JCK 
>>> tests is recommended for work in this area.
>> Got it. By "the JDK math library regression tests" you mean exactly 
>> which test
>> suite? the jtreg tests?
>
> Specifically, the regression tests under test/java/lang/Math and 
> test/java/lang/StrictMath in the jdk repository. There are some other 
> math library tests in the hotspot repo, but I don't know where they 
> are offhand.
>
> A note on methodologies, when I've been writing test for my port I've 
> tried to include test cases that exercise all the branches point in 
> the code. Due to the large input space (~2^64 for a single-argument 
> method), random sampling alone is an inefficient way to try to find 
> differences in behavior.
>> For testing against JCK/TCK I'll need some help on that.
>>
>
> I believe the JCK/TCK does have additional testcases relevant here.
>
> HTH; thanks,
>
> -Joe


From trevor.d.watson at oracle.com  Fri Nov 18 07:58:58 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Fri, 18 Nov 2016 07:58:58 +0000
Subject: Unsafe compareAnd*
In-Reply-To: <77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com>
References: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com>
	<77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com>
Message-ID: <0fc2f433-3ee9-d24f-2081-ecf174ec5b75@oracle.com>

Thanks for the explanation Paul.

On 17/11/16 18:48, Paul Sandoz wrote:
> Hi Trevor,
>
> The compareAndSwapShort (non-instrinsic) implementation defers to the compareAndExchangeShortVolatile implementation, and the weakCompareAndSwapShortVolatile implementation defers to (the stronger) compareAndSwapShort implementation [*]:
>
> i.e. weakCompareAndSwapShortVolatile -> compareAndSwapShort -> compareAndExchangeShortVolatile
>
> @HotSpotIntrinsicCandidate
> public final short compareAndExchangeShortVolatile(Object o, long offset,
>                                          short expected,
>                                          short x) {
>     if ((offset & 3) == 3) {
>         throw new IllegalArgumentException("Update spans the word, not supported");
>     }
>
>     ?
> }
>
> @HotSpotIntrinsicCandidate
> public final boolean compareAndSwapShort(Object o, long offset,
>                                          short expected,
>                                          short x) {
>     return compareAndExchangeShortVolatile(o, offset, expected, x) == expected;
> }
>
> @HotSpotIntrinsicCandidate
> public final boolean weakCompareAndSwapShortVolatile(Object o, long offset,
>                                                      short expected,
>                                                      short x) {
>     return compareAndSwapShort(o, offset, expected, x);
> }
>
> I think that explains why you are observing failing Unsafe.weakCompareAndSwapShort() tests.
>
> Paul.
>
> [*] Note, we really need to change the names here to be consistent with the schema on VarHandles
>
>
>> On 17 Nov 2016, at 09:29, Trevor Watson <trevor.d.watson at oracle.com> wrote:
>>
>> I'm working on an implementation of the C2 code for compareAndExchangeShort on SPARC.
>>
>> I've only implemented this function so far, and no compareAndSwapShort equivalent.
>>
>> When I run the test in hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java it fails because Unsafe.compareAndSwapShort() returns an incorrect value. This test passes without my implementation of compareAndExchangeShort.
>>
>> If I comment out the Unsafe.compareAndSwapShort() tests, the Unsafe.compareAndExchangeShort tests run successfully but the Unsafe.weakCompareAndSwapShort() tests subsequently fail.
>>
>> Can anyone tell me why it might be that an implementation for CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()?
>>
>> Thanks,
>> Trevor
>

From tobias.hartmann at oracle.com  Fri Nov 18 08:33:36 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 18 Nov 2016 09:33:36 +0100
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if
	intrinsic method is disabled
In-Reply-To: <582E21BB.1060704@oracle.com>
References: <582DA5B2.4020307@oracle.com>
	<efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>
	<582E21BB.1060704@oracle.com>
Message-ID: <582EBCE0.7090506@oracle.com>

Thanks for the reviews, Vladimir and Ioi!

As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry():
http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/

Best regards,
Tobias

On 17.11.2016 22:31, Ioi Lam wrote:
> Hi Tobias,
> 
> The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts.
> 
> Thanks
> - Ioi
> 
> On 11/17/16 11:34 AM, Vladimir Kozlov wrote:
>> Hi Tobias,
>>
>> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods.
>> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/17/16 4:42 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8169711
>>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/
>>>
>>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()).
>>>
>>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code.
>>>
>>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).
>>>
>>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub.
>>>
>>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because
>>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and
>>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized.
>>>
>>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals.
>>>
>>> Tested with regression test, JPRT and RBT (running).
>>>
>>> Thanks,
>>> Tobias
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867
>>>
> 

From erik.helin at oracle.com  Fri Nov 18 12:59:12 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 18 Nov 2016 13:59:12 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
Message-ID: <a04bab56-7a4d-a22c-f271-94cd7ac74ad7@oracle.com>

On 11/16/2016 12:58 AM, Kim Barrett wrote:
>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>>>
>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>>> om> wrote:
>>>> Maybe it would be cleaner to call a method in the barrier set
>>>> instead of inlining the dirtying + enqueuing in lines 685 to 691?
>>>> Maybe as an additional RFE.
>>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>>> overgeneralized and inefficient for this situation, but this
>>> situation should occur *very* rarely; it requires a stale card get
>>> processed just as a humongous object is in the midst of being
>>> allocated in the same region.
>>
>> I kind of think for these reasons we should use _ct_bs->invalidate() as
>> it seems clearer to me. There is the mentioned drawback of having no
>> other more efficient way, so I will let you decide about this.
>
> I've made the change to call invalidate, and also updated some comments.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166607
>
> Webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/

Again, thanks for all your hard work on this patch series!

I've been over this patch (and the other one) many times now, and I 
think this is good. At least I can't come up with any reason why it 
wouldn't work (this is one trickiest parts of G1).

Thanks,
Erik

> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
>
> Also, see RFR: 8166811, where I've included a webrev combining the
> latest changes for 8166607 and 8166811, since they are rather
> intertwined.  I think I'll do as Erik suggested and push the two
> together.
>
>

From erik.helin at oracle.com  Fri Nov 18 13:00:33 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 18 Nov 2016 14:00:33 +0100
Subject: RFR: 8166811: Missing memory fences between memory allocation and
	refinement
In-Reply-To: <4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com>
References: <BF01DCC8-D3CA-4C96-8319-D9F8F08FE455@oracle.com>
	<1478609549.2689.71.camel@oracle.com>
	<671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com>
	<1479205608.3251.18.camel@oracle.com>
	<894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com>
	<f89a2a1c-f48d-3e9b-f253-c19bb5261b89@oracle.com>
	<4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com>
Message-ID: <635c9aa8-62d6-dda8-6fee-741e213686ce@oracle.com>

On 11/17/2016 10:06 PM, Kim Barrett wrote:
>> On Nov 17, 2016, at 12:28 PM, Erik Helin <erik.helin at oracle.com> wrote:
>>
>> First of all, thanks for doing this tricky work. One initial comment:
>>
>> 659   // Iterate over the objects overlapping the card designated by
>> 660   // card_ptr, applying cl to all references in the region.  This
>> 661   // is a helper for G1RemSet::refine_card, and is tightly coupled
>> 662   // with it.
>>
>> In the first sentence you mention the now removed argument card_ptr. Maybe just reword this to "Iterate over the objects covered by the memory region, applying cl to all references in the region??
>
> You?re right, I missed updating the comment when the signature was changed.
> Changing to:
>
>   // Iterate over the objects overlapping part of a card, applying cl
>   // to all references in the region.  This is a helper for
>   // G1RemSet::refine_card, and is tightly coupled with it.
>
> which is still immediately followed by:
>
>   // mr: the memory region covered by the card, trimmed to the
>   // allocated space for this region.  Must not be empty.

Ok, looks good. I think this patch is good to go now, thanks for taking 
this on, appreciate it.

Erik

From kim.barrett at oracle.com  Fri Nov 18 14:03:50 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 18 Nov 2016 09:03:50 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
Message-ID: <DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>

> On Nov 15, 2016, at 6:58 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> 
>> Hi Kim,
>> 
>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>>> 
>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>>> om> wrote:
>>>> Maybe it would be cleaner to call a method in the barrier set
>>>> instead of inlining the dirtying + enqueuing in lines 685 to 691?
>>>> Maybe as an additional RFE.
>>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>>> overgeneralized and inefficient for this situation, but this
>>> situation should occur *very* rarely; it requires a stale card get
>>> processed just as a humongous object is in the midst of being
>>> allocated in the same region.
>> 
>> I kind of think for these reasons we should use _ct_bs->invalidate() as
>> it seems clearer to me. There is the mentioned drawback of having no
>> other more efficient way, so I will let you decide about this.
> 
> I've made the change to call invalidate, and also updated some comments.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8166607
> 
> Webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
> 
> Also, see RFR: 8166811, where I've included a webrev combining the
> latest changes for 8166607 and 8166811, since they are rather
> intertwined.  I think I'll do as Erik suggested and push the two
> together.

Sorry folks, but I want to revert this part and go back to the old
code where it locked the shared queue and enqueued there.

If the executing invocation of refine_card is from a Java thread,
e.g. this is the "mutator helps with refinement" case, calling
invalidate would enqueue to the current thread's buffer.  But that is
effectively a reentrant call to enqueue, and the Java thread case of
enqueue is not reentrant-safe.  Only enqueue to the shared queue is
reentrant-safe.

I think that scenario presently can't happen, since the mutator helps
case is dealt with by the mutator processing it's own buffer.  In that
situation, all the cards in the buffer came from writes by this thread
to an object this thread either allocated or has access to, so the
klass must be there.  But that's getting uncomfortably subtle in what
is already difficult to analyze code.

Also, we've talked about changing the mutator helps case to not
immediately process it's own buffer but instead add its buffer to the
pending buffer list and process the next (FIFO ordered) buffer, in
order to let its buffer age.  (I have a change for that in my post-JDK
9 collection of pending changes.  The mutator-invoked enqueue might be
reentrant-safe in that change, but I don't think I want to make that
guarantee.)


From thomas.schatzl at oracle.com  Fri Nov 18 14:28:27 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 18 Nov 2016 15:28:27 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
	<DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
Message-ID: <1479479307.2483.5.camel@oracle.com>

Hi Kim,

On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote:
> > 
> > On Nov 15, 2016, at 6:58 PM, Kim Barrett <kim.barrett at oracle.com>
> > wrote:
> > 
> > > 
> > > On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracl
> > > e.com> wrote:
> > > 
> > > Hi Kim,
> > > 
> > > On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
> > > > 
> > > > > 
> > > > > 
> > > > > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at or
> > > > > acle.c
> > > > > om> wrote:
> > > > > Maybe it would be cleaner to call a method in the barrier set
> > > > > instead of inlining the dirtying + enqueuing in lines 685 to
> > > > > 691?
> > > > > Maybe as an additional RFE.
> > > > We could use _ct_bs->invalidate(dirtyRegion).??That's rather
> > > > overgeneralized and inefficient for this situation, but this
> > > > situation should occur *very* rarely; it requires a stale card
> > > > get
> > > > processed just as a humongous object is in the midst of being
> > > > allocated in the same region.
> > > I kind of think for these reasons we should use _ct_bs-
> > > >invalidate() as
> > > it seems clearer to me. There is the mentioned drawback of having
> > > no
> > > other more efficient way, so I will let you decide about this.
> > I've made the change to call invalidate, and also updated some
> > comments.
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8166607
> > 
> > Webrevs:
> > full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
> > incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
> > 
> > Also, see RFR: 8166811, where I've included a webrev combining the
> > latest changes for 8166607 and 8166811, since they are rather
> > intertwined.??I think I'll do as Erik suggested and push the two
> > together.
> Sorry folks, but I want to revert this part and go back to the old
> code where it locked the shared queue and enqueued there.
> 

You mean the invalidate() call? If you think this is better, it has
only been a suggestion.


> If the executing invocation of refine_card is from a Java thread,
> e.g. this is the "mutator helps with refinement" case, calling
> invalidate would enqueue to the current thread's buffer.??But that is
> effectively a reentrant call to enqueue, and the Java thread case of
> enqueue is not reentrant-safe.??Only enqueue to the shared queue is
> reentrant-safe.

Yes, that's bad. One could extract this code out and put into the
barrier set though - as another CR.

Thanks,
? Thomas


From kim.barrett at oracle.com  Fri Nov 18 14:32:33 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 18 Nov 2016 09:32:33 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <1479479307.2483.5.camel@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
	<DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
	<1479479307.2483.5.camel@oracle.com>
Message-ID: <A8D3F32C-D954-4CD7-BEF4-B5983B0E7547@oracle.com>

> On Nov 18, 2016, at 9:28 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote:
>>> 
>>> On Nov 15, 2016, at 6:58 PM, Kim Barrett <kim.barrett at oracle.com>
>>> wrote:
>>> 
>>>> 
>>>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracl
>>>> e.com> wrote:
>>>> 
>>>> Hi Kim,
>>>> 
>>>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at or
>>>>>> acle.c
>>>>>> om> wrote:
>>>>>> Maybe it would be cleaner to call a method in the barrier set
>>>>>> instead of inlining the dirtying + enqueuing in lines 685 to
>>>>>> 691?
>>>>>> Maybe as an additional RFE.
>>>>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>>>>> overgeneralized and inefficient for this situation, but this
>>>>> situation should occur *very* rarely; it requires a stale card
>>>>> get
>>>>> processed just as a humongous object is in the midst of being
>>>>> allocated in the same region.
>>>> I kind of think for these reasons we should use _ct_bs-
>>>>> invalidate() as
>>>> it seems clearer to me. There is the mentioned drawback of having
>>>> no
>>>> other more efficient way, so I will let you decide about this.
>>> I've made the change to call invalidate, and also updated some
>>> comments.
>>> 
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8166607
>>> 
>>> Webrevs:
>>> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
>>> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
>>> 
>>> Also, see RFR: 8166811, where I've included a webrev combining the
>>> latest changes for 8166607 and 8166811, since they are rather
>>> intertwined.  I think I'll do as Erik suggested and push the two
>>> together.
>> Sorry folks, but I want to revert this part and go back to the old
>> code where it locked the shared queue and enqueued there.
>> 
> 
> You mean the invalidate() call? If you think this is better, it has
> only been a suggestion.

Yes.  It seemed like a good idea at the time?


From erik.joelsson at oracle.com  Fri Nov 18 15:30:20 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Fri, 18 Nov 2016 16:30:20 +0100
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
Message-ID: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>

Hello,

Please review this change which removes the $ARCH sub directory in the 
lib directory of the runtime images, which is an outstanding issue from 
the new runtime images. Most of the changes are in the build, but there 
are some in hotspot and launcher source. I have verified -testset 
hotspot and default in JPRT as well as tried to run as many jtreg tests 
as possible locally. I could only really find two tests that needed to 
be adjusted.

Bug: https://bugs.openjdk.java.net/browse/JDK-8066474

Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01

/Erik


From vladimir.kozlov at oracle.com  Fri Nov 18 16:09:01 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 18 Nov 2016 08:09:01 -0800
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if
	intrinsic method is disabled
In-Reply-To: <582EBCE0.7090506@oracle.com>
References: <582DA5B2.4020307@oracle.com>
	<efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>
	<582E21BB.1060704@oracle.com> <582EBCE0.7090506@oracle.com>
Message-ID: <3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com>

Looks good.

Thanks,
Vladimir

On 11/18/16 12:33 AM, Tobias Hartmann wrote:
> Thanks for the reviews, Vladimir and Ioi!
>
> As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry():
> http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/
>
> Best regards,
> Tobias
>
> On 17.11.2016 22:31, Ioi Lam wrote:
>> Hi Tobias,
>>
>> The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts.
>>
>> Thanks
>> - Ioi
>>
>> On 11/17/16 11:34 AM, Vladimir Kozlov wrote:
>>> Hi Tobias,
>>>
>>> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods.
>>> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 11/17/16 4:42 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8169711
>>>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/
>>>>
>>>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()).
>>>>
>>>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code.
>>>>
>>>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).
>>>>
>>>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub.
>>>>
>>>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because
>>>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and
>>>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized.
>>>>
>>>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals.
>>>>
>>>> Tested with regression test, JPRT and RBT (running).
>>>>
>>>> Thanks,
>>>> Tobias
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867
>>>>
>>

From tim.bell at oracle.com  Fri Nov 18 16:34:03 2016
From: tim.bell at oracle.com (Tim Bell)
Date: Fri, 18 Nov 2016 08:34:03 -0800
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
Message-ID: <9846f58b-0c80-b8ce-674f-cf3fd239b01f@oracle.com>

Erik:

> Please review this change which removes the $ARCH sub directory in the
> lib directory of the runtime images, which is an outstanding issue from
> the new runtime images. Most of the changes are in the build, but there
> are some in hotspot and launcher source. I have verified -testset
> hotspot and default in JPRT as well as tried to run as many jtreg tests
> as possible locally. I could only really find two tests that needed to
> be adjusted.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
>
> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01


hotspot/test/runtime/ThreadSignalMask/exeThreadSignalMask.c
jdk/make/copy/Copy-java.desktop.gmk
jdk/src/java.base/unix/classes/java/lang/ProcessImpl.java

These legal notices need to be updated for 2016.  No need to redo the 
webrev if this is all the feedback you get.

Looks fine otherwise.

Tim


From vladimir.kozlov at oracle.com  Fri Nov 18 16:41:29 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 18 Nov 2016 08:41:29 -0800
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
Message-ID: <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com>

Finally! :)

Hotspot changes looks fine to me. But you missed 
hotspot/make/hotspot.script file.

Our colleges in RH and SAP should test these changes on their platforms.

Next step would be removal of client/server sub-directories on platforms 
where we have only Server JVM (64-bit JDK has only Server JVM).

Thanks,
Vladimir

On 11/18/16 7:30 AM, Erik Joelsson wrote:
> Hello,
>
> Please review this change which removes the $ARCH sub directory in the
> lib directory of the runtime images, which is an outstanding issue from
> the new runtime images. Most of the changes are in the build, but there
> are some in hotspot and launcher source. I have verified -testset
> hotspot and default in JPRT as well as tried to run as many jtreg tests
> as possible locally. I could only really find two tests that needed to
> be adjusted.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
>
> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01
>
> /Erik
>

From magnus.ihse.bursie at oracle.com  Fri Nov 18 20:40:58 2016
From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie)
Date: Fri, 18 Nov 2016 21:40:58 +0100
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
Message-ID: <dbec0c96-d4e0-3d8f-fe3b-67667be35270@oracle.com>

On 2016-11-18 16:30, Erik Joelsson wrote:
> Hello,
>
> Please review this change which removes the $ARCH sub directory in the 
> lib directory of the runtime images, which is an outstanding issue 
> from the new runtime images. Most of the changes are in the build, but 
> there are some in hotspot and launcher source. I have verified 
> -testset hotspot and default in JPRT as well as tried to run as many 
> jtreg tests as possible locally. I could only really find two tests 
> that needed to be adjusted.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
>
> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01

Looks good to me.

If anything, the switch statement in ProcessImpl.java seems superfluous 
now, and you could possibly prune that bit even harder.

Nice to see this go. :)

/Magnus

From kim.barrett at oracle.com  Fri Nov 18 20:53:45 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 18 Nov 2016 15:53:45 -0500
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <1479479307.2483.5.camel@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
	<DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
	<1479479307.2483.5.camel@oracle.com>
Message-ID: <6740CCD9-671C-485B-8E39-5F1DEF249830@oracle.com>

> On Nov 18, 2016, at 9:28 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote:
>> If the executing invocation of refine_card is from a Java thread,
>> e.g. this is the "mutator helps with refinement" case, calling
>> invalidate would enqueue to the current thread's buffer.  But that is
>> effectively a reentrant call to enqueue, and the Java thread case of
>> enqueue is not reentrant-safe.  Only enqueue to the shared queue is
>> reentrant-safe.
> 
> Yes, that's bad. One could extract this code out and put into the
> barrier set though - as another CR.

https://bugs.openjdk.java.net/browse/JDK-8170020


From tobias.hartmann at oracle.com  Mon Nov 21 06:05:19 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 21 Nov 2016 07:05:19 +0100
Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if
	intrinsic method is disabled
In-Reply-To: <3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com>
References: <582DA5B2.4020307@oracle.com>
	<efed4e29-707c-364a-0007-3fbf75bace37@oracle.com>
	<582E21BB.1060704@oracle.com> <582EBCE0.7090506@oracle.com>
	<3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com>
Message-ID: <58328E9F.5010303@oracle.com>

Thanks again, Vladimir!

Best regards,
Tobias

On 18.11.2016 17:09, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 11/18/16 12:33 AM, Tobias Hartmann wrote:
>> Thanks for the reviews, Vladimir and Ioi!
>>
>> As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry():
>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/
>>
>> Best regards,
>> Tobias
>>
>> On 17.11.2016 22:31, Ioi Lam wrote:
>>> Hi Tobias,
>>>
>>> The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts.
>>>
>>> Thanks
>>> - Ioi
>>>
>>> On 11/17/16 11:34 AM, Vladimir Kozlov wrote:
>>>> Hi Tobias,
>>>>
>>>> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods.
>>>> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/17/16 4:42 AM, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> please review the following patch:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8169711
>>>>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/
>>>>>
>>>>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()).
>>>>>
>>>>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code.
>>>>>
>>>>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test).
>>>>>
>>>>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub.
>>>>>
>>>>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because
>>>>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and
>>>>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized.
>>>>>
>>>>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals.
>>>>>
>>>>> Tested with regression test, JPRT and RBT (running).
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867
>>>>>
>>>

From shafi.s.ahmad at oracle.com  Mon Nov 21 06:29:42 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Sun, 20 Nov 2016 22:29:42 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <b3e52d65-024f-cbb0-9588-594fab8c59b0@oracle.com>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
	<582B622F.7030909@oracle.com>
	<4332d26a-0efa-4582-9068-f28fb7ebd109@default>
	<b3e52d65-024f-cbb0-9588-594fab8c59b0@oracle.com>
Message-ID: <acc34349-458b-4476-b165-b85c61d3722a@default>

Hi All,

May I get the second review on this. 

I am putting together all the webrevs to make it simple for reviewer.
http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/
http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/
http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/

Please note that I tested with jprt, all jtreg and rbt tests.

Regards,
Shafi

> -----Original Message-----
> From: Vladimir Kozlov
> Sent: Wednesday, November 16, 2016 10:21 PM
> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> mismatched unsafe accesses
> 
> Looks good.
> 
> I would suggest to run all jtreg tests (or even RBT) when you apply all
> changes before pushing this.
> 
> Thanks,
> Vladimir
> 
> On 11/16/16 4:52 AM, Shafi Ahmad wrote:
> > Hi Vladimir,
> >
> > Thank you for the review and feedback.
> >
> > Please find updated webrevs:
> > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed
> the test case as it use only jdk9 APIs.
> > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed
> test methods testFixedOffsetHeaderArray17() and
> testFixedOffsetHeader17() which referenced jdk9 API
> UNSAFE.getIntUnaligned.
> >
> >
> > Regards,
> > Shafi
> >
> >
> >> -----Original Message-----
> >> From: Vladimir Kozlov
> >> Sent: Wednesday, November 16, 2016 1:00 AM
> >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> >> mismatched unsafe accesses
> >>
> >> Hi Shafi
> >>
> >> You should not backport tests which use only new JDK 9 APIs. Like
> >> TestUnsafeUnalignedMismatchedAccesses.java test.
> >>
> >> But it is perfectly fine to modify backport by removing part of
> >> changes which use a new API. For example,  8162101 changes in
> >> OpaqueAccesses.java test which use getIntUnaligned() method.
> >>
> >> It is unfortunate that 8140309 changes include also code which
> >> process new Unsafe Unaligned intrinsics from JDK 9. It should not be
> >> backported but it will simplify this and following backports. So I
> >> agree with changes you did for
> >> 8140309 backport.
> >>
> >> Thanks,
> >> Vladimir
> >>
> >> On 11/14/16 10:34 PM, Shafi Ahmad wrote:
> >>> Hi Vladimir,
> >>>
> >>> Thanks for the review.
> >>>
> >>>> -----Original Message-----
> >>>
> >>>> From: Vladimir Kozlov
> >>>
> >>>> Sent: Monday, November 14, 2016 11:20 PM
> >>>
> >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >>>
> >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>> produces
> >>>
> >>>> mismatched unsafe accesses
> >>>
> >>>>
> >>>
> >>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
> >>>
> >>>>> Hi Vladimir,
> >>>
> >>>>>
> >>>
> >>>>> Thanks for the review.
> >>>
> >>>>>
> >>>
> >>>>> Please find updated webrevs.
> >>>
> >>>>>
> >>>
> >>>>> All webrevs are with respect to the base changes on JDK-8140309.
> >>>
> >>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
> >>>
> >>>>
> >>>
> >>>> Why you kept unaligned parameter in changes?
> >>>
> >>> The fix of JDK-8136473 caused many problems after integration (see
> >>> JDK-
> >> 8140267).
> >>>
> >>> The fix was backed out and re-implemented with JDK-8140309 by
> >>> slightly
> >> changing the assert:
> >>>
> >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-
> >> Novem
> >>> ber/019696.html
> >>>
> >>> The code change for the fix of JDK-8140309 is code changes for
> >>> JDK-8136473
> >> by slightly changing one assert.
> >>>
> >>> jdk9 original changeset is
> >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
> >>>
> >>> As this is a backport so I keep the changes as it is.
> >>>
> >>>>
> >>>
> >>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work
> >>>> since
> >>>
> >>>> since Unsafe class in jdk8 does not have unaligned methods.
> >>>
> >>>> Hot did you run it?
> >>>
> >>> I am sorry, looks there is some issue with my testing.
> >>>
> >>> I have run jtreg test after merging the changes but somehow the test
> >>> does
> >> not run and I verified only the failing list of jtreg result.
> >>>
> >>> When I run the test case separately it is failing as you already
> >>> pointed out
> >> the same.
> >>>
> >>> $java -jar ~/Tools/jtreg/lib/jtreg.jar
> >>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/
> >>>
> >>
> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched
> >> A
> >>> ccesses.java
> >>>
> >>> Test results: failed: 1
> >>>
> >>> Report written to
> >>> /scratch/shshahma/Java/jdk8u-dev-
> >> 8140309_01/JTreport/html/report.html
> >>>
> >>> Results written to
> >>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
> >>>
> >>> Error:
> >>>
> >>> /scratch/shshahma/Java/jdk8u-dev-
> >> 8140309_01/hotspot/test/compiler/intr
> >>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error:
> >>> cannot find symbol
> >>>
> >>>          UNSAFE.putIntUnaligned(array,
> >>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
> >>>
> >>> Not sure if we should push without the test case.
> >>>
> >>>>
> >>>
> >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
> >>>
> >>>>
> >>>
> >>>> Good. Did you run new UnsafeAccess.java test?
> >>>
> >>> Due to same process issue the test case is not run and when I run it
> >> separately it fails.
> >>>
> >>> It passes after doing below changes:
> >>>
> >>> 1. Added /othervm
> >>>
> >>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by
> >>> 'import
> >> sun.misc.Unsafe;'
> >>>
> >>> Updated webrev:
> >>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
> >>>
> >>>>
> >>>
> >>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
> >>>
> >>> I am getting the similar compilation error as above for added test
> >>> case.  Not
> >> sure if we can push without the test case.
> >>>
> >>> Regards,
> >>>
> >>> Shafi
> >>>
> >>>>
> >>>
> >>>> Good.
> >>>
> >>>>
> >>>
> >>>> Thanks,
> >>>
> >>>> Vladimir
> >>>
> >>>>
> >>>
> >>>>>
> >>>
> >>>>> Regards,
> >>>
> >>>>> Shafi
> >>>
> >>>>>
> >>>
> >>>>>
> >>>
> >>>>>
> >>>
> >>>>>> -----Original Message-----
> >>>
> >>>>>> From: Vladimir Kozlov
> >>>
> >>>>>> Sent: Friday, November 11, 2016 1:26 AM
> >>>
> >>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>
> >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>>>> produces
> >>>
> >>>>>> mismatched unsafe accesses
> >>>
> >>>>>>
> >>>
> >>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
> >>>
> >>>>>>> Hi,
> >>>
> >>>>>>>
> >>>
> >>>>>>> Please review the backport of following dependent backports.
> >>>
> >>>>>>>
> >>>
> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473
> >>>
> >>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
> >>>
> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61
> >>>>>>> [JDK-
> >>>
> >>>>>> 8080289]. Manual merge is not done as the corresponding code is
> >>>>>> not
> >>>
> >>>>>> there in jdk8u-dev.
> >>>
> >>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp
> >>>>>>> and
> >>>
> >>>>>>> manual
> >>>
> >>>>>> merge is done.
> >>>
> >>>>>>> webrev link:
> >>>
> >>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
> >>>
> >>>>>>
> >>>
> >>>>>> unaligned unsafe access methods were added in jdk 9 only. In your
> >>>
> >>>>>> changes unaligned argument is always false. You can simplify
> changes.
> >>>
> >>>>>>
> >>>
> >>>>>> Also you should base changes on JDK-8140309 (original 8136473
> >>>>>> changes
> >>>
> >>>>>> were backout by 8140267):
> >>>
> >>>>>>
> >>>
> >>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote:
> >>>
> >>>>>>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
> >>>
> >>>>>>  >
> >>>
> >>>>>>  > Same as 8136473 with only the following change:
> >>>
> >>>>>>  >
> >>>
> >>>>>>  > diff --git a/src/share/vm/opto/library_call.cpp
> >>>
> >>>>>> b/src/share/vm/opto/library_call.cpp
> >>>
> >>>>>>  > --- a/src/share/vm/opto/library_call.cpp
> >>>
> >>>>>>  > +++ b/src/share/vm/opto/library_call.cpp
> >>>
> >>>>>>  > @@ -2527,7 +2527,7 @@
> >>>
> >>>>>>  >     // of safe & unsafe memory.
> >>>
> >>>>>>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
> >>>
> >>>>>>  >
> >>>
> >>>>>>  > -  assert(is_native_ptr || alias_type->adr_type() ==
> >>>
> >>>>>> TypeOopPtr::BOTTOM
> >>>
> >>>>>> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
> >>>
> >>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
> >>>
> >>>>>>  >            alias_type->field() != NULL || alias_type->element() !=
> >>>
> >>>>>> NULL, "field, array element or unknown");
> >>>
> >>>>>>  >     bool mismatched = false;
> >>>
> >>>>>>  >     if (alias_type->element() != NULL || alias_type->field() != NULL)
> {
> >>>
> >>>>>>  >
> >>>
> >>>>>>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
> >>>
> >>>>>> is_native_ptr case and the case where the unsafe method is called
> >>>>>> with a
> >>>
> >>>> null object.
> >>>
> >>>>>>
> >>>
> >>>>>>> jdk9 changeset:
> >>>
> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
> >>>
> >>>>>>>
> >>>
> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
> >>>
> >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >>>
> >>>>>>>
> >>>
> >>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16
> >>>> 5
> >>>
> >>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code
> >>>>>> is
> >>>
> >>>>>> not there in jdk8u-dev.
> >>>
> >>>>>>
> >>>
> >>>>>> I explained situation with this line above.
> >>>
> >>>>>>
> >>>
> >>>>>>> webrev link:
> >>>
> >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >>>
> >>>>>>
> >>>
> >>>>>> This webrev is not incremental for your 8136473 changes -
> >>>
> >>>>>> library_call.cpp has part from 8136473 changes.
> >>>
> >>>>>>
> >>>
> >>>>>>> jdk9 changeset:
> >>>
> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >>>
> >>>>>>>
> >>>
> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781
> >>>
> >>>>>>> Clean merge
> >>>
> >>>>>>> webrev link:
> >>>
> >>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
> >>>
> >>>>>>
> >>>
> >>>>>> Thanks seems fine.
> >>>
> >>>>>>
> >>>
> >>>>>>> jdk9 changeset:
> >>>
> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
> >>>
> >>>>>>>
> >>>
> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>
> >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >>>
> >>>>
> >>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
> >>>
> >>>>>>> [JDK-8160360] - Resolved 2.
> >>>
> >>>>
> >>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2
> >>>>>> 73
> >>>
> >>>>>> [JDK-8148146] - Manual merge is not done as the corresponding
> >>>>>> code is
> >>>
> >>>>>> not there in jdk8u-dev.
> >>>
> >>>>>>> webrev link:
> >>>
> >>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
> >>>
> >>>>>>
> >>>
> >>>>>> This webrev is not incremental in library_call.cpp. Difficult to
> >>>>>> see
> >>>
> >>>>>> this part of changes.
> >>>
> >>>>>>
> >>>
> >>>>>> Thanks,
> >>>
> >>>>>> Vladimir
> >>>
> >>>>>>
> >>>
> >>>>>>> jdk9 changeset:
> >>>
> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
> >>>
> >>>>>>>
> >>>
> >>>>>>> Testing: jprt and jtreg
> >>>
> >>>>>>>
> >>>
> >>>>>>> Regards,
> >>>
> >>>>>>> Shafi
> >>>
> >>>>>>>
> >>>
> >>>>>>>> -----Original Message-----
> >>>
> >>>>>>>> From: Shafi Ahmad
> >>>
> >>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM
> >>>
> >>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net
> >>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>
> >>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>
> >>>>>>>> produces mismatched unsafe accesses
> >>>
> >>>>>>>>
> >>>
> >>>>>>>> Thanks Vladimir.
> >>>
> >>>>>>>>
> >>>
> >>>>>>>> I will create dependent  backport of 1.
> >>>
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473
> >>>
> >>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
> >>>
> >>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>
> >>>>>>>>
> >>>
> >>>>>>>> Regards,
> >>>
> >>>>>>>> Shafi
> >>>
> >>>>>>>>
> >>>
> >>>>>>>>> -----Original Message-----
> >>>
> >>>>>>>>> From: Vladimir Kozlov
> >>>
> >>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM
> >>>
> >>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>
> >>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>
> >>>>>>>>> produces mismatched unsafe accesses
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> Hi Shafi,
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> You should also consider backporting following related fixes:
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781
> >>>
> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> Otherwise you may hit asserts added by 8134918 changes.
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> Thanks,
> >>>
> >>>>>>>>> Vladimir
> >>>
> >>>>>>>>>
> >>>
> >>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
> >>>
> >>>>>>>>>> Hi All,
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type
> >>>>>>>>>> speculation
> >>>
> >>>>>>>>>> produces
> >>>
> >>>>>>>>> mismatched unsafe accesses to jdk8u-dev.
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> Please note that backport is not clean and the conflict is due to:
> >>>
> >>>>>>>>>>
> >>>
> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >>>
> >>>>>>>>>> 1
> >>>
> >>>>>>>>>> 65
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>>  Getting debug build failure because of:
> >>>
> >>>>>>>>>>
> >>>
> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >>>
> >>>>>>>>>> 1
> >>>
> >>>>>>>>>> 55
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed:
> >>>>>>>>>> no
> >>>
> >>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which
> >>>>>>>>> is
> >>>
> >>>>>>>>> not back ported to jdk8u and the current backport is on top of
> >>>
> >>>>>>>>> above
> >>>
> >>>>>> change.
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>>  Please note that I am not sure if there is any dependency
> >>>
> >>>>>>>>>> between these
> >>>
> >>>>>>>>> two changesets.
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> open webrev:
> >>>
> >>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >>>
> >>>>>>>>>> jdk9 bug
> >>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918
> >>>
> >>>>>>>>>> jdk9 changeset:
> >>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> testing: Passes JPRT, jtreg not completed
> >>>
> >>>>>>>>>>
> >>>
> >>>>>>>>>> Regards,
> >>>
> >>>>>>>>>> Shafi
> >>>
> >>>>>>>>>>
> >>>

From ioi.lam at oracle.com  Mon Nov 21 06:58:13 2016
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 20 Nov 2016 22:58:13 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
Message-ID: <58329B05.6070602@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8169867
http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/

Thanks to Tobias for finding the bug. I have done the following

+ integrated Tobias' suggested fix
+ fixed Method::restore_unshareable_info to call Method::link_method
+ added comments and a diagram to illustrate how the CDS method entry
   trampolines work.

BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
It's basically an extra level of indirection to get to the adapter. However.
The word "trampoline" usually is used for and extra jump in executable code,
so it may be a little confusing when we use it for a data pointer here.

Any suggest for a better name?


Testing:
[1] I have tested Tobias' TestInterpreterMethodEntries.java class and
     now it produces the correct assertion. I won't check in this test, 
though,
     since it won't assert anymore after Tobias fixes 8169711.

# after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error 
(/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), 
pid=16840, tid=16843
#  assert(entry != __null && entry == _i2i_entry && entry == 
_from_interpreted_entry) failed:
#  should be correctly set during dump time

[2] Ran RBT in fastdebug build for 
hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
     All tests passed.

Thanks
- Ioi


From erik.helin at oracle.com  Mon Nov 21 07:21:05 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 21 Nov 2016 08:21:05 +0100
Subject: RFR: 8166607: G1 needs klass_or_null_acquire
In-Reply-To: <DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com>
	<20161010135436.GB11489@ehelin.jrpg.bea.com>
	<D3245D04-CF5A-4EE1-8673-277B14E1E12A@oracle.com>
	<20161013101149.GL19291@ehelin.jrpg.bea.com>
	<01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com>
	<328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com>
	<20161018085351.GA19291@ehelin.jrpg.bea.com>
	<7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com>
	<299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com>
	<36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com>
	<788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com>
	<D97A93D0-6FF6-47A9-8BFE-E993A9CB5620@oracle.com>
	<1478516014.2646.16.camel@oracle.com>
	<98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com>
	<1479205264.3251.13.camel@oracle.com>
	<05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com>
	<DF472E3E-8ECF-4B2E-861D-0E6248645F84@oracle.com>
Message-ID: <9976738f-1dc6-f3c0-3ec3-6229741b7db0@oracle.com>

On 11/18/2016 03:03 PM, Kim Barrett wrote:
>> On Nov 15, 2016, at 6:58 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>
>>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>>
>>> Hi Kim,
>>>
>>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote:
>>>>>
>>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl <thomas.schatzl at oracle.c
>>>>> om> wrote:
>>>>> Maybe it would be cleaner to call a method in the barrier set
>>>>> instead of inlining the dirtying + enqueuing in lines 685 to 691?
>>>>> Maybe as an additional RFE.
>>>> We could use _ct_bs->invalidate(dirtyRegion).  That's rather
>>>> overgeneralized and inefficient for this situation, but this
>>>> situation should occur *very* rarely; it requires a stale card get
>>>> processed just as a humongous object is in the midst of being
>>>> allocated in the same region.
>>>
>>> I kind of think for these reasons we should use _ct_bs->invalidate() as
>>> it seems clearer to me. There is the mentioned drawback of having no
>>> other more efficient way, so I will let you decide about this.
>>
>> I've made the change to call invalidate, and also updated some comments.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8166607
>>
>> Webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/
>> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/
>>
>> Also, see RFR: 8166811, where I've included a webrev combining the
>> latest changes for 8166607 and 8166811, since they are rather
>> intertwined.  I think I'll do as Erik suggested and push the two
>> together.
>
> Sorry folks, but I want to revert this part and go back to the old
> code where it locked the shared queue and enqueued there.
>
> If the executing invocation of refine_card is from a Java thread,
> e.g. this is the "mutator helps with refinement" case, calling
> invalidate would enqueue to the current thread's buffer.  But that is
> effectively a reentrant call to enqueue, and the Java thread case of
> enqueue is not reentrant-safe.  Only enqueue to the shared queue is
> reentrant-safe.
>
> I think that scenario presently can't happen, since the mutator helps
> case is dealt with by the mutator processing it's own buffer.  In that
> situation, all the cards in the buffer came from writes by this thread
> to an object this thread either allocated or has access to, so the
> klass must be there.  But that's getting uncomfortably subtle in what
> is already difficult to analyze code.

Agree, lets revert to the old code. Thanks for being so careful about 
this change.

> Also, we've talked about changing the mutator helps case to not
> immediately process it's own buffer but instead add its buffer to the
> pending buffer list and process the next (FIFO ordered) buffer, in
> order to let its buffer age.  (I have a change for that in my post-JDK
> 9 collection of pending changes.  The mutator-invoked enqueue might be
> reentrant-safe in that change, but I don't think I want to make that
> guarantee.)

It is hard as it is to keep track of all the synchronization and 
guarantees spread out in the code to make the card refinement work, so I 
would prefer to keep it simple as just revert back to the existing code.

From tobias.hartmann at oracle.com  Mon Nov 21 07:53:32 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 21 Nov 2016 08:53:32 +0100
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <58329B05.6070602@oracle.com>
References: <58329B05.6070602@oracle.com>
Message-ID: <5832A7FC.8030505@oracle.com>

Hi Ioi,

this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation!

For the record: the test mentioned in [1] is part of my fix for JDK-8169711.

Best regards,
Tobias

On 21.11.2016 07:58, Ioi Lam wrote:
> https://bugs.openjdk.java.net/browse/JDK-8169867
> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
> 
> Thanks to Tobias for finding the bug. I have done the following
> 
> + integrated Tobias' suggested fix
> + fixed Method::restore_unshareable_info to call Method::link_method
> + added comments and a diagram to illustrate how the CDS method entry
>   trampolines work.
> 
> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
> It's basically an extra level of indirection to get to the adapter. However.
> The word "trampoline" usually is used for and extra jump in executable code,
> so it may be a little confusing when we use it for a data pointer here.
> 
> Any suggest for a better name?
> 
> 
> Testing:
> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>     now it produces the correct assertion. I won't check in this test, though,
>     since it won't assert anymore after Tobias fixes 8169711.
> 
> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843
> #  assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed:
> #  should be correctly set during dump time
> 
> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>     All tests passed.
> 
> Thanks
> - Ioi
> 

From goetz.lindenmaier at sap.com  Mon Nov 21 11:35:20 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 21 Nov 2016 11:35:20 +0000
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
	<1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com>
Message-ID: <1156e3bb13944547977e904fd8d79ccb@DEROTE13DE08.global.corp.sap>

Hi,

we appreciate this change a lot, and also if /server would go away.

I built and tested it on linuxppcle, aixppc and linuxs390.

There is still a place that refers to a removed variables
and breaks the build:
jdk/src/java.base/unix/native/libjli/ergo.c:94  LIBARCHNAME
You can probably just replace LIBARCHNAME by ARCH which is set to 
the same value.

I would propose to remove VM_CPU from hotspot/test/test_env.sh after you 
removed the last place where it is used.  (VM_BITS is dead, too.)

Best regards,
  Goetz.

> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> Behalf Of Vladimir Kozlov
> Sent: Freitag, 18. November 2016 17:41
> To: Erik Joelsson <erik.joelsson at oracle.com>; build-dev <build-
> dev at openjdk.java.net>; core-libs-dev <core-libs-dev at openjdk.java.net>;
> hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
> and Solaris images
> 
> Finally! :)
> 
> Hotspot changes looks fine to me. But you missed
> hotspot/make/hotspot.script file.
> 
> Our colleges in RH and SAP should test these changes on their platforms.
> 
> Next step would be removal of client/server sub-directories on platforms
> where we have only Server JVM (64-bit JDK has only Server JVM).
> 
> Thanks,
> Vladimir
> 
> On 11/18/16 7:30 AM, Erik Joelsson wrote:
> > Hello,
> >
> > Please review this change which removes the $ARCH sub directory in the
> > lib directory of the runtime images, which is an outstanding issue from
> > the new runtime images. Most of the changes are in the build, but there
> > are some in hotspot and launcher source. I have verified -testset
> > hotspot and default in JPRT as well as tried to run as many jtreg tests
> > as possible locally. I could only really find two tests that needed to
> > be adjusted.
> >
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
> >
> > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01
> >
> > /Erik
> >

From goetz.lindenmaier at sap.com  Mon Nov 21 13:10:15 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 21 Nov 2016 13:10:15 +0000
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
	<1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> 
Message-ID: <b3c4c828b90c4bdc8d4ae258dff2c47f@DEROTE13DE08.global.corp.sap>

Hi,

linuxx86_64 has the same issue.  I tested it with the jdk9/hs repo:

jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function ServerClassMachineImpl:
jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected ) before LIBARCHNAME

Best regards,
  Goetz

> -----Original Message-----
> From: Lindenmaier, Goetz
> Sent: Montag, 21. November 2016 12:35
> To: 'Vladimir Kozlov' <vladimir.kozlov at oracle.com>; Erik Joelsson
> <erik.joelsson at oracle.com>; build-dev <build-dev at openjdk.java.net>;
> core-libs-dev <core-libs-dev at openjdk.java.net>; hotspot-dev developers
> <hotspot-dev at openjdk.java.net>
> Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
> and Solaris images
> 
> Hi,
> 
> we appreciate this change a lot, and also if /server would go away.
> 
> I built and tested it on linuxppcle, aixppc and linuxs390.
> 
> There is still a place that refers to a removed variables
> and breaks the build:
> jdk/src/java.base/unix/native/libjli/ergo.c:94  LIBARCHNAME
> You can probably just replace LIBARCHNAME by ARCH which is set to
> the same value.
> 
> I would propose to remove VM_CPU from hotspot/test/test_env.sh after
> you
> removed the last place where it is used.  (VM_BITS is dead, too.)
> 
> Best regards,
>   Goetz.
> 
> > -----Original Message-----
> > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> > Behalf Of Vladimir Kozlov
> > Sent: Freitag, 18. November 2016 17:41
> > To: Erik Joelsson <erik.joelsson at oracle.com>; build-dev <build-
> > dev at openjdk.java.net>; core-libs-dev <core-libs-dev at openjdk.java.net>;
> > hotspot-dev developers <hotspot-dev at openjdk.java.net>
> > Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
> > and Solaris images
> >
> > Finally! :)
> >
> > Hotspot changes looks fine to me. But you missed
> > hotspot/make/hotspot.script file.
> >
> > Our colleges in RH and SAP should test these changes on their platforms.
> >
> > Next step would be removal of client/server sub-directories on platforms
> > where we have only Server JVM (64-bit JDK has only Server JVM).
> >
> > Thanks,
> > Vladimir
> >
> > On 11/18/16 7:30 AM, Erik Joelsson wrote:
> > > Hello,
> > >
> > > Please review this change which removes the $ARCH sub directory in the
> > > lib directory of the runtime images, which is an outstanding issue from
> > > the new runtime images. Most of the changes are in the build, but there
> > > are some in hotspot and launcher source. I have verified -testset
> > > hotspot and default in JPRT as well as tried to run as many jtreg tests
> > > as possible locally. I could only really find two tests that needed to
> > > be adjusted.
> > >
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
> > >
> > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01
> > >
> > > /Erik
> > >

From erik.joelsson at oracle.com  Mon Nov 21 13:26:54 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 21 Nov 2016 14:26:54 +0100
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <b3c4c828b90c4bdc8d4ae258dff2c47f@DEROTE13DE08.global.corp.sap>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
	<1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com>
	<b3c4c828b90c4bdc8d4ae258dff2c47f@DEROTE13DE08.global.corp.sap>
Message-ID: <ea16ad7a-8d66-04ab-509b-441fb54cefe7@oracle.com>

Hello Goetz,

Thanks for trying this out. Note that the ergo* files were removed in 
JDK-8169001 which is currently in jdk9/dev but not yet in hs.

/Erik


On 2016-11-21 14:10, Lindenmaier, Goetz wrote:
> Hi,
>
> linuxx86_64 has the same issue.  I tested it with the jdk9/hs repo:
>
> jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function ServerClassMachineImpl:
> jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected ) before LIBARCHNAME
>
> Best regards,
>    Goetz
>
>> -----Original Message-----
>> From: Lindenmaier, Goetz
>> Sent: Montag, 21. November 2016 12:35
>> To: 'Vladimir Kozlov' <vladimir.kozlov at oracle.com>; Erik Joelsson
>> <erik.joelsson at oracle.com>; build-dev <build-dev at openjdk.java.net>;
>> core-libs-dev <core-libs-dev at openjdk.java.net>; hotspot-dev developers
>> <hotspot-dev at openjdk.java.net>
>> Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
>> and Solaris images
>>
>> Hi,
>>
>> we appreciate this change a lot, and also if /server would go away.
>>
>> I built and tested it on linuxppcle, aixppc and linuxs390.
>>
>> There is still a place that refers to a removed variables
>> and breaks the build:
>> jdk/src/java.base/unix/native/libjli/ergo.c:94  LIBARCHNAME
>> You can probably just replace LIBARCHNAME by ARCH which is set to
>> the same value.
>>
>> I would propose to remove VM_CPU from hotspot/test/test_env.sh after
>> you
>> removed the last place where it is used.  (VM_BITS is dead, too.)
>>
>> Best regards,
>>    Goetz.
>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
>>> Behalf Of Vladimir Kozlov
>>> Sent: Freitag, 18. November 2016 17:41
>>> To: Erik Joelsson <erik.joelsson at oracle.com>; build-dev <build-
>>> dev at openjdk.java.net>; core-libs-dev <core-libs-dev at openjdk.java.net>;
>>> hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
>>> and Solaris images
>>>
>>> Finally! :)
>>>
>>> Hotspot changes looks fine to me. But you missed
>>> hotspot/make/hotspot.script file.
>>>
>>> Our colleges in RH and SAP should test these changes on their platforms.
>>>
>>> Next step would be removal of client/server sub-directories on platforms
>>> where we have only Server JVM (64-bit JDK has only Server JVM).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 11/18/16 7:30 AM, Erik Joelsson wrote:
>>>> Hello,
>>>>
>>>> Please review this change which removes the $ARCH sub directory in the
>>>> lib directory of the runtime images, which is an outstanding issue from
>>>> the new runtime images. Most of the changes are in the build, but there
>>>> are some in hotspot and launcher source. I have verified -testset
>>>> hotspot and default in JPRT as well as tried to run as many jtreg tests
>>>> as possible locally. I could only really find two tests that needed to
>>>> be adjusted.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01
>>>>
>>>> /Erik
>>>>


From goetz.lindenmaier at sap.com  Mon Nov 21 13:43:21 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 21 Nov 2016 13:43:21 +0000
Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and
	Solaris images
In-Reply-To: <ea16ad7a-8d66-04ab-509b-441fb54cefe7@oracle.com>
References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com>
	<1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com>
	<b3c4c828b90c4bdc8d4ae258dff2c47f@DEROTE13DE08.global.corp.sap>
	<ea16ad7a-8d66-04ab-509b-441fb54cefe7@oracle.com>
Message-ID: <e3d8d94841b2494fbb20a5cf0dd8a455@DEROTE13DE08.global.corp.sap>

Ah, ok, so this is fine.

Best regards,
  Goetz.

> -----Original Message-----
> From: Erik Joelsson [mailto:erik.joelsson at oracle.com]
> Sent: Montag, 21. November 2016 14:27
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Vladimir Kozlov
> <vladimir.kozlov at oracle.com>; build-dev <build-dev at openjdk.java.net>;
> core-libs-dev <core-libs-dev at openjdk.java.net>; hotspot-dev developers
> <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux
> and Solaris images
> 
> Hello Goetz,
> 
> Thanks for trying this out. Note that the ergo* files were removed in
> JDK-8169001 which is currently in jdk9/dev but not yet in hs.
> 
> /Erik
> 
> 
> On 2016-11-21 14:10, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > linuxx86_64 has the same issue.  I tested it with the jdk9/hs repo:
> >
> > jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function
> ServerClassMachineImpl:
> > jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected )
> before LIBARCHNAME
> >
> > Best regards,
> >    Goetz
> >
> >> -----Original Message-----
> >> From: Lindenmaier, Goetz
> >> Sent: Montag, 21. November 2016 12:35
> >> To: 'Vladimir Kozlov' <vladimir.kozlov at oracle.com>; Erik Joelsson
> >> <erik.joelsson at oracle.com>; build-dev <build-dev at openjdk.java.net>;
> >> core-libs-dev <core-libs-dev at openjdk.java.net>; hotspot-dev
> developers
> >> <hotspot-dev at openjdk.java.net>
> >> Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from
> Linux
> >> and Solaris images
> >>
> >> Hi,
> >>
> >> we appreciate this change a lot, and also if /server would go away.
> >>
> >> I built and tested it on linuxppcle, aixppc and linuxs390.
> >>
> >> There is still a place that refers to a removed variables
> >> and breaks the build:
> >> jdk/src/java.base/unix/native/libjli/ergo.c:94  LIBARCHNAME
> >> You can probably just replace LIBARCHNAME by ARCH which is set to
> >> the same value.
> >>
> >> I would propose to remove VM_CPU from hotspot/test/test_env.sh after
> >> you
> >> removed the last place where it is used.  (VM_BITS is dead, too.)
> >>
> >> Best regards,
> >>    Goetz.
> >>
> >>> -----Original Message-----
> >>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
> >>> Behalf Of Vladimir Kozlov
> >>> Sent: Freitag, 18. November 2016 17:41
> >>> To: Erik Joelsson <erik.joelsson at oracle.com>; build-dev <build-
> >>> dev at openjdk.java.net>; core-libs-dev <core-libs-
> dev at openjdk.java.net>;
> >>> hotspot-dev developers <hotspot-dev at openjdk.java.net>
> >>> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from
> Linux
> >>> and Solaris images
> >>>
> >>> Finally! :)
> >>>
> >>> Hotspot changes looks fine to me. But you missed
> >>> hotspot/make/hotspot.script file.
> >>>
> >>> Our colleges in RH and SAP should test these changes on their platforms.
> >>>
> >>> Next step would be removal of client/server sub-directories on
> platforms
> >>> where we have only Server JVM (64-bit JDK has only Server JVM).
> >>>
> >>> Thanks,
> >>> Vladimir
> >>>
> >>> On 11/18/16 7:30 AM, Erik Joelsson wrote:
> >>>> Hello,
> >>>>
> >>>> Please review this change which removes the $ARCH sub directory in
> the
> >>>> lib directory of the runtime images, which is an outstanding issue from
> >>>> the new runtime images. Most of the changes are in the build, but
> there
> >>>> are some in hotspot and launcher source. I have verified -testset
> >>>> hotspot and default in JPRT as well as tried to run as many jtreg tests
> >>>> as possible locally. I could only really find two tests that needed to
> >>>> be adjusted.
> >>>>
> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474
> >>>>
> >>>> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01
> >>>>
> >>>> /Erik
> >>>>


From aph at redhat.com  Mon Nov 21 15:15:58 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 15:15:58 +0000
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
Message-ID: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>

JavaThread::interp_only_mode is a 32-bit sized field. In the assembly code we read it as a 64-bit xword, causing false positives. This means that as soon as we attach a JVMTI debugger everything runs very slowly.

http://cr.openjdk.java.net/~aph/8170098/

From aph at redhat.com  Mon Nov 21 15:18:13 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 15:18:13 +0000
Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing References
Message-ID: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>

In the entry of
TemplateInterpreterGenerator::generate_Reference_get_entry the
sender's SP is saved in r13, a call-clobbered register. We need to
save it in a register which is not call-clobbered when we call
g1_write_barrier_pre().

It would be better to convert all usages of r13 as senderSP to r19,
but this is less risky.  I'll do it in JDK 10.

http://cr.openjdk.java.net/~aph/8170100/

Andrew.

From adinn at redhat.com  Mon Nov 21 15:21:08 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 21 Nov 2016 15:21:08 +0000
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
Message-ID: <57a5fe39-1f0b-3c0c-c2ee-409769a32c0d@redhat.com>

On 21/11/16 15:15, Andrew Haley wrote:
> JavaThread::interp_only_mode is a 32-bit sized field. In the assembly
> code we read it as a 64-bit xword, causing false positives. This
> means that as soon as we attach a JVMTI debugger everything runs very
> slowly.
> 
> http://cr.openjdk.java.net/~aph/8170098/

Looks good (not an official review).


regards,


Andrew Dinn
-----------

From aph at redhat.com  Mon Nov 21 15:22:16 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 15:22:16 +0000
Subject: RFR: 8170106: AArch64: Multiple JVMCI issues
Message-ID: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>

JVMCI nearly works, but there are multiple minor bugs which make it
non-functional.  It's not possible to separate these into multiple
issues, so this is a composite patch.

The handling of some relocs is wrong.
Narrow klasses and OOPs have only partial support, returning Unimplemented()
Register numbering for float registers is wrong
Scratch registers r8 and r9 aren't marked as non-allocatable.

http://cr.openjdk.java.net/~aph/8170106

Andrew.

From adinn at redhat.com  Mon Nov 21 15:23:29 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 21 Nov 2016 15:23:29 +0000
Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing
	References
In-Reply-To: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>
References: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>
Message-ID: <eb3e8ef4-5dd0-9ef1-e95b-ec62de830415@redhat.com>

On 21/11/16 15:18, Andrew Haley wrote:
> In the entry of
> TemplateInterpreterGenerator::generate_Reference_get_entry the
> sender's SP is saved in r13, a call-clobbered register. We need to
> save it in a register which is not call-clobbered when we call
> g1_write_barrier_pre().
> 
> It would be better to convert all usages of r13 as senderSP to r19,
> but this is less risky.  I'll do it in JDK 10.
> 
> http://cr.openjdk.java.net/~aph/8170100/

Looks good to me (not an official review). I agree that postponing the
full change to JDK10 is a wiser.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From aph at redhat.com  Mon Nov 21 15:34:07 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 15:34:07 +0000
Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing
	References
In-Reply-To: <eb3e8ef4-5dd0-9ef1-e95b-ec62de830415@redhat.com>
References: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>
	<eb3e8ef4-5dd0-9ef1-e95b-ec62de830415@redhat.com>
Message-ID: <569efbb2-9292-e789-8738-c55ad521ab04@redhat.com>

On 21/11/16 15:23, Andrew Dinn wrote:
> Looks good to me (not an official review).

Shouldn't you be a JDK9 reviewer by now?  IMO you have enough
experience.  I'll propose you if you like.

Andrew.


From adinn at redhat.com  Mon Nov 21 15:50:37 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Mon, 21 Nov 2016 15:50:37 +0000
Subject: RFR: 8170106: AArch64: Multiple JVMCI issues
In-Reply-To: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>
References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>
Message-ID: <8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com>


On 21/11/16 15:22, Andrew Haley wrote:
> JVMCI nearly works, but there are multiple minor bugs which make it
> non-functional.  It's not possible to separate these into multiple
> issues, so this is a composite patch.
> 
> The handling of some relocs is wrong.
> Narrow klasses and OOPs have only partial support, returning Unimplemented()
> Register numbering for float registers is wrong
> Scratch registers r8 and r9 aren't marked as non-allocatable.
> 
> http://cr.openjdk.java.net/~aph/8170106

I guess I probably ought to review this. All the code changes look
sensible and appear correct by eyeball. Whether they are really needed
or, indeed, are /all/ that is needed is far from obvious. I could at
least build the tree and test that it runs ok. Are you able to provide
any special instructions needed to achieve that? (esp the latter).

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From aph at redhat.com  Mon Nov 21 15:52:00 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 15:52:00 +0000
Subject: RFR: 8170106: AArch64: Multiple JVMCI issues
In-Reply-To: <8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com>
References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>
	<8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com>
Message-ID: <4121dae2-d69f-abc4-f425-973cc4515e52@redhat.com>

On 21/11/16 15:50, Andrew Dinn wrote:
> I guess I probably ought to review this. All the code changes look
> sensible and appear correct by eyeball. Whether they are really needed
> or, indeed, are /all/ that is needed is far from obvious. I could at
> least build the tree and test that it runs ok. Are you able to provide
> any special instructions needed to achieve that? (esp the latter).

It'll be tricky without the Graal patches which are needed to make
things run.

Andrew.


From doug.simon at oracle.com  Mon Nov 21 15:54:27 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 21 Nov 2016 16:54:27 +0100
Subject: RFR: 8170106: AArch64: Multiple JVMCI issues
In-Reply-To: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>
References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com>
Message-ID: <29BBCC34-9BF6-41C4-BE14-6ED90639A7E6@oracle.com>

(including hotspot-compiler-dev)

> On 21 Nov 2016, at 16:22, Andrew Haley <aph at redhat.com> wrote:
> 
> JVMCI nearly works, but there are multiple minor bugs which make it
> non-functional.  It's not possible to separate these into multiple
> issues, so this is a composite patch.
> 
> The handling of some relocs is wrong.
> Narrow klasses and OOPs have only partial support, returning Unimplemented()
> Register numbering for float registers is wrong
> Scratch registers r8 and r9 aren't marked as non-allocatable.
> 
> http://cr.openjdk.java.net/~aph/8170106
> 
> Andrew.


From kirill.zhaldybin at oracle.com  Mon Nov 21 16:38:37 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Mon, 21 Nov 2016 19:38:37 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
Message-ID: <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>

Marcus,

Thank you for reviewing the fix!
>> WebRev: 
>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>
> ISO8601 says the decimal point can be either '.' or ',' so the test 
> should accept either. You could let sscanf read out the decimal point 
> as a character and just verify that it is one of the two.
>
> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that 
> we won't accept "Z" suffixed strings. Please revert that.
I agree that ISO8601 could add "Z" to time (and as far as I understand 
date/time without delimiters is legal too) but these are the unit tests.
Hence they cover the existing code and they should pass only if the 
result corresponds to existing code and fail otherwise.
The current code from os::iso8601_time format date/time string 
%04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not consider 
any other format as valid.

Could you please let me know your opinion?

Thank you.

Regards, Kirill

>
> Thanks,
> Marcus
>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>
>> Thank you.
>>
>> Regards, Kirill
>


From rwestrel at redhat.com  Mon Nov 21 17:24:47 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Mon, 21 Nov 2016 18:24:47 +0100
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI
	debugging	enabled
In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
Message-ID: <dk660ngbva8.fsf@rwestrel.remote.csb>


> http://cr.openjdk.java.net/~aph/8170098/

Looks good to me.

Roland.

From rwestrel at redhat.com  Mon Nov 21 17:25:56 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Mon, 21 Nov 2016 18:25:56 +0100
Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing
	References
In-Reply-To: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>
References: <a39d741a-2c96-82ea-0190-3fabff8bb4b6@redhat.com>
Message-ID: <dk637ikbv8b.fsf@rwestrel.remote.csb>


> http://cr.openjdk.java.net/~aph/8170100/

That looks good to me.

Roland.

From vladimir.kozlov at oracle.com  Mon Nov 21 17:48:39 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 21 Nov 2016 09:48:39 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <58329B05.6070602@oracle.com>
References: <58329B05.6070602@oracle.com>
Message-ID: <a6b891cd-bfbe-aa47-5bc2-b60ee75c4cf0@oracle.com>

On 11/20/16 10:58 PM, Ioi Lam wrote:
> https://bugs.openjdk.java.net/browse/JDK-8169867
> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/

Looks good to me.

>
>
> Thanks to Tobias for finding the bug. I have done the following
>
> + integrated Tobias' suggested fix
> + fixed Method::restore_unshareable_info to call Method::link_method
> + added comments and a diagram to illustrate how the CDS method entry
>   trampolines work.
>
> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
> It's basically an extra level of indirection to get to the adapter.
> However.
> The word "trampoline" usually is used for and extra jump in executable
> code,
> so it may be a little confusing when we use it for a data pointer here.
>
> Any suggest for a better name?

_adapter_cds_entry ?

Thanks,
Vladimir

>
>
> Testing:
> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>     now it produces the correct assertion. I won't check in this test,
> though,
>     since it won't assert anymore after Tobias fixes 8169711.
>
> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error
> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035),
> pid=16840, tid=16843
> #  assert(entry != __null && entry == _i2i_entry && entry ==
> _from_interpreted_entry) failed:
> #  should be correctly set during dump time
>
> [2] Ran RBT in fastdebug build for
> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>     All tests passed.
>
> Thanks
> - Ioi
>

From dmitry.samersoff at oracle.com  Mon Nov 21 18:46:53 2016
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Mon, 21 Nov 2016 21:46:53 +0300
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
Message-ID: <a4f8e48f-7ff5-d1de-ed54-c3c0623d18c3@oracle.com>

Andrew,

Should the code in MethodHandles::jump_from_method_handle() be changed
as well?

-Dmitry

On 2016-11-21 18:15, Andrew Haley wrote:
> JavaThread::interp_only_mode is a 32-bit sized field. In the assembly code we read it as a 64-bit xword, causing false positives. This means that as soon as we attach a JVMTI debugger everything runs very slowly.
> 
> http://cr.openjdk.java.net/~aph/8170098/
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From aph at redhat.com  Mon Nov 21 19:00:37 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 21 Nov 2016 19:00:37 +0000
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
In-Reply-To: <a4f8e48f-7ff5-d1de-ed54-c3c0623d18c3@oracle.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
	<a4f8e48f-7ff5-d1de-ed54-c3c0623d18c3@oracle.com>
Message-ID: <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com>

Hi,

On 21/11/16 18:46, Dmitry Samersoff wrote:

> Should the code in MethodHandles::jump_from_method_handle() be changed
> as well?

I can't see where.  We don't seem to be calling a native function in
there.  Can you tell me more about the code path you have in mind?

Thanks,

Andrew.


From dmitry.samersoff at oracle.com  Mon Nov 21 19:17:10 2016
From: dmitry.samersoff at oracle.com (Dmitry Samersoff)
Date: Mon, 21 Nov 2016 22:17:10 +0300
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
In-Reply-To: <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
	<a4f8e48f-7ff5-d1de-ed54-c3c0623d18c3@oracle.com>
	<44a8064c-f247-bca1-f887-75260bb8458f@redhat.com>
Message-ID: <42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com>

On 2016-11-21 22:00, Andrew Haley wrote:
> Hi,
> 
> On 21/11/16 18:46, Dmitry Samersoff wrote:
> 
>> Should the code in MethodHandles::jump_from_method_handle() be changed
>> as well?
> 
> I can't see where.  We don't seem to be calling a native function in
> there.  
> Can you tell me more about the code path you have in mind?

methodHandles_aarch64.cpp:106

 __ ldrb(rscratch1,
Address(rthread, JavaThread::interp_only_mode_offset()));

__ cbnz(rscratch1, run_compiled_code);

-Dmitry


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

From gromero at linux.vnet.ibm.com  Tue Nov 22 00:27:10 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 21 Nov 2016 22:27:10 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
Message-ID: <583390DE.5050406@linux.vnet.ibm.com>

Hi Joe,

On 17-11-2016 19:33, joe darcy wrote:
>>>> Currently, optimization for building fdlibm is disabled, except for the
>>>> "solaris" OS target [1].
>>> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the
>>> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline.
>> oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm
>> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does
>> not affect the precision, even if setting -O3 does not improve the performance
>> as much as on PPC64.
> 
> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume values
> of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would run afoul
> of these fdlibm coding practices.

On discussing with the Power toolchain folks we narrowed down the issue on PPC64
to the FMA. -fno-strict-aliasing has no effect and when used with an aggressive
optimization does not solve the issue on precision. Thus -ffp-contract=off is
the best options we have by now to optimize the fdlibm on PPC64.


>>> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases.
>> I agree. It's just that the issue on StrictMath methods was first noted due to
>> that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64.
> 
> Depending on how Math.{sin, cos} is implemented on PPC64, compiling the fdlibm sin/cos with more aggressive optimizations should not be expected to close the performance gap. In particular, if
> Math.{sin, cos} is an intrinsic on PPC64 (I haven't checked the sources) that used platform-specific feature (say fused multiply add instructions) then just compiling fdlibm more aggressively wouldn't
> necessarily make up that gap.

In our case (PPC64) it does close the gap. Non-optimized code will suffer a lot,
for instance, from load-hit-store issues. Contrary to what happens on PPC64, the
gap on x64 seems to be quite small as you said.


> 
> To allow cross-platform and cross-release reproducibility, StrictMath is specified to use the particular fdlibm algorithms, which precludes using better algorithms developed more recently. If we were
> to start with a clean slate today, to get such reproducibility we would specify correctly-rounded behavior of all those methods, but such an approach was much less tractable technical 20+ years ago
> without benefit of the research that was been done in the interim, such as the work of Prof. Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.
> 
>>
>>
>>> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones.
>>> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ.
>> That's really good point, thanks for letting me know about that. I'll re-test my
>> change under that perspective.
>>
>>
>>> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area.
>> Got it. By "the JDK math library regression tests" you mean exactly which test
>> suite? the jtreg tests?
> 
> Specifically, the regression tests under test/java/lang/Math and test/java/lang/StrictMath in the jdk repository. There are some other math library tests in the hotspot repo, but I don't know where
> they are offhand.
> 
> A note on methodologies, when I've been writing test for my port I've tried to include test cases that exercise all the branches point in the code. Due to the large input space (~2^64 for a
> single-argument method), random sampling alone is an inefficient way to try to find differences in behavior.
>> For testing against JCK/TCK I'll need some help on that.
>>
> 
> I believe the JCK/TCK does have additional testcases relevant here.
> 
> HTH; thanks,
> 
> -Joe
> 

Thank you very much for the valuable comments.

I'll send a webrev accordingly for review.

I filed a bug: https://bugs.openjdk.java.net/browse/JDK-8170153


Best regards,
Gustavo


From gromero at linux.vnet.ibm.com  Tue Nov 22 00:34:37 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 21 Nov 2016 22:34:37 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
	<5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
Message-ID: <5833929D.9000602@linux.vnet.ibm.com>

Hi Chris,

On 17-11-2016 19:48, Chris Plummer wrote:
>> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume
>> values of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would
>> run afoul of these fdlibm coding practices.
> This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more
> than 12 years since I last dealt with fdlibm and compiler aliasing issues.

I've tested with -O3 and -fno-strict-aliasing as you suggested but it did not
fix the fp precision issue on PPC64.

After finding that -fno-expensive-optimizations solved the problem, we narrowed
down the problem to the FMA: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386

Thank you.


Regards,
Gustavo


From gromero at linux.vnet.ibm.com  Tue Nov 22 00:41:34 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 21 Nov 2016 22:41:34 -0200
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <CY1PR07MB2393948C11DDD6F533CBADA384B10@CY1PR07MB2393.namprd07.prod.outlook.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
	<5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
	<CY1PR07MB2393948C11DDD6F533CBADA384B10@CY1PR07MB2393.namprd07.prod.outlook.com>
Message-ID: <5833943E.9010807@linux.vnet.ibm.com>

Hi Derek,

On 17-11-2016 20:47, White, Derek wrote:
> Hi Joe,
> 
> Although neither a floating point expert (as I think I've proven to you over the years), or a gcc expert, I checked with our in-house gcc expert and got this following answer:
> 
> 	"Yes using -fno-strict-aliasing fixes the issues.  Also there are many forks of fdlibm which has this fixed including the code inside glibc. "

I've tried on PPC64 -O3 and -fno-strict-aliasing but it didn't work. Disabling
the FMA fixed the issue although.

Do you know if the gap between Math and StrictMath is also huge on aarch64?

Thank you.


Regards,
Gustavo:


From joe.darcy at oracle.com  Tue Nov 22 00:42:10 2016
From: joe.darcy at oracle.com (joe darcy)
Date: Mon, 21 Nov 2016 16:42:10 -0800
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <5833929D.9000602@linux.vnet.ibm.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
	<5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com>
	<5833929D.9000602@linux.vnet.ibm.com>
Message-ID: <8d21cafc-a4f5-0ed7-1f8f-4c40ccf4ecbe@oracle.com>

Hello,


On 11/21/2016 4:34 PM, Gustavo Romero wrote:
> Hi Chris,
>
> On 17-11-2016 19:48, Chris Plummer wrote:
>>> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume
>>> values of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would
>>> run afoul of these fdlibm coding practices.
>> This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more
>> than 12 years since I last dealt with fdlibm and compiler aliasing issues.
> I've tested with -O3 and -fno-strict-aliasing as you suggested but it did not
> fix the fp precision issue on PPC64.
>
> After finding that -fno-expensive-optimizations solved the problem, we narrowed
> down the problem to the FMA: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386
>
>

That makes sense; an FMA will by its nature provide different results 
than separate (unfused) multiple and add operations. While the 
polynomials used in fdlibm would benefit performance-wise from implicit 
replacement with FMA, such a replacement would violate the StrictMath 
contract. Therefore, if FDLIBM is left in C sources, it must be compiled 
in such a way that FMA is *not* substituted for multiply and add.

Thanks,

-Joe

From gromero at linux.vnet.ibm.com  Tue Nov 22 00:43:49 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 21 Nov 2016 22:43:49 -0200
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
Message-ID: <583394C5.3030206@linux.vnet.ibm.com>

Hi,

Could the following change be reviewed, please?

webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/
webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/
bug:        https://bugs.openjdk.java.net/browse/JDK-8170153

It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the
StrictMath methods (in some cases up to 3x) on that platform.

On PPC64 fdlibm optimization can be done without precision issues if
floating-point expression contraction is disable, i.e. if the compiler does not
use floating-point multiply-add (FMA). For further details please refer to gcc
bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386

No regression was observed on Math and StrictMath tests:

Passed: java/lang/Math/AbsPositiveZero.java
Passed: java/lang/Math/Atan2Tests.java
Passed: java/lang/Math/CeilAndFloorTests.java
Passed: java/lang/Math/CubeRootTests.java
Passed: java/lang/Math/DivModTests.java
Passed: java/lang/Math/ExactArithTests.java
Passed: java/lang/Math/Expm1Tests.java
Passed: java/lang/Math/FusedMultiplyAddTests.java
Passed: java/lang/Math/HyperbolicTests.java
Passed: java/lang/Math/HypotTests.java
Passed: java/lang/Math/IeeeRecommendedTests.java
Passed: java/lang/Math/Log10Tests.java
Passed: java/lang/Math/Log1pTests.java
Passed: java/lang/Math/MinMax.java
Passed: java/lang/Math/MultiplicationTests.java
Passed: java/lang/Math/PowTests.java
Passed: java/lang/Math/Rint.java
Passed: java/lang/Math/RoundTests.java
Passed: java/lang/Math/SinCosCornerCasesTests.java
Passed: java/lang/Math/TanTests.java
Passed: java/lang/Math/WorstCaseTests.java
Test results: passed: 21

Passed: java/lang/StrictMath/CubeRootTests.java
Passed: java/lang/StrictMath/ExactArithTests.java
Passed: java/lang/StrictMath/Expm1Tests.java
Passed: java/lang/StrictMath/HyperbolicTests.java
Passed: java/lang/StrictMath/HypotTests.java
Passed: java/lang/StrictMath/Log10Tests.java
Passed: java/lang/StrictMath/Log1pTests.java
Passed: java/lang/StrictMath/PowTests.java
Test results: passed: 8

and also on the following hotspot tests:

Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java
Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java
Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java
Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java
Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java
Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java
Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java
Passed: compiler/intrinsics/mathexact/AddExactICondTest.java
Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java
Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java
Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java
Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java
Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java
Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java
Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java
Passed: compiler/intrinsics/mathexact/CompareTest.java
Passed: compiler/intrinsics/mathexact/DecExactITest.java
Passed: compiler/intrinsics/mathexact/DecExactLTest.java
Passed: compiler/intrinsics/mathexact/GVNTest.java
Passed: compiler/intrinsics/mathexact/IncExactITest.java
Passed: compiler/intrinsics/mathexact/IncExactLTest.java
Passed: compiler/intrinsics/mathexact/MulExactICondTest.java
Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java
Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java
Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java
Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java
Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java
Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java
Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java
Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java
Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java
Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java
Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java
Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java
Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java
Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java
Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java
Passed: compiler/intrinsics/mathexact/SubExactICondTest.java
Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java
Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java
Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java
Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java
Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java
Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java
Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java
Test results: passed: 50

Thank you.


Regards,
Gustavo


From chris.plummer at oracle.com  Tue Nov 22 01:33:08 2016
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 21 Nov 2016 17:33:08 -0800
Subject: PPC64: Poor StrictMath performance due to non-optimized
	compilation
In-Reply-To: <583390DE.5050406@linux.vnet.ibm.com>
References: <582D0BCE.2030209@linux.vnet.ibm.com>
	<d445b60d-d5f0-f2a5-83c4-11f8aacf06cf@oracle.com>
	<582DF764.70504@linux.vnet.ibm.com>
	<cff16188-b3c6-da38-6d6e-857ff08481ea@oracle.com>
	<583390DE.5050406@linux.vnet.ibm.com>
Message-ID: <adafe92c-13e6-c48e-9bf8-a7754194ce9c@oracle.com>

On 11/21/16 4:27 PM, Gustavo Romero wrote:
> Hi Joe,
>
> On 17-11-2016 19:33, joe darcy wrote:
>>>>> Currently, optimization for building fdlibm is disabled, except for the
>>>>> "solaris" OS target [1].
>>>> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the
>>>> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline.
>>> oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm
>>> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does
>>> not affect the precision, even if setting -O3 does not improve the performance
>>> as much as on PPC64.
>> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume values
>> of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would run afoul
>> of these fdlibm coding practices.
> On discussing with the Power toolchain folks we narrowed down the issue on PPC64
> to the FMA. -fno-strict-aliasing has no effect and when used with an aggressive
> optimization does not solve the issue on precision. Thus -ffp-contract=off is
> the best options we have by now to optimize the fdlibm on PPC64.
Ah! I should have thought of this. I dealt with with fdlibm FMA issues 
on ppc about 15 years ago.  At the time -mno-fused-madd was the 
solution. I don't think -ffp-contract=off existed back then.

Chris
>
>
>>>> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases.
>>> I agree. It's just that the issue on StrictMath methods was first noted due to
>>> that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64.
>> Depending on how Math.{sin, cos} is implemented on PPC64, compiling the fdlibm sin/cos with more aggressive optimizations should not be expected to close the performance gap. In particular, if
>> Math.{sin, cos} is an intrinsic on PPC64 (I haven't checked the sources) that used platform-specific feature (say fused multiply add instructions) then just compiling fdlibm more aggressively wouldn't
>> necessarily make up that gap.
> In our case (PPC64) it does close the gap. Non-optimized code will suffer a lot,
> for instance, from load-hit-store issues. Contrary to what happens on PPC64, the
> gap on x64 seems to be quite small as you said.
>
>
>> To allow cross-platform and cross-release reproducibility, StrictMath is specified to use the particular fdlibm algorithms, which precludes using better algorithms developed more recently. If we were
>> to start with a clean slate today, to get such reproducibility we would specify correctly-rounded behavior of all those methods, but such an approach was much less tractable technical 20+ years ago
>> without benefit of the research that was been done in the interim, such as the work of Prof. Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.
>>
>>>
>>>> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones.
>>>> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ.
>>> That's really good point, thanks for letting me know about that. I'll re-test my
>>> change under that perspective.
>>>
>>>
>>>> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area.
>>> Got it. By "the JDK math library regression tests" you mean exactly which test
>>> suite? the jtreg tests?
>> Specifically, the regression tests under test/java/lang/Math and test/java/lang/StrictMath in the jdk repository. There are some other math library tests in the hotspot repo, but I don't know where
>> they are offhand.
>>
>> A note on methodologies, when I've been writing test for my port I've tried to include test cases that exercise all the branches point in the code. Due to the large input space (~2^64 for a
>> single-argument method), random sampling alone is an inefficient way to try to find differences in behavior.
>>> For testing against JCK/TCK I'll need some help on that.
>>>
>> I believe the JCK/TCK does have additional testcases relevant here.
>>
>> HTH; thanks,
>>
>> -Joe
>>
> Thank you very much for the valuable comments.
>
> I'll send a webrev accordingly for review.
>
> I filed a bug: https://bugs.openjdk.java.net/browse/JDK-8170153
>
>
> Best regards,
> Gustavo
>


From jiangli.zhou at Oracle.COM  Tue Nov 22 04:33:02 2016
From: jiangli.zhou at Oracle.COM (Jiangli Zhou)
Date: Mon, 21 Nov 2016 20:33:02 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <58329B05.6070602@oracle.com>
References: <58329B05.6070602@oracle.com>
Message-ID: <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com>

Hi Ioi,

Looks good.

I also have one suggestion. To make it a little more easier to read, could you please fold the 'else if (_i2i_entry != NULL)? block starting at line 1039 and ?if (!is_shared())? block starting at line 1047 into to one block?
1032   if (is_shared()) {
1033     address entry = Interpreter::entry_for_cds_method(h_method);
1034     assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry,
1035            "should be correctly set during dump time");
1036     if (adapter() != NULL) {
1037       return;
1038     }
1039   } else if (_i2i_entry != NULL) {
1040     return;
1041   }
1042   assert( _code == NULL, "nothing compiled yet" );
1043 
1044   // Setup interpreter entrypoint
1045   assert(this == h_method(), "wrong h_method()" );
1046 
1047   if (!is_shared()) {
1048     assert(adapter() == NULL, "init'd to NULL");
1049     address entry = Interpreter::entry_for_method(h_method);
1050     assert(entry != NULL, "interpreter entry must be non-null");
1051     // Sets both _i2i_entry and _from_interpreted_entry
1052     set_interpreter_entry(entry);
1053   }
Thanks,
Jiangli

> On Nov 20, 2016, at 10:58 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> https://bugs.openjdk.java.net/browse/JDK-8169867
> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
> 
> Thanks to Tobias for finding the bug. I have done the following
> 
> + integrated Tobias' suggested fix
> + fixed Method::restore_unshareable_info to call Method::link_method
> + added comments and a diagram to illustrate how the CDS method entry
>  trampolines work.
> 
> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
> It's basically an extra level of indirection to get to the adapter. However.
> The word "trampoline" usually is used for and extra jump in executable code,
> so it may be a little confusing when we use it for a data pointer here.
> 
> Any suggest for a better name?
> 
> 
> Testing:
> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>    now it produces the correct assertion. I won't check in this test, though,
>    since it won't assert anymore after Tobias fixes 8169711.
> 
> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843
> #  assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed:
> #  should be correctly set during dump time
> 
> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>    All tests passed.
> 
> Thanks
> - Ioi
> 


From ioi.lam at oracle.com  Tue Nov 22 07:05:42 2016
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 21 Nov 2016 23:05:42 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com>
References: <58329B05.6070602@oracle.com>
	<02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com>
Message-ID: <5833EE46.8020304@oracle.com>


On 11/21/16 8:33 PM, Jiangli Zhou wrote:
> Hi Ioi,
>
> Looks good.
>
> I also have one suggestion. To make it a little more easier to read, 
> could you please fold the 'else if (_i2i_entry != NULL)? block 
> starting at line 1039 and ?if (!is_shared())? block starting at line 
> 1047 into to one block?
> 1032   if (is_shared()) {
> 1033     address entry = Interpreter::entry_for_cds_method(h_method);
> 1034     assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry,
> 1035            "should be correctly set during dump time");
> 1036     if (adapter() != NULL) {
> 1037       return;
> 1038     }
> 1039   } else if (_i2i_entry != NULL) {
> 1040     return;
> 1041   }
> 1042   assert( _code == NULL, "nothing compiled yet" );
> 1043
> 1044   // Setup interpreter entrypoint
> 1045   assert(this == h_method(), "wrong h_method()" );
> 1046
> 1047   if (!is_shared()) {
> 1048     assert(adapter() == NULL, "init'd to NULL");
> 1049     address entry = Interpreter::entry_for_method(h_method);
> 1050     assert(entry != NULL, "interpreter entry must be non-null");
> 1051     // Sets both _i2i_entry and _from_interpreted_entry
> 1052     set_interpreter_entry(entry);
> 1053   }

Hi Jiangli,

The line

     assert( _code == NULL, "nothing compiled yet" );

is necessary before we call

     set_interpreter_entry(entry);

That's because the _from_interpreted_entry would be different if the 
method has been compiled.

So this means I cannot simply move the block starting at #1047 to above 
#1042.

Thanks
- Ioi

> Thanks,
> Jiangli
>
>> On Nov 20, 2016, at 10:58 PM, Ioi Lam <ioi.lam at oracle.com 
>> <mailto:ioi.lam at oracle.com>> wrote:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8169867
>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
>>
>> Thanks to Tobias for finding the bug. I have done the following
>>
>> + integrated Tobias' suggested fix
>> + fixed Method::restore_unshareable_info to call Method::link_method
>> + added comments and a diagram to illustrate how the CDS method entry
>>  trampolines work.
>>
>> BTW, I am a little unhappy about the name 
>> ConstMethod::_adapter_trampoline.
>> It's basically an extra level of indirection to get to the adapter. 
>> However.
>> The word "trampoline" usually is used for and extra jump in 
>> executable code,
>> so it may be a little confusing when we use it for a data pointer here.
>>
>> Any suggest for a better name?
>>
>>
>> Testing:
>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>    now it produces the correct assertion. I won't check in this test, 
>> though,
>>    since it won't assert anymore after Tobias fixes 8169711.
>>
>> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error 
>> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), 
>> pid=16840, tid=16843
>> #  assert(entry != __null && entry == _i2i_entry && entry == 
>> _from_interpreted_entry) failed:
>> #  should be correctly set during dump time
>>
>> [2] Ran RBT in fastdebug build for 
>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>>    All tests passed.
>>
>> Thanks
>> - Ioi
>>
>


From aph at redhat.com  Tue Nov 22 10:08:39 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 22 Nov 2016 10:08:39 +0000
Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging
	enabled
In-Reply-To: <42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com>
References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com>
	<a4f8e48f-7ff5-d1de-ed54-c3c0623d18c3@oracle.com>
	<44a8064c-f247-bca1-f887-75260bb8458f@redhat.com>
	<42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com>
Message-ID: <d313bc8a-dcde-85a0-67ef-f75a1f609f77@redhat.com>

On 21/11/16 19:17, Dmitry Samersoff wrote:
> On 2016-11-21 22:00, Andrew Haley wrote:
>>
>> On 21/11/16 18:46, Dmitry Samersoff wrote:
>>
>>> Should the code in MethodHandles::jump_from_method_handle() be changed
>>> as well?
>>
>> I can't see where.  We don't seem to be calling a native function in
>> there.  
>> Can you tell me more about the code path you have in mind?
> 
> methodHandles_aarch64.cpp:106
> 
>  __ ldrb(rscratch1, Address(rthread, JavaThread::interp_only_mode_offset()));
> 
> __ cbnz(rscratch1, run_compiled_code);

Oh, I see.  I guess it would have been a good idea for me to change this,
but unless we see a big-endian ARM it doesn't matter.

Thanks,

Andrew.


From trevor.d.watson at oracle.com  Tue Nov 22 10:25:26 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Tue, 22 Nov 2016 10:25:26 +0000
Subject: Ping: RFR: 8162865 Implementation of SPARC lzcnt
In-Reply-To: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
References: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
Message-ID: <42f837bb-bb59-1dab-14fc-578cd95de101@oracle.com>

On 15/11/16 11:57, Trevor Watson wrote:
> I have implemented the code to use the lzcnt instruction for both
> integer and long countLeadingZeros() methods on SPARC platforms
> supporting the vis3 instruction set.
>
> Current "bmi" tests for the above are updated so that they run on both
> SPARC and x86 platforms.
>
> I've also implemented a test to ensure that Integer.countLeadingZeros()
> and Long.countLeadingZeros() return the correct values when C2 runs.
> This test is currently under the intrinsics "bmi" tests for want of
> somewhere better (they do apply to both SPARC and x86 though).
>
> http://cr.openjdk.java.net/~alanbur/8162865/
>
> Thanks,
> Trevor

From marcus.larsson at oracle.com  Tue Nov 22 12:32:21 2016
From: marcus.larsson at oracle.com (Marcus Larsson)
Date: Tue, 22 Nov 2016 13:32:21 +0100
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
Message-ID: <ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>

Hi,


On 2016-11-21 17:38, Kirill Zhaldybin wrote:
> Marcus,
>
> Thank you for reviewing the fix!
>>> WebRev: 
>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>
>> ISO8601 says the decimal point can be either '.' or ',' so the test 
>> should accept either. You could let sscanf read out the decimal point 
>> as a character and just verify that it is one of the two.
>>
>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that 
>> we won't accept "Z" suffixed strings. Please revert that.
> I agree that ISO8601 could add "Z" to time (and as far as I understand 
> date/time without delimiters is legal too) but these are the unit tests.
> Hence they cover the existing code and they should pass only if the 
> result corresponds to existing code and fail otherwise.
> The current code from os::iso8601_time format date/time string 
> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not consider 
> any other format as valid.
>
> Could you please let me know your opinion?

I think the test should verify the intended behavior, not the 
implementation. If we refactor or change something in iso8601_time() we 
shouldn't be failing the test if it still conforms to ISO8601, IMO.

Thanks,
Marcus

>
> Thank you.
>
> Regards, Kirill
>
>>
>> Thanks,
>> Marcus
>>
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>
>>> Thank you.
>>>
>>> Regards, Kirill
>>
>


From kirill.zhaldybin at oracle.com  Tue Nov 22 13:24:07 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Tue, 22 Nov 2016 16:24:07 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
Message-ID: <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>

Marcus,

Thank you for prompt reply!

Could you please read comments inline?
I'm looking forward to your reply.

Thank you.

Regards, Kirill

On 22.11.2016 15:32, Marcus Larsson wrote:
> Hi,
>
>
> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>> Marcus,
>>
>> Thank you for reviewing the fix!
>>>> WebRev: 
>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>
>>> ISO8601 says the decimal point can be either '.' or ',' so the test 
>>> should accept either. You could let sscanf read out the decimal 
>>> point as a character and just verify that it is one of the two.
>>>
>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that 
>>> we won't accept "Z" suffixed strings. Please revert that.
>> I agree that ISO8601 could add "Z" to time (and as far as I 
>> understand date/time without delimiters is legal too) but these are 
>> the unit tests.
>> Hence they cover the existing code and they should pass only if the 
>> result corresponds to existing code and fail otherwise.
>> The current code from os::iso8601_time format date/time string 
>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not 
>> consider any other format as valid.
>>
>> Could you please let me know your opinion?
>
> I think the test should verify the intended behavior, not the 
> implementation. If we refactor or change something in iso8601_time() 
> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
I would agree with you if we were talking about a functional test. But 
since it is an unit test I think we should keep it as close to 
implementation as possible.
If the implementation is changed unintentionally the test fails and 
signals us that something is broken.
If it is an intentional change the test must be updated correspondingly.

>
> Thanks,
> Marcus
>
>>
>> Thank you.
>>
>> Regards, Kirill
>>
>>>
>>> Thanks,
>>> Marcus
>>>
>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>
>>>> Thank you.
>>>>
>>>> Regards, Kirill
>>>
>>
>


From stefan.karlsson at oracle.com  Tue Nov 22 14:54:55 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Nov 2016 15:54:55 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk
	freelist
Message-ID: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>

Hi all,

Please, review this patch to fix a bug in ChunkManager::list_index():
  http://cr.openjdk.java.net/~stefank/8169931/webrev.01

There's a great description of the bug in the bug report:
  https://bugs.openjdk.java.net/browse/JDK-8169931

There are two conceptual parts of the metaspace. The _class_ metaspace, 
and the _non-class_ metaspace. They have different chunk sizes, and 
while querying for the list index of a humongous chunk in the class 
metaspace, the code accidentally matched the size against the 
MediumChunk size of the non-class metaspace.

I've changed the code to not query against the global ChunkSizes enum, 
but rather the values stored inside the ChunkManager instances. 
Therefore, the list_index() function was changed into an instance method.

I've written a unit test that provoked the bug. It's a simplified test 
with vm asserts instead of gtest asserts. The reason is that the 
ChunkManager class is currently located in metaspace.cpp, and is not 
accessible from the gtest unit tests.

Testing: jprt, Kitchensink, parallel class loading tests

Thanks,
StefanK

From mikael.gerdin at oracle.com  Tue Nov 22 16:08:24 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 22 Nov 2016 17:08:24 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
Message-ID: <41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com>

Hi Stefan,


On 2016-11-22 15:54, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix a bug in ChunkManager::list_index():
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01

The change looks good to me.

/Mikael

>
> There's a great description of the bug in the bug report:
>  https://bugs.openjdk.java.net/browse/JDK-8169931
>
> There are two conceptual parts of the metaspace. The _class_ metaspace,
> and the _non-class_ metaspace. They have different chunk sizes, and
> while querying for the list index of a humongous chunk in the class
> metaspace, the code accidentally matched the size against the
> MediumChunk size of the non-class metaspace.
>
> I've changed the code to not query against the global ChunkSizes enum,
> but rather the values stored inside the ChunkManager instances.
> Therefore, the list_index() function was changed into an instance method.
>
> I've written a unit test that provoked the bug. It's a simplified test
> with vm asserts instead of gtest asserts. The reason is that the
> ChunkManager class is currently located in metaspace.cpp, and is not
> accessible from the gtest unit tests.
>
> Testing: jprt, Kitchensink, parallel class loading tests
>
> Thanks,
> StefanK

From thomas.stuefe at gmail.com  Tue Nov 22 17:09:12 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 22 Nov 2016 18:09:12 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
Message-ID: <CAA-vtUz-u0kNiKwFbfm9Jf6SSidpm4SV4G-QhFQovfO0QcWTaw@mail.gmail.com>

Hi Stefan,

this change looks good!

Small nitpick, there exists already a function returning a pointer to the
free list by chunk index (ChunkManager::free_chunks(index)). You could have
implemented ChunkManager::list_chunk_size() using this function (return
free_chunks(index)->size()) and add your assert to
ChunkManager::free_chunks(index) instead. Or, alternativly, just use
free_chunks(index)->size() directly instead of adding list_chunk_size().

Kind Regards, Thomas


On Tue, Nov 22, 2016 at 3:54 PM, Stefan Karlsson <stefan.karlsson at oracle.com
> wrote:

> Hi all,
>
> Please, review this patch to fix a bug in ChunkManager::list_index():
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>
> There's a great description of the bug in the bug report:
>  https://bugs.openjdk.java.net/browse/JDK-8169931
>
> There are two conceptual parts of the metaspace. The _class_ metaspace,
> and the _non-class_ metaspace. They have different chunk sizes, and while
> querying for the list index of a humongous chunk in the class metaspace,
> the code accidentally matched the size against the MediumChunk size of the
> non-class metaspace.
>
> I've changed the code to not query against the global ChunkSizes enum, but
> rather the values stored inside the ChunkManager instances. Therefore, the
> list_index() function was changed into an instance method.
>
> I've written a unit test that provoked the bug. It's a simplified test
> with vm asserts instead of gtest asserts. The reason is that the
> ChunkManager class is currently located in metaspace.cpp, and is not
> accessible from the gtest unit tests.
>
> Testing: jprt, Kitchensink, parallel class loading tests
>
> Thanks,
> StefanK
>

From jiangli.zhou at oracle.com  Tue Nov 22 17:55:51 2016
From: jiangli.zhou at oracle.com (Jiangli Zhou)
Date: Tue, 22 Nov 2016 09:55:51 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <5833EE46.8020304@oracle.com>
References: <58329B05.6070602@oracle.com>
	<02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com>
	<5833EE46.8020304@oracle.com>
Message-ID: <70FCC942-9363-489D-8CDC-6441CB246953@oracle.com>

Hi Ioi,

> On Nov 21, 2016, at 11:05 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> 
> 
> On 11/21/16 8:33 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>> 
>> Looks good.
>> 
>> I also have one suggestion. To make it a little more easier to read, could you please fold the 'else if (_i2i_entry != NULL)? block starting at line 1039 and ?if (!is_shared())? block starting at line 1047 into to one block?
>> 1032   if (is_shared()) {
>> 1033     address entry = Interpreter::entry_for_cds_method(h_method);
>> 1034     assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry,
>> 1035            "should be correctly set during dump time");
>> 1036     if (adapter() != NULL) {
>> 1037       return;
>> 1038     }
>> 1039   } else if (_i2i_entry != NULL) {
>> 1040     return;
>> 1041   }
>> 1042   assert( _code == NULL, "nothing compiled yet" );
>> 1043 
>> 1044   // Setup interpreter entrypoint
>> 1045   assert(this == h_method(), "wrong h_method()" );
>> 1046 
>> 1047   if (!is_shared()) {
>> 1048     assert(adapter() == NULL, "init'd to NULL");
>> 1049     address entry = Interpreter::entry_for_method(h_method);
>> 1050     assert(entry != NULL, "interpreter entry must be non-null");
>> 1051     // Sets both _i2i_entry and _from_interpreted_entry
>> 1052     set_interpreter_entry(entry);
>> 1053   }
> 
> Hi Jiangli,
> 
> The line
> 
>     assert( _code == NULL, "nothing compiled yet" );
> 
> is necessary before we call
> 
>     set_interpreter_entry(entry);
> 
> That's because the _from_interpreted_entry would be different if the method has been compiled.
> 
> So this means I cannot simply move the block starting at #1047 to above #1042.


Ok.

Thanks,
Jiangli

> 
> Thanks
> - Ioi
> 
>> Thanks,
>> Jiangli
>> 
>>> On Nov 20, 2016, at 10:58 PM, Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>> wrote:
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8169867 <https://bugs.openjdk.java.net/browse/JDK-8169867>
>>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ <http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/>
>>> 
>>> Thanks to Tobias for finding the bug. I have done the following
>>> 
>>> + integrated Tobias' suggested fix
>>> + fixed Method::restore_unshareable_info to call Method::link_method
>>> + added comments and a diagram to illustrate how the CDS method entry
>>>  trampolines work.
>>> 
>>> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
>>> It's basically an extra level of indirection to get to the adapter. However.
>>> The word "trampoline" usually is used for and extra jump in executable code,
>>> so it may be a little confusing when we use it for a data pointer here.
>>> 
>>> Any suggest for a better name?
>>> 
>>> 
>>> Testing:
>>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>>    now it produces the correct assertion. I won't check in this test, though,
>>>    since it won't assert anymore after Tobias fixes 8169711.
>>> 
>>> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #  Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843
>>> #  assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed:
>>> #  should be correctly set during dump time
>>> 
>>> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>>>    All tests passed.
>>> 
>>> Thanks
>>> - Ioi
>>> 
>> 
> 


From coleen.phillimore at oracle.com  Tue Nov 22 19:30:24 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 22 Nov 2016 14:30:24 -0500
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
Message-ID: <bdb65d74-bea3-92df-1d70-45a4681c5ca4@oracle.com>


Can you put this test at the end of the file with // Unit Tests and an 
explanation why this is here so people don't try to port the whole thing 
to gtest?

I was looking for uses of list_index and found this code, which looks wrong:

   assert((word_size <= chunk->word_size()) ||
          list_index(chunk->word_size() == HumongousIndex),
          "Non-humongous variable sized chunk");


This change looks good though.

Thanks,
Coleen

On 11/22/16 9:54 AM, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix a bug in ChunkManager::list_index():
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>
> There's a great description of the bug in the bug report:
>  https://bugs.openjdk.java.net/browse/JDK-8169931
>
> There are two conceptual parts of the metaspace. The _class_ 
> metaspace, and the _non-class_ metaspace. They have different chunk 
> sizes, and while querying for the list index of a humongous chunk in 
> the class metaspace, the code accidentally matched the size against 
> the MediumChunk size of the non-class metaspace.
>
> I've changed the code to not query against the global ChunkSizes enum, 
> but rather the values stored inside the ChunkManager instances. 
> Therefore, the list_index() function was changed into an instance method.
>
> I've written a unit test that provoked the bug. It's a simplified test 
> with vm asserts instead of gtest asserts. The reason is that the 
> ChunkManager class is currently located in metaspace.cpp, and is not 
> accessible from the gtest unit tests.
>
> Testing: jprt, Kitchensink, parallel class loading tests
>
> Thanks,
> StefanK


From vladimir.kozlov at oracle.com  Tue Nov 22 20:04:09 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 22 Nov 2016 12:04:09 -0800
Subject: RFR: 8162865 Implementation of SPARC lzcnt
In-Reply-To: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
References: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
Message-ID: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com>

Hi Trevor

Do you have performance numbers?

UseVIS is too wide flag to control only these instructions generation.

To be consistent with x86 code please add 
UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its 
setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is 
done for x86.

May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear 
that VIS is used.

Indention in the new test is all over place. Please, fix.

Thanks,
Vladimir

On 11/15/16 3:57 AM, Trevor Watson wrote:
> I have implemented the code to use the lzcnt instruction for both
> integer and long countLeadingZeros() methods on SPARC platforms
> supporting the vis3 instruction set.
>
> Current "bmi" tests for the above are updated so that they run on both
> SPARC and x86 platforms.
>
> I've also implemented a test to ensure that Integer.countLeadingZeros()
> and Long.countLeadingZeros() return the correct values when C2 runs.
> This test is currently under the intrinsics "bmi" tests for want of
> somewhere better (they do apply to both SPARC and x86 though).
>
> http://cr.openjdk.java.net/~alanbur/8162865/
>
> Thanks,
> Trevor

From stefan.karlsson at oracle.com  Tue Nov 22 21:05:10 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Nov 2016 22:05:10 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <bdb65d74-bea3-92df-1d70-45a4681c5ca4@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<bdb65d74-bea3-92df-1d70-45a4681c5ca4@oracle.com>
Message-ID: <9f67e983-a058-ad27-e0ce-2300e6bb8a56@oracle.com>

Hi Coleen,

On 2016-11-22 20:30, Coleen Phillimore wrote:
>
> Can you put this test at the end of the file with // Unit Tests and an 
> explanation why this is here so people don't try to port the whole 
> thing to gtest?

Sure.

>
> I was looking for uses of list_index and found this code, which looks 
> wrong:
>
>   assert((word_size <= chunk->word_size()) ||
>          list_index(chunk->word_size() == HumongousIndex),
>          "Non-humongous variable sized chunk");

I'll fix that assert.

>
>
> This change looks good though.

Thanks, I'll send out a new patch.

StefanK

>
> Thanks,
> Coleen
>
> On 11/22/16 9:54 AM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_ 
>> metaspace, and the _non-class_ metaspace. They have different chunk 
>> sizes, and while querying for the list index of a humongous chunk in 
>> the class metaspace, the code accidentally matched the size against 
>> the MediumChunk size of the non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes 
>> enum, but rather the values stored inside the ChunkManager instances. 
>> Therefore, the list_index() function was changed into an instance 
>> method.
>>
>> I've written a unit test that provoked the bug. It's a simplified 
>> test with vm asserts instead of gtest asserts. The reason is that the 
>> ChunkManager class is currently located in metaspace.cpp, and is not 
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK
>


From stefan.karlsson at oracle.com  Tue Nov 22 21:06:09 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Nov 2016 22:06:09 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <CAA-vtUz-u0kNiKwFbfm9Jf6SSidpm4SV4G-QhFQovfO0QcWTaw@mail.gmail.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<CAA-vtUz-u0kNiKwFbfm9Jf6SSidpm4SV4G-QhFQovfO0QcWTaw@mail.gmail.com>
Message-ID: <c3f10dd1-94aa-de3f-524c-40211b163cff@oracle.com>

Hi Thomas,

On 2016-11-22 18:09, Thomas St?fe wrote:
> Hi Stefan,
>
> this change looks good!

Thanks!

>
> Small nitpick, there exists already a function returning a pointer to 
> the free list by chunk index (ChunkManager::free_chunks(index)). You 
> could have implemented ChunkManager::list_chunk_size() using this 
> function (return free_chunks(index)->size()) and add your assert to 
> ChunkManager::free_chunks(index) instead. Or, alternativly, just use 
> free_chunks(index)->size() directly instead of adding list_chunk_size().

Sure. I'll send out a new patch including your suggestion.

Thanks,
StefanK
>
> Kind Regards, Thomas
>
>
> On Tue, Nov 22, 2016 at 3:54 PM, Stefan Karlsson 
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
>     Hi all,
>
>     Please, review this patch to fix a bug in ChunkManager::list_index():
>     http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>     <http://cr.openjdk.java.net/%7Estefank/8169931/webrev.01>
>
>     There's a great description of the bug in the bug report:
>     https://bugs.openjdk.java.net/browse/JDK-8169931
>     <https://bugs.openjdk.java.net/browse/JDK-8169931>
>
>     There are two conceptual parts of the metaspace. The _class_
>     metaspace, and the _non-class_ metaspace. They have different
>     chunk sizes, and while querying for the list index of a humongous
>     chunk in the class metaspace, the code accidentally matched the
>     size against the MediumChunk size of the non-class metaspace.
>
>     I've changed the code to not query against the global ChunkSizes
>     enum, but rather the values stored inside the ChunkManager
>     instances. Therefore, the list_index() function was changed into
>     an instance method.
>
>     I've written a unit test that provoked the bug. It's a simplified
>     test with vm asserts instead of gtest asserts. The reason is that
>     the ChunkManager class is currently located in metaspace.cpp, and
>     is not accessible from the gtest unit tests.
>
>     Testing: jprt, Kitchensink, parallel class loading tests
>
>     Thanks,
>     StefanK
>
>


From stefan.karlsson at oracle.com  Tue Nov 22 21:06:29 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Nov 2016 22:06:29 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com>
Message-ID: <8a47ecc7-8195-8df8-df70-a2657cd683f8@oracle.com>

Thanks, Mikael.

StefanK

On 2016-11-22 17:08, Mikael Gerdin wrote:
> Hi Stefan,
>
>
> On 2016-11-22 15:54, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>
> The change looks good to me.
>
> /Mikael
>
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_ metaspace,
>> and the _non-class_ metaspace. They have different chunk sizes, and
>> while querying for the list index of a humongous chunk in the class
>> metaspace, the code accidentally matched the size against the
>> MediumChunk size of the non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes enum,
>> but rather the values stored inside the ChunkManager instances.
>> Therefore, the list_index() function was changed into an instance 
>> method.
>>
>> I've written a unit test that provoked the bug. It's a simplified test
>> with vm asserts instead of gtest asserts. The reason is that the
>> ChunkManager class is currently located in metaspace.cpp, and is not
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Tue Nov 22 21:37:51 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 22 Nov 2016 22:37:51 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
Message-ID: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>

Hi all,

Here are the update patch, with changes suggested by Coleen and Thomas:
  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
  http://cr.openjdk.java.net/~stefank/8169931/webrev.02

Changes to the previous patch:
* Removed list_chunk_size and instead used free_chunks(index)->size()
* Removed the const qualifier from list_index, since free_chunks isn't 
declared const. Fixing this would have been a too large change for this 
bug fix.
* Moved ChunkManager_test_list_index into the unit test section of 
metaspace.cpp
* Fixed a broken assert

Thanks,
StefanK


On 2016-11-22 15:54, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix a bug in ChunkManager::list_index():
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>
> There's a great description of the bug in the bug report:
>  https://bugs.openjdk.java.net/browse/JDK-8169931
>
> There are two conceptual parts of the metaspace. The _class_ 
> metaspace, and the _non-class_ metaspace. They have different chunk 
> sizes, and while querying for the list index of a humongous chunk in 
> the class metaspace, the code accidentally matched the size against 
> the MediumChunk size of the non-class metaspace.
>
> I've changed the code to not query against the global ChunkSizes enum, 
> but rather the values stored inside the ChunkManager instances. 
> Therefore, the list_index() function was changed into an instance method.
>
> I've written a unit test that provoked the bug. It's a simplified test 
> with vm asserts instead of gtest asserts. The reason is that the 
> ChunkManager class is currently located in metaspace.cpp, and is not 
> accessible from the gtest unit tests.
>
> Testing: jprt, Kitchensink, parallel class loading tests
>
> Thanks,
> StefanK


From coleen.phillimore at oracle.com  Tue Nov 22 22:48:30 2016
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Tue, 22 Nov 2016 17:48:30 -0500
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
Message-ID: <6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com>

Looks good!
Thanks,
Coleen

On 11/22/16 4:37 PM, Stefan Karlsson wrote:
> Hi all,
>
> Here are the update patch, with changes suggested by Coleen and Thomas:
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>
> Changes to the previous patch:
> * Removed list_chunk_size and instead used free_chunks(index)->size()
> * Removed the const qualifier from list_index, since free_chunks isn't 
> declared const. Fixing this would have been a too large change for 
> this bug fix.
> * Moved ChunkManager_test_list_index into the unit test section of 
> metaspace.cpp
> * Fixed a broken assert
>
> Thanks,
> StefanK
>
>
> On 2016-11-22 15:54, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_ 
>> metaspace, and the _non-class_ metaspace. They have different chunk 
>> sizes, and while querying for the list index of a humongous chunk in 
>> the class metaspace, the code accidentally matched the size against 
>> the MediumChunk size of the non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes 
>> enum, but rather the values stored inside the ChunkManager instances. 
>> Therefore, the list_index() function was changed into an instance 
>> method.
>>
>> I've written a unit test that provoked the bug. It's a simplified 
>> test with vm asserts instead of gtest asserts. The reason is that the 
>> ChunkManager class is currently located in metaspace.cpp, and is not 
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK
>
>


From david.holmes at oracle.com  Wed Nov 23 05:08:28 2016
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 23 Nov 2016 15:08:28 +1000
Subject: Presentation: Understanding OrderAccess
Message-ID: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>

This is a presentation I recently gave internally to the runtime and 
serviceability teams that may be of more general interest to hotspot 
developers.

http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf

Cheers,
David

From erik.helin at oracle.com  Wed Nov 23 07:09:28 2016
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 23 Nov 2016 08:09:28 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
Message-ID: <00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com>

On 11/22/2016 10:37 PM, Stefan Karlsson wrote:
> Hi all,
>
> Here are the update patch, with changes suggested by Coleen and Thomas:
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02

Hey StefanK, thanks for taking care of this! The patch looks good to me, 
Reviewed.

Thanks,
Erik

> Changes to the previous patch:
> * Removed list_chunk_size and instead used free_chunks(index)->size()
> * Removed the const qualifier from list_index, since free_chunks isn't
> declared const. Fixing this would have been a too large change for this
> bug fix.
> * Moved ChunkManager_test_list_index into the unit test section of
> metaspace.cpp
> * Fixed a broken assert
>
> Thanks,
> StefanK
>
>
> On 2016-11-22 15:54, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_
>> metaspace, and the _non-class_ metaspace. They have different chunk
>> sizes, and while querying for the list index of a humongous chunk in
>> the class metaspace, the code accidentally matched the size against
>> the MediumChunk size of the non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes enum,
>> but rather the values stored inside the ChunkManager instances.
>> Therefore, the list_index() function was changed into an instance method.
>>
>> I've written a unit test that provoked the bug. It's a simplified test
>> with vm asserts instead of gtest asserts. The reason is that the
>> ChunkManager class is currently located in metaspace.cpp, and is not
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK
>
>

From thomas.stuefe at gmail.com  Wed Nov 23 07:42:12 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 23 Nov 2016 08:42:12 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
Message-ID: <CAA-vtUy8EK_8PdydgrGfwzpFNods4CgEXGWeQQk2YF+MJJgN8w@mail.gmail.com>

Hi Stefan,

this looks fine!

Thanks,
Thomas

On Tue, Nov 22, 2016 at 10:37 PM, Stefan Karlsson <
stefan.karlsson at oracle.com> wrote:

> Hi all,
>
> Here are the update patch, with changes suggested by Coleen and Thomas:
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>
> Changes to the previous patch:
> * Removed list_chunk_size and instead used free_chunks(index)->size()
> * Removed the const qualifier from list_index, since free_chunks isn't
> declared const. Fixing this would have been a too large change for this bug
> fix.
> * Moved ChunkManager_test_list_index into the unit test section of
> metaspace.cpp
> * Fixed a broken assert
>
> Thanks,
> StefanK
>
>
> On 2016-11-22 15:54, Stefan Karlsson wrote:
>
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_ metaspace,
>> and the _non-class_ metaspace. They have different chunk sizes, and while
>> querying for the list index of a humongous chunk in the class metaspace,
>> the code accidentally matched the size against the MediumChunk size of the
>> non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes enum,
>> but rather the values stored inside the ChunkManager instances. Therefore,
>> the list_index() function was changed into an instance method.
>>
>> I've written a unit test that provoked the bug. It's a simplified test
>> with vm asserts instead of gtest asserts. The reason is that the
>> ChunkManager class is currently located in metaspace.cpp, and is not
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK
>>
>
>
>

From mikael.gerdin at oracle.com  Wed Nov 23 09:42:26 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 23 Nov 2016 10:42:26 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
Message-ID: <b24ba1d2-354d-7423-044a-9d8275462d0f@oracle.com>

Hi Stefan,

On 2016-11-22 22:37, Stefan Karlsson wrote:
> Hi all,
>
> Here are the update patch, with changes suggested by Coleen and Thomas:
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02

Updated webrev looks good to me as well.
/Mikael

>
> Changes to the previous patch:
> * Removed list_chunk_size and instead used free_chunks(index)->size()
> * Removed the const qualifier from list_index, since free_chunks isn't
> declared const. Fixing this would have been a too large change for this
> bug fix.
> * Moved ChunkManager_test_list_index into the unit test section of
> metaspace.cpp
> * Fixed a broken assert
>
> Thanks,
> StefanK
>
>
> On 2016-11-22 15:54, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>
>> There's a great description of the bug in the bug report:
>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>
>> There are two conceptual parts of the metaspace. The _class_
>> metaspace, and the _non-class_ metaspace. They have different chunk
>> sizes, and while querying for the list index of a humongous chunk in
>> the class metaspace, the code accidentally matched the size against
>> the MediumChunk size of the non-class metaspace.
>>
>> I've changed the code to not query against the global ChunkSizes enum,
>> but rather the values stored inside the ChunkManager instances.
>> Therefore, the list_index() function was changed into an instance method.
>>
>> I've written a unit test that provoked the bug. It's a simplified test
>> with vm asserts instead of gtest asserts. The reason is that the
>> ChunkManager class is currently located in metaspace.cpp, and is not
>> accessible from the gtest unit tests.
>>
>> Testing: jprt, Kitchensink, parallel class loading tests
>>
>> Thanks,
>> StefanK
>
>

From aph at redhat.com  Wed Nov 23 10:40:49 2016
From: aph at redhat.com (Andrew Haley)
Date: Wed, 23 Nov 2016 10:40:49 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
Message-ID: <a723a16a-e438-bc67-270c-0d95dc69ba48@redhat.com>

On 23/11/16 05:08, David Holmes wrote:
> This is a presentation I recently gave internally to the runtime and 
> serviceability teams that may be of more general interest to hotspot 
> developers.
> 
> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf

That's pretty cool; nicely done.

I'd quibble about a couple of minor things:

In Data Race Example: Using Barriers, the use of a naked StoreStore is
rather terrifying.  In real-world code it'd be better to use
StoreStore|LoadStore or release unless the author really knows what
they're doing.

The use of "fence" to mean a full barrier is rather idiosyncratic; it
confused me the first time I saw it in HotSpot source, and from time
to time it still does.

But, as I said, these are minor criticisms.

Andrew.

From tobias.hartmann at oracle.com  Wed Nov 23 11:42:00 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 23 Nov 2016 12:42:00 +0100
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <acc34349-458b-4476-b165-b85c61d3722a@default>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
	<582B622F.7030909@oracle.com>
	<4332d26a-0efa-4582-9068-f28fb7ebd109@default>
	<b3e52d65-024f-cbb0-9588-594fab8c59b0@oracle.com>
	<acc34349-458b-4476-b165-b85c61d3722a@default>
Message-ID: <58358088.1090709@oracle.com>

Hi Shafi,

On 21.11.2016 07:29, Shafi Ahmad wrote:
> Hi All,
> 
> May I get the second review on this. 
> 
> I am putting together all the webrevs to make it simple for reviewer.
> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/
> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/
> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/

This looks good to me (not a 8u reviewer).

Best regards,
Tobias

> 
> Please note that I tested with jprt, all jtreg and rbt tests.
> 
> Regards,
> Shafi
> 
>> -----Original Message-----
>> From: Vladimir Kozlov
>> Sent: Wednesday, November 16, 2016 10:21 PM
>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>> mismatched unsafe accesses
>>
>> Looks good.
>>
>> I would suggest to run all jtreg tests (or even RBT) when you apply all
>> changes before pushing this.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/16/16 4:52 AM, Shafi Ahmad wrote:
>>> Hi Vladimir,
>>>
>>> Thank you for the review and feedback.
>>>
>>> Please find updated webrevs:
>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed
>> the test case as it use only jdk9 APIs.
>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed
>> test methods testFixedOffsetHeaderArray17() and
>> testFixedOffsetHeader17() which referenced jdk9 API
>> UNSAFE.getIntUnaligned.
>>>
>>>
>>> Regards,
>>> Shafi
>>>
>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov
>>>> Sent: Wednesday, November 16, 2016 1:00 AM
>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
>>>> mismatched unsafe accesses
>>>>
>>>> Hi Shafi
>>>>
>>>> You should not backport tests which use only new JDK 9 APIs. Like
>>>> TestUnsafeUnalignedMismatchedAccesses.java test.
>>>>
>>>> But it is perfectly fine to modify backport by removing part of
>>>> changes which use a new API. For example,  8162101 changes in
>>>> OpaqueAccesses.java test which use getIntUnaligned() method.
>>>>
>>>> It is unfortunate that 8140309 changes include also code which
>>>> process new Unsafe Unaligned intrinsics from JDK 9. It should not be
>>>> backported but it will simplify this and following backports. So I
>>>> agree with changes you did for
>>>> 8140309 backport.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 11/14/16 10:34 PM, Shafi Ahmad wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> Thanks for the review.
>>>>>
>>>>>> -----Original Message-----
>>>>>
>>>>>> From: Vladimir Kozlov
>>>>>
>>>>>> Sent: Monday, November 14, 2016 11:20 PM
>>>>>
>>>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
>>>>>
>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>>> produces
>>>>>
>>>>>> mismatched unsafe accesses
>>>>>
>>>>>>
>>>>>
>>>>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
>>>>>
>>>>>>> Hi Vladimir,
>>>>>
>>>>>>>
>>>>>
>>>>>>> Thanks for the review.
>>>>>
>>>>>>>
>>>>>
>>>>>>> Please find updated webrevs.
>>>>>
>>>>>>>
>>>>>
>>>>>>> All webrevs are with respect to the base changes on JDK-8140309.
>>>>>
>>>>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
>>>>>
>>>>>>
>>>>>
>>>>>> Why you kept unaligned parameter in changes?
>>>>>
>>>>> The fix of JDK-8136473 caused many problems after integration (see
>>>>> JDK-
>>>> 8140267).
>>>>>
>>>>> The fix was backed out and re-implemented with JDK-8140309 by
>>>>> slightly
>>>> changing the assert:
>>>>>
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-
>>>> Novem
>>>>> ber/019696.html
>>>>>
>>>>> The code change for the fix of JDK-8140309 is code changes for
>>>>> JDK-8136473
>>>> by slightly changing one assert.
>>>>>
>>>>> jdk9 original changeset is
>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
>>>>>
>>>>> As this is a backport so I keep the changes as it is.
>>>>>
>>>>>>
>>>>>
>>>>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work
>>>>>> since
>>>>>
>>>>>> since Unsafe class in jdk8 does not have unaligned methods.
>>>>>
>>>>>> Hot did you run it?
>>>>>
>>>>> I am sorry, looks there is some issue with my testing.
>>>>>
>>>>> I have run jtreg test after merging the changes but somehow the test
>>>>> does
>>>> not run and I verified only the failing list of jtreg result.
>>>>>
>>>>> When I run the test case separately it is failing as you already
>>>>> pointed out
>>>> the same.
>>>>>
>>>>> $java -jar ~/Tools/jtreg/lib/jtreg.jar
>>>>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/
>>>>>
>>>>
>> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched
>>>> A
>>>>> ccesses.java
>>>>>
>>>>> Test results: failed: 1
>>>>>
>>>>> Report written to
>>>>> /scratch/shshahma/Java/jdk8u-dev-
>>>> 8140309_01/JTreport/html/report.html
>>>>>
>>>>> Results written to
>>>>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
>>>>>
>>>>> Error:
>>>>>
>>>>> /scratch/shshahma/Java/jdk8u-dev-
>>>> 8140309_01/hotspot/test/compiler/intr
>>>>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error:
>>>>> cannot find symbol
>>>>>
>>>>>          UNSAFE.putIntUnaligned(array,
>>>>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
>>>>>
>>>>> Not sure if we should push without the test case.
>>>>>
>>>>>>
>>>>>
>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
>>>>>
>>>>>>
>>>>>
>>>>>> Good. Did you run new UnsafeAccess.java test?
>>>>>
>>>>> Due to same process issue the test case is not run and when I run it
>>>> separately it fails.
>>>>>
>>>>> It passes after doing below changes:
>>>>>
>>>>> 1. Added /othervm
>>>>>
>>>>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;'  by
>>>>> 'import
>>>> sun.misc.Unsafe;'
>>>>>
>>>>> Updated webrev:
>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
>>>>>
>>>>>>
>>>>>
>>>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
>>>>>
>>>>> I am getting the similar compilation error as above for added test
>>>>> case.  Not
>>>> sure if we can push without the test case.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Shafi
>>>>>
>>>>>>
>>>>>
>>>>>> Good.
>>>>>
>>>>>>
>>>>>
>>>>>> Thanks,
>>>>>
>>>>>> Vladimir
>>>>>
>>>>>>
>>>>>
>>>>>>>
>>>>>
>>>>>>> Regards,
>>>>>
>>>>>>> Shafi
>>>>>
>>>>>>>
>>>>>
>>>>>>>
>>>>>
>>>>>>>
>>>>>
>>>>>>>> -----Original Message-----
>>>>>
>>>>>>>> From: Vladimir Kozlov
>>>>>
>>>>>>>> Sent: Friday, November 11, 2016 1:26 AM
>>>>>
>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>>>
>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>>>>> produces
>>>>>
>>>>>>>> mismatched unsafe accesses
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
>>>>>
>>>>>>>>> Hi,
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> Please review the backport of following dependent backports.
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473
>>>>>
>>>>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61
>>>>>>>>> [JDK-
>>>>>
>>>>>>>> 8080289]. Manual merge is not done as the corresponding code is
>>>>>>>> not
>>>>>
>>>>>>>> there in jdk8u-dev.
>>>>>
>>>>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp
>>>>>>>>> and
>>>>>
>>>>>>>>> manual
>>>>>
>>>>>>>> merge is done.
>>>>>
>>>>>>>>> webrev link:
>>>>>
>>>>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> unaligned unsafe access methods were added in jdk 9 only. In your
>>>>>
>>>>>>>> changes unaligned argument is always false. You can simplify
>> changes.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> Also you should base changes on JDK-8140309 (original 8136473
>>>>>>>> changes
>>>>>
>>>>>>>> were backout by 8140267):
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote:
>>>>>
>>>>>>>>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
>>>>>
>>>>>>>>  >
>>>>>
>>>>>>>>  > Same as 8136473 with only the following change:
>>>>>
>>>>>>>>  >
>>>>>
>>>>>>>>  > diff --git a/src/share/vm/opto/library_call.cpp
>>>>>
>>>>>>>> b/src/share/vm/opto/library_call.cpp
>>>>>
>>>>>>>>  > --- a/src/share/vm/opto/library_call.cpp
>>>>>
>>>>>>>>  > +++ b/src/share/vm/opto/library_call.cpp
>>>>>
>>>>>>>>  > @@ -2527,7 +2527,7 @@
>>>>>
>>>>>>>>  >     // of safe & unsafe memory.
>>>>>
>>>>>>>>  >     if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder);
>>>>>
>>>>>>>>  >
>>>>>
>>>>>>>>  > -  assert(is_native_ptr || alias_type->adr_type() ==
>>>>>
>>>>>>>> TypeOopPtr::BOTTOM
>>>>>
>>>>>>>> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM ||
>>>>>
>>>>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
>>>>>
>>>>>>>>  >            alias_type->field() != NULL || alias_type->element() !=
>>>>>
>>>>>>>> NULL, "field, array element or unknown");
>>>>>
>>>>>>>>  >     bool mismatched = false;
>>>>>
>>>>>>>>  >     if (alias_type->element() != NULL || alias_type->field() != NULL)
>> {
>>>>>
>>>>>>>>  >
>>>>>
>>>>>>>>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
>>>>>
>>>>>>>> is_native_ptr case and the case where the unsafe method is called
>>>>>>>> with a
>>>>>
>>>>>> null object.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>>> jdk9 changeset:
>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918
>>>>>
>>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>>>>
>>>>>>>>>
>>>>>
>>>>>>
>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16
>>>>>> 5
>>>>>
>>>>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code
>>>>>>>> is
>>>>>
>>>>>>>> not there in jdk8u-dev.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> I explained situation with this line above.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>>> webrev link:
>>>>>
>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> This webrev is not incremental for your 8136473 changes -
>>>>>
>>>>>>>> library_call.cpp has part from 8136473 changes.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>>> jdk9 changeset:
>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781
>>>>>
>>>>>>>>> Clean merge
>>>>>
>>>>>>>>> webrev link:
>>>>>
>>>>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> Thanks seems fine.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>>> jdk9 changeset:
>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101
>>>>>
>>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
>>>>>
>>>>>>
>>>>>>>
>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
>>>>>
>>>>>>>>> [JDK-8160360] - Resolved 2.
>>>>>
>>>>>>
>>>>>>
>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2
>>>>>>>> 73
>>>>>
>>>>>>>> [JDK-8148146] - Manual merge is not done as the corresponding
>>>>>>>> code is
>>>>>
>>>>>>>> not there in jdk8u-dev.
>>>>>
>>>>>>>>> webrev link:
>>>>>
>>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> This webrev is not incremental in library_call.cpp. Difficult to
>>>>>>>> see
>>>>>
>>>>>>>> this part of changes.
>>>>>
>>>>>>>>
>>>>>
>>>>>>>> Thanks,
>>>>>
>>>>>>>> Vladimir
>>>>>
>>>>>>>>
>>>>>
>>>>>>>>> jdk9 changeset:
>>>>>
>>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> Testing: jprt and jtreg
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>> Regards,
>>>>>
>>>>>>>>> Shafi
>>>>>
>>>>>>>>>
>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>
>>>>>>>>>> From: Shafi Ahmad
>>>>>
>>>>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM
>>>>>
>>>>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net
>>>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>>>
>>>>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>>
>>>>>>>>>> produces mismatched unsafe accesses
>>>>>
>>>>>>>>>>
>>>>>
>>>>>>>>>> Thanks Vladimir.
>>>>>
>>>>>>>>>>
>>>>>
>>>>>>>>>> I will create dependent  backport of 1.
>>>>>
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473
>>>>>
>>>>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
>>>>>
>>>>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
>>>>>
>>>>>>>>>>
>>>>>
>>>>>>>>>> Regards,
>>>>>
>>>>>>>>>> Shafi
>>>>>
>>>>>>>>>>
>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>
>>>>>>>>>>> From: Vladimir Kozlov
>>>>>
>>>>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM
>>>>>
>>>>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
>>>>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
>>>>>
>>>>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
>>>>>
>>>>>>>>>>> produces mismatched unsafe accesses
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> Hi Shafi,
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> You should also consider backporting following related fixes:
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781
>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> Otherwise you may hit asserts added by 8134918 changes.
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> Thanks,
>>>>>
>>>>>>>>>>> Vladimir
>>>>>
>>>>>>>>>>>
>>>>>
>>>>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
>>>>>
>>>>>>>>>>>> Hi All,
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type
>>>>>>>>>>>> speculation
>>>>>
>>>>>>>>>>>> produces
>>>>>
>>>>>>>>>>> mismatched unsafe accesses to jdk8u-dev.
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> Please note that backport is not clean and the conflict is due to:
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>>>>>
>>>>>>>>>>>> 1
>>>>>
>>>>>>>>>>>> 65
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>  Getting debug build failure because of:
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
>>>>>
>>>>>>>>>>>> 1
>>>>>
>>>>>>>>>>>> 55
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed:
>>>>>>>>>>>> no
>>>>>
>>>>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which
>>>>>>>>>>> is
>>>>>
>>>>>>>>>>> not back ported to jdk8u and the current backport is on top of
>>>>>
>>>>>>>>>>> above
>>>>>
>>>>>>>> change.
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>>  Please note that I am not sure if there is any dependency
>>>>>
>>>>>>>>>>>> between these
>>>>>
>>>>>>>>>>> two changesets.
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> open webrev:
>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
>>>>>
>>>>>>>>>>>> jdk9 bug
>>>>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918
>>>>>
>>>>>>>>>>>> jdk9 changeset:
>>>>>
>>>>>>>>>>>
>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> testing: Passes JPRT, jtreg not completed
>>>>>
>>>>>>>>>>>>
>>>>>
>>>>>>>>>>>> Regards,
>>>>>
>>>>>>>>>>>> Shafi
>>>>>
>>>>>>>>>>>>
>>>>>

From shafi.s.ahmad at oracle.com  Wed Nov 23 11:47:34 2016
From: shafi.s.ahmad at oracle.com (Shafi Ahmad)
Date: Wed, 23 Nov 2016 03:47:34 -0800 (PST)
Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces
	mismatched unsafe accesses
In-Reply-To: <58358088.1090709@oracle.com>
References: <77e0b348-2b95-4097-ba95-906257d8893c@default>
	<b52c43da-549e-906f-2812-2631ac47d717@oracle.com>
	<137be921-c1ef-48d8-b85a-301d597109c0@default>
	<4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com>
	<769e91f9-b0ad-421d-a8c2-ef6fedac4693@default>
	<582B622F.7030909@oracle.com>
	<4332d26a-0efa-4582-9068-f28fb7ebd109@default>
	<b3e52d65-024f-cbb0-9588-594fab8c59b0@oracle.com>
	<acc34349-458b-4476-b165-b85c61d3722a@default>
	<58358088.1090709@oracle.com>
Message-ID: <341e37fe-0e73-4f20-afbd-33cdbe42ffba@default>

Thank you very much Vladimir and Tobias for reviewing it.

Regards,
Shafi 

> -----Original Message-----
> From: Tobias Hartmann
> Sent: Wednesday, November 23, 2016 5:12 PM
> To: Shafi Ahmad; Vladimir Kozlov; hotspot-dev at openjdk.java.net
> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> mismatched unsafe accesses
> 
> Hi Shafi,
> 
> On 21.11.2016 07:29, Shafi Ahmad wrote:
> > Hi All,
> >
> > May I get the second review on this.
> >
> > I am putting together all the webrevs to make it simple for reviewer.
> > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/
> > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/
> > http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
> > http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
> 
> This looks good to me (not a 8u reviewer).
> 
> Best regards,
> Tobias
> 
> >
> > Please note that I tested with jprt, all jtreg and rbt tests.
> >
> > Regards,
> > Shafi
> >
> >> -----Original Message-----
> >> From: Vladimir Kozlov
> >> Sent: Wednesday, November 16, 2016 10:21 PM
> >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces
> >> mismatched unsafe accesses
> >>
> >> Looks good.
> >>
> >> I would suggest to run all jtreg tests (or even RBT) when you apply
> >> all changes before pushing this.
> >>
> >> Thanks,
> >> Vladimir
> >>
> >> On 11/16/16 4:52 AM, Shafi Ahmad wrote:
> >>> Hi Vladimir,
> >>>
> >>> Thank you for the review and feedback.
> >>>
> >>> Please find updated webrevs:
> >>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ =>
> Removed
> >> the test case as it use only jdk9 APIs.
> >>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ =>
> Removed
> >> test methods testFixedOffsetHeaderArray17() and
> >> testFixedOffsetHeader17() which referenced jdk9 API
> >> UNSAFE.getIntUnaligned.
> >>>
> >>>
> >>> Regards,
> >>> Shafi
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Vladimir Kozlov
> >>>> Sent: Wednesday, November 16, 2016 1:00 AM
> >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>> produces mismatched unsafe accesses
> >>>>
> >>>> Hi Shafi
> >>>>
> >>>> You should not backport tests which use only new JDK 9 APIs. Like
> >>>> TestUnsafeUnalignedMismatchedAccesses.java test.
> >>>>
> >>>> But it is perfectly fine to modify backport by removing part of
> >>>> changes which use a new API. For example,  8162101 changes in
> >>>> OpaqueAccesses.java test which use getIntUnaligned() method.
> >>>>
> >>>> It is unfortunate that 8140309 changes include also code which
> >>>> process new Unsafe Unaligned intrinsics from JDK 9. It should not
> >>>> be backported but it will simplify this and following backports. So
> >>>> I agree with changes you did for
> >>>> 8140309 backport.
> >>>>
> >>>> Thanks,
> >>>> Vladimir
> >>>>
> >>>> On 11/14/16 10:34 PM, Shafi Ahmad wrote:
> >>>>> Hi Vladimir,
> >>>>>
> >>>>> Thanks for the review.
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>
> >>>>>> From: Vladimir Kozlov
> >>>>>
> >>>>>> Sent: Monday, November 14, 2016 11:20 PM
> >>>>>
> >>>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net
> >>>>>
> >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>>>> produces
> >>>>>
> >>>>>> mismatched unsafe accesses
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote:
> >>>>>
> >>>>>>> Hi Vladimir,
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>> Thanks for the review.
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>> Please find updated webrevs.
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>> All webrevs are with respect to the base changes on JDK-8140309.
> >>>>>
> >>>>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> Why you kept unaligned parameter in changes?
> >>>>>
> >>>>> The fix of JDK-8136473 caused many problems after integration (see
> >>>>> JDK-
> >>>> 8140267).
> >>>>>
> >>>>> The fix was backed out and re-implemented with JDK-8140309 by
> >>>>> slightly
> >>>> changing the assert:
> >>>>>
> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-
> >>>> Novem
> >>>>> ber/019696.html
> >>>>>
> >>>>> The code change for the fix of JDK-8140309 is code changes for
> >>>>> JDK-8136473
> >>>> by slightly changing one assert.
> >>>>>
> >>>>> jdk9 original changeset is
> >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c
> >>>>>
> >>>>> As this is a backport so I keep the changes as it is.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not
> work
> >>>>>> since
> >>>>>
> >>>>>> since Unsafe class in jdk8 does not have unaligned methods.
> >>>>>
> >>>>>> Hot did you run it?
> >>>>>
> >>>>> I am sorry, looks there is some issue with my testing.
> >>>>>
> >>>>> I have run jtreg test after merging the changes but somehow the
> >>>>> test does
> >>>> not run and I verified only the failing list of jtreg result.
> >>>>>
> >>>>> When I run the test case separately it is failing as you already
> >>>>> pointed out
> >>>> the same.
> >>>>>
> >>>>> $java -jar ~/Tools/jtreg/lib/jtreg.jar
> >>>>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/
> >>>>>
> >>>>
> >>
> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched
> >>>> A
> >>>>> ccesses.java
> >>>>>
> >>>>> Test results: failed: 1
> >>>>>
> >>>>> Report written to
> >>>>> /scratch/shshahma/Java/jdk8u-dev-
> >>>> 8140309_01/JTreport/html/report.html
> >>>>>
> >>>>> Results written to
> >>>>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork
> >>>>>
> >>>>> Error:
> >>>>>
> >>>>> /scratch/shshahma/Java/jdk8u-dev-
> >>>> 8140309_01/hotspot/test/compiler/intr
> >>>>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92:
> error:
> >>>>> cannot find symbol
> >>>>>
> >>>>>          UNSAFE.putIntUnaligned(array,
> >>>>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1);
> >>>>>
> >>>>> Not sure if we should push without the test case.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> Good. Did you run new UnsafeAccess.java test?
> >>>>>
> >>>>> Due to same process issue the test case is not run and when I run
> >>>>> it
> >>>> separately it fails.
> >>>>>
> >>>>> It passes after doing below changes:
> >>>>>
> >>>>> 1. Added /othervm
> >>>>>
> >>>>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;'
> >>>>> by 'import
> >>>> sun.misc.Unsafe;'
> >>>>>
> >>>>> Updated webrev:
> >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/
> >>>>>
> >>>>>>
> >>>>>
> >>>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/
> >>>>>
> >>>>> I am getting the similar compilation error as above for added test
> >>>>> case.  Not
> >>>> sure if we can push without the test case.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Shafi
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> Good.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> Thanks,
> >>>>>
> >>>>>> Vladimir
> >>>>>
> >>>>>>
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>> Regards,
> >>>>>
> >>>>>>> Shafi
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>>
> >>>>>
> >>>>>>>> -----Original Message-----
> >>>>>
> >>>>>>>> From: Vladimir Kozlov
> >>>>>
> >>>>>>>> Sent: Friday, November 11, 2016 1:26 AM
> >>>>>
> >>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>>>
> >>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>>>>>> produces
> >>>>>
> >>>>>>>> mismatched unsafe accesses
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote:
> >>>>>
> >>>>>>>>> Hi,
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> Please review the backport of following dependent backports.
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-
> 8136473
> >>>>>
> >>>>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1.
> >>>>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61
> >>>>>>>>> [JDK-
> >>>>>
> >>>>>>>> 8080289]. Manual merge is not done as the corresponding code is
> >>>>>>>> not
> >>>>>
> >>>>>>>> there in jdk8u-dev.
> >>>>>
> >>>>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp
> >>>>>>>>> and
> >>>>>
> >>>>>>>>> manual
> >>>>>
> >>>>>>>> merge is done.
> >>>>>
> >>>>>>>>> webrev link:
> >>>>>
> >>>>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> unaligned unsafe access methods were added in jdk 9 only. In
> >>>>>>>> your
> >>>>>
> >>>>>>>> changes unaligned argument is always false. You can simplify
> >> changes.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> Also you should base changes on JDK-8140309 (original 8136473
> >>>>>>>> changes
> >>>>>
> >>>>>>>> were backout by 8140267):
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote:
> >>>>>
> >>>>>>>>  >http://cr.openjdk.java.net/~roland/8140309/webrev.00/
> >>>>>
> >>>>>>>>  >
> >>>>>
> >>>>>>>>  > Same as 8136473 with only the following change:
> >>>>>
> >>>>>>>>  >
> >>>>>
> >>>>>>>>  > diff --git a/src/share/vm/opto/library_call.cpp
> >>>>>
> >>>>>>>> b/src/share/vm/opto/library_call.cpp
> >>>>>
> >>>>>>>>  > --- a/src/share/vm/opto/library_call.cpp
> >>>>>
> >>>>>>>>  > +++ b/src/share/vm/opto/library_call.cpp
> >>>>>
> >>>>>>>>  > @@ -2527,7 +2527,7 @@
> >>>>>
> >>>>>>>>  >     // of safe & unsafe memory.
> >>>>>
> >>>>>>>>  >     if (need_mem_bar)
> insert_mem_bar(Op_MemBarCPUOrder);
> >>>>>
> >>>>>>>>  >
> >>>>>
> >>>>>>>>  > -  assert(is_native_ptr || alias_type->adr_type() ==
> >>>>>
> >>>>>>>> TypeOopPtr::BOTTOM
> >>>>>
> >>>>>>>> ||  > +  assert(alias_type->adr_type() == TypeRawPtr::BOTTOM
> ||
> >>>>>
> >>>>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM ||
> >>>>>
> >>>>>>>>  >            alias_type->field() != NULL || alias_type->element() !=
> >>>>>
> >>>>>>>> NULL, "field, array element or unknown");
> >>>>>
> >>>>>>>>  >     bool mismatched = false;
> >>>>>
> >>>>>>>>  >     if (alias_type->element() != NULL || alias_type->field() !=
> NULL)
> >> {
> >>>>>
> >>>>>>>>  >
> >>>>>
> >>>>>>>>  > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the
> >>>>>
> >>>>>>>> is_native_ptr case and the case where the unsafe method is
> >>>>>>>> called with a
> >>>>>
> >>>>>> null object.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 changeset:
> >>>>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-
> 8134918
> >>>>>
> >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>
> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16
> >>>>>> 5
> >>>>>
> >>>>>>>> [JDK-8140309]. Manual merge is not done as the corresponding
> >>>>>>>> code is
> >>>>>
> >>>>>>>> not there in jdk8u-dev.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> I explained situation with this line above.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>>> webrev link:
> >>>>>
> >>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> This webrev is not incremental for your 8136473 changes -
> >>>>>
> >>>>>>>> library_call.cpp has part from 8136473 changes.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 changeset:
> >>>>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-
> 8155781
> >>>>>
> >>>>>>>>> Clean merge
> >>>>>
> >>>>>>>>> webrev link:
> >>>>>
> >>>>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> Thanks seems fine.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 changeset:
> >>>>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-
> 8162101
> >>>>>
> >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1.
> >>>>>
> >>>>>>
> >>>>>>>
> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7
> >>>>>
> >>>>>>>>> [JDK-8160360] - Resolved 2.
> >>>>>
> >>>>>>
> >>>>>>
> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2
> >>>>>>>> 73
> >>>>>
> >>>>>>>> [JDK-8148146] - Manual merge is not done as the corresponding
> >>>>>>>> code is
> >>>>>
> >>>>>>>> not there in jdk8u-dev.
> >>>>>
> >>>>>>>>> webrev link:
> >>>>>
> >>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> This webrev is not incremental in library_call.cpp. Difficult
> >>>>>>>> to see
> >>>>>
> >>>>>>>> this part of changes.
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>> Thanks,
> >>>>>
> >>>>>>>> Vladimir
> >>>>>
> >>>>>>>>
> >>>>>
> >>>>>>>>> jdk9 changeset:
> >>>>>
> >>>>>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> Testing: jprt and jtreg
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>> Regards,
> >>>>>
> >>>>>>>>> Shafi
> >>>>>
> >>>>>>>>>
> >>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>
> >>>>>>>>>> From: Shafi Ahmad
> >>>>>
> >>>>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM
> >>>>>
> >>>>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net
> >>>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>>>
> >>>>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>>>
> >>>>>>>>>> produces mismatched unsafe accesses
> >>>>>
> >>>>>>>>>>
> >>>>>
> >>>>>>>>>> Thanks Vladimir.
> >>>>>
> >>>>>>>>>>
> >>>>>
> >>>>>>>>>> I will create dependent  backport of 1.
> >>>>>
> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473
> >>>>>
> >>>>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781
> >>>>>
> >>>>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>>>
> >>>>>>>>>>
> >>>>>
> >>>>>>>>>> Regards,
> >>>>>
> >>>>>>>>>> Shafi
> >>>>>
> >>>>>>>>>>
> >>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>
> >>>>>>>>>>> From: Vladimir Kozlov
> >>>>>
> >>>>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM
> >>>>>
> >>>>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net
> >>>>>>>>>>> <mailto:hotspot-dev at openjdk.java.net>
> >>>>>
> >>>>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation
> >>>>>
> >>>>>>>>>>> produces mismatched unsafe accesses
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> Hi Shafi,
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> You should also consider backporting following related fixes:
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781
> >>>>>
> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> Otherwise you may hit asserts added by 8134918 changes.
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> Thanks,
> >>>>>
> >>>>>>>>>>> Vladimir
> >>>>>
> >>>>>>>>>>>
> >>>>>
> >>>>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote:
> >>>>>
> >>>>>>>>>>>> Hi All,
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type
> >>>>>>>>>>>> speculation
> >>>>>
> >>>>>>>>>>>> produces
> >>>>>
> >>>>>>>>>>> mismatched unsafe accesses to jdk8u-dev.
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> Please note that backport is not clean and the conflict is due
> to:
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >>>>>
> >>>>>>>>>>>> 1
> >>>>>
> >>>>>>>>>>>> 65
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>>  Getting debug build failure because of:
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.
> >>>>>
> >>>>>>>>>>>> 1
> >>>>>
> >>>>>>>>>>>> 55
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> The above changes are done under bug# 'JDK-8136473:
> failed:
> >>>>>>>>>>>> no
> >>>>>
> >>>>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI'
> >>>>>>>>>>> which is
> >>>>>
> >>>>>>>>>>> not back ported to jdk8u and the current backport is on top
> >>>>>>>>>>> of
> >>>>>
> >>>>>>>>>>> above
> >>>>>
> >>>>>>>> change.
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>>  Please note that I am not sure if there is any dependency
> >>>>>
> >>>>>>>>>>>> between these
> >>>>>
> >>>>>>>>>>> two changesets.
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> open webrev:
> >>>>>
> >>>>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/
> >>>>>
> >>>>>>>>>>>> jdk9 bug
> >>>>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918
> >>>>>
> >>>>>>>>>>>> jdk9 changeset:
> >>>>>
> >>>>>>>>>>>
> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> testing: Passes JPRT, jtreg not completed
> >>>>>
> >>>>>>>>>>>>
> >>>>>
> >>>>>>>>>>>> Regards,
> >>>>>
> >>>>>>>>>>>> Shafi
> >>>>>
> >>>>>>>>>>>>
> >>>>>

From stefan.karlsson at oracle.com  Wed Nov 23 11:53:12 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Nov 2016 12:53:12 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
	<6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com>
Message-ID: <c8c9ca21-fc71-921b-2ae1-ad4e8b40d3ef@oracle.com>

Thanks, Coleen!

StefanK

On 2016-11-22 23:48, Coleen Phillimore wrote:
> Looks good!
> Thanks,
> Coleen
>
> On 11/22/16 4:37 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Here are the update patch, with changes suggested by Coleen and Thomas:
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>>
>> Changes to the previous patch:
>> * Removed list_chunk_size and instead used free_chunks(index)->size()
>> * Removed the const qualifier from list_index, since free_chunks isn't
>> declared const. Fixing this would have been a too large change for
>> this bug fix.
>> * Moved ChunkManager_test_list_index into the unit test section of
>> metaspace.cpp
>> * Fixed a broken assert
>>
>> Thanks,
>> StefanK
>>
>>
>> On 2016-11-22 15:54, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>>
>>> There's a great description of the bug in the bug report:
>>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>>
>>> There are two conceptual parts of the metaspace. The _class_
>>> metaspace, and the _non-class_ metaspace. They have different chunk
>>> sizes, and while querying for the list index of a humongous chunk in
>>> the class metaspace, the code accidentally matched the size against
>>> the MediumChunk size of the non-class metaspace.
>>>
>>> I've changed the code to not query against the global ChunkSizes
>>> enum, but rather the values stored inside the ChunkManager instances.
>>> Therefore, the list_index() function was changed into an instance
>>> method.
>>>
>>> I've written a unit test that provoked the bug. It's a simplified
>>> test with vm asserts instead of gtest asserts. The reason is that the
>>> ChunkManager class is currently located in metaspace.cpp, and is not
>>> accessible from the gtest unit tests.
>>>
>>> Testing: jprt, Kitchensink, parallel class loading tests
>>>
>>> Thanks,
>>> StefanK
>>
>>
>

From stefan.karlsson at oracle.com  Wed Nov 23 11:54:01 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Nov 2016 12:54:01 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
	<00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com>
Message-ID: <d28f7efe-d919-d746-7152-04da48d9a5fe@oracle.com>

Thanks, Erik.

StefanK

On 2016-11-23 08:09, Erik Helin wrote:
> On 11/22/2016 10:37 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Here are the update patch, with changes suggested by Coleen and Thomas:
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>
> Hey StefanK, thanks for taking care of this! The patch looks good to me,
> Reviewed.
>
> Thanks,
> Erik
>
>> Changes to the previous patch:
>> * Removed list_chunk_size and instead used free_chunks(index)->size()
>> * Removed the const qualifier from list_index, since free_chunks isn't
>> declared const. Fixing this would have been a too large change for this
>> bug fix.
>> * Moved ChunkManager_test_list_index into the unit test section of
>> metaspace.cpp
>> * Fixed a broken assert
>>
>> Thanks,
>> StefanK
>>
>>
>> On 2016-11-22 15:54, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>>
>>> There's a great description of the bug in the bug report:
>>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>>
>>> There are two conceptual parts of the metaspace. The _class_
>>> metaspace, and the _non-class_ metaspace. They have different chunk
>>> sizes, and while querying for the list index of a humongous chunk in
>>> the class metaspace, the code accidentally matched the size against
>>> the MediumChunk size of the non-class metaspace.
>>>
>>> I've changed the code to not query against the global ChunkSizes enum,
>>> but rather the values stored inside the ChunkManager instances.
>>> Therefore, the list_index() function was changed into an instance
>>> method.
>>>
>>> I've written a unit test that provoked the bug. It's a simplified test
>>> with vm asserts instead of gtest asserts. The reason is that the
>>> ChunkManager class is currently located in metaspace.cpp, and is not
>>> accessible from the gtest unit tests.
>>>
>>> Testing: jprt, Kitchensink, parallel class loading tests
>>>
>>> Thanks,
>>> StefanK
>>
>>

From stefan.karlsson at oracle.com  Wed Nov 23 11:54:14 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Nov 2016 12:54:14 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <CAA-vtUy8EK_8PdydgrGfwzpFNods4CgEXGWeQQk2YF+MJJgN8w@mail.gmail.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
	<CAA-vtUy8EK_8PdydgrGfwzpFNods4CgEXGWeQQk2YF+MJJgN8w@mail.gmail.com>
Message-ID: <218f8b02-3138-70a1-26d4-cbb6ebc4e243@oracle.com>

Thanks, Thomas.

StefanK

On 2016-11-23 08:42, Thomas St?fe wrote:
> Hi Stefan,
>
> this looks fine!
>
> Thanks,
> Thomas
>
> On Tue, Nov 22, 2016 at 10:37 PM, Stefan Karlsson
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
>     Hi all,
>
>     Here are the update patch, with changes suggested by Coleen and Thomas:
>      http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>     <http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta>
>      http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>     <http://cr.openjdk.java.net/~stefank/8169931/webrev.02>
>
>     Changes to the previous patch:
>     * Removed list_chunk_size and instead used free_chunks(index)->size()
>     * Removed the const qualifier from list_index, since free_chunks
>     isn't declared const. Fixing this would have been a too large change
>     for this bug fix.
>     * Moved ChunkManager_test_list_index into the unit test section of
>     metaspace.cpp
>     * Fixed a broken assert
>
>     Thanks,
>     StefanK
>
>
>     On 2016-11-22 15:54, Stefan Karlsson wrote:
>
>         Hi all,
>
>         Please, review this patch to fix a bug in
>         ChunkManager::list_index():
>          http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>         <http://cr.openjdk.java.net/~stefank/8169931/webrev.01>
>
>         There's a great description of the bug in the bug report:
>          https://bugs.openjdk.java.net/browse/JDK-8169931
>         <https://bugs.openjdk.java.net/browse/JDK-8169931>
>
>         There are two conceptual parts of the metaspace. The _class_
>         metaspace, and the _non-class_ metaspace. They have different
>         chunk sizes, and while querying for the list index of a
>         humongous chunk in the class metaspace, the code accidentally
>         matched the size against the MediumChunk size of the non-class
>         metaspace.
>
>         I've changed the code to not query against the global ChunkSizes
>         enum, but rather the values stored inside the ChunkManager
>         instances. Therefore, the list_index() function was changed into
>         an instance method.
>
>         I've written a unit test that provoked the bug. It's a
>         simplified test with vm asserts instead of gtest asserts. The
>         reason is that the ChunkManager class is currently located in
>         metaspace.cpp, and is not accessible from the gtest unit tests.
>
>         Testing: jprt, Kitchensink, parallel class loading tests
>
>         Thanks,
>         StefanK
>
>
>
>

From stefan.karlsson at oracle.com  Wed Nov 23 11:54:28 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 23 Nov 2016 12:54:28 +0100
Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k
	chunk freelist
In-Reply-To: <b24ba1d2-354d-7423-044a-9d8275462d0f@oracle.com>
References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com>
	<8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com>
	<b24ba1d2-354d-7423-044a-9d8275462d0f@oracle.com>
Message-ID: <b168a29d-6fe6-ad83-d78d-a6fc797adf28@oracle.com>

Thanks, Mikael.

StefanK

On 2016-11-23 10:42, Mikael Gerdin wrote:
> Hi Stefan,
>
> On 2016-11-22 22:37, Stefan Karlsson wrote:
>> Hi all,
>>
>> Here are the update patch, with changes suggested by Coleen and Thomas:
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta
>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.02
>
> Updated webrev looks good to me as well.
> /Mikael
>
>>
>> Changes to the previous patch:
>> * Removed list_chunk_size and instead used free_chunks(index)->size()
>> * Removed the const qualifier from list_index, since free_chunks isn't
>> declared const. Fixing this would have been a too large change for this
>> bug fix.
>> * Moved ChunkManager_test_list_index into the unit test section of
>> metaspace.cpp
>> * Fixed a broken assert
>>
>> Thanks,
>> StefanK
>>
>>
>> On 2016-11-22 15:54, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please, review this patch to fix a bug in ChunkManager::list_index():
>>>  http://cr.openjdk.java.net/~stefank/8169931/webrev.01
>>>
>>> There's a great description of the bug in the bug report:
>>>  https://bugs.openjdk.java.net/browse/JDK-8169931
>>>
>>> There are two conceptual parts of the metaspace. The _class_
>>> metaspace, and the _non-class_ metaspace. They have different chunk
>>> sizes, and while querying for the list index of a humongous chunk in
>>> the class metaspace, the code accidentally matched the size against
>>> the MediumChunk size of the non-class metaspace.
>>>
>>> I've changed the code to not query against the global ChunkSizes enum,
>>> but rather the values stored inside the ChunkManager instances.
>>> Therefore, the list_index() function was changed into an instance
>>> method.
>>>
>>> I've written a unit test that provoked the bug. It's a simplified test
>>> with vm asserts instead of gtest asserts. The reason is that the
>>> ChunkManager class is currently located in metaspace.cpp, and is not
>>> accessible from the gtest unit tests.
>>>
>>> Testing: jprt, Kitchensink, parallel class loading tests
>>>
>>> Thanks,
>>> StefanK
>>
>>

From igor.ignatyev at oracle.com  Wed Nov 23 12:46:15 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 23 Nov 2016 15:46:15 +0300
Subject: RFR(XXS) : register closed @requires property setter
Message-ID: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>

Hi all,

could you please review the changeset which registers closed vm property setter (for @requires expressions)?
this setter is register as optional, so test execution won?t fail if the file doesn?t exist. 

webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/
webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/
JBS : https://bugs.openjdk.java.net/browse/JDK-8170228

Thanks,
? Igor

From dmitry.fazunenko at oracle.com  Wed Nov 23 13:24:48 2016
From: dmitry.fazunenko at oracle.com (Dmitry Fazunenenko)
Date: Wed, 23 Nov 2016 16:24:48 +0300
Subject: RFR(XXS) : register closed @requires property setter
In-Reply-To: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
References: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
Message-ID: <ca1956c3-8f68-5043-85e9-f0fac8ab129b@oracle.com>

Hi Igor,

The change itself looks good to me.

Would you provide a bit more information into the CR.
"register closed @requires property setter" doesn't provide enough 
information to understand the reasons why it's necessary.

Thanks,
Dima

On 23.11.2016 15:46, Igor Ignatyev wrote:
> Hi all,
>
> could you please review the changeset which registers closed vm property setter (for @requires expressions)?
> this setter is register as optional, so test execution won?t fail if the file doesn?t exist.
>
> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/
> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/
> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228
>
> Thanks,
> ? Igor


From volker.simonis at gmail.com  Wed Nov 23 14:05:33 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 23 Nov 2016 15:05:33 +0100
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <583394C5.3030206@linux.vnet.ibm.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
Message-ID: <CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>

Hi Gustavo,

thanks a lot for tracking this down!

The change looks good and I a can sponsor it once you get another
review from the build group and the FC Extension Request was approved.

In general I'd advise to sign the OCTLA [1] to get access to the Java
SE TCK [2] as this contains quite a lot of additional conformance
tests which can be quite valuable for changes like this.

Regards,
Volker

[1] http://openjdk.java.net/legal/octla-java-se-8.pdf
[2] http://openjdk.java.net/groups/conformance/JckAccess/


On Tue, Nov 22, 2016 at 1:43 AM, Gustavo Romero
<gromero at linux.vnet.ibm.com> wrote:
> Hi,
>
> Could the following change be reviewed, please?
>
> webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/
> webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/
> bug:        https://bugs.openjdk.java.net/browse/JDK-8170153
>
> It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the
> StrictMath methods (in some cases up to 3x) on that platform.
>
> On PPC64 fdlibm optimization can be done without precision issues if
> floating-point expression contraction is disable, i.e. if the compiler does not
> use floating-point multiply-add (FMA). For further details please refer to gcc
> bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386
>
> No regression was observed on Math and StrictMath tests:
>
> Passed: java/lang/Math/AbsPositiveZero.java
> Passed: java/lang/Math/Atan2Tests.java
> Passed: java/lang/Math/CeilAndFloorTests.java
> Passed: java/lang/Math/CubeRootTests.java
> Passed: java/lang/Math/DivModTests.java
> Passed: java/lang/Math/ExactArithTests.java
> Passed: java/lang/Math/Expm1Tests.java
> Passed: java/lang/Math/FusedMultiplyAddTests.java
> Passed: java/lang/Math/HyperbolicTests.java
> Passed: java/lang/Math/HypotTests.java
> Passed: java/lang/Math/IeeeRecommendedTests.java
> Passed: java/lang/Math/Log10Tests.java
> Passed: java/lang/Math/Log1pTests.java
> Passed: java/lang/Math/MinMax.java
> Passed: java/lang/Math/MultiplicationTests.java
> Passed: java/lang/Math/PowTests.java
> Passed: java/lang/Math/Rint.java
> Passed: java/lang/Math/RoundTests.java
> Passed: java/lang/Math/SinCosCornerCasesTests.java
> Passed: java/lang/Math/TanTests.java
> Passed: java/lang/Math/WorstCaseTests.java
> Test results: passed: 21
>
> Passed: java/lang/StrictMath/CubeRootTests.java
> Passed: java/lang/StrictMath/ExactArithTests.java
> Passed: java/lang/StrictMath/Expm1Tests.java
> Passed: java/lang/StrictMath/HyperbolicTests.java
> Passed: java/lang/StrictMath/HypotTests.java
> Passed: java/lang/StrictMath/Log10Tests.java
> Passed: java/lang/StrictMath/Log1pTests.java
> Passed: java/lang/StrictMath/PowTests.java
> Test results: passed: 8
>
> and also on the following hotspot tests:
>
> Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java
> Passed: compiler/intrinsics/mathexact/AddExactICondTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/CompareTest.java
> Passed: compiler/intrinsics/mathexact/DecExactITest.java
> Passed: compiler/intrinsics/mathexact/DecExactLTest.java
> Passed: compiler/intrinsics/mathexact/GVNTest.java
> Passed: compiler/intrinsics/mathexact/IncExactITest.java
> Passed: compiler/intrinsics/mathexact/IncExactLTest.java
> Passed: compiler/intrinsics/mathexact/MulExactICondTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java
> Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java
> Passed: compiler/intrinsics/mathexact/SubExactICondTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java
> Test results: passed: 50
>
> Thank you.
>
>
> Regards,
> Gustavo
>

From erik.joelsson at oracle.com  Wed Nov 23 14:29:52 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Wed, 23 Nov 2016 15:29:52 +0100
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <583394C5.3030206@linux.vnet.ibm.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
Message-ID: <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>

Build changes look ok.

In CoreLibraries.gmk, I think it would have been ok to keep the 
conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly 
works too.

/Erik


On 2016-11-22 01:43, Gustavo Romero wrote:
> Hi,
>
> Could the following change be reviewed, please?
>
> webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/
> webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/
> bug:        https://bugs.openjdk.java.net/browse/JDK-8170153
>
> It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the
> StrictMath methods (in some cases up to 3x) on that platform.
>
> On PPC64 fdlibm optimization can be done without precision issues if
> floating-point expression contraction is disable, i.e. if the compiler does not
> use floating-point multiply-add (FMA). For further details please refer to gcc
> bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386
>
> No regression was observed on Math and StrictMath tests:
>
> Passed: java/lang/Math/AbsPositiveZero.java
> Passed: java/lang/Math/Atan2Tests.java
> Passed: java/lang/Math/CeilAndFloorTests.java
> Passed: java/lang/Math/CubeRootTests.java
> Passed: java/lang/Math/DivModTests.java
> Passed: java/lang/Math/ExactArithTests.java
> Passed: java/lang/Math/Expm1Tests.java
> Passed: java/lang/Math/FusedMultiplyAddTests.java
> Passed: java/lang/Math/HyperbolicTests.java
> Passed: java/lang/Math/HypotTests.java
> Passed: java/lang/Math/IeeeRecommendedTests.java
> Passed: java/lang/Math/Log10Tests.java
> Passed: java/lang/Math/Log1pTests.java
> Passed: java/lang/Math/MinMax.java
> Passed: java/lang/Math/MultiplicationTests.java
> Passed: java/lang/Math/PowTests.java
> Passed: java/lang/Math/Rint.java
> Passed: java/lang/Math/RoundTests.java
> Passed: java/lang/Math/SinCosCornerCasesTests.java
> Passed: java/lang/Math/TanTests.java
> Passed: java/lang/Math/WorstCaseTests.java
> Test results: passed: 21
>
> Passed: java/lang/StrictMath/CubeRootTests.java
> Passed: java/lang/StrictMath/ExactArithTests.java
> Passed: java/lang/StrictMath/Expm1Tests.java
> Passed: java/lang/StrictMath/HyperbolicTests.java
> Passed: java/lang/StrictMath/HypotTests.java
> Passed: java/lang/StrictMath/Log10Tests.java
> Passed: java/lang/StrictMath/Log1pTests.java
> Passed: java/lang/StrictMath/PowTests.java
> Test results: passed: 8
>
> and also on the following hotspot tests:
>
> Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java
> Passed: compiler/intrinsics/mathexact/AddExactICondTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/CompareTest.java
> Passed: compiler/intrinsics/mathexact/DecExactITest.java
> Passed: compiler/intrinsics/mathexact/DecExactLTest.java
> Passed: compiler/intrinsics/mathexact/GVNTest.java
> Passed: compiler/intrinsics/mathexact/IncExactITest.java
> Passed: compiler/intrinsics/mathexact/IncExactLTest.java
> Passed: compiler/intrinsics/mathexact/MulExactICondTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java
> Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java
> Passed: compiler/intrinsics/mathexact/SubExactICondTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java
> Test results: passed: 50
>
> Thank you.
>
>
> Regards,
> Gustavo
>


From martin.doerr at sap.com  Wed Nov 23 14:38:09 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 23 Nov 2016 14:38:09 +0000
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>
Message-ID: <8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap>

Hi Gustavo,

thanks for providing the webrevs.

I have ran the StrictMath jck tests which fail when building with -O3 and without -ffp-contract=off:
FailedTests:
api/java_lang/StrictMath/desc.html#acos javasoft.sqe.tests.api.java.lang.StrictMath.acos_test
api/java_lang/StrictMath/desc.html#asin javasoft.sqe.tests.api.java.lang.StrictMath.asin_test
api/java_lang/StrictMath/desc.html#atan javasoft.sqe.tests.api.java.lang.StrictMath.atan_test
api/java_lang/StrictMath/desc.html#atan2 javasoft.sqe.tests.api.java.lang.StrictMath.atan2_test
api/java_lang/StrictMath/desc.html#cos javasoft.sqe.tests.api.java.lang.StrictMath.cos_test
api/java_lang/StrictMath/desc.html#exp javasoft.sqe.tests.api.java.lang.StrictMath.exp_test
api/java_lang/StrictMath/desc.html#log javasoft.sqe.tests.api.java.lang.StrictMath.log_test
api/java_lang/StrictMath/desc.html#sin javasoft.sqe.tests.api.java.lang.StrictMath.sin_test
api/java_lang/StrictMath/desc.html#tan javasoft.sqe.tests.api.java.lang.StrictMath.tan_test
api/java_lang/StrictMath/index.html#expm1 javasoft.sqe.tests.api.java.lang.StrictMath.expm1Tests -TestCaseID ALL
api/java_lang/StrictMath/index.html#log10 javasoft.sqe.tests.api.java.lang.StrictMath.log10Tests -TestCaseID ALL
api/java_lang/StrictMath/index.html#log1p javasoft.sqe.tests.api.java.lang.StrictMath.log1pTests -TestCaseID ALL

All of them have passed when building with -O3 and -ffp-contract=off (on linuxppc64le).

So thumbs up from my side.

Thanks and best regards,
Martin


-----Original Message-----
From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis
Sent: Mittwoch, 23. November 2016 15:06
To: Gustavo Romero <gromero at linux.vnet.ibm.com>
Cc: build-dev <build-dev at openjdk.java.net>; ppc-aix-port-dev at openjdk.java.net; Java Core Libs <core-libs-dev at openjdk.java.net>; hotspot-dev at openjdk.java.net
Subject: Re: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation

Hi Gustavo,

thanks a lot for tracking this down!

The change looks good and I a can sponsor it once you get another review from the build group and the FC Extension Request was approved.

In general I'd advise to sign the OCTLA [1] to get access to the Java SE TCK [2] as this contains quite a lot of additional conformance tests which can be quite valuable for changes like this.

Regards,
Volker

[1] http://openjdk.java.net/legal/octla-java-se-8.pdf
[2] http://openjdk.java.net/groups/conformance/JckAccess/


On Tue, Nov 22, 2016 at 1:43 AM, Gustavo Romero <gromero at linux.vnet.ibm.com> wrote:
> Hi,
>
> Could the following change be reviewed, please?
>
> webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/
> webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/
> bug:        https://bugs.openjdk.java.net/browse/JDK-8170153
>
> It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds 
> up the StrictMath methods (in some cases up to 3x) on that platform.
>
> On PPC64 fdlibm optimization can be done without precision issues if 
> floating-point expression contraction is disable, i.e. if the compiler 
> does not use floating-point multiply-add (FMA). For further details 
> please refer to gcc
> bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386
>
> No regression was observed on Math and StrictMath tests:
>
> Passed: java/lang/Math/AbsPositiveZero.java
> Passed: java/lang/Math/Atan2Tests.java
> Passed: java/lang/Math/CeilAndFloorTests.java
> Passed: java/lang/Math/CubeRootTests.java
> Passed: java/lang/Math/DivModTests.java
> Passed: java/lang/Math/ExactArithTests.java
> Passed: java/lang/Math/Expm1Tests.java
> Passed: java/lang/Math/FusedMultiplyAddTests.java
> Passed: java/lang/Math/HyperbolicTests.java
> Passed: java/lang/Math/HypotTests.java
> Passed: java/lang/Math/IeeeRecommendedTests.java
> Passed: java/lang/Math/Log10Tests.java
> Passed: java/lang/Math/Log1pTests.java
> Passed: java/lang/Math/MinMax.java
> Passed: java/lang/Math/MultiplicationTests.java
> Passed: java/lang/Math/PowTests.java
> Passed: java/lang/Math/Rint.java
> Passed: java/lang/Math/RoundTests.java
> Passed: java/lang/Math/SinCosCornerCasesTests.java
> Passed: java/lang/Math/TanTests.java
> Passed: java/lang/Math/WorstCaseTests.java
> Test results: passed: 21
>
> Passed: java/lang/StrictMath/CubeRootTests.java
> Passed: java/lang/StrictMath/ExactArithTests.java
> Passed: java/lang/StrictMath/Expm1Tests.java
> Passed: java/lang/StrictMath/HyperbolicTests.java
> Passed: java/lang/StrictMath/HypotTests.java
> Passed: java/lang/StrictMath/Log10Tests.java
> Passed: java/lang/StrictMath/Log1pTests.java
> Passed: java/lang/StrictMath/PowTests.java
> Test results: passed: 8
>
> and also on the following hotspot tests:
>
> Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java
> Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java
> Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java
> Passed: 
> compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java
> Passed: compiler/intrinsics/mathexact/AddExactICondTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/CompareTest.java
> Passed: compiler/intrinsics/mathexact/DecExactITest.java
> Passed: compiler/intrinsics/mathexact/DecExactLTest.java
> Passed: compiler/intrinsics/mathexact/GVNTest.java
> Passed: compiler/intrinsics/mathexact/IncExactITest.java
> Passed: compiler/intrinsics/mathexact/IncExactLTest.java
> Passed: compiler/intrinsics/mathexact/MulExactICondTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java
> Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java
> Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java
> Passed: compiler/intrinsics/mathexact/SubExactICondTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java
> Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java
> Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java
> Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java
> Test results: passed: 50
>
> Thank you.
>
>
> Regards,
> Gustavo
>

From gromero at linux.vnet.ibm.com  Wed Nov 23 15:28:05 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Wed, 23 Nov 2016 13:28:05 -0200
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>
Message-ID: <5835B585.5040807@linux.vnet.ibm.com>

Hi Volker,

On 23-11-2016 12:05, Volker Simonis wrote:
> thanks a lot for tracking this down!

Happy to contribute :)


> The change looks good and I a can sponsor it once you get another
> review from the build group and the FC Extension Request was approved.

Thanks a lot for sponsoring it!


> In general I'd advise to sign the OCTLA [1] to get access to the Java
> SE TCK [2] as this contains quite a lot of additional conformance
> tests which can be quite valuable for changes like this.
> 
> Regards,
> Volker
> 
> [1] http://openjdk.java.net/legal/octla-java-se-8.pdf
> [2] http://openjdk.java.net/groups/conformance/JckAccess/

Right. I'll check the documentation and find a way to get access to the TCK.


Best regards,
Gustavo


From gromero at linux.vnet.ibm.com  Wed Nov 23 15:29:51 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Wed, 23 Nov 2016 13:29:51 -0200
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<CA+3eh12BSzOjmkQnHfUimZzhS6+7VuniQqp0bSZmmZnJF82xcQ@mail.gmail.com>
	<8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap>
Message-ID: <5835B5EF.7070006@linux.vnet.ibm.com>

Hi Martin,

On 23-11-2016 12:38, Doerr, Martin wrote:
> Hi Gustavo,
> 
> thanks for providing the webrevs.
> 
> I have ran the StrictMath jck tests which fail when building with -O3 and without -ffp-contract=off:
> FailedTests:
> api/java_lang/StrictMath/desc.html#acos javasoft.sqe.tests.api.java.lang.StrictMath.acos_test
> api/java_lang/StrictMath/desc.html#asin javasoft.sqe.tests.api.java.lang.StrictMath.asin_test
> api/java_lang/StrictMath/desc.html#atan javasoft.sqe.tests.api.java.lang.StrictMath.atan_test
> api/java_lang/StrictMath/desc.html#atan2 javasoft.sqe.tests.api.java.lang.StrictMath.atan2_test
> api/java_lang/StrictMath/desc.html#cos javasoft.sqe.tests.api.java.lang.StrictMath.cos_test
> api/java_lang/StrictMath/desc.html#exp javasoft.sqe.tests.api.java.lang.StrictMath.exp_test
> api/java_lang/StrictMath/desc.html#log javasoft.sqe.tests.api.java.lang.StrictMath.log_test
> api/java_lang/StrictMath/desc.html#sin javasoft.sqe.tests.api.java.lang.StrictMath.sin_test
> api/java_lang/StrictMath/desc.html#tan javasoft.sqe.tests.api.java.lang.StrictMath.tan_test
> api/java_lang/StrictMath/index.html#expm1 javasoft.sqe.tests.api.java.lang.StrictMath.expm1Tests -TestCaseID ALL
> api/java_lang/StrictMath/index.html#log10 javasoft.sqe.tests.api.java.lang.StrictMath.log10Tests -TestCaseID ALL
> api/java_lang/StrictMath/index.html#log1p javasoft.sqe.tests.api.java.lang.StrictMath.log1pTests -TestCaseID ALL
> 
> All of them have passed when building with -O3 and -ffp-contract=off (on linuxppc64le).

Thank you very much for running the additional StrictMath jck tests against the change!


Best regards,
Gustavo


From gromero at linux.vnet.ibm.com  Wed Nov 23 15:33:43 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Wed, 23 Nov 2016 13:33:43 -0200
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>
Message-ID: <5835B6D7.4020101@linux.vnet.ibm.com>

Hi Erik,

On 23-11-2016 12:29, Erik Joelsson wrote:
> Build changes look ok.
> 
> In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too.

Thanks a lot for reviewing the change.


Regards,
Gustavo


From martin.doerr at sap.com  Wed Nov 23 16:20:32 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 23 Nov 2016 16:20:32 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
Message-ID: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap>

Hi David,

thank you very much for the presentation. I think it provides a good guideline for hotspot development.


Would you like to add something about multi-copy atomicity?
E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).

It is needed in the following scenario:
- Different threads write 2 variables.
- Readers of these 2 variables expect a globally consistent order of the write accesses.

In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".

(While taking a look at it, the condition "#if !(defined SPARC || defined IA32 || defined AMD64)" is not accurate and should better get improved. E.g. s390 is multi-copy atomic.)


I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.


Thanks and best regards,
Martin


-----Original Message-----
From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of David Holmes
Sent: Mittwoch, 23. November 2016 06:08
To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Presentation: Understanding OrderAccess

This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.

http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf

Cheers,
David

From adinn at redhat.com  Wed Nov 23 16:30:29 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Wed, 23 Nov 2016 16:30:29 +0000
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <1effc60c-94ce-f42c-8756-310737969799@jku.at>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>
	<8a2536a3-8d16-1777-55c7-95b10000465b@jku.at>
	<204272cd-b606-2be7-9359-ea05d0922515@redhat.com>
	<1effc60c-94ce-f42c-8756-310737969799@jku.at>
Message-ID: <3263b517-d397-22f3-8351-cb36b9fe539a@redhat.com>

On 23/11/16 16:06, Peter Hofer wrote:
> I finally got around to measuring the change in execution times between
> disabling the profiler in a patched OpenJDK and an entirely unmodified
> OpenJDK. I did this for the benchmarks of the DaCapo and scalabench suites.
> 
> For many benchmarks, there is some difference even when the profiler is
> not enabled. Still, the disabled case was not something that we
> optimized for. I think that most, if not all of that cost can be shaved
> off by revisiting changes to frequent code paths and to the object layouts.
  . . .

Thanks very much for doing this!

Am I safe to assume the y axis measures execution time?

The differences never appear to be very great but a few of the tests
show a couple of percent points which is maybe a little troubling. It
would probably help if you could improve on that.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From peter.hofer at jku.at  Wed Nov 23 16:06:59 2016
From: peter.hofer at jku.at (Peter Hofer)
Date: Wed, 23 Nov 2016 17:06:59 +0100
Subject: Contribution: Lock Contention Profiler for HotSpot
In-Reply-To: <204272cd-b606-2be7-9359-ea05d0922515@redhat.com>
References: <df261e95-1c33-a3aa-5695-1dbe902e8ae6@jku.at>
	<f4bf2777-dccf-0fb0-8a6c-71d61350eec9@redhat.com>
	<8a2536a3-8d16-1777-55c7-95b10000465b@jku.at>
	<204272cd-b606-2be7-9359-ea05d0922515@redhat.com>
Message-ID: <1effc60c-94ce-f42c-8756-310737969799@jku.at>

Hi Andrew,

I finally got around to measuring the change in execution times between 
disabling the profiler in a patched OpenJDK and an entirely unmodified 
OpenJDK. I did this for the benchmarks of the DaCapo and scalabench suites.

For many benchmarks, there is some difference even when the profiler is
not enabled. Still, the disabled case was not something that we 
optimized for. I think that most, if not all of that cost can be shaved
off by revisiting changes to frequent code paths and to the object layouts.

Here are the results for the JDK 8u patch:
> http://ssw.jku.at/General/Staff/PH/lct/unmodified-vs-disabled/jdk8u.pdf

For the JDK 9 patch (tracing only native locks):
> http://ssw.jku.at/General/Staff/PH/lct/unmodified-vs-disabled/jdk9-nativeonly.pdf

I measured this on a openSUSE 13.2 system with a single Intel Core
i7-4790K processor, using a fixed Java heap size of 8 GB.

Cheers,
  Peter

On 11/04/2016 03:21 PM, Andrew Dinn wrote:
> On 04/11/16 12:04, Peter Hofer wrote:
>    . . .
>>> Have you measured the overhead this change produces when running with
>>> contention detection disabled? (i.e. do we pay to have this feature even
>>> when we don't use it).
>>
>> We measured only the overhead relative to an unmodified OpenJDK build.
>>
>> Our profiler observes only lock contention, which is generally handled
>> via slow paths in the VM code, so this is where we added the code to
>> record events. I don't expect this code to cause much overhead when
>> disabled. However, we added fields to several data structures, which
>> might make a difference.
>
> Yes, increased footprint (in code as well as object space) would be as
> much a concern as increased execution time.
>
>> I'll run some more benchmarks and report my findings.
>
> Thanks very much.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>

From igor.ignatyev at oracle.com  Thu Nov 24 12:14:31 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 24 Nov 2016 15:14:31 +0300
Subject: RFR(XXS) : 8170228 : register closed @requires property setter
In-Reply-To: <ca1956c3-8f68-5043-85e9-f0fac8ab129b@oracle.com>
References: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
	<ca1956c3-8f68-5043-85e9-f0fac8ab129b@oracle.com>
Message-ID: <9D707B36-466F-4739-994B-2695E020D030@oracle.com>

Dima,

thanks for review.
I?ve added more detail to the bug report. I hope it?ll be enough for descendants to understand why it was needed.

Thanks,
? Igor

> On Nov 23, 2016, at 4:24 PM, Dmitry Fazunenenko <dmitry.fazunenko at oracle.com> wrote:
> 
> Hi Igor,
> 
> The change itself looks good to me.
> 
> Would you provide a bit more information into the CR.
> "register closed @requires property setter" doesn't provide enough information to understand the reasons why it's necessary.
> 
> Thanks,
> Dima
> 
> On 23.11.2016 15:46, Igor Ignatyev wrote:
>> Hi all,
>> 
>> could you please review the changeset which registers closed vm property setter (for @requires expressions)?
>> this setter is register as optional, so test execution won?t fail if the file doesn?t exist.
>> 
>> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/
>> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/
>> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228
>> 
>> Thanks,
>> ? Igor
> 


From marcus.larsson at oracle.com  Thu Nov 24 14:35:37 2016
From: marcus.larsson at oracle.com (Marcus Larsson)
Date: Thu, 24 Nov 2016 15:35:37 +0100
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
Message-ID: <ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>

Hi,


On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
> Marcus,
>
> Thank you for prompt reply!
>
> Could you please read comments inline?
> I'm looking forward to your reply.
>
> Thank you.
>
> Regards, Kirill
>
> On 22.11.2016 15:32, Marcus Larsson wrote:
>> Hi,
>>
>>
>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>> Marcus,
>>>
>>> Thank you for reviewing the fix!
>>>>> WebRev: 
>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>>
>>>> ISO8601 says the decimal point can be either '.' or ',' so the test 
>>>> should accept either. You could let sscanf read out the decimal 
>>>> point as a character and just verify that it is one of the two.
>>>>
>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means 
>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>> I agree that ISO8601 could add "Z" to time (and as far as I 
>>> understand date/time without delimiters is legal too) but these are 
>>> the unit tests.
>>> Hence they cover the existing code and they should pass only if the 
>>> result corresponds to existing code and fail otherwise.
>>> The current code from os::iso8601_time format date/time string 
>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not 
>>> consider any other format as valid.
>>>
>>> Could you please let me know your opinion?
>>
>> I think the test should verify the intended behavior, not the 
>> implementation. If we refactor or change something in iso8601_time() 
>> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
> I would agree with you if we were talking about a functional test. But 
> since it is an unit test I think we should keep it as close to 
> implementation as possible.
> If the implementation is changed unintentionally the test fails and 
> signals us that something is broken.
> If it is an intentional change the test must be updated correspondingly.

I still think it's unnecessary noise, but if you insist I'm fine with it.

If we're not going to accept anything else than the current 
implementation then you should also remove the if-case for the Z suffix, 
since the test will fail for that anyway.

Thanks,
Marcus

>
>>
>> Thanks,
>> Marcus
>>
>>>
>>> Thank you.
>>>
>>> Regards, Kirill
>>>
>>>>
>>>> Thanks,
>>>> Marcus
>>>>
>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards, Kirill
>>>>
>>>
>>
>


From vladimir.x.ivanov at oracle.com  Thu Nov 24 20:26:45 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 24 Nov 2016 23:26:45 +0300
Subject: RFR(XXS) : register closed @requires property setter
In-Reply-To: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
References: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
Message-ID: <b74f022a-5354-9833-4843-9f87bc5801fc@oracle.com>

Reviewed.

Best regards,
Vladimir Ivanov

On 11/23/16 3:46 PM, Igor Ignatyev wrote:
> Hi all,
>
> could you please review the changeset which registers closed vm property setter (for @requires expressions)?
> this setter is register as optional, so test execution won?t fail if the file doesn?t exist.
>
> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/
> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/
> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228
>
> Thanks,
> ? Igor
>

From david.holmes at oracle.com  Fri Nov 25 10:38:39 2016
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 25 Nov 2016 20:38:39 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
Message-ID: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8170307

The bug is not public unfortunately for non-technical reasons - but see 
my eval below.

Background: if you load the JVM from the primordial thread of a process 
(not done by the java launcher since JDK 6), there is an artificial 
stack limit imposed on the initial thread (by sticking the guard page at 
the limit position of the actual stack) of the minimum of the -Xss 
setting and 2M. So if you set -Xss to > 2M it is ignored for the main 
thread even if the true stack is, say, 8M. This limitation dates back 
10-15 years and is no longer relevant today and should be removed (see 
below). I've also added additional explanatory notes.

webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/

Testing was manually done by modifying the launcher to not run the VM in 
a new thread, and checking the resulting stack size used.

This change will only affect hosted JVMs launched with a -Xss value > 2M.

Thanks,
David
-----

Bug eval:

JDK-4441425 limits the stack to 8M as a safeguard against an unlimited 
value from getrlimit in 1.3.1, but further constrained that to 2M in 
1.4.0 due to JDK-4466587.

By 1.4.2 we have the basic form of the current problematic code:

#ifndef IA64
   if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
#else
   // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small
   if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
#endif

   _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);

   if (max_size && _initial_thread_stack_size > max_size) {
      _initial_thread_stack_size = max_size;
   }

This was added by JDK-4678676 to allow the stack of the main thread to 
be _reduced_ below the default 2M/4M if the -Xss value was smaller than 
that.** There was no intent to allow the stack size to follow -Xss 
arbitrarily due to the operational constraints imposed by the OS/glibc 
at the time when dealing with the primordial process thread.

** It could not actually change the actual stack size of course, but set 
the guard pages to limit use to the expected stack size.

In JDK 6, under JDK-6316197, the launcher was changed to create the JVM 
in a new thread, so that it was not limited by the idiosyncracies of the 
OS or thread library primordial thread handling. However, the stack size 
limitations remained in place in case the VM was launched from the 
primordial thread of a user application via the JNI invocation API.

I believe it should be safe to remove the 2M limitation now.

From volker.simonis at gmail.com  Fri Nov 25 13:32:37 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Fri, 25 Nov 2016 14:32:37 +0100
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <5835B6D7.4020101@linux.vnet.ibm.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>
	<5835B6D7.4020101@linux.vnet.ibm.com>
Message-ID: <CA+3eh10CQKg3yzCm+HvFiq6ODGf1gUNUaATTuGwd06YV_gR_tw@mail.gmail.com>

Hi Gustavo,

we've realized that we have exactly the same problem on Linux/s390 so
I hope you don't mind that I've updated the bug and the webrev to also
include the fix for Linux/s390:

http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/
http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/
https://bugs.openjdk.java.net/browse/JDK-8170153

The top-level change stays the same (I've only added the current
reviewers) and for the jdk change I've just added Linux/s390 as
another platform which can compile fdlibm with HIGH optimization.

Thanks,
Volker

On Wed, Nov 23, 2016 at 4:33 PM, Gustavo Romero
<gromero at linux.vnet.ibm.com> wrote:
> Hi Erik,
>
> On 23-11-2016 12:29, Erik Joelsson wrote:
>> Build changes look ok.
>>
>> In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too.
>
> Thanks a lot for reviewing the change.
>
>
> Regards,
> Gustavo
>

From erik.joelsson at oracle.com  Fri Nov 25 14:06:27 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Fri, 25 Nov 2016 15:06:27 +0100
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <CA+3eh10CQKg3yzCm+HvFiq6ODGf1gUNUaATTuGwd06YV_gR_tw@mail.gmail.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>
	<5835B6D7.4020101@linux.vnet.ibm.com>
	<CA+3eh10CQKg3yzCm+HvFiq6ODGf1gUNUaATTuGwd06YV_gR_tw@mail.gmail.com>
Message-ID: <98b7942d-837e-0166-93de-9ea256bb1ecf@oracle.com>

Looks good.

/Erik


On 2016-11-25 14:32, Volker Simonis wrote:
> Hi Gustavo,
>
> we've realized that we have exactly the same problem on Linux/s390 so
> I hope you don't mind that I've updated the bug and the webrev to also
> include the fix for Linux/s390:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/
> https://bugs.openjdk.java.net/browse/JDK-8170153
>
> The top-level change stays the same (I've only added the current
> reviewers) and for the jdk change I've just added Linux/s390 as
> another platform which can compile fdlibm with HIGH optimization.
>
> Thanks,
> Volker
>
> On Wed, Nov 23, 2016 at 4:33 PM, Gustavo Romero
> <gromero at linux.vnet.ibm.com> wrote:
>> Hi Erik,
>>
>> On 23-11-2016 12:29, Erik Joelsson wrote:
>>> Build changes look ok.
>>>
>>> In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too.
>> Thanks a lot for reviewing the change.
>>
>>
>> Regards,
>> Gustavo
>>


From kirill.zhaldybin at oracle.com  Fri Nov 25 17:23:52 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Fri, 25 Nov 2016 20:23:52 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
	<ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
Message-ID: <583873A8.8000106@oracle.com>

Marcus,

Here are a new webrev: 
http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/

I addressed your comment about "if-case for the Z suffix".

Could you please let me know your opinion?

Thank you.

Regards, Kirill

On 24.11.2016 17:35, Marcus Larsson wrote:
> Hi,
>
>
> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
>> Marcus,
>>
>> Thank you for prompt reply!
>>
>> Could you please read comments inline?
>> I'm looking forward to your reply.
>>
>> Thank you.
>>
>> Regards, Kirill
>>
>> On 22.11.2016 15:32, Marcus Larsson wrote:
>>> Hi,
>>>
>>>
>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>>> Marcus,
>>>>
>>>> Thank you for reviewing the fix!
>>>>>> WebRev:
>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>>>
>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test
>>>>> should accept either. You could let sscanf read out the decimal
>>>>> point as a character and just verify that it is one of the two.
>>>>>
>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means
>>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>>> I agree that ISO8601 could add "Z" to time (and as far as I
>>>> understand date/time without delimiters is legal too) but these are
>>>> the unit tests.
>>>> Hence they cover the existing code and they should pass only if the
>>>> result corresponds to existing code and fail otherwise.
>>>> The current code from os::iso8601_time format date/time string
>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not
>>>> consider any other format as valid.
>>>>
>>>> Could you please let me know your opinion?
>>>
>>> I think the test should verify the intended behavior, not the
>>> implementation. If we refactor or change something in iso8601_time()
>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
>> I would agree with you if we were talking about a functional test. But
>> since it is an unit test I think we should keep it as close to
>> implementation as possible.
>> If the implementation is changed unintentionally the test fails and
>> signals us that something is broken.
>> If it is an intentional change the test must be updated correspondingly.
>
> I still think it's unnecessary noise, but if you insist I'm fine with it.
>
> If we're not going to accept anything else than the current
> implementation then you should also remove the if-case for the Z suffix,
> since the test will fail for that anyway.
>
> Thanks,
> Marcus
>
>>
>>>
>>> Thanks,
>>> Marcus
>>>
>>>>
>>>> Thank you.
>>>>
>>>> Regards, Kirill
>>>>
>>>>>
>>>>> Thanks,
>>>>> Marcus
>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Regards, Kirill
>>>>>
>>>>
>>>
>>
>


From igor.ignatyev at oracle.com  Fri Nov 25 20:01:49 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 25 Nov 2016 23:01:49 +0300
Subject: RFR(XXS) : register closed @requires property setter
In-Reply-To: <b74f022a-5354-9833-4843-9f87bc5801fc@oracle.com>
References: <B386293B-9EB4-4BEA-B2E2-C3D87081ACB5@oracle.com>
	<b74f022a-5354-9833-4843-9f87bc5801fc@oracle.com>
Message-ID: <688B8C65-700C-4DAD-B959-BE5429688ACF@oracle.com>

Vladimir,

thanks a lot for your Review.

? Igor
> On Nov 24, 2016, at 11:26 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> Reviewed.
> 
> Best regards,
> Vladimir Ivanov
> 
> On 11/23/16 3:46 PM, Igor Ignatyev wrote:
>> Hi all,
>> 
>> could you please review the changeset which registers closed vm property setter (for @requires expressions)?
>> this setter is register as optional, so test execution won?t fail if the file doesn?t exist.
>> 
>> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/
>> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/
>> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228
>> 
>> Thanks,
>> ? Igor
>> 


From ioi.lam at oracle.com  Mon Nov 28 03:58:19 2016
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 27 Nov 2016 19:58:19 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <5832A7FC.8030505@oracle.com>
References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com>
Message-ID: <583BAB5B.4020404@oracle.com>

I found a problem in my previous patch. Here's the fix (on top of he 
previous patch):

diff -r 3404f61c7081 src/share/vm/oops/method.cpp
--- a/src/share/vm/oops/method.cpp    Sun Nov 27 19:44:44 2016 -0800
+++ b/src/share/vm/oops/method.cpp    Sun Nov 27 19:50:35 2016 -0800
@@ -1031,11 +1031,13 @@
    // leftover methods that weren't linked.
    if (is_shared()) {
      address entry = Interpreter::entry_for_cds_method(h_method);
-    assert(entry != NULL && entry == _i2i_entry && entry == 
_from_interpreted_entry,
+    assert(entry != NULL && entry == _i2i_entry,
             "should be correctly set during dump time");
      if (adapter() != NULL) {
        return;
      }
+    assert(entry == _from_interpreted_entry,
+           "should be correctly set during dump time");
    } else if (_i2i_entry != NULL) {
      return;
    }

The problem is: if the method has been compiled, then a shared method's
_from_interpreted_entry would be different than _i2i_entry (see
Method::set_code()).

I am not sure if Method::link_method() would ever be called after
it's been compiled, but I think it's safer to make the asserts no
stronger than before this patch.

Thanks
- Ioi


On 11/20/16 11:53 PM, Tobias Hartmann wrote:
> Hi Ioi,
>
> this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation!
>
> For the record: the test mentioned in [1] is part of my fix for JDK-8169711.
>
> Best regards,
> Tobias
>
> On 21.11.2016 07:58, Ioi Lam wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8169867
>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
>>
>> Thanks to Tobias for finding the bug. I have done the following
>>
>> + integrated Tobias' suggested fix
>> + fixed Method::restore_unshareable_info to call Method::link_method
>> + added comments and a diagram to illustrate how the CDS method entry
>>    trampolines work.
>>
>> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
>> It's basically an extra level of indirection to get to the adapter. However.
>> The word "trampoline" usually is used for and extra jump in executable code,
>> so it may be a little confusing when we use it for a data pointer here.
>>
>> Any suggest for a better name?
>>
>>
>> Testing:
>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>      now it produces the correct assertion. I won't check in this test, though,
>>      since it won't assert anymore after Tobias fixes 8169711.
>>
>> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843
>> #  assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed:
>> #  should be correctly set during dump time
>>
>> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>>      All tests passed.
>>
>> Thanks
>> - Ioi
>>


From david.holmes at oracle.com  Mon Nov 28 05:55:35 2016
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 28 Nov 2016 15:55:35 +1000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
	<5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap>
Message-ID: <a9cf18ea-6f33-2a5e-0f5d-667a66c357ad@oracle.com>

Hi Martin

On 24/11/2016 2:20 AM, Doerr, Martin wrote:
> Hi David,
>
> thank you very much for the presentation. I think it provides a good guideline for hotspot development.

Thanks.

>
> Would you like to add something about multi-copy atomicity?

Not really. :)

> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>
> It is needed in the following scenario:
> - Different threads write 2 variables.
> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>
> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".

Hmmm ... I know this code was discussed at length a couple of years ago 
... and I know I've probably forgotten most of what was discussed ... so 
I'll have to revisit this because this seems wrong ...

> (While taking a look at it, the condition "#if !(defined SPARC || defined IA32 || defined AMD64)" is not accurate and should better get improved. E.g. s390 is multi-copy atomic.)
>
>
> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.

I still can't get my head around the C++11 terminology for this and how 
you are expected to use it - what does it mean for an individual 
operation to be "sequentially consistent" ? :(

Cheers,
David

>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of David Holmes
> Sent: Mittwoch, 23. November 2016 06:08
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Presentation: Understanding OrderAccess
>
> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>
> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf
>
> Cheers,
> David
>

From david.holmes at oracle.com  Mon Nov 28 06:08:34 2016
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 28 Nov 2016 16:08:34 +1000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <a723a16a-e438-bc67-270c-0d95dc69ba48@redhat.com>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
	<a723a16a-e438-bc67-270c-0d95dc69ba48@redhat.com>
Message-ID: <0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com>

On 23/11/2016 8:40 PM, Andrew Haley wrote:
> On 23/11/16 05:08, David Holmes wrote:
>> This is a presentation I recently gave internally to the runtime and
>> serviceability teams that may be of more general interest to hotspot
>> developers.
>>
>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf
>
> That's pretty cool; nicely done.

Thanks Andrew.

> I'd quibble about a couple of minor things:
>
> In Data Race Example: Using Barriers, the use of a naked StoreStore is
> rather terrifying.  In real-world code it'd be better to use
> StoreStore|LoadStore or release unless the author really knows what
> they're doing.

It would all depend on the exact code of course. The simple flag+data 
example doesn't require it.

> The use of "fence" to mean a full barrier is rather idiosyncratic; it
> confused me the first time I saw it in HotSpot source, and from time
> to time it still does.

Yeah not sure the detailed history there - possibly related to x86 mfence.

Cheers,
David

> But, as I said, these are minor criticisms.
>
> Andrew.
>

From marcus.larsson at oracle.com  Mon Nov 28 10:06:27 2016
From: marcus.larsson at oracle.com (Marcus Larsson)
Date: Mon, 28 Nov 2016 11:06:27 +0100
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <583873A8.8000106@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
	<ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
	<583873A8.8000106@oracle.com>
Message-ID: <e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>

Hi,


On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote:
> Marcus,
>
> Here are a new webrev: 
> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/

Looks ok.

Thanks,
Marcus

>
> I addressed your comment about "if-case for the Z suffix".
>
> Could you please let me know your opinion?
>
> Thank you.
>
> Regards, Kirill
>
> On 24.11.2016 17:35, Marcus Larsson wrote:
>> Hi,
>>
>>
>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
>>> Marcus,
>>>
>>> Thank you for prompt reply!
>>>
>>> Could you please read comments inline?
>>> I'm looking forward to your reply.
>>>
>>> Thank you.
>>>
>>> Regards, Kirill
>>>
>>> On 22.11.2016 15:32, Marcus Larsson wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>>>> Marcus,
>>>>>
>>>>> Thank you for reviewing the fix!
>>>>>>> WebRev:
>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>>>>
>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test
>>>>>> should accept either. You could let sscanf read out the decimal
>>>>>> point as a character and just verify that it is one of the two.
>>>>>>
>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means
>>>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>>>> I agree that ISO8601 could add "Z" to time (and as far as I
>>>>> understand date/time without delimiters is legal too) but these are
>>>>> the unit tests.
>>>>> Hence they cover the existing code and they should pass only if the
>>>>> result corresponds to existing code and fail otherwise.
>>>>> The current code from os::iso8601_time format date/time string
>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not
>>>>> consider any other format as valid.
>>>>>
>>>>> Could you please let me know your opinion?
>>>>
>>>> I think the test should verify the intended behavior, not the
>>>> implementation. If we refactor or change something in iso8601_time()
>>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
>>> I would agree with you if we were talking about a functional test. But
>>> since it is an unit test I think we should keep it as close to
>>> implementation as possible.
>>> If the implementation is changed unintentionally the test fails and
>>> signals us that something is broken.
>>> If it is an intentional change the test must be updated 
>>> correspondingly.
>>
>> I still think it's unnecessary noise, but if you insist I'm fine with 
>> it.
>>
>> If we're not going to accept anything else than the current
>> implementation then you should also remove the if-case for the Z suffix,
>> since the test will fail for that anyway.
>>
>> Thanks,
>> Marcus
>>
>>>
>>>>
>>>> Thanks,
>>>> Marcus
>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards, Kirill
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>
>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards, Kirill
>>>>>>
>>>>>
>>>>
>>>
>>
>


From martin.doerr at sap.com  Mon Nov 28 10:43:22 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 28 Nov 2016 10:43:22 +0000
Subject: Presentation: Understanding OrderAccess
Message-ID: <b63c4d238e674c709df56a74220eefa2@dewdfe13de06.global.corp.sap>

Hi David,

I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
 
The term "multiple-copy atomicity" is described as
"... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".

I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Since you have asked about C++11, there's an example implementation for PPC [3].
Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
But I guess the Java memory model is beyond the scope of your presentation.

Best regards,
Martin


[1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
[2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030212.html
[3] http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html


-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Montag, 28. November 2016 06:56
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess

Hi Martin

On 24/11/2016 2:20 AM, Doerr, Martin wrote:
> Hi David,
>
> thank you very much for the presentation. I think it provides a good guideline for hotspot development.

Thanks.

>
> Would you like to add something about multi-copy atomicity?

Not really. :)

> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>
> It is needed in the following scenario:
> - Different threads write 2 variables.
> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>
> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".

Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...

> (While taking a look at it, the condition "#if !(defined SPARC || 
> defined IA32 || defined AMD64)" is not accurate and should better get 
> improved. E.g. s390 is multi-copy atomic.)
>
>
> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.

I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :(

Cheers,
David

>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On 
> Behalf Of David Holmes
> Sent: Mittwoch, 23. November 2016 06:08
> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Presentation: Understanding OrderAccess
>
> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>
> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderA
> ccess-v1.1.pdf
>
> Cheers,
> David
>

From aph at redhat.com  Mon Nov 28 10:50:55 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 28 Nov 2016 10:50:55 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
	<a723a16a-e438-bc67-270c-0d95dc69ba48@redhat.com>
	<0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com>
Message-ID: <8f4d6742-3592-7539-b176-028522ac2d32@redhat.com>

On 28/11/16 06:08, David Holmes wrote:
> On 23/11/2016 8:40 PM, Andrew Haley wrote:
>> On 23/11/16 05:08, David Holmes wrote:
> 
>> I'd quibble about a couple of minor things:
>>
>> In Data Race Example: Using Barriers, the use of a naked StoreStore is
>> rather terrifying.  In real-world code it'd be better to use
>> StoreStore|LoadStore or release unless the author really knows what
>> they're doing.
> 
> It would all depend on the exact code of course. The simple flag+data 
> example doesn't require it.

Ya, but it's a rare case: it's a bit like teaching someone to use
a chainsaw before they've learned to use a knife and fork.  :-)

Andrew.


From aph at redhat.com  Mon Nov 28 10:59:09 2016
From: aph at redhat.com (Andrew Haley)
Date: Mon, 28 Nov 2016 10:59:09 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <a9cf18ea-6f33-2a5e-0f5d-667a66c357ad@oracle.com>
References: <c18bd74b-2076-b0f0-e734-f3cc06ecd475@oracle.com>
	<5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap>
	<a9cf18ea-6f33-2a5e-0f5d-667a66c357ad@oracle.com>
Message-ID: <681454af-691b-268b-8328-a636b67a8afa@redhat.com>

On 28/11/16 05:55, David Holmes wrote:
> I still can't get my head around the C++11 terminology for this and how 
> you are expected to use it - what does it mean for an individual 
> operation to be "sequentially consistent" ? :(

It means that a set of atomic::seq_cst loads and stores form a
sequentially consistent order.

So, if your program uses *only* atomic::seq_cst operations, "... the
result of any execution is the same as if the operations of all the
processors were executed in some sequential order, and the operations
of each individual processor appear in this sequence in the order
specified by its program."

Andrew.


Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly
Executes Multiprocess Programs", IEEE Trans. Comput. C-28,9
(Sept. 1979), 690-691.

From igor.ignatyev at oracle.com  Mon Nov 28 12:19:05 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 28 Nov 2016 15:19:05 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
	<ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
	<583873A8.8000106@oracle.com>
	<e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>
Message-ID: <D80EE634-0C8D-454E-80A3-067B5BCCB981@oracle.com>

Hi Kirill,

looks good to me, thanks for fixing that. 

Cheers,
? Igor

> On Nov 28, 2016, at 1:06 PM, Marcus Larsson <marcus.larsson at oracle.com> wrote:
> 
> Hi,
> 
> 
> On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote:
>> Marcus,
>> 
>> Here are a new webrev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/
> 
> Looks ok.
> 
> Thanks,
> Marcus
> 
>> 
>> I addressed your comment about "if-case for the Z suffix".
>> 
>> Could you please let me know your opinion?
>> 
>> Thank you.
>> 
>> Regards, Kirill
>> 
>> On 24.11.2016 17:35, Marcus Larsson wrote:
>>> Hi,
>>> 
>>> 
>>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
>>>> Marcus,
>>>> 
>>>> Thank you for prompt reply!
>>>> 
>>>> Could you please read comments inline?
>>>> I'm looking forward to your reply.
>>>> 
>>>> Thank you.
>>>> 
>>>> Regards, Kirill
>>>> 
>>>> On 22.11.2016 15:32, Marcus Larsson wrote:
>>>>> Hi,
>>>>> 
>>>>> 
>>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>>>>> Marcus,
>>>>>> 
>>>>>> Thank you for reviewing the fix!
>>>>>>>> WebRev:
>>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>>>>> 
>>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test
>>>>>>> should accept either. You could let sscanf read out the decimal
>>>>>>> point as a character and just verify that it is one of the two.
>>>>>>> 
>>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means
>>>>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>>>>> I agree that ISO8601 could add "Z" to time (and as far as I
>>>>>> understand date/time without delimiters is legal too) but these are
>>>>>> the unit tests.
>>>>>> Hence they cover the existing code and they should pass only if the
>>>>>> result corresponds to existing code and fail otherwise.
>>>>>> The current code from os::iso8601_time format date/time string
>>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not
>>>>>> consider any other format as valid.
>>>>>> 
>>>>>> Could you please let me know your opinion?
>>>>> 
>>>>> I think the test should verify the intended behavior, not the
>>>>> implementation. If we refactor or change something in iso8601_time()
>>>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
>>>> I would agree with you if we were talking about a functional test. But
>>>> since it is an unit test I think we should keep it as close to
>>>> implementation as possible.
>>>> If the implementation is changed unintentionally the test fails and
>>>> signals us that something is broken.
>>>> If it is an intentional change the test must be updated correspondingly.
>>> 
>>> I still think it's unnecessary noise, but if you insist I'm fine with it.
>>> 
>>> If we're not going to accept anything else than the current
>>> implementation then you should also remove the if-case for the Z suffix,
>>> since the test will fail for that anyway.
>>> 
>>> Thanks,
>>> Marcus
>>> 
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Marcus
>>>>> 
>>>>>> 
>>>>>> Thank you.
>>>>>> 
>>>>>> Regards, Kirill
>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Marcus
>>>>>>> 
>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>>>> 
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Regards, Kirill
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 


From tobias.hartmann at oracle.com  Mon Nov 28 12:40:33 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 28 Nov 2016 13:40:33 +0100
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <583BAB5B.4020404@oracle.com>
References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com>
	<583BAB5B.4020404@oracle.com>
Message-ID: <583C25C1.1000605@oracle.com>

Hi Ioi,

On 28.11.2016 04:58, Ioi Lam wrote:
> I found a problem in my previous patch. Here's the fix (on top of he previous patch):
> 
> diff -r 3404f61c7081 src/share/vm/oops/method.cpp
> --- a/src/share/vm/oops/method.cpp    Sun Nov 27 19:44:44 2016 -0800
> +++ b/src/share/vm/oops/method.cpp    Sun Nov 27 19:50:35 2016 -0800
> @@ -1031,11 +1031,13 @@
>    // leftover methods that weren't linked.
>    if (is_shared()) {
>      address entry = Interpreter::entry_for_cds_method(h_method);
> -    assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry,
> +    assert(entry != NULL && entry == _i2i_entry,
>             "should be correctly set during dump time");
>      if (adapter() != NULL) {
>        return;
>      }
> +    assert(entry == _from_interpreted_entry,
> +           "should be correctly set during dump time");
>    } else if (_i2i_entry != NULL) {
>      return;
>    }
> 
> The problem is: if the method has been compiled, then a shared method's
> _from_interpreted_entry would be different than _i2i_entry (see
> Method::set_code()).
> 
> I am not sure if Method::link_method() would ever be called after
> it's been compiled, but I think it's safer to make the asserts no
> stronger than before this patch.

That looks reasonable to me!

Thanks,
Tobias

> Thanks
> - Ioi
> 
> 
> On 11/20/16 11:53 PM, Tobias Hartmann wrote:
>> Hi Ioi,
>>
>> this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation!
>>
>> For the record: the test mentioned in [1] is part of my fix for JDK-8169711.
>>
>> Best regards,
>> Tobias
>>
>> On 21.11.2016 07:58, Ioi Lam wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8169867
>>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
>>>
>>> Thanks to Tobias for finding the bug. I have done the following
>>>
>>> + integrated Tobias' suggested fix
>>> + fixed Method::restore_unshareable_info to call Method::link_method
>>> + added comments and a diagram to illustrate how the CDS method entry
>>>    trampolines work.
>>>
>>> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline.
>>> It's basically an extra level of indirection to get to the adapter. However.
>>> The word "trampoline" usually is used for and extra jump in executable code,
>>> so it may be a little confusing when we use it for a data pointer here.
>>>
>>> Any suggest for a better name?
>>>
>>>
>>> Testing:
>>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>>      now it produces the correct assertion. I won't check in this test, though,
>>>      since it won't assert anymore after Tobias fixes 8169711.
>>>
>>> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #  Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843
>>> #  assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed:
>>> #  should be correctly set during dump time
>>>
>>> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>>>      All tests passed.
>>>
>>> Thanks
>>> - Ioi
>>>
> 

From david.holmes at oracle.com  Mon Nov 28 12:55:51 2016
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 28 Nov 2016 22:55:51 +1000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <b63c4d238e674c709df56a74220eefa2@dewdfe13de06.global.corp.sap>
References: <b63c4d238e674c709df56a74220eefa2@dewdfe13de06.global.corp.sap>
Message-ID: <a02de9df-a2e7-0a4a-2d7d-8578de02aa5a@oracle.com>

Hi Martin,

On 28/11/2016 8:43 PM, Doerr, Martin wrote:
> Hi David,
>
> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>
> The term "multiple-copy atomicity" is described as
> "... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>
> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

Thanks for the reminder of that discussion. :)

> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Problem there, I think, is that fence() is really not special in that 
regard. You need to insert something between the two loads to force a 
globally consistent view of memory. But what part of fence() gives that 
guarantee? Maybe there is something we need to define for 
non-multi-copy-atomicarchitectures to use just for this purpose.

> Since you have asked about C++11, there's an example implementation for PPC [3].
> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Yeah I've seen the mappings but it is the conceptual model that I have a 
problem with. Andrew's reply makes it somewhat clearer - if every atomic 
op is seq-cst then you get a seq-cst execution ... but does that somehow 
bind all memory accesses not just those involved in the atomic ops? And 
how do non seq-cst atomic ops interact with seq-cst ones?

> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
> But I guess the Java memory model is beyond the scope of your presentation.

Oh yes way out of scope! :)

Cheers,
David

> Best regards,
> Martin
>
>
> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> [2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030212.html
> [3] http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 06:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin
>
> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>> Hi David,
>>
>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>
> Thanks.
>
>>
>> Would you like to add something about multi-copy atomicity?
>
> Not really. :)
>
>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>
>> It is needed in the following scenario:
>> - Different threads write 2 variables.
>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>
>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>
> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>
>> (While taking a look at it, the condition "#if !(defined SPARC ||
>> defined IA32 || defined AMD64)" is not accurate and should better get
>> improved. E.g. s390 is multi-copy atomic.)
>>
>>
>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>
> I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :(
>
> Cheers,
> David
>
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
>> Behalf Of David Holmes
>> Sent: Mittwoch, 23. November 2016 06:08
>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Presentation: Understanding OrderAccess
>>
>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>
>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderA
>> ccess-v1.1.pdf
>>
>> Cheers,
>> David
>>

From kirill.zhaldybin at oracle.com  Mon Nov 28 13:01:25 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Mon, 28 Nov 2016 16:01:25 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
	<ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
	<583873A8.8000106@oracle.com>
	<e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>
Message-ID: <b39de909-ffd0-e304-5c0c-b91144f354db@oracle.com>

Markus,

Thank you for review!

Regards, Kirill

On 28.11.2016 13:06, Marcus Larsson wrote:
> Hi,
>
>
> On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote:
>> Marcus,
>>
>> Here are a new webrev: 
>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/
>
> Looks ok.
>
> Thanks,
> Marcus
>
>>
>> I addressed your comment about "if-case for the Z suffix".
>>
>> Could you please let me know your opinion?
>>
>> Thank you.
>>
>> Regards, Kirill
>>
>> On 24.11.2016 17:35, Marcus Larsson wrote:
>>> Hi,
>>>
>>>
>>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
>>>> Marcus,
>>>>
>>>> Thank you for prompt reply!
>>>>
>>>> Could you please read comments inline?
>>>> I'm looking forward to your reply.
>>>>
>>>> Thank you.
>>>>
>>>> Regards, Kirill
>>>>
>>>> On 22.11.2016 15:32, Marcus Larsson wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>>>>> Marcus,
>>>>>>
>>>>>> Thank you for reviewing the fix!
>>>>>>>> WebRev:
>>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ 
>>>>>>>>
>>>>>>>
>>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test
>>>>>>> should accept either. You could let sscanf read out the decimal
>>>>>>> point as a character and just verify that it is one of the two.
>>>>>>>
>>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means
>>>>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>>>>> I agree that ISO8601 could add "Z" to time (and as far as I
>>>>>> understand date/time without delimiters is legal too) but these are
>>>>>> the unit tests.
>>>>>> Hence they cover the existing code and they should pass only if the
>>>>>> result corresponds to existing code and fail otherwise.
>>>>>> The current code from os::iso8601_time format date/time string
>>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not
>>>>>> consider any other format as valid.
>>>>>>
>>>>>> Could you please let me know your opinion?
>>>>>
>>>>> I think the test should verify the intended behavior, not the
>>>>> implementation. If we refactor or change something in iso8601_time()
>>>>> we shouldn't be failing the test if it still conforms to ISO8601, 
>>>>> IMO.
>>>> I would agree with you if we were talking about a functional test. But
>>>> since it is an unit test I think we should keep it as close to
>>>> implementation as possible.
>>>> If the implementation is changed unintentionally the test fails and
>>>> signals us that something is broken.
>>>> If it is an intentional change the test must be updated 
>>>> correspondingly.
>>>
>>> I still think it's unnecessary noise, but if you insist I'm fine 
>>> with it.
>>>
>>> If we're not going to accept anything else than the current
>>> implementation then you should also remove the if-case for the Z 
>>> suffix,
>>> since the test will fail for that anyway.
>>>
>>> Thanks,
>>> Marcus
>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Marcus
>>>>>
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Regards, Kirill
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Marcus
>>>>>>>
>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> Regards, Kirill
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From kirill.zhaldybin at oracle.com  Mon Nov 28 13:01:53 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Mon, 28 Nov 2016 16:01:53 +0300
Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if
	numeric locale uses ", " as separator between integer and fraction part
In-Reply-To: <D80EE634-0C8D-454E-80A3-067B5BCCB981@oracle.com>
References: <b8b54a5e-324d-ca19-8eb8-16ddf048d310@oracle.com>
	<60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com>
	<85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com>
	<ba3f2cd0-38a8-d128-bd57-f608f9f2c1c0@oracle.com>
	<23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com>
	<ade369fb-388e-1f9e-fc1f-51efc5a35a9c@oracle.com>
	<583873A8.8000106@oracle.com>
	<e508fa25-84af-a521-70d1-6d2bb8b5d39c@oracle.com>
	<D80EE634-0C8D-454E-80A3-067B5BCCB981@oracle.com>
Message-ID: <0b1cf3ff-5317-cd1c-e5e3-e43ef215bdb7@oracle.com>

Igor,

Thank you for review!

Regards, Kirill

On 28.11.2016 15:19, Igor Ignatyev wrote:
> Hi Kirill,
>
> looks good to me, thanks for fixing that.
>
> Cheers,
> ? Igor
>
>> On Nov 28, 2016, at 1:06 PM, Marcus Larsson <marcus.larsson at oracle.com> wrote:
>>
>> Hi,
>>
>>
>> On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote:
>>> Marcus,
>>>
>>> Here are a new webrev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/
>> Looks ok.
>>
>> Thanks,
>> Marcus
>>
>>> I addressed your comment about "if-case for the Z suffix".
>>>
>>> Could you please let me know your opinion?
>>>
>>> Thank you.
>>>
>>> Regards, Kirill
>>>
>>> On 24.11.2016 17:35, Marcus Larsson wrote:
>>>> Hi,
>>>>
>>>>
>>>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote:
>>>>> Marcus,
>>>>>
>>>>> Thank you for prompt reply!
>>>>>
>>>>> Could you please read comments inline?
>>>>> I'm looking forward to your reply.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Regards, Kirill
>>>>>
>>>>> On 22.11.2016 15:32, Marcus Larsson wrote:
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote:
>>>>>>> Marcus,
>>>>>>>
>>>>>>> Thank you for reviewing the fix!
>>>>>>>>> WebRev:
>>>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/
>>>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test
>>>>>>>> should accept either. You could let sscanf read out the decimal
>>>>>>>> point as a character and just verify that it is one of the two.
>>>>>>>>
>>>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means
>>>>>>>> that we won't accept "Z" suffixed strings. Please revert that.
>>>>>>> I agree that ISO8601 could add "Z" to time (and as far as I
>>>>>>> understand date/time without delimiters is legal too) but these are
>>>>>>> the unit tests.
>>>>>>> Hence they cover the existing code and they should pass only if the
>>>>>>> result corresponds to existing code and fail otherwise.
>>>>>>> The current code from os::iso8601_time format date/time string
>>>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not
>>>>>>> consider any other format as valid.
>>>>>>>
>>>>>>> Could you please let me know your opinion?
>>>>>> I think the test should verify the intended behavior, not the
>>>>>> implementation. If we refactor or change something in iso8601_time()
>>>>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO.
>>>>> I would agree with you if we were talking about a functional test. But
>>>>> since it is an unit test I think we should keep it as close to
>>>>> implementation as possible.
>>>>> If the implementation is changed unintentionally the test fails and
>>>>> signals us that something is broken.
>>>>> If it is an intentional change the test must be updated correspondingly.
>>>> I still think it's unnecessary noise, but if you insist I'm fine with it.
>>>>
>>>> If we're not going to accept anything else than the current
>>>> implementation then you should also remove the if-case for the Z suffix,
>>>> since the test will fail for that anyway.
>>>>
>>>> Thanks,
>>>> Marcus
>>>>
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards, Kirill
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Marcus
>>>>>>>>
>>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> Regards, Kirill


From gromero at linux.vnet.ibm.com  Mon Nov 28 13:24:40 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 28 Nov 2016 11:24:40 -0200
Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <CA+3eh10CQKg3yzCm+HvFiq6ODGf1gUNUaATTuGwd06YV_gR_tw@mail.gmail.com>
References: <583394C5.3030206@linux.vnet.ibm.com>
	<9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com>
	<5835B6D7.4020101@linux.vnet.ibm.com>
	<CA+3eh10CQKg3yzCm+HvFiq6ODGf1gUNUaATTuGwd06YV_gR_tw@mail.gmail.com>
Message-ID: <583C3018.5080109@linux.vnet.ibm.com>

Hi Volker,

Sorry for not replying earlier, it was day-off on Friday here...

On 25-11-2016 11:32, Volker Simonis wrote:
> Hi Gustavo,
> 
> we've realized that we have exactly the same problem on Linux/s390 so
> I hope you don't mind that I've updated the bug and the webrev to also
> include the fix for Linux/s390:
> 
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/
> https://bugs.openjdk.java.net/browse/JDK-8170153
> 
> The top-level change stays the same (I've only added the current
> reviewers) and for the jdk change I've just added Linux/s390 as
> another platform which can compile fdlibm with HIGH optimization.

Actually, it's really cool to know that an analysis on PPC64 contributed
also to the s390 arch! :)

Thanks for providing the updated webrevs.


Regards,
Gustavo


From stefan.karlsson at oracle.com  Mon Nov 28 13:52:20 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 14:52:20 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
Message-ID: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>

Hi all,

Please, review this patch to fix metaspace initialization.

http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8170395

The fix for JDK-8169931 introduced a new assert to ensure that we always 
try to allocate chunks that are any of the three fixed sizes 
(specialized, small, medium) or a humongous chunk (if it is larger then 
the medium chunk size).

During metaspace initialization an initial metaspace chunk is allocated. 
The size of some of the metaspace instances can be specified on the 
command line. For example:
java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version

If this size is smaller than the medium chunk size and at the same time 
doesn't match the specialized or small chunk size, then we end up 
hitting the assert mentioned above:
#
# Internal Error 
(/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), 
pid=31643, tid=31646
# assert(size > free_chunks(MediumIndex)->size()) failed: Not a 
humongous chunk
#

========================================================================

The most important part of the fix is this line:
+  // Adjust to one of the fixed chunk sizes (unless humongous)
+  const size_t adjusted = adjust_initial_chunk_size(requested);

which ensures that we always request either of a specialized, small, 
medium, or humongous chunk size, even if the requested size is neither 
of these.

Most of the other code is refactoring to unify the non-class metaspace 
and the class metaspace code paths to get rid of some of the existing 
code duplication, bring the chunk size calculation nearer to the the 
actual chunk allocation, and make it easier to write a unit test for the 
new adjust_initial_chunk_size function.

========================================================================

The patch for JDK-8169931 was backed out with JDK-8170355 and will be 
reintroduced as JDK-8170358 when this patch has been reviewed and pushed.

Testing: jprt, unit test, parts of PIT testing (including CDS tests), 
failing test

Thanks,
StefanK

From michail.chernov at oracle.com  Mon Nov 28 13:57:23 2016
From: michail.chernov at oracle.com (Michail Chernov)
Date: Mon, 28 Nov 2016 16:57:23 +0300
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
Message-ID: <ae513b1c-65c2-d1c4-0709-4a8866c11917@oracle.com>

Hi Stefan,


Could you please add simple regression test for this case?


Thanks,

Michail


On 28.11.2016 16:52, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix metaspace initialization.
>
> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8170395
>
> The fix for JDK-8169931 introduced a new assert to ensure that we 
> always try to allocate chunks that are any of the three fixed sizes 
> (specialized, small, medium) or a humongous chunk (if it is larger 
> then the medium chunk size).
>
> During metaspace initialization an initial metaspace chunk is 
> allocated. The size of some of the metaspace instances can be 
> specified on the command line. For example:
> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
> If this size is smaller than the medium chunk size and at the same 
> time doesn't match the specialized or small chunk size, then we end up 
> hitting the assert mentioned above:
> #
> # Internal Error 
> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), 
> pid=31643, tid=31646
> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a 
> humongous chunk
> #
>
> ========================================================================
>
> The most important part of the fix is this line:
> +  // Adjust to one of the fixed chunk sizes (unless humongous)
> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
> which ensures that we always request either of a specialized, small, 
> medium, or humongous chunk size, even if the requested size is neither 
> of these.
>
> Most of the other code is refactoring to unify the non-class metaspace 
> and the class metaspace code paths to get rid of some of the existing 
> code duplication, bring the chunk size calculation nearer to the the 
> actual chunk allocation, and make it easier to write a unit test for 
> the new adjust_initial_chunk_size function.
>
> ========================================================================
>
> The patch for JDK-8169931 was backed out with JDK-8170355 and will be 
> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>
> Testing: jprt, unit test, parts of PIT testing (including CDS tests), 
> failing test
>
> Thanks,
> StefanK


From martin.doerr at sap.com  Mon Nov 28 14:37:16 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 28 Nov 2016 14:37:16 +0000
Subject: [JUNK] Re: Presentation: Understanding OrderAccess
In-Reply-To: <a02de9df-a2e7-0a4a-2d7d-8578de02aa5a@oracle.com>
References: <b63c4d238e674c709df56a74220eefa2@dewdfe13de06.global.corp.sap>
	<a02de9df-a2e7-0a4a-2d7d-8578de02aa5a@oracle.com>
Message-ID: <0acb8779574543ff80607e460a81061f@dewdfe13de06.global.corp.sap>

Hi David,

> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee?

This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try:

I think the comment in orderAccess.hpp is not bad:
// Finally, we define a "fence" operation, as a bidirectional barrier.
// It guarantees that any memory access preceding the fence is not
// reordered w.r.t. any memory accesses subsequent to the fence in program
// order.

One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B.
If A contains a load, one has to include the corresponding store which may have been performed by another thread into A.
The same is valid for B.
Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one.


> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ...
> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?

"Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4]

So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5].


Best regards,
Martin

[4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering
[5] http://g.oswego.edu/dl/jmm/cookbook.html


-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Montag, 28. November 2016 13:56
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: [JUNK] Re: Presentation: Understanding OrderAccess

Hi Martin,

On 28/11/2016 8:43 PM, Doerr, Martin wrote:
> Hi David,
>
> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>
> The term "multiple-copy atomicity" is described as "... in a machine 
> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>
> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

Thanks for the reminder of that discussion. :)

> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose.

> Since you have asked about C++11, there's an example implementation for PPC [3].
> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?

> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
> But I guess the Java memory model is beyond the scope of your presentation.

Oh yes way out of scope! :)

Cheers,
David

> Best regards,
> Martin
>
>
> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> [2] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030
> 212.html [3] 
> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 06:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
> <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin
>
> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>> Hi David,
>>
>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>
> Thanks.
>
>>
>> Would you like to add something about multi-copy atomicity?
>
> Not really. :)
>
>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>
>> It is needed in the following scenario:
>> - Different threads write 2 variables.
>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>
>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>
> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>
>> (While taking a look at it, the condition "#if !(defined SPARC || 
>> defined IA32 || defined AMD64)" is not accurate and should better get 
>> improved. E.g. s390 is multi-copy atomic.)
>>
>>
>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>
> I still can't get my head around the C++11 terminology for this and 
> how you are expected to use it - what does it mean for an individual 
> operation to be "sequentially consistent" ? :(
>
> Cheers,
> David
>
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On 
>> Behalf Of David Holmes
>> Sent: Mittwoch, 23. November 2016 06:08
>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Presentation: Understanding OrderAccess
>>
>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>
>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order
>> A
>> ccess-v1.1.pdf
>>
>> Cheers,
>> David
>>

From gromero at linux.vnet.ibm.com  Mon Nov 28 16:28:00 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Mon, 28 Nov 2016 14:28:00 -0200
Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to
	non-optimized compilation
Message-ID: <583C5B10.8040204@linux.vnet.ibm.com>

Hi all,

I'm re-sending due to JDK title update to include s390x and aarch64 archs.

Could the following webrev be reviewed, please?

webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/
webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/
bug:        https://bugs.openjdk.java.net/browse/JDK-8170153

Thank you.


Regards,
Gustavo


From martin.doerr at sap.com  Mon Nov 28 16:29:28 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 28 Nov 2016 16:29:28 +0000
Subject: Presentation: Understanding OrderAccess
Message-ID: <f0ac33b12d0d4dc3950c90fee3b2b90c@dewdfe13de06.global.corp.sap>

Hi David,

sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that.

> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee?

This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try:

I think the comment in orderAccess.hpp is not bad:
// Finally, we define a "fence" operation, as a bidirectional barrier.
// It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order.

One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B.
If A contains a load, one has to include the corresponding store which may have been performed by another thread into A.
Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one.


> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ...
> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?

"Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4]

So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5].


Best regards,
Martin

[4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering
[5] http://g.oswego.edu/dl/jmm/cookbook.html


-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com]
Sent: Montag, 28. November 2016 13:56
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess

Hi Martin,

On 28/11/2016 8:43 PM, Doerr, Martin wrote:
> Hi David,
>
> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>
> The term "multiple-copy atomicity" is described as "... in a machine 
> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>
> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".

Thanks for the reminder of that discussion. :)

> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.

Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose.

> Since you have asked about C++11, there's an example implementation for PPC [3].
> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.

Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?

> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
> But I guess the Java memory model is beyond the scope of your presentation.

Oh yes way out of scope! :)

Cheers,
David

> Best regards,
> Martin
>
>
> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
> [2]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030
> 212.html [3]
> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 06:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers 
> <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin
>
> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>> Hi David,
>>
>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>
> Thanks.
>
>>
>> Would you like to add something about multi-copy atomicity?
>
> Not really. :)
>
>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>
>> It is needed in the following scenario:
>> - Different threads write 2 variables.
>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>
>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>
> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>
>> (While taking a look at it, the condition "#if !(defined SPARC || 
>> defined IA32 || defined AMD64)" is not accurate and should better get 
>> improved. E.g. s390 is multi-copy atomic.)
>>
>>
>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>
> I still can't get my head around the C++11 terminology for this and 
> how you are expected to use it - what does it mean for an individual 
> operation to be "sequentially consistent" ? :(
>
> Cheers,
> David
>
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On 
>> Behalf Of David Holmes
>> Sent: Mittwoch, 23. November 2016 06:08
>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Subject: Presentation: Understanding OrderAccess
>>
>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>
>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order
>> A
>> ccess-v1.1.pdf
>>
>> Cheers,
>> David
>>

From mikael.gerdin at oracle.com  Mon Nov 28 16:45:08 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Mon, 28 Nov 2016 17:45:08 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
Message-ID: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>

Hi Stefan,

On 2016-11-28 14:52, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix metaspace initialization.
>
> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/

Overall I think this change looks good.
One thing I noticed is that the first parameter to
VirtualSpaceList::get_new_chunk
is actually ignored so you might want to just get rid of it, it's just 
confusing to see it. If you decide to do something about get_new_chunk I 
think it wouldn't hurt to have the names of the parameters changed as 
well, "grow_chunks_by_words" is actually "requested_chunk_size" and 
"medium_chunk_bunch" could be something like "suggested_commit_granularity"

You might want to make the "const size_t" constants you moved out of the 
enum to either be "static" (which would be static in the C-sense) or add 
them in an anonymous namespace since otherwise they will pollute the 
global symbol namespace (more so than an enum which is strictly file 
scoped).

The rest of the change looks good to me.

/Mikael

> https://bugs.openjdk.java.net/browse/JDK-8170395
>
> The fix for JDK-8169931 introduced a new assert to ensure that we always
> try to allocate chunks that are any of the three fixed sizes
> (specialized, small, medium) or a humongous chunk (if it is larger then
> the medium chunk size).
>
> During metaspace initialization an initial metaspace chunk is allocated.
> The size of some of the metaspace instances can be specified on the
> command line. For example:
> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
> If this size is smaller than the medium chunk size and at the same time
> doesn't match the specialized or small chunk size, then we end up
> hitting the assert mentioned above:
> #
> # Internal Error
> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
> pid=31643, tid=31646
> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
> humongous chunk
> #
>
> ========================================================================
>
> The most important part of the fix is this line:
> +  // Adjust to one of the fixed chunk sizes (unless humongous)
> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
> which ensures that we always request either of a specialized, small,
> medium, or humongous chunk size, even if the requested size is neither
> of these.
>
> Most of the other code is refactoring to unify the non-class metaspace
> and the class metaspace code paths to get rid of some of the existing
> code duplication, bring the chunk size calculation nearer to the the
> actual chunk allocation, and make it easier to write a unit test for the
> new adjust_initial_chunk_size function.
>
> ========================================================================
>
> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>
> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
> failing test
>
> Thanks,
> StefanK

From thomas.stuefe at gmail.com  Mon Nov 28 16:48:11 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 28 Nov 2016 17:48:11 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
Message-ID: <CAA-vtUyov5GZGAaFLZ39D=Pci1rN8Z0L2q4oq2D5hBBYTk4b7g@mail.gmail.com>

Hi Stefan,

This looks good. Some small remarks:

- Metaspace::verify_initialized () : could be made static. For clarity I'd
also either rename it to something like "verify_global_initialization" or
to just roll the code out into its only caller, Metaspace::initialize().

- I never liked the ChunkSizes enum names, because they do not indicate
they are sizes, and now that the encompassing enum name "ChunkSizes" is
gone they are even less clear. Would it be possible to rename the former
enum values to "...Size" for better code clarity, e.g. "MediumChunkSize"
instead of "MediumChunk"?

- Metaspace::get_space_manager(MetadataType mdtype) - asserting for
mdType==Class||NonClassType instead of != MetadaTypeCount could be a bit
clearer.

- SpaceManager::adjust_initial_chunk_size () - could we rename this to a
more generic name like "::next_larger_chunksize" or similar? I also wonder
whether this could be combined somehow with
SpaceManager::calc_chunk_size(), which wants to do something similar
(calculate a fitting chunk size for a given smaller allocation size)


Kind Regards, Thomas


On Mon, Nov 28, 2016 at 2:52 PM, Stefan Karlsson <stefan.karlsson at oracle.com
> wrote:

> Hi all,
>
> Please, review this patch to fix metaspace initialization.
>
> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8170395
>
> The fix for JDK-8169931 introduced a new assert to ensure that we always
> try to allocate chunks that are any of the three fixed sizes (specialized,
> small, medium) or a humongous chunk (if it is larger then the medium chunk
> size).
>
> During metaspace initialization an initial metaspace chunk is allocated.
> The size of some of the metaspace instances can be specified on the command
> line. For example:
> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
> If this size is smaller than the medium chunk size and at the same time
> doesn't match the specialized or small chunk size, then we end up hitting
> the assert mentioned above:
> #
> # Internal Error (/scratch/opt/jprt/T/P1/142848
> .erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643,
> tid=31646
> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a humongous
> chunk
> #
>
> ========================================================================
>
> The most important part of the fix is this line:
> +  // Adjust to one of the fixed chunk sizes (unless humongous)
> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
> which ensures that we always request either of a specialized, small,
> medium, or humongous chunk size, even if the requested size is neither of
> these.
>
> Most of the other code is refactoring to unify the non-class metaspace and
> the class metaspace code paths to get rid of some of the existing code
> duplication, bring the chunk size calculation nearer to the the actual
> chunk allocation, and make it easier to write a unit test for the new
> adjust_initial_chunk_size function.
>
> ========================================================================
>
> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>
> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
> failing test
>
> Thanks,
> StefanK
>

From erik.joelsson at oracle.com  Mon Nov 28 16:55:00 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 28 Nov 2016 17:55:00 +0100
Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <583C5B10.8040204@linux.vnet.ibm.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
Message-ID: <1b332dd2-aa9f-e24b-faaf-b95eacd11dac@oracle.com>

Looks good.

/Erik


On 2016-11-28 17:28, Gustavo Romero wrote:
> Hi all,
>
> I'm re-sending due to JDK title update to include s390x and aarch64 archs.
>
> Could the following webrev be reviewed, please?
>
> webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/
> webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/
> bug:        https://bugs.openjdk.java.net/browse/JDK-8170153
>
> Thank you.
>
>
> Regards,
> Gustavo
>


From thomas.stuefe at gmail.com  Mon Nov 28 16:58:37 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 28 Nov 2016 17:58:37 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
Message-ID: <CAA-vtUzG_Pu_MDyD_3n_8W1Vfostex398KxoyACGgib10AorFQ@mail.gmail.com>

On Mon, Nov 28, 2016 at 5:45 PM, Mikael Gerdin <mikael.gerdin at oracle.com>
wrote:

> Hi Stefan,
>
> On 2016-11-28 14:52, Stefan Karlsson wrote:
>
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>>
>
> Overall I think this change looks good.
> One thing I noticed is that the first parameter to
> VirtualSpaceList::get_new_chunk
> is actually ignored so you might want to just get rid of it, it's just
> confusing to see it. If you decide to do something about get_new_chunk I
> think it wouldn't hurt to have the names of the parameters changed as well,
> "grow_chunks_by_words" is actually "requested_chunk_size" and
> "medium_chunk_bunch" could be something like "suggested_commit_granularity"
>
>
+1 to that, this would make the code quite a bit clearer.

I also had a hard time understanding the "make_current" flag in
SpaceManager::add_chunk() until I (hope I) understood that it only matters
for humongous chunks where we differentiate between (a) preallocating a
still-unused humongous chunk for future allocations (initial chunk) or (b)
allocating a humongous chunk for immediate consumption by a
larger-than-medium-chunk memory request. I never saw (b) in real life,
however, the only humongous chunks I ever see are the initial chunks. Does
this ever happen?


> You might want to make the "const size_t" constants you moved out of the
> enum to either be "static" (which would be static in the C-sense) or add
> them in an anonymous namespace since otherwise they will pollute the global
> symbol namespace (more so than an enum which is strictly file scoped).
>
> The rest of the change looks good to me.
>
> /Mikael
>
>
> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we always
>> try to allocate chunks that are any of the three fixed sizes
>> (specialized, small, medium) or a humongous chunk (if it is larger then
>> the medium chunk size).
>>
>> During metaspace initialization an initial metaspace chunk is allocated.
>> The size of some of the metaspace instances can be specified on the
>> command line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same time
>> doesn't match the specialized or small chunk size, then we end up
>> hitting the assert mentioned above:
>> #
>> # Internal Error
>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/
>> memory/metaspace.cpp:2359),
>> pid=31643, tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small,
>> medium, or humongous chunk size, even if the requested size is neither
>> of these.
>>
>> Most of the other code is refactoring to unify the non-class metaspace
>> and the class metaspace code paths to get rid of some of the existing
>> code duplication, bring the chunk size calculation nearer to the the
>> actual chunk allocation, and make it easier to write a unit test for the
>> new adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>> failing test
>>
>> Thanks,
>> StefanK
>>
>

From stefan.karlsson at oracle.com  Mon Nov 28 18:44:20 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 19:44:20 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <CAA-vtUyov5GZGAaFLZ39D=Pci1rN8Z0L2q4oq2D5hBBYTk4b7g@mail.gmail.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<CAA-vtUyov5GZGAaFLZ39D=Pci1rN8Z0L2q4oq2D5hBBYTk4b7g@mail.gmail.com>
Message-ID: <8142bf8c-0fda-5f15-4747-452ac09b578e@oracle.com>

Hi Thomas,

On 2016-11-28 17:48, Thomas St?fe wrote:
> Hi Stefan,
>
> This looks good.

Thanks.

> Some small remarks:
>
> - Metaspace::verify_initialized () : could be made static. For clarity 
> I'd also either rename it to something like 
> "verify_global_initialization" or to just roll the code out into its 
> only caller, Metaspace::initialize().

I'll rename the function and make it static.

Personally, I want verbose verification and debugging code to get out of 
the way of the other code. That's why I moved it to a separate function.

>
> - I never liked the ChunkSizes enum names, because they do not 
> indicate they are sizes, and now that the encompassing enum name 
> "ChunkSizes" is gone they are even less clear. Would it be possible to 
> rename the former enum values to "...Size" for better code clarity, 
> e.g. "MediumChunkSize" instead of "MediumChunk"?

I sort of agree, but changing it will affect large parts of 
metaspace.cpp, which makes it hard to see the other changes in this 
patch. I'd rather revert back to the enum, and maybe deal with that 
cleanup as a separate enhancement.

>
> - Metaspace::get_space_manager(MetadataType mdtype) - asserting for 
> mdType==Class||NonClassType instead of != MetadaTypeCount could be a 
> bit clearer.

The assert is copied from the other getters in the file, so I'd like to 
keep it for consistency.

Maybe we should get rid of MetadataTypeCount and that assert, and let 
the code that converts back and forth between MetadataType and integers 
do the assert check? That would need to be handled as a separate 
enhancement.

>
> - SpaceManager::adjust_initial_chunk_size () - could we rename this to 
> a more generic name like "::next_larger_chunksize" or similar?

I choose the name because it is a helper for a specific use-case and 
call site. I also considered giving it a more generic name, but I 
couldn't immediately come up with a name that accurately described the 
function. The proposed next_larger_chunksize isn't describing the 
function correctly, since adjust_initial_chunk_size(SmallChunk) returns 
SmallChunk and not MediumChunk. If we can figure out a spot-on name, I'd 
be happy to change it.
> I also wonder whether this could be combined somehow with 
> SpaceManager::calc_chunk_size(), which wants to do something similar 
> (calculate a fitting chunk size for a given smaller allocation size)

I briefly thought about that as well, but then skipped that though 
because of the heuristics involved in calc_chunk_size().

Thanks reviewing,
StefanK

>
>
> Kind Regards, Thomas
>
>
> On Mon, Nov 28, 2016 at 2:52 PM, Stefan Karlsson 
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
>     Hi all,
>
>     Please, review this patch to fix metaspace initialization.
>
>     http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>     <http://cr.openjdk.java.net/%7Estefank/8170395/webrev.01/>
>     https://bugs.openjdk.java.net/browse/JDK-8170395
>     <https://bugs.openjdk.java.net/browse/JDK-8170395>
>
>     The fix for JDK-8169931 introduced a new assert to ensure that we
>     always try to allocate chunks that are any of the three fixed
>     sizes (specialized, small, medium) or a humongous chunk (if it is
>     larger then the medium chunk size).
>
>     During metaspace initialization an initial metaspace chunk is
>     allocated. The size of some of the metaspace instances can be
>     specified on the command line. For example:
>     java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
>     If this size is smaller than the medium chunk size and at the same
>     time doesn't match the specialized or small chunk size, then we
>     end up hitting the assert mentioned above:
>     #
>     # Internal Error
>     (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>     pid=31643, tid=31646
>     # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>     humongous chunk
>     #
>
>     ========================================================================
>
>     The most important part of the fix is this line:
>     +  // Adjust to one of the fixed chunk sizes (unless humongous)
>     +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
>     which ensures that we always request either of a specialized,
>     small, medium, or humongous chunk size, even if the requested size
>     is neither of these.
>
>     Most of the other code is refactoring to unify the non-class
>     metaspace and the class metaspace code paths to get rid of some of
>     the existing code duplication, bring the chunk size calculation
>     nearer to the the actual chunk allocation, and make it easier to
>     write a unit test for the new adjust_initial_chunk_size function.
>
>     ========================================================================
>
>     The patch for JDK-8169931 was backed out with JDK-8170355 and will
>     be reintroduced as JDK-8170358 when this patch has been reviewed
>     and pushed.
>
>     Testing: jprt, unit test, parts of PIT testing (including CDS
>     tests), failing test
>
>     Thanks,
>     StefanK
>
>


From kim.barrett at oracle.com  Mon Nov 28 18:49:30 2016
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 28 Nov 2016 13:49:30 -0500
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
Message-ID: <2C5018D2-204D-4C81-96C5-E003941DA731@oracle.com>

> On Nov 28, 2016, at 11:45 AM, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
> You might want to make the "const size_t" constants you moved out of the enum to either be "static" (which would be static in the C-sense) or add them in an anonymous namespace since otherwise they will pollute the global symbol namespace (more so than an enum which is strictly file scoped).

C++ const declarations at namesapce scope have internal linkage unless explicitly declared to have external linkage.


From stefan.karlsson at oracle.com  Mon Nov 28 19:23:32 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 20:23:32 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com>
Message-ID: <f271740e-77d1-d900-d110-80444596ac9d@oracle.com>

Hi Mikael,

On 2016-11-28 17:45, Mikael Gerdin wrote:
> Hi Stefan,
>
> On 2016-11-28 14:52, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>
> Overall I think this change looks good.
> One thing I noticed is that the first parameter to
> VirtualSpaceList::get_new_chunk
> is actually ignored so you might want to just get rid of it, it's just 
> confusing to see it. If you decide to do something about get_new_chunk 
> I think it wouldn't hurt to have the names of the parameters changed 
> as well, "grow_chunks_by_words" is actually "requested_chunk_size" and 
> "medium_chunk_bunch" could be something like 
> "suggested_commit_granularity"

I'll fix this and the surrounding code.

>
> You might want to make the "const size_t" constants you moved out of 
> the enum to either be "static" (which would be static in the C-sense) 
> or add them in an anonymous namespace since otherwise they will 
> pollute the global symbol namespace (more so than an enum which is 
> strictly file scoped).

I'm going to revert back to using an enum, for now.

>
> The rest of the change looks good to me.

Thanks,
StefanK

>
> /Mikael
>
>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we always
>> try to allocate chunks that are any of the three fixed sizes
>> (specialized, small, medium) or a humongous chunk (if it is larger then
>> the medium chunk size).
>>
>> During metaspace initialization an initial metaspace chunk is allocated.
>> The size of some of the metaspace instances can be specified on the
>> command line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same time
>> doesn't match the specialized or small chunk size, then we end up
>> hitting the assert mentioned above:
>> #
>> # Internal Error
>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), 
>>
>> pid=31643, tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small,
>> medium, or humongous chunk size, even if the requested size is neither
>> of these.
>>
>> Most of the other code is refactoring to unify the non-class metaspace
>> and the class metaspace code paths to get rid of some of the existing
>> code duplication, bring the chunk size calculation nearer to the the
>> actual chunk allocation, and make it easier to write a unit test for the
>> new adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>> reintroduced as JDK-8170358 when this patch has been reviewed and 
>> pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>> failing test
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Mon Nov 28 19:29:29 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 20:29:29 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <ae513b1c-65c2-d1c4-0709-4a8866c11917@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<ae513b1c-65c2-d1c4-0709-4a8866c11917@oracle.com>
Message-ID: <98082fd3-37d2-e2d4-842f-26e5ea38dcbc@oracle.com>

Hi Michail,

On 2016-11-28 14:57, Michail Chernov wrote:
> Hi Stefan,
>
>
> Could you please add simple regression test for this case?
The failure below was found with one of the test cases in:
runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java

Is this enough or do you want an explicit regression test that simply 
invokes:
java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ?

Thanks,
StefanK

>
>
> Thanks,
>
> Michail
>
>
> On 28.11.2016 16:52, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we 
>> always try to allocate chunks that are any of the three fixed sizes 
>> (specialized, small, medium) or a humongous chunk (if it is larger 
>> then the medium chunk size).
>>
>> During metaspace initialization an initial metaspace chunk is 
>> allocated. The size of some of the metaspace instances can be 
>> specified on the command line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same 
>> time doesn't match the specialized or small chunk size, then we end 
>> up hitting the assert mentioned above:
>> #
>> # Internal Error 
>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), 
>> pid=31643, tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a 
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small, 
>> medium, or humongous chunk size, even if the requested size is 
>> neither of these.
>>
>> Most of the other code is refactoring to unify the non-class 
>> metaspace and the class metaspace code paths to get rid of some of 
>> the existing code duplication, bring the chunk size calculation 
>> nearer to the the actual chunk allocation, and make it easier to 
>> write a unit test for the new adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be 
>> reintroduced as JDK-8170358 when this patch has been reviewed and 
>> pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests), 
>> failing test
>>
>> Thanks,
>> StefanK
>


From michail.chernov at oracle.com  Mon Nov 28 19:47:45 2016
From: michail.chernov at oracle.com (Michail Chernov)
Date: Mon, 28 Nov 2016 11:47:45 -0800 (PST)
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
Message-ID: <c24a624d-6b5b-4ec9-adc3-5f312d2759d4@default>


Hi Stefan, 


Since the bug was caught in existing test, I don't see any reason to make additional test for this case. Thanks for explanation! 


Michail 
----- ???????? ????????? ----- 
??: stefan.karlsson at oracle.com 
????: michail.chernov at oracle.com, hotspot-dev at openjdk.java.net 
????????????: ???????????, 28 ?????? 2016 ? 22:29:33 GMT +03:00 ???? 
????: Re: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist 


Hi Michail, 

On 2016-11-28 14:57, Michail Chernov wrote: 


Hi Stefan, 


Could you please add simple regression test for this case? 
The failure below was found with one of the test cases in: 
runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java 

Is this enough or do you want an explicit regression test that simply invokes: 
java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ? 

Thanks, 
StefanK 


Thanks, 

Michail 


On 28.11.2016 16:52, Stefan Karlsson wrote: 


Hi all, 

Please, review this patch to fix metaspace initialization. 

http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ 
https://bugs.openjdk.java.net/browse/JDK-8170395 

The fix for JDK-8169931 introduced a new assert to ensure that we always try to allocate chunks that are any of the three fixed sizes (specialized, small, medium) or a humongous chunk (if it is larger then the medium chunk size). 

During metaspace initialization an initial metaspace chunk is allocated. The size of some of the metaspace instances can be specified on the command line. For example: 
java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version 

If this size is smaller than the medium chunk size and at the same time doesn't match the specialized or small chunk size, then we end up hitting the assert mentioned above: 
# 
# Internal Error (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643, tid=31646 
# assert(size > free_chunks(MediumIndex)->size()) failed: Not a humongous chunk 
# 

======================================================================== 

The most important part of the fix is this line: 
+ // Adjust to one of the fixed chunk sizes (unless humongous) 
+ const size_t adjusted = adjust_initial_chunk_size(requested); 

which ensures that we always request either of a specialized, small, medium, or humongous chunk size, even if the requested size is neither of these. 

Most of the other code is refactoring to unify the non-class metaspace and the class metaspace code paths to get rid of some of the existing code duplication, bring the chunk size calculation nearer to the the actual chunk allocation, and make it easier to write a unit test for the new adjust_initial_chunk_size function. 

======================================================================== 

The patch for JDK-8169931 was backed out with JDK-8170355 and will be reintroduced as JDK-8170358 when this patch has been reviewed and pushed. 

Testing: jprt, unit test, parts of PIT testing (including CDS tests), failing test 

Thanks, 
StefanK 


From stefan.karlsson at oracle.com  Mon Nov 28 19:49:26 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 20:49:26 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <c24a624d-6b5b-4ec9-adc3-5f312d2759d4@default>
References: <c24a624d-6b5b-4ec9-adc3-5f312d2759d4@default>
Message-ID: <688aff6d-5bca-fd30-282b-91a8ae31e7f9@oracle.com>

Thanks, Michail.

StefanK

On 2016-11-28 20:47, Michail Chernov wrote:
> Hi Stefan,
>
> Since the bug was caught in existing test, I don't see any reason to 
> make additional test for this case. Thanks for explanation!
>
> Michail
>
> ----- ???????? ????????? -----
> ??: stefan.karlsson at oracle.com
> ????: michail.chernov at oracle.com, hotspot-dev at openjdk.java.net
> ????????????: ???????????, 28 ?????? 2016 ? 22:29:33 GMT +03:00 ????
> ????: Re: RFR: 8170395: Metaspace initialization queries the wrong 
> chunk freelist
>
> Hi Michail,
>
> On 2016-11-28 14:57, Michail Chernov wrote:
>
>     Hi Stefan,
>
>
>     Could you please add simple regression test for this case?
>
> The failure below was found with one of the test cases in:
> runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java
>
> Is this enough or do you want an explicit regression test that simply 
> invokes:
> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ?
>
> Thanks,
> StefanK
>
>
>
>     Thanks,
>
>     Michail
>
>
>     On 28.11.2016 16:52, Stefan Karlsson wrote:
>
>         Hi all,
>
>         Please, review this patch to fix metaspace initialization.
>
>         http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>         https://bugs.openjdk.java.net/browse/JDK-8170395
>
>         The fix for JDK-8169931 introduced a new assert to ensure that
>         we always try to allocate chunks that are any of the three
>         fixed sizes (specialized, small, medium) or a humongous chunk
>         (if it is larger then the medium chunk size).
>
>         During metaspace initialization an initial metaspace chunk is
>         allocated. The size of some of the metaspace instances can be
>         specified on the command line. For example:
>         java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
>         If this size is smaller than the medium chunk size and at the
>         same time doesn't match the specialized or small chunk size,
>         then we end up hitting the assert mentioned above:
>         #
>         # Internal Error
>         (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>         pid=31643, tid=31646
>         # assert(size > free_chunks(MediumIndex)->size()) failed: Not
>         a humongous chunk
>         #
>
>         ========================================================================
>
>
>         The most important part of the fix is this line:
>         +  // Adjust to one of the fixed chunk sizes (unless humongous)
>         +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
>         which ensures that we always request either of a specialized,
>         small, medium, or humongous chunk size, even if the requested
>         size is neither of these.
>
>         Most of the other code is refactoring to unify the non-class
>         metaspace and the class metaspace code paths to get rid of
>         some of the existing code duplication, bring the chunk size
>         calculation nearer to the the actual chunk allocation, and
>         make it easier to write a unit test for the new
>         adjust_initial_chunk_size function.
>
>         ========================================================================
>
>
>         The patch for JDK-8169931 was backed out with JDK-8170355 and
>         will be reintroduced as JDK-8170358 when this patch has been
>         reviewed and pushed.
>
>         Testing: jprt, unit test, parts of PIT testing (including CDS
>         tests), failing test
>
>         Thanks,
>         StefanK
>
>
>


From dean.long at oracle.com  Mon Nov 28 20:01:44 2016
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Mon, 28 Nov 2016 12:01:44 -0800
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
Message-ID: <bd1fa912-9cb3-caff-e84d-6dbf630b7e19@oracle.com>

Hi David,


On 11/25/16 2:38 AM, David Holmes wrote:
> However, the stack size limitations remained in place in case the VM 
> was launched from the primordial thread of a user application via the 
> JNI invocation API. 

why is the JNI invocation API no longer a problem?  Does it create a new 
thread like the launcher?

dl

From vladimir.kozlov at oracle.com  Mon Nov 28 20:18:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 28 Nov 2016 12:18:05 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <583BAB5B.4020404@oracle.com>
References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com>
	<583BAB5B.4020404@oracle.com>
Message-ID: <a1b12ce2-ee8a-64ee-cb7e-63456f8b51f3@oracle.com>

Hi Ioi,

Did you have updated webrev?

And you did not comment on my suggestion:

 >> Any suggest for a better name?
 >
 > _adapter_cds_entry ?

Thanks,
Vladimir

On 11/27/16 7:58 PM, Ioi Lam wrote:
> I found a problem in my previous patch. Here's the fix (on top of he
> previous patch):
>
> diff -r 3404f61c7081 src/share/vm/oops/method.cpp
> --- a/src/share/vm/oops/method.cpp    Sun Nov 27 19:44:44 2016 -0800
> +++ b/src/share/vm/oops/method.cpp    Sun Nov 27 19:50:35 2016 -0800
> @@ -1031,11 +1031,13 @@
>    // leftover methods that weren't linked.
>    if (is_shared()) {
>      address entry = Interpreter::entry_for_cds_method(h_method);
> -    assert(entry != NULL && entry == _i2i_entry && entry ==
> _from_interpreted_entry,
> +    assert(entry != NULL && entry == _i2i_entry,
>             "should be correctly set during dump time");
>      if (adapter() != NULL) {
>        return;
>      }
> +    assert(entry == _from_interpreted_entry,
> +           "should be correctly set during dump time");
>    } else if (_i2i_entry != NULL) {
>      return;
>    }
>
> The problem is: if the method has been compiled, then a shared method's
> _from_interpreted_entry would be different than _i2i_entry (see
> Method::set_code()).
>
> I am not sure if Method::link_method() would ever be called after
> it's been compiled, but I think it's safer to make the asserts no
> stronger than before this patch.
>
> Thanks
> - Ioi
>
>
> On 11/20/16 11:53 PM, Tobias Hartmann wrote:
>> Hi Ioi,
>>
>> this looks good to me, the detailed description including the diagram
>> is very nice and helps to understand the complex implementation!
>>
>> For the record: the test mentioned in [1] is part of my fix for
>> JDK-8169711.
>>
>> Best regards,
>> Tobias
>>
>> On 21.11.2016 07:58, Ioi Lam wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8169867
>>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/
>>>
>>>
>>> Thanks to Tobias for finding the bug. I have done the following
>>>
>>> + integrated Tobias' suggested fix
>>> + fixed Method::restore_unshareable_info to call Method::link_method
>>> + added comments and a diagram to illustrate how the CDS method entry
>>>    trampolines work.
>>>
>>> BTW, I am a little unhappy about the name
>>> ConstMethod::_adapter_trampoline.
>>> It's basically an extra level of indirection to get to the adapter.
>>> However.
>>> The word "trampoline" usually is used for and extra jump in
>>> executable code,
>>> so it may be a little confusing when we use it for a data pointer here.
>>>
>>> Any suggest for a better name?
>>>
>>>
>>> Testing:
>>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>>      now it produces the correct assertion. I won't check in this
>>> test, though,
>>>      since it won't assert anymore after Tobias fixes 8169711.
>>>
>>> # after -XX: or in .hotspotrc:  SuppressErrorAt=/method.cpp:1035
>>> #
>>> # A fatal error has been detected by the Java Runtime Environment:
>>> #
>>> #  Internal Error
>>> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035),
>>> pid=16840, tid=16843
>>> #  assert(entry != __null && entry == _i2i_entry && entry ==
>>> _from_interpreted_entry) failed:
>>> #  should be correctly set during dump time
>>>
>>> [2] Ran RBT in fastdebug build for
>>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist
>>>      All tests passed.
>>>
>>> Thanks
>>> - Ioi
>>>
>

From stefan.karlsson at oracle.com  Mon Nov 28 21:06:08 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 28 Nov 2016 22:06:08 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
Message-ID: <cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>

Hi again,

This set of patches resolve some of the comments given by Mikael and Thomas:

Entire patch:
  http://cr.openjdk.java.net/~stefank/8170395/webrev.02

Delta patches:
  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter

I consider pushing the last patch as a separate changeset.

This is the entire patch without the unused_parameter patch:
  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02

Thanks,
StefanK

On 2016-11-28 14:52, Stefan Karlsson wrote:
> Hi all,
>
> Please, review this patch to fix metaspace initialization.
>
> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8170395
>
> The fix for JDK-8169931 introduced a new assert to ensure that we 
> always try to allocate chunks that are any of the three fixed sizes 
> (specialized, small, medium) or a humongous chunk (if it is larger 
> then the medium chunk size).
>
> During metaspace initialization an initial metaspace chunk is 
> allocated. The size of some of the metaspace instances can be 
> specified on the command line. For example:
> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
> If this size is smaller than the medium chunk size and at the same 
> time doesn't match the specialized or small chunk size, then we end up 
> hitting the assert mentioned above:
> #
> # Internal Error 
> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), 
> pid=31643, tid=31646
> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a 
> humongous chunk
> #
>
> ========================================================================
>
> The most important part of the fix is this line:
> +  // Adjust to one of the fixed chunk sizes (unless humongous)
> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
> which ensures that we always request either of a specialized, small, 
> medium, or humongous chunk size, even if the requested size is neither 
> of these.
>
> Most of the other code is refactoring to unify the non-class metaspace 
> and the class metaspace code paths to get rid of some of the existing 
> code duplication, bring the chunk size calculation nearer to the the 
> actual chunk allocation, and make it easier to write a unit test for 
> the new adjust_initial_chunk_size function.
>
> ========================================================================
>
> The patch for JDK-8169931 was backed out with JDK-8170355 and will be 
> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>
> Testing: jprt, unit test, parts of PIT testing (including CDS tests), 
> failing test
>
> Thanks,
> StefanK


From david.holmes at oracle.com  Mon Nov 28 21:22:29 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Nov 2016 07:22:29 +1000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <f0ac33b12d0d4dc3950c90fee3b2b90c@dewdfe13de06.global.corp.sap>
References: <f0ac33b12d0d4dc3950c90fee3b2b90c@dewdfe13de06.global.corp.sap>
Message-ID: <f326f912-f1b2-7292-1491-6c36c0bc588d@oracle.com>

Hi Martin,

I've added Erik explicitly to the cc as he and I have been discussing 
fences and "visibility", and of course he most recently revised the 
descriptions in orderAccess.hpp

On 29/11/2016 2:29 AM, Doerr, Martin wrote:
> Hi David,
>
> sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that.
>
>> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee?
>
> This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try:
>
> I think the comment in orderAccess.hpp is not bad:
> // Finally, we define a "fence" operation, as a bidirectional barrier.
> // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order.
>
> One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B.
> If A contains a load, one has to include the corresponding store which may have been performed by another thread into A.
> Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one.

But again that attribution of global properties is not something I think 
is necessarily implied or intended by OrderAccess. Or maybe it is, but 
as it is only an issue on non-multicopy-atomic systems, it has never 
been called out explicitly. ?? And those global properties must also be 
a part of the other barriers (as the fence is just the combination of 
them all) - but I don't know how you would describe the affects of the 
other barriers (like loadload) in "global" terms.

David
-----

>
>> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ...
>> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
> "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4]
>
> So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5].
>
>
> Best regards,
> Martin
>
> [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering
> [5] http://g.oswego.edu/dl/jmm/cookbook.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 13:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin,
>
> On 28/11/2016 8:43 PM, Doerr, Martin wrote:
>> Hi David,
>>
>> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
>> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>>
>> The term "multiple-copy atomicity" is described as "... in a machine
>> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>>
>> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
>> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".
>
> Thanks for the reminder of that discussion. :)
>
>> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.
>
> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose.
>
>> Since you have asked about C++11, there's an example implementation for PPC [3].
>> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.
>
> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
>> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
>> But I guess the Java memory model is beyond the scope of your presentation.
>
> Oh yes way out of scope! :)
>
> Cheers,
> David
>
>> Best regards,
>> Martin
>>
>>
>> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>> [2]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030
>> 212.html [3]
>> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>>
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Montag, 28. November 2016 06:56
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>> <hotspot-dev at openjdk.java.net>
>> Subject: Re: Presentation: Understanding OrderAccess
>>
>> Hi Martin
>>
>> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>>> Hi David,
>>>
>>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>>
>> Thanks.
>>
>>>
>>> Would you like to add something about multi-copy atomicity?
>>
>> Not really. :)
>>
>>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>>
>>> It is needed in the following scenario:
>>> - Different threads write 2 variables.
>>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>>
>>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>>
>> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>>
>>> (While taking a look at it, the condition "#if !(defined SPARC ||
>>> defined IA32 || defined AMD64)" is not accurate and should better get
>>> improved. E.g. s390 is multi-copy atomic.)
>>>
>>>
>>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>>
>> I still can't get my head around the C++11 terminology for this and
>> how you are expected to use it - what does it mean for an individual
>> operation to be "sequentially consistent" ? :(
>>
>> Cheers,
>> David
>>
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
>>> Behalf Of David Holmes
>>> Sent: Mittwoch, 23. November 2016 06:08
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Presentation: Understanding OrderAccess
>>>
>>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>>
>>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order
>>> A
>>> ccess-v1.1.pdf
>>>
>>> Cheers,
>>> David
>>>

From david.holmes at oracle.com  Mon Nov 28 21:25:50 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Nov 2016 07:25:50 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <bd1fa912-9cb3-caff-e84d-6dbf630b7e19@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<bd1fa912-9cb3-caff-e84d-6dbf630b7e19@oracle.com>
Message-ID: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com>

Hi Dean,

On 29/11/2016 6:01 AM, dean.long at oracle.com wrote:
> Hi David,
>
>
> On 11/25/16 2:38 AM, David Holmes wrote:
>> However, the stack size limitations remained in place in case the VM
>> was launched from the primordial thread of a user application via the
>> JNI invocation API.
>
> why is the JNI invocation API no longer a problem?  Does it create a new
> thread like the launcher?

No, the JNI invocation API is unchanged. What has changed now are the 
conditions that required the 2MB limit due to the behaviour of the 
thread library (this goes back to LinuxThreads and the IA64 port).

Thanks,
David

> dl

From ioi.lam at oracle.com  Mon Nov 28 23:03:50 2016
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 28 Nov 2016 15:03:50 -0800
Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke
	Method::link_method
In-Reply-To: <a1b12ce2-ee8a-64ee-cb7e-63456f8b51f3@oracle.com>
References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com>
	<583BAB5B.4020404@oracle.com>
	<a1b12ce2-ee8a-64ee-cb7e-63456f8b51f3@oracle.com>
Message-ID: <583CB7D6.2020207@oracle.com>


On 11/28/16 12:18 PM, Vladimir Kozlov wrote:
> Hi Ioi,
>
> Did you have updated webrev?
>
I didn't update the webrev. The only change from the previous webrev is 
the diff below
> And you did not comment on my suggestion:
>
> >> Any suggest for a better name?
> >
> > _adapter_cds_entry ?
>
Thanks for the suggestion.

I think "entry" may be confusing with other use of the word, such as 
_i2i_entry -- in this case this pointer doesn't point to the entry point 
of executable code.

I think I'll just leave the names as is for now, and maybe file an RFE 
to rename it in JDK10.

Thanks
- Ioi

> Thanks,
> Vladimir
>
> On 11/27/16 7:58 PM, Ioi Lam wrote:
>> I found a problem in my previous patch. Here's the fix (on top of he
>> previous patch):
>>
>> diff -r 3404f61c7081 src/share/vm/oops/method.cpp
>> --- a/src/share/vm/oops/method.cpp    Sun Nov 27 19:44:44 2016 -0800
>> +++ b/src/share/vm/oops/method.cpp    Sun Nov 27 19:50:35 2016 -0800
>> @@ -1031,11 +1031,13 @@
>>    // leftover methods that weren't linked.
>>    if (is_shared()) {
>>      address entry = Interpreter::entry_for_cds_method(h_method);
>> -    assert(entry != NULL && entry == _i2i_entry && entry ==
>> _from_interpreted_entry,
>> +    assert(entry != NULL && entry == _i2i_entry,
>>             "should be correctly set during dump time");
>>      if (adapter() != NULL) {
>>        return;
>>      }
>> +    assert(entry == _from_interpreted_entry,
>> +           "should be correctly set during dump time");
>>    } else if (_i2i_entry != NULL) {
>>      return;
>>    }
>>
>> The problem is: if the method has been compiled, then a shared method's
>> _from_interpreted_entry would be different than _i2i_entry (see
>> Method::set_code()).
>>
>> I am not sure if Method::link_method() would ever be called after
>> it's been compiled, but I think it's safer to make the asserts no
>> stronger than before this patch.
>>
>> Thanks
>> - Ioi
>>
>>
>> On 11/20/16 11:53 PM, Tobias Hartmann wrote:
>>> Hi Ioi,
>>>
>>> this looks good to me, the detailed description including the diagram
>>> is very nice and helps to understand the complex implementation!
>>>
>>> For the record: the test mentioned in [1] is part of my fix for
>>> JDK-8169711.
>>>
>>> Best regards,
>>> Tobias
>>>
>>> On 21.11.2016 07:58, Ioi Lam wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8169867
>>>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ 
>>>>
>>>>
>>>>
>>>> Thanks to Tobias for finding the bug. I have done the following
>>>>
>>>> + integrated Tobias' suggested fix
>>>> + fixed Method::restore_unshareable_info to call Method::link_method
>>>> + added comments and a diagram to illustrate how the CDS method entry
>>>>    trampolines work.
>>>>
>>>> BTW, I am a little unhappy about the name
>>>> ConstMethod::_adapter_trampoline.
>>>> It's basically an extra level of indirection to get to the adapter.
>>>> However.
>>>> The word "trampoline" usually is used for and extra jump in
>>>> executable code,
>>>> so it may be a little confusing when we use it for a data pointer 
>>>> here.
>>>>
>>>> Any suggest for a better name?
>>>>
>>>>
>>>> Testing:
>>>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and
>>>>      now it produces the correct assertion. I won't check in this
>>>> test, though,
>>>>      since it won't assert anymore after Tobias fixes 8169711.
>>>>
>>>> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035
>>>> #
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> #  Internal Error
>>>> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035),
>>>> pid=16840, tid=16843
>>>> #  assert(entry != __null && entry == _i2i_entry && entry ==
>>>> _from_interpreted_entry) failed:
>>>> #  should be correctly set during dump time
>>>>
>>>> [2] Ran RBT in fastdebug build for
>>>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist 
>>>>
>>>>      All tests passed.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>


From davidcholmes at aapt.net.au  Tue Nov 29 00:16:58 2016
From: davidcholmes at aapt.net.au (David Holmes)
Date: Tue, 29 Nov 2016 10:16:58 +1000
Subject: TEST - please ignore
Message-ID: <01dd01d249d5$e6a52510$b3ef6f30$@aapt.net.au>


From dean.long at oracle.com  Tue Nov 29 06:59:27 2016
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Mon, 28 Nov 2016 22:59:27 -0800
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<bd1fa912-9cb3-caff-e84d-6dbf630b7e19@oracle.com>
	<53c38b45-22d5-082d-1244-1d1025822fea@oracle.com>
Message-ID: <4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com>

On 11/28/16 1:25 PM, David Holmes wrote:

> Hi Dean,
>
> On 29/11/2016 6:01 AM, dean.long at oracle.com wrote:
>> Hi David,
>>
>>
>> On 11/25/16 2:38 AM, David Holmes wrote:
>>> However, the stack size limitations remained in place in case the VM
>>> was launched from the primordial thread of a user application via the
>>> JNI invocation API.
>>
>> why is the JNI invocation API no longer a problem?  Does it create a new
>> thread like the launcher?
>
> No, the JNI invocation API is unchanged. What has changed now are the 
> conditions that required the 2MB limit due to the behaviour of the 
> thread library (this goes back to LinuxThreads and the IA64 port).
>

Let me see if I have it straight.  The stack size limit was needed for 
the primordial thread on LinuxThreads (I remember those days!). We can 
still start the JVM on the primordial thread if we use a custom launcher 
or the JNI invocation API, but we no longer need the 2MB limit because 
we no longer support LinuxThreads.

Based on the comment in os::Linux::capture_initial_stack, I'd also like 
to know if pthread_getattr_np() is now reliable on the primordial 
thread.  If so, we could remove a lot of ugly code.

dl


> Thanks,
> David
>
>> dl


From thomas.stuefe at gmail.com  Tue Nov 29 07:13:01 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 29 Nov 2016 08:13:01 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
Message-ID: <CAA-vtUz8x=pjkMe1PXX1xLhV90bn8Zh+C-z8riM-MZmy+UXgjw@mail.gmail.com>

Hi Stefan,

looks fine. There is a trailing ; after the "smallest_chunk_size" method
(no need to do a webrev for that).

Thanks for taking my suggestions.

Best regards, Thomas


On Mon, Nov 28, 2016 at 10:06 PM, Stefan Karlsson <
stefan.karlsson at oracle.com> wrote:

> Hi again,
>
> This set of patches resolve some of the comments given by Mikael and
> Thomas:
>
> Entire patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02
>
> Delta patches:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.ve
> rify_global_initialization
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>
> I consider pushing the last patch as a separate changeset.
>
> This is the entire patch without the unused_parameter patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>
> Thanks,
> StefanK
>
> On 2016-11-28 14:52, Stefan Karlsson wrote:
>
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we always
>> try to allocate chunks that are any of the three fixed sizes (specialized,
>> small, medium) or a humongous chunk (if it is larger then the medium chunk
>> size).
>>
>> During metaspace initialization an initial metaspace chunk is allocated.
>> The size of some of the metaspace instances can be specified on the command
>> line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same time
>> doesn't match the specialized or small chunk size, then we end up hitting
>> the assert mentioned above:
>> #
>> # Internal Error (/scratch/opt/jprt/T/P1/142848
>> .erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643,
>> tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small,
>> medium, or humongous chunk size, even if the requested size is neither of
>> these.
>>
>> Most of the other code is refactoring to unify the non-class metaspace
>> and the class metaspace code paths to get rid of some of the existing code
>> duplication, bring the chunk size calculation nearer to the the actual
>> chunk allocation, and make it easier to write a unit test for the new
>> adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>> failing test
>>
>> Thanks,
>> StefanK
>>
>
>
>

From david.holmes at oracle.com  Tue Nov 29 09:20:26 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Nov 2016 19:20:26 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<bd1fa912-9cb3-caff-e84d-6dbf630b7e19@oracle.com>
	<53c38b45-22d5-082d-1244-1d1025822fea@oracle.com>
	<4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com>
Message-ID: <d10aa77c-a9d0-a04e-eda9-49f7aa0aac74@oracle.com>

On 29/11/2016 4:59 PM, dean.long at oracle.com wrote:
> On 11/28/16 1:25 PM, David Holmes wrote:
>
>> Hi Dean,
>>
>> On 29/11/2016 6:01 AM, dean.long at oracle.com wrote:
>>> Hi David,
>>>
>>>
>>> On 11/25/16 2:38 AM, David Holmes wrote:
>>>> However, the stack size limitations remained in place in case the VM
>>>> was launched from the primordial thread of a user application via the
>>>> JNI invocation API.
>>>
>>> why is the JNI invocation API no longer a problem?  Does it create a new
>>> thread like the launcher?
>>
>> No, the JNI invocation API is unchanged. What has changed now are the
>> conditions that required the 2MB limit due to the behaviour of the
>> thread library (this goes back to LinuxThreads and the IA64 port).
>>
>
> Let me see if I have it straight.  The stack size limit was needed for
> the primordial thread on LinuxThreads (I remember those days!). We can
> still start the JVM on the primordial thread if we use a custom launcher
> or the JNI invocation API, but we no longer need the 2MB limit because
> we no longer support LinuxThreads.

Yes. There were some other reasons why the 2MB limit was needed but 
those no longer exist either (ie ia64 port, alt-stack usage)

> Based on the comment in os::Linux::capture_initial_stack, I'd also like
> to know if pthread_getattr_np() is now reliable on the primordial
> thread.  If so, we could remove a lot of ugly code.

I expect that it would be, but that would require a lot more extensive 
testing of different Linuxes. This can be done as part of planned future 
cleanup work in 10.

Thanks,
David

> dl
>
>
>> Thanks,
>> David
>>
>>> dl
>

From stefan.karlsson at oracle.com  Tue Nov 29 09:32:05 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 29 Nov 2016 10:32:05 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <CAA-vtUz8x=pjkMe1PXX1xLhV90bn8Zh+C-z8riM-MZmy+UXgjw@mail.gmail.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
	<CAA-vtUz8x=pjkMe1PXX1xLhV90bn8Zh+C-z8riM-MZmy+UXgjw@mail.gmail.com>
Message-ID: <afed375f-bbc9-39c0-9ca3-fe4d4d0fb3c0@oracle.com>

Hi Thomas,

On 2016-11-29 08:13, Thomas St?fe wrote:
> Hi Stefan,
>
> looks fine. There is a trailing ; after the "smallest_chunk_size" method
> (no need to do a webrev for that).

Will fix.

>
> Thanks for taking my suggestions.

Thanks for reviewing.

StefanK

>
> Best regards, Thomas
>
>
>
> On Mon, Nov 28, 2016 at 10:06 PM, Stefan Karlsson
> <stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>> wrote:
>
>     Hi again,
>
>     This set of patches resolve some of the comments given by Mikael and
>     Thomas:
>
>     Entire patch:
>      http://cr.openjdk.java.net/~stefank/8170395/webrev.02
>     <http://cr.openjdk.java.net/~stefank/8170395/webrev.02>
>
>     Delta patches:
>      http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
>     <http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization>
>      http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum <http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum>
>      http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>     <http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter>
>
>     I consider pushing the last patch as a separate changeset.
>
>     This is the entire patch without the unused_parameter patch:
>      http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>     <http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02>
>
>     Thanks,
>     StefanK
>
>     On 2016-11-28 14:52, Stefan Karlsson wrote:
>
>         Hi all,
>
>         Please, review this patch to fix metaspace initialization.
>
>         http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>         <http://cr.openjdk.java.net/~stefank/8170395/webrev.01/>
>         https://bugs.openjdk.java.net/browse/JDK-8170395
>         <https://bugs.openjdk.java.net/browse/JDK-8170395>
>
>         The fix for JDK-8169931 introduced a new assert to ensure that
>         we always try to allocate chunks that are any of the three fixed
>         sizes (specialized, small, medium) or a humongous chunk (if it
>         is larger then the medium chunk size).
>
>         During metaspace initialization an initial metaspace chunk is
>         allocated. The size of some of the metaspace instances can be
>         specified on the command line. For example:
>         java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>
>         If this size is smaller than the medium chunk size and at the
>         same time doesn't match the specialized or small chunk size,
>         then we end up hitting the assert mentioned above:
>         #
>         # Internal Error
>         (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>         pid=31643, tid=31646
>         # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>         humongous chunk
>         #
>
>         ========================================================================
>
>         The most important part of the fix is this line:
>         +  // Adjust to one of the fixed chunk sizes (unless humongous)
>         +  const size_t adjusted = adjust_initial_chunk_size(requested);
>
>         which ensures that we always request either of a specialized,
>         small, medium, or humongous chunk size, even if the requested
>         size is neither of these.
>
>         Most of the other code is refactoring to unify the non-class
>         metaspace and the class metaspace code paths to get rid of some
>         of the existing code duplication, bring the chunk size
>         calculation nearer to the the actual chunk allocation, and make
>         it easier to write a unit test for the new
>         adjust_initial_chunk_size function.
>
>         ========================================================================
>
>         The patch for JDK-8169931 was backed out with JDK-8170355 and
>         will be reintroduced as JDK-8170358 when this patch has been
>         reviewed and pushed.
>
>         Testing: jprt, unit test, parts of PIT testing (including CDS
>         tests), failing test
>
>         Thanks,
>         StefanK
>
>
>
>

From volker.simonis at gmail.com  Tue Nov 29 09:41:10 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 29 Nov 2016 10:41:10 +0100
Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to
	non-optimized compilation
In-Reply-To: <583C5B10.8040204@linux.vnet.ibm.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
Message-ID: <CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>

Thanks Gustavo,

the change looks good.

So now we're just waiting for another review from somebody of the aarch64 folks.
Once we have that and the fc-request is approved I'll push the changes.

Regards,
Volker


On Mon, Nov 28, 2016 at 5:28 PM, Gustavo Romero
<gromero at linux.vnet.ibm.com> wrote:
> Hi all,
>
> I'm re-sending due to JDK title update to include s390x and aarch64 archs.
>
> Could the following webrev be reviewed, please?
>
> webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/
> webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/
> bug:        https://bugs.openjdk.java.net/browse/JDK-8170153
>
> Thank you.
>
>
> Regards,
> Gustavo
>

From thomas.stuefe at gmail.com  Tue Nov 29 10:39:51 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 29 Nov 2016 11:39:51 +0100
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
Message-ID: <CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>

Hi David,

thanks for the good explanation. Change looks good, I really like the
comment in capture_initial_stack().

Question, with -Xss given and being smaller than current thread stack size,
guard pages may appear in the middle of the invoking thread stack? I always
thought this is a bit dangerous. If your model is to have the VM created
from the main thread, which then goes off to do different things, and have
other threads then attach and run java code, main thread later may crash in
unrelated native code just because it reached the stack depth of the hava
threads? Or am I misunderstanding something?

Thanks, Thomas


On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>
> The bug is not public unfortunately for non-technical reasons - but see my
> eval below.
>
> Background: if you load the JVM from the primordial thread of a process
> (not done by the java launcher since JDK 6), there is an artificial stack
> limit imposed on the initial thread (by sticking the guard page at the
> limit position of the actual stack) of the minimum of the -Xss setting and
> 2M. So if you set -Xss to > 2M it is ignored for the main thread even if
> the true stack is, say, 8M. This limitation dates back 10-15 years and is
> no longer relevant today and should be removed (see below). I've also added
> additional explanatory notes.
>
> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>
> Testing was manually done by modifying the launcher to not run the VM in a
> new thread, and checking the resulting stack size used.
>
> This change will only affect hosted JVMs launched with a -Xss value > 2M.
>
> Thanks,
> David
> -----
>
> Bug eval:
>
> JDK-4441425 limits the stack to 8M as a safeguard against an unlimited
> value from getrlimit in 1.3.1, but further constrained that to 2M in 1.4.0
> due to JDK-4466587.
>
> By 1.4.2 we have the basic form of the current problematic code:
>
> #ifndef IA64
>   if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
> #else
>   // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small
>   if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
> #endif
>
>   _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>
>   if (max_size && _initial_thread_stack_size > max_size) {
>      _initial_thread_stack_size = max_size;
>   }
>
> This was added by JDK-4678676 to allow the stack of the main thread to be
> _reduced_ below the default 2M/4M if the -Xss value was smaller than
> that.** There was no intent to allow the stack size to follow -Xss
> arbitrarily due to the operational constraints imposed by the OS/glibc at
> the time when dealing with the primordial process thread.
>
> ** It could not actually change the actual stack size of course, but set
> the guard pages to limit use to the expected stack size.
>
> In JDK 6, under JDK-6316197, the launcher was changed to create the JVM in
> a new thread, so that it was not limited by the idiosyncracies of the OS or
> thread library primordial thread handling. However, the stack size
> limitations remained in place in case the VM was launched from the
> primordial thread of a user application via the JNI invocation API.
>
> I believe it should be safe to remove the 2M limitation now.
>

From mikael.gerdin at oracle.com  Tue Nov 29 10:53:04 2016
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Tue, 29 Nov 2016 11:53:04 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
Message-ID: <7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com>

Hi Stefan,

On 2016-11-28 22:06, Stefan Karlsson wrote:
> Hi again,
>
> This set of patches resolve some of the comments given by Mikael and
> Thomas:
>
> Entire patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02

Looks good!
/Mikael

>
> Delta patches:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
>
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>
> I consider pushing the last patch as a separate changeset.
>
> This is the entire patch without the unused_parameter patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>
> Thanks,
> StefanK
>
> On 2016-11-28 14:52, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we
>> always try to allocate chunks that are any of the three fixed sizes
>> (specialized, small, medium) or a humongous chunk (if it is larger
>> then the medium chunk size).
>>
>> During metaspace initialization an initial metaspace chunk is
>> allocated. The size of some of the metaspace instances can be
>> specified on the command line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same
>> time doesn't match the specialized or small chunk size, then we end up
>> hitting the assert mentioned above:
>> #
>> # Internal Error
>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>> pid=31643, tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small,
>> medium, or humongous chunk size, even if the requested size is neither
>> of these.
>>
>> Most of the other code is refactoring to unify the non-class metaspace
>> and the class metaspace code paths to get rid of some of the existing
>> code duplication, bring the chunk size calculation nearer to the the
>> actual chunk allocation, and make it easier to write a unit test for
>> the new adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>> failing test
>>
>> Thanks,
>> StefanK
>
>

From per.liden at oracle.com  Tue Nov 29 11:11:52 2016
From: per.liden at oracle.com (Per Liden)
Date: Tue, 29 Nov 2016 12:11:52 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
Message-ID: <6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com>

Hi Stefan,

On 2016-11-28 22:06, Stefan Karlsson wrote:
> Hi again,
>
> This set of patches resolve some of the comments given by Mikael and
> Thomas:
>
> Entire patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02

Looks good, just a few comments.

metaspace.cpp
-------------

  751   static size_t specialized_chunk_size(bool is_class) { return 
(size_t) is_class ? ClassSpecializedChunk : SpecializedChunk; }
  752   static size_t small_chunk_size(bool is_class)       { return 
(size_t) is_class ? ClassSmallChunk : SmallChunk; }
  753   static size_t medium_chunk_size(bool is_class)      { return 
(size_t) is_class ? ClassMediumChunk : MediumChunk; }

The size_t casts above binds to is_class and not the result from ?: so 
you probably you want to do:

return is_class ? (size_t)A : (size_t)B;

... or perhaps just skip the casts.


  760   size_t specialized_chunk_size() { return 
specialized_chunk_size(is_class()); }
  761   size_t small_chunk_size()       { return 
small_chunk_size(is_class()); }
  762   size_t medium_chunk_size()      { return 
medium_chunk_size(is_class()); }
  763
  764   size_t smallest_chunk_size()    { return 
smallest_chunk_size(is_class()); }
  765
  766   size_t medium_chunk_bunch()     { return medium_chunk_size() * 
MediumChunkMultiple; }

More of a style thing, but it looks like these functions could also be 
const, no?

I don't need to see a new webrev.

cheers,
Per

>
> Delta patches:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
>
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>
> I consider pushing the last patch as a separate changeset.
>
> This is the entire patch without the unused_parameter patch:
>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>
> Thanks,
> StefanK
>
> On 2016-11-28 14:52, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please, review this patch to fix metaspace initialization.
>>
>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>
>> The fix for JDK-8169931 introduced a new assert to ensure that we
>> always try to allocate chunks that are any of the three fixed sizes
>> (specialized, small, medium) or a humongous chunk (if it is larger
>> then the medium chunk size).
>>
>> During metaspace initialization an initial metaspace chunk is
>> allocated. The size of some of the metaspace instances can be
>> specified on the command line. For example:
>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>
>> If this size is smaller than the medium chunk size and at the same
>> time doesn't match the specialized or small chunk size, then we end up
>> hitting the assert mentioned above:
>> #
>> # Internal Error
>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>> pid=31643, tid=31646
>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>> humongous chunk
>> #
>>
>> ========================================================================
>>
>> The most important part of the fix is this line:
>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>
>> which ensures that we always request either of a specialized, small,
>> medium, or humongous chunk size, even if the requested size is neither
>> of these.
>>
>> Most of the other code is refactoring to unify the non-class metaspace
>> and the class metaspace code paths to get rid of some of the existing
>> code duplication, bring the chunk size calculation nearer to the the
>> actual chunk allocation, and make it easier to write a unit test for
>> the new adjust_initial_chunk_size function.
>>
>> ========================================================================
>>
>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>> reintroduced as JDK-8170358 when this patch has been reviewed and pushed.
>>
>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>> failing test
>>
>> Thanks,
>> StefanK
>
>

From stefan.karlsson at oracle.com  Tue Nov 29 11:47:26 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 29 Nov 2016 12:47:26 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
	<6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com>
Message-ID: <e5d596af-f5b5-1de3-7c9b-1d69396edf4b@oracle.com>

Hi Per,

On 2016-11-29 12:11, Per Liden wrote:
> Hi Stefan,
>
> On 2016-11-28 22:06, Stefan Karlsson wrote:
>> Hi again,
>>
>> This set of patches resolve some of the comments given by Mikael and
>> Thomas:
>>
>> Entire patch:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02
>
> Looks good, just a few comments.
>
> metaspace.cpp
> -------------
>
>  751   static size_t specialized_chunk_size(bool is_class) { return
> (size_t) is_class ? ClassSpecializedChunk : SpecializedChunk; }
>  752   static size_t small_chunk_size(bool is_class)       { return
> (size_t) is_class ? ClassSmallChunk : SmallChunk; }
>  753   static size_t medium_chunk_size(bool is_class)      { return
> (size_t) is_class ? ClassMediumChunk : MediumChunk; }
>
> The size_t casts above binds to is_class and not the result from ?: so
> you probably you want to do:
>
> return is_class ? (size_t)A : (size_t)B;
>
> ... or perhaps just skip the casts.
>

Sure. This cast existed before my changes, but I can remove it since 
it's obviously wrong.

>
>  760   size_t specialized_chunk_size() { return
> specialized_chunk_size(is_class()); }
>  761   size_t small_chunk_size()       { return
> small_chunk_size(is_class()); }
>  762   size_t medium_chunk_size()      { return
> medium_chunk_size(is_class()); }
>  763
>  764   size_t smallest_chunk_size()    { return
> smallest_chunk_size(is_class()); }
>  765
>  766   size_t medium_chunk_bunch()     { return medium_chunk_size() *
> MediumChunkMultiple; }
>
> More of a style thing, but it looks like these functions could also be
> const, no?

Yes, and many other functions in that file. I'll update these since I 
changed them.

>
> I don't need to see a new webrev.

Thanks for reviewing,
StefanK

>
> cheers,
> Per
>
>>
>> Delta patches:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
>>
>>
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>>
>>
>> I consider pushing the last patch as a separate changeset.
>>
>> This is the entire patch without the unused_parameter patch:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>>
>> Thanks,
>> StefanK
>>
>> On 2016-11-28 14:52, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please, review this patch to fix metaspace initialization.
>>>
>>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>>
>>> The fix for JDK-8169931 introduced a new assert to ensure that we
>>> always try to allocate chunks that are any of the three fixed sizes
>>> (specialized, small, medium) or a humongous chunk (if it is larger
>>> then the medium chunk size).
>>>
>>> During metaspace initialization an initial metaspace chunk is
>>> allocated. The size of some of the metaspace instances can be
>>> specified on the command line. For example:
>>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>>
>>> If this size is smaller than the medium chunk size and at the same
>>> time doesn't match the specialized or small chunk size, then we end up
>>> hitting the assert mentioned above:
>>> #
>>> # Internal Error
>>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>>>
>>> pid=31643, tid=31646
>>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>>> humongous chunk
>>> #
>>>
>>> ========================================================================
>>>
>>> The most important part of the fix is this line:
>>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>>
>>> which ensures that we always request either of a specialized, small,
>>> medium, or humongous chunk size, even if the requested size is neither
>>> of these.
>>>
>>> Most of the other code is refactoring to unify the non-class metaspace
>>> and the class metaspace code paths to get rid of some of the existing
>>> code duplication, bring the chunk size calculation nearer to the the
>>> actual chunk allocation, and make it easier to write a unit test for
>>> the new adjust_initial_chunk_size function.
>>>
>>> ========================================================================
>>>
>>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>>> reintroduced as JDK-8170358 when this patch has been reviewed and
>>> pushed.
>>>
>>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>>> failing test
>>>
>>> Thanks,
>>> StefanK
>>
>>

From stefan.karlsson at oracle.com  Tue Nov 29 11:47:51 2016
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 29 Nov 2016 12:47:51 +0100
Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk
	freelist
In-Reply-To: <7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com>
References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com>
	<cd31f34a-0ad9-4540-9918-b260f2f4ed11@oracle.com>
	<7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com>
Message-ID: <4235288e-c437-56a7-6de4-38f577c8fa7e@oracle.com>

Thanks, Mikael!

StefanK

On 2016-11-29 11:53, Mikael Gerdin wrote:
> Hi Stefan,
>
> On 2016-11-28 22:06, Stefan Karlsson wrote:
>> Hi again,
>>
>> This set of patches resolve some of the comments given by Mikael and
>> Thomas:
>>
>> Entire patch:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02
>
> Looks good!
> /Mikael
>
>>
>> Delta patches:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization
>>
>>
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter
>>
>>
>> I consider pushing the last patch as a separate changeset.
>>
>> This is the entire patch without the unused_parameter patch:
>>  http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02
>>
>> Thanks,
>> StefanK
>>
>> On 2016-11-28 14:52, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please, review this patch to fix metaspace initialization.
>>>
>>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8170395
>>>
>>> The fix for JDK-8169931 introduced a new assert to ensure that we
>>> always try to allocate chunks that are any of the three fixed sizes
>>> (specialized, small, medium) or a humongous chunk (if it is larger
>>> then the medium chunk size).
>>>
>>> During metaspace initialization an initial metaspace chunk is
>>> allocated. The size of some of the metaspace instances can be
>>> specified on the command line. For example:
>>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version
>>>
>>> If this size is smaller than the medium chunk size and at the same
>>> time doesn't match the specialized or small chunk size, then we end up
>>> hitting the assert mentioned above:
>>> #
>>> # Internal Error
>>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359),
>>>
>>> pid=31643, tid=31646
>>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a
>>> humongous chunk
>>> #
>>>
>>> ========================================================================
>>>
>>> The most important part of the fix is this line:
>>> +  // Adjust to one of the fixed chunk sizes (unless humongous)
>>> +  const size_t adjusted = adjust_initial_chunk_size(requested);
>>>
>>> which ensures that we always request either of a specialized, small,
>>> medium, or humongous chunk size, even if the requested size is neither
>>> of these.
>>>
>>> Most of the other code is refactoring to unify the non-class metaspace
>>> and the class metaspace code paths to get rid of some of the existing
>>> code duplication, bring the chunk size calculation nearer to the the
>>> actual chunk allocation, and make it easier to write a unit test for
>>> the new adjust_initial_chunk_size function.
>>>
>>> ========================================================================
>>>
>>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be
>>> reintroduced as JDK-8170358 when this patch has been reviewed and
>>> pushed.
>>>
>>> Testing: jprt, unit test, parts of PIT testing (including CDS tests),
>>> failing test
>>>
>>> Thanks,
>>> StefanK
>>
>>

From david.holmes at oracle.com  Tue Nov 29 11:59:44 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Nov 2016 21:59:44 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
Message-ID: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>

Hi Thomas,

On 29/11/2016 8:39 PM, Thomas St?fe wrote:
> Hi David,
>
> thanks for the good explanation. Change looks good, I really like the
> comment in capture_initial_stack().
>
> Question, with -Xss given and being smaller than current thread stack
> size, guard pages may appear in the middle of the invoking thread stack?
> I always thought this is a bit dangerous. If your model is to have the
> VM created from the main thread, which then goes off to do different
> things, and have other threads then attach and run java code, main
> thread later may crash in unrelated native code just because it reached
> the stack depth of the hava threads? Or am I misunderstanding something?

There is no change to the general behaviour other than allowing a 
primordial process thread that launches the VM, to now not have an 
effective stack limited at 2MB. The current logic will insert guard 
pages where ever -Xss states (as long as less than 2MB else 2MB), while 
with the fix the guard pages will be inserted above 2MB - as dictated by 
-Xss.

David
-----

> Thanks, Thomas
>
>
> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>
>     The bug is not public unfortunately for non-technical reasons - but
>     see my eval below.
>
>     Background: if you load the JVM from the primordial thread of a
>     process (not done by the java launcher since JDK 6), there is an
>     artificial stack limit imposed on the initial thread (by sticking
>     the guard page at the limit position of the actual stack) of the
>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is
>     ignored for the main thread even if the true stack is, say, 8M. This
>     limitation dates back 10-15 years and is no longer relevant today
>     and should be removed (see below). I've also added additional
>     explanatory notes.
>
>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>
>     Testing was manually done by modifying the launcher to not run the
>     VM in a new thread, and checking the resulting stack size used.
>
>     This change will only affect hosted JVMs launched with a -Xss value
>     > 2M.
>
>     Thanks,
>     David
>     -----
>
>     Bug eval:
>
>     JDK-4441425 limits the stack to 8M as a safeguard against an
>     unlimited value from getrlimit in 1.3.1, but further constrained
>     that to 2M in 1.4.0 due to JDK-4466587.
>
>     By 1.4.2 we have the basic form of the current problematic code:
>
>     #ifndef IA64
>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>     #else
>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small
>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>     #endif
>
>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>
>       if (max_size && _initial_thread_stack_size > max_size) {
>          _initial_thread_stack_size = max_size;
>       }
>
>     This was added by JDK-4678676 to allow the stack of the main thread
>     to be _reduced_ below the default 2M/4M if the -Xss value was
>     smaller than that.** There was no intent to allow the stack size to
>     follow -Xss arbitrarily due to the operational constraints imposed
>     by the OS/glibc at the time when dealing with the primordial process
>     thread.
>
>     ** It could not actually change the actual stack size of course, but
>     set the guard pages to limit use to the expected stack size.
>
>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>     JVM in a new thread, so that it was not limited by the
>     idiosyncracies of the OS or thread library primordial thread
>     handling. However, the stack size limitations remained in place in
>     case the VM was launched from the primordial thread of a user
>     application via the JNI invocation API.
>
>     I believe it should be safe to remove the 2M limitation now.
>
>

From david.holmes at oracle.com  Tue Nov 29 12:25:45 2016
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 29 Nov 2016 22:25:45 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
Message-ID: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>

I just realized I overlooked the case where ThreadStackSize=0 and the 
stack is unlimited. In that case it isn't clear where the guard pages 
will get inserted - I do know that I don't get a stackoverflow error.

This needs further investigation.

David

On 29/11/2016 9:59 PM, David Holmes wrote:
> Hi Thomas,
>
> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>> Hi David,
>>
>> thanks for the good explanation. Change looks good, I really like the
>> comment in capture_initial_stack().
>>
>> Question, with -Xss given and being smaller than current thread stack
>> size, guard pages may appear in the middle of the invoking thread stack?
>> I always thought this is a bit dangerous. If your model is to have the
>> VM created from the main thread, which then goes off to do different
>> things, and have other threads then attach and run java code, main
>> thread later may crash in unrelated native code just because it reached
>> the stack depth of the hava threads? Or am I misunderstanding something?
>
> There is no change to the general behaviour other than allowing a
> primordial process thread that launches the VM, to now not have an
> effective stack limited at 2MB. The current logic will insert guard
> pages where ever -Xss states (as long as less than 2MB else 2MB), while
> with the fix the guard pages will be inserted above 2MB - as dictated by
> -Xss.
>
> David
> -----
>
>> Thanks, Thomas
>>
>>
>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>
>>     The bug is not public unfortunately for non-technical reasons - but
>>     see my eval below.
>>
>>     Background: if you load the JVM from the primordial thread of a
>>     process (not done by the java launcher since JDK 6), there is an
>>     artificial stack limit imposed on the initial thread (by sticking
>>     the guard page at the limit position of the actual stack) of the
>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is
>>     ignored for the main thread even if the true stack is, say, 8M. This
>>     limitation dates back 10-15 years and is no longer relevant today
>>     and should be removed (see below). I've also added additional
>>     explanatory notes.
>>
>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>
>>     Testing was manually done by modifying the launcher to not run the
>>     VM in a new thread, and checking the resulting stack size used.
>>
>>     This change will only affect hosted JVMs launched with a -Xss value
>>     > 2M.
>>
>>     Thanks,
>>     David
>>     -----
>>
>>     Bug eval:
>>
>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>     that to 2M in 1.4.0 due to JDK-4466587.
>>
>>     By 1.4.2 we have the basic form of the current problematic code:
>>
>>     #ifndef IA64
>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>     #else
>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>> small
>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>     #endif
>>
>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>
>>       if (max_size && _initial_thread_stack_size > max_size) {
>>          _initial_thread_stack_size = max_size;
>>       }
>>
>>     This was added by JDK-4678676 to allow the stack of the main thread
>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>     smaller than that.** There was no intent to allow the stack size to
>>     follow -Xss arbitrarily due to the operational constraints imposed
>>     by the OS/glibc at the time when dealing with the primordial process
>>     thread.
>>
>>     ** It could not actually change the actual stack size of course, but
>>     set the guard pages to limit use to the expected stack size.
>>
>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>     JVM in a new thread, so that it was not limited by the
>>     idiosyncracies of the OS or thread library primordial thread
>>     handling. However, the stack size limitations remained in place in
>>     case the VM was launched from the primordial thread of a user
>>     application via the JNI invocation API.
>>
>>     I believe it should be safe to remove the 2M limitation now.
>>
>>

From martin.doerr at sap.com  Tue Nov 29 13:08:16 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 29 Nov 2016 13:08:16 +0000
Subject: Presentation: Understanding OrderAccess
Message-ID: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>

Hi David and Erik,

> But again that attribution of global properties is not something I think is necessarily implied or intended by OrderAccess.
> Or maybe it is, but as it is only an issue on non-multicopy-atomic systems, it has never been called out explicitly. ?? And
> those global properties must also be a part of the other barriers (as the fence is just the combination of them all) - but I
> don't know how you would describe the affects of the other barriers (like loadload) in "global" terms.

I think the global properties are implicitly assumed on multicopy-atomic systems and most people don't think about them.
But they are important as soon as more than 2 threads are involved, especially on PPC64 and Aarch64.
That's why I'd appreciate if they could be added to hotspot documentations or presentations.

Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation.
If one thread releases something which is based on something else which was written by another thread, a third thread
which acquires it, is expected to see that in a consistent way. Do you agree?

loadStore and loadLoad barriers are much simpler as they basically require the following accesses to occur late enough
without any global synchronization requirements.

Best regards,
Martin


-----Original Message-----
From: David Holmes [mailto:david.holmes at oracle.com] 
Sent: Montag, 28. November 2016 22:22
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Cc: ERIK.OSTERLUND <erik.osterlund at oracle.com>
Subject: Re: Presentation: Understanding OrderAccess

Hi Martin,

I've added Erik explicitly to the cc as he and I have been discussing fences and "visibility", and of course he most recently revised the descriptions in orderAccess.hpp

On 29/11/2016 2:29 AM, Doerr, Martin wrote:
> Hi David,
>
> sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that.
>
>> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee?
>
> This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try:
>
> I think the comment in orderAccess.hpp is not bad:
> // Finally, we define a "fence" operation, as a bidirectional barrier.
> // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order.
>
> One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B.
> If A contains a load, one has to include the corresponding store which may have been performed by another thread into A.
> Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one.

But again that attribution of global properties is not something I think is necessarily implied or intended by OrderAccess. Or maybe it is, but as it is only an issue on non-multicopy-atomic systems, it has never been called out explicitly. ?? And those global properties must also be a part of the other barriers (as the fence is just the combination of them all) - but I don't know how you would describe the affects of the other barriers (like loadload) in "global" terms.

David
-----

>
>> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ...
>> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
> "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4]
>
> So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5].
>
>
> Best regards,
> Martin
>
> [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering
> [5] http://g.oswego.edu/dl/jmm/cookbook.html
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 28. November 2016 13:56
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Subject: Re: Presentation: Understanding OrderAccess
>
> Hi Martin,
>
> On 28/11/2016 8:43 PM, Doerr, Martin wrote:
>> Hi David,
>>
>> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved.
>> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]).
>>
>> The term "multiple-copy atomicity" is described as "... in a machine
>> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...".
>>
>> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example.
>> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...".
>
> Thanks for the reminder of that discussion. :)
>
>> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses.
>
> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose.
>
>> Since you have asked about C++11, there's an example implementation for PPC [3].
>> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order.
>
> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones?
>
>> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation.
>> But I guess the Java memory model is beyond the scope of your presentation.
>
> Oh yes way out of scope! :)
>
> Cheers,
> David
>
>> Best regards,
>> Martin
>>
>>
>> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>> [2]
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030
>> 212.html [3]
>> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>>
>>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Montag, 28. November 2016 06:56
>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>> <hotspot-dev at openjdk.java.net>
>> Subject: Re: Presentation: Understanding OrderAccess
>>
>> Hi Martin
>>
>> On 24/11/2016 2:20 AM, Doerr, Martin wrote:
>>> Hi David,
>>>
>>> thank you very much for the presentation. I think it provides a good guideline for hotspot development.
>>
>> Thanks.
>>
>>>
>>> Would you like to add something about multi-copy atomicity?
>>
>> Not really. :)
>>
>>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue<E, F, N>::pop_global which is only needed on platforms which don't provide this property (PPC and ARM).
>>>
>>> It is needed in the following scenario:
>>> - Different threads write 2 variables.
>>> - Readers of these 2 variables expect a globally consistent order of the write accesses.
>>>
>>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity".
>>
>> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ...
>>
>>> (While taking a look at it, the condition "#if !(defined SPARC ||
>>> defined IA32 || defined AMD64)" is not accurate and should better get
>>> improved. E.g. s390 is multi-copy atomic.)
>>>
>>>
>>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64.
>>
>> I still can't get my head around the C++11 terminology for this and
>> how you are expected to use it - what does it mean for an individual
>> operation to be "sequentially consistent" ? :(
>>
>> Cheers,
>> David
>>
>>>
>>> Thanks and best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On
>>> Behalf Of David Holmes
>>> Sent: Mittwoch, 23. November 2016 06:08
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Subject: Presentation: Understanding OrderAccess
>>>
>>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers.
>>>
>>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order
>>> A
>>> ccess-v1.1.pdf
>>>
>>> Cheers,
>>> David
>>>

From thomas.stuefe at gmail.com  Tue Nov 29 13:43:53 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 29 Nov 2016 14:43:53 +0100
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
Message-ID: <CAA-vtUxEV9751kT2RCcFcF166w6faHOD5BJQCK+Me2V154+ebw@mail.gmail.com>

Hi David,

On Tue, Nov 29, 2016 at 12:59 PM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>
>> Hi David,
>>
>> thanks for the good explanation. Change looks good, I really like the
>> comment in capture_initial_stack().
>>
>> Question, with -Xss given and being smaller than current thread stack
>> size, guard pages may appear in the middle of the invoking thread stack?
>> I always thought this is a bit dangerous. If your model is to have the
>> VM created from the main thread, which then goes off to do different
>> things, and have other threads then attach and run java code, main
>> thread later may crash in unrelated native code just because it reached
>> the stack depth of the hava threads? Or am I misunderstanding something?
>>
>
> There is no change to the general behaviour other than allowing a
> primordial process thread that launches the VM, to now not have an
> effective stack limited at 2MB. The current logic will insert guard pages
> where ever -Xss states (as long as less than 2MB else 2MB), while with the
> fix the guard pages will be inserted above 2MB - as dictated by -Xss.
>
>
Thank you for this answer. I know my question was outside the scope of your
patch.

Thomas


> David
> -----
>
> Thanks, Thomas
>>
>>
>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>
>>     The bug is not public unfortunately for non-technical reasons - but
>>     see my eval below.
>>
>>     Background: if you load the JVM from the primordial thread of a
>>     process (not done by the java launcher since JDK 6), there is an
>>     artificial stack limit imposed on the initial thread (by sticking
>>     the guard page at the limit position of the actual stack) of the
>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is
>>     ignored for the main thread even if the true stack is, say, 8M. This
>>     limitation dates back 10-15 years and is no longer relevant today
>>     and should be removed (see below). I've also added additional
>>     explanatory notes.
>>
>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>
>>     Testing was manually done by modifying the launcher to not run the
>>     VM in a new thread, and checking the resulting stack size used.
>>
>>     This change will only affect hosted JVMs launched with a -Xss value
>>     > 2M.
>>
>>     Thanks,
>>     David
>>     -----
>>
>>     Bug eval:
>>
>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>     that to 2M in 1.4.0 due to JDK-4466587.
>>
>>     By 1.4.2 we have the basic form of the current problematic code:
>>
>>     #ifndef IA64
>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>     #else
>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>> small
>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>     #endif
>>
>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>
>>       if (max_size && _initial_thread_stack_size > max_size) {
>>          _initial_thread_stack_size = max_size;
>>       }
>>
>>     This was added by JDK-4678676 to allow the stack of the main thread
>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>     smaller than that.** There was no intent to allow the stack size to
>>     follow -Xss arbitrarily due to the operational constraints imposed
>>     by the OS/glibc at the time when dealing with the primordial process
>>     thread.
>>
>>     ** It could not actually change the actual stack size of course, but
>>     set the guard pages to limit use to the expected stack size.
>>
>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>     JVM in a new thread, so that it was not limited by the
>>     idiosyncracies of the OS or thread library primordial thread
>>     handling. However, the stack size limitations remained in place in
>>     case the VM was launched from the primordial thread of a user
>>     application via the JNI invocation API.
>>
>>     I believe it should be safe to remove the 2M limitation now.
>>
>>
>>

From aph at redhat.com  Tue Nov 29 14:07:23 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 29 Nov 2016 14:07:23 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>
References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>
Message-ID: <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com>

On 29/11/16 13:08, Doerr, Martin wrote:
> Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation.
> If one thread releases something which is based on something else which was written by another thread, a third thread
> which acquires it, is expected to see that in a consistent way. Do you agree?

It depends.  What exactly do you mean by "is based on"?

Andrew.


From aph at redhat.com  Tue Nov 29 15:35:16 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 29 Nov 2016 15:35:16 +0000
Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath
	performance due to non-optimized compilation
In-Reply-To: <CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
	<CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
Message-ID: <d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>

On 29/11/16 09:41, Volker Simonis wrote:
> Thanks Gustavo,
> 
> the change looks good.
> 
> So now we're just waiting for another review from somebody of the aarch64 folks.
> Once we have that and the fc-request is approved I'll push the changes.

One thing I don't understand:

cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s
sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s

Do you know what causes the lower digits to be different?  Is
it that Math and StrictMath use different algorithms, not just
different optimization levels?

Andrew.


From gromero at linux.vnet.ibm.com  Tue Nov 29 16:31:58 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Tue, 29 Nov 2016 14:31:58 -0200
Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath
	performance due to non-optimized compilation
In-Reply-To: <d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
	<CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
	<d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>
Message-ID: <583DAD7E.7020807@linux.vnet.ibm.com>

Hi Andrew,

On 29-11-2016 13:35, Andrew Haley wrote:
> On 29/11/16 09:41, Volker Simonis wrote:
>> Thanks Gustavo,
>>
>> the change looks good.
>>
>> So now we're just waiting for another review from somebody of the aarch64 folks.
>> Once we have that and the fc-request is approved I'll push the changes.
> 
> One thing I don't understand:
> 
> cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s
> sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s
> 
> Do you know what causes the lower digits to be different?  Is
> it that Math and StrictMath use different algorithms, not just
> different optimization levels?

I don't know exactly what's the root cause for that difference (in the result).
The difference is not present on x64, however on PPC64 even with -O0 (as it is
by now) that difference exists.

Math methods are intrisified, but StricMath are not. But I understand that Math
and StrictMath share the fdlibm code since I already changed some code in fdlibm
that reflected both on Math and StrictMath, so it's not clear to me where the
Math relaxation occurs on PPC64 (given that such a relaxation is allowed [1]).

For sure others much more experienced than I can comment about difference.


Regards,
Gustavo

[1] https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html


From martin.doerr at sap.com  Tue Nov 29 17:50:32 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 29 Nov 2016 17:50:32 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com>
References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>
	<593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com>
Message-ID: <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap>

Hi Andrew,

I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1].
Thread 1 reads a value from Thread 0, Thread 1 publishes something e.g. by a releasing store (which could be lwsync + store on PPC64) and Thread 2 acquires this value (or relies on address dependency based ordering).

The barrier must order Thread 0's store wrt. Thread 1's store in this case.

E.g. Thread 1 could have updated a data structure referencing stuff from Thread 0. I think we all rely on that Thread 3 sees at least the same changes from Thread 0 when accessing this data structure. So this "cumulative" property is relevant for hotspot's OrderAccess functions.

Best regards,
Martin


[1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf

-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Dienstag, 29. November 2016 15:07
To: Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess

On 29/11/16 13:08, Doerr, Martin wrote:
> Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation.
> If one thread releases something which is based on something else 
> which was written by another thread, a third thread which acquires it, is expected to see that in a consistent way. Do you agree?

It depends.  What exactly do you mean by "is based on"?

Andrew.


From daniel.daugherty at oracle.com  Tue Nov 29 17:57:50 2016
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 29 Nov 2016 10:57:50 -0700
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
Message-ID: <c361b87f-2b0f-1204-6fb4-6bc40212b5cc@oracle.com>

Sorry for being late to this party! Seems like thread stack sizes are
very much on folks minds lately...

 > webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/

src/os/linux/vm/os_linux.cpp
     L936:   // a user-specified value known to be greater than the 
minimum needed.
         Perhaps: ... known to be at least the minimum needed.

         As enforced by this code in os::Posix::set_minimum_stack_sizes():

         _java_thread_min_stack_allowed = 
MAX2(_java_thread_min_stack_allowed,
JavaThread::stack_guard_zone_size() +
JavaThread::stack_shadow_zone_size() +
                                               (4 * BytesPerWord 
COMPILER2_PRESENT(+ 2)) * 4 * K);

         _java_thread_min_stack_allowed = 
align_size_up(_java_thread_min_stack_allowed, vm_page_size());

         size_t stack_size_in_bytes = ThreadStackSize * K;
         if (stack_size_in_bytes != 0 &&
             stack_size_in_bytes < _java_thread_min_stack_allowed) {
           // The '-Xss' and '-XX:ThreadStackSize=N' options both set
           // ThreadStackSize so we go with "Java thread stack size" instead
           // of "ThreadStackSize" to be more friendly.
           tty->print_cr("\nThe Java thread stack size specified is too 
small. "
                         "Specify at least " SIZE_FORMAT "k",
                         _java_thread_min_stack_allowed / K);
           return JNI_ERR;
         }

     L939:   // can not do anything to emulate a larger stack than what 
has been provided by
         Typo: 'can not' -> 'cannot'

     L943:   // Mamimum stack size is the easy part, get it from 
RLIMIT_STACK
         Typo: 'Mamimum' -> 'Maximum'
         nit - please add a '.' to the end.


Thumbs up!

I don't need to see a new webrev if you decide to make the
minor edits above.

Dan


On 11/29/16 5:25 AM, David Holmes wrote:
> I just realized I overlooked the case where ThreadStackSize=0 and the 
> stack is unlimited. In that case it isn't clear where the guard pages 
> will get inserted - I do know that I don't get a stackoverflow error.
>
> This needs further investigation.
>
> David
>
> On 29/11/2016 9:59 PM, David Holmes wrote:
>> Hi Thomas,
>>
>> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>>> Hi David,
>>>
>>> thanks for the good explanation. Change looks good, I really like the
>>> comment in capture_initial_stack().
>>>
>>> Question, with -Xss given and being smaller than current thread stack
>>> size, guard pages may appear in the middle of the invoking thread 
>>> stack?
>>> I always thought this is a bit dangerous. If your model is to have the
>>> VM created from the main thread, which then goes off to do different
>>> things, and have other threads then attach and run java code, main
>>> thread later may crash in unrelated native code just because it reached
>>> the stack depth of the hava threads? Or am I misunderstanding 
>>> something?
>>
>> There is no change to the general behaviour other than allowing a
>> primordial process thread that launches the VM, to now not have an
>> effective stack limited at 2MB. The current logic will insert guard
>> pages where ever -Xss states (as long as less than 2MB else 2MB), while
>> with the fix the guard pages will be inserted above 2MB - as dictated by
>> -Xss.
>>
>> David
>> -----
>>
>>> Thanks, Thomas
>>>
>>>
>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>
>>>     The bug is not public unfortunately for non-technical reasons - but
>>>     see my eval below.
>>>
>>>     Background: if you load the JVM from the primordial thread of a
>>>     process (not done by the java launcher since JDK 6), there is an
>>>     artificial stack limit imposed on the initial thread (by sticking
>>>     the guard page at the limit position of the actual stack) of the
>>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M 
>>> it is
>>>     ignored for the main thread even if the true stack is, say, 8M. 
>>> This
>>>     limitation dates back 10-15 years and is no longer relevant today
>>>     and should be removed (see below). I've also added additional
>>>     explanatory notes.
>>>
>>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>
>>>     Testing was manually done by modifying the launcher to not run the
>>>     VM in a new thread, and checking the resulting stack size used.
>>>
>>>     This change will only affect hosted JVMs launched with a -Xss value
>>>     > 2M.
>>>
>>>     Thanks,
>>>     David
>>>     -----
>>>
>>>     Bug eval:
>>>
>>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>>     that to 2M in 1.4.0 due to JDK-4466587.
>>>
>>>     By 1.4.2 we have the basic form of the current problematic code:
>>>
>>>     #ifndef IA64
>>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>>     #else
>>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>>> small
>>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>>     #endif
>>>
>>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>>
>>>       if (max_size && _initial_thread_stack_size > max_size) {
>>>          _initial_thread_stack_size = max_size;
>>>       }
>>>
>>>     This was added by JDK-4678676 to allow the stack of the main thread
>>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>>     smaller than that.** There was no intent to allow the stack size to
>>>     follow -Xss arbitrarily due to the operational constraints imposed
>>>     by the OS/glibc at the time when dealing with the primordial 
>>> process
>>>     thread.
>>>
>>>     ** It could not actually change the actual stack size of course, 
>>> but
>>>     set the guard pages to limit use to the expected stack size.
>>>
>>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>>     JVM in a new thread, so that it was not limited by the
>>>     idiosyncracies of the OS or thread library primordial thread
>>>     handling. However, the stack size limitations remained in place in
>>>     case the VM was launched from the primordial thread of a user
>>>     application via the JNI invocation API.
>>>
>>>     I believe it should be safe to remove the 2M limitation now.
>>>
>>>


From volker.simonis at gmail.com  Tue Nov 29 18:06:05 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 29 Nov 2016 19:06:05 +0100
Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath
	performance due to non-optimized compilation
In-Reply-To: <583DAD7E.7020807@linux.vnet.ibm.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
	<CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
	<d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>
	<583DAD7E.7020807@linux.vnet.ibm.com>
Message-ID: <CA+3eh13YL6g+iGdvtw0FDuwO9_QJ2NwFXNpX4Byww7Wp3V_FvA@mail.gmail.com>

On Tue, Nov 29, 2016 at 5:31 PM, Gustavo Romero
<gromero at linux.vnet.ibm.com> wrote:
> Hi Andrew,
>
> On 29-11-2016 13:35, Andrew Haley wrote:
>> On 29/11/16 09:41, Volker Simonis wrote:
>>> Thanks Gustavo,
>>>
>>> the change looks good.
>>>
>>> So now we're just waiting for another review from somebody of the aarch64 folks.
>>> Once we have that and the fc-request is approved I'll push the changes.
>>
>> One thing I don't understand:
>>
>> cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s
>> sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s
>>
>> Do you know what causes the lower digits to be different?  Is
>> it that Math and StrictMath use different algorithms, not just
>> different optimization levels?
>
> I don't know exactly what's the root cause for that difference (in the result).
> The difference is not present on x64, however on PPC64 even with -O0 (as it is
> by now) that difference exists.
>
> Math methods are intrisified, but StricMath are not. But I understand that Math
> and StrictMath share the fdlibm code since I already changed some code in fdlibm
> that reflected both on Math and StrictMath, so it's not clear to me where the
> Math relaxation occurs on PPC64 (given that such a relaxation is allowed [1]).
>

I think the difference is because Math functions can be intrinsified
(and optimized) while StricMath functions can not.

HotSpot has different ways of intrinsifying the Math functions. If the
CPU is supporting the corresponding function the VM generates special
nodes for that. Otherwise, if there exist special optimized assembler
stubs for a function (e.g. see "StubRoutines::_dsin =
generate_libmSin()" in stubGenerator_x86_64.cpp) the VM makes use of
them. Otherwise it still uses leaf-calls into HotSpots internal
C++-Implementation of the functions (e.g. SharedRuntime::dsin() in
sharedRuntimeTrig.cpp) which are faster than doing a native call into
the fdlibm version.

The implementation in SharedRuntime doesn't has to be "strict" so it
probably uses fused multiplication and it is also build with full
optimization without '-ffp-contract=off' (which is OK in this case).

@Andrew: are you fine with Gustavos latest version of the change?

> For sure others much more experienced than I can comment about difference.
>
>
> Regards,
> Gustavo
>
> [1] https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html
>

From aph at redhat.com  Tue Nov 29 18:15:09 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 29 Nov 2016 18:15:09 +0000
Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath
	performance due to non-optimized compilation
In-Reply-To: <CA+3eh13YL6g+iGdvtw0FDuwO9_QJ2NwFXNpX4Byww7Wp3V_FvA@mail.gmail.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
	<CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
	<d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>
	<583DAD7E.7020807@linux.vnet.ibm.com>
	<CA+3eh13YL6g+iGdvtw0FDuwO9_QJ2NwFXNpX4Byww7Wp3V_FvA@mail.gmail.com>
Message-ID: <3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com>

On 29/11/16 18:06, Volker Simonis wrote:
> @Andrew: are you fine with Gustavos latest version of the change?

Sure.  The StrictMath versions all seem to give the same results.

Andrew.


From gromero at linux.vnet.ibm.com  Tue Nov 29 18:37:01 2016
From: gromero at linux.vnet.ibm.com (Gustavo Romero)
Date: Tue, 29 Nov 2016 16:37:01 -0200
Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath
	performance due to non-optimized compilation
In-Reply-To: <3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com>
References: <583C5B10.8040204@linux.vnet.ibm.com>
	<CA+3eh11YgC06XdK6gWjs_dKbNXeym07bzKfJ590PEYZXdgwd0g@mail.gmail.com>
	<d9b620af-bd8d-1560-5fbc-f0d980f35d8b@redhat.com>
	<583DAD7E.7020807@linux.vnet.ibm.com>
	<CA+3eh13YL6g+iGdvtw0FDuwO9_QJ2NwFXNpX4Byww7Wp3V_FvA@mail.gmail.com>
	<3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com>
Message-ID: <583DCACD.3090803@linux.vnet.ibm.com>

Hi Erik, Volker, Andrew

On 29-11-2016 16:15, Andrew Haley wrote:
> On 29/11/16 18:06, Volker Simonis wrote:
>> @Andrew: are you fine with Gustavos latest version of the change?
> 
> Sure.  The StrictMath versions all seem to give the same results.
> 
> Andrew.
> 

Thanks for reviewing the change!

I changed the "FC Extension Request" status to "reviewed".


Regards,
Gustavo


From aph at redhat.com  Tue Nov 29 19:01:56 2016
From: aph at redhat.com (Andrew Haley)
Date: Tue, 29 Nov 2016 19:01:56 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap>
References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>
	<593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com>
	<47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap>
Message-ID: <612df825-e963-2518-db9e-bf713dbf166a@redhat.com>

On 29/11/16 17:50, Doerr, Martin wrote:

> I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1].
>
> Thread 1 reads a value from Thread 0, Thread 1 publishes something
> e.g. by a releasing store (which could be lwsync + store on PPC64)
> and Thread 2 acquires this value (or relies on address dependency
> based ordering).
> 
> The barrier must order Thread 0's store wrt. Thread 1's store in this case.
> 
> E.g. Thread 1 could have updated a data structure referencing stuff
> from Thread 0. I think we all rely on that Thread 3 sees at least
> the same changes from Thread 0 when accessing this data
> structure. So this "cumulative" property is relevant for hotspot's
> OrderAccess functions.

You can't rely on address dependency ordering in a language like C++
unless you use something like memory_order_consume: the compiler is
capable of optimizing your code so that it doesn't use the address you
think it should be using.  That example is only valid for assembly
code.  Acquire is fine.

Andrew.

From david.holmes at oracle.com  Wed Nov 30 07:22:49 2016
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 30 Nov 2016 17:22:49 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <c361b87f-2b0f-1204-6fb4-6bc40212b5cc@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
	<c361b87f-2b0f-1204-6fb4-6bc40212b5cc@oracle.com>
Message-ID: <0bf53099-6f87-419c-ca5c-af6437002929@oracle.com>

Thanks for the review Dan. Unfortunately I overlooked one case - see my 
other emails. :)

Cheers,
David

On 30/11/2016 3:57 AM, Daniel D. Daugherty wrote:
> Sorry for being late to this party! Seems like thread stack sizes are
> very much on folks minds lately...
>
>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>
> src/os/linux/vm/os_linux.cpp
>     L936:   // a user-specified value known to be greater than the
> minimum needed.
>         Perhaps: ... known to be at least the minimum needed.
>
>         As enforced by this code in os::Posix::set_minimum_stack_sizes():
>
>         _java_thread_min_stack_allowed =
> MAX2(_java_thread_min_stack_allowed,
> JavaThread::stack_guard_zone_size() +
> JavaThread::stack_shadow_zone_size() +
>                                               (4 * BytesPerWord
> COMPILER2_PRESENT(+ 2)) * 4 * K);
>
>         _java_thread_min_stack_allowed =
> align_size_up(_java_thread_min_stack_allowed, vm_page_size());
>
>         size_t stack_size_in_bytes = ThreadStackSize * K;
>         if (stack_size_in_bytes != 0 &&
>             stack_size_in_bytes < _java_thread_min_stack_allowed) {
>           // The '-Xss' and '-XX:ThreadStackSize=N' options both set
>           // ThreadStackSize so we go with "Java thread stack size" instead
>           // of "ThreadStackSize" to be more friendly.
>           tty->print_cr("\nThe Java thread stack size specified is too
> small. "
>                         "Specify at least " SIZE_FORMAT "k",
>                         _java_thread_min_stack_allowed / K);
>           return JNI_ERR;
>         }
>
>     L939:   // can not do anything to emulate a larger stack than what
> has been provided by
>         Typo: 'can not' -> 'cannot'
>
>     L943:   // Mamimum stack size is the easy part, get it from
> RLIMIT_STACK
>         Typo: 'Mamimum' -> 'Maximum'
>         nit - please add a '.' to the end.
>
>
> Thumbs up!
>
> I don't need to see a new webrev if you decide to make the
> minor edits above.
>
> Dan
>
>
>
> On 11/29/16 5:25 AM, David Holmes wrote:
>> I just realized I overlooked the case where ThreadStackSize=0 and the
>> stack is unlimited. In that case it isn't clear where the guard pages
>> will get inserted - I do know that I don't get a stackoverflow error.
>>
>> This needs further investigation.
>>
>> David
>>
>> On 29/11/2016 9:59 PM, David Holmes wrote:
>>> Hi Thomas,
>>>
>>> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>>>> Hi David,
>>>>
>>>> thanks for the good explanation. Change looks good, I really like the
>>>> comment in capture_initial_stack().
>>>>
>>>> Question, with -Xss given and being smaller than current thread stack
>>>> size, guard pages may appear in the middle of the invoking thread
>>>> stack?
>>>> I always thought this is a bit dangerous. If your model is to have the
>>>> VM created from the main thread, which then goes off to do different
>>>> things, and have other threads then attach and run java code, main
>>>> thread later may crash in unrelated native code just because it reached
>>>> the stack depth of the hava threads? Or am I misunderstanding
>>>> something?
>>>
>>> There is no change to the general behaviour other than allowing a
>>> primordial process thread that launches the VM, to now not have an
>>> effective stack limited at 2MB. The current logic will insert guard
>>> pages where ever -Xss states (as long as less than 2MB else 2MB), while
>>> with the fix the guard pages will be inserted above 2MB - as dictated by
>>> -Xss.
>>>
>>> David
>>> -----
>>>
>>>> Thanks, Thomas
>>>>
>>>>
>>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>
>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>>
>>>>     The bug is not public unfortunately for non-technical reasons - but
>>>>     see my eval below.
>>>>
>>>>     Background: if you load the JVM from the primordial thread of a
>>>>     process (not done by the java launcher since JDK 6), there is an
>>>>     artificial stack limit imposed on the initial thread (by sticking
>>>>     the guard page at the limit position of the actual stack) of the
>>>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M
>>>> it is
>>>>     ignored for the main thread even if the true stack is, say, 8M.
>>>> This
>>>>     limitation dates back 10-15 years and is no longer relevant today
>>>>     and should be removed (see below). I've also added additional
>>>>     explanatory notes.
>>>>
>>>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>>
>>>>     Testing was manually done by modifying the launcher to not run the
>>>>     VM in a new thread, and checking the resulting stack size used.
>>>>
>>>>     This change will only affect hosted JVMs launched with a -Xss value
>>>>     > 2M.
>>>>
>>>>     Thanks,
>>>>     David
>>>>     -----
>>>>
>>>>     Bug eval:
>>>>
>>>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>>>     that to 2M in 1.4.0 due to JDK-4466587.
>>>>
>>>>     By 1.4.2 we have the basic form of the current problematic code:
>>>>
>>>>     #ifndef IA64
>>>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>>>     #else
>>>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>>>> small
>>>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>>>     #endif
>>>>
>>>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>>>
>>>>       if (max_size && _initial_thread_stack_size > max_size) {
>>>>          _initial_thread_stack_size = max_size;
>>>>       }
>>>>
>>>>     This was added by JDK-4678676 to allow the stack of the main thread
>>>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>>>     smaller than that.** There was no intent to allow the stack size to
>>>>     follow -Xss arbitrarily due to the operational constraints imposed
>>>>     by the OS/glibc at the time when dealing with the primordial
>>>> process
>>>>     thread.
>>>>
>>>>     ** It could not actually change the actual stack size of course,
>>>> but
>>>>     set the guard pages to limit use to the expected stack size.
>>>>
>>>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>>>     JVM in a new thread, so that it was not limited by the
>>>>     idiosyncracies of the OS or thread library primordial thread
>>>>     handling. However, the stack size limitations remained in place in
>>>>     case the VM was launched from the primordial thread of a user
>>>>     application via the JNI invocation API.
>>>>
>>>>     I believe it should be safe to remove the 2M limitation now.
>>>>
>>>>
>

From david.holmes at oracle.com  Wed Nov 30 07:35:24 2016
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 30 Nov 2016 17:35:24 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
Message-ID: <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com>

On 29/11/2016 10:25 PM, David Holmes wrote:
> I just realized I overlooked the case where ThreadStackSize=0 and the
> stack is unlimited. In that case it isn't clear where the guard pages
> will get inserted - I do know that I don't get a stackoverflow error.
>
> This needs further investigation.

So what happens here is that the massive stack-size causes stack-bottom 
to be higher than stack-top! So we will set a guard-page goodness knows 
where, and we can consume the current stack until such time as we hit an 
unmapped or protected region at which point we are killed.

I'm not sure what to do here. My gut feel is that in such a case we 
should not attempt to create a guard page in the initial thread. That 
would require using a sentinel value for the stack-size. Though it also 
presents a problem for stack-bottom - which is implicitly zero. It may 
also give false positives in the is_initial_thread() check!

Thoughts? Suggestions?

> David
>
> On 29/11/2016 9:59 PM, David Holmes wrote:
>> Hi Thomas,
>>
>> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>>> Hi David,
>>>
>>> thanks for the good explanation. Change looks good, I really like the
>>> comment in capture_initial_stack().
>>>
>>> Question, with -Xss given and being smaller than current thread stack
>>> size, guard pages may appear in the middle of the invoking thread stack?
>>> I always thought this is a bit dangerous. If your model is to have the
>>> VM created from the main thread, which then goes off to do different
>>> things, and have other threads then attach and run java code, main
>>> thread later may crash in unrelated native code just because it reached
>>> the stack depth of the hava threads? Or am I misunderstanding something?
>>
>> There is no change to the general behaviour other than allowing a
>> primordial process thread that launches the VM, to now not have an
>> effective stack limited at 2MB. The current logic will insert guard
>> pages where ever -Xss states (as long as less than 2MB else 2MB), while
>> with the fix the guard pages will be inserted above 2MB - as dictated by
>> -Xss.
>>
>> David
>> -----
>>
>>> Thanks, Thomas
>>>
>>>
>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>
>>>     The bug is not public unfortunately for non-technical reasons - but
>>>     see my eval below.
>>>
>>>     Background: if you load the JVM from the primordial thread of a
>>>     process (not done by the java launcher since JDK 6), there is an
>>>     artificial stack limit imposed on the initial thread (by sticking
>>>     the guard page at the limit position of the actual stack) of the
>>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is
>>>     ignored for the main thread even if the true stack is, say, 8M. This
>>>     limitation dates back 10-15 years and is no longer relevant today
>>>     and should be removed (see below). I've also added additional
>>>     explanatory notes.
>>>
>>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>
>>>     Testing was manually done by modifying the launcher to not run the
>>>     VM in a new thread, and checking the resulting stack size used.
>>>
>>>     This change will only affect hosted JVMs launched with a -Xss value
>>>     > 2M.
>>>
>>>     Thanks,
>>>     David
>>>     -----
>>>
>>>     Bug eval:
>>>
>>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>>     that to 2M in 1.4.0 due to JDK-4466587.
>>>
>>>     By 1.4.2 we have the basic form of the current problematic code:
>>>
>>>     #ifndef IA64
>>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>>     #else
>>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>>> small
>>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>>     #endif
>>>
>>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>>
>>>       if (max_size && _initial_thread_stack_size > max_size) {
>>>          _initial_thread_stack_size = max_size;
>>>       }
>>>
>>>     This was added by JDK-4678676 to allow the stack of the main thread
>>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>>     smaller than that.** There was no intent to allow the stack size to
>>>     follow -Xss arbitrarily due to the operational constraints imposed
>>>     by the OS/glibc at the time when dealing with the primordial process
>>>     thread.
>>>
>>>     ** It could not actually change the actual stack size of course, but
>>>     set the guard pages to limit use to the expected stack size.
>>>
>>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>>     JVM in a new thread, so that it was not limited by the
>>>     idiosyncracies of the OS or thread library primordial thread
>>>     handling. However, the stack size limitations remained in place in
>>>     case the VM was launched from the primordial thread of a user
>>>     application via the JNI invocation API.
>>>
>>>     I believe it should be safe to remove the 2M limitation now.
>>>
>>>

From thomas.stuefe at gmail.com  Wed Nov 30 08:17:03 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 30 Nov 2016 09:17:03 +0100
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
	<52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com>
Message-ID: <CAA-vtUxe7O-ry0Lr4YHsXOmZgDJ0Lu1U==WR=T54iwwqswNDeg@mail.gmail.com>

On Wed, Nov 30, 2016 at 8:35 AM, David Holmes <david.holmes at oracle.com>
wrote:

> On 29/11/2016 10:25 PM, David Holmes wrote:
>
>> I just realized I overlooked the case where ThreadStackSize=0 and the
>> stack is unlimited. In that case it isn't clear where the guard pages
>> will get inserted - I do know that I don't get a stackoverflow error.
>>
>> This needs further investigation.
>>
>
> So what happens here is that the massive stack-size causes stack-bottom to
> be higher than stack-top! So we will set a guard-page goodness knows where,
> and we can consume the current stack until such time as we hit an unmapped
> or protected region at which point we are killed.
>
> I'm not sure what to do here. My gut feel is that in such a case we should
> not attempt to create a guard page in the initial thread. That would
> require using a sentinel value for the stack-size. Though it also presents
> a problem for stack-bottom - which is implicitly zero. It may also give
> false positives in the is_initial_thread() check!
>
> Thoughts? Suggestions?
>
>
Maybe I am overlooking something, but should os::capture_initial_thread()
not call pthread_getattr_np() first to handle the case where the VM was
created on a pthread which is not the primordial thread and may have a
different stack size than what getrlimit returns? And fall back to
getrlimit only if pthread_getattr_np() fails? And then we also should
handle RLIM_INFINITY. For that case, I also think not setting guard pages
would be safest.

We also may just refuse to run in that case, because the workaround for the
user is easy - just set the limit before process start. Note that on AIX,
we currently refuse to run on the primordial thread because it may have
different page sizes than pthreads and it is impossible to get the exact
stack locations.

Thomas


>
> David
>>
>> On 29/11/2016 9:59 PM, David Holmes wrote:
>>
>>> Hi Thomas,
>>>
>>> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>>>
>>>> Hi David,
>>>>
>>>> thanks for the good explanation. Change looks good, I really like the
>>>> comment in capture_initial_stack().
>>>>
>>>> Question, with -Xss given and being smaller than current thread stack
>>>> size, guard pages may appear in the middle of the invoking thread stack?
>>>> I always thought this is a bit dangerous. If your model is to have the
>>>> VM created from the main thread, which then goes off to do different
>>>> things, and have other threads then attach and run java code, main
>>>> thread later may crash in unrelated native code just because it reached
>>>> the stack depth of the hava threads? Or am I misunderstanding something?
>>>>
>>>
>>> There is no change to the general behaviour other than allowing a
>>> primordial process thread that launches the VM, to now not have an
>>> effective stack limited at 2MB. The current logic will insert guard
>>> pages where ever -Xss states (as long as less than 2MB else 2MB), while
>>> with the fix the guard pages will be inserted above 2MB - as dictated by
>>> -Xss.
>>>
>>> David
>>> -----
>>>
>>> Thanks, Thomas
>>>>
>>>>
>>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes <david.holmes at oracle.com
>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>
>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>>>     <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>>
>>>>     The bug is not public unfortunately for non-technical reasons - but
>>>>     see my eval below.
>>>>
>>>>     Background: if you load the JVM from the primordial thread of a
>>>>     process (not done by the java launcher since JDK 6), there is an
>>>>     artificial stack limit imposed on the initial thread (by sticking
>>>>     the guard page at the limit position of the actual stack) of the
>>>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is
>>>>     ignored for the main thread even if the true stack is, say, 8M. This
>>>>     limitation dates back 10-15 years and is no longer relevant today
>>>>     and should be removed (see below). I've also added additional
>>>>     explanatory notes.
>>>>
>>>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>>     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>>
>>>>     Testing was manually done by modifying the launcher to not run the
>>>>     VM in a new thread, and checking the resulting stack size used.
>>>>
>>>>     This change will only affect hosted JVMs launched with a -Xss value
>>>>     > 2M.
>>>>
>>>>     Thanks,
>>>>     David
>>>>     -----
>>>>
>>>>     Bug eval:
>>>>
>>>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>>>     that to 2M in 1.4.0 due to JDK-4466587.
>>>>
>>>>     By 1.4.2 we have the basic form of the current problematic code:
>>>>
>>>>     #ifndef IA64
>>>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>>>     #else
>>>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>>>> small
>>>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>>>     #endif
>>>>
>>>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1);
>>>>
>>>>       if (max_size && _initial_thread_stack_size > max_size) {
>>>>          _initial_thread_stack_size = max_size;
>>>>       }
>>>>
>>>>     This was added by JDK-4678676 to allow the stack of the main thread
>>>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>>>     smaller than that.** There was no intent to allow the stack size to
>>>>     follow -Xss arbitrarily due to the operational constraints imposed
>>>>     by the OS/glibc at the time when dealing with the primordial process
>>>>     thread.
>>>>
>>>>     ** It could not actually change the actual stack size of course, but
>>>>     set the guard pages to limit use to the expected stack size.
>>>>
>>>>     In JDK 6, under JDK-6316197, the launcher was changed to create the
>>>>     JVM in a new thread, so that it was not limited by the
>>>>     idiosyncracies of the OS or thread library primordial thread
>>>>     handling. However, the stack size limitations remained in place in
>>>>     case the VM was launched from the primordial thread of a user
>>>>     application via the JNI invocation API.
>>>>
>>>>     I believe it should be safe to remove the 2M limitation now.
>>>>
>>>>
>>>>

From martin.doerr at sap.com  Wed Nov 30 08:36:21 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 30 Nov 2016 08:36:21 +0000
Subject: Presentation: Understanding OrderAccess
In-Reply-To: <612df825-e963-2518-db9e-bf713dbf166a@redhat.com>
References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap>
	<593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com>
	<47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap>
	<612df825-e963-2518-db9e-bf713dbf166a@redhat.com>
Message-ID: <32c3e619ba3e4dcf9525e596b5c91312@dewdfe13de06.global.corp.sap>

Hi Andrew,

I know that. My point was the global effect of Thread 1's barrier.

Best regards,
Martin

-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Dienstag, 29. November 2016 20:02
To: Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
Subject: Re: Presentation: Understanding OrderAccess

On 29/11/16 17:50, Doerr, Martin wrote:

> I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1].
>
> Thread 1 reads a value from Thread 0, Thread 1 publishes something 
> e.g. by a releasing store (which could be lwsync + store on PPC64) and 
> Thread 2 acquires this value (or relies on address dependency based 
> ordering).
> 
> The barrier must order Thread 0's store wrt. Thread 1's store in this case.
> 
> E.g. Thread 1 could have updated a data structure referencing stuff 
> from Thread 0. I think we all rely on that Thread 3 sees at least the 
> same changes from Thread 0 when accessing this data structure. So this 
> "cumulative" property is relevant for hotspot's OrderAccess functions.

You can't rely on address dependency ordering in a language like C++ unless you use something like memory_order_consume: the compiler is capable of optimizing your code so that it doesn't use the address you think it should be using.  That example is only valid for assembly code.  Acquire is fine.

Andrew.

From david.holmes at oracle.com  Wed Nov 30 08:46:47 2016
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 30 Nov 2016 18:46:47 +1000
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <CAA-vtUxe7O-ry0Lr4YHsXOmZgDJ0Lu1U==WR=T54iwwqswNDeg@mail.gmail.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
	<52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com>
	<CAA-vtUxe7O-ry0Lr4YHsXOmZgDJ0Lu1U==WR=T54iwwqswNDeg@mail.gmail.com>
Message-ID: <d9cde803-14a1-a3ed-005b-485691eef018@oracle.com>

On 30/11/2016 6:17 PM, Thomas St?fe wrote:
> On Wed, Nov 30, 2016 at 8:35 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     On 29/11/2016 10:25 PM, David Holmes wrote:
>
>         I just realized I overlooked the case where ThreadStackSize=0
>         and the
>         stack is unlimited. In that case it isn't clear where the guard
>         pages
>         will get inserted - I do know that I don't get a stackoverflow
>         error.
>
>         This needs further investigation.
>
>
>     So what happens here is that the massive stack-size causes
>     stack-bottom to be higher than stack-top! So we will set a
>     guard-page goodness knows where, and we can consume the current
>     stack until such time as we hit an unmapped or protected region at
>     which point we are killed.
>
>     I'm not sure what to do here. My gut feel is that in such a case we
>     should not attempt to create a guard page in the initial thread.
>     That would require using a sentinel value for the stack-size. Though
>     it also presents a problem for stack-bottom - which is implicitly
>     zero. It may also give false positives in the is_initial_thread() check!
>
>     Thoughts? Suggestions?
>
>
> Maybe I am overlooking something, but should
> os::capture_initial_thread() not call pthread_getattr_np() first to
> handle the case where the VM was created on a pthread which is not the
> primordial thread and may have a different stack size than what
> getrlimit returns? And fall back to getrlimit only if
> pthread_getattr_np() fails?

My understanding of the problem (which likely no longer exists) is that 
pthread_getattr_np didn't fail as such but returned bogus values - so 
the problem was not detectable and so we just had to not use 
pthread_getattr_np.

> And then we also should handle
> RLIM_INFINITY. For that case, I also think not setting guard pages would
> be safest.
>
> We also may just refuse to run in that case, because the workaround for
> the user is easy - just set the limit before process start. Note that on
> AIX, we currently refuse to run on the primordial thread because it may
> have different page sizes than pthreads and it is impossible to get the
> exact stack locations.

I was wondering why the AIX set up seemed so simple in comparison :)

Thanks,
David

>
> Thomas
>
>
>
>         David
>
>         On 29/11/2016 9:59 PM, David Holmes wrote:
>
>             Hi Thomas,
>
>             On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>
>                 Hi David,
>
>                 thanks for the good explanation. Change looks good, I
>                 really like the
>                 comment in capture_initial_stack().
>
>                 Question, with -Xss given and being smaller than current
>                 thread stack
>                 size, guard pages may appear in the middle of the
>                 invoking thread stack?
>                 I always thought this is a bit dangerous. If your model
>                 is to have the
>                 VM created from the main thread, which then goes off to
>                 do different
>                 things, and have other threads then attach and run java
>                 code, main
>                 thread later may crash in unrelated native code just
>                 because it reached
>                 the stack depth of the hava threads? Or am I
>                 misunderstanding something?
>
>
>             There is no change to the general behaviour other than
>             allowing a
>             primordial process thread that launches the VM, to now not
>             have an
>             effective stack limited at 2MB. The current logic will
>             insert guard
>             pages where ever -Xss states (as long as less than 2MB else
>             2MB), while
>             with the fix the guard pages will be inserted above 2MB - as
>             dictated by
>             -Xss.
>
>             David
>             -----
>
>                 Thanks, Thomas
>
>
>                 On Fri, Nov 25, 2016 at 11:38 AM, David Holmes
>                 <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>                 <mailto:david.holmes at oracle.com
>                 <mailto:david.holmes at oracle.com>>> wrote:
>
>                     Bug:
>                 https://bugs.openjdk.java.net/browse/JDK-8170307
>                 <https://bugs.openjdk.java.net/browse/JDK-8170307>
>                     <https://bugs.openjdk.java.net/browse/JDK-8170307
>                 <https://bugs.openjdk.java.net/browse/JDK-8170307>>
>
>                     The bug is not public unfortunately for
>                 non-technical reasons - but
>                     see my eval below.
>
>                     Background: if you load the JVM from the primordial
>                 thread of a
>                     process (not done by the java launcher since JDK 6),
>                 there is an
>                     artificial stack limit imposed on the initial thread
>                 (by sticking
>                     the guard page at the limit position of the actual
>                 stack) of the
>                     minimum of the -Xss setting and 2M. So if you set
>                 -Xss to > 2M it is
>                     ignored for the main thread even if the true stack
>                 is, say, 8M. This
>                     limitation dates back 10-15 years and is no longer
>                 relevant today
>                     and should be removed (see below). I've also added
>                 additional
>                     explanatory notes.
>
>                     webrev:
>                 http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>                 <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>                     <http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>                 <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>>
>
>                     Testing was manually done by modifying the launcher
>                 to not run the
>                     VM in a new thread, and checking the resulting stack
>                 size used.
>
>                     This change will only affect hosted JVMs launched
>                 with a -Xss value
>                     > 2M.
>
>                     Thanks,
>                     David
>                     -----
>
>                     Bug eval:
>
>                     JDK-4441425 limits the stack to 8M as a safeguard
>                 against an
>                     unlimited value from getrlimit in 1.3.1, but further
>                 constrained
>                     that to 2M in 1.4.0 due to JDK-4466587.
>
>                     By 1.4.2 we have the basic form of the current
>                 problematic code:
>
>                     #ifndef IA64
>                       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 *
>                 K * K;
>                     #else
>                       // Problem still exists RH7.2 (IA64 anyway) but
>                 2MB is a little
>                 small
>                       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 *
>                 K * K;
>                     #endif
>
>                       _initial_thread_stack_size = rlim.rlim_cur &
>                 ~(page_size() - 1);
>
>                       if (max_size && _initial_thread_stack_size >
>                 max_size) {
>                          _initial_thread_stack_size = max_size;
>                       }
>
>                     This was added by JDK-4678676 to allow the stack of
>                 the main thread
>                     to be _reduced_ below the default 2M/4M if the -Xss
>                 value was
>                     smaller than that.** There was no intent to allow
>                 the stack size to
>                     follow -Xss arbitrarily due to the operational
>                 constraints imposed
>                     by the OS/glibc at the time when dealing with the
>                 primordial process
>                     thread.
>
>                     ** It could not actually change the actual stack
>                 size of course, but
>                     set the guard pages to limit use to the expected
>                 stack size.
>
>                     In JDK 6, under JDK-6316197, the launcher was
>                 changed to create the
>                     JVM in a new thread, so that it was not limited by the
>                     idiosyncracies of the OS or thread library
>                 primordial thread
>                     handling. However, the stack size limitations
>                 remained in place in
>                     case the VM was launched from the primordial thread
>                 of a user
>                     application via the JNI invocation API.
>
>                     I believe it should be safe to remove the 2M
>                 limitation now.
>
>
>

From daniel.daugherty at oracle.com  Wed Nov 30 15:10:14 2016
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 30 Nov 2016 08:10:14 -0700
Subject: RFR: 8170307: Stack size option -Xss is ignored
In-Reply-To: <0bf53099-6f87-419c-ca5c-af6437002929@oracle.com>
References: <e71bee88-fc90-4556-1a43-2423c446e1af@oracle.com>
	<CAA-vtUxJfRgXkpJUqqQ170BSC0a9G6M4pZn28y7dRqXdL5dCFQ@mail.gmail.com>
	<1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com>
	<9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com>
	<c361b87f-2b0f-1204-6fb4-6bc40212b5cc@oracle.com>
	<0bf53099-6f87-419c-ca5c-af6437002929@oracle.com>
Message-ID: <32d6ce82-3279-1e3b-b23e-aa37ec79a459@oracle.com>

On 11/30/16 12:22 AM, David Holmes wrote:
> Thanks for the review Dan. Unfortunately I overlooked one case - see 
> my other emails. :)

Yup. I always read the entire review thread before posting my review
(and sometimes update said review with "Update:" lines). I poked
around a bit in the code, but couldn't come up with an "aha moment"
on the -XX:ThreadStackSize=0 issue. It looked like the few comments
I had might still be useful when you find your way out of the current
quagmire... :-)

Gotta love these thread stack size issues... :-(

Dan


>
> Cheers,
> David
>
> On 30/11/2016 3:57 AM, Daniel D. Daugherty wrote:
>> Sorry for being late to this party! Seems like thread stack sizes are
>> very much on folks minds lately...
>>
>>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>
>> src/os/linux/vm/os_linux.cpp
>>     L936:   // a user-specified value known to be greater than the
>> minimum needed.
>>         Perhaps: ... known to be at least the minimum needed.
>>
>>         As enforced by this code in 
>> os::Posix::set_minimum_stack_sizes():
>>
>>         _java_thread_min_stack_allowed =
>> MAX2(_java_thread_min_stack_allowed,
>> JavaThread::stack_guard_zone_size() +
>> JavaThread::stack_shadow_zone_size() +
>>                                               (4 * BytesPerWord
>> COMPILER2_PRESENT(+ 2)) * 4 * K);
>>
>>         _java_thread_min_stack_allowed =
>> align_size_up(_java_thread_min_stack_allowed, vm_page_size());
>>
>>         size_t stack_size_in_bytes = ThreadStackSize * K;
>>         if (stack_size_in_bytes != 0 &&
>>             stack_size_in_bytes < _java_thread_min_stack_allowed) {
>>           // The '-Xss' and '-XX:ThreadStackSize=N' options both set
>>           // ThreadStackSize so we go with "Java thread stack size" 
>> instead
>>           // of "ThreadStackSize" to be more friendly.
>>           tty->print_cr("\nThe Java thread stack size specified is too
>> small. "
>>                         "Specify at least " SIZE_FORMAT "k",
>>                         _java_thread_min_stack_allowed / K);
>>           return JNI_ERR;
>>         }
>>
>>     L939:   // can not do anything to emulate a larger stack than what
>> has been provided by
>>         Typo: 'can not' -> 'cannot'
>>
>>     L943:   // Mamimum stack size is the easy part, get it from
>> RLIMIT_STACK
>>         Typo: 'Mamimum' -> 'Maximum'
>>         nit - please add a '.' to the end.
>>
>>
>> Thumbs up!
>>
>> I don't need to see a new webrev if you decide to make the
>> minor edits above.
>>
>> Dan
>>
>>
>>
>> On 11/29/16 5:25 AM, David Holmes wrote:
>>> I just realized I overlooked the case where ThreadStackSize=0 and the
>>> stack is unlimited. In that case it isn't clear where the guard pages
>>> will get inserted - I do know that I don't get a stackoverflow error.
>>>
>>> This needs further investigation.
>>>
>>> David
>>>
>>> On 29/11/2016 9:59 PM, David Holmes wrote:
>>>> Hi Thomas,
>>>>
>>>> On 29/11/2016 8:39 PM, Thomas St?fe wrote:
>>>>> Hi David,
>>>>>
>>>>> thanks for the good explanation. Change looks good, I really like the
>>>>> comment in capture_initial_stack().
>>>>>
>>>>> Question, with -Xss given and being smaller than current thread stack
>>>>> size, guard pages may appear in the middle of the invoking thread
>>>>> stack?
>>>>> I always thought this is a bit dangerous. If your model is to have 
>>>>> the
>>>>> VM created from the main thread, which then goes off to do different
>>>>> things, and have other threads then attach and run java code, main
>>>>> thread later may crash in unrelated native code just because it 
>>>>> reached
>>>>> the stack depth of the hava threads? Or am I misunderstanding
>>>>> something?
>>>>
>>>> There is no change to the general behaviour other than allowing a
>>>> primordial process thread that launches the VM, to now not have an
>>>> effective stack limited at 2MB. The current logic will insert guard
>>>> pages where ever -Xss states (as long as less than 2MB else 2MB), 
>>>> while
>>>> with the fix the guard pages will be inserted above 2MB - as 
>>>> dictated by
>>>> -Xss.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Thanks, Thomas
>>>>>
>>>>>
>>>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes 
>>>>> <david.holmes at oracle.com
>>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>>
>>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170307
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>>>
>>>>>     The bug is not public unfortunately for non-technical reasons 
>>>>> - but
>>>>>     see my eval below.
>>>>>
>>>>>     Background: if you load the JVM from the primordial thread of a
>>>>>     process (not done by the java launcher since JDK 6), there is an
>>>>>     artificial stack limit imposed on the initial thread (by sticking
>>>>>     the guard page at the limit position of the actual stack) of the
>>>>>     minimum of the -Xss setting and 2M. So if you set -Xss to > 2M
>>>>> it is
>>>>>     ignored for the main thread even if the true stack is, say, 8M.
>>>>> This
>>>>>     limitation dates back 10-15 years and is no longer relevant today
>>>>>     and should be removed (see below). I've also added additional
>>>>>     explanatory notes.
>>>>>
>>>>>     webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>>>
>>>>>     Testing was manually done by modifying the launcher to not run 
>>>>> the
>>>>>     VM in a new thread, and checking the resulting stack size used.
>>>>>
>>>>>     This change will only affect hosted JVMs launched with a -Xss 
>>>>> value
>>>>>     > 2M.
>>>>>
>>>>>     Thanks,
>>>>>     David
>>>>>     -----
>>>>>
>>>>>     Bug eval:
>>>>>
>>>>>     JDK-4441425 limits the stack to 8M as a safeguard against an
>>>>>     unlimited value from getrlimit in 1.3.1, but further constrained
>>>>>     that to 2M in 1.4.0 due to JDK-4466587.
>>>>>
>>>>>     By 1.4.2 we have the basic form of the current problematic code:
>>>>>
>>>>>     #ifndef IA64
>>>>>       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K;
>>>>>     #else
>>>>>       // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little
>>>>> small
>>>>>       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K;
>>>>>     #endif
>>>>>
>>>>>       _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 
>>>>> 1);
>>>>>
>>>>>       if (max_size && _initial_thread_stack_size > max_size) {
>>>>>          _initial_thread_stack_size = max_size;
>>>>>       }
>>>>>
>>>>>     This was added by JDK-4678676 to allow the stack of the main 
>>>>> thread
>>>>>     to be _reduced_ below the default 2M/4M if the -Xss value was
>>>>>     smaller than that.** There was no intent to allow the stack 
>>>>> size to
>>>>>     follow -Xss arbitrarily due to the operational constraints 
>>>>> imposed
>>>>>     by the OS/glibc at the time when dealing with the primordial
>>>>> process
>>>>>     thread.
>>>>>
>>>>>     ** It could not actually change the actual stack size of course,
>>>>> but
>>>>>     set the guard pages to limit use to the expected stack size.
>>>>>
>>>>>     In JDK 6, under JDK-6316197, the launcher was changed to 
>>>>> create the
>>>>>     JVM in a new thread, so that it was not limited by the
>>>>>     idiosyncracies of the OS or thread library primordial thread
>>>>>     handling. However, the stack size limitations remained in 
>>>>> place in
>>>>>     case the VM was launched from the primordial thread of a user
>>>>>     application via the JNI invocation API.
>>>>>
>>>>>     I believe it should be safe to remove the 2M limitation now.
>>>>>
>>>>>
>>


From trevor.d.watson at oracle.com  Wed Nov 30 15:29:50 2016
From: trevor.d.watson at oracle.com (Trevor Watson)
Date: Wed, 30 Nov 2016 15:29:50 +0000
Subject: RFR: 8162865 Implementation of SPARC lzcnt
In-Reply-To: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com>
References: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
	<1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com>
Message-ID: <8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com>

Hi Vladimir,

Thanks for the review. Comments inline below...

On 22/11/16 20:04, Vladimir Kozlov wrote:
> Do you have performance numbers?

I've spent a lot of time looking at performance and it's proving verify 
difficult to precisely quantify either on a T5 or an S7. However, 
overall, it would appear that using the native lzcnt instruction is 
around 10% quicker than the current implementation which uses POPC.

> UseVIS is too wide flag to control only these instructions generation.
>
> To be consistent with x86 code please add
> UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its
> setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is
> done for x86.

I've done this and it actually proved useful in testing as I was able to 
turn off lzcnt and use popc and vice-versa :)

> May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear
> that VIS is used.

Done.

> Indention in the new test is all over place. Please, fix.

I've fixed it (I hope) and broken the test up into separate Integer and 
Long tests to be consistent with the rest of the BMI tests in that 
directory.

I've run the jtreg bmi tests on Solaris 12 SPARC and x86 and am awaiting 
the results of a jprt (hotspot) run on all platforms.

The code review is in the same place as before:
>> http://cr.openjdk.java.net/~alanbur/8162865/

Thanks,
Trevor

From igor.nunes at eldorado.org.br  Wed Nov 30 16:51:59 2016
From: igor.nunes at eldorado.org.br (Igor Henrique Soares Nunes)
Date: Wed, 30 Nov 2016 16:51:59 +0000
Subject: [8u] request for approval: "8168318 : PPC64: Use cmpldi instead of
	li/cmpld"
Message-ID: <c2905f26268547719b391530bb13e3d9@serv030.corp.eldorado.org.br>

Hi all,

Could you please approve the backport of the following ppc64-only improvement to jdk8u-dev:

8168318: PPC64: Use cmpldi instead of li/cmpld

Bug: https://bugs.openjdk.java.net/browse/JDK-8168318
Webrev: https://igorsnunes.github.io/openjdk/webrev/8168318/
Review: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-October/024809.html
URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/622d3fe587f2

Thank you and best regards,
Igor Nunes


From rob.mckenna at oracle.com  Wed Nov 30 17:40:23 2016
From: rob.mckenna at oracle.com (Rob McKenna)
Date: Wed, 30 Nov 2016 17:40:23 +0000
Subject: [8u] request for approval: "8168318 : PPC64: Use cmpldi instead
	of li/cmpld"
In-Reply-To: <c2905f26268547719b391530bb13e3d9@serv030.corp.eldorado.org.br>
References: <c2905f26268547719b391530bb13e3d9@serv030.corp.eldorado.org.br>
Message-ID: <20161130174023.GA2448@vimes>

Hi Igor,

As this is an enhancement request, please follow the enhancement
approval request process:

http://openjdk.java.net/projects/jdk8u/enhancement-template.html

http://openjdk.java.net/projects/jdk8u/groundrules.html

    -Rob

On 30/11/16 04:51, Igor Henrique Soares Nunes wrote:
> Hi all,
> 
> Could you please approve the backport of the following ppc64-only improvement to jdk8u-dev:
> 
> 8168318: PPC64: Use cmpldi instead of li/cmpld
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8168318
> Webrev: https://igorsnunes.github.io/openjdk/webrev/8168318/
> Review: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-October/024809.html
> URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/622d3fe587f2
> 
> Thank you and best regards,
> Igor Nunes
> 

From jcowgill at debian.org  Wed Nov 30 17:50:33 2016
From: jcowgill at debian.org (James Cowgill)
Date: Wed, 30 Nov 2016 17:50:33 +0000
Subject: JDK 9 fails to build on MIPS
Message-ID: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org>

Hi,

Firstly I have never submitted anything to OpenJDK before so apologies
if I haven't done things the right way. I also have no bug number for this.

OpenJDK 9 does not build on MIPS machines and hasn't for some time. This
is due to code in hotspot which assumes NSIG <= 65 which is not the case
on MIPS since MIPS has 127 signal numbers.

I've attached an initial patch which converts the offending code in
hotspot/src/os/linux/vm/jsig.c to use sigset_t instead of an array to
store the used signals. I notice the AIX implementation of jsig.c
already does this.

Originally from: https://bugs.debian.org/841173

Thanks,
James


-------------- next part --------------
A non-text attachment was scrubbed...
Name: mips-sigset-hotspot.diff
Type: text/x-patch
Size: 3570 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-dev/attachments/20161130/5edb9e21/mips-sigset-hotspot.diff>

From vladimir.kozlov at oracle.com  Wed Nov 30 19:19:10 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 30 Nov 2016 11:19:10 -0800
Subject: RFR: 8162865 Implementation of SPARC lzcnt
In-Reply-To: <8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com>
References: <ee7dcf6d-f22f-ad4a-d751-7592a2463471@oracle.com>
	<1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com>
	<8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com>
Message-ID: <583F262E.9020604@oracle.com>

Looks good. Only one small issue - new tests files should have only 2016 year:

* Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.

Changes have to wait when JDK 10 repo is open. It is Enhancement and we done with new features in JDK 9 already.

Thanks,
Vladimir

On 11/30/16 7:29 AM, Trevor Watson wrote:
> Hi Vladimir,
>
> Thanks for the review. Comments inline below...
>
> On 22/11/16 20:04, Vladimir Kozlov wrote:
>> Do you have performance numbers?
>
> I've spent a lot of time looking at performance and it's proving verify difficult to precisely quantify either on a T5 or an S7. However, overall, it would appear that using the native lzcnt
> instruction is around 10% quicker than the current implementation which uses POPC.
>
>> UseVIS is too wide flag to control only these instructions generation.
>>
>> To be consistent with x86 code please add
>> UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its
>> setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is
>> done for x86.
>
> I've done this and it actually proved useful in testing as I was able to turn off lzcnt and use popc and vice-versa :)
>
>> May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear
>> that VIS is used.
>
> Done.
>
>> Indention in the new test is all over place. Please, fix.
>
> I've fixed it (I hope) and broken the test up into separate Integer and Long tests to be consistent with the rest of the BMI tests in that directory.
>
> I've run the jtreg bmi tests on Solaris 12 SPARC and x86 and am awaiting the results of a jprt (hotspot) run on all platforms.
>
> The code review is in the same place as before:
>>> http://cr.openjdk.java.net/~alanbur/8162865/
>
> Thanks,
> Trevor

From thomas.stuefe at gmail.com  Wed Nov 30 19:33:16 2016
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 30 Nov 2016 20:33:16 +0100
Subject: JDK 9 fails to build on MIPS
In-Reply-To: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org>
References: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org>
Message-ID: <CAA-vtUw8BfTfPUgAVhKm8_frbUMFX4Va_AK2OUzwiVk46p2doA@mail.gmail.com>

Hi James,

In general I like your patch - we used sigset_t in the AIX port instead of
masks and this would be a good cleanup for the other platforms too.

But in this case, is the problem not that the mips signal.h headers fails
to define NSIG?

We have NSIG and _NSIG. _NSIG seems to be the platform dependent max
including real time signals. NSIG excludes real time signals, and seems to
be 32 (SIGRTMIN) on all Linux platforms I checked.

I may have looked wrong (I searched
http://lxr.free-electrons.com/ident?v=3.2&i=NSIG), but I found that NSIG
was missing from signal.h on some architectures, mips being among them. I
do not know why, but would like to understand the reason. Do you define
NSIG to be _NSIG?

The VM currently does not use real time signals, so NSIG should be
sufficient. If NSIG is really missing on mips, then maybe defining it
locally as SIGRTMIN would be a less invasive change.

If we were to change the hand-written bitmask to sigset_t, we probably
should also take a look at the arrays of length NSIG (sigact, sigflags,
pending_signals) and the associated checks. This would be a bigger cleanup.

---

Apart from all that, I'd suggest moving the sigset initialization in
os_linux.cpp from the "__attribute__((constructor))" function to
os::signal_init_pd(). I'd suggest a similar move for jsig.c, but do not see
a suitable initialization function there. Maybe someone else has an idea?

Thanks & Kind Regards, Thomas


On Wed, Nov 30, 2016 at 6:50 PM, James Cowgill <jcowgill at debian.org> wrote:

> Hi,
>
> Firstly I have never submitted anything to OpenJDK before so apologies
> if I haven't done things the right way. I also have no bug number for this.
>
> OpenJDK 9 does not build on MIPS machines and hasn't for some time. This
> is due to code in hotspot which assumes NSIG <= 65 which is not the case
> on MIPS since MIPS has 127 signal numbers.
>
> I've attached an initial patch which converts the offending code in
> hotspot/src/os/linux/vm/jsig.c to use sigset_t instead of an array to
> store the used signals. I notice the AIX implementation of jsig.c
> already does this.
>
> Originally from: https://bugs.debian.org/841173
>
> Thanks,
> James
>
>
>

From max.ockner at oracle.com  Wed Nov 30 19:57:00 2016
From: max.ockner at oracle.com (Max Ockner)
Date: Wed, 30 Nov 2016 14:57:00 -0500
Subject: RFR(s): 8169206: TemplateInterpreter::_continuation_entry is never
	referenced
Message-ID: <583F2F0C.7050206@oracle.com>

Hello everyone!

Please review this small fix which removes some dead code from the 
interpreter. TemplateInterpreter::_continuation_entry table and its 
accessor are never called from anywhere else.

Bug: https://bugs.openjdk.java.net/browse/JDK-8169206
Webrev: http://cr.openjdk.java.net/~mockner/8169206.01/

Tested with java -version.

Thanks,
Max


From frederic.parain at oracle.com  Wed Nov 30 20:51:31 2016
From: frederic.parain at oracle.com (Frederic Parain)
Date: Wed, 30 Nov 2016 15:51:31 -0500
Subject: RFR(s): 8169206: TemplateInterpreter::_continuation_entry is
	never referenced
In-Reply-To: <583F2F0C.7050206@oracle.com>
References: <583F2F0C.7050206@oracle.com>
Message-ID: <8ea4d1ca-ece1-9e09-42bd-fd4cad7f2658@oracle.com>

Looks good to me.

Fred

On 11/30/2016 02:57 PM, Max Ockner wrote:
> Hello everyone!
>
> Please review this small fix which removes some dead code from the
> interpreter. TemplateInterpreter::_continuation_entry table and its
> accessor are never called from anywhere else.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8169206
> Webrev: http://cr.openjdk.java.net/~mockner/8169206.01/
>
> Tested with java -version.
>
> Thanks,
> Max
>
>
>
>

From chf at redhat.com  Wed Nov 30 21:59:56 2016
From: chf at redhat.com (Christine Flood)
Date: Wed, 30 Nov 2016 16:59:56 -0500 (EST)
Subject: Java heap size defaults when running with CGroups in Linux.
In-Reply-To: <162383910.709983.1479672832499.JavaMail.zimbra@redhat.com>
Message-ID: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com>


The problem is that when running the JVM inside of a cgroup, such as docker, the JVM bases it's default heap parameters on the size of the whole machine's memory not on the memory available to the container.  This causes errors as discussed on this blog entry.  http://matthewkwilliams.com/index.php/2016/03/17/docker-cgroups-memory-constraints-and-java-cautionary-tale/

Basically the JVM dies in a non-obvious manner.

The solution I propose is to add a parameter -XX:+UseCGroupLimits to the JVM which states that you should look to the CGroup when calculating default heap sizes. 

Webrev is here:  http://cr.openjdk.java.net/~andrew/rh1390708/webrev.01/


Christine


From mikael.vidstedt at oracle.com  Wed Nov 30 22:53:41 2016
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Wed, 30 Nov 2016 14:53:41 -0800
Subject: Java heap size defaults when running with CGroups in Linux.
In-Reply-To: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com>
References: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com>
Message-ID: <D8C99DDD-61EB-41EF-9C3C-E24665659D1A@oracle.com>


Out of curiosity, why wouldn?t this be the default behavior? That is, in which cases is it not a good idea to use the cgroup information when sizing the JVM?

Cheers,
Mikael


> On Nov 30, 2016, at 1:59 PM, Christine Flood <chf at redhat.com> wrote:
> 
> 
> The problem is that when running the JVM inside of a cgroup, such as docker, the JVM bases it's default heap parameters on the size of the whole machine's memory not on the memory available to the container.  This causes errors as discussed on this blog entry.  http://matthewkwilliams.com/index.php/2016/03/17/docker-cgroups-memory-constraints-and-java-cautionary-tale/
> 
> Basically the JVM dies in a non-obvious manner.
> 
> The solution I propose is to add a parameter -XX:+UseCGroupLimits to the JVM which states that you should look to the CGroup when calculating default heap sizes. 
> 
> Webrev is here:  http://cr.openjdk.java.net/~andrew/rh1390708/webrev.01/
> 
> 
> Christine
> 
> 
> 
> 
> 
> 
>