From vladimir.kozlov at oracle.com Tue Nov 1 00:52:56 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 31 Oct 2016 17:52:56 -0700 Subject: [9] RFR(M) 8166416: [AOT] Integrate JDK build changes and launcher 'jaotc' for AOT compiler In-Reply-To: <7bb59402-ecba-692b-9718-16414fe17efd@redhat.com> References: <58114E0C.107@oracle.com> <58124A47.5010604@oracle.com> <581304B6.7030804@oracle.com> <9f52669e-75fe-7139-78cb-d28ae1aff0a7@redhat.com> <58137448.6040205@oracle.com> <23d2851f-027d-f142-e4d1-8c42e4a011f2@redhat.com> <7bb59402-ecba-692b-9718-16414fe17efd@redhat.com> Message-ID: <94ac122b-6ff8-d038-f0c5-fb8011a661dd@oracle.com> Thank you, Andrew I fixed compiledCI_aarch64.cpp and updated webrev in place. Thanks, Vladimir On 10/31/16 8:35 AM, Andrew Dinn wrote: > Hi Vladimir, > > On 31/10/16 11:38, Andrew Dinn wrote: >> On 28/10/16 16:52, Vladimir Kozlov wrote: >>> Thank you, Andrew, for verifying that build changes do not break AArch64. >>> But it would be nice if you can also apply Hotspot changes (revert >>> hs.make.webrev changes before that since hs.webrev have them): >>> >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>> >>> and jaotc sources (which are located in Hotspot repo): >>> >>> http://cr.openjdk.java.net/~kvn/aot/jaotc.webrev/ >> >> I tried this and found two missing changes to compiledIC_aarch64.cpp >> (basically a missing arg in each of two class to find_stub() -- see >> below for diff). >> >> However, I then ran into the problem Volker saw: >> >> Compiling 15 files for jdk.attach >> /home/adinn/openjdk/hs/hotspot/src/jdk.vm.ci/share/classes/module-info.java:40: >> error: module not found: jdk.vm.compiler >> jdk.vm.compiler; >> ^ >> /home/adinn/openjdk/hs/hotspot/src/jdk.vm.ci/share/classes/module-info.java:43: >> error: module not found: jdk.vm.compiler >> jdk.vm.compiler; >> >> . . . >> >> I assume fixing this second problem requires me to clone the graal-core >> repo into my tree and the apply the graal.webrev patch then rebuild. > > I cloned and patched the graal-core/graal tree and then copied it into > my hotspot space as follows > > $ cp /path/to/graal-core/graal \ > /otherpath/to/hs/hotspot/src/share/classes/jdk.vm.compiler > > With this and the extra tweaks to compiledCI_aarch64.cpp mentioned in > the previous reply I managed to build a slowdebug release which > successfully ran 'java Hello' and 'javac Hello.java'. > > Andrew Haley is currently trying to get Graal itself to run on AArch64. > So, this is probably good enough for now to confirm the acceptability of > the hs and jaotc change sets. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From coleen.phillimore at oracle.com Tue Nov 1 01:35:21 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Mon, 31 Oct 2016 21:35:21 -0400 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <58115536.5080205@oracle.com> References: <58115536.5080205@oracle.com> Message-ID: I looked at the runtime code and it looks fine to me. I'm pleased the changes were not more invasive. Some minor questions and nits: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/code/nmethod.cpp.udiff.html *+ virtual void set_to_interpreted(methodHandle method, CompiledICInfo& info) {* Can you pass methodHandle by const reference so that the copy constructor and destructor aren't called? http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/oops/methodCounters.hpp.udiff.html Why does this add a Method* pointer for #ifndef AOT code? This could be a lot of additional footprint. http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/globals.hpp.udiff.html Why are the AOT parameters in two separate sections? The intx ones should be defined with a valid range. http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/vmStructs.cpp.udiff.html Why is this added and the SA code fixed? AOT doesn't use the SA, does it? Was it added for debugging? Thanks, Colee On 10/26/16 9:15 PM, Vladimir Kozlov wrote: > AOT JEP: > https://bugs.openjdk.java.net/browse/JDK-8166089 > Subtask: > https://bugs.openjdk.java.net/browse/JDK-8166415 > Webrev: > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ > > Please, review Hotspot VM part of AOT changes. > > Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot > will be build only on Linux/x64. > > AOT code is NOT linked during AOT libraries load as it happens with > normal .so libraries. AOT code entry points are not exposed (not > global) in AOT libraries. Only class data has global labels which we > look for with dlsym(klass_name). > > AOT-compiled code in AOT libraries is treated by JVM as *extension* of > existing CodeCache. When a java class is loaded JVM looks if > corresponding AOT-compiled methods exist in loaded AOT libraries and > add links to them from java methods descriptors (we have new field > Method::_aot_code). AOT-compiled code follows the same > invocation/deoptimization/unloading rules as normal JIT-compiled code. > > Calls in AOT code use the same methods resolution runtime code as > calls in JITed code. The difference is call's destination address is > loaded indirectly because we can't patch AOT code - it is immutable > (to share between multiple JVM instances). > > Classes and Strings referenced in AOT code are resolved lazily by > calling into runtime. All mutable pointers (oops (mostly strings), > metadata) are stored and modified in a separate mutable memory (GOT > cells) - they are not embedded into AOT code. > > Changes includes klass fingerprint generation since we need it to find > correct klass data in loaded AOT libraries. > > Thanks, > Vladimir From stefan.karlsson at oracle.com Tue Nov 1 07:28:06 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 1 Nov 2016 08:28:06 +0100 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <58115536.5080205@oracle.com> References: <58115536.5080205@oracle.com> Message-ID: <511061ab-e70d-970b-f8e3-67d87a894099@oracle.com> Hi Vladimir, I just took a quick look at the GC code. 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should. Some examples: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html #include "utilities/debug.hpp" #include "utilities/macros.hpp" *+ #include "aot/aotLoader.hpp"* http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html #include "gc/g1/g1Policy.hpp" #include "gc/g1/g1RootClosures.hpp" #include "gc/g1/g1RootProcessor.hpp" #include "gc/g1/heapRegion.inline.hpp" #include "memory/allocation.inline.hpp" *+ #include "aot/aotLoader.hpp"* #include "runtime/fprofiler.hpp" #include "runtime/mutex.hpp" #include "services/management.hpp" 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous: For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); CodeCache::blobs_do(&adjust_from_blobs); *+ if (UseAOT) {* *+ AOTLoader::oops_do(adjust_pointer_closure());* *+ }* StringTable::oops_do(adjust_pointer_closure()); ref_processor()->weak_oops_do(adjust_pointer_closure()); PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); Would be: CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); CodeCache::blobs_do(&adjust_from_blobs); *+ AOTLoader::oops_do(adjust_pointer_closure());* StringTable::oops_do(adjust_pointer_closure()); ref_processor()->weak_oops_do(adjust_pointer_closure()); PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html // Reserved area char* low_boundary() const { return _low_boundary; } char* high_boundary() const { return _high_boundary; } *+ void set_low_boundary(char *p) { _low_boundary = p; }* *+ void set_high_boundary(char *p) { _high_boundary = p; }* *+ void set_low(char *p) { _low = p; }* *+ void set_high(char *p) { _high = p; }* *+ * bool special() const { return _special; } These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used? 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load. Thanks, StefanK On 27/10/16 03:15, Vladimir Kozlov wrote: > AOT JEP: https://bugs.openjdk.java.net/browse/JDK-8166089 Subtask: > https://bugs.openjdk.java.net/browse/JDK-8166415 Webrev: > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ Please, review Hotspot > VM part of AOT changes. Only Linux/x64 platform is supported. 'jaotc' > and AOT part of Hotspot will be build only on Linux/x64. AOT code is > NOT linked during AOT libraries load as it happens with normal .so > libraries. AOT code entry points are not exposed (not global) in AOT > libraries. Only class data has global labels which we look for with > dlsym(klass_name). AOT-compiled code in AOT libraries is treated by > JVM as *extension* of existing CodeCache. When a java class is loaded > JVM looks if corresponding AOT-compiled methods exist in loaded AOT > libraries and add links to them from java methods descriptors (we have > new field Method::_aot_code). AOT-compiled code follows the same > invocation/deoptimization/unloading rules as normal JIT-compiled code. > Calls in AOT code use the same methods resolution runtime code as > calls in JITed code. The difference is call's destination address is > loaded indirectly because we can't patch AOT code - it is immutable > (to share between multiple JVM instances). Classes and Strings > referenced in AOT code are resolved lazily by calling into runtime. > All mutable pointers (oops (mostly strings), metadata) are stored and > modified in a separate mutable memory (GOT cells) - they are not > embedded into AOT code. Changes includes klass fingerprint generation > since we need it to find correct klass data in loaded AOT libraries. > Thanks, Vladimir From stefan.karlsson at oracle.com Tue Nov 1 07:38:39 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 1 Nov 2016 08:38:39 +0100 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <58115536.5080205@oracle.com> References: <58115536.5080205@oracle.com> Message-ID: (resending without formatting) Hi Vladimir, I just took a quick look at the GC code. 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should. Some examples: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html #include "utilities/debug.hpp" #include "utilities/macros.hpp" + #include "aot/aotLoader.hpp" http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html #include "gc/g1/g1Policy.hpp" #include "gc/g1/g1RootClosures.hpp" #include "gc/g1/g1RootProcessor.hpp" #include "gc/g1/heapRegion.inline.hpp" #include "memory/allocation.inline.hpp" + #include "aot/aotLoader.hpp" #include "runtime/fprofiler.hpp" #include "runtime/mutex.hpp" #include "services/management.hpp" 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous. For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); CodeCache::blobs_do(&adjust_from_blobs); + if (UseAOT) { + AOTLoader::oops_do(adjust_pointer_closure()); + } StringTable::oops_do(adjust_pointer_closure()); ref_processor()->weak_oops_do(adjust_pointer_closure()); PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); Would be: CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); CodeCache::blobs_do(&adjust_from_blobs); + AOTLoader::oops_do(adjust_pointer_closure()); StringTable::oops_do(adjust_pointer_closure()); ref_processor()->weak_oops_do(adjust_pointer_closure()); PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html // Reserved area char* low_boundary() const { return _low_boundary; } char* high_boundary() const { return _high_boundary; } + void set_low_boundary(char *p) { _low_boundary = p; } + void set_high_boundary(char *p) { _high_boundary = p; } + void set_low(char *p) { _low = p; } + void set_high(char *p) { _high = p; } + bool special() const { return _special; } These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used? 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load. Thanks, StefanK On 27/10/16 03:15, Vladimir Kozlov wrote: > AOT JEP: > https://bugs.openjdk.java.net/browse/JDK-8166089 > Subtask: > https://bugs.openjdk.java.net/browse/JDK-8166415 > Webrev: > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ > > Please, review Hotspot VM part of AOT changes. > > Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot > will be build only on Linux/x64. > > AOT code is NOT linked during AOT libraries load as it happens with > normal .so libraries. AOT code entry points are not exposed (not > global) in AOT libraries. Only class data has global labels which we > look for with dlsym(klass_name). > > AOT-compiled code in AOT libraries is treated by JVM as *extension* of > existing CodeCache. When a java class is loaded JVM looks if > corresponding AOT-compiled methods exist in loaded AOT libraries and > add links to them from java methods descriptors (we have new field > Method::_aot_code). AOT-compiled code follows the same > invocation/deoptimization/unloading rules as normal JIT-compiled code. > > Calls in AOT code use the same methods resolution runtime code as > calls in JITed code. The difference is call's destination address is > loaded indirectly because we can't patch AOT code - it is immutable > (to share between multiple JVM instances). > > Classes and Strings referenced in AOT code are resolved lazily by > calling into runtime. All mutable pointers (oops (mostly strings), > metadata) are stored and modified in a separate mutable memory (GOT > cells) - they are not embedded into AOT code. > > Changes includes klass fingerprint generation since we need it to find > correct klass data in loaded AOT libraries. > > Thanks, > Vladimir From coleen.phillimore at oracle.com Tue Nov 1 10:40:23 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 1 Nov 2016 06:40:23 -0400 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: References: <58115536.5080205@oracle.com> Message-ID: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> 5. Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging. Thanks Coleen > On Nov 1, 2016, at 3:38 AM, Stefan Karlsson wrote: > > (resending without formatting) > > Hi Vladimir, > > I just took a quick look at the GC code. > > 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should. > > Some examples: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html > > #include "utilities/debug.hpp" > #include "utilities/macros.hpp" > + #include "aot/aotLoader.hpp" > > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html > > #include "gc/g1/g1Policy.hpp" > #include "gc/g1/g1RootClosures.hpp" > #include "gc/g1/g1RootProcessor.hpp" > #include "gc/g1/heapRegion.inline.hpp" > #include "memory/allocation.inline.hpp" > + #include "aot/aotLoader.hpp" > #include "runtime/fprofiler.hpp" > #include "runtime/mutex.hpp" > #include "services/management.hpp" > > 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous. > > For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html > > CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); > CodeCache::blobs_do(&adjust_from_blobs); > + if (UseAOT) { > + AOTLoader::oops_do(adjust_pointer_closure()); > + } > StringTable::oops_do(adjust_pointer_closure()); > ref_processor()->weak_oops_do(adjust_pointer_closure()); > PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); > > Would be: > > CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); > CodeCache::blobs_do(&adjust_from_blobs); > + AOTLoader::oops_do(adjust_pointer_closure()); > StringTable::oops_do(adjust_pointer_closure()); > ref_processor()->weak_oops_do(adjust_pointer_closure()); > PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); > > 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html > > // Reserved area > char* low_boundary() const { return _low_boundary; } > char* high_boundary() const { return _high_boundary; } > > + void set_low_boundary(char *p) { _low_boundary = p; } > + void set_high_boundary(char *p) { _high_boundary = p; } > + void set_low(char *p) { _low = p; } > + void set_high(char *p) { _high = p; } > + > bool special() const { return _special; } > > These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used? > > > 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html > > Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load. > > Thanks, > StefanK > >> On 27/10/16 03:15, Vladimir Kozlov wrote: >> AOT JEP: >> https://bugs.openjdk.java.net/browse/JDK-8166089 >> Subtask: >> https://bugs.openjdk.java.net/browse/JDK-8166415 >> Webrev: >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >> >> Please, review Hotspot VM part of AOT changes. >> >> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64. >> >> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name). >> >> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code. >> >> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances). >> >> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code. >> >> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries. >> >> Thanks, >> Vladimir > > From coleen.phillimore at oracle.com Tue Nov 1 15:14:02 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 1 Nov 2016 11:14:02 -0400 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> Message-ID: Sorry my phone autocorrected. I meant tag aot composed with the others. Coleen Sent from my iPhone > On Nov 1, 2016, at 6:40 AM, Coleen Phillimore wrote: > > 5. Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging. > Thanks > Coleen > > >> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson wrote: >> >> (resending without formatting) >> >> Hi Vladimir, >> >> I just took a quick look at the GC code. >> >> 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should. >> >> Some examples: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >> >> #include "utilities/debug.hpp" >> #include "utilities/macros.hpp" >> + #include "aot/aotLoader.hpp" >> >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >> >> #include "gc/g1/g1Policy.hpp" >> #include "gc/g1/g1RootClosures.hpp" >> #include "gc/g1/g1RootProcessor.hpp" >> #include "gc/g1/heapRegion.inline.hpp" >> #include "memory/allocation.inline.hpp" >> + #include "aot/aotLoader.hpp" >> #include "runtime/fprofiler.hpp" >> #include "runtime/mutex.hpp" >> #include "services/management.hpp" >> >> 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous. >> >> For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + if (UseAOT) { >> + AOTLoader::oops_do(adjust_pointer_closure()); >> + } >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> Would be: >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + AOTLoader::oops_do(adjust_pointer_closure()); >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >> >> // Reserved area >> char* low_boundary() const { return _low_boundary; } >> char* high_boundary() const { return _high_boundary; } >> >> + void set_low_boundary(char *p) { _low_boundary = p; } >> + void set_high_boundary(char *p) { _high_boundary = p; } >> + void set_low(char *p) { _low = p; } >> + void set_high(char *p) { _high = p; } >> + >> bool special() const { return _special; } >> >> These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used? >> >> >> 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >> >> Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load. >> >> Thanks, >> StefanK >> >>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>> AOT JEP: >>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>> Subtask: >>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>> Webrev: >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>> >>> Please, review Hotspot VM part of AOT changes. >>> >>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64. >>> >>> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name). >>> >>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code. >>> >>> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances). >>> >>> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code. >>> >>> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries. >>> >>> Thanks, >>> Vladimir > From trevor.d.watson at oracle.com Tue Nov 1 17:16:09 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Tue, 1 Nov 2016 17:16:09 +0000 Subject: Tests for lzcnt Message-ID: <23a928c5-704f-a3b6-2bdb-cba9bd424fdd@oracle.com> I'm working on https://bugs.openjdk.java.net/browse/JDK-8162865, which implements the inlining of LZCNT instructions on SPARC platforms which support it. I have the code implemented and have written a test-case that validates the values returned by Long.countLeadingZeros() and Integer.countLeadingZeros(). Looking through the hotspot tests, I see there is some testing of lzcnt in compiler/intrinsics/bmi. I've adapted TestLzcntI.java and TestLzcntL.java to check for the relevant SPARC feature as well as the x86/x64 feature. These tests work fine, but only validate that C2 generates the same results as interpreted code for a selection of random values. Whilst that is perfectly valid, I'd like to also be able to verify that the inlined code for the lz count for each power of 2 in an Integer and Long produces correct values (per my standalone test). Would the "bmi" directory be the appropriate place to add a new test like this or even hold a test which supports both x86/x64 and SPARC given that "bmi" appears to refer to some kind of x86/x64 cpu feature set? Or am I reading too much into "bmi" and its just used here as a generic name? I hope that made sense. Thanks, Trevor From kirill.zhaldybin at oracle.com Tue Nov 1 17:30:09 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Tue, 1 Nov 2016 20:30:09 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part Message-ID: Dear all, Could you please review this fix for 8169003? I changed parsing of time string so now it is not depend on LC_NUMERIC locale so the test does not fail if locale where "floating point" is actually a comma is set. WebRev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ CR: https://bugs.openjdk.java.net/browse/JDK-8169003 Thank you. Regards, Kirill From claes.redestad at oracle.com Tue Nov 1 21:22:45 2016 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 1 Nov 2016 22:22:45 +0100 Subject: RFR 8163553 java.lang.LinkageError from test java/lang/ThreadGroup/Stop.java In-Reply-To: <8C56B19F-22EC-4E1E-AA47-E0D629231B07@oracle.com> References: <5EA4A44D-3E66-4B76-8160-163580606FF1@oracle.com> <8C56B19F-22EC-4E1E-AA47-E0D629231B07@oracle.com> Message-ID: <581907A5.8020600@oracle.com> +1 /Claes On 2016-10-27 21:24, Paul Sandoz wrote: > Gentle reminder. > > Paul. > >> On 18 Oct 2016, at 11:41, Paul Sandoz wrote: >> >> Hi, >> >> Please review: >> >> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8163553-vh-mh-link-errors-not-wrapped/webrev/ >> >> This is the issue that motivated a change in the behaviour of indy wrapping Errors in BootstrapMethodError, JDK-8166974. I plan to push this issue with JDK-8166974 to hs, since they are related in behaviour even though there is no direct dependency between the patches. >> >> >> When invoking signature-polymorphic methods a similar but hardcoded dance occurs, with an appeal to Java code, to link the call site. >> >> - MethodHandle.invoke/invokeExact (and the VH methods) would wrap all Errors in LinkageError. Now they are passed through, thus an Error like ThreadDeath is not wrapped. >> >> - MethodHandle.invoke/invokeExact/invokeBasic throw Throwable, and in certain cases the Throwable is wrapped in an InternalError. In many other cases Error and RuntimeException are propagated, which i think in general is the right pattern, so i consistently applied that. >> >> - I updated StringConcatFactory to also pass through Errors and avoid unduly wrapping StringConcatException in another instance of StringConcatException. (LambdaMetafactory and associated classes required no changes.) >> >> Thanks, >> Paul. > From 1072213404 at qq.com Wed Nov 2 08:11:35 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Wed, 2 Nov 2016 16:11:35 +0800 Subject: =?gb18030?B?d2hhdCdzIHRoZSBmdW5jdGlvbiBhbmQgZGlmZnJl?= =?gb18030?B?bmNlIGJldHdlZW4gY29tcGlsZXIgYzEgYW5kIGMy?= =?gb18030?B?IKO/IA==?= Message-ID: what's the function and diffrence between hotspot compiler c1 and c2 ? about c1? i have found something about ?voaltile? and methods LIRGenerator::do_StoreField(StoreField* x) and LIRGenerator::do_LoadField(LoadField* x) in ?share/vm/c1/c1_LIRGenerator.cpp?? when operating a variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method? if true? then method do_LoadField or do_StoreField by which method? From 1072213404 at qq.com Wed Nov 2 08:19:18 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Wed, 2 Nov 2016 16:19:18 +0800 Subject: =?gb18030?B?cGxlYXNlIGhlbHAgdW5kZXJzdGFuZGluZyB3aGF0?= =?gb18030?B?J3MgdGhlIGZ1bmN0aW9uIGFuZCBkaWZmcmVuY2Ug?= =?gb18030?B?YmV0d2VlbiBob3RzcG90IGNvbXBpbGVyIGMxIGFu?= =?gb18030?B?ZCBjMiCjvw==?= Message-ID: what's the function and diffrence between hotspot compiler c1 and c2 ? about c1? i have found something about ?voaltile? and methods LIRGenerator::do_StoreField(StoreField* x) and LIRGenerator::do_LoadField(LoadField* x) in ?share/vm/c1/c1_LIRGenerator.cpp?? when operating a variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method? if true? then method do_LoadField or do_StoreField by which method? From rednaxelafx at gmail.com Wed Nov 2 08:28:50 2016 From: rednaxelafx at gmail.com (Krystal Mok) Date: Wed, 2 Nov 2016 01:28:50 -0700 Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_function_an?= =?UTF-8?Q?d_diffrence_between_hotspot_compiler_c1_and_c2_=EF=BC=9F?= In-Reply-To: References: Message-ID: Hi, I don't think I understand your question, but I'll take a shot. Are you trying to ask what the differences are between C1 and C2, with regards to how they handle volatile field accesses? For C1, yes, all Java fields accesses (load/store) are represented in the HIR with LoadField and StoreField instructions. The ciField in these instructions would carry the information about whether the field is volatile or not. When lowering HIR to LIR, the LIRGenerator::do_LoadField() and do_StoreField() functions are called. What is it that you're trying to learn about these functions? For C2, it's a bit complicated, because volatile semantics involve the memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1] and [2] for some background information before you dive into the code. Hope it helps, Kris [1]: https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation [2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote: > what's the function and diffrence between hotspot compiler c1 and c2 ? > > > about c1? > i have found something about ?voaltile? and > methods LIRGenerator::do_StoreField(StoreField* x) and > LIRGenerator::do_LoadField(LoadField* x) in > ?share/vm/c1/c1_LIRGenerator.cpp?? > > > when operating a variable with a volatile qualifier?will it finally > invoke do_StoreField or do_LoadField method? > if true? then method do_LoadField or do_StoreField by which method? From 1072213404 at qq.com Wed Nov 2 08:48:30 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Wed, 2 Nov 2016 16:48:30 +0800 Subject: =?gb18030?B?IHBsZWFzZSBoZWxwIHVuZGVyc3RhbmRpbmcgd2hh?= =?gb18030?B?dCdzIHRoZSByZWxhdGlvbnNoaXAgb2YgIGhvdHNw?= =?gb18030?B?b3QgY29tcGlsZXIgYzEgYW5kIGMyIKO/?= In-Reply-To: References: Message-ID: Thank you ?Krystal ? i think i need some time to make these things light? and which part of code link HIR and LIR together in openjdk ? c1 matches client mode? c2 matches server mode? when some code running? just one of them work or both do? ------------------ ???? ------------------ ???: "Krystal Mok";; ????: 2016?11?2?(???) ??4:28 ???: "????"<1072213404 at qq.com>; ??: "hotspot-dev"; ??: Re: please help understanding what's the function and diffrence between hotspot compiler c1 and c2 ? Hi, I don't think I understand your question, but I'll take a shot. Are you trying to ask what the differences are between C1 and C2, with regards to how they handle volatile field accesses? For C1, yes, all Java fields accesses (load/store) are represented in the HIR with LoadField and StoreField instructions. The ciField in these instructions would carry the information about whether the field is volatile or not. When lowering HIR to LIR, the LIRGenerator::do_LoadField() and do_StoreField() functions are called. What is it that you're trying to learn about these functions? For C2, it's a bit complicated, because volatile semantics involve the memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1] and [2] for some background information before you dive into the code. Hope it helps, Kris [1]: https://wiki.openjdk.java.net/display/HotSpot/Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation [2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote: what's the function and diffrence between hotspot compiler c1 and c2 ? about c1? i have found something about ?voaltile? and methods LIRGenerator::do_StoreField(StoreField* x) and LIRGenerator::do_LoadField(LoadField* x) in ?share/vm/c1/c1_LIRGenerator.cpp?? when operating a variable with a volatile qualifier?will it finally invoke do_StoreField or do_LoadField method? if true? then method do_LoadField or do_StoreField by which method? From rednaxelafx at gmail.com Wed Nov 2 08:54:10 2016 From: rednaxelafx at gmail.com (Krystal Mok) Date: Wed, 2 Nov 2016 01:54:10 -0700 Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_relationshi?= =?UTF-8?Q?p_of_hotspot_compiler_c1_and_c2_=EF=BC=9F?= In-Reply-To: References: Message-ID: Hi, By "HIR" and "LIR", I specifically mean the "High-level IR" and "Low-level IR" for C1. In C1, the GraphBuilder is what parses Java bytecodes into HIR, and then the LIRGenerator is what lowers HIR into LIR, and finally the LIRAssembler is what encodes LIR into actual machine code. C1 is the "Client Compiler", and C2 is the "Server Compiler". In a HotSpot Client VM build (which is by default not available on 64-bit architectures), it only contains C1. In a JDK7+ HotSpot Server VM, the VM actually contains both C1 and C2 compilers. They can work together in what's called a "tiered compilation system", where methods can be interpreted first, then compiled by C1, and then further compiled by C2. In JDK7, -XX:+TieredCompilation is off by default, where as in JDK8 it's on by default. - Kris On Wed, Nov 2, 2016 at 1:48 AM, ???? <1072213404 at qq.com> wrote: > > Thank you ?Krystal ? > i think i need some time to make these things light? > > and > which part of code link HIR and LIR together in openjdk ? > > c1 matches client mode? > c2 matches server mode? > when some code running? just one of them work or both do? > > > > > > > ------------------ ???? ------------------ > *???:* "Krystal Mok";; > *????:* 2016?11?2?(???) ??4:28 > *???:* "????"<1072213404 at qq.com>; > *??:* "hotspot-dev"; > *??:* Re: please help understanding what's the function and diffrence > between hotspot compiler c1 and c2 ? > > Hi, > > I don't think I understand your question, but I'll take a shot. > Are you trying to ask what the differences are between C1 and C2, with > regards to how they handle volatile field accesses? > > For C1, yes, all Java fields accesses (load/store) are represented in the > HIR with LoadField and StoreField instructions. The ciField in these > instructions would carry the information about whether the field is > volatile or not. > When lowering HIR to LIR, the LIRGenerator::do_LoadField() and > do_StoreField() functions are called. What is it that you're trying to > learn about these functions? > > For C2, it's a bit complicated, because volatile semantics involve the > memory graph portion of C2's Sea-of-nodes IR. You may want to refer to [1] > and [2] for some background information before you dive into the code. > > Hope it helps, > Kris > > [1]: https://wiki.openjdk.java.net/display/HotSpot/ > Overview+of+Ideal%2C+C2%27s+high+level+intermediate+representation > [2]: https://wiki.openjdk.java.net/display/HotSpot/C2+IR+Graph+and+Nodes > > On Wed, Nov 2, 2016 at 1:19 AM, ???? <1072213404 at qq.com> wrote: > >> what's the function and diffrence between hotspot compiler c1 and c2 ? >> >> >> about c1? >> i have found something about ?voaltile? and >> methods LIRGenerator::do_StoreField(StoreField* x) and >> LIRGenerator::do_LoadField(LoadField* x) in >> ?share/vm/c1/c1_LIRGenerator.cpp?? >> >> >> when operating a variable with a volatile qualifier?will it finally >> invoke do_StoreField or do_LoadField method? >> if true? then method do_LoadField or do_StoreField by which method? > > > From hyperdak at gmail.com Wed Nov 2 09:34:30 2016 From: hyperdak at gmail.com (=?UTF-8?B?5Lqi5Lyf5qWg?=) Date: Wed, 2 Nov 2016 17:34:30 +0800 Subject: =?UTF-8?Q?Re=3A_please_help_understanding_what=27s_the_relationshi?= =?UTF-8?Q?p_of_hotspot_compiler_c1_and_c2_=EF=BC=9F?= In-Reply-To: References: Message-ID: Hi, When use tiered compilation (default enable in jdk8),tiered VM can use C1 and C2 both [1].Client VM will use C1 and Server VM will use C2. Thanks, hyperdak [1] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation From 1072213404 at qq.com Wed Nov 2 09:40:13 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Wed, 2 Nov 2016 17:40:13 +0800 Subject: =?gb18030?B?u9i4tKO6IHBsZWFzZSBoZWxwIHVuZGVyc3RhbmRp?= =?gb18030?B?bmcgd2hhdCdzIHRoZSByZWxhdGlvbnNoaXAgb2Yg?= =?gb18030?B?aG90c3BvdCBjb21waWxlciBjMSBhbmQgYzIgo78=?= In-Reply-To: References: Message-ID: Hi, Server VM will use C2, in this mode which method processes 'volatile' operations like method do_StoreField in src/share/vm/c1/c1_LIRGenerator.cpp? Thank you ! ------------------ ???? ------------------ ???: "???";; ????: 2016?11?2?(???) ??5:34 ???: "????"<1072213404 at qq.com>; ??: "Krystal Mok"; "hotspot-dev"; ??: Re: please help understanding what's the relationship of hotspot compiler c1 and c2 ? Hi, When use tiered compilation (default enable in jdk8),tiered VM can use C1 and C2 both [1].Client VM will use C1 and Server VM will use C2. Thanks, hyperdak [1] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation From aph at redhat.com Wed Nov 2 10:59:26 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 2 Nov 2016 10:59:26 +0000 Subject: =?UTF-8?B?UmU6IOWbnuWkje+8miBwbGVhc2UgaGVscCB1bmRlcnN0YW5kaW5nIHdo?= =?UTF-8?Q?at's_the_relationship_of_hotspot_compiler_c1_and_c2_=ef=bc=9f?= In-Reply-To: References: Message-ID: <3b13f1b8-e560-e666-d1a6-fde0ec474ce7@redhat.com> On 02/11/16 09:40, ???? wrote: > which method processes 'volatile' operations like method do_StoreField in src/share/vm/c1/c1_LIRGenerator.cpp? It's done in line 1771: if (is_volatile && os::is_MP()) { __ membar_release(); } and 1793: if (!support_IRIW_for_not_multiple_copy_atomic_cpu && is_volatile && os::is_MP()) { __ membar(); } Andrew. From vladimir.kozlov at oracle.com Thu Nov 3 02:51:12 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Nov 2016 19:51:12 -0700 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: References: <58115536.5080205@oracle.com> Message-ID: <452951ff-f480-57a0-bf4b-def10a599424@oracle.com> Thank you, Coleen On 10/31/16 6:35 PM, Coleen Phillimore wrote: > > I looked at the runtime code and it looks fine to me. I'm pleased the > changes were not more invasive. Some minor questions and nits: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/code/nmethod.cpp.udiff.html > > > *+ virtual void set_to_interpreted(methodHandle method, CompiledICInfo& > info) {* > > > Can you pass methodHandle by const reference so that the copy > constructor and destructor aren't called? It was original declaration for CompiledStaticCall::set_to_interpreted(): http://hg.openjdk.java.net/jdk9/hs/hotspot/file/031e87605d21/src/share/vm/code/compiledIC.hpp#l300 But your suggestion is good - I implemented it: set_to_interpreted(const methodHandle& method, I also have the same change for set_to_far(). > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/oops/methodCounters.hpp.udiff.html > > Why does this add a Method* pointer for #ifndef AOT code? This could > be a lot of additional footprint. Good catch. Put under $if INCLUDE_AOT. > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/globals.hpp.udiff.html > > Why are the AOT parameters in two separate sections? The intx ones > should be defined with a valid range. We think Tiered compilation flags should be together. I added missing range(). > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/runtime/vmStructs.cpp.udiff.html > > Why is this added and the SA code fixed? AOT doesn't use the SA, does > it? Was it added for debugging? Yes, AOT does not use SA. It is for debugging of core files to correctly calculate size of instanceKlass structure - it depends on presence of fingerprint field: +// [EMBEDDED fingerprint] only if should_store_fingerprint()==true Ioi added that for klass fingerprint code which is part of hotspot AOT changes: https://bugs.openjdk.java.net/browse/JDK-8165142 Thanks, Vladimir > > Thanks, > Colee > > On 10/26/16 9:15 PM, Vladimir Kozlov wrote: >> AOT JEP: >> https://bugs.openjdk.java.net/browse/JDK-8166089 >> Subtask: >> https://bugs.openjdk.java.net/browse/JDK-8166415 >> Webrev: >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >> >> Please, review Hotspot VM part of AOT changes. >> >> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot >> will be build only on Linux/x64. >> >> AOT code is NOT linked during AOT libraries load as it happens with >> normal .so libraries. AOT code entry points are not exposed (not >> global) in AOT libraries. Only class data has global labels which we >> look for with dlsym(klass_name). >> >> AOT-compiled code in AOT libraries is treated by JVM as *extension* of >> existing CodeCache. When a java class is loaded JVM looks if >> corresponding AOT-compiled methods exist in loaded AOT libraries and >> add links to them from java methods descriptors (we have new field >> Method::_aot_code). AOT-compiled code follows the same >> invocation/deoptimization/unloading rules as normal JIT-compiled code. >> >> Calls in AOT code use the same methods resolution runtime code as >> calls in JITed code. The difference is call's destination address is >> loaded indirectly because we can't patch AOT code - it is immutable >> (to share between multiple JVM instances). >> >> Classes and Strings referenced in AOT code are resolved lazily by >> calling into runtime. All mutable pointers (oops (mostly strings), >> metadata) are stored and modified in a separate mutable memory (GOT >> cells) - they are not embedded into AOT code. >> >> Changes includes klass fingerprint generation since we need it to find >> correct klass data in loaded AOT libraries. >> >> Thanks, >> Vladimir > From vladimir.kozlov at oracle.com Thu Nov 3 06:54:33 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Nov 2016 23:54:33 -0700 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: References: <58115536.5080205@oracle.com> Message-ID: <3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com> Thank you, Stefan On 11/1/16 12:38 AM, Stefan Karlsson wrote: > (resending without formatting) > > Hi Vladimir, > > I just took a quick look at the GC code. > > 1) You need to go over the entire patch and fix all the include lines > that were added. They are are not sorted, as they should. Done. > > Some examples: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html > > > #include "utilities/debug.hpp" > #include "utilities/macros.hpp" > + #include "aot/aotLoader.hpp" > > > > 2) I'd prefer if the the check if AOT is enabled was folded into > AOTLoader::oops_do, so that the additions to the GC code would be less > conspicuous. Done. But I don't remove UseAOT for complex checks to avoid executing following checks, like next: + if (UseAOT && !_process_strong_tasks->is_task_claimed(GCH_PS_aot_oops_do)) { + AOTLoader::oops_do(strong_roots); + } > > For example: > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html > > > CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), > CodeBlobToOopClosure::FixRelocations); > CodeCache::blobs_do(&adjust_from_blobs); > + if (UseAOT) { > + AOTLoader::oops_do(adjust_pointer_closure()); > + } > StringTable::oops_do(adjust_pointer_closure()); > ref_processor()->weak_oops_do(adjust_pointer_closure()); > PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); > > Would be: > > CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), > CodeBlobToOopClosure::FixRelocations); > CodeCache::blobs_do(&adjust_from_blobs); > + AOTLoader::oops_do(adjust_pointer_closure()); > StringTable::oops_do(adjust_pointer_closure()); > ref_processor()->weak_oops_do(adjust_pointer_closure()); > PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); > > 3) aotLoader.hpp uses implements methods using GrowableArray. This will > expose the growable array functions to all includers of that file. > Please move all that code out to an aotLoader.inline.hpp file, and then > remove the unneeded includes from the aotLoader.hpp file. > Done. > 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html > > > // Reserved area > char* low_boundary() const { return _low_boundary; } > char* high_boundary() const { return _high_boundary; } > > + void set_low_boundary(char *p) { _low_boundary = p; } > + void set_high_boundary(char *p) { _high_boundary = p; } > + void set_low(char *p) { _low = p; } > + void set_high(char *p) { _high = p; } > + > bool special() const { return _special; } > > These seems unsafe to me, but that might be because I don't understand > how this is used. VirtualSpace has three sections, the lower, middle, > and the high. The middle section might have another alignment (large > pages) than the others. Is this property still maintained when these > functions are used? This is used only by AOT code because it does not call VirtualSpace::initialize_with_granularity() when creates AOTCodeHeap (inherited from CodeHeap). There is no actual memory heap reservation for AOT code. We set those boundary values to code section addresses in AOT library. AOT does not use alignment and middle section. I put #if INCLUDE_AOT around these methods to be clear where they are used. > > 5) > http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html > > > Did you discuss with the Runtime team about the naming of these tags? > The other class* tags where split up into multiple tags. For example, > classload was changed to class,load. Will reply to following Coleen's mail. Thanks, Vladimir > > Thanks, > StefanK > > On 27/10/16 03:15, Vladimir Kozlov wrote: >> AOT JEP: >> https://bugs.openjdk.java.net/browse/JDK-8166089 >> Subtask: >> https://bugs.openjdk.java.net/browse/JDK-8166415 >> Webrev: >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >> >> Please, review Hotspot VM part of AOT changes. >> >> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot >> will be build only on Linux/x64. >> >> AOT code is NOT linked during AOT libraries load as it happens with >> normal .so libraries. AOT code entry points are not exposed (not >> global) in AOT libraries. Only class data has global labels which we >> look for with dlsym(klass_name). >> >> AOT-compiled code in AOT libraries is treated by JVM as *extension* of >> existing CodeCache. When a java class is loaded JVM looks if >> corresponding AOT-compiled methods exist in loaded AOT libraries and >> add links to them from java methods descriptors (we have new field >> Method::_aot_code). AOT-compiled code follows the same >> invocation/deoptimization/unloading rules as normal JIT-compiled code. >> >> Calls in AOT code use the same methods resolution runtime code as >> calls in JITed code. The difference is call's destination address is >> loaded indirectly because we can't patch AOT code - it is immutable >> (to share between multiple JVM instances). >> >> Classes and Strings referenced in AOT code are resolved lazily by >> calling into runtime. All mutable pointers (oops (mostly strings), >> metadata) are stored and modified in a separate mutable memory (GOT >> cells) - they are not embedded into AOT code. >> >> Changes includes klass fingerprint generation since we need it to find >> correct klass data in loaded AOT libraries. >> >> Thanks, >> Vladimir > > From stefan.karlsson at oracle.com Thu Nov 3 07:57:27 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 3 Nov 2016 08:57:27 +0100 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com> References: <58115536.5080205@oracle.com> <3ec892ec-0a2b-da64-8ba3-c814efc1180e@oracle.com> Message-ID: <3a21f4ec-3364-66e5-ab39-296e9db810cc@oracle.com> Thanks! StefanK On 03/11/16 07:54, Vladimir Kozlov wrote: > Thank you, Stefan > > On 11/1/16 12:38 AM, Stefan Karlsson wrote: >> (resending without formatting) >> >> Hi Vladimir, >> >> I just took a quick look at the GC code. >> >> 1) You need to go over the entire patch and fix all the include lines >> that were added. They are are not sorted, as they should. > > Done. > >> >> Some examples: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >> >> >> >> #include "utilities/debug.hpp" >> #include "utilities/macros.hpp" >> + #include "aot/aotLoader.hpp" >> >> >> >> 2) I'd prefer if the the check if AOT is enabled was folded into >> AOTLoader::oops_do, so that the additions to the GC code would be less >> conspicuous. > > Done. But I don't remove UseAOT for complex checks to avoid executing > following checks, like next: > > + if (UseAOT && > !_process_strong_tasks->is_task_claimed(GCH_PS_aot_oops_do)) { > + AOTLoader::oops_do(strong_roots); > + } > >> >> For example: >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >> >> >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >> CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + if (UseAOT) { >> + AOTLoader::oops_do(adjust_pointer_closure()); >> + } >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> >> Would be: >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >> CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + AOTLoader::oops_do(adjust_pointer_closure()); >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> >> 3) aotLoader.hpp uses implements methods using GrowableArray. This will >> expose the growable array functions to all includers of that file. >> Please move all that code out to an aotLoader.inline.hpp file, and then >> remove the unneeded includes from the aotLoader.hpp file. >> > > Done. > >> 4) >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >> >> >> // Reserved area >> char* low_boundary() const { return _low_boundary; } >> char* high_boundary() const { return _high_boundary; } >> >> + void set_low_boundary(char *p) { _low_boundary = p; } >> + void set_high_boundary(char *p) { _high_boundary = p; } >> + void set_low(char *p) { _low = p; } >> + void set_high(char *p) { _high = p; } >> + >> bool special() const { return _special; } >> >> These seems unsafe to me, but that might be because I don't understand >> how this is used. VirtualSpace has three sections, the lower, middle, >> and the high. The middle section might have another alignment (large >> pages) than the others. Is this property still maintained when these >> functions are used? > > This is used only by AOT code because it does not call > VirtualSpace::initialize_with_granularity() when creates AOTCodeHeap > (inherited from CodeHeap). There is no actual memory heap reservation > for AOT code. We set those boundary values to code section addresses > in AOT library. AOT does not use alignment and middle section. > > I put #if INCLUDE_AOT around these methods to be clear where they are > used. > >> >> 5) >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >> >> >> >> Did you discuss with the Runtime team about the naming of these tags? >> The other class* tags where split up into multiple tags. For example, >> classload was changed to class,load. > > Will reply to following Coleen's mail. > > Thanks, > Vladimir > >> >> Thanks, >> StefanK >> >> On 27/10/16 03:15, Vladimir Kozlov wrote: >>> AOT JEP: >>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>> Subtask: >>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>> Webrev: >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>> >>> Please, review Hotspot VM part of AOT changes. >>> >>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot >>> will be build only on Linux/x64. >>> >>> AOT code is NOT linked during AOT libraries load as it happens with >>> normal .so libraries. AOT code entry points are not exposed (not >>> global) in AOT libraries. Only class data has global labels which we >>> look for with dlsym(klass_name). >>> >>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of >>> existing CodeCache. When a java class is loaded JVM looks if >>> corresponding AOT-compiled methods exist in loaded AOT libraries and >>> add links to them from java methods descriptors (we have new field >>> Method::_aot_code). AOT-compiled code follows the same >>> invocation/deoptimization/unloading rules as normal JIT-compiled code. >>> >>> Calls in AOT code use the same methods resolution runtime code as >>> calls in JITed code. The difference is call's destination address is >>> loaded indirectly because we can't patch AOT code - it is immutable >>> (to share between multiple JVM instances). >>> >>> Classes and Strings referenced in AOT code are resolved lazily by >>> calling into runtime. All mutable pointers (oops (mostly strings), >>> metadata) are stored and modified in a separate mutable memory (GOT >>> cells) - they are not embedded into AOT code. >>> >>> Changes includes klass fingerprint generation since we need it to find >>> correct klass data in loaded AOT libraries. >>> >>> Thanks, >>> Vladimir >> >> From vladimir.kozlov at oracle.com Thu Nov 3 11:33:38 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 3 Nov 2016 04:33:38 -0700 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> Message-ID: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> Done: java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so HelloWorld [0.060s][trace][aot,class,load] found java.lang.Object in ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000 ... I updated webrev with your, Coleen, and Stefan all suggestions: http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/ this is delta of changes: http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/ Thanks, Vladimir On 11/1/16 3:40 AM, Coleen Phillimore wrote: > 5. Thanks for pointing out the logging tags Stefan. Yes we would prefer adding "apt" and "fingerprint" and using the composition of existing tags for logging. > Thanks > Coleen > > >> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson wrote: >> >> (resending without formatting) >> >> Hi Vladimir, >> >> I just took a quick look at the GC code. >> >> 1) You need to go over the entire patch and fix all the include lines that were added. They are are not sorted, as they should. >> >> Some examples: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >> >> #include "utilities/debug.hpp" >> #include "utilities/macros.hpp" >> + #include "aot/aotLoader.hpp" >> >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >> >> #include "gc/g1/g1Policy.hpp" >> #include "gc/g1/g1RootClosures.hpp" >> #include "gc/g1/g1RootProcessor.hpp" >> #include "gc/g1/heapRegion.inline.hpp" >> #include "memory/allocation.inline.hpp" >> + #include "aot/aotLoader.hpp" >> #include "runtime/fprofiler.hpp" >> #include "runtime/mutex.hpp" >> #include "services/management.hpp" >> >> 2) I'd prefer if the the check if AOT is enabled was folded into AOTLoader::oops_do, so that the additions to the GC code would be less conspicuous. >> >> For example: http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + if (UseAOT) { >> + AOTLoader::oops_do(adjust_pointer_closure()); >> + } >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> Would be: >> >> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), CodeBlobToOopClosure::FixRelocations); >> CodeCache::blobs_do(&adjust_from_blobs); >> + AOTLoader::oops_do(adjust_pointer_closure()); >> StringTable::oops_do(adjust_pointer_closure()); >> ref_processor()->weak_oops_do(adjust_pointer_closure()); >> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >> >> 3) aotLoader.hpp uses implements methods using GrowableArray. This will expose the growable array functions to all includers of that file. Please move all that code out to an aotLoader.inline.hpp file, and then remove the unneeded includes from the aotLoader.hpp file. 4) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >> >> // Reserved area >> char* low_boundary() const { return _low_boundary; } >> char* high_boundary() const { return _high_boundary; } >> >> + void set_low_boundary(char *p) { _low_boundary = p; } >> + void set_high_boundary(char *p) { _high_boundary = p; } >> + void set_low(char *p) { _low = p; } >> + void set_high(char *p) { _high = p; } >> + >> bool special() const { return _special; } >> >> These seems unsafe to me, but that might be because I don't understand how this is used. VirtualSpace has three sections, the lower, middle, and the high. The middle section might have another alignment (large pages) than the others. Is this property still maintained when these functions are used? >> >> >> 5) http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >> >> Did you discuss with the Runtime team about the naming of these tags? The other class* tags where split up into multiple tags. For example, classload was changed to class,load. >> >> Thanks, >> StefanK >> >>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>> AOT JEP: >>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>> Subtask: >>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>> Webrev: >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>> >>> Please, review Hotspot VM part of AOT changes. >>> >>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of Hotspot will be build only on Linux/x64. >>> >>> AOT code is NOT linked during AOT libraries load as it happens with normal .so libraries. AOT code entry points are not exposed (not global) in AOT libraries. Only class data has global labels which we look for with dlsym(klass_name). >>> >>> AOT-compiled code in AOT libraries is treated by JVM as *extension* of existing CodeCache. When a java class is loaded JVM looks if corresponding AOT-compiled methods exist in loaded AOT libraries and add links to them from java methods descriptors (we have new field Method::_aot_code). AOT-compiled code follows the same invocation/deoptimization/unloading rules as normal JIT-compiled code. >>> >>> Calls in AOT code use the same methods resolution runtime code as calls in JITed code. The difference is call's destination address is loaded indirectly because we can't patch AOT code - it is immutable (to share between multiple JVM instances). >>> >>> Classes and Strings referenced in AOT code are resolved lazily by calling into runtime. All mutable pointers (oops (mostly strings), metadata) are stored and modified in a separate mutable memory (GOT cells) - they are not embedded into AOT code. >>> >>> Changes includes klass fingerprint generation since we need it to find correct klass data in loaded AOT libraries. >>> >>> Thanks, >>> Vladimir >> >> > From stefan.karlsson at oracle.com Thu Nov 3 12:47:36 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 3 Nov 2016 13:47:36 +0100 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> Message-ID: Hi Vladimir, On 03/11/16 12:33, Vladimir Kozlov wrote: > Done: > > java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so > HelloWorld > [0.060s][trace][aot,class,load] found java.lang.Object in > ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000 > ... > > I updated webrev with your, Coleen, and Stefan all suggestions: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/ > > this is delta of changes: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/ Looks good to me. I noticed one nit: http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.cpp.udiff.html #include "precompiled.hpp" *+ * *+ #include "aot/aotLoader.hpp"* *+ #include "aot/aotCodeHeap.hpp"* The lines should be swapped. Thanks, StefanK > > Thanks, > Vladimir > > On 11/1/16 3:40 AM, Coleen Phillimore wrote: >> 5. Thanks for pointing out the logging tags Stefan. Yes we would >> prefer adding "apt" and "fingerprint" and using the composition of >> existing tags for logging. >> Thanks >> Coleen >> >> >>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson >>> wrote: >>> >>> (resending without formatting) >>> >>> Hi Vladimir, >>> >>> I just took a quick look at the GC code. >>> >>> 1) You need to go over the entire patch and fix all the include >>> lines that were added. They are are not sorted, as they should. >>> >>> Some examples: >>> >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >>> >>> >>> #include "utilities/debug.hpp" >>> #include "utilities/macros.hpp" >>> + #include "aot/aotLoader.hpp" >>> >>> >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >>> >>> >>> #include "gc/g1/g1Policy.hpp" >>> #include "gc/g1/g1RootClosures.hpp" >>> #include "gc/g1/g1RootProcessor.hpp" >>> #include "gc/g1/heapRegion.inline.hpp" >>> #include "memory/allocation.inline.hpp" >>> + #include "aot/aotLoader.hpp" >>> #include "runtime/fprofiler.hpp" >>> #include "runtime/mutex.hpp" >>> #include "services/management.hpp" >>> >>> 2) I'd prefer if the the check if AOT is enabled was folded into >>> AOTLoader::oops_do, so that the additions to the GC code would be >>> less conspicuous. >>> >>> For example: >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >>> >>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>> CodeBlobToOopClosure::FixRelocations); >>> CodeCache::blobs_do(&adjust_from_blobs); >>> + if (UseAOT) { >>> + AOTLoader::oops_do(adjust_pointer_closure()); >>> + } >>> StringTable::oops_do(adjust_pointer_closure()); >>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>> >>> >>> Would be: >>> >>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>> CodeBlobToOopClosure::FixRelocations); >>> CodeCache::blobs_do(&adjust_from_blobs); >>> + AOTLoader::oops_do(adjust_pointer_closure()); >>> StringTable::oops_do(adjust_pointer_closure()); >>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>> >>> >>> 3) aotLoader.hpp uses implements methods using GrowableArray. This >>> will expose the growable array functions to all includers of that >>> file. Please move all that code out to an aotLoader.inline.hpp file, >>> and then remove the unneeded includes from the aotLoader.hpp file. >>> 4) >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >>> >>> // Reserved area >>> char* low_boundary() const { return _low_boundary; } >>> char* high_boundary() const { return _high_boundary; } >>> >>> + void set_low_boundary(char *p) { _low_boundary = p; } >>> + void set_high_boundary(char *p) { _high_boundary = p; } >>> + void set_low(char *p) { _low = p; } >>> + void set_high(char *p) { _high = p; } >>> + >>> bool special() const { return _special; } >>> >>> These seems unsafe to me, but that might be because I don't >>> understand how this is used. VirtualSpace has three sections, the >>> lower, middle, and the high. The middle section might have another >>> alignment (large pages) than the others. Is this property still >>> maintained when these functions are used? >>> >>> >>> 5) >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >>> >>> Did you discuss with the Runtime team about the naming of these >>> tags? The other class* tags where split up into multiple tags. For >>> example, classload was changed to class,load. >>> >>> Thanks, >>> StefanK >>> >>>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>>> AOT JEP: >>>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>>> Subtask: >>>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>>> Webrev: >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>>> >>>> Please, review Hotspot VM part of AOT changes. >>>> >>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of >>>> Hotspot will be build only on Linux/x64. >>>> >>>> AOT code is NOT linked during AOT libraries load as it happens with >>>> normal .so libraries. AOT code entry points are not exposed (not >>>> global) in AOT libraries. Only class data has global labels which >>>> we look for with dlsym(klass_name). >>>> >>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* >>>> of existing CodeCache. When a java class is loaded JVM looks if >>>> corresponding AOT-compiled methods exist in loaded AOT libraries >>>> and add links to them from java methods descriptors (we have new >>>> field Method::_aot_code). AOT-compiled code follows the same >>>> invocation/deoptimization/unloading rules as normal JIT-compiled code. >>>> >>>> Calls in AOT code use the same methods resolution runtime code as >>>> calls in JITed code. The difference is call's destination address >>>> is loaded indirectly because we can't patch AOT code - it is >>>> immutable (to share between multiple JVM instances). >>>> >>>> Classes and Strings referenced in AOT code are resolved lazily by >>>> calling into runtime. All mutable pointers (oops (mostly strings), >>>> metadata) are stored and modified in a separate mutable memory (GOT >>>> cells) - they are not embedded into AOT code. >>>> >>>> Changes includes klass fingerprint generation since we need it to >>>> find correct klass data in loaded AOT libraries. >>>> >>>> Thanks, >>>> Vladimir >>> >>> >> From bob.vandette at oracle.com Thu Nov 3 14:06:36 2016 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 3 Nov 2016 10:06:36 -0400 Subject: RFR: 8167501 ARMv7 Linux C2 compiler crashes running jtreg harness on MP systems Message-ID: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com> Please review this JDK9 work-around for a reliability problem causing crashes and hangs running jtreg on ARMv7 MP platforms using the server compiler. This work-around disables the use of quick-enter on ARM. This enhancement was previously disabled for AARCH64 binaries. This work-around has been independently verified by running jtreg on two different MP based ARM systems. https://bugs.openjdk.java.net/browse/JDK-8167501 diff --git a/src/share/vm/runtime/sharedRuntime.cpp b/src/share/vm/runtime/sharedRuntime.cpp --- a/src/share/vm/runtime/sharedRuntime.cpp +++ b/src/share/vm/runtime/sharedRuntime.cpp @@ -1983,8 +1983,10 @@ // Handles the uncommon case in locking, i.e., contention or an inflated lock. JRT_BLOCK_ENTRY(void, SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* lock, JavaThread* thread)) // Disable ObjectSynchronizer::quick_enter() in default config - // on AARCH64 until JDK-8153107 is resolved. - if (AARCH64_ONLY((SyncFlags & 256) != 0 &&) !SafepointSynchronize::is_synchronizing()) { + // on AARCH64 and ARM until JDK-8153107 is resolved. + if (ARM_ONLY((SyncFlags & 256) != 0 &&) + AARCH64_ONLY((SyncFlags & 256) != 0 &&) + !SafepointSynchronize::is_synchronizing()) { // Only try quick_enter() if we're not trying to reach a safepoint // so that the calling thread reaches the safepoint more quickly. if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return; The real problem will be investigated and fixed under this bug: https://bugs.openjdk.java.net/browse/JDK-8153107 Bob. From daniel.daugherty at oracle.com Thu Nov 3 14:32:25 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 3 Nov 2016 08:32:25 -0600 Subject: RFR: 8167501 ARMv7 Linux C2 compiler crashes running jtreg harness on MP systems In-Reply-To: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com> References: <8595EB1C-1E3C-44A8-AE88-71DBEA485A54@oracle.com> Message-ID: <7a659d9f-9625-8b12-c3b2-4c8bc6032516@oracle.com> Thumbs up. Dan On 11/3/16 8:06 AM, Bob Vandette wrote: > Please review this JDK9 work-around for a reliability problem causing crashes and hangs > running jtreg on ARMv7 MP platforms using the server compiler. > > This work-around disables the use of quick-enter on ARM. This enhancement was > previously disabled for AARCH64 binaries. > > This work-around has been independently verified by running jtreg on two different MP > based ARM systems. > > https://bugs.openjdk.java.net/browse/JDK-8167501 > > diff --git a/src/share/vm/runtime/sharedRuntime.cpp b/src/share/vm/runtime/sharedRuntime.cpp > --- a/src/share/vm/runtime/sharedRuntime.cpp > +++ b/src/share/vm/runtime/sharedRuntime.cpp > @@ -1983,8 +1983,10 @@ > // Handles the uncommon case in locking, i.e., contention or an inflated lock. > JRT_BLOCK_ENTRY(void, SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock* lock, JavaThread* thread)) > // Disable ObjectSynchronizer::quick_enter() in default config > - // on AARCH64 until JDK-8153107 is resolved. > - if (AARCH64_ONLY((SyncFlags & 256) != 0 &&) !SafepointSynchronize::is_synchronizing()) { > + // on AARCH64 and ARM until JDK-8153107 is resolved. > + if (ARM_ONLY((SyncFlags & 256) != 0 &&) > + AARCH64_ONLY((SyncFlags & 256) != 0 &&) > + !SafepointSynchronize::is_synchronizing()) { > // Only try quick_enter() if we're not trying to reach a safepoint > // so that the calling thread reaches the safepoint more quickly. > if (ObjectSynchronizer::quick_enter(_obj, thread, lock)) return; > > > The real problem will be investigated and fixed under this bug: > > https://bugs.openjdk.java.net/browse/JDK-8153107 > > Bob. > > From coleen.phillimore at oracle.com Thu Nov 3 16:00:15 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 3 Nov 2016 12:00:15 -0400 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> Message-ID: <034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com> On 11/3/16 7:33 AM, Vladimir Kozlov wrote: > Done: > > java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so > HelloWorld > [0.060s][trace][aot,class,load] found java.lang.Object in > ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000 > ... > > I updated webrev with your, Coleen, and Stefan all suggestions: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/ > > this is delta of changes: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/ Thank you for making these changes. I must have missed these CompiledIC => const methodHandle& changes when I went through a while ago. One minor change though: http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.hpp.udiff.html http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.cpp.udiff.html http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.hpp.udiff.html Since instanceKlassHandle is not a real handle, can you keep that passing by value? I have a script that removes them for jdk 10. Sorry for the confusion about these. The difference is that methodHandle has a copy constructor and destructor, so passing by const reference avoids copying them. instanceKlassHandle and KlassHandle are dummy now and don't have these so don't add to the code. I don't need to see another webrev. Thank you for fixing the logging. Coleen > > Thanks, > Vladimir > > On 11/1/16 3:40 AM, Coleen Phillimore wrote: >> 5. Thanks for pointing out the logging tags Stefan. Yes we would >> prefer adding "apt" and "fingerprint" and using the composition of >> existing tags for logging. >> Thanks >> Coleen >> >> >>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson >>> wrote: >>> >>> (resending without formatting) >>> >>> Hi Vladimir, >>> >>> I just took a quick look at the GC code. >>> >>> 1) You need to go over the entire patch and fix all the include >>> lines that were added. They are are not sorted, as they should. >>> >>> Some examples: >>> >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >>> >>> >>> #include "utilities/debug.hpp" >>> #include "utilities/macros.hpp" >>> + #include "aot/aotLoader.hpp" >>> >>> >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >>> >>> >>> #include "gc/g1/g1Policy.hpp" >>> #include "gc/g1/g1RootClosures.hpp" >>> #include "gc/g1/g1RootProcessor.hpp" >>> #include "gc/g1/heapRegion.inline.hpp" >>> #include "memory/allocation.inline.hpp" >>> + #include "aot/aotLoader.hpp" >>> #include "runtime/fprofiler.hpp" >>> #include "runtime/mutex.hpp" >>> #include "services/management.hpp" >>> >>> 2) I'd prefer if the the check if AOT is enabled was folded into >>> AOTLoader::oops_do, so that the additions to the GC code would be >>> less conspicuous. >>> >>> For example: >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >>> >>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>> CodeBlobToOopClosure::FixRelocations); >>> CodeCache::blobs_do(&adjust_from_blobs); >>> + if (UseAOT) { >>> + AOTLoader::oops_do(adjust_pointer_closure()); >>> + } >>> StringTable::oops_do(adjust_pointer_closure()); >>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>> >>> >>> Would be: >>> >>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>> CodeBlobToOopClosure::FixRelocations); >>> CodeCache::blobs_do(&adjust_from_blobs); >>> + AOTLoader::oops_do(adjust_pointer_closure()); >>> StringTable::oops_do(adjust_pointer_closure()); >>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>> >>> >>> 3) aotLoader.hpp uses implements methods using GrowableArray. This >>> will expose the growable array functions to all includers of that >>> file. Please move all that code out to an aotLoader.inline.hpp file, >>> and then remove the unneeded includes from the aotLoader.hpp file. >>> 4) >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >>> >>> // Reserved area >>> char* low_boundary() const { return _low_boundary; } >>> char* high_boundary() const { return _high_boundary; } >>> >>> + void set_low_boundary(char *p) { _low_boundary = p; } >>> + void set_high_boundary(char *p) { _high_boundary = p; } >>> + void set_low(char *p) { _low = p; } >>> + void set_high(char *p) { _high = p; } >>> + >>> bool special() const { return _special; } >>> >>> These seems unsafe to me, but that might be because I don't >>> understand how this is used. VirtualSpace has three sections, the >>> lower, middle, and the high. The middle section might have another >>> alignment (large pages) than the others. Is this property still >>> maintained when these functions are used? >>> >>> >>> 5) >>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >>> >>> Did you discuss with the Runtime team about the naming of these >>> tags? The other class* tags where split up into multiple tags. For >>> example, classload was changed to class,load. >>> >>> Thanks, >>> StefanK >>> >>>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>>> AOT JEP: >>>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>>> Subtask: >>>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>>> Webrev: >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>>> >>>> Please, review Hotspot VM part of AOT changes. >>>> >>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of >>>> Hotspot will be build only on Linux/x64. >>>> >>>> AOT code is NOT linked during AOT libraries load as it happens with >>>> normal .so libraries. AOT code entry points are not exposed (not >>>> global) in AOT libraries. Only class data has global labels which >>>> we look for with dlsym(klass_name). >>>> >>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* >>>> of existing CodeCache. When a java class is loaded JVM looks if >>>> corresponding AOT-compiled methods exist in loaded AOT libraries >>>> and add links to them from java methods descriptors (we have new >>>> field Method::_aot_code). AOT-compiled code follows the same >>>> invocation/deoptimization/unloading rules as normal JIT-compiled code. >>>> >>>> Calls in AOT code use the same methods resolution runtime code as >>>> calls in JITed code. The difference is call's destination address >>>> is loaded indirectly because we can't patch AOT code - it is >>>> immutable (to share between multiple JVM instances). >>>> >>>> Classes and Strings referenced in AOT code are resolved lazily by >>>> calling into runtime. All mutable pointers (oops (mostly strings), >>>> metadata) are stored and modified in a separate mutable memory (GOT >>>> cells) - they are not embedded into AOT code. >>>> >>>> Changes includes klass fingerprint generation since we need it to >>>> find correct klass data in loaded AOT libraries. >>>> >>>> Thanks, >>>> Vladimir >>> >>> >> From vladimir.kozlov at oracle.com Thu Nov 3 19:04:27 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 3 Nov 2016 12:04:27 -0700 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> Message-ID: Thank you, Stefan I fixed aotCodeHeap.cpp as you suggested. Vladimir On 11/3/16 5:47 AM, Stefan Karlsson wrote: > Hi Vladimir, > > On 03/11/16 12:33, Vladimir Kozlov wrote: >> Done: >> >> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so >> HelloWorld >> [0.060s][trace][aot,class,load] found java.lang.Object in >> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000 >> ... >> >> I updated webrev with your, Coleen, and Stefan all suggestions: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/ >> >> this is delta of changes: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/ > > Looks good to me. > > I noticed one nit: > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.cpp.udiff.html > > > #include "precompiled.hpp" > *+ * > *+ #include "aot/aotLoader.hpp"* > *+ #include "aot/aotCodeHeap.hpp"* > > The lines should be swapped. > > Thanks, > StefanK > >> >> Thanks, >> Vladimir >> >> On 11/1/16 3:40 AM, Coleen Phillimore wrote: >>> 5. Thanks for pointing out the logging tags Stefan. Yes we would >>> prefer adding "apt" and "fingerprint" and using the composition of >>> existing tags for logging. >>> Thanks >>> Coleen >>> >>> >>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson >>>> wrote: >>>> >>>> (resending without formatting) >>>> >>>> Hi Vladimir, >>>> >>>> I just took a quick look at the GC code. >>>> >>>> 1) You need to go over the entire patch and fix all the include >>>> lines that were added. They are are not sorted, as they should. >>>> >>>> Some examples: >>>> >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >>>> >>>> >>>> #include "utilities/debug.hpp" >>>> #include "utilities/macros.hpp" >>>> + #include "aot/aotLoader.hpp" >>>> >>>> >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >>>> >>>> >>>> #include "gc/g1/g1Policy.hpp" >>>> #include "gc/g1/g1RootClosures.hpp" >>>> #include "gc/g1/g1RootProcessor.hpp" >>>> #include "gc/g1/heapRegion.inline.hpp" >>>> #include "memory/allocation.inline.hpp" >>>> + #include "aot/aotLoader.hpp" >>>> #include "runtime/fprofiler.hpp" >>>> #include "runtime/mutex.hpp" >>>> #include "services/management.hpp" >>>> >>>> 2) I'd prefer if the the check if AOT is enabled was folded into >>>> AOTLoader::oops_do, so that the additions to the GC code would be >>>> less conspicuous. >>>> >>>> For example: >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >>>> >>>> >>>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>>> CodeBlobToOopClosure::FixRelocations); >>>> CodeCache::blobs_do(&adjust_from_blobs); >>>> + if (UseAOT) { >>>> + AOTLoader::oops_do(adjust_pointer_closure()); >>>> + } >>>> StringTable::oops_do(adjust_pointer_closure()); >>>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>>> >>>> >>>> Would be: >>>> >>>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>>> CodeBlobToOopClosure::FixRelocations); >>>> CodeCache::blobs_do(&adjust_from_blobs); >>>> + AOTLoader::oops_do(adjust_pointer_closure()); >>>> StringTable::oops_do(adjust_pointer_closure()); >>>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>>> >>>> >>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This >>>> will expose the growable array functions to all includers of that >>>> file. Please move all that code out to an aotLoader.inline.hpp file, >>>> and then remove the unneeded includes from the aotLoader.hpp file. >>>> 4) >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >>>> >>>> >>>> // Reserved area >>>> char* low_boundary() const { return _low_boundary; } >>>> char* high_boundary() const { return _high_boundary; } >>>> >>>> + void set_low_boundary(char *p) { _low_boundary = p; } >>>> + void set_high_boundary(char *p) { _high_boundary = p; } >>>> + void set_low(char *p) { _low = p; } >>>> + void set_high(char *p) { _high = p; } >>>> + >>>> bool special() const { return _special; } >>>> >>>> These seems unsafe to me, but that might be because I don't >>>> understand how this is used. VirtualSpace has three sections, the >>>> lower, middle, and the high. The middle section might have another >>>> alignment (large pages) than the others. Is this property still >>>> maintained when these functions are used? >>>> >>>> >>>> 5) >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >>>> >>>> >>>> Did you discuss with the Runtime team about the naming of these >>>> tags? The other class* tags where split up into multiple tags. For >>>> example, classload was changed to class,load. >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>>>> AOT JEP: >>>>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>>>> Subtask: >>>>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>>>> >>>>> Please, review Hotspot VM part of AOT changes. >>>>> >>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of >>>>> Hotspot will be build only on Linux/x64. >>>>> >>>>> AOT code is NOT linked during AOT libraries load as it happens with >>>>> normal .so libraries. AOT code entry points are not exposed (not >>>>> global) in AOT libraries. Only class data has global labels which >>>>> we look for with dlsym(klass_name). >>>>> >>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* >>>>> of existing CodeCache. When a java class is loaded JVM looks if >>>>> corresponding AOT-compiled methods exist in loaded AOT libraries >>>>> and add links to them from java methods descriptors (we have new >>>>> field Method::_aot_code). AOT-compiled code follows the same >>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code. >>>>> >>>>> Calls in AOT code use the same methods resolution runtime code as >>>>> calls in JITed code. The difference is call's destination address >>>>> is loaded indirectly because we can't patch AOT code - it is >>>>> immutable (to share between multiple JVM instances). >>>>> >>>>> Classes and Strings referenced in AOT code are resolved lazily by >>>>> calling into runtime. All mutable pointers (oops (mostly strings), >>>>> metadata) are stored and modified in a separate mutable memory (GOT >>>>> cells) - they are not embedded into AOT code. >>>>> >>>>> Changes includes klass fingerprint generation since we need it to >>>>> find correct klass data in loaded AOT libraries. >>>>> >>>>> Thanks, >>>>> Vladimir >>>> >>>> >>> > From vladimir.kozlov at oracle.com Thu Nov 3 19:17:14 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 3 Nov 2016 12:17:14 -0700 Subject: [9] RFR(L) 8166413: Integrate VM part of AOT changes In-Reply-To: <034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com> References: <58115536.5080205@oracle.com> <009CBB23-B815-4D73-8CC9-3F0BC4230455@oracle.com> <7f3a7f36-1119-df16-27e2-4b8c1b560363@oracle.com> <034cc7f5-f99d-f507-a96a-d6b85b7d31d2@oracle.com> Message-ID: Thank you, Coleen I reverted instanceKlassHandle changes. Vladimir On 11/3/16 9:00 AM, Coleen Phillimore wrote: > > > On 11/3/16 7:33 AM, Vladimir Kozlov wrote: >> Done: >> >> java -Xlog:aot+class+load=trace -XX:AOTLibrary=./libhelloworld.so >> HelloWorld >> [0.060s][trace][aot,class,load] found java.lang.Object in >> ./libhelloworld.so for classloader 0x7f1fb80eef30 tid=0x00007f1fb801b000 >> ... >> >> I updated webrev with your, Coleen, and Stefan all suggestions: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.2/ >> >> this is delta of changes: >> >> http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/ > > Thank you for making these changes. I must have missed these CompiledIC > => const methodHandle& changes when I went through a while ago. > > One minor change though: > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotCodeHeap.hpp.udiff.html > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.cpp.udiff.html > > http://cr.openjdk.java.net/~kvn/aot/hs.webrev.delta.2/src/share/vm/aot/aotLoader.hpp.udiff.html > > > Since instanceKlassHandle is not a real handle, can you keep that > passing by value? I have a script that removes them for jdk 10. Sorry > for the confusion about these. The difference is that methodHandle has > a copy constructor and destructor, so passing by const reference avoids > copying them. instanceKlassHandle and KlassHandle are dummy now and > don't have these so don't add to the code. > > I don't need to see another webrev. Thank you for fixing the logging. > > Coleen > > > > > >> >> Thanks, >> Vladimir >> >> On 11/1/16 3:40 AM, Coleen Phillimore wrote: >>> 5. Thanks for pointing out the logging tags Stefan. Yes we would >>> prefer adding "apt" and "fingerprint" and using the composition of >>> existing tags for logging. >>> Thanks >>> Coleen >>> >>> >>>> On Nov 1, 2016, at 3:38 AM, Stefan Karlsson >>>> wrote: >>>> >>>> (resending without formatting) >>>> >>>> Hi Vladimir, >>>> >>>> I just took a quick look at the GC code. >>>> >>>> 1) You need to go over the entire patch and fix all the include >>>> lines that were added. They are are not sorted, as they should. >>>> >>>> Some examples: >>>> >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/metaspace.cpp.udiff.html >>>> >>>> >>>> #include "utilities/debug.hpp" >>>> #include "utilities/macros.hpp" >>>> + #include "aot/aotLoader.hpp" >>>> >>>> >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/g1/g1RootProcessor.cpp.udiff.html >>>> >>>> >>>> #include "gc/g1/g1Policy.hpp" >>>> #include "gc/g1/g1RootClosures.hpp" >>>> #include "gc/g1/g1RootProcessor.hpp" >>>> #include "gc/g1/heapRegion.inline.hpp" >>>> #include "memory/allocation.inline.hpp" >>>> + #include "aot/aotLoader.hpp" >>>> #include "runtime/fprofiler.hpp" >>>> #include "runtime/mutex.hpp" >>>> #include "services/management.hpp" >>>> >>>> 2) I'd prefer if the the check if AOT is enabled was folded into >>>> AOTLoader::oops_do, so that the additions to the GC code would be >>>> less conspicuous. >>>> >>>> For example: >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/gc/parallel/psMarkSweep.cpp.udiff.html >>>> >>>> >>>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>>> CodeBlobToOopClosure::FixRelocations); >>>> CodeCache::blobs_do(&adjust_from_blobs); >>>> + if (UseAOT) { >>>> + AOTLoader::oops_do(adjust_pointer_closure()); >>>> + } >>>> StringTable::oops_do(adjust_pointer_closure()); >>>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>>> >>>> >>>> Would be: >>>> >>>> CodeBlobToOopClosure adjust_from_blobs(adjust_pointer_closure(), >>>> CodeBlobToOopClosure::FixRelocations); >>>> CodeCache::blobs_do(&adjust_from_blobs); >>>> + AOTLoader::oops_do(adjust_pointer_closure()); >>>> StringTable::oops_do(adjust_pointer_closure()); >>>> ref_processor()->weak_oops_do(adjust_pointer_closure()); >>>> PSScavenge::reference_processor()->weak_oops_do(adjust_pointer_closure()); >>>> >>>> >>>> 3) aotLoader.hpp uses implements methods using GrowableArray. This >>>> will expose the growable array functions to all includers of that >>>> file. Please move all that code out to an aotLoader.inline.hpp file, >>>> and then remove the unneeded includes from the aotLoader.hpp file. >>>> 4) >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/memory/virtualspace.hpp.udiff.html >>>> >>>> >>>> // Reserved area >>>> char* low_boundary() const { return _low_boundary; } >>>> char* high_boundary() const { return _high_boundary; } >>>> >>>> + void set_low_boundary(char *p) { _low_boundary = p; } >>>> + void set_high_boundary(char *p) { _high_boundary = p; } >>>> + void set_low(char *p) { _low = p; } >>>> + void set_high(char *p) { _high = p; } >>>> + >>>> bool special() const { return _special; } >>>> >>>> These seems unsafe to me, but that might be because I don't >>>> understand how this is used. VirtualSpace has three sections, the >>>> lower, middle, and the high. The middle section might have another >>>> alignment (large pages) than the others. Is this property still >>>> maintained when these functions are used? >>>> >>>> >>>> 5) >>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/src/share/vm/logging/logTag.hpp.udiff.html >>>> >>>> >>>> Did you discuss with the Runtime team about the naming of these >>>> tags? The other class* tags where split up into multiple tags. For >>>> example, classload was changed to class,load. >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> On 27/10/16 03:15, Vladimir Kozlov wrote: >>>>> AOT JEP: >>>>> https://bugs.openjdk.java.net/browse/JDK-8166089 >>>>> Subtask: >>>>> https://bugs.openjdk.java.net/browse/JDK-8166415 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~kvn/aot/hs.webrev/ >>>>> >>>>> Please, review Hotspot VM part of AOT changes. >>>>> >>>>> Only Linux/x64 platform is supported. 'jaotc' and AOT part of >>>>> Hotspot will be build only on Linux/x64. >>>>> >>>>> AOT code is NOT linked during AOT libraries load as it happens with >>>>> normal .so libraries. AOT code entry points are not exposed (not >>>>> global) in AOT libraries. Only class data has global labels which >>>>> we look for with dlsym(klass_name). >>>>> >>>>> AOT-compiled code in AOT libraries is treated by JVM as *extension* >>>>> of existing CodeCache. When a java class is loaded JVM looks if >>>>> corresponding AOT-compiled methods exist in loaded AOT libraries >>>>> and add links to them from java methods descriptors (we have new >>>>> field Method::_aot_code). AOT-compiled code follows the same >>>>> invocation/deoptimization/unloading rules as normal JIT-compiled code. >>>>> >>>>> Calls in AOT code use the same methods resolution runtime code as >>>>> calls in JITed code. The difference is call's destination address >>>>> is loaded indirectly because we can't patch AOT code - it is >>>>> immutable (to share between multiple JVM instances). >>>>> >>>>> Classes and Strings referenced in AOT code are resolved lazily by >>>>> calling into runtime. All mutable pointers (oops (mostly strings), >>>>> metadata) are stored and modified in a separate mutable memory (GOT >>>>> cells) - they are not embedded into AOT code. >>>>> >>>>> Changes includes klass fingerprint generation since we need it to >>>>> find correct klass data in loaded AOT libraries. >>>>> >>>>> Thanks, >>>>> Vladimir >>>> >>>> >>> > From david.holmes at oracle.com Thu Nov 3 19:39:35 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 4 Nov 2016 05:39:35 +1000 Subject: =?UTF-8?B?UmU6IOWbnuWkje+8miBwbGVhc2UgaGVscCB1bmRlcnN0YW5kaW5nIHdo?= =?UTF-8?Q?at's_the_relationship_of_hotspot_compiler_c1_and_c2_=ef=bc=9f?= In-Reply-To: References: Message-ID: <6554c154-0fb7-1fa1-169d-57c58df57e6b@oracle.com> On 2/11/2016 7:40 PM, ???? wrote: > Hi, > Server VM will use C2, in this mode > which method processes 'volatile' operations like method do_StoreField in src/share/vm/c1/c1_LIRGenerator.cpp? C2 definitions are in the .ad files (that get fed into Adlc to generate the compiler implementation). Eg. hotspot/src/cpu/x86/vm/x86_32.ad // Atomically load the volatile long enc_class enc_loadL_volatile( memory mem, stackSlotL dst ) %{ emit_opcode(cbuf,0xDF); int rm_byte_opcode = 0x05; int base = $mem$$base; int index = $mem$$index; int scale = $mem$$scale; int displace = $mem$$disp; relocInfo::relocType disp_reloc = $mem->disp_reloc(); // disp-as-oop when working with static globals encode_RegMem(cbuf, rm_byte_opcode, base, index, scale, displace, disp_reloc); store_to_stackslot( cbuf, 0x0DF, 0x07, $dst$$disp ); %} David ----- > > Thank you ! > ------------------ ???? ------------------ > ???: "???";; > ????: 2016?11?2?(???) ??5:34 > ???: "????"<1072213404 at qq.com>; > ??: "Krystal Mok"; "hotspot-dev"; > ??: Re: please help understanding what's the relationship of hotspot compiler c1 and c2 ? > > > > Hi, > When use tiered compilation (default enable in jdk8),tiered VM can use C1 and C2 both [1].Client VM will use C1 and Server VM will use C2. > > > Thanks, > hyperdak > > > [1] http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#tieredcompilation > From 1072213404 at qq.com Fri Nov 4 06:02:12 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Fri, 4 Nov 2016 14:02:12 +0800 Subject: help me understanding how volatile field of java was executed on hotspot Message-ID: for now, i know that when operating volatile field, it may be executed on below method in hotspot : interpreter : hotspot/src/share/vm/interpreter/templateTable.hpp static void getfield_or_static(int byte_no, bool is_static); static void putfield_or_static(int byte_no, bool is_static); C1: hotspot/src/share/vm/c1/c1_LIRGenerator.cpp void LIRGenerator::do_LoadField(LoadField* x) void LIRGenerator::do_StoreField(StoreField* x) C2: hotspot/src/share/vm/opto/parse.hpp void do_get_xxx(Node* obj, ciField* field, bool is_field); void do_put_xxx(Node* obj, ciField* field, bool is_field); is there some offical doc describing these method? and is there someway to prove volatile field actually executed on above three methods? Thank you ? Arron From yang.zhang at linaro.org Fri Nov 4 07:07:53 2016 From: yang.zhang at linaro.org (Yang Zhang) Date: Fri, 4 Nov 2016 15:07:53 +0800 Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 Message-ID: Hi, jdk9/hs/hotspot native libs for jtreg build failed after the push of http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/308a53dd5aee Build command: make test-image-hotspot-jtreg-native Could someone please help to fix it? The reason is that dl library isn't found. I think the following change could fix that: ------ diff --git a/make/test/JtregNative.gmk b/make/test/JtregNative.gmk index 78e78d7..95b5747 100644 --- a/make/test/JtregNative.gmk +++ b/make/test/JtregNative.gmk @@ -91,7 +91,7 @@ ifeq ($(OPENJDK_TARGET_OS), linux) BUILD_HOTSPOT_JTREG_LIBRARIES_LDFLAGS_libtest-rwx := -z execstack BUILD_HOTSPOT_JTREG_EXECUTABLES_LIBS_exeinvoke := -ljvm -lpthread BUILD_TEST_invoke_exeinvoke.c_OPTIMIZATION := NONE - BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -ldl + BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -Wl,--no-as-needed -ldl endif ifeq ($(OPENJDK_TARGET_OS), windows) ------ Regards Yang From aph at redhat.com Fri Nov 4 09:18:01 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Nov 2016 09:18:01 +0000 Subject: help me understanding how volatile field of java was executed on hotspot In-Reply-To: References: Message-ID: On 04/11/16 06:02, ???? wrote: > is there some offical doc describing these method? There's the source code that you're looking at. > and is there someway to prove volatile field actually executed on above three method? Beyond looking at the code, no. There must be an important reason that you're asking this, and we'd be happy to help if you told us. Andrew. From peter.hofer at jku.at Fri Nov 4 10:00:38 2016 From: peter.hofer at jku.at (Peter Hofer) Date: Fri, 4 Nov 2016 11:00:38 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot Message-ID: Hello everyone, we are researchers at the University of Linz and have worked on a lock contention profiler that is built into HotSpot. We would like to contribute this work to the OpenJDK community. Our profiler records an event when a thread fails to acquire a contended lock and also when a thread releases a contended lock. It further efficiently records the stack traces where these events occur. We devised a versatile visualization tool that analyzes the recorded events and determines when and where threads _cause_ contention by holding a contended lock. The visualization tool can show the contention by stack trace, by lock, by lock class, by thread, and by any combination of those aspects. We described our profiler in more detail in a research paper at ICPE 2016. [1] In our evaluation, we found that the overhead is typically below 10% for common multi-threaded Java benchmarks. Please find a free download of the paper on our website: > http://mevss.jku.at/lct/ I contribute this work on behalf of Dynatrace Austria (the sponsor of this research), my colleagues David Gnedt and Andreas Schoergenhumer, and myself. The necessary OCAs have already been submitted. We provide two patches: Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we described and evaluated in our paper, plus minor improvements. It records events for Java intrinsic locks (monitors) and for java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). We support only Linux on 64-bit x86 hardware. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal native locks only. We consider this to be useful for HotSpot developers to find locking bottlenecks in HotSpot itself. We tested this patch only on Linux on 64-bit x86 hardware, but it should require few changes for other platforms. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/ With both patches, the profiler is enabled with -XX:+EnableEventTracing. By default, an uncompressed event trace is written to file "output.trc". More detailed usage information and a download of the corresponding visualization tool is available on our website, http://mevss.jku.at/lct/. Kind regards, Peter Hofer -- Peter Hofer Christian Doppler Laboratory on Monitoring and Evolution of Very-Large-Scale Software Systems / Institute for System Software University of Linz [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter Moessenboeck. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level. Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE?16), Delft, Netherlands, 2016. From adinn at redhat.com Fri Nov 4 11:14:50 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 4 Nov 2016 11:14:50 +0000 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: Hi Peter, On 04/11/16 10:00, Peter Hofer wrote: > Hello everyone, > > we are researchers at the University of Linz and have worked on a lock > contention profiler that is built into HotSpot. We would like to > contribute this work to the OpenJDK community. > . . . This sounds very interesting. > We described our profiler in more detail in a research paper at ICPE > 2016. [1] In our evaluation, we found that the overhead is typically > below 10% for common multi-threaded Java benchmarks. Please find a free > download of the paper on our website: Have you measured the overhead this change produces when running with contention detection disabled? (i.e. do we pay to have this feature even when we don't use it). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aaron.grunthal at infinite-source.de Fri Nov 4 11:31:34 2016 From: aaron.grunthal at infinite-source.de (Aaron Grunthal) Date: Fri, 4 Nov 2016 12:31:34 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: <051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de> I think for lock contention the distribution of the blocking time is of interest. Can the profiler show that or just the cumulative time? Most profilers only record the sum, which is useful for optimizing throughput bottlenecks, but when optimizing for latency the CDF also is of interest since some methods can have vastly different average and worst case behaviors which can get obscured in the averages. - Aaron On 04.11.2016 11:00, Peter Hofer wrote: > Hello everyone, > > we are researchers at the University of Linz and have worked on a lock > contention profiler that is built into HotSpot. We would like to > contribute this work to the OpenJDK community. > > Our profiler records an event when a thread fails to acquire a contended > lock and also when a thread releases a contended lock. It further > efficiently records the stack traces where these events occur. We > devised a versatile visualization tool that analyzes the recorded events > and determines when and where threads _cause_ contention by holding a > contended lock. The visualization tool can show the contention by stack > trace, by lock, by lock class, by thread, and by any combination of > those aspects. > > We described our profiler in more detail in a research paper at ICPE > 2016. [1] In our evaluation, we found that the overhead is typically > below 10% for common multi-threaded Java benchmarks. Please find a free > download of the paper on our website: >> http://mevss.jku.at/lct/ > > I contribute this work on behalf of Dynatrace Austria (the sponsor of > this research), my colleagues David Gnedt and Andreas Schoergenhumer, > and myself. The necessary OCAs have already been submitted. > > We provide two patches: > > Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we > described and evaluated in our paper, plus minor improvements. It > records events for Java intrinsic locks (monitors) and for > java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). > We support only Linux on 64-bit x86 hardware. > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ > > Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal > native locks only. We consider this to be useful for HotSpot developers > to find locking bottlenecks in HotSpot itself. We tested this patch only > on Linux on 64-bit x86 hardware, but it should require few changes for > other platforms. > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/ >> >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/ >> > > With both patches, the profiler is enabled with -XX:+EnableEventTracing. > By default, an uncompressed event trace is written to file "output.trc". > > More detailed usage information and a download of the corresponding > visualization tool is available on our website, http://mevss.jku.at/lct/. > > Kind regards, > Peter Hofer > > > -- > Peter Hofer > Christian Doppler Laboratory on Monitoring and Evolution of > Very-Large-Scale Software Systems / Institute for System Software > University of Linz > > > [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter > Moessenboeck. Efficient Tracing and Versatile Analysis of Lock > Contention in Java Applications on the Virtual Machine Level. > Proceedings of the 7th ACM/SPEC International Conference on Performance > Engineering (ICPE?16), Delft, Netherlands, 2016. > From peter.hofer at jku.at Fri Nov 4 12:04:12 2016 From: peter.hofer at jku.at (Peter Hofer) Date: Fri, 4 Nov 2016 13:04:12 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at> Hi Andrew, On 11/04/2016 12:14 PM, Andrew Dinn wrote: >> We described our profiler in more detail in a research paper at ICPE >> 2016. [1] In our evaluation, we found that the overhead is typically >> below 10% for common multi-threaded Java benchmarks. Please find a free >> download of the paper on our website: > > Have you measured the overhead this change produces when running with > contention detection disabled? (i.e. do we pay to have this feature even > when we don't use it). We measured only the overhead relative to an unmodified OpenJDK build. Our profiler observes only lock contention, which is generally handled via slow paths in the VM code, so this is where we added the code to record events. I don't expect this code to cause much overhead when disabled. However, we added fields to several data structures, which might make a difference. I'll run some more benchmarks and report my findings. Cheers, Peter From peter.hofer at jku.at Fri Nov 4 12:59:58 2016 From: peter.hofer at jku.at (Peter Hofer) Date: Fri, 4 Nov 2016 13:59:58 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: <051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de> References: <051cbc05-24e7-f83e-e7e7-2a057f07cd76@infinite-source.de> Message-ID: <5c925ca4-27ef-ddb6-fb92-b91297e9b676@jku.at> Hi Aaron, On 11/04/2016 12:31 PM, Aaron Grunthal wrote: > I think for lock contention the distribution of the blocking time is > of interest. Can the profiler show that or just the cumulative time? > > Most profilers only record the sum, which is useful for optimizing > throughput bottlenecks, but when optimizing for latency the CDF also > is of interest since some methods can have vastly different average > and worst case behaviors which can get obscured in the averages. Our visualization tool currently shows only the cumulative contention times for each stack trace, lock, lock class, thread, or any combination of those aspects. However, individual blocking times could be computed from the events that the profiler records. These times could also be computed from the lock owner thread's perspective, i.e., the time from when the owned lock becomes contended until the thread releases the lock. Individual blocking times would only work well for monitors (and native monitors) though. With java.util.concurrent locks, we observe individual park()/unpark() calls. A thread that cannot acquire a lock may call park() more than once, and we cannot distinguish this from when a thread tries to acquire a lock multiple times and calls park() once each time. We would likely need bytecode instrumentation to group multiple park() calls that are part of a single lock acquisition and use the duration from the first park() call to the return of the last park() call as the blocking time. Cheers, Peter > On 04.11.2016 11:00, Peter Hofer wrote: >> Hello everyone, >> >> we are researchers at the University of Linz and have worked on a >> lock contention profiler that is built into HotSpot. We would like >> to contribute this work to the OpenJDK community. >> >> Our profiler records an event when a thread fails to acquire a >> contended lock and also when a thread releases a contended lock. It >> further efficiently records the stack traces where these events >> occur. We devised a versatile visualization tool that analyzes the >> recorded events and determines when and where threads _cause_ >> contention by holding a contended lock. The visualization tool can >> show the contention by stack trace, by lock, by lock class, by >> thread, and by any combination of those aspects. >> >> [...] From adinn at redhat.com Fri Nov 4 14:21:31 2016 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 4 Nov 2016 14:21:31 +0000 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at> References: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at> Message-ID: <204272cd-b606-2be7-9359-ea05d0922515@redhat.com> On 04/11/16 12:04, Peter Hofer wrote: . . . >> Have you measured the overhead this change produces when running with >> contention detection disabled? (i.e. do we pay to have this feature even >> when we don't use it). > > We measured only the overhead relative to an unmodified OpenJDK build. > > Our profiler observes only lock contention, which is generally handled > via slow paths in the VM code, so this is where we added the code to > record events. I don't expect this code to cause much overhead when > disabled. However, we added fields to several data structures, which > might make a difference. Yes, increased footprint (in code as well as object space) would be as much a concern as increased execution time. > I'll run some more benchmarks and report my findings. Thanks very much. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From erik.joelsson at oracle.com Fri Nov 4 14:22:46 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 4 Nov 2016 15:22:46 +0100 Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is configured for static linking Message-ID: In the build, we have a global setting for linking libstdc++ static or dynamic on Linux. All libraries and executables that go in the product honor this setting. The gtestLauncher currently doesn't. This causes trouble in testing where some machines might not have the 32bit libstdc++.so installed. Since installing that library is not needed for just running the product, it's awkward to have to install it to run certain tests. This patch adds the LIBCXX flags from configure when linking gtestLauncher. The resulting file actually comes out a little bit smaller, so there is no footprint overhead. The tests still pass. Bug: https://bugs.openjdk.java.net/browse/JDK-8169255 Patch: diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk --- a/make/lib/CompileGtest.gmk +++ b/make/lib/CompileGtest.gmk @@ -107,6 +107,7 @@ LDFLAGS := $(LDFLAGS_JDKEXE), \ LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call SET_SHARED_LIBRARY_ORIGIN), \ LDFLAGS_solaris := -library=stlport4, \ + LIBS_linux := $(LIBCXX), \ LIBS_unix := -ljvm, \ LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \ COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \ /Erik From tim.bell at oracle.com Fri Nov 4 14:31:21 2016 From: tim.bell at oracle.com (Tim Bell) Date: Fri, 4 Nov 2016 07:31:21 -0700 Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is configured for static linking In-Reply-To: References: Message-ID: Erik: > In the build, we have a global setting for linking libstdc++ static or > dynamic on Linux. All libraries and executables that go in the product > honor this setting. The gtestLauncher currently doesn't. This causes > trouble in testing where some machines might not have the 32bit > libstdc++.so installed. Since installing that library is not needed for > just running the product, it's awkward to have to install it to run > certain tests. > > This patch adds the LIBCXX flags from configure when linking > gtestLauncher. The resulting file actually comes out a little bit > smaller, so there is no footprint overhead. The tests still pass. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8169255 > > Patch: > > diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk > --- a/make/lib/CompileGtest.gmk > +++ b/make/lib/CompileGtest.gmk > @@ -107,6 +107,7 @@ > LDFLAGS := $(LDFLAGS_JDKEXE), \ > LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call > SET_SHARED_LIBRARY_ORIGIN), \ > LDFLAGS_solaris := -library=stlport4, \ > + LIBS_linux := $(LIBCXX), \ > LIBS_unix := -ljvm, \ > LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \ > COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \ Looks good to me. Tim From marcus.larsson at oracle.com Fri Nov 4 15:16:34 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Fri, 4 Nov 2016 16:16:34 +0100 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: Message-ID: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> Hi, Thanks for fixing this. On 2016-11-01 18:30, Kirill Zhaldybin wrote: > Dear all, > > Could you please review this fix for 8169003? > > I changed parsing of time string so now it is not depend on LC_NUMERIC > locale so the test does not fail if locale where "floating point" is > actually a comma is set. > > WebRev: > http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ ISO8601 says the decimal point can be either '.' or ',' so the test should accept either. You could let sscanf read out the decimal point as a character and just verify that it is one of the two. In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that we won't accept "Z" suffixed strings. Please revert that. Thanks, Marcus > CR: https://bugs.openjdk.java.net/browse/JDK-8169003 > > Thank you. > > Regards, Kirill From David.Gnedt at jku.at Fri Nov 4 11:26:28 2016 From: David.Gnedt at jku.at (David Gnedt) Date: Fri, 04 Nov 2016 12:26:28 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: <581C7E740200009400009CA7@gwia.im.jku.at> Hello, I am one of the authors of this work and I gladly support this contribution. Best regards, David Gnedt >>> Peter Hofer 04.11.16 11.01 Uhr >>> Hello everyone, we are researchers at the University of Linz and have worked on a lock contention profiler that is built into HotSpot. We would like to contribute this work to the OpenJDK community. Our profiler records an event when a thread fails to acquire a contended lock and also when a thread releases a contended lock. It further efficiently records the stack traces where these events occur. We devised a versatile visualization tool that analyzes the recorded events and determines when and where threads _cause_ contention by holding a contended lock. The visualization tool can show the contention by stack trace, by lock, by lock class, by thread, and by any combination of those aspects. We described our profiler in more detail in a research paper at ICPE 2016. [1] In our evaluation, we found that the overhead is typically below 10% for common multi-threaded Java benchmarks. Please find a free download of the paper on our website: > http://mevss.jku.at/lct/ I contribute this work on behalf of Dynatrace Austria (the sponsor of this research), my colleagues David Gnedt and Andreas Schoergenhumer, and myself. The necessary OCAs have already been submitted. We provide two patches: Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we described and evaluated in our paper, plus minor improvements. It records events for Java intrinsic locks (monitors) and for java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). We support only Linux on 64-bit x86 hardware. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal native locks only. We consider this to be useful for HotSpot developers to find locking bottlenecks in HotSpot itself. We tested this patch only on Linux on 64-bit x86 hardware, but it should require few changes for other platforms. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/ With both patches, the profiler is enabled with -XX:+EnableEventTracing. By default, an uncompressed event trace is written to file "output.trc". More detailed usage information and a download of the corresponding visualization tool is available on our website, http://mevss.jku.at/lct/. Kind regards, Peter Hofer -- Peter Hofer Christian Doppler Laboratory on Monitoring and Evolution of Very-Large-Scale Software Systems / Institute for System Software University of Linz [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter Moessenboeck. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level. Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE?16), Delft, Netherlands, 2016. From Derek.White at cavium.com Fri Nov 4 17:07:37 2016 From: Derek.White at cavium.com (White, Derek) Date: Fri, 4 Nov 2016 17:07:37 +0000 Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 In-Reply-To: References: Message-ID: I saw this on some of my machines also - I thought it was a configuration issue, but now I think it?s due to the fix for JDK-8067744 being incompatible with some versions of gcc (which have differences with "as-needed"). Created bug: https://bugs.openjdk.java.net/browse/JDK-8169261 -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Yang Zhang Sent: Friday, November 04, 2016 3:08 AM To: hotspot-dev at openjdk.java.net Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 Hi, jdk9/hs/hotspot native libs for jtreg build failed after the push of http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/308a53dd5aee Build command: make test-image-hotspot-jtreg-native Could someone please help to fix it? The reason is that dl library isn't found. I think the following change could fix that: ------ diff --git a/make/test/JtregNative.gmk b/make/test/JtregNative.gmk index 78e78d7..95b5747 100644 --- a/make/test/JtregNative.gmk +++ b/make/test/JtregNative.gmk @@ -91,7 +91,7 @@ ifeq ($(OPENJDK_TARGET_OS), linux) BUILD_HOTSPOT_JTREG_LIBRARIES_LDFLAGS_libtest-rwx := -z execstack BUILD_HOTSPOT_JTREG_EXECUTABLES_LIBS_exeinvoke := -ljvm -lpthread BUILD_TEST_invoke_exeinvoke.c_OPTIMIZATION := NONE - BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -ldl + BUILD_HOTSPOT_JTREG_EXECUTABLES_LDFLAGS_exeFPRegs := -Wl,--no-as-needed -ldl endif ifeq ($(OPENJDK_TARGET_OS), windows) ------ Regards Yang From aph at redhat.com Fri Nov 4 17:24:23 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 4 Nov 2016 17:24:23 +0000 Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 In-Reply-To: References: Message-ID: On 04/11/16 17:07, White, Derek wrote: > The reason is that dl library isn't found. I think the following change could fix that: But why isn't it found? It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2 or somesuch. Or your system wouldn't work. Andrew. From Derek.White at cavium.com Fri Nov 4 17:54:40 2016 From: Derek.White at cavium.com (White, Derek) Date: Fri, 4 Nov 2016 17:54:40 +0000 Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 In-Reply-To: References: Message-ID: This is a build-time linking error. The link command does include -ldl, the libraries do exist in the expect places. So I don't understand exactly what the issue is. But Yang's fix follows some internet wisdom that include this claim: "Apparently it has something to do with recent versions of gcc/ld default to linking with -- as-needed." I haven't had time to track down a fuller explanation. - Derek -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Friday, November 04, 2016 1:24 PM To: White, Derek ; Yang Zhang ; hotspot-dev at openjdk.java.net Subject: Re: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 On 04/11/16 17:07, White, Derek wrote: > The reason is that dl library isn't found. I think the following change could fix that: But why isn't it found? It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2 or somesuch. Or your system wouldn't work. Andrew. From jeremymanson at google.com Fri Nov 4 18:24:20 2016 From: jeremymanson at google.com (Jeremy Manson) Date: Fri, 4 Nov 2016 11:24:20 -0700 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: <581C7E740200009400009CA7@gwia.im.jku.at> References: <581C7E740200009400009CA7@gwia.im.jku.at> Message-ID: Why aren't these extensions to JVMTI, which already has MonitorContendedEnter and MonitorContendedEntered events? You could just add a MonitorContendedRelease event to cover what you want. Then the bulk of the tracking work can be done in JVMTI. At Google, we've built on these JVMTI primitives quite successfully. The only internal enhancements we've had to make is to make them support j.u.c locks. (We've also done the hotspot lock contention work, but it has been less directly useful.) Jeremy On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt wrote: > Hello, > > I am one of the authors of this work and I gladly support this > contribution. > > Best regards, > David Gnedt > > >>> Peter Hofer 04.11.16 11.01 Uhr >>> > Hello everyone, > > we are researchers at the University of Linz and have worked on a lock > contention profiler that is built into HotSpot. We would like to > contribute this work to the OpenJDK community. > > Our profiler records an event when a thread fails to acquire a contended > > lock and also when a thread releases a contended lock. It further > efficiently records the stack traces where these events occur. We > devised a versatile visualization tool that analyzes the recorded events > > and determines when and where threads _cause_ contention by holding a > contended lock. The visualization tool can show the contention by stack > trace, by lock, by lock class, by thread, and by any combination of > those aspects. > > We described our profiler in more detail in a research paper at ICPE > 2016. [1] In our evaluation, we found that the overhead is typically > below 10% for common multi-threaded Java benchmarks. Please find a free > download of the paper on our website: > > http://mevss.jku.at/lct/ > > I contribute this work on behalf of Dynatrace Austria (the sponsor of > this research), my colleagues David Gnedt and Andreas Schoergenhumer, > and myself. The necessary OCAs have already been submitted. > > We provide two patches: > > Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we > described and evaluated in our paper, plus minor improvements. It > records events for Java intrinsic locks (monitors) and for > java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). > We support only Linux on 64-bit x86 hardware. > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ > > Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal > native locks only. We consider this to be useful for HotSpot developers > to find locking bottlenecks in HotSpot itself. We tested this patch only > > on Linux on 64-bit x86 hardware, but it should require few changes for > other platforms. > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_ > nativelocksonly_hotspot_jdk9%2b140/ > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_ > nativelocksonly_jdk_jdk-9%2b140/ > > With both patches, the profiler is enabled with -XX:+EnableEventTracing. > > By default, an uncompressed event trace is written to file "output.trc". > > More detailed usage information and a download of the corresponding > visualization tool is available on our website, > http://mevss.jku.at/lct/. > > Kind regards, > Peter Hofer > > > -- > Peter Hofer > Christian Doppler Laboratory on Monitoring and Evolution of > Very-Large-Scale Software Systems / Institute for System Software > University of Linz > > > [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter > Moessenboeck. Efficient Tracing and Versatile Analysis of Lock > Contention in Java Applications on the Virtual Machine Level. > Proceedings of the 7th ACM/SPEC International Conference on Performance > Engineering (ICPE?16), Delft, Netherlands, 2016. > > > From dmitry.samersoff at oracle.com Fri Nov 4 18:27:36 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Fri, 4 Nov 2016 21:27:36 +0300 Subject: jdk9/hs/hotspot make native libs for test build failure both on x86 and aarch64 In-Reply-To: References: Message-ID: <7f5b2cc3-7a67-b7da-f715-5a4d9dac3127@oracle.com> Andrew, gcc -Wl,--as-needed flag allows the linker to don't link to shared library if it think that this library is not necessary. It can cause an error in some cases e.g. libraries (-lXXX) appears in command line in wrong order[1]. So I think that explicitly disable as-needed when building tests is a good idea. 1. g++ -Wl,--no-as-needed -o test -ldl test.cxx OK. g++ -Wl,--as-needed -o test -ldl test.cxx /tmp/ccOqlI4O.o: In function `main': test.cxx:(.text+0xb7): undefined reference to `dlopen' collect2: error: ld returned 1 exit status -Dmitry On 2016-11-04 20:24, Andrew Haley wrote: > On 04/11/16 17:07, White, Derek wrote: >> The reason is that dl library isn't found. I think the following change could fix that: > > But why isn't it found? It should be on your system at /lib/aarch64-linux-gnu/libdl.so.2 > or somesuch. Or your system wouldn't work. > > Andrew. > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From ceeaspb at gmail.com Fri Nov 4 19:39:28 2016 From: ceeaspb at gmail.com (Alex Bagehot) Date: Fri, 4 Nov 2016 19:39:28 +0000 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: <581C7E740200009400009CA7@gwia.im.jku.at> Message-ID: Seems release was removed: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4986044 and... https://bugs.openjdk.java.net/browse/JDK-8038441 Related, there is a dtrace/systemtap probe for exit http://mail.openjdk.java.net/pipermail/distro-pkg-dev/2009-April/005445.html On Fri, Nov 4, 2016 at 6:24 PM, Jeremy Manson wrote: > Why aren't these extensions to JVMTI, which already has > MonitorContendedEnter and MonitorContendedEntered events? You could just > add a MonitorContendedRelease event to cover what you want. Then the bulk > of the tracking work can be done in JVMTI. > > At Google, we've built on these JVMTI primitives quite successfully. The > only internal enhancements we've had to make is to make them support j.u.c > locks. > > (We've also done the hotspot lock contention work, but it has been less > directly useful.) > > Jeremy > > On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt wrote: > >> Hello, >> >> I am one of the authors of this work and I gladly support this >> contribution. >> >> Best regards, >> David Gnedt >> >> >>> Peter Hofer 04.11.16 11.01 Uhr >>> >> Hello everyone, >> >> we are researchers at the University of Linz and have worked on a lock >> contention profiler that is built into HotSpot. We would like to >> contribute this work to the OpenJDK community. >> >> Our profiler records an event when a thread fails to acquire a contended >> >> lock and also when a thread releases a contended lock. It further >> efficiently records the stack traces where these events occur. We >> devised a versatile visualization tool that analyzes the recorded events >> >> and determines when and where threads _cause_ contention by holding a >> contended lock. The visualization tool can show the contention by stack >> trace, by lock, by lock class, by thread, and by any combination of >> those aspects. >> >> We described our profiler in more detail in a research paper at ICPE >> 2016. [1] In our evaluation, we found that the overhead is typically >> below 10% for common multi-threaded Java benchmarks. Please find a free >> download of the paper on our website: >> > http://mevss.jku.at/lct/ >> >> I contribute this work on behalf of Dynatrace Austria (the sponsor of >> this research), my colleagues David Gnedt and Andreas Schoergenhumer, >> and myself. The necessary OCAs have already been submitted. >> >> We provide two patches: >> >> Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we >> described and evaluated in our paper, plus minor improvements. It >> records events for Java intrinsic locks (monitors) and for >> java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). >> We support only Linux on 64-bit x86 hardware. >> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ >> > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ >> >> Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal >> native locks only. We consider this to be useful for HotSpot developers >> to find locking bottlenecks in HotSpot itself. We tested this patch only >> >> on Linux on 64-bit x86 hardware, but it should require few changes for >> other platforms. >> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_ >> nativelocksonly_hotspot_jdk9%2b140/ >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_ >> nativelocksonly_jdk_jdk-9%2b140/ >> >> With both patches, the profiler is enabled with -XX:+EnableEventTracing. >> >> By default, an uncompressed event trace is written to file "output.trc". >> >> More detailed usage information and a download of the corresponding >> visualization tool is available on our website, >> http://mevss.jku.at/lct/. >> >> Kind regards, >> Peter Hofer >> >> >> -- >> Peter Hofer >> Christian Doppler Laboratory on Monitoring and Evolution of >> Very-Large-Scale Software Systems / Institute for System Software >> University of Linz >> >> >> [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter >> Moessenboeck. Efficient Tracing and Versatile Analysis of Lock >> Contention in Java Applications on the Virtual Machine Level. >> Proceedings of the 7th ACM/SPEC International Conference on Performance >> Engineering (ICPE?16), Delft, Netherlands, 2016. >> >> >> From david.holmes at oracle.com Sat Nov 5 18:43:52 2016 From: david.holmes at oracle.com (David Holmes) Date: Sun, 6 Nov 2016 04:43:52 +1000 Subject: Memory ordering properties of Atomic::r-m-w operations In-Reply-To: References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com> <1475236951.6301.72.camel@oracle.com> <6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com> <14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com> <1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com> Message-ID: Forking new discussion from: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 On 1/11/2016 7:44 PM, Andrew Haley wrote: > On 31/10/16 21:30, David Holmes wrote: >> >> >> On 31/10/2016 7:32 PM, Andrew Haley wrote: >>> On 30/10/16 21:26, David Holmes wrote: >>>> On 31/10/2016 4:36 AM, Andrew Haley wrote: >>>>> >>>>> And, while we're on the subject, is memory_order_conservative actually >>>>> defined anywhere? >>>> >>>> No. It was chosen to represent the current status quo that the Atomic:: >>>> ops should all be (by default) full bi-directional fences. >>> >>> Does that mean that a CAS is actually stronger than a load acquire >>> followed by a store release? And that a CAS is a release fence even >>> when it fails and no store happens? >> >> Yes. Yes. >> >> // All of the atomic operations that imply a read-modify-write >> // action guarantee a two-way memory barrier across that >> // operation. Historically these semantics reflect the strength >> // of atomic operations that are provided on SPARC/X86. We assume >> // that strength is necessary unless we can prove that a weaker >> // form is sufficiently safe. > > Mmmm, but that doesn't say anything about a CAS that fails. But fair > enough, I accept your interpretation. Granted the above was not written with load-linked/store-conditional style implementations in mind; and the historical behaviour on sparc and x86 is not affected by failure of the cas, so it isn't called out. I should fix that. >> But there is some contention as to whether the actual implementations >> obey this completely. > > Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified > as a > > "full barrier". That is, no memory operand is moved across the > operation, either forward or backward. Further, instructions are > issued as necessary to prevent the processor from speculating loads > across the operation and from queuing stores after the operation. > > ... which reads the same as the language you quoted above, but looking > at the assembly code I'm sure that it's really no stronger than a seq > cst load followed by a seq cst store. Are you saying that a seq_cst load followed by a seq_cst store is weaker than a full barrier? > I guess maybe I could give up fighting this and implement all AArch64 > CAS sequences as > > CAS(seq_cst); full fence > > or, even more extremely, > > full fence; CAS(relaxed); full fence > > but it all seems unreasonably heavyweight. Indeed. A couple of issues here. If you are thinking in terms of orderAccess::fence() then it needs to guarantee visibility as well as ordering - see this bug I just filed: https://bugs.openjdk.java.net/browse/JDK-8169193 So would be heavier than a "full barrier" that simply combined all four storeload membar variants. Though of course the actual implementation on a given architecture may be just as heavyweight. And of course the Atomic op must guarantee visibility of the successful store (else the atomicity aspect would not be present). That aside we do not need two "fences" surrounding the atomic op. For platforms where the atomic op is a single instruction which combines load and store then conceptually all we need is: loadload|storeload; op; storeload|storestore Note this is at odds with the commentary in atomic.hpp which says things like: // add-value-to-dest I need to check why we settled on the above formulation - I suspect it was conservatism. And of course for the cmpxchg it fails to account for the fact there may not be a store to order with. For load-linked/store-conditional based operations that would expand to (assume a retry loop for unrelated store failures): loadLoad|storeLoad temp = ld-linked &val cmp temp, expected jmp ne st-cond &val, newVal storeload|storestore which is fine if we actually store, but if we find the wrong value there is no store for those final barriers to sync with. That then raises the question: can subsequent loads and stores move into the ld-linked/st-cond region? The general context-free answer would be yes, but the actual details may be architecture specific and also context dependent - ie the subsequent loads/stores may be dependent on the CAS succeeding (or on it failing). So without further knowledge you would need to use a "full-barrier" after the st-cond. David ----- >>> And that a conservative load is a *store* barrier? >> >> Not sure what you mean. Atomic::load is not a r-m-w action so not >> expected to be a two-way memory barrier. > > OK. > > Thanks, > > Andrew. > From david.holmes at oracle.com Sat Nov 5 18:48:28 2016 From: david.holmes at oracle.com (David Holmes) Date: Sun, 6 Nov 2016 04:48:28 +1000 Subject: RFR: JDK-8169255: Link gtestLauncher statically if libjvm is configured for static linking In-Reply-To: References: Message-ID: <7ffb2f76-9bc9-f774-7224-e6ff30b0e33c@oracle.com> Looks good. Thanks for fixing this Erik. David On 5/11/2016 12:22 AM, Erik Joelsson wrote: > In the build, we have a global setting for linking libstdc++ static or > dynamic on Linux. All libraries and executables that go in the product > honor this setting. The gtestLauncher currently doesn't. This causes > trouble in testing where some machines might not have the 32bit > libstdc++.so installed. Since installing that library is not needed for > just running the product, it's awkward to have to install it to run > certain tests. > > This patch adds the LIBCXX flags from configure when linking > gtestLauncher. The resulting file actually comes out a little bit > smaller, so there is no footprint overhead. The tests still pass. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8169255 > > Patch: > > diff -r 246f6fb74bf1 make/lib/CompileGtest.gmk > --- a/make/lib/CompileGtest.gmk > +++ b/make/lib/CompileGtest.gmk > @@ -107,6 +107,7 @@ > LDFLAGS := $(LDFLAGS_JDKEXE), \ > LDFLAGS_unix := -L$(JVM_OUTPUTDIR)/gtest $(call > SET_SHARED_LIBRARY_ORIGIN), \ > LDFLAGS_solaris := -library=stlport4, \ > + LIBS_linux := $(LIBCXX), \ > LIBS_unix := -ljvm, \ > LIBS_windows := $(JVM_OUTPUTDIR)/gtest/objs/jvm.lib, \ > COPY_DEBUG_SYMBOLS := $(GTEST_COPY_DEBUG_SYMBOLS), \ > > > /Erik > From aph at redhat.com Sun Nov 6 10:54:53 2016 From: aph at redhat.com (Andrew Haley) Date: Sun, 6 Nov 2016 10:54:53 +0000 Subject: Memory ordering properties of Atomic::r-m-w operations In-Reply-To: References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com> <1475236951.6301.72.camel@oracle.com> <6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com> <14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com> <1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com> Message-ID: <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com> On 05/11/16 18:43, David Holmes wrote: > Forking new discussion from: > > RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 > > On 1/11/2016 7:44 PM, Andrew Haley wrote: >> On 31/10/16 21:30, David Holmes wrote: >>> >>> >>> On 31/10/2016 7:32 PM, Andrew Haley wrote: >>>> On 30/10/16 21:26, David Holmes wrote: >>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote: >>> >>> // All of the atomic operations that imply a read-modify-write >>> // action guarantee a two-way memory barrier across that >>> // operation. Historically these semantics reflect the strength >>> // of atomic operations that are provided on SPARC/X86. We assume >>> // that strength is necessary unless we can prove that a weaker >>> // form is sufficiently safe. >> >> Mmmm, but that doesn't say anything about a CAS that fails. But fair >> enough, I accept your interpretation. > > Granted the above was not written with load-linked/store-conditional > style implementations in mind; and the historical behaviour on sparc > and x86 is not affected by failure of the cas, so it isn't called > out. I should fix that. > >>> But there is some contention as to whether the actual implementations >>> obey this completely. >> >> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified >> as a >> >> "full barrier". That is, no memory operand is moved across the >> operation, either forward or backward. Further, instructions are >> issued as necessary to prevent the processor from speculating loads >> across the operation and from queuing stores after the operation. >> >> ... which reads the same as the language you quoted above, but looking >> at the assembly code I'm sure that it's really no stronger than a seq >> cst load followed by a seq cst store. > > Are you saying that a seq_cst load followed by a seq_cst store is weaker > than a full barrier? Probably. I'm saying that when someone says "full barrier" they aren't exactly clear what that means. I know what sequential consistency is, but not "full barrier" because it's used inconsistently. For example, the above says that no memory operand is moved across the barrier, but if you have store_relaxed(a) load_seq_cst(b) store_seq_cst(c) load_relaxed(d) there's nothing to prevent load_seq_cst(b) load_relaxed(d) store_relaxed(a) store_seq_cst(c) It is true that neither store a nor load d have moved across this operation, but they have exchanged places. As far as GCC is concerned this is a correct implementation, and it does meet the requirement of sequential consistency as defined in the C++ memory model. >> I guess maybe I could give up fighting this and implement all AArch64 >> CAS sequences as >> >> CAS(seq_cst); full fence >> >> or, even more extremely, >> >> full fence; CAS(relaxed); full fence >> >> but it all seems unreasonably heavyweight. > > Indeed. A couple of issues here. If you are thinking in terms of > orderAccess::fence() then it needs to guarantee visibility as well as > ordering - see this bug I just filed: > > https://bugs.openjdk.java.net/browse/JDK-8169193 Ouch. Yes, I agree that something needs fixing. That comment: // Use release_store_fence to update values like the thread state, // where we don't want the current thread to continue until all our // prior memory accesses (including the new thread state) are visible // to other threads. ... seems very unhelpful, at least because a release fence (using conventional terminology) does not have that property: a release fence is only LoadStore|StoreStore. > So would be heavier than a "full barrier" that simply combined all > four storeload membar variants. Though of course the actual > implementation on a given architecture may be just as > heavyweight. And of course the Atomic op must guarantee visibility > of the successful store (else the atomicity aspect would not be > present). I don't think that's exactly right. As I understand the ARMv8 memory model, it's possible to have a CAS which imposes no memory ordering or visibility at all: it's a relaxed load and a relaxed store. Other threads can still see stale values of the store unless they attempt a CAS. This is really good: it's exactly what you want for some shared counters. > That aside we do not need two "fences" surrounding the atomic > op. For platforms where the atomic op is a single instruction which > combines load and store then conceptually all we need is: > > loadload|storeload; op; storeload|storestore > > Note this is at odds with the commentary in atomic.hpp which says things > like: > > // add-value-to-dest > > I need to check why we settled on the above formulation - I suspect it > was conservatism. And of course for the cmpxchg it fails to account for > the fact there may not be a store to order with. > > For load-linked/store-conditional based operations that would expand to > (assume a retry loop for unrelated store failures): > > loadLoad|storeLoad > temp = ld-linked &val > cmp temp, expected > jmp ne > st-cond &val, newVal > storeload|storestore > > which is fine if we actually store, but if we find the wrong value > there is no store for those final barriers to sync with. That then > raises the question: can subsequent loads and stores move into the > ld-linked/st-cond region? The general context-free answer would be > yes, but the actual details may be architecture specific and also > context dependent - ie the subsequent loads/stores may be dependent > on the CAS succeeding (or on it failing). So without further > knowledge you would need to use a "full-barrier" after the st-cond. On most (all?) architectures a StoreLoad fence is a full barrier, so this formulation is equivalent to what I was saying anyway. Andrew. From 1072213404 at qq.com Mon Nov 7 03:09:22 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Mon, 7 Nov 2016 11:09:22 +0800 Subject: help understanding release sematics Message-ID: Hi, in OrderAccess inline void OrderAccess::storestore() { release(); } inline void OrderAccess::loadstore() { acquire(); } the storestore can complete release sematics why some blog saying that release sematics include both storestore and loadstore? i can understand what blog say but i am a little confused by the code. thank you ! Arron From david.holmes at oracle.com Mon Nov 7 04:15:36 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 7 Nov 2016 14:15:36 +1000 Subject: help understanding release sematics In-Reply-To: References: Message-ID: <16bb2830-a0b6-7b27-d4cc-43519613c98c@oracle.com> On 7/11/2016 1:09 PM, ???? wrote: > Hi, > > > in OrderAccess > inline void OrderAccess::storestore() { release(); } > inline void OrderAccess::loadstore() { acquire(); } > the storestore can complete release sematics why some blog saying that release sematics include both storestore and loadstore? You are looking at a particular platform's implementation where the two things are the same at the hardware level. Conceptually it is the wrong way to express it. In orderAccess.hpp we define: acquire() == loadLoad|loadStore release() == loadStore|storeStore This is a particular definition inside hotspot such that we define an equivalence between these pairs: release_store(&x, 1) ? release(); x = 1; and y = load_acquire(&x) ? y = x; acquire(); In the more general literature this equivalence does not exist as the two statements could be reordered. acquire/release can not be exactly expressed using loadload/loadstore etc. I actually have a presentation on all this that I just did last week. I plan to add a few updates then make it available. David > > > i can understand what blog say but i am a little confused by the code. > > > thank you ! > > > Arron > From thomas.schatzl at oracle.com Mon Nov 7 10:53:34 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 07 Nov 2016 11:53:34 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> Message-ID: <1478516014.2646.16.camel@oracle.com> Hi, On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote: > > > > On Oct 21, 2016, at 9:54 PM, Kim Barrett > > wrote: > > > > > > > > On Oct 21, 2016, at 8:46 PM, Kim Barrett > > > wrote: > > > In the humongous case, if it bails because klass_or_null == NULL, > > > we must re-enqueue > > > the card ? > This update (webrev.02) reverts part of the previous change. > > In the original RFR I said: > > ? As a result of the changes in oops_on_card_seq_iterate_careful, we > ? now almost never fail to process the card.??The only place where > ? that can occur is a stale card in a humongous region with an > ? in-progress allocation, where we can just ignore it.??So the only > ? caller, refine_card, no longer needs to examine the result of the > ? call and enqueue the card for later reconsideration. > > Ignoring such a stale card is incorrect at the point where it was > being done.??At that point we've already cleaned the card, so we must > either process the designated object(s) or, if we can't do the > processing because of in-progress allocation (klass_or_null returned > NULL), then re-queue the card for later reconsideration. > > So the change to refine_card to eliminate that behavior, and the > associated changes to oops_on_card_seq_iterate_careful, were a > mistake, and are being reverted by this new version.??As a result, > refine_card is no longer changed at all. Thanks for catching this. Maybe it would be cleaner to call a method in the barrier set instead of inlining the dirtying + enqueuing in lines 685 to 691? Maybe as an additional RFE. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166607 > > Webrev: > Full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/ > Incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02.inc/ > > Testing: > Local specjbb2015 (non-perf). > Local jtreg hotspot_all. > Also tested as baseline of changes for JDK-8166811. > > Additionally, in the original RFR I also said: > > ? Note that [...] At present the only source of stale cards in the > ? concurrent case seems to be HCC eviction.??[...]??Doing HCC cleanup > ? when freeing regions might remove the need for klass_or_null > ? checking in the humongous case for concurrent refinement, so might > ? be worth looking into later. > > That was also incorrect; there are other sources of stale cards. Can you elaborate on that? > That doesn't affect this change, but may effect how JDK-8166811 > should be fixed, and removes the rationale for JDK-8166995 (which has > been resolved Won't Fix because of that). > > See also the RFR for the followup JDK-8166811. Thanks, ? Thomas From thomas.schatzl at oracle.com Mon Nov 7 10:57:05 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 07 Nov 2016 11:57:05 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: References: Message-ID: <1478516225.2646.19.camel@oracle.com> Hi, On Sat, 2016-10-29 at 19:26 -0400, Kim Barrett wrote: > > > > On Oct 25, 2016, at 7:13 PM, Kim Barrett > > wrote: > > > > Please review this change to address missing memory barriers needed > > to > > ensure ordering between allocation and refinement in G1. > > [?] > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8166811 > > > > Webrev: > > http://cr.openjdk.java.net/~kbarrett/8166811/webrev.00/ > > [Based on http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/] > > > ------------------------------------------------------------------- > -----------? > src/share/vm/gc/g1/g1RemSet.cpp > ?581???// The region could be young.??Cards for young regions are > dirtied, > ?582???// so the post-barrier will filter them out.??However, that > dirtying > ?583???// is performed concurrently.??A write to a young object could > occur > ?584???// before the card has been dirtied, slipping past the filter. > > This is a rewording of the comment that used to be here.??However, it > was not true even before these changes.??As part of JDK-8014555 we > mark young region cards with g1_young_card_val().??That's the change > set that added the storeload to the post-barrier. > > I'm not quite sure what to do about this. The comment is currently > wrong.??However, the storeload is considered a problem, and there > have been various ideas discussed for eliminating it that might allow > us to go back to dirtying young cards. Depends on what "dirtying" is supposed to mean in this context - setting it to "dirty" or setting it to something non-clean. One could replace "dirtied" by something less specific here to make it right again. Thanks, ? Thomas From kim.barrett at oracle.com Mon Nov 7 18:36:46 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Nov 2016 13:36:46 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1478516225.2646.19.camel@oracle.com> References: <1478516225.2646.19.camel@oracle.com> Message-ID: > On Nov 7, 2016, at 5:57 AM, Thomas Schatzl wrote: > > Hi, > > On Sat, 2016-10-29 at 19:26 -0400, Kim Barrett wrote: >>> >>> On Oct 25, 2016, at 7:13 PM, Kim Barrett >>> wrote: >>> >>> Please review this change to address missing memory barriers needed >>> to >>> ensure ordering between allocation and refinement in G1. >>> [?] >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8166811 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~kbarrett/8166811/webrev.00/ >>> [Based on http://cr.openjdk.java.net/~kbarrett/8166607/webrev.02/] >>> >> ------------------------------------------------------------------- >> ----------- >> src/share/vm/gc/g1/g1RemSet.cpp >> 581 // The region could be young. Cards for young regions are >> dirtied, >> 582 // so the post-barrier will filter them out. However, that >> dirtying >> 583 // is performed concurrently. A write to a young object could >> occur >> 584 // before the card has been dirtied, slipping past the filter. >> >> This is a rewording of the comment that used to be here. However, it >> was not true even before these changes. As part of JDK-8014555 we >> mark young region cards with g1_young_card_val(). That's the change >> set that added the storeload to the post-barrier. >> >> I'm not quite sure what to do about this. The comment is currently >> wrong. However, the storeload is considered a problem, and there >> have been various ideas discussed for eliminating it that might allow >> us to go back to dirtying young cards. > > Depends on what "dirtying" is supposed to mean in this context - > setting it to "dirty" or setting it to something non-clean. > > One could replace "dirtied" by something less specific here to make it > right again. Good idea. How about this rewording (using ?set to a value?) // The region could be young. Cards for young regions are set to a // value that allows the post-barrier to filter them out. However, // that card setting is performed concurrently. A write to a young // object could occur before the card has been set, slipping past // the filter. From kim.barrett at oracle.com Mon Nov 7 19:07:27 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Nov 2016 14:07:27 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: References: <1478516225.2646.19.camel@oracle.com> Message-ID: <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com> > On Nov 7, 2016, at 1:36 PM, Kim Barrett wrote: >>> src/share/vm/gc/g1/g1RemSet.cpp >>> 581 // The region could be young. Cards for young regions are >>> dirtied, >>> 582 // so the post-barrier will filter them out. However, that >>> dirtying >>> 583 // is performed concurrently. A write to a young object could >>> occur >>> 584 // before the card has been dirtied, slipping past the filter. >>> >>> This is a rewording of the comment that used to be here. However, it >>> was not true even before these changes. As part of JDK-8014555 we >>> mark young region cards with g1_young_card_val(). That's the change >>> set that added the storeload to the post-barrier. >>> >>> I'm not quite sure what to do about this. The comment is currently >>> wrong. However, the storeload is considered a problem, and there >>> have been various ideas discussed for eliminating it that might allow >>> us to go back to dirtying young cards. >> >> Depends on what "dirtying" is supposed to mean in this context - >> setting it to "dirty" or setting it to something non-clean. >> >> One could replace "dirtied" by something less specific here to make it >> right again. > > Good idea. How about this rewording (using ?set to a value?) > > // The region could be young. Cards for young regions are set to a > // value that allows the post-barrier to filter them out. However, > // that card setting is performed concurrently. A write to a young > // object could occur before the card has been set, slipping past > // the filter. Oops, no, that isn't right. (It's been a couple of weeks since I looked at this, and forgot part of the problem.) Part of what's wrong with the comment is that we can no longer get to that point with a young region. A young region's cards will be either g1_young_gen or clean, never dirty. Hence the filtering out of non-dirty cards a few lines before this comment will have already discarded a young card before we reach the test this comment is discussing. So the whole premise of the comment in question, that the region could be young, is false. From peter.hofer at jku.at Mon Nov 7 19:35:42 2016 From: peter.hofer at jku.at (Peter Hofer) Date: Mon, 7 Nov 2016 20:35:42 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: <581C7E740200009400009CA7@gwia.im.jku.at> Message-ID: Hi Jeremy, On 2016-11-04 19:24, Jeremy Manson wrote: > Why aren't these extensions to JVMTI, which already has > MonitorContendedEnter and MonitorContendedEntered events? You could > just add a MonitorContendedRelease event to cover what you want. > Then the bulk of the tracking work can be done in JVMTI. One of our main goals was to make profiling very lightweight so that there is a chance that the profiler can be used on production systems. In the HotSpot code, we can record events and maintain state very efficiently. I agree that a profiler that uses only JVMTI and extension methods would be more modular. We actually tried to implement a comparable profiler using JVMTI. It performs very frequent state transitions to the agent and back, requires wrapping all references and data structures, and needs tagging to associate state with objects. Moreover, it cannot efficiently cache stack traces without always resolving inlined methods from the compiler's debug information (which makes a lot of difference in our HotSpot-internal profiler). The JVMTI-based profiler turned out to be rather inefficient, which is why we didn't pursue this approach further. As Alex pointed out, there used to be a MonitorContendedExit event in early versions of JVMTI. It was eliminated because the context of a monitor exit is not really safe for invoking a JVMTI callback, which is another issue that would need to be addressed first. Cheers, Peter > At Google, we've built on these JVMTI primitives quite successfully. > The only internal enhancements we've had to make is to make them support > j.u.c locks. > > (We've also done the hotspot lock contention work, but it has been less > directly useful.) > > Jeremy > > On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt > wrote: > > Hello, > > I am one of the authors of this work and I gladly support this > contribution. > > Best regards, > David Gnedt > > >>> Peter Hofer > > 04.11.16 11.01 Uhr >>> > Hello everyone, > > we are researchers at the University of Linz and have worked on a lock > contention profiler that is built into HotSpot. We would like to > contribute this work to the OpenJDK community. > > Our profiler records an event when a thread fails to acquire a contended > > lock and also when a thread releases a contended lock. It further > efficiently records the stack traces where these events occur. We > devised a versatile visualization tool that analyzes the recorded events > > and determines when and where threads _cause_ contention by holding a > contended lock. The visualization tool can show the contention by stack > trace, by lock, by lock class, by thread, and by any combination of > those aspects. > > We described our profiler in more detail in a research paper at ICPE > 2016. [1] In our evaluation, we found that the overhead is typically > below 10% for common multi-threaded Java benchmarks. Please find a free > download of the paper on our website: > > http://mevss.jku.at/lct/ > > I contribute this work on behalf of Dynatrace Austria (the sponsor of > this research), my colleagues David Gnedt and Andreas Schoergenhumer, > and myself. The necessary OCAs have already been submitted. > > We provide two patches: > > Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we > described and evaluated in our paper, plus minor improvements. It > records events for Java intrinsic locks (monitors) and for > java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). > We support only Linux on 64-bit x86 hardware. > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ > > > Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal > native locks only. We consider this to be useful for HotSpot developers > to find locking bottlenecks in HotSpot itself. We tested this patch only > > on Linux on 64-bit x86 hardware, but it should require few changes for > other platforms. > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/ > > > > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/ > > > With both patches, the profiler is enabled with -XX:+EnableEventTracing. > > By default, an uncompressed event trace is written to file "output.trc". > > More detailed usage information and a download of the corresponding > visualization tool is available on our website, > http://mevss.jku.at/lct/. > > Kind regards, > Peter Hofer > > > -- > Peter Hofer > Christian Doppler Laboratory on Monitoring and Evolution of > Very-Large-Scale Software Systems / Institute for System Software > University of Linz > > > [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter > Moessenboeck. Efficient Tracing and Versatile Analysis of Lock > Contention in Java Applications on the Virtual Machine Level. > Proceedings of the 7th ACM/SPEC International Conference on Performance > Engineering (ICPE?16), Delft, Netherlands, 2016. > > > From kim.barrett at oracle.com Mon Nov 7 19:38:25 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 7 Nov 2016 14:38:25 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <1478516014.2646.16.camel@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> Message-ID: <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl wrote: > On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote: >>> >>> On Oct 21, 2016, at 9:54 PM, Kim Barrett >>> wrote: >>> >>>> >>>> On Oct 21, 2016, at 8:46 PM, Kim Barrett >>>> wrote: >>>> In the humongous case, if it bails because klass_or_null == NULL, >>>> we must re-enqueue >>>> the card ? >> This update (webrev.02) reverts part of the previous change. >> >> In the original RFR I said: >> >> As a result of the changes in oops_on_card_seq_iterate_careful, we >> now almost never fail to process the card. The only place where >> that can occur is a stale card in a humongous region with an >> in-progress allocation, where we can just ignore it. So the only >> caller, refine_card, no longer needs to examine the result of the >> call and enqueue the card for later reconsideration. >> >> Ignoring such a stale card is incorrect at the point where it was >> being done. At that point we've already cleaned the card, so we must >> either process the designated object(s) or, if we can't do the >> processing because of in-progress allocation (klass_or_null returned >> NULL), then re-queue the card for later reconsideration. >> >> So the change to refine_card to eliminate that behavior, and the >> associated changes to oops_on_card_seq_iterate_careful, were a >> mistake, and are being reverted by this new version. As a result, >> refine_card is no longer changed at all. > > Thanks for catching this. > > Maybe it would be cleaner to call a method in the barrier set instead > of inlining the dirtying + enqueuing in lines 685 to 691? Maybe as an > additional RFE. We could use _ct_bs->invalidate(dirtyRegion). That's rather overgeneralized and inefficient for this situation, but this situation should occur *very* rarely; it requires a stale card get processed just as a humongous object is in the midst of being allocated in the same region. >> Additionally, in the original RFR I also said: >> >> Note that [...] At present the only source of stale cards in the >> concurrent case seems to be HCC eviction. [...] Doing HCC cleanup >> when freeing regions might remove the need for klass_or_null >> checking in the humongous case for concurrent refinement, so might >> be worth looking into later. >> >> That was also incorrect; there are other sources of stale cards. > > Can you elaborate on that? Here's a scenario that I've observed while running a jtreg test (I think it was hotspot/test/gc/TestHumongousReferenceObject). We have humongous object H, referring to young object Y. This induces a remembered set entry for card C in region R (allocated for H). H becomes unreachable. Start concurrent collection cycle. Pause Initial Mark scan_rs pushes &H->Y onto mark stack. Pause Initial Mark evac processes &H->Y, copying Y, updating &H->Y, and adding C to g1h_dcqs in update_rs. Pause Initial Mark redirty_logged_cards dirties g1h_dcqs entries, including C. Pause Initial Mark merges g1h_dcqs into java_dcqs, adding dirty C to java_dcqs. Concurrent Mark determines H is dead. Pause Cleanup frees regions for H, including R. Concurrent Refinement finally comes across stale C in now (possibly) free R. A similar situation can arise if instead of H we have old O in region R and all objects in R are unreachable before starting concurrent collection, so that Pause Cleanup frees R. From david.holmes at oracle.com Tue Nov 8 01:11:57 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Nov 2016 11:11:57 +1000 Subject: Memory ordering properties of Atomic::r-m-w operations In-Reply-To: <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com> References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com> <1475236951.6301.72.camel@oracle.com> <6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com> <14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com> <1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com> <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com> Message-ID: On 6/11/2016 8:54 PM, Andrew Haley wrote: > On 05/11/16 18:43, David Holmes wrote: >> Forking new discussion from: >> >> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >> >> On 1/11/2016 7:44 PM, Andrew Haley wrote: >>> On 31/10/16 21:30, David Holmes wrote: >>>> >>>> >>>> On 31/10/2016 7:32 PM, Andrew Haley wrote: >>>>> On 30/10/16 21:26, David Holmes wrote: >>>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote: >>>> >>>> // All of the atomic operations that imply a read-modify-write >>>> // action guarantee a two-way memory barrier across that >>>> // operation. Historically these semantics reflect the strength >>>> // of atomic operations that are provided on SPARC/X86. We assume >>>> // that strength is necessary unless we can prove that a weaker >>>> // form is sufficiently safe. >>> >>> Mmmm, but that doesn't say anything about a CAS that fails. But fair >>> enough, I accept your interpretation. >> >> Granted the above was not written with load-linked/store-conditional >> style implementations in mind; and the historical behaviour on sparc >> and x86 is not affected by failure of the cas, so it isn't called >> out. I should fix that. >> >>>> But there is some contention as to whether the actual implementations >>>> obey this completely. >>> >>> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified >>> as a >>> >>> "full barrier". That is, no memory operand is moved across the >>> operation, either forward or backward. Further, instructions are >>> issued as necessary to prevent the processor from speculating loads >>> across the operation and from queuing stores after the operation. >>> >>> ... which reads the same as the language you quoted above, but looking >>> at the assembly code I'm sure that it's really no stronger than a seq >>> cst load followed by a seq cst store. >> >> Are you saying that a seq_cst load followed by a seq_cst store is weaker >> than a full barrier? > > Probably. I'm saying that when someone says "full barrier" they > aren't exactly clear what that means. I know what sequential > consistency is, but not "full barrier" because it's used > inconsistently. Agreed it is not a term that has a common definition - it may just relate to no-reorderings of any loads or stores, or it may also imply visibility guarantees. Though while I know what "sequential consistency" is I do not know what exactly it means to implement an operation with seq_cst semantics. > For example, the above says that no memory operand is moved across the > barrier, but if you have > > store_relaxed(a) > load_seq_cst(b) > store_seq_cst(c) > load_relaxed(d) > > there's nothing to prevent > > load_seq_cst(b) > load_relaxed(d) > store_relaxed(a) > store_seq_cst(c) > > It is true that neither store a nor load d have moved across this > operation, but they have exchanged places. As far as GCC is concerned > this is a correct implementation, and it does meet the requirement of > sequential consistency as defined in the C++ memory model. It does? Then it emphasises what I just said about not knowing what it means to implement an operation with seq_cst semantics. I would have expected full ordering of all loads and stores to get "sequential consistency". >>> I guess maybe I could give up fighting this and implement all AArch64 >>> CAS sequences as >>> >>> CAS(seq_cst); full fence >>> >>> or, even more extremely, >>> >>> full fence; CAS(relaxed); full fence >>> >>> but it all seems unreasonably heavyweight. >> >> Indeed. A couple of issues here. If you are thinking in terms of >> orderAccess::fence() then it needs to guarantee visibility as well as >> ordering - see this bug I just filed: >> >> https://bugs.openjdk.java.net/browse/JDK-8169193 > > Ouch. Yes, I agree that something needs fixing. That comment: > > // Use release_store_fence to update values like the thread state, > // where we don't want the current thread to continue until all our > // prior memory accesses (including the new thread state) are visible > // to other threads. > > ... seems very unhelpful, at least because a release fence (using > conventional terminology) does not have that property: a release > fence is only LoadStore|StoreStore. In release_store_fence the release and fence are distinct memory ordering components. It is not a store combined with a "release fence" but a store between a "release" and a "fence". And critically in hotspot that "fence" must have visibility guarantees to ensure correctness of Dekker-duality algorithms. Note the equivalence of release() with LoadStore|StoreStore is a definition within orderAccess.hpp, it is not a general equivalence. >> So would be heavier than a "full barrier" that simply combined all >> four storeload membar variants. Though of course the actual >> implementation on a given architecture may be just as >> heavyweight. And of course the Atomic op must guarantee visibility >> of the successful store (else the atomicity aspect would not be >> present). > > I don't think that's exactly right. As I understand the ARMv8 memory > model, it's possible to have a CAS which imposes no memory ordering or > visibility at all: it's a relaxed load and a relaxed store. Other > threads can still see stale values of the store unless they attempt a > CAS. This is really good: it's exactly what you want for some shared > counters. Okay - yes - a naked "relaxed" load need not see the result of a recent successful "CAS". But the load-with-reservation within a "CAS" must see such a store I would think, to ensure things work correctly - though I suppose that could also be handled at the store-with-reservation point. Which suggests that a CAS with a "full two-way memory barrier" on ARMv8 does indeed need a fairly heavy pre- and post-op memory barrier (which makes me wonder whether the reservation using ld.acq and st.rel can be efficiently strengthened as needed, or whether plain ld and st would be more efficient within the overall sequence). >> That aside we do not need two "fences" surrounding the atomic >> op. For platforms where the atomic op is a single instruction which >> combines load and store then conceptually all we need is: >> >> loadload|storeload; op; storeload|storestore >> >> Note this is at odds with the commentary in atomic.hpp which says things >> like: >> >> // add-value-to-dest >> >> I need to check why we settled on the above formulation - I suspect it >> was conservatism. And of course for the cmpxchg it fails to account for >> the fact there may not be a store to order with. Just a note that, for example, SPARC does not require a CAS to succeed, for a subsequent membar to consider the CAS as a load+store. >> >> For load-linked/store-conditional based operations that would expand to >> (assume a retry loop for unrelated store failures): >> >> loadLoad|storeLoad >> temp = ld-linked &val >> cmp temp, expected >> jmp ne >> st-cond &val, newVal >> storeload|storestore >> >> which is fine if we actually store, but if we find the wrong value >> there is no store for those final barriers to sync with. That then >> raises the question: can subsequent loads and stores move into the >> ld-linked/st-cond region? The general context-free answer would be >> yes, but the actual details may be architecture specific and also >> context dependent - ie the subsequent loads/stores may be dependent >> on the CAS succeeding (or on it failing). So without further >> knowledge you would need to use a "full-barrier" after the st-cond. > > On most (all?) architectures a StoreLoad fence is a full barrier, so > this formulation is equivalent to what I was saying anyway. I'm trying to distinguish the desired semantics from any actual implementation mechanism. That fact that, for example, on SPARC and x86, the only explicit barrier needed is storeLoad, so if you have that then you effectively have a "full barrier" because the other three are implicit, is incidental. Cheers, David > Andrew. > From jeremymanson at google.com Tue Nov 8 07:32:14 2016 From: jeremymanson at google.com (Jeremy Manson) Date: Mon, 7 Nov 2016 23:32:14 -0800 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: <581C7E740200009400009CA7@gwia.im.jku.at> Message-ID: Fair enough. We sample instead of getting detailed information, but we are generally trying to profile an application running on lots of JVMs, so we have some scale that not everyone does. We wouldn't have been able to live with 7.8% overhead either, though. I bet someone could come up with a reasonable strategy to make MonitorContendedExit work if they were motivated. :) Jeremy On Mon, Nov 7, 2016 at 11:35 AM, Peter Hofer wrote: > Hi Jeremy, > > On 2016-11-04 19:24, Jeremy Manson wrote: > >> Why aren't these extensions to JVMTI, which already has >> MonitorContendedEnter and MonitorContendedEntered events? You could >> just add a MonitorContendedRelease event to cover what you want. >> Then the bulk of the tracking work can be done in JVMTI. >> > > One of our main goals was to make profiling very lightweight so that > there is a chance that the profiler can be used on production systems. In > the HotSpot code, we can record events and maintain state very efficiently. > > I agree that a profiler that uses only JVMTI and extension methods would > be more modular. We actually tried to implement a comparable profiler using > JVMTI. It performs very frequent state transitions to the agent and back, > requires wrapping all references and data structures, and needs tagging to > associate state with objects. Moreover, it cannot efficiently cache stack > traces without always resolving inlined methods from the compiler's debug > information (which makes a lot of difference in our HotSpot-internal > profiler). The JVMTI-based profiler turned out to be rather inefficient, > which is why we didn't pursue this approach further. > > As Alex pointed out, there used to be a MonitorContendedExit event in > early versions of JVMTI. It was eliminated because the context of a monitor > exit is not really safe for invoking a JVMTI callback, which is another > issue that would need to be addressed first. > > Cheers, > Peter > > At Google, we've built on these JVMTI primitives quite successfully. >> The only internal enhancements we've had to make is to make them support >> j.u.c locks. >> >> (We've also done the hotspot lock contention work, but it has been less >> directly useful.) >> >> Jeremy >> >> On Fri, Nov 4, 2016 at 4:26 AM, David Gnedt > > wrote: >> >> Hello, >> >> I am one of the authors of this work and I gladly support this >> contribution. >> >> Best regards, >> David Gnedt >> >> >>> Peter Hofer > >> >> 04.11.16 11.01 Uhr >>> >> Hello everyone, >> >> we are researchers at the University of Linz and have worked on a lock >> contention profiler that is built into HotSpot. We would like to >> contribute this work to the OpenJDK community. >> >> Our profiler records an event when a thread fails to acquire a >> contended >> >> lock and also when a thread releases a contended lock. It further >> efficiently records the stack traces where these events occur. We >> devised a versatile visualization tool that analyzes the recorded >> events >> >> and determines when and where threads _cause_ contention by holding a >> contended lock. The visualization tool can show the contention by >> stack >> trace, by lock, by lock class, by thread, and by any combination of >> those aspects. >> >> We described our profiler in more detail in a research paper at ICPE >> 2016. [1] In our evaluation, we found that the overhead is typically >> below 10% for common multi-threaded Java benchmarks. Please find a >> free >> download of the paper on our website: >> > http://mevss.jku.at/lct/ >> >> I contribute this work on behalf of Dynatrace Austria (the sponsor of >> this research), my colleagues David Gnedt and Andreas Schoergenhumer, >> and myself. The necessary OCAs have already been submitted. >> >> We provide two patches: >> >> Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we >> described and evaluated in our paper, plus minor improvements. It >> records events for Java intrinsic locks (monitors) and for >> java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). >> We support only Linux on 64-bit x86 hardware. >> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_ >> jdk8u102b14/ >> > jdk8u102b14/> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ >> >> >> Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal >> native locks only. We consider this to be useful for HotSpot >> developers >> to find locking bottlenecks in HotSpot itself. We tested this patch >> only >> >> on Linux on 64-bit x86 hardware, but it should require few changes for >> other platforms. >> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativeloc >> ksonly_hotspot_jdk9%2b140/ >> > cksonly_hotspot_jdk9%2b140/> >> > >> http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativeloc >> ksonly_jdk_jdk-9%2b140/ >> > cksonly_jdk_jdk-9%2b140/> >> >> With both patches, the profiler is enabled with >> -XX:+EnableEventTracing. >> >> By default, an uncompressed event trace is written to file >> "output.trc". >> >> More detailed usage information and a download of the corresponding >> visualization tool is available on our website, >> http://mevss.jku.at/lct/. >> >> Kind regards, >> Peter Hofer >> >> >> -- >> Peter Hofer >> Christian Doppler Laboratory on Monitoring and Evolution of >> Very-Large-Scale Software Systems / Institute for System Software >> University of Linz >> >> >> [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter >> Moessenboeck. Efficient Tracing and Versatile Analysis of Lock >> Contention in Java Applications on the Virtual Machine Level. >> Proceedings of the 7th ACM/SPEC International Conference on >> Performance >> Engineering (ICPE?16), Delft, Netherlands, 2016. >> >> >> >> From 1072213404 at qq.com Tue Nov 8 08:48:44 2016 From: 1072213404 at qq.com (=?gb18030?B?tvHB6cbvyr8=?=) Date: Tue, 8 Nov 2016 16:48:44 +0800 Subject: help understanding lock instruction in OrderAccess::fence() Message-ID: hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp inline void OrderAccess::fence() { if (os::is_MP()) { // always use locked addl since mfence is sometimes expensive #ifdef AMD64 __asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory"); #else __asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory"); #endif } } my classmates think that code ?addl $0,0(%%esp)? having some specific effect? because esp points to the top of stack . it that true ? or the code ?addl $0,0(%%esp)? just equals no op, needing one operation after lock at least, otherwise lock instruction will produce an error . Thank you ! Arron From david.holmes at oracle.com Tue Nov 8 09:35:24 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Nov 2016 19:35:24 +1000 Subject: help understanding lock instruction in OrderAccess::fence() In-Reply-To: References: Message-ID: On 8/11/2016 6:48 PM, ???? wrote: > hotspot/src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp > inline void OrderAccess::fence() { > if (os::is_MP()) { > // always use locked addl since mfence is sometimes expensive > #ifdef AMD64 > __asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory"); > #else > __asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory"); > #endif > } > } > > > my classmates think that code ?addl $0,0(%%esp)? having some specific effect? > because esp points to the top of stack . > it that true ? > or the code ?addl $0,0(%%esp)? just equals no op, It is a no-op - adding zero to a value. > needing one operation after lock at least, otherwise lock instruction will produce an error . "lock" is not an instruction, it is an instruction prefix, so has to go before some other instruction. The "lock" prefix acts as a storeload** barrier for x86 and as per the comment can be cheaper than an explicit mfence instruction. **All the other barriers are implicit in the x86 memory model, so you only need to add a storeload barrier to get the necessary fence semantics. David > > Thank you ! > > > Arron > From thomas.schatzl at oracle.com Tue Nov 8 10:01:53 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 08 Nov 2016 11:01:53 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com> References: <1478516225.2646.19.camel@oracle.com> <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com> Message-ID: <1478599313.2689.44.camel@oracle.com> Hi Kim, On Mon, 2016-11-07 at 14:07 -0500, Kim Barrett wrote: > > > > On Nov 7, 2016, at 1:36 PM, Kim Barrett > > wrote: > > > > > > > > > > > [...] > > > One could replace "dirtied" by something less specific here to > > > make it > > > right again. > > Good idea.??How about this rewording (using ?set to a value?) > > > > ?// The region could be young.??Cards for young regions are set to > > a > > ?// value that allows the post-barrier to filter them > > out.??However, > > ?// that card setting is performed concurrently.??A write to a > > young > > ?// object could occur before the card has been set, slipping past > > ?// the filter. > Oops, no, that isn't right.??(It's been a couple of weeks since I > looked at this, and forgot part of the problem.) > > Part of what's wrong with the comment is that we can no longer get to > that point with a young region.??A young region's cards will be ither > g1_young_gen or clean, never dirty.??Hence the filtering out of non- Why? I think the reason for this comment has been that the following could happen: A: allocate new young region X, allocate object, storestore, stops at the beginning of the dirty_young_block() method B: allocate new object B in X, set B.y = something-outside, making the card "Dirty" since thread A did not actually start doing dirty_young_block() yet. Refinement: scans the card; since R does not seem to synchronize with A either, you may get a "dirty" card in a young (or free, depending on whether the setting of the region flag in X has already been observed - but it must be either one) region here in this case? A: does the work in dirty_young_block() (The previous is_young() check has indeed been wrong, and is_old_or_humongous() is better) > dirty cards a few lines before this comment will have already > discarded a young card before we reach the test this comment is > discussing.??So the whole premise of the comment in question, that > the region could be young, is false. I think the comment is good after all. I would even emphasize the act of setting it to "g1_young_gen" by writing something like: // The region could be young.??Cards for young regions are set to? // "g1_young_gen" so the post-barrier will filter them out.??However,? // that dirtying is performed concurrently.??A write to a young object? // could occur in the same region before the cards have been set to? // that value, slipping past the filter. Because then, if somebody removes g1_young_gen, he will hopefully find this place again by searching for "g1_young_gen" and think about this situation again. (in theory :)) Thanks, ? Thomas From aph at redhat.com Tue Nov 8 10:18:09 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 8 Nov 2016 10:18:09 +0000 Subject: Memory ordering properties of Atomic::r-m-w operations In-Reply-To: References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com> <6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com> <14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com> <1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com> <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com> Message-ID: <5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com> On 08/11/16 01:11, David Holmes wrote: > On 6/11/2016 8:54 PM, Andrew Haley wrote: >> On 05/11/16 18:43, David Holmes wrote: >>> Forking new discussion from: >>> >>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >>> >>> On 1/11/2016 7:44 PM, Andrew Haley wrote: >>>> On 31/10/16 21:30, David Holmes wrote: > >> if you have >> >> store_relaxed(a) >> load_seq_cst(b) >> store_seq_cst(c) >> load_relaxed(d) >> >> there's nothing to prevent >> >> load_seq_cst(b) >> load_relaxed(d) >> store_relaxed(a) >> store_seq_cst(c) >> >> It is true that neither store a nor load d have moved across this >> operation, but they have exchanged places. As far as GCC is concerned >> this is a correct implementation, and it does meet the requirement of >> sequential consistency as defined in the C++ memory model. > > It does? Then it emphasises what I just said about not knowing what it > means to implement an operation with seq_cst semantics. I take your point, but seq_cst is not a real mystery, it's just a matter of looking it up: it's all defined in the C++11 standard. And it's not significantly different from Java volatile. > I would have expected full ordering of all loads and stores to get > "sequential consistency". Why? There are only two sequentially-consistent loads and stores in that block of code. Of course those two have a total order. But you surely wouldn't expect a sequentially-consistent store to be ordered with respect to a relaxed load. >> Ouch. Yes, I agree that something needs fixing. That comment: >> >> // Use release_store_fence to update values like the thread state, >> // where we don't want the current thread to continue until all our >> // prior memory accesses (including the new thread state) are visible >> // to other threads. >> >> ... seems very unhelpful, at least because a release fence (using >> conventional terminology) does not have that property: a release >> fence is only LoadStore|StoreStore. > > In release_store_fence the release and fence are distinct memory > ordering components. It is not a store combined with a "release > fence" but a store between a "release" and a "fence". And critically > in hotspot that "fence" must have visibility guarantees to ensure > correctness of Dekker-duality algorithms. Ah, that is a slightly misleading name. The "_fence" at the end of the name is really a StoreLoad fence, got it. I noticed that once before, but I'd forgotten. I guess what's intended here is a sequentially-consistent store. > Note the equivalence of release() with LoadStore|StoreStore is a > definition within orderAccess.hpp, it is not a general equivalence. OK. It would certainly be nice if HotSpot could move to using standard terminology. Then, in time, we could just use the C++11 atomics. Andrew. From david.holmes at oracle.com Tue Nov 8 10:35:17 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 8 Nov 2016 20:35:17 +1000 Subject: Memory ordering properties of Atomic::r-m-w operations In-Reply-To: <5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com> References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com> <6ee4f1c6-f638-c5b9-7475-8fb6aeabf20b@oracle.com> <14c2eff4-4f90-caa0-17a7-835e6f1f1167@oracle.com> <1cbb094f-b29b-c6b3-1e50-bed21b140fcb@oracle.com> <333a37b3-7686-63d6-1852-ae963954bdd7@redhat.com> <5be88ed4-54ad-26e8-14ae-d5e402141287@redhat.com> Message-ID: <59d08376-fb41-cf39-3b1f-01f826e8d9e7@oracle.com> On 8/11/2016 8:18 PM, Andrew Haley wrote: > On 08/11/16 01:11, David Holmes wrote: >> On 6/11/2016 8:54 PM, Andrew Haley wrote: >>> On 05/11/16 18:43, David Holmes wrote: >>>> Forking new discussion from: >>>> >>>> RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64 >>>> >>>> On 1/11/2016 7:44 PM, Andrew Haley wrote: >>>>> On 31/10/16 21:30, David Holmes wrote: >> >>> if you have >>> >>> store_relaxed(a) >>> load_seq_cst(b) >>> store_seq_cst(c) >>> load_relaxed(d) >>> >>> there's nothing to prevent >>> >>> load_seq_cst(b) >>> load_relaxed(d) >>> store_relaxed(a) >>> store_seq_cst(c) >>> >>> It is true that neither store a nor load d have moved across this >>> operation, but they have exchanged places. As far as GCC is concerned >>> this is a correct implementation, and it does meet the requirement of >>> sequential consistency as defined in the C++ memory model. >> >> It does? Then it emphasises what I just said about not knowing what it >> means to implement an operation with seq_cst semantics. > > I take your point, but seq_cst is not a real mystery, it's just a > matter of looking it up: it's all defined in the C++11 standard. And > it's not significantly different from Java volatile. I have looked at it of course, but still find it rather "mysterious". >> I would have expected full ordering of all loads and stores to get >> "sequential consistency". > > Why? There are only two sequentially-consistent loads and stores in > that block of code. Of course those two have a total order. But you > surely wouldn't expect a sequentially-consistent store to be ordered > with respect to a relaxed load. I guess I think of sequentially consistent as a global property of a system, not relative to just atomic operations. >>> Ouch. Yes, I agree that something needs fixing. That comment: >>> >>> // Use release_store_fence to update values like the thread state, >>> // where we don't want the current thread to continue until all our >>> // prior memory accesses (including the new thread state) are visible >>> // to other threads. >>> >>> ... seems very unhelpful, at least because a release fence (using >>> conventional terminology) does not have that property: a release >>> fence is only LoadStore|StoreStore. >> >> In release_store_fence the release and fence are distinct memory >> ordering components. It is not a store combined with a "release >> fence" but a store between a "release" and a "fence". And critically >> in hotspot that "fence" must have visibility guarantees to ensure >> correctness of Dekker-duality algorithms. > > Ah, that is a slightly misleading name. The "_fence" at the end of > the name is really a StoreLoad fence, got it. I noticed that once > before, but I'd forgotten. I guess what's intended here is a > sequentially-consistent store. It is intended to be: release(); store; fence(); but might be implementable in a more efficient manner when combined in a single function. I have a problem with referring to a "storeload fence". storeload is one form of memory barrier - a full fence represents all four forms to me. Terminology is a disaster in this field unfortunately - one architectures barrier is anothers fence. :( >> Note the equivalence of release() with LoadStore|StoreStore is a >> definition within orderAccess.hpp, it is not a general equivalence. > > OK. It would certainly be nice if HotSpot could move to using > standard terminology. Then, in time, we could just use the C++11 > atomics. The stand-alone (unbound) release() and acquire() are defined as they are to allow them to be associated with a subsequent store, or previous load, in cases where we can not access the variable directly to apply a release_store, or load_acquire operation. This is somewhat independent of the atomic API. David ----- > Andrew. > From thomas.schatzl at oracle.com Tue Nov 8 12:52:29 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 08 Nov 2016 13:52:29 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: References: Message-ID: <1478609549.2689.71.camel@oracle.com> Hi Kim, On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: > Please review this change to address missing memory barriers needed > to ensure ordering between allocation and refinement in G1. > > Rather than simply adding the "obvious" barriers, this change > modifies refinement to not need any additional memory barriers. > > First, the heap region type predicate used to decide whether the card > should be processed has been changed.??Previously, !is_young was > used, but that isn't really the state of interest. Rather, processing > should only occur if the region is old or humongous, not if young or > *free*. The free case (and so other cases that should be filtered > out) can happen if the card is stale, and there are several ways to > get stale cards here.??So added is_old_or_humongous type predicate > and use it for filtering based on the region's type. > > Second, moved to refine_card the card region trimming to the heap > region's allocated space, and the associated filtering, to be > co-located with the type-based filtering.??An empty trimmed card > region is another indication of a stale card. > > We should filter out cards that are !is_old_or_humongous or when the > card's region is beyond the allocated space for the heap > region.??Only if the card is old/humongous and covers allocated space > should we proceed with processing, and then only for the subset of > the card covering allocated space. > > Moved the card cleaning to refine_card.??Having the cleaning in the > iterator seemed misplaced.??Placing it in refine_card, after the card > trimming and type-based filtering also allows the fence needed for > the cleaning to serve double duty; in addition to ensuring processing > occurs after card cleaning (the original purpose), it now also > ensures processing follows the filtering.??And this ensures the > necessary synchronization with allocation; we can't safely examine > some parts of the heap region object or the memory designated by the > card until after the card has been trimmed and filtered.??Part of > this involved changing the storeload to a full fence, though for > supported platforms that makes no difference in the underlying > implementation. > > (This change to card cleaning would benefit from a store_fence > operation on some platforms, but that operation was phased out, and a > release_store_fence is stronger (and more expensive) than needed on > some platforms.) It would also be beneficial to make the fence conditional on is_gc_active(), but that may be another change as we previously did the storeload unconditionally too. > There is still a situation where processing can fail, namely an > in-progress humongous allocation that hasn't set the klass yet.??We > continue to handle that as before. - I am not completely sure about whether this case is handled correctly. I am mostly concerned that the information used before the fence may not be the correct ones, but the checks expect them to be valid. Probably I am overlooking something critical somewhere. A: allocates humongous object C, sets region type, issues storestore, sets top pointers, writes the object, and then sets C.y = x to mark a card Refinement: gets card (and assuming we have no further synchronization around which is not true, e.g. the enqueuing) ?592???if (!r->is_old_or_humongous()) { assume refinement thread has not received the "type" correctly yet, so must be Free. So the card will be filtered out incorrectly? That is contradictory to what I said in the other email about the comment discussion, but I only thoroughly looked at the comment aspect there. :) I think at this point in general we can't do anything but !is_young(), as we can't ignore cards in "Free" regions - they may be for cards for humongous ones where the thread did not receive top and/or the type yet? - assuming this works due to other synchronization, I have another similar concern with later trimming: 653 } else { 654 ? // Non-humongous objects are only allocated in the old-gen during 655 ? // GC, so if region is old then top is stable.??Humongous object 656 ? // allocation sets top last; if top has not yet been set, then 657 ? // we'll end up with an empty intersection. 658 ? scan_limit = r->top(); 659 } 660 if (scan_limit <= start) { 661 ? // If the trimmed region is empty, the card must be stale. 662 ? return false; 663 } Assume that the current value of top for a humongous object has not been seen yet by the thread and we end up with an empty intersection. Now, didn't we potentially just drop a card to a humongous object in waiting to scan but did not re-enqueue it? (And we did not clear the card table value either?) We may do it after the fence though I think. Maybe I am completely wrong though, what do you think? - another stale comment: ?636???// a card beyond the heap.??This is not safe without a perm ?637???// gen at the upper end of the heap. Could everything after "without" be removed in this sentence? We haven't had a "perm gen" for a long time... Thanks, ? Thomas From markus.gronlund at oracle.com Tue Nov 8 14:14:12 2016 From: markus.gronlund at oracle.com (Markus Gronlund) Date: Tue, 8 Nov 2016 06:14:12 -0800 (PST) Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: <2c9077d1-fa5b-42f6-a805-0eb343b5b22e@default> Hi Peter, Thanks for your offer to contribute this work to the OpenJDK. You will most likely need to follow the Java Enhancement Proposal (JEP) process for this work: Please see the following link for the JEP process description: http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html Thanks Markus -----Original Message----- From: Peter Hofer [mailto:peter.hofer at jku.at] Sent: den 4 november 2016 11:01 To: hotspot-dev at openjdk.java.net Cc: David Gnedt; Andreas Schoergenhumer Subject: Contribution: Lock Contention Profiler for HotSpot Hello everyone, we are researchers at the University of Linz and have worked on a lock contention profiler that is built into HotSpot. We would like to contribute this work to the OpenJDK community. Our profiler records an event when a thread fails to acquire a contended lock and also when a thread releases a contended lock. It further efficiently records the stack traces where these events occur. We devised a versatile visualization tool that analyzes the recorded events and determines when and where threads _cause_ contention by holding a contended lock. The visualization tool can show the contention by stack trace, by lock, by lock class, by thread, and by any combination of those aspects. We described our profiler in more detail in a research paper at ICPE 2016. [1] In our evaluation, we found that the overhead is typically below 10% for common multi-threaded Java benchmarks. Please find a free download of the paper on our website: > http://mevss.jku.at/lct/ I contribute this work on behalf of Dynatrace Austria (the sponsor of this research), my colleagues David Gnedt and Andreas Schoergenhumer, and myself. The necessary OCAs have already been submitted. We provide two patches: Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we described and evaluated in our paper, plus minor improvements. It records events for Java intrinsic locks (monitors) and for java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). We support only Linux on 64-bit x86 hardware. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14 > / http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal native locks only. We consider this to be useful for HotSpot developers to find locking bottlenecks in HotSpot itself. We tested this patch only on Linux on 64-bit x86 hardware, but it should require few changes for other platforms. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hot > spot_jdk9%2b140/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk > _jdk-9%2b140/ With both patches, the profiler is enabled with -XX:+EnableEventTracing. By default, an uncompressed event trace is written to file "output.trc". More detailed usage information and a download of the corresponding visualization tool is available on our website, http://mevss.jku.at/lct/. Kind regards, Peter Hofer -- Peter Hofer Christian Doppler Laboratory on Monitoring and Evolution of Very-Large-Scale Software Systems / Institute for System Software University of Linz [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter Moessenboeck. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level. Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE?16), Delft, Netherlands, 2016. From andreas.schoergenhumer at jku.at Tue Nov 8 08:27:37 2016 From: andreas.schoergenhumer at jku.at (=?UTF-8?Q?Andreas_Sch=c3=b6rgenhumer?=) Date: Tue, 8 Nov 2016 09:27:37 +0100 Subject: Fwd: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: References: Message-ID: <551c23dc-c9d0-8a22-070c-c3668ad6d63d@jku.at> Hi, I am one of the authors of this work and I gladly support this contribution. Kind regards, Andreas Sch?rgenhumer -------- Forwarded Message -------- Subject: Contribution: Lock Contention Profiler for HotSpot Date: Fri, 4 Nov 2016 11:00:38 +0100 From: Peter Hofer > Hello everyone, we are researchers at the University of Linz and have worked on a lock contention profiler that is built into HotSpot. We would like to contribute this work to the OpenJDK community. Our profiler records an event when a thread fails to acquire a contended lock and also when a thread releases a contended lock. It further efficiently records the stack traces where these events occur. We devised a versatile visualization tool that analyzes the recorded events and determines when and where threads _cause_ contention by holding a contended lock. The visualization tool can show the contention by stack trace, by lock, by lock class, by thread, and by any combination of those aspects. We described our profiler in more detail in a research paper at ICPE 2016. [1] In our evaluation, we found that the overhead is typically below 10% for common multi-threaded Java benchmarks. Please find a free download of the paper on our website: > http://mevss.jku.at/lct/ I contribute this work on behalf of Dynatrace Austria (the sponsor of this research), my colleagues David Gnedt and Andreas Schoergenhumer, and myself. The necessary OCAs have already been submitted. We provide two patches: Patch 1. A patch for OpenJDK 8u102-b14 with the profiler that we described and evaluated in our paper, plus minor improvements. It records events for Java intrinsic locks (monitors) and for java.util.concurrent locks (ReentrantLock and ReentrantReadWriteLock). We support only Linux on 64-bit x86 hardware. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_hotspot_jdk8u102b14/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_jdk_jdk8u102b14/ Patch 2. A patch for OpenJDK 9+140 with a profiler for VM-internal native locks only. We consider this to be useful for HotSpot developers to find locking bottlenecks in HotSpot itself. We tested this patch only on Linux on 64-bit x86 hardware, but it should require few changes for other platforms. > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_hotspot_jdk9%2b140/ > http://cr.openjdk.java.net/~tschatzl/phofer/webrev_nativelocksonly_jdk_jdk-9%2b140/ With both patches, the profiler is enabled with -XX:+EnableEventTracing. By default, an uncompressed event trace is written to file "output.trc". More detailed usage information and a download of the corresponding visualization tool is available on our website, http://mevss.jku.at/lct/. Kind regards, Peter Hofer -- Peter Hofer Christian Doppler Laboratory on Monitoring and Evolution of Very-Large-Scale Software Systems / Institute for System Software University of Linz [1] Peter Hofer, David Gnedt, Andreas Schoergenhumer, Hanspeter Moessenboeck. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level. Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE?16), Delft, Netherlands, 2016. From aph at redhat.com Wed Nov 9 10:42:02 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 9 Nov 2016 10:42:02 +0000 Subject: Segfaults in error traces caused by modules Message-ID: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com> I'm seeing repeated segfaults in error traces. These seem to be caused by void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const { if (_cb != NULL) { if (Interpreter::contains(pc())) { Method* m = this->interpreter_frame_method(); if (m != NULL) { m->name_and_sig_as_C_string(buf, buflen); st->print("j %s", buf); st->print("+%d", this->interpreter_frame_bci()); ModuleEntry* module = m->method_holder()->module(); if (module->is_named()) { module->name()->as_C_string(buf, buflen); st->print(" %s", buf); module->version()->as_C_string(buf, buflen); where module->version() returns NULL. Is this expected? Andrew. From Alan.Bateman at oracle.com Wed Nov 9 11:03:41 2016 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 9 Nov 2016 12:03:41 +0100 Subject: Segfaults in error traces caused by modules In-Reply-To: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com> References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com> Message-ID: On 09/11/2016 11:42, Andrew Haley wrote: > I'm seeing repeated segfaults in error traces. These seem to be caused by > > void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const { > if (_cb != NULL) { > if (Interpreter::contains(pc())) { > Method* m = this->interpreter_frame_method(); > if (m != NULL) { > m->name_and_sig_as_C_string(buf, buflen); > st->print("j %s", buf); > st->print("+%d", this->interpreter_frame_bci()); > ModuleEntry* module = m->method_holder()->module(); > if (module->is_named()) { > module->name()->as_C_string(buf, buflen); > st->print(" %s", buf); > module->version()->as_C_string(buf, buflen); > > where module->version() returns NULL. > > The version is optional and so modules may be defined to the VM with a version string of NULL. It may be that this code has only been tested with images builds, where the platform modules have version ("9" or "9-internal" ...). However with an exploded build then the platform modules don't have a version string and I assume this is where you hit this. -Alan From aph at redhat.com Wed Nov 9 11:15:51 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 9 Nov 2016 11:15:51 +0000 Subject: Segfaults in error traces caused by modules In-Reply-To: References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com> Message-ID: On 09/11/16 11:03, Alan Bateman wrote: > On 09/11/2016 11:42, Andrew Haley wrote: > >> I'm seeing repeated segfaults in error traces. These seem to be caused by >> >> void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const { >> if (_cb != NULL) { >> if (Interpreter::contains(pc())) { >> Method* m = this->interpreter_frame_method(); >> if (m != NULL) { >> m->name_and_sig_as_C_string(buf, buflen); >> st->print("j %s", buf); >> st->print("+%d", this->interpreter_frame_bci()); >> ModuleEntry* module = m->method_holder()->module(); >> if (module->is_named()) { >> module->name()->as_C_string(buf, buflen); >> st->print(" %s", buf); >> module->version()->as_C_string(buf, buflen); >> >> where module->version() returns NULL. >> > The version is optional and so modules may be defined to the VM with a > version string of NULL. It may be that this code has only been tested > with images builds, where the platform modules have version ("9" or > "9-internal" ...). However with an exploded build then the platform > modules don't have a version string and I assume this is where you hit this. Yes. OK, so it's a bug. Thanks. Andrew. From shafi.s.ahmad at oracle.com Thu Nov 10 06:42:02 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 9 Nov 2016 22:42:02 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses Message-ID: <77e0b348-2b95-4097-ba95-906257d8893c@default> Hi, Please review the backport of following dependent backports. jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473 Conflict in file src/share/vm/opto/memnode.cpp due to 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-8080289]. Manual merge is not done as the corresponding code is not there in jdk8u-dev. Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual merge is done. webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 Conflict in file src/share/vm/opto/library_call.cpp due to 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 [JDK-8140309]. Manual merge is not done as the corresponding code is not there in jdk8u-dev. webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781 Clean merge webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101 Conflict in file src/share/vm/opto/library_call.cpp due to 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 [JDK-8160360] - Resolved 2. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 [JDK-8148146] - Manual merge is not done as the corresponding code is not there in jdk8u-dev. webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 Testing: jprt and jtreg Regards, Shafi > -----Original Message----- > From: Shafi Ahmad > Sent: Thursday, October 20, 2016 10:08 AM > To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > Thanks Vladimir. > > I will create dependent backport of > 1. https://bugs.openjdk.java.net/browse/JDK-8136473 > 2. https://bugs.openjdk.java.net/browse/JDK-8155781 > 3. https://bugs.openjdk.java.net/browse/JDK-8162101 > > Regards, > Shafi > > > -----Original Message----- > > From: Vladimir Kozlov > > Sent: Wednesday, October 19, 2016 8:27 AM > > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > > produces mismatched unsafe accesses > > > > Hi Shafi, > > > > You should also consider backporting following related fixes: > > > > https://bugs.openjdk.java.net/browse/JDK-8155781 > > https://bugs.openjdk.java.net/browse/JDK-8162101 > > > > Otherwise you may hit asserts added by 8134918 changes. > > > > Thanks, > > Vladimir > > > > On 10/17/16 3:12 AM, Shafi Ahmad wrote: > > > Hi All, > > > > > > Please review the backport of JDK-8134918 - C2: Type speculation > > > produces > > mismatched unsafe accesses to jdk8u-dev. > > > > > > Please note that backport is not clean and the conflict is due to: > > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 > > > 65 > > > > > > Getting debug build failure because of: > > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 > > > 55 > > > > > > The above changes are done under bug# 'JDK-8136473: failed: no > > mismatched stores, except on raw memory: StoreB StoreI' which is not > > back ported to jdk8u and the current backport is on top of above change. > > > > > > Please note that I am not sure if there is any dependency between > > > these > > two changesets. > > > > > > open webrev: > http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > > > jdk9 changeset: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > > > > > > testing: Passes JPRT, jtreg not completed > > > > > > Regards, > > > Shafi > > > From shafi.s.ahmad at oracle.com Thu Nov 10 07:10:20 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 9 Nov 2016 23:10:20 -0800 (PST) Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 ciObjectFactory::create_new_metadata In-Reply-To: References: <2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default> Message-ID: Hi All, May I get the second review for this backport. Regards, Shafi > -----Original Message----- > From: Shafi Ahmad > Sent: Tuesday, October 25, 2016 9:09 AM > To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > Cc: Vladimir Ivanov > Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 > ciObjectFactory::create_new_metadata > > May I get the second review for this backport. > > Regards, > Shafi > > > -----Original Message----- > > From: Shafi Ahmad > > Sent: Thursday, October 20, 2016 9:55 AM > > To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > > Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with > > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata > > > > Thank you Vladimir for the review. > > > > Please find the updated webrev link. > > http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/ > > > > All, > > > > May I get 2nd review for this. > > > > Regards, > > Shafi > > > > > -----Original Message----- > > > From: Vladimir Kozlov > > > Sent: Wednesday, October 19, 2016 10:14 PM > > > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > > > Cc: Vladimir Ivanov; Jamsheed C M > > > Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with > > > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata > > > > > > In ciMethod.hpp you duplicated comment line: > > > > > > + // Given a certain calling environment, find the monomorphic > > > + target > > > // Given a certain calling environment, find the monomorphic > > > target > > > > > > Otherwise looks good. > > > > > > Thanks, > > > Vladimir K > > > > > > On 10/19/16 12:53 AM, Shafi Ahmad wrote: > > > > Hi All, > > > > > > > > Please review the backport of 'JDK-8134389: Crash in HotSpot with > > > jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev. > > > > > > > > Please note that backport is not clean as I was getting build failure due > to: > > > > Formal parameter 'ignore_return' in method > > > > GraphBuilder::method_return > > > is added in the fix of https://bugs.openjdk.java.net/browse/JDK- > 8164122. > > > > The current code change is done on top of aforesaid bug fix and > > > > this formal > > > parameter is referenced in this code change. > > > > * if (x != NULL && !ignore_return) { * > > > > > > > > Author of this code change suggested me, we can safely remove this > > > addition conditional expression ' && !ignore_return'. > > > > > > > > open webrev: > > http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/ > > > > jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389 > > > > jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs- > > > comp/hotspot/rev/4191b33b3629 > > > > > > > > testing: Passes JPRT, jtreg on Linux [amd64] and newly added test > > > > case > > > > > > > > Regards, > > > > Shafi > > > > From harold.seigel at oracle.com Thu Nov 10 14:54:30 2016 From: harold.seigel at oracle.com (harold seigel) Date: Thu, 10 Nov 2016 09:54:30 -0500 Subject: Segfaults in error traces caused by modules In-Reply-To: References: <638c6cea-bde6-6b06-4c92-aa859671ba89@redhat.com> Message-ID: <66f938eb-b319-98dd-97a1-2ffef7d58d18@oracle.com> Thanks for letting us know about this. I entered https://bugs.openjdk.java.net/browse/JDK-8169551 for this issue. Harold On 11/9/2016 6:15 AM, Andrew Haley wrote: > On 09/11/16 11:03, Alan Bateman wrote: >> On 09/11/2016 11:42, Andrew Haley wrote: >> >>> I'm seeing repeated segfaults in error traces. These seem to be caused by >>> >>> void frame::print_on_error(outputStream* st, char* buf, int buflen, bool verbose) const { >>> if (_cb != NULL) { >>> if (Interpreter::contains(pc())) { >>> Method* m = this->interpreter_frame_method(); >>> if (m != NULL) { >>> m->name_and_sig_as_C_string(buf, buflen); >>> st->print("j %s", buf); >>> st->print("+%d", this->interpreter_frame_bci()); >>> ModuleEntry* module = m->method_holder()->module(); >>> if (module->is_named()) { >>> module->name()->as_C_string(buf, buflen); >>> st->print(" %s", buf); >>> module->version()->as_C_string(buf, buflen); >>> >>> where module->version() returns NULL. >>> >> The version is optional and so modules may be defined to the VM with a >> version string of NULL. It may be that this code has only been tested >> with images builds, where the platform modules have version ("9" or >> "9-internal" ...). However with an exploded build then the platform >> modules don't have a version string and I assume this is where you hit this. > Yes. OK, so it's a bug. Thanks. > > Andrew. > From kim.barrett at oracle.com Thu Nov 10 17:42:34 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 10 Nov 2016 12:42:34 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1478599313.2689.44.camel@oracle.com> References: <1478516225.2646.19.camel@oracle.com> <867FA0FF-C699-4CB3-B34A-E754D9C13F15@oracle.com> <1478599313.2689.44.camel@oracle.com> Message-ID: <1D73FB14-127D-4508-A9CA-F9F88F12EACD@oracle.com> > On Nov 8, 2016, at 5:01 AM, Thomas Schatzl wrote: > On Mon, 2016-11-07 at 14:07 -0500, Kim Barrett wrote: >>> >>> On Nov 7, 2016, at 1:36 PM, Kim Barrett >>> wrote: >>>> >>>>> >>>>> > [...] >>>> One could replace "dirtied" by something less specific here to >>>> make it >>>> right again. >>> Good idea. How about this rewording (using ?set to a value?) >>> >>> // The region could be young. Cards for young regions are set to >>> a >>> // value that allows the post-barrier to filter them >>> out. However, >>> // that card setting is performed concurrently. A write to a >>> young >>> // object could occur before the card has been set, slipping past >>> // the filter. >> Oops, no, that isn't right. (It's been a couple of weeks since I >> looked at this, and forgot part of the problem.) >> >> Part of what's wrong with the comment is that we can no longer get to >> that point with a young region. A young region's cards will be ither >> g1_young_gen or clean, never dirty. Hence the filtering out of non- > > Why? I think the reason for this comment has been that the following > could happen: > > A: allocate new young region X, allocate object, storestore, stops at > the beginning of the dirty_young_block() method > > B: allocate new object B in X, set B.y = something-outside, making the > card "Dirty" since thread A did not actually start doing > dirty_young_block() yet. > > Refinement: scans the card; since R does not seem to synchronize with A > either, you may get a "dirty" card in a young (or free, depending on > whether the setting of the region flag in X has already been observed - > but it must be either one) region here in this case? > > A: does the work in dirty_young_block() > > (The previous is_young() check has indeed been wrong, and > is_old_or_humongous() is better) You are correct. Hopefully I've refreshed my understanding sufficiently that I won't keep making similar mistakes in this discussion. > I think the comment is good after all. I would even emphasize the act > of setting it to "g1_young_gen" by writing something like: > > // The region could be young. Cards for young regions are set to > // "g1_young_gen" so the post-barrier will filter them out. However, > // that dirtying is performed concurrently. A write to a young object > // could occur in the same region before the cards have been set to > // that value, slipping past the filter. > > Because then, if somebody removes g1_young_gen, he will hopefully find > this place again by searching for "g1_young_gen" and think about this > situation again. (in theory :)) Yes, that?s better. I?ll make that change. From kim.barrett at oracle.com Thu Nov 10 18:20:41 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 10 Nov 2016 13:20:41 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1478609549.2689.71.camel@oracle.com> References: <1478609549.2689.71.camel@oracle.com> Message-ID: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl wrote: > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: >> There is still a situation where processing can fail, namely an >> in-progress humongous allocation that hasn't set the klass yet. We >> continue to handle that as before. > > - I am not completely sure about whether this case is handled > correctly. I am mostly concerned that the information used before the > fence may not be the correct ones, but the checks expect them to be > valid. > > Probably I am overlooking something critical somewhere. > > A: allocates humongous object C, sets region type, issues storestore, > sets top pointers, writes the object, and then sets C.y = x to mark a > card > > Refinement: gets card (and assuming we have no further synchronization > around which is not true, e.g. the enqueuing) > > 592 if (!r->is_old_or_humongous()) { > > assume refinement thread has not received the "type" correctly yet, so > must be Free. So the card will be filtered out incorrectly? > > That is contradictory to what I said in the other email about the > comment discussion, but I only thoroughly looked at the comment aspect > there. :) > > I think at this point in general we can't do anything but !is_young(), > as we can't ignore cards in "Free" regions - they may be for cards for > humongous ones where the thread did not receive top and/or the type > yet? > > - assuming this works due to other synchronization, This is the critical point. There *is* synchronization there. In the scenario described, the card that was marked and enqueued after the object was created will pass through some synchronization barriers (full locks, perhaps someday lock-free but with appropriate memory barriers) along the way to refinement. This is the "easy" case. If only it were that simple... The additional checks are to deal with the possibility of stale cards. > [?] I have another > similar concern with later trimming: > > 653 } else { > 654 // Non-humongous objects are only allocated in the old-gen during > 655 // GC, so if region is old then top is stable. Humongous object > 656 // allocation sets top last; if top has not yet been set, then > 657 // we'll end up with an empty intersection. > 658 scan_limit = r->top(); > 659 } > 660 if (scan_limit <= start) { > 661 // If the trimmed region is empty, the card must be stale. > 662 return false; > 663 } > > Assume that the current value of top for a humongous object has not > been seen yet by the thread and we end up with an empty intersection. > > Now, didn't we potentially just drop a card to a humongous object in > waiting to scan but did not re-enqueue it? (And we did not clear the > card table value either?) > > We may do it after the fence though I think. > > Maybe I am completely wrong though, what do you think? If we see the old (zero) value of top in conjunction with a humongous region type, it is because this is a stale card. If this were a non-stale card, the synchronization between enqueuing the card and reaching refinement would have ensured we see an up-to-date top (as well as an up-to-date type). Card table entries for a free region are cleaned before the region can be allocated (and there are locks in the allocation path that provide the needed ordering). Since this is a stale card and regions are allocated with clean card table entries, the dirty card table entry check having passed implies there is another (non-stale and not-yet-processed) card making its way to refinement through the usual channels, including the needed synchronization barriers. > - another stale comment: > > 636 // a card beyond the heap. This is not safe without a perm > 637 // gen at the upper end of the heap. > > Could everything after "without" be removed in this sentence? We > haven't had a "perm gen" for a long time? Yes. I?ll make that change. From vladimir.kozlov at oracle.com Thu Nov 10 19:55:45 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Nov 2016 11:55:45 -0800 Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <77e0b348-2b95-4097-ba95-906257d8893c@default> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> Message-ID: On 11/9/16 10:42 PM, Shafi Ahmad wrote: > Hi, > > Please review the backport of following dependent backports. > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473 > Conflict in file src/share/vm/opto/memnode.cpp due to > 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK-8080289]. Manual merge is not done as the corresponding code is not there in jdk8u-dev. > Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual merge is done. > webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ unaligned unsafe access methods were added in jdk 9 only. In your changes unaligned argument is always false. You can simplify changes. Also you should base changes on JDK-8140309 (original 8136473 changes were backout by 8140267): On 11/4/15 10:21 PM, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > > Same as 8136473 with only the following change: > > diff --git a/src/share/vm/opto/library_call.cpp b/src/share/vm/opto/library_call.cpp > --- a/src/share/vm/opto/library_call.cpp > +++ b/src/share/vm/opto/library_call.cpp > @@ -2527,7 +2527,7 @@ > // of safe & unsafe memory. > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > > - assert(is_native_ptr || alias_type->adr_type() == TypeOopPtr::BOTTOM || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || alias_type->adr_type() == TypeOopPtr::BOTTOM || > alias_type->field() != NULL || alias_type->element() != NULL, "field, array element or unknown"); > bool mismatched = false; > if (alias_type->element() != NULL || alias_type->field() != NULL) { > > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr case and the case where the unsafe method is called with a null object. > jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > Conflict in file src/share/vm/opto/library_call.cpp due to > 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 [JDK-8140309]. Manual merge is not done as the corresponding code is not there in jdk8u-dev. I explained situation with this line above. > webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ This webrev is not incremental for your 8136473 changes - library_call.cpp has part from 8136473 changes. > jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781 > Clean merge > webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ Thanks seems fine. > jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101 > Conflict in file src/share/vm/opto/library_call.cpp due to > 1. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 [JDK-8160360] - Resolved > 2. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 [JDK-8148146] - Manual merge is not done as the corresponding code is not there in jdk8u-dev. > webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ This webrev is not incremental in library_call.cpp. Difficult to see this part of changes. Thanks, Vladimir > jdk9 changeset: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > > Testing: jprt and jtreg > > Regards, > Shafi > >> -----Original Message----- >> From: Shafi Ahmad >> Sent: Thursday, October 20, 2016 10:08 AM >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net >> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces >> mismatched unsafe accesses >> >> Thanks Vladimir. >> >> I will create dependent backport of >> 1. https://bugs.openjdk.java.net/browse/JDK-8136473 >> 2. https://bugs.openjdk.java.net/browse/JDK-8155781 >> 3. https://bugs.openjdk.java.net/browse/JDK-8162101 >> >> Regards, >> Shafi >> >>> -----Original Message----- >>> From: Vladimir Kozlov >>> Sent: Wednesday, October 19, 2016 8:27 AM >>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>> produces mismatched unsafe accesses >>> >>> Hi Shafi, >>> >>> You should also consider backporting following related fixes: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8155781 >>> https://bugs.openjdk.java.net/browse/JDK-8162101 >>> >>> Otherwise you may hit asserts added by 8134918 changes. >>> >>> Thanks, >>> Vladimir >>> >>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: >>>> Hi All, >>>> >>>> Please review the backport of JDK-8134918 - C2: Type speculation >>>> produces >>> mismatched unsafe accesses to jdk8u-dev. >>>> >>>> Please note that backport is not clean and the conflict is due to: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 >>>> 65 >>>> >>>> Getting debug build failure because of: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 >>>> 55 >>>> >>>> The above changes are done under bug# 'JDK-8136473: failed: no >>> mismatched stores, except on raw memory: StoreB StoreI' which is not >>> back ported to jdk8u and the current backport is on top of above change. >>>> >>>> Please note that I am not sure if there is any dependency between >>>> these >>> two changesets. >>>> >>>> open webrev: >> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 >>>> jdk9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>>> >>>> testing: Passes JPRT, jtreg not completed >>>> >>>> Regards, >>>> Shafi >>>> From jesper.wilhelmsson at oracle.com Fri Nov 11 16:15:27 2016 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 11 Nov 2016 17:15:27 +0100 Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until JDK-8169232 is solved Message-ID: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com> Hi, Please review this minor change to quarantine a new test that is triggering an old bug. The bug is being worked on and the test will be enabled again once the bug is fixed. Bug: https://bugs.openjdk.java.net/browse/JDK-8169597 Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/ Thanks, /Jesper From erik.gahlin at oracle.com Fri Nov 11 17:24:40 2016 From: erik.gahlin at oracle.com (Erik Gahlin) Date: Fri, 11 Nov 2016 18:24:40 +0100 Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until JDK-8169232 is solved In-Reply-To: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com> References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com> Message-ID: <5825FED8.3080503@oracle.com> Looks good. Erik > Hi, > > Please review this minor change to quarantine a new test that is > triggering an old bug. The bug is being worked on and the test will be > enabled again once the bug is fixed. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8169597 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/ > > Thanks, > /Jesper From george.triantafillou at oracle.com Fri Nov 11 19:07:03 2016 From: george.triantafillou at oracle.com (George Triantafillou) Date: Fri, 11 Nov 2016 14:07:03 -0500 Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until JDK-8169232 is solved In-Reply-To: <5825FED8.3080503@oracle.com> References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com> <5825FED8.3080503@oracle.com> Message-ID: +1 -George On 11/11/2016 12:24 PM, Erik Gahlin wrote: > Looks good. > > Erik > >> Hi, >> >> Please review this minor change to quarantine a new test that is >> triggering an old bug. The bug is being worked on and the test will >> be enabled again once the bug is fixed. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8169597 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/ >> >> Thanks, >> /Jesper > From jesper.wilhelmsson at oracle.com Fri Nov 11 19:40:01 2016 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 11 Nov 2016 20:40:01 +0100 Subject: RFR(xs): 8169597: Quarantine TestCpoolForInvokeDynamic.java until JDK-8169232 is solved In-Reply-To: References: <2c4260b4-65c2-daa9-4966-924f33879e05@oracle.com> <5825FED8.3080503@oracle.com> Message-ID: Thanks Erik and George! /Jesper Den 11/11/16 kl. 20:07, skrev George Triantafillou: > +1 > > -George > > On 11/11/2016 12:24 PM, Erik Gahlin wrote: >> Looks good. >> >> Erik >> >>> Hi, >>> >>> Please review this minor change to quarantine a new test that is triggering >>> an old bug. The bug is being worked on and the test will be enabled again >>> once the bug is fixed. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8169597 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8169597/webrev.00/ >>> >>> Thanks, >>> /Jesper >> > From shafi.s.ahmad at oracle.com Mon Nov 14 09:03:17 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 14 Nov 2016 01:03:17 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: References: <77e0b348-2b95-4097-ba95-906257d8893c@default> Message-ID: <137be921-c1ef-48d8-b85a-301d597109c0@default> Hi Vladimir, Thanks for the review. Please find updated webrevs. All webrevs are with respect to the base changes on JDK-8140309. http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ Regards, Shafi > -----Original Message----- > From: Vladimir Kozlov > Sent: Friday, November 11, 2016 1:26 AM > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > On 11/9/16 10:42 PM, Shafi Ahmad wrote: > > Hi, > > > > Please review the backport of following dependent backports. > > > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473 > > Conflict in file src/share/vm/opto/memnode.cpp due to 1. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK- > 8080289]. Manual merge is not done as the corresponding code is not there > in jdk8u-dev. > > Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual > merge is done. > > webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > > unaligned unsafe access methods were added in jdk 9 only. In your changes > unaligned argument is always false. You can simplify changes. > > Also you should base changes on JDK-8140309 (original 8136473 changes > were backout by 8140267): > > On 11/4/15 10:21 PM, Roland Westrelin wrote: > > http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > > > > Same as 8136473 with only the following change: > > > > diff --git a/src/share/vm/opto/library_call.cpp > b/src/share/vm/opto/library_call.cpp > > --- a/src/share/vm/opto/library_call.cpp > > +++ b/src/share/vm/opto/library_call.cpp > > @@ -2527,7 +2527,7 @@ > > // of safe & unsafe memory. > > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > > > > - assert(is_native_ptr || alias_type->adr_type() == TypeOopPtr::BOTTOM > || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || > alias_type->adr_type() == TypeOopPtr::BOTTOM || > > alias_type->field() != NULL || alias_type->element() != > NULL, "field, array element or unknown"); > > bool mismatched = false; > > if (alias_type->element() != NULL || alias_type->field() != NULL) { > > > > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr > case and the case where the unsafe method is called with a null object. > > > jdk9 changeset: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > > > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > > Conflict in file src/share/vm/opto/library_call.cpp due to 1. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 > [JDK-8140309]. Manual merge is not done as the corresponding code is not > there in jdk8u-dev. > > I explained situation with this line above. > > > webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > > This webrev is not incremental for your 8136473 changes - library_call.cpp has > part from 8136473 changes. > > > jdk9 changeset: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > > > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781 > > Clean merge > > webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > > Thanks seems fine. > > > jdk9 changeset: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > > > > jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101 > > Conflict in file src/share/vm/opto/library_call.cpp due to 1. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > > [JDK-8160360] - Resolved 2. > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 > [JDK-8148146] - Manual merge is not done as the corresponding code is not > there in jdk8u-dev. > > webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > > This webrev is not incremental in library_call.cpp. Difficult to see this part of > changes. > > Thanks, > Vladimir > > > jdk9 changeset: > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > > > > Testing: jprt and jtreg > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Shafi Ahmad > >> Sent: Thursday, October 20, 2016 10:08 AM > >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > >> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> mismatched unsafe accesses > >> > >> Thanks Vladimir. > >> > >> I will create dependent backport of > >> 1. https://bugs.openjdk.java.net/browse/JDK-8136473 > >> 2. https://bugs.openjdk.java.net/browse/JDK-8155781 > >> 3. https://bugs.openjdk.java.net/browse/JDK-8162101 > >> > >> Regards, > >> Shafi > >> > >>> -----Original Message----- > >>> From: Vladimir Kozlov > >>> Sent: Wednesday, October 19, 2016 8:27 AM > >>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>> produces mismatched unsafe accesses > >>> > >>> Hi Shafi, > >>> > >>> You should also consider backporting following related fixes: > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8155781 > >>> https://bugs.openjdk.java.net/browse/JDK-8162101 > >>> > >>> Otherwise you may hit asserts added by 8134918 changes. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > >>>> Hi All, > >>>> > >>>> Please review the backport of JDK-8134918 - C2: Type speculation > >>>> produces > >>> mismatched unsafe accesses to jdk8u-dev. > >>>> > >>>> Please note that backport is not clean and the conflict is due to: > >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 > >>>> 65 > >>>> > >>>> Getting debug build failure because of: > >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 > >>>> 55 > >>>> > >>>> The above changes are done under bug# 'JDK-8136473: failed: no > >>> mismatched stores, except on raw memory: StoreB StoreI' which is not > >>> back ported to jdk8u and the current backport is on top of above > change. > >>>> > >>>> Please note that I am not sure if there is any dependency between > >>>> these > >>> two changesets. > >>>> > >>>> open webrev: > >> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > >>>> jdk9 changeset: > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>>> > >>>> testing: Passes JPRT, jtreg not completed > >>>> > >>>> Regards, > >>>> Shafi > >>>> From volker.simonis at gmail.com Mon Nov 14 10:09:46 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 14 Nov 2016 11:09:46 +0100 Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390 builds Message-ID: Hi, can I please have a review and sponsor for the following small change which only affects ppc64/s390x but touches a shared make file: http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/ https://bugs.openjdk.java.net/browse/JDK-8169625 It is unfortunate that the build of the libjsig library (see make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags used by the hotspot build (i.e. the ones specified in JVM_CFLAGS). Instead, CompileLibjsig.gmk defines its own compiler flags in LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems if the compiler on these platforms uses other default settings as configured for the OpenJDK build. Thank you and best regards, Volker From erik.joelsson at oracle.com Mon Nov 14 10:14:00 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 14 Nov 2016 11:14:00 +0100 Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390 builds In-Reply-To: References: Message-ID: Looks good. I will push it. /Erik On 2016-11-14 11:09, Volker Simonis wrote: > Hi, > > can I please have a review and sponsor for the following small change > which only affects ppc64/s390x but touches a shared make file: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/ > https://bugs.openjdk.java.net/browse/JDK-8169625 > > It is unfortunate that the build of the libjsig library (see > make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags > used by the hotspot build (i.e. the ones specified in JVM_CFLAGS). > Instead, CompileLibjsig.gmk defines its own compiler flags in > LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems > if the compiler on these platforms uses other default settings as > configured for the OpenJDK build. > > Thank you and best regards, > Volker From volker.simonis at gmail.com Mon Nov 14 10:15:09 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 14 Nov 2016 11:15:09 +0100 Subject: RFR(XS): 8169625: Libjsig build doesn't set flags for ppc64/s390 builds In-Reply-To: References: Message-ID: Thanks a lot Erik! Volker On Mon, Nov 14, 2016 at 11:14 AM, Erik Joelsson wrote: > Looks good. I will push it. > > /Erik > > > > On 2016-11-14 11:09, Volker Simonis wrote: >> >> Hi, >> >> can I please have a review and sponsor for the following small change >> which only affects ppc64/s390x but touches a shared make file: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8169625/ >> https://bugs.openjdk.java.net/browse/JDK-8169625 >> >> It is unfortunate that the build of the libjsig library (see >> make/lib/CompileLibjsig.gmk) doesn't reuse the generic compiler flags >> used by the hotspot build (i.e. the ones specified in JVM_CFLAGS). >> Instead, CompileLibjsig.gmk defines its own compiler flags in >> LIBJSIG_CPU_FLAGS but not for ppc64 and s390x. This leads to problems >> if the compiler on these platforms uses other default settings as >> configured for the OpenJDK build. >> >> Thank you and best regards, >> Volker > > From vladimir.kozlov at oracle.com Mon Nov 14 17:50:01 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Nov 2016 09:50:01 -0800 Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <137be921-c1ef-48d8-b85a-301d597109c0@default> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> Message-ID: <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> On 11/14/16 1:03 AM, Shafi Ahmad wrote: > Hi Vladimir, > > Thanks for the review. > > Please find updated webrevs. > > All webrevs are with respect to the base changes on JDK-8140309. > http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ Why you kept unaligned parameter in changes? The test TestUnsafeUnalignedMismatchedAccesses.java will not work since since Unsafe class in jdk8 does not have unaligned methods. Hot did you run it? > http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ Good. Did you run new UnsafeAccess.java test? > http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ Good. Thanks, Vladimir > > Regards, > Shafi > > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Friday, November 11, 2016 1:26 AM >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces >> mismatched unsafe accesses >> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote: >>> Hi, >>> >>> Please review the backport of following dependent backports. >>> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473 >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK- >> 8080289]. Manual merge is not done as the corresponding code is not there >> in jdk8u-dev. >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and manual >> merge is done. >>> webrev link: http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ >> >> unaligned unsafe access methods were added in jdk 9 only. In your changes >> unaligned argument is always false. You can simplify changes. >> >> Also you should base changes on JDK-8140309 (original 8136473 changes >> were backout by 8140267): >> >> On 11/4/15 10:21 PM, Roland Westrelin wrote: >> > http://cr.openjdk.java.net/~roland/8140309/webrev.00/ >> > >> > Same as 8136473 with only the following change: >> > >> > diff --git a/src/share/vm/opto/library_call.cpp >> b/src/share/vm/opto/library_call.cpp >> > --- a/src/share/vm/opto/library_call.cpp >> > +++ b/src/share/vm/opto/library_call.cpp >> > @@ -2527,7 +2527,7 @@ >> > // of safe & unsafe memory. >> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); >> > >> > - assert(is_native_ptr || alias_type->adr_type() == TypeOopPtr::BOTTOM >> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || >> alias_type->adr_type() == TypeOopPtr::BOTTOM || >> > alias_type->field() != NULL || alias_type->element() != >> NULL, "field, array element or unknown"); >> > bool mismatched = false; >> > if (alias_type->element() != NULL || alias_type->field() != NULL) { >> > >> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the is_native_ptr >> case and the case where the unsafe method is called with a null object. >> >>> jdk9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 >>> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 >> [JDK-8140309]. Manual merge is not done as the corresponding code is not >> there in jdk8u-dev. >> >> I explained situation with this line above. >> >>> webrev link: http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >> >> This webrev is not incremental for your 8136473 changes - library_call.cpp has >> part from 8136473 changes. >> >>> jdk9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781 >>> Clean merge >>> webrev link: http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ >> >> Thanks seems fine. >> >>> jdk9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 >>> >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101 >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 >>> [JDK-8160360] - Resolved 2. >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 >> [JDK-8148146] - Manual merge is not done as the corresponding code is not >> there in jdk8u-dev. >>> webrev link: http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ >> >> This webrev is not incremental in library_call.cpp. Difficult to see this part of >> changes. >> >> Thanks, >> Vladimir >> >>> jdk9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 >>> >>> Testing: jprt and jtreg >>> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: Shafi Ahmad >>>> Sent: Thursday, October 20, 2016 10:08 AM >>>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation produces >>>> mismatched unsafe accesses >>>> >>>> Thanks Vladimir. >>>> >>>> I will create dependent backport of >>>> 1. https://bugs.openjdk.java.net/browse/JDK-8136473 >>>> 2. https://bugs.openjdk.java.net/browse/JDK-8155781 >>>> 3. https://bugs.openjdk.java.net/browse/JDK-8162101 >>>> >>>> Regards, >>>> Shafi >>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov >>>>> Sent: Wednesday, October 19, 2016 8:27 AM >>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>> produces mismatched unsafe accesses >>>>> >>>>> Hi Shafi, >>>>> >>>>> You should also consider backporting following related fixes: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 >>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 >>>>> >>>>> Otherwise you may hit asserts added by 8134918 changes. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: >>>>>> Hi All, >>>>>> >>>>>> Please review the backport of JDK-8134918 - C2: Type speculation >>>>>> produces >>>>> mismatched unsafe accesses to jdk8u-dev. >>>>>> >>>>>> Please note that backport is not clean and the conflict is due to: >>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 >>>>>> 65 >>>>>> >>>>>> Getting debug build failure because of: >>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5.1 >>>>>> 55 >>>>>> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no >>>>> mismatched stores, except on raw memory: StoreB StoreI' which is not >>>>> back ported to jdk8u and the current backport is on top of above >> change. >>>>>> >>>>>> Please note that I am not sure if there is any dependency between >>>>>> these >>>>> two changesets. >>>>>> >>>>>> open webrev: >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>>>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 >>>>>> jdk9 changeset: >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>>>>> >>>>>> testing: Passes JPRT, jtreg not completed >>>>>> >>>>>> Regards, >>>>>> Shafi >>>>>> From kumar.x.srinivasan at oracle.com Mon Nov 14 14:36:43 2016 From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan) Date: Mon, 14 Nov 2016 06:36:43 -0800 Subject: Note: JDK-8168010: Deprecate obsolete launcher -d32/-d64 options Message-ID: <5829CBFB.2020000@oracle.com> Hello community, This is to inform you that the -d32 and -d64 options are obsolete and are destined to be removed in JDK10, see [1] and [2], this will be Release noted for JDK9. Please make every effort to inspect your java start-up scripts and purge these options. Thanks Kumar Srinivasan [1] https://bugs.openjdk.java.net/browse/JDK-8168010 [2] https://bugs.openjdk.java.net/browse/JDK-8169646 From david.lloyd at redhat.com Mon Nov 14 18:11:00 2016 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 14 Nov 2016 12:11:00 -0600 Subject: Sporadic NPEs in compiled code Message-ID: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com> We observed a problem where java.net.NetworkInterface appeared to be throwing an NPE originating at a line of code corresponding to its return instruction: Caused by: java.lang.NullPointerException at java.net.NetworkInterface.(NetworkInterface.java:80) at java.net.NetworkInterface.getAll(Native Method) at java.net.NetworkInterface.getNetworkInterfaces(NetworkInterface.java:343) java.net.NetworkInterface(); Code: 0: aload_0 1: invokespecial #3 // Method java/lang/Object."":()V 4: aload_0 5: aconst_null 6: putfield #4 // Field parent:Ljava/net/NetworkInterface; 9: aload_0 10: iconst_0 11: putfield #5 // Field virtual:Z 14: return LineNumberTable: line 79: 0 line 50: 4 line 51: 9 line 80: 14 I assumed that the problem was possibly JNI-related, because of the previous stack frame, however we've begun seeing the problem in other bits of code as well, areas like this: 0: aload_0 1: invokestatic #10 // Method doInject:(Lorg/jboss/msc/service/ValueInjection;)V 4: return or this constructor: 87: aload_0 88: aload 7 90: putfield #17 // Field extensionModuleName:Ljava/lang/String; We've started testing with -XX:TieredStopAtLevel=1 and so far it seems the problems have disappeared, however, it's not clear to my hotspot-amateur mind at all whether it's C2 that is causing this or whether there is a more general timing-related race condition that is hidden by limiting the compiler in this way. The OpenJDK version is: openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-b16) OpenJDK 64-Bit Server VM (build 25.111-b16, mixed mode) It's coming out of a Fedora 24 distribution. -- - DML From chf at redhat.com Mon Nov 14 18:25:40 2016 From: chf at redhat.com (Christine Flood) Date: Mon, 14 Nov 2016 13:25:40 -0500 (EST) Subject: JEP 189: Shenandoah: An Ultra-Low-Pause-Time Garbage Collector In-Reply-To: <432402741.14209755.1479147790128.JavaMail.zimbra@redhat.com> Message-ID: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com> Hi We've addressed the issues with the JEP that were brought up last summer. We've been meeting our performance goals. What do we need to do to get Shenandoah approved for OpenJDK10? Christine From vladimir.kozlov at oracle.com Mon Nov 14 19:18:37 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Nov 2016 11:18:37 -0800 Subject: Sporadic NPEs in compiled code In-Reply-To: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com> References: <92cf69d3-3655-0688-9d24-54e33e2beed4@redhat.com> Message-ID: <109eaf6c-9951-aa40-d393-236144fba268@oracle.com> Could be https://bugs.openjdk.java.net/browse/JDK-8038348 My be other related to EA issues. First try to run with C2 only: -XX:-TieredCompialtion Then try to switch off EA as whole: -XX:-DoEscapeAnalysis Or just subset of EA: -XX:-OptimizePtrCompare Also could be incorrect memory instruction scheduling (above NULL check). You can generate hs_err file to see recent events (deoptimization, compilations, uncommon traps): -XX:+UnlockDiagnosticVMOptions -XX:AbortVMOnException=java.lang.NullPointerException Also build fastdebug version of JDK and run with it. To see if it hits some asserts. Thanks, Vladimir On 11/14/16 10:11 AM, David M. Lloyd wrote: > We observed a problem where java.net.NetworkInterface appeared to be > throwing an NPE originating at a line of code corresponding to its > return instruction: > > Caused by: java.lang.NullPointerException > at java.net.NetworkInterface.(NetworkInterface.java:80) > at java.net.NetworkInterface.getAll(Native Method) > at > java.net.NetworkInterface.getNetworkInterfaces(NetworkInterface.java:343) > > java.net.NetworkInterface(); > Code: > 0: aload_0 > 1: invokespecial #3 // Method > java/lang/Object."":()V > 4: aload_0 > 5: aconst_null > 6: putfield #4 // Field > parent:Ljava/net/NetworkInterface; > 9: aload_0 > 10: iconst_0 > 11: putfield #5 // Field virtual:Z > 14: return > LineNumberTable: > line 79: 0 > line 50: 4 > line 51: 9 > line 80: 14 > > I assumed that the problem was possibly JNI-related, because of the > previous stack frame, however we've begun seeing the problem in other > bits of code as well, areas like this: > > 0: aload_0 > 1: invokestatic #10 // Method > doInject:(Lorg/jboss/msc/service/ValueInjection;)V > 4: return > > or this constructor: > > 87: aload_0 > 88: aload 7 > 90: putfield #17 // Field > extensionModuleName:Ljava/lang/String; > > We've started testing with -XX:TieredStopAtLevel=1 and so far it seems > the problems have disappeared, however, it's not clear to my > hotspot-amateur mind at all whether it's C2 that is causing this or > whether there is a more general timing-related race condition that is > hidden by limiting the compiler in this way. > > The OpenJDK version is: > > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build 1.8.0_111-b16) > OpenJDK 64-Bit Server VM (build 25.111-b16, mixed mode) > > It's coming out of a Fedora 24 distribution. From shafi.s.ahmad at oracle.com Tue Nov 15 06:34:42 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Mon, 14 Nov 2016 22:34:42 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> Message-ID: <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> Hi Vladimir, Thanks for the review. > -----Original Message----- > From: Vladimir Kozlov > Sent: Monday, November 14, 2016 11:20 PM > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > On 11/14/16 1:03 AM, Shafi Ahmad wrote: > > Hi Vladimir, > > > > Thanks for the review. > > > > Please find updated webrevs. > > > > All webrevs are with respect to the base changes on JDK-8140309. > > http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ > > Why you kept unaligned parameter in changes? The fix of JDK-8136473 caused many problems after integration (see JDK-8140267). The fix was backed out and re-implemented with JDK-8140309 by slightly changing the assert: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-November/019696.html The code change for the fix of JDK-8140309 is code changes for JDK-8136473 by slightly changing one assert. jdk9 original changeset is http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c As this is a backport so I keep the changes as it is. > > The test TestUnsafeUnalignedMismatchedAccesses.java will not work since > since Unsafe class in jdk8 does not have unaligned methods. > Hot did you run it? I am sorry, looks there is some issue with my testing. I have run jtreg test after merging the changes but somehow the test does not run and I verified only the failing list of jtreg result. When I run the test case separately it is failing as you already pointed out the same. $java -jar ~/Tools/jtreg/lib/jtreg.jar -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java Test results: failed: 1 Report written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTreport/html/report.html Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork Error: /scratch/shshahma/Java/jdk8u-dev-8140309_01/hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: cannot find symbol UNSAFE.putIntUnaligned(array, UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); Not sure if we should push without the test case. > > > http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ > > Good. Did you run new UnsafeAccess.java test? Due to same process issue the test case is not run and when I run it separately it fails. It passes after doing below changes: 1. Added /othervm 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by 'import sun.misc.Unsafe;' Updated webrev: http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > > > http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ I am getting the similar compilation error as above for added test case. Not sure if we can push without the test case. Regards, Shafi > > Good. > > Thanks, > Vladimir > > > > > Regards, > > Shafi > > > > > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Friday, November 11, 2016 1:26 AM > >> To: Shafi Ahmad; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net > >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> mismatched unsafe accesses > >> > >> On 11/9/16 10:42 PM, Shafi Ahmad wrote: > >>> Hi, > >>> > >>> Please review the backport of following dependent backports. > >>> > >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8136473 > >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK- > >> 8080289]. Manual merge is not done as the corresponding code is not > >> there in jdk8u-dev. > >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and > >>> manual > >> merge is done. > >>> webrev link: > http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > >> > >> unaligned unsafe access methods were added in jdk 9 only. In your > >> changes unaligned argument is always false. You can simplify changes. > >> > >> Also you should base changes on JDK-8140309 (original 8136473 changes > >> were backout by 8140267): > >> > >> On 11/4/15 10:21 PM, Roland Westrelin wrote: > >> > http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > >> > > >> > Same as 8136473 with only the following change: > >> > > >> > diff --git a/src/share/vm/opto/library_call.cpp > >> b/src/share/vm/opto/library_call.cpp > >> > --- a/src/share/vm/opto/library_call.cpp > >> > +++ b/src/share/vm/opto/library_call.cpp > >> > @@ -2527,7 +2527,7 @@ > >> > // of safe & unsafe memory. > >> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > >> > > >> > - assert(is_native_ptr || alias_type->adr_type() == > >> TypeOopPtr::BOTTOM > >> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || > >> alias_type->adr_type() == TypeOopPtr::BOTTOM || > >> > alias_type->field() != NULL || alias_type->element() != > >> NULL, "field, array element or unknown"); > >> > bool mismatched = false; > >> > if (alias_type->element() != NULL || alias_type->field() != NULL) { > >> > > >> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the > >> is_native_ptr case and the case where the unsafe method is called with a > null object. > >> > >>> jdk9 changeset: > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > >>> > >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 > >> [JDK-8140309]. Manual merge is not done as the corresponding code is > >> not there in jdk8u-dev. > >> > >> I explained situation with this line above. > >> > >>> webrev link: > http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >> > >> This webrev is not incremental for your 8136473 changes - > >> library_call.cpp has part from 8136473 changes. > >> > >>> jdk9 changeset: > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>> > >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8155781 > >>> Clean merge > >>> webrev link: > http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > >> > >> Thanks seems fine. > >> > >>> jdk9 changeset: > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > >>> > >>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8162101 > >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > >>> [JDK-8160360] - Resolved 2. > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 > >> [JDK-8148146] - Manual merge is not done as the corresponding code is > >> not there in jdk8u-dev. > >>> webrev link: > http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > >> > >> This webrev is not incremental in library_call.cpp. Difficult to see > >> this part of changes. > >> > >> Thanks, > >> Vladimir > >> > >>> jdk9 changeset: > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > >>> > >>> Testing: jprt and jtreg > >>> > >>> Regards, > >>> Shafi > >>> > >>>> -----Original Message----- > >>>> From: Shafi Ahmad > >>>> Sent: Thursday, October 20, 2016 10:08 AM > >>>> To: Vladimir Kozlov; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net > >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>> produces mismatched unsafe accesses > >>>> > >>>> Thanks Vladimir. > >>>> > >>>> I will create dependent backport of 1. > >>>> https://bugs.openjdk.java.net/browse/JDK-8136473 > >>>> 2. https://bugs.openjdk.java.net/browse/JDK-8155781 > >>>> 3. https://bugs.openjdk.java.net/browse/JDK-8162101 > >>>> > >>>> Regards, > >>>> Shafi > >>>> > >>>>> -----Original Message----- > >>>>> From: Vladimir Kozlov > >>>>> Sent: Wednesday, October 19, 2016 8:27 AM > >>>>> To: Shafi Ahmad; HYPERLINK "mailto:hotspot-dev at openjdk.java.net"hotspot-dev at openjdk.java.net > >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>> produces mismatched unsafe accesses > >>>>> > >>>>> Hi Shafi, > >>>>> > >>>>> You should also consider backporting following related fixes: > >>>>> > >>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 > >>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 > >>>>> > >>>>> Otherwise you may hit asserts added by 8134918 changes. > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > >>>>>> Hi All, > >>>>>> > >>>>>> Please review the backport of JDK-8134918 - C2: Type speculation > >>>>>> produces > >>>>> mismatched unsafe accesses to jdk8u-dev. > >>>>>> > >>>>>> Please note that backport is not clean and the conflict is due to: > >>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>>>>> 1 > >>>>>> 65 > >>>>>> > >>>>>> Getting debug build failure because of: > >>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>>>>> 1 > >>>>>> 55 > >>>>>> > >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no > >>>>> mismatched stores, except on raw memory: StoreB StoreI' which is > >>>>> not back ported to jdk8u and the current backport is on top of > >>>>> above > >> change. > >>>>>> > >>>>>> Please note that I am not sure if there is any dependency > >>>>>> between these > >>>>> two changesets. > >>>>>> > >>>>>> open webrev: > >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>>>>> jdk9 bug link: https://bugs.openjdk.java.net/browse/JDK-8134918 > >>>>>> jdk9 changeset: > >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>>>>> > >>>>>> testing: Passes JPRT, jtreg not completed > >>>>>> > >>>>>> Regards, > >>>>>> Shafi > >>>>>> From thomas.schatzl at oracle.com Tue Nov 15 10:21:04 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 15 Nov 2016 11:21:04 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> Message-ID: <1479205264.3251.13.camel@oracle.com> Hi Kim, On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: > > > > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl > om> wrote: > > On Tue, 2016-10-25 at 19:11 -0400, Kim Barrett wrote: > > > > > > > > > > > > > > > On Oct 21, 2016, at 9:54 PM, Kim Barrett > > > m> > > > > wrote: > > > > > > > > > > > > > > > > > > > On Oct 21, 2016, at 8:46 PM, Kim Barrett > > > > com> > > > > > wrote: > > > > > In the humongous case, if it bails because klass_or_null == > > > > > NULL, > > > > > we must re-enqueue > > > > > the card ? > > > This update (webrev.02) reverts part of the previous change. > > > > > > In the original RFR I said: > > > > > > ? As a result of the changes in oops_on_card_seq_iterate_careful, > > > we > > > ? now almost never fail to process the card.??The only place > > > where > > > ? that can occur is a stale card in a humongous region with an > > > ? in-progress allocation, where we can just ignore it.??So the > > > only > > > ? caller, refine_card, no longer needs to examine the result of > > > the > > > ? call and enqueue the card for later reconsideration. > > > > > > Ignoring such a stale card is incorrect at the point where it was > > > being done.??At that point we've already cleaned the card, so we > > > must > > > either process the designated object(s) or, if we can't do the > > > processing because of in-progress allocation (klass_or_null > > > returned > > > NULL), then re-queue the card for later reconsideration. > > > > > > So the change to refine_card to eliminate that behavior, and the > > > associated changes to oops_on_card_seq_iterate_careful, were a > > > mistake, and are being reverted by this new version.??As a > > > result, > > > refine_card is no longer changed at all. > > Thanks for catching this. > > > > Maybe it would be cleaner to call a method in the barrier set > > instead of inlining the dirtying + enqueuing in lines 685 to 691? > > Maybe as an additional RFE. > We could use _ct_bs->invalidate(dirtyRegion).??That's rather > overgeneralized and inefficient for this situation, but this > situation should occur *very* rarely; it requires a stale card get > processed just as a humongous object is in the midst of being > allocated in the same region. I kind of think for these reasons we should use _ct_bs->invalidate() as it seems clearer to me. There is the mentioned drawback of having no other more efficient way, so I will let you decide about this. > > > Additionally, in the original RFR I also said: > > > > > > ? Note that [...] At present the only source of stale cards in > > > the concurrent case seems to be HCC eviction.??[...]??Doing HCC > > > cleanup when freeing regions might remove the need for > > > klass_or_null checking in the humongous case for concurrent > > > refinement, so might be worth looking into later. > > > > > > That was also incorrect; there are other sources of stale cards. > > Can you elaborate on that? > Here's a scenario that I've observed while running a jtreg test (I > think it was hotspot/test/gc/TestHumongousReferenceObject). > > We have humongous object H, referring to young object Y.??This > induces a remembered set entry for card C in region R (allocated for > H). > > H becomes unreachable. > Start concurrent collection cycle. > Pause Initial Mark scan_rs pushes &H->Y onto mark stack. > Pause Initial Mark evac processes &H->Y, copying Y, updating &H->Y, > ? and adding C to g1h_dcqs in update_rs. > Pause Initial Mark redirty_logged_cards dirties g1h_dcqs entries, > including C. > Pause Initial Mark merges g1h_dcqs into java_dcqs, adding dirty C to > java_dcqs. > Concurrent Mark determines H is dead. > Pause Cleanup frees regions for H, including R. > Concurrent Refinement finally comes across stale C in now (possibly) > free R. > > A similar situation can arise if instead of H we have old O in region > R and all objects in R are unreachable before starting concurrent > collection, so that Pause Cleanup frees R. Okay, thanks, understood. Thomas From thomas.schatzl at oracle.com Tue Nov 15 10:26:48 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 15 Nov 2016 11:26:48 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> Message-ID: <1479205608.3251.18.camel@oracle.com> Hi, On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote: > > > > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl > om> wrote: > > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: > > > > > > There is still a situation where processing can fail, namely an > > > in-progress humongous allocation that hasn't set the klass > > > yet.??We continue to handle that as before. > > - I am not completely sure about whether this case is handled > > correctly. I?am mostly concerned that the information used before > > the fence may not be the correct ones, but the checks expect them > > to be valid. > > > > Probably I am overlooking something critical somewhere. > > > > A: allocates humongous object C, sets region type, issues > > storestore, sets top pointers, writes the object, and then sets C.y > > = x to mark a card > > > > Refinement: gets card (and assuming we have no further > > synchronization around which is not true, e.g. the enqueuing) > > > > ?592???if (!r->is_old_or_humongous()) { > > > > assume refinement thread has not received the "type" correctly yet, > > so must be Free. So the card will be filtered out incorrectly? > > > > That is contradictory to what I said in the other email about the > > comment discussion, but I only thoroughly looked at the comment > > aspect there. :) > > > > I think at this point in general we can't do anything but > > !is_young(), as we can't ignore cards in "Free" regions - they may > > be for cards for humongous ones where the thread did not receive > > top and/or the type yet? > > > > - assuming this works due to other synchronization, > This is the critical point.??There *is* synchronization there. Okay, thanks. I just wanted to make sure that we are aware of that we are using this other synchronization here. > In the scenario described, the card that was marked and enqueued > after the object was created will pass through some synchronization > barriers (full locks, perhaps someday lock-free but with appropriate > memory barriers) along the way to refinement. > > This is the "easy" case.??If only it were that simple... > > The additional checks are to deal with the possibility of stale > cards. > > > > > [?] I have another > > similar concern with later trimming: > > > > 653 } else { > > 654???// Non-humongous objects are only allocated in the old-gen > > during > > 655???// GC, so if region is old then top is stable.??Humongous > > object > > 656???// allocation sets top last; if top has not yet been set, > > then > > 657???// we'll end up with an empty intersection. > > 658???scan_limit = r->top(); > > 659 } > > 660 if (scan_limit <= start) { > > 661???// If the trimmed region is empty, the card must be stale. > > 662???return false; > > 663 } > > > > Assume that the current value of top for a humongous object has not > > been seen yet by the thread and we end up with an empty > > intersection. > > > > Now, didn't we potentially just drop a card to a humongous object > > in waiting to scan but did not re-enqueue it? (And we did not clear > > the card table value either?) > > > > We may do it after the fence though I think. > > > > Maybe I am completely wrong though, what do you think? > If we see the old (zero) value of top in conjunction with a humongous > region type, it is because this is a stale card.??If this were a > non-stale card, the synchronization between enqueuing the card and > reaching refinement would have ensured we see an up-to-date top (as > well as an up-to-date type).??Card table entries for a free region > are cleaned before the region can be allocated (and there are locks > in the allocation path that provide the needed ordering).??Since this > is a stale card and regions are allocated with clean card table > entries, the dirty card table entry check having passed implies there > is another (non-stale and not-yet-processed) card making its way to > refinement through the usual channels, including the needed > synchronization barriers. Thanks. Again I was mostly worried about noting this reliance on previous synchronization down somewhere, even if it is only the mailing list. It may be useful to note this in the code too. This would save the next one working on this code looking through old mailing list threads. Maybe I am a bit overly concerned about making sure that these thoughts are provided in the proper place though. Or maybe everyone thinks that everything is clear :) > > > > - another stale comment: > > > > ?636???// a card beyond the heap.??This is not safe without a perm > > ?637???// gen at the upper end of the heap. > > > > Could everything after "without" be removed in this sentence? We > > haven't had a "perm gen" for a long time? > Yes.??I?ll make that change. > Thanks. Thanks, ? Thomas From trevor.d.watson at oracle.com Tue Nov 15 11:57:50 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Tue, 15 Nov 2016 11:57:50 +0000 Subject: RFR: 8162865 Implementation of SPARC lzcnt Message-ID: I have implemented the code to use the lzcnt instruction for both integer and long countLeadingZeros() methods on SPARC platforms supporting the vis3 instruction set. Current "bmi" tests for the above are updated so that they run on both SPARC and x86 platforms. I've also implemented a test to ensure that Integer.countLeadingZeros() and Long.countLeadingZeros() return the correct values when C2 runs. This test is currently under the intrinsics "bmi" tests for want of somewhere better (they do apply to both SPARC and x86 though). http://cr.openjdk.java.net/~alanbur/8162865/ Thanks, Trevor From vladimir.kozlov at oracle.com Tue Nov 15 19:29:51 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Nov 2016 11:29:51 -0800 Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> Message-ID: <582B622F.7030909@oracle.com> Hi Shafi You should not backport tests which use only new JDK 9 APIs. Like TestUnsafeUnalignedMismatchedAccesses.java test. But it is perfectly fine to modify backport by removing part of changes which use a new API. For example, 8162101 changes in OpaqueAccesses.java test which use getIntUnaligned() method. It is unfortunate that 8140309 changes include also code which process new Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will simplify this and following backports. So I agree with changes you did for 8140309 backport. Thanks, Vladimir On 11/14/16 10:34 PM, Shafi Ahmad wrote: > Hi Vladimir, > > Thanks for the review. > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Monday, November 14, 2016 11:20 PM > >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> mismatched unsafe accesses > >> > >> On 11/14/16 1:03 AM, Shafi Ahmad wrote: > >> > Hi Vladimir, > >> > > >> > Thanks for the review. > >> > > >> > Please find updated webrevs. > >> > > >> > All webrevs are with respect to the base changes on JDK-8140309. > >> >http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ > >> > >> Why you kept unaligned parameter in changes? > > The fix of JDK-8136473 caused many problems after integration (see JDK-8140267). > > The fix was backed out and re-implemented with JDK-8140309 by slightly changing the assert: > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-November/019696.html > > The code change for the fix of JDK-8140309 is code changes for JDK-8136473 by slightly changing one assert. > > jdk9 original changeset is http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c > > As this is a backport so I keep the changes as it is. > >> > >> The test TestUnsafeUnalignedMismatchedAccesses.java will not work since > >> since Unsafe class in jdk8 does not have unaligned methods. > >> Hot did you run it? > > I am sorry, looks there is some issue with my testing. > > I have run jtreg test after merging the changes but somehow the test does not run and I verified only the failing list of jtreg result. > > When I run the test case separately it is failing as you already pointed out the same. > > $java -jar ~/Tools/jtreg/lib/jtreg.jar -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java > > Test results: failed: 1 > > Report written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTreport/html/report.html > > Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork > > Error: > > /scratch/shshahma/Java/jdk8u-dev-8140309_01/hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: cannot find symbol > > UNSAFE.putIntUnaligned(array, UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); > > Not sure if we should push without the test case. > >> > >> >http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ > >> > >> Good. Did you run new UnsafeAccess.java test? > > Due to same process issue the test case is not run and when I run it separately it fails. > > It passes after doing below changes: > > 1. Added /othervm > > 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by 'import sun.misc.Unsafe;' > > Updated webrev: http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > >> > >> >http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ > > I am getting the similar compilation error as above for added test case. Not sure if we can push without the test case. > > Regards, > > Shafi > >> > >> Good. > >> > >> Thanks, > >> Vladimir > >> > >> > > >> > Regards, > >> > Shafi > >> > > >> > > >> > > >> >> -----Original Message----- > >> >> From: Vladimir Kozlov > >> >> Sent: Friday, November 11, 2016 1:26 AM > >> >> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> >> mismatched unsafe accesses > >> >> > >> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote: > >> >>> Hi, > >> >>> > >> >>> Please review the backport of following dependent backports. > >> >>> > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473 > >> >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 [JDK- > >> >> 8080289]. Manual merge is not done as the corresponding code is not > >> >> there in jdk8u-dev. > >> >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and > >> >>> manual > >> >> merge is done. > >> >>> webrev link: > >>http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > >> >> > >> >> unaligned unsafe access methods were added in jdk 9 only. In your > >> >> changes unaligned argument is always false. You can simplify changes. > >> >> > >> >> Also you should base changes on JDK-8140309 (original 8136473 changes > >> >> were backout by 8140267): > >> >> > >> >> On 11/4/15 10:21 PM, Roland Westrelin wrote: > >> >> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > >> >> > > >> >> > Same as 8136473 with only the following change: > >> >> > > >> >> > diff --git a/src/share/vm/opto/library_call.cpp > >> >> b/src/share/vm/opto/library_call.cpp > >> >> > --- a/src/share/vm/opto/library_call.cpp > >> >> > +++ b/src/share/vm/opto/library_call.cpp > >> >> > @@ -2527,7 +2527,7 @@ > >> >> > // of safe & unsafe memory. > >> >> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > >> >> > > >> >> > - assert(is_native_ptr || alias_type->adr_type() == > >> >> TypeOopPtr::BOTTOM > >> >> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || > >> >> alias_type->adr_type() == TypeOopPtr::BOTTOM || > >> >> > alias_type->field() != NULL || alias_type->element() != > >> >> NULL, "field, array element or unknown"); > >> >> > bool mismatched = false; > >> >> > if (alias_type->element() != NULL || alias_type->field() != NULL) { > >> >> > > >> >> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the > >> >> is_native_ptr case and the case where the unsafe method is called with a > >> null object. > >> >> > >> >>> jdk9 changeset: > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > >> >>> > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 > >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >> >>> > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 > >> >> [JDK-8140309]. Manual merge is not done as the corresponding code is > >> >> not there in jdk8u-dev. > >> >> > >> >> I explained situation with this line above. > >> >> > >> >>> webrev link: > >>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >> >> > >> >> This webrev is not incremental for your 8136473 changes - > >> >> library_call.cpp has part from 8136473 changes. > >> >> > >> >>> jdk9 changeset: > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >> >>> > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781 > >> >>> Clean merge > >> >>> webrev link: > >>http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > >> >> > >> >> Thanks seems fine. > >> >> > >> >>> jdk9 changeset: > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > >> >>> > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101 > >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > >> >>> [JDK-8160360] - Resolved 2. > >> >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.273 > >> >> [JDK-8148146] - Manual merge is not done as the corresponding code is > >> >> not there in jdk8u-dev. > >> >>> webrev link: > >>http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > >> >> > >> >> This webrev is not incremental in library_call.cpp. Difficult to see > >> >> this part of changes. > >> >> > >> >> Thanks, > >> >> Vladimir > >> >> > >> >>> jdk9 changeset: > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > >> >>> > >> >>> Testing: jprt and jtreg > >> >>> > >> >>> Regards, > >> >>> Shafi > >> >>> > >> >>>> -----Original Message----- > >> >>>> From: Shafi Ahmad > >> >>>> Sent: Thursday, October 20, 2016 10:08 AM > >> >>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net > >> >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation > >> >>>> produces mismatched unsafe accesses > >> >>>> > >> >>>> Thanks Vladimir. > >> >>>> > >> >>>> I will create dependent backport of 1. > >> >>>>https://bugs.openjdk.java.net/browse/JDK-8136473 > >> >>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 > >> >>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 > >> >>>> > >> >>>> Regards, > >> >>>> Shafi > >> >>>> > >> >>>>> -----Original Message----- > >> >>>>> From: Vladimir Kozlov > >> >>>>> Sent: Wednesday, October 19, 2016 8:27 AM > >> >>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >> >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >> >>>>> produces mismatched unsafe accesses > >> >>>>> > >> >>>>> Hi Shafi, > >> >>>>> > >> >>>>> You should also consider backporting following related fixes: > >> >>>>> > >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8155781 > >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8162101 > >> >>>>> > >> >>>>> Otherwise you may hit asserts added by 8134918 changes. > >> >>>>> > >> >>>>> Thanks, > >> >>>>> Vladimir > >> >>>>> > >> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > >> >>>>>> Hi All, > >> >>>>>> > >> >>>>>> Please review the backport of JDK-8134918 - C2: Type speculation > >> >>>>>> produces > >> >>>>> mismatched unsafe accesses to jdk8u-dev. > >> >>>>>> > >> >>>>>> Please note that backport is not clean and the conflict is due to: > >> >>>>>> > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >> >>>>>> 1 > >> >>>>>> 65 > >> >>>>>> > >> >>>>>> Getting debug build failure because of: > >> >>>>>> > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >> >>>>>> 1 > >> >>>>>> 55 > >> >>>>>> > >> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no > >> >>>>> mismatched stores, except on raw memory: StoreB StoreI' which is > >> >>>>> not back ported to jdk8u and the current backport is on top of > >> >>>>> above > >> >> change. > >> >>>>>> > >> >>>>>> Please note that I am not sure if there is any dependency > >> >>>>>> between these > >> >>>>> two changesets. > >> >>>>>> > >> >>>>>> open webrev: > >> >>>>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >> >>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 > >> >>>>>> jdk9 changeset: > >> >>>>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >> >>>>>> > >> >>>>>> testing: Passes JPRT, jtreg not completed > >> >>>>>> > >> >>>>>> Regards, > >> >>>>>> Shafi > >> >>>>>> > From kim.barrett at oracle.com Tue Nov 15 23:58:24 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 15 Nov 2016 18:58:24 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <1479205264.3251.13.camel@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> Message-ID: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> > On Nov 15, 2016, at 5:21 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>> >>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >> om> wrote: >>> Maybe it would be cleaner to call a method in the barrier set >>> instead of inlining the dirtying + enqueuing in lines 685 to 691? >>> Maybe as an additional RFE. >> We could use _ct_bs->invalidate(dirtyRegion). That's rather >> overgeneralized and inefficient for this situation, but this >> situation should occur *very* rarely; it requires a stale card get >> processed just as a humongous object is in the midst of being >> allocated in the same region. > > I kind of think for these reasons we should use _ct_bs->invalidate() as > it seems clearer to me. There is the mentioned drawback of having no > other more efficient way, so I will let you decide about this. I've made the change to call invalidate, and also updated some comments. CR: https://bugs.openjdk.java.net/browse/JDK-8166607 Webrevs: full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ Also, see RFR: 8166811, where I've included a webrev combining the latest changes for 8166607 and 8166811, since they are rather intertwined. I think I'll do as Erik suggested and push the two together. From kim.barrett at oracle.com Wed Nov 16 00:00:02 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 15 Nov 2016 19:00:02 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1479205608.3251.18.camel@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> Message-ID: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> > On Nov 15, 2016, at 5:26 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote: >>> >>> On Nov 8, 2016, at 7:52 AM, Thomas Schatzl >> om> wrote: >>> On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: >>> - assuming this works due to other synchronization, >> This is the critical point. There *is* synchronization there. > > Okay, thanks. I just wanted to make sure that we are aware of that we > are using this other synchronization here. > > Thanks. Again I was mostly worried about noting this reliance on > previous synchronization down somewhere, even if it is only the mailing > list. > > It may be useful to note this in the code too. This would save the next > one working on this code looking through old mailing list threads. > > Maybe I am a bit overly concerned about making sure that these thoughts > are provided in the proper place though. Or maybe everyone thinks that > everything is clear :) I've updated some comments to mention that external synchronization. CR: https://bugs.openjdk.java.net/browse/JDK-8166811 Webrevs: full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/ incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/ Also, since this set of changes is rather intertwined with the changes for 8166607, here is a combined webrev for both: http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/ I think I'll do as Erik suggested and push the two together. From thomas.schatzl at oracle.com Wed Nov 16 09:06:54 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 16 Nov 2016 10:06:54 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> Message-ID: <1479287214.2466.35.camel@oracle.com> Hi Kim, On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote: > > > > On Nov 15, 2016, at 5:26 AM, Thomas Schatzl > com> wrote: > > > > Hi, > > > > On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote: > > > > > > > > > > > > > > > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl > > > le.c > > > > om> wrote: > > > > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: > > > > - assuming this works due to other synchronization, > > > This is the critical point.??There *is* synchronization there. > > Okay, thanks. I just wanted to make sure that we are aware of that > > we > > are using this other synchronization here. > > > > Thanks. Again I was mostly worried about noting this reliance on > > previous synchronization down somewhere, even if it is only the > > mailing > > list. > > > > It may be useful to note this in the code too. This would save the > > next > > one working on this code looking through old mailing list threads. > > > > Maybe I am a bit overly concerned about making sure that these > > thoughts are provided in the proper place though. Or maybe everyone > > thinks that everything is clear :) > I've updated some comments to mention that external synchronization. ?581???// The region could be young.??Cards for young regions are set to ?582???// g1_young_gen, so the post-barrier will filter them out.??However, ?583???// that marking is performed concurrently.??A write to a young ?584???// object could occur before the card has been marked young, slipping ?585???// past the filter. I would prefer if the text would not change terminology for the same thing mid-paragraph, from "setting" to "marking". The advantage of it reading better seems to be smaller than the potential confusion. Everything else looks very nice. Thanks for considering my comments. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166811 > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/ > incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/ > > Also, since this set of changes is rather intertwined with the > changes > for 8166607, here is a combined webrev for both: > http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/ > > I think I'll do as Erik suggested and push the two together. Just fyi, you can push two commits at once, or one commit having two CR-number lines. I think it is sufficient to commit these two changes in a single push job, but I do not see a need for making it a single commit. Either way is fine with me. Thanks, ? Thomas From thomas.schatzl at oracle.com Wed Nov 16 09:21:27 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 16 Nov 2016 10:21:27 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> Message-ID: <1479288087.2466.36.camel@oracle.com> Hi Kim, On Tue, 2016-11-15 at 18:58 -0500, Kim Barrett wrote: > > > > On Nov 15, 2016, at 5:21 AM, Thomas Schatzl > com> wrote: > > > > Hi Kim, > > > > On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: > > > > > > > > > > > > > > > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl > > > le.c > > > > om> wrote: > > > > Maybe it would be cleaner to call a method in the barrier set > > > > instead of inlining the dirtying + enqueuing in lines 685 to > > > > 691? > > > > Maybe as an additional RFE. > > > We could use _ct_bs->invalidate(dirtyRegion).??That's rather > > > overgeneralized and inefficient for this situation, but this > > > situation should occur *very* rarely; it requires a stale card > > > get > > > processed just as a humongous object is in the midst of being > > > allocated in the same region. > > I kind of think for these reasons we should use _ct_bs- > > >invalidate() as > > it seems clearer to me. There is the mentioned drawback of having > > no > > other more efficient way, so I will let you decide about this. > I've made the change to call invalidate, and also updated some > comments. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166607 > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ > incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ > ? thanks, looks good. Thomas From shafi.s.ahmad at oracle.com Wed Nov 16 12:52:24 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 16 Nov 2016 04:52:24 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <582B622F.7030909@oracle.com> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> <582B622F.7030909@oracle.com> Message-ID: <4332d26a-0efa-4582-9068-f28fb7ebd109@default> Hi Vladimir, Thank you for the review and feedback. Please find updated webrevs: http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed the test case as it use only jdk9 APIs. http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed test methods testFixedOffsetHeaderArray17() and testFixedOffsetHeader17() which referenced jdk9 API UNSAFE.getIntUnaligned. Regards, Shafi > -----Original Message----- > From: Vladimir Kozlov > Sent: Wednesday, November 16, 2016 1:00 AM > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > Hi Shafi > > You should not backport tests which use only new JDK 9 APIs. Like > TestUnsafeUnalignedMismatchedAccesses.java test. > > But it is perfectly fine to modify backport by removing part of changes which > use a new API. For example, 8162101 changes in OpaqueAccesses.java test > which use getIntUnaligned() method. > > It is unfortunate that 8140309 changes include also code which process new > Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will > simplify this and following backports. So I agree with changes you did for > 8140309 backport. > > Thanks, > Vladimir > > On 11/14/16 10:34 PM, Shafi Ahmad wrote: > > Hi Vladimir, > > > > Thanks for the review. > > > >> -----Original Message----- > > > >> From: Vladimir Kozlov > > > >> Sent: Monday, November 14, 2016 11:20 PM > > > >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > > > >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > > > >> mismatched unsafe accesses > > > >> > > > >> On 11/14/16 1:03 AM, Shafi Ahmad wrote: > > > >> > Hi Vladimir, > > > >> > > > > >> > Thanks for the review. > > > >> > > > > >> > Please find updated webrevs. > > > >> > > > > >> > All webrevs are with respect to the base changes on JDK-8140309. > > > >> >http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ > > > >> > > > >> Why you kept unaligned parameter in changes? > > > > The fix of JDK-8136473 caused many problems after integration (see JDK- > 8140267). > > > > The fix was backed out and re-implemented with JDK-8140309 by slightly > changing the assert: > > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015- > Novem > > ber/019696.html > > > > The code change for the fix of JDK-8140309 is code changes for JDK-8136473 > by slightly changing one assert. > > > > jdk9 original changeset is > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c > > > > As this is a backport so I keep the changes as it is. > > > >> > > > >> The test TestUnsafeUnalignedMismatchedAccesses.java will not work > >> since > > > >> since Unsafe class in jdk8 does not have unaligned methods. > > > >> Hot did you run it? > > > > I am sorry, looks there is some issue with my testing. > > > > I have run jtreg test after merging the changes but somehow the test does > not run and I verified only the failing list of jtreg result. > > > > When I run the test case separately it is failing as you already pointed out > the same. > > > > $java -jar ~/Tools/jtreg/lib/jtreg.jar > > -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ > > > hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedA > > ccesses.java > > > > Test results: failed: 1 > > > > Report written to > > /scratch/shshahma/Java/jdk8u-dev- > 8140309_01/JTreport/html/report.html > > > > Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork > > > > Error: > > > > /scratch/shshahma/Java/jdk8u-dev- > 8140309_01/hotspot/test/compiler/intr > > insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: > > cannot find symbol > > > > UNSAFE.putIntUnaligned(array, > > UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); > > > > Not sure if we should push without the test case. > > > >> > > > >> >http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ > > > >> > > > >> Good. Did you run new UnsafeAccess.java test? > > > > Due to same process issue the test case is not run and when I run it > separately it fails. > > > > It passes after doing below changes: > > > > 1. Added /othervm > > > > 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by 'import > sun.misc.Unsafe;' > > > > Updated webrev: > > http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > > > >> > > > >> >http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ > > > > I am getting the similar compilation error as above for added test case. Not > sure if we can push without the test case. > > > > Regards, > > > > Shafi > > > >> > > > >> Good. > > > >> > > > >> Thanks, > > > >> Vladimir > > > >> > > > >> > > > > >> > Regards, > > > >> > Shafi > > > >> > > > > >> > > > > >> > > > > >> >> -----Original Message----- > > > >> >> From: Vladimir Kozlov > > > >> >> Sent: Friday, November 11, 2016 1:26 AM > > > >> >> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >> >> > > > >> >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >> >> produces > > > >> >> mismatched unsafe accesses > > > >> >> > > > >> >> On 11/9/16 10:42 PM, Shafi Ahmad wrote: > > > >> >>> Hi, > > > >> >>> > > > >> >>> Please review the backport of following dependent backports. > > > >> >>> > > > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473 > > > >> >>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. > > > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 > >> >>>[JDK- > > > >> >> 8080289]. Manual merge is not done as the corresponding code is > >> >> not > > > >> >> there in jdk8u-dev. > > > >> >>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and > > > >> >>> manual > > > >> >> merge is done. > > > >> >>> webrev link: > > > >>http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > > > >> >> > > > >> >> unaligned unsafe access methods were added in jdk 9 only. In your > > > >> >> changes unaligned argument is always false. You can simplify changes. > > > >> >> > > > >> >> Also you should base changes on JDK-8140309 (original 8136473 > >> >> changes > > > >> >> were backout by 8140267): > > > >> >> > > > >> >> On 11/4/15 10:21 PM, Roland Westrelin wrote: > > > >> >> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > > > >> >> > > > > >> >> > Same as 8136473 with only the following change: > > > >> >> > > > > >> >> > diff --git a/src/share/vm/opto/library_call.cpp > > > >> >> b/src/share/vm/opto/library_call.cpp > > > >> >> > --- a/src/share/vm/opto/library_call.cpp > > > >> >> > +++ b/src/share/vm/opto/library_call.cpp > > > >> >> > @@ -2527,7 +2527,7 @@ > > > >> >> > // of safe & unsafe memory. > > > >> >> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > > > >> >> > > > > >> >> > - assert(is_native_ptr || alias_type->adr_type() == > > > >> >> TypeOopPtr::BOTTOM > > > >> >> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || > > > >> >> alias_type->adr_type() == TypeOopPtr::BOTTOM || > > > >> >> > alias_type->field() != NULL || alias_type->element() != > > > >> >> NULL, "field, array element or unknown"); > > > >> >> > bool mismatched = false; > > > >> >> > if (alias_type->element() != NULL || alias_type->field() != NULL) { > > > >> >> > > > > >> >> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the > > > >> >> is_native_ptr case and the case where the unsafe method is called > >> >> with a > > > >> null object. > > > >> >> > > > >> >>> jdk9 changeset: > > > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > > > >> >>> > > > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 > > > >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > > > >> >>> > > > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 > > > >> >> [JDK-8140309]. Manual merge is not done as the corresponding code > >> >> is > > > >> >> not there in jdk8u-dev. > > > >> >> > > > >> >> I explained situation with this line above. > > > >> >> > > > >> >>> webrev link: > > > >>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > > > >> >> > > > >> >> This webrev is not incremental for your 8136473 changes - > > > >> >> library_call.cpp has part from 8136473 changes. > > > >> >> > > > >> >>> jdk9 changeset: > > > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > > > >> >>> > > > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781 > > > >> >>> Clean merge > > > >> >>> webrev link: > > > >>http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > > > >> >> > > > >> >> Thanks seems fine. > > > >> >> > > > >> >>> jdk9 changeset: > > > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > > > >> >>> > > > >> >>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101 > > > >> >>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > > > >> > >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > > > >> >>> [JDK-8160360] - Resolved 2. > > > >> > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2 > >> >>73 > > > >> >> [JDK-8148146] - Manual merge is not done as the corresponding code > >> >> is > > > >> >> not there in jdk8u-dev. > > > >> >>> webrev link: > > > >>http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > > > >> >> > > > >> >> This webrev is not incremental in library_call.cpp. Difficult to > >> >> see > > > >> >> this part of changes. > > > >> >> > > > >> >> Thanks, > > > >> >> Vladimir > > > >> >> > > > >> >>> jdk9 changeset: > > > >> >>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > > > >> >>> > > > >> >>> Testing: jprt and jtreg > > > >> >>> > > > >> >>> Regards, > > > >> >>> Shafi > > > >> >>> > > > >> >>>> -----Original Message----- > > > >> >>>> From: Shafi Ahmad > > > >> >>>> Sent: Thursday, October 20, 2016 10:08 AM > > > >> >>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net > >> >>>> > > > >> >>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation > > > >> >>>> produces mismatched unsafe accesses > > > >> >>>> > > > >> >>>> Thanks Vladimir. > > > >> >>>> > > > >> >>>> I will create dependent backport of 1. > > > >> >>>>https://bugs.openjdk.java.net/browse/JDK-8136473 > > > >> >>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 > > > >> >>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 > > > >> >>>> > > > >> >>>> Regards, > > > >> >>>> Shafi > > > >> >>>> > > > >> >>>>> -----Original Message----- > > > >> >>>>> From: Vladimir Kozlov > > > >> >>>>> Sent: Wednesday, October 19, 2016 8:27 AM > > > >> >>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >> >>>>> > > > >> >>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > > > >> >>>>> produces mismatched unsafe accesses > > > >> >>>>> > > > >> >>>>> Hi Shafi, > > > >> >>>>> > > > >> >>>>> You should also consider backporting following related fixes: > > > >> >>>>> > > > >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8155781 > > > >> >>>>>https://bugs.openjdk.java.net/browse/JDK-8162101 > > > >> >>>>> > > > >> >>>>> Otherwise you may hit asserts added by 8134918 changes. > > > >> >>>>> > > > >> >>>>> Thanks, > > > >> >>>>> Vladimir > > > >> >>>>> > > > >> >>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > > > >> >>>>>> Hi All, > > > >> >>>>>> > > > >> >>>>>> Please review the backport of JDK-8134918 - C2: Type > >> >>>>>> speculation > > > >> >>>>>> produces > > > >> >>>>> mismatched unsafe accesses to jdk8u-dev. > > > >> >>>>>> > > > >> >>>>>> Please note that backport is not clean and the conflict is due to: > > > >> >>>>>> > > > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > > > >> >>>>>> 1 > > > >> >>>>>> 65 > > > >> >>>>>> > > > >> >>>>>> Getting debug build failure because of: > > > >> >>>>>> > > > >>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > > > >> >>>>>> 1 > > > >> >>>>>> 55 > > > >> >>>>>> > > > >> >>>>>> The above changes are done under bug# 'JDK-8136473: failed: no > > > >> >>>>> mismatched stores, except on raw memory: StoreB StoreI' which > >> >>>>> is > > > >> >>>>> not back ported to jdk8u and the current backport is on top of > > > >> >>>>> above > > > >> >> change. > > > >> >>>>>> > > > >> >>>>>> Please note that I am not sure if there is any dependency > > > >> >>>>>> between these > > > >> >>>>> two changesets. > > > >> >>>>>> > > > >> >>>>>> open webrev: > > > >> >>>>http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > > > >> >>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 > > > >> >>>>>> jdk9 changeset: > > > >> >>>>>http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > > > >> >>>>>> > > > >> >>>>>> testing: Passes JPRT, jtreg not completed > > > >> >>>>>> > > > >> >>>>>> Regards, > > > >> >>>>>> Shafi > > > >> >>>>>> > > From kevin.walls at oracle.com Wed Nov 16 15:57:10 2016 From: kevin.walls at oracle.com (Kevin Walls) Date: Wed, 16 Nov 2016 15:57:10 +0000 Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 ciObjectFactory::create_new_metadata In-Reply-To: References: <2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default> Message-ID: <970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com> Hi Shafi - yes, backport looks good, Regards Kevin On 10/11/2016 07:10, Shafi Ahmad wrote: > Hi All, > > May I get the second review for this backport. > > Regards, > Shafi > >> -----Original Message----- >> From: Shafi Ahmad >> Sent: Tuesday, October 25, 2016 9:09 AM >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net >> Cc: Vladimir Ivanov >> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 >> ciObjectFactory::create_new_metadata >> >> May I get the second review for this backport. >> >> Regards, >> Shafi >> >>> -----Original Message----- >>> From: Shafi Ahmad >>> Sent: Thursday, October 20, 2016 9:55 AM >>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net >>> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with >>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata >>> >>> Thank you Vladimir for the review. >>> >>> Please find the updated webrev link. >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/ >>> >>> All, >>> >>> May I get 2nd review for this. >>> >>> Regards, >>> Shafi >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Wednesday, October 19, 2016 10:14 PM >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>>> Cc: Vladimir Ivanov; Jamsheed C M >>>> Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata >>>> >>>> In ciMethod.hpp you duplicated comment line: >>>> >>>> + // Given a certain calling environment, find the monomorphic >>>> + target >>>> // Given a certain calling environment, find the monomorphic >>>> target >>>> >>>> Otherwise looks good. >>>> >>>> Thanks, >>>> Vladimir K >>>> >>>> On 10/19/16 12:53 AM, Shafi Ahmad wrote: >>>>> Hi All, >>>>> >>>>> Please review the backport of 'JDK-8134389: Crash in HotSpot with >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev. >>>>> Please note that backport is not clean as I was getting build failure due >> to: >>>>> Formal parameter 'ignore_return' in method >>>>> GraphBuilder::method_return >>>> is added in the fix of https://bugs.openjdk.java.net/browse/JDK- >> 8164122. >>>>> The current code change is done on top of aforesaid bug fix and >>>>> this formal >>>> parameter is referenced in this code change. >>>>> * if (x != NULL && !ignore_return) { * >>>>> >>>>> Author of this code change suggested me, we can safely remove this >>>> addition conditional expression ' && !ignore_return'. >>>>> open webrev: >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/ >>>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389 >>>>> jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs- >>>> comp/hotspot/rev/4191b33b3629 >>>>> testing: Passes JPRT, jtreg on Linux [amd64] and newly added test >>>>> case >>>>> >>>>> Regards, >>>>> Shafi >>>>> From shafi.s.ahmad at oracle.com Wed Nov 16 16:32:42 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 16 Nov 2016 08:32:42 -0800 (PST) Subject: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 ciObjectFactory::create_new_metadata In-Reply-To: <970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com> References: <2e1de7f0-cc65-47f7-9f97-cb0e56dacfe1@default> <970a44a7-ebbc-e04c-5891-875c93c0aa58@oracle.com> Message-ID: <5e62d234-df62-44b5-826a-c041a002e548@default> Thank you Kevin for the review. Regards, Shafi > -----Original Message----- > From: Kevin Walls > Sent: Wednesday, November 16, 2016 9:27 PM > To: Shafi Ahmad; Vladimir Kozlov; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with jvm.dll+0x42b48 > ciObjectFactory::create_new_metadata > > > Hi Shafi - yes, backport looks good, > > Regards > Kevin > > On 10/11/2016 07:10, Shafi Ahmad wrote: > > Hi All, > > > > May I get the second review for this backport. > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Shafi Ahmad > >> Sent: Tuesday, October 25, 2016 9:09 AM > >> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > >> Cc: Vladimir Ivanov > >> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with > >> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata > >> > >> May I get the second review for this backport. > >> > >> Regards, > >> Shafi > >> > >>> -----Original Message----- > >>> From: Shafi Ahmad > >>> Sent: Thursday, October 20, 2016 9:55 AM > >>> To: Vladimir Kozlov; hotspot-dev at openjdk.java.net > >>> Subject: RE: [8u] RFR for JDK-8134389: Crash in HotSpot with > >>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata > >>> > >>> Thank you Vladimir for the review. > >>> > >>> Please find the updated webrev link. > >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.01/ > >>> > >>> All, > >>> > >>> May I get 2nd review for this. > >>> > >>> Regards, > >>> Shafi > >>> > >>>> -----Original Message----- > >>>> From: Vladimir Kozlov > >>>> Sent: Wednesday, October 19, 2016 10:14 PM > >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >>>> Cc: Vladimir Ivanov; Jamsheed C M > >>>> Subject: Re: [8u] RFR for JDK-8134389: Crash in HotSpot with > >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata > >>>> > >>>> In ciMethod.hpp you duplicated comment line: > >>>> > >>>> + // Given a certain calling environment, find the monomorphic > >>>> + target > >>>> // Given a certain calling environment, find the monomorphic > >>>> target > >>>> > >>>> Otherwise looks good. > >>>> > >>>> Thanks, > >>>> Vladimir K > >>>> > >>>> On 10/19/16 12:53 AM, Shafi Ahmad wrote: > >>>>> Hi All, > >>>>> > >>>>> Please review the backport of 'JDK-8134389: Crash in HotSpot with > >>>> jvm.dll+0x42b48 ciObjectFactory::create_new_metadata' to jdk8u-dev. > >>>>> Please note that backport is not clean as I was getting build > >>>>> failure due > >> to: > >>>>> Formal parameter 'ignore_return' in method > >>>>> GraphBuilder::method_return > >>>> is added in the fix of https://bugs.openjdk.java.net/browse/JDK- > >> 8164122. > >>>>> The current code change is done on top of aforesaid bug fix and > >>>>> this formal > >>>> parameter is referenced in this code change. > >>>>> * if (x != NULL && !ignore_return) { * > >>>>> > >>>>> Author of this code change suggested me, we can safely remove this > >>>> addition conditional expression ' && !ignore_return'. > >>>>> open webrev: > >>> http://cr.openjdk.java.net/~shshahma/8134389/webrev.00/ > >>>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8134389 > >>>>> jdk9 changeset: http://hg.openjdk.java.net/jdk9/hs- > >>>> comp/hotspot/rev/4191b33b3629 > >>>>> testing: Passes JPRT, jtreg on Linux [amd64] and newly added test > >>>>> case > >>>>> > >>>>> Regards, > >>>>> Shafi > >>>>> > From david.buck at oracle.com Wed Nov 16 16:44:03 2016 From: david.buck at oracle.com (david buck) Date: Thu, 17 Nov 2016 01:44:03 +0900 Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV In-Reply-To: References: Message-ID: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com> (moving to hotspot-dev for more exposure.) Jamsheed, thanks once again reviewing my backport! Any reviewers out there willing to chime in? Cheers, -Buck -------- Forwarded Message -------- Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV Date: Wed, 16 Nov 2016 21:48:10 +0530 From: Jamsheed C m Organization: Oracle Corporation To: david buck , hotspot-compiler-dev at openjdk.java.net Thanks for fixing. new webrev looks good to me (not a reviewer). Best Regards, Jamsheed On 11/16/2016 4:31 PM, david buck wrote: > Hi Jamsheed! > > Thank you for catching the mistake! I have modified the backport to > include the relevant change from 8072008 [0]. Here is an updated webrev: > > http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/ > > In the new chunk of code added, the only difference from the code in > JDK 9 is I had to add a call to err_msg() as JDK 8 does not have > variadic macro version of assert() [1]. > > I have reran all tests (both JPRT and manual) with no issues. > > Cheers, > -Buck > > [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b > > [1] https://bugs.openjdk.java.net/browse/JDK-8080775 > > On 2016/11/16 16:29, Jamsheed C m wrote: >> Hi David, >> >> this change is missing >> >> JVMState* VirtualCallGenerator::generate(JVMState* jvms) { >> >> ... >> >> if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) { >> assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", kit.java_bc(), >> Bytecodes::name(kit.java_bc())); >> ciMethod* declared_method = >> kit.method()->get_method_at_bci(kit.bci()); >> int arg_size = >> declared_method->signature()->arg_size_for_bc(kit.java_bc()); >> kit.inc_sp(arg_size); // restore arguments >> kit.uncommon_trap(Deoptimization::Reason_null_check, >> Deoptimization::Action_none, >> NULL, "null receiver"); >> >> >> Best Regards, >> >> Jamsheed >> >> >> On 11/15/2016 8:55 PM, david buck wrote: >>> Hi! >>> >>> Please review the backported changes of JDK-8158639 to 8u: >>> >>> It is a very straightforward backport. The only two differences are: >>> >>> - I added a convenience macro, get_method_at_bci(), from the change >>> for 8072008 to make the backport cleaner. >>> >>> - I had to modify (remove) the package used for the testcase. >>> >>> Bug Report: >>> [ 8158639: C2 compilation fails with SIGSEGV ] >>> https://bugs.openjdk.java.net/browse/JDK-8158639 >>> >>> JDK 9 changeset: >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b >>> >>> 8u-dev Webrev: >>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/ >>> >>> Testing: >>> Manual verification and JPRT (default and hotspot testsets) >>> >>> Cheers, >>> -Buck >> From vladimir.kozlov at oracle.com Wed Nov 16 16:51:09 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 16 Nov 2016 08:51:09 -0800 Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <4332d26a-0efa-4582-9068-f28fb7ebd109@default> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> <582B622F.7030909@oracle.com> <4332d26a-0efa-4582-9068-f28fb7ebd109@default> Message-ID: Looks good. I would suggest to run all jtreg tests (or even RBT) when you apply all changes before pushing this. Thanks, Vladimir On 11/16/16 4:52 AM, Shafi Ahmad wrote: > Hi Vladimir, > > Thank you for the review and feedback. > > Please find updated webrevs: > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed the test case as it use only jdk9 APIs. > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed test methods testFixedOffsetHeaderArray17() and testFixedOffsetHeader17() which referenced jdk9 API UNSAFE.getIntUnaligned. > > > Regards, > Shafi > > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Wednesday, November 16, 2016 1:00 AM >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces >> mismatched unsafe accesses >> >> Hi Shafi >> >> You should not backport tests which use only new JDK 9 APIs. Like >> TestUnsafeUnalignedMismatchedAccesses.java test. >> >> But it is perfectly fine to modify backport by removing part of changes which >> use a new API. For example, 8162101 changes in OpaqueAccesses.java test >> which use getIntUnaligned() method. >> >> It is unfortunate that 8140309 changes include also code which process new >> Unsafe Unaligned intrinsics from JDK 9. It should not be backported but it will >> simplify this and following backports. So I agree with changes you did for >> 8140309 backport. >> >> Thanks, >> Vladimir >> >> On 11/14/16 10:34 PM, Shafi Ahmad wrote: >>> Hi Vladimir, >>> >>> Thanks for the review. >>> >>>> -----Original Message----- >>> >>>> From: Vladimir Kozlov >>> >>>> Sent: Monday, November 14, 2016 11:20 PM >>> >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>> >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces >>> >>>> mismatched unsafe accesses >>> >>>> >>> >>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote: >>> >>>>> Hi Vladimir, >>> >>>>> >>> >>>>> Thanks for the review. >>> >>>>> >>> >>>>> Please find updated webrevs. >>> >>>>> >>> >>>>> All webrevs are with respect to the base changes on JDK-8140309. >>> >>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ >>> >>>> >>> >>>> Why you kept unaligned parameter in changes? >>> >>> The fix of JDK-8136473 caused many problems after integration (see JDK- >> 8140267). >>> >>> The fix was backed out and re-implemented with JDK-8140309 by slightly >> changing the assert: >>> >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015- >> Novem >>> ber/019696.html >>> >>> The code change for the fix of JDK-8140309 is code changes for JDK-8136473 >> by slightly changing one assert. >>> >>> jdk9 original changeset is >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c >>> >>> As this is a backport so I keep the changes as it is. >>> >>>> >>> >>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work >>>> since >>> >>>> since Unsafe class in jdk8 does not have unaligned methods. >>> >>>> Hot did you run it? >>> >>> I am sorry, looks there is some issue with my testing. >>> >>> I have run jtreg test after merging the changes but somehow the test does >> not run and I verified only the failing list of jtreg result. >>> >>> When I run the test case separately it is failing as you already pointed out >> the same. >>> >>> $java -jar ~/Tools/jtreg/lib/jtreg.jar >>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ >>> >> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatchedA >>> ccesses.java >>> >>> Test results: failed: 1 >>> >>> Report written to >>> /scratch/shshahma/Java/jdk8u-dev- >> 8140309_01/JTreport/html/report.html >>> >>> Results written to /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork >>> >>> Error: >>> >>> /scratch/shshahma/Java/jdk8u-dev- >> 8140309_01/hotspot/test/compiler/intr >>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: >>> cannot find symbol >>> >>> UNSAFE.putIntUnaligned(array, >>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); >>> >>> Not sure if we should push without the test case. >>> >>>> >>> >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ >>> >>>> >>> >>>> Good. Did you run new UnsafeAccess.java test? >>> >>> Due to same process issue the test case is not run and when I run it >> separately it fails. >>> >>> It passes after doing below changes: >>> >>> 1. Added /othervm >>> >>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by 'import >> sun.misc.Unsafe;' >>> >>> Updated webrev: >>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ >>> >>>> >>> >>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ >>> >>> I am getting the similar compilation error as above for added test case. Not >> sure if we can push without the test case. >>> >>> Regards, >>> >>> Shafi >>> >>>> >>> >>>> Good. >>> >>>> >>> >>>> Thanks, >>> >>>> Vladimir >>> >>>> >>> >>>>> >>> >>>>> Regards, >>> >>>>> Shafi >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>>> -----Original Message----- >>> >>>>>> From: Vladimir Kozlov >>> >>>>>> Sent: Friday, November 11, 2016 1:26 AM >>> >>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net >>>>>> >>> >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>>> produces >>> >>>>>> mismatched unsafe accesses >>> >>>>>> >>> >>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote: >>> >>>>>>> Hi, >>> >>>>>>> >>> >>>>>>> Please review the backport of following dependent backports. >>> >>>>>>> >>> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473 >>> >>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. >>> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 >>>>>>> [JDK- >>> >>>>>> 8080289]. Manual merge is not done as the corresponding code is >>>>>> not >>> >>>>>> there in jdk8u-dev. >>> >>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp and >>> >>>>>>> manual >>> >>>>>> merge is done. >>> >>>>>>> webrev link: >>> >>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ >>> >>>>>> >>> >>>>>> unaligned unsafe access methods were added in jdk 9 only. In your >>> >>>>>> changes unaligned argument is always false. You can simplify changes. >>> >>>>>> >>> >>>>>> Also you should base changes on JDK-8140309 (original 8136473 >>>>>> changes >>> >>>>>> were backout by 8140267): >>> >>>>>> >>> >>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote: >>> >>>>>> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ >>> >>>>>> > >>> >>>>>> > Same as 8136473 with only the following change: >>> >>>>>> > >>> >>>>>> > diff --git a/src/share/vm/opto/library_call.cpp >>> >>>>>> b/src/share/vm/opto/library_call.cpp >>> >>>>>> > --- a/src/share/vm/opto/library_call.cpp >>> >>>>>> > +++ b/src/share/vm/opto/library_call.cpp >>> >>>>>> > @@ -2527,7 +2527,7 @@ >>> >>>>>> > // of safe & unsafe memory. >>> >>>>>> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); >>> >>>>>> > >>> >>>>>> > - assert(is_native_ptr || alias_type->adr_type() == >>> >>>>>> TypeOopPtr::BOTTOM >>> >>>>>> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || >>> >>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM || >>> >>>>>> > alias_type->field() != NULL || alias_type->element() != >>> >>>>>> NULL, "field, array element or unknown"); >>> >>>>>> > bool mismatched = false; >>> >>>>>> > if (alias_type->element() != NULL || alias_type->field() != NULL) { >>> >>>>>> > >>> >>>>>> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the >>> >>>>>> is_native_ptr case and the case where the unsafe method is called >>>>>> with a >>> >>>> null object. >>> >>>>>> >>> >>>>>>> jdk9 changeset: >>> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 >>> >>>>>>> >>> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 >>> >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>> >>>>>>> >>> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.165 >>> >>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code >>>>>> is >>> >>>>>> not there in jdk8u-dev. >>> >>>>>> >>> >>>>>> I explained situation with this line above. >>> >>>>>> >>> >>>>>>> webrev link: >>> >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>> >>>>>> >>> >>>>>> This webrev is not incremental for your 8136473 changes - >>> >>>>>> library_call.cpp has part from 8136473 changes. >>> >>>>>> >>> >>>>>>> jdk9 changeset: >>> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>> >>>>>>> >>> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781 >>> >>>>>>> Clean merge >>> >>>>>>> webrev link: >>> >>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ >>> >>>>>> >>> >>>>>> Thanks seems fine. >>> >>>>>> >>> >>>>>>> jdk9 changeset: >>> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 >>> >>>>>>> >>> >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101 >>> >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>> >>>> >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 >>> >>>>>>> [JDK-8160360] - Resolved 2. >>> >>>> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2 >>>>>> 73 >>> >>>>>> [JDK-8148146] - Manual merge is not done as the corresponding code >>>>>> is >>> >>>>>> not there in jdk8u-dev. >>> >>>>>>> webrev link: >>> >>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ >>> >>>>>> >>> >>>>>> This webrev is not incremental in library_call.cpp. Difficult to >>>>>> see >>> >>>>>> this part of changes. >>> >>>>>> >>> >>>>>> Thanks, >>> >>>>>> Vladimir >>> >>>>>> >>> >>>>>>> jdk9 changeset: >>> >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 >>> >>>>>>> >>> >>>>>>> Testing: jprt and jtreg >>> >>>>>>> >>> >>>>>>> Regards, >>> >>>>>>> Shafi >>> >>>>>>> >>> >>>>>>>> -----Original Message----- >>> >>>>>>>> From: Shafi Ahmad >>> >>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM >>> >>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net >>>>>>>> >>> >>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation >>> >>>>>>>> produces mismatched unsafe accesses >>> >>>>>>>> >>> >>>>>>>> Thanks Vladimir. >>> >>>>>>>> >>> >>>>>>>> I will create dependent backport of 1. >>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473 >>> >>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 >>> >>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 >>> >>>>>>>> >>> >>>>>>>> Regards, >>> >>>>>>>> Shafi >>> >>>>>>>> >>> >>>>>>>>> -----Original Message----- >>> >>>>>>>>> From: Vladimir Kozlov >>> >>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM >>> >>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net >>>>>>>>> >>> >>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>> >>>>>>>>> produces mismatched unsafe accesses >>> >>>>>>>>> >>> >>>>>>>>> Hi Shafi, >>> >>>>>>>>> >>> >>>>>>>>> You should also consider backporting following related fixes: >>> >>>>>>>>> >>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 >>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 >>> >>>>>>>>> >>> >>>>>>>>> Otherwise you may hit asserts added by 8134918 changes. >>> >>>>>>>>> >>> >>>>>>>>> Thanks, >>> >>>>>>>>> Vladimir >>> >>>>>>>>> >>> >>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: >>> >>>>>>>>>> Hi All, >>> >>>>>>>>>> >>> >>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type >>>>>>>>>> speculation >>> >>>>>>>>>> produces >>> >>>>>>>>> mismatched unsafe accesses to jdk8u-dev. >>> >>>>>>>>>> >>> >>>>>>>>>> Please note that backport is not clean and the conflict is due to: >>> >>>>>>>>>> >>> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. >>> >>>>>>>>>> 1 >>> >>>>>>>>>> 65 >>> >>>>>>>>>> >>> >>>>>>>>>> Getting debug build failure because of: >>> >>>>>>>>>> >>> >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. >>> >>>>>>>>>> 1 >>> >>>>>>>>>> 55 >>> >>>>>>>>>> >>> >>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed: no >>> >>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which >>>>>>>>> is >>> >>>>>>>>> not back ported to jdk8u and the current backport is on top of >>> >>>>>>>>> above >>> >>>>>> change. >>> >>>>>>>>>> >>> >>>>>>>>>> Please note that I am not sure if there is any dependency >>> >>>>>>>>>> between these >>> >>>>>>>>> two changesets. >>> >>>>>>>>>> >>> >>>>>>>>>> open webrev: >>> >>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>> >>>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 >>> >>>>>>>>>> jdk9 changeset: >>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>> >>>>>>>>>> >>> >>>>>>>>>> testing: Passes JPRT, jtreg not completed >>> >>>>>>>>>> >>> >>>>>>>>>> Regards, >>> >>>>>>>>>> Shafi >>> >>>>>>>>>> >>> From kim.barrett at oracle.com Wed Nov 16 17:28:09 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Nov 2016 12:28:09 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <1479288087.2466.36.camel@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> <1479288087.2466.36.camel@oracle.com> Message-ID: <47742160-0A06-48C9-BDBD-76F453C33A68@oracle.com> > On Nov 16, 2016, at 4:21 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Tue, 2016-11-15 at 18:58 -0500, Kim Barrett wrote: >>> >>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl >> com> wrote: >>> >>> Hi Kim, >>> >>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>>> >>>>> >>>>> >>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >>>> le.c >>>>> om> wrote: >>>>> Maybe it would be cleaner to call a method in the barrier set >>>>> instead of inlining the dirtying + enqueuing in lines 685 to >>>>> 691? >>>>> Maybe as an additional RFE. >>>> We could use _ct_bs->invalidate(dirtyRegion). That's rather >>>> overgeneralized and inefficient for this situation, but this >>>> situation should occur *very* rarely; it requires a stale card >>>> get >>>> processed just as a humongous object is in the midst of being >>>> allocated in the same region. >>> I kind of think for these reasons we should use _ct_bs- >>>> invalidate() as >>> it seems clearer to me. There is the mentioned drawback of having >>> no >>> other more efficient way, so I will let you decide about this. >> I've made the change to call invalidate, and also updated some >> comments. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8166607 >> >> Webrevs: >> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ >> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ >> > > thanks, looks good. > > Thomas Thanks. From kim.barrett at oracle.com Wed Nov 16 18:02:07 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 16 Nov 2016 13:02:07 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1479287214.2466.35.camel@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> <1479287214.2466.35.camel@oracle.com> Message-ID: <1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com> > On Nov 16, 2016, at 4:06 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote: >> I've updated some comments to mention that external synchronization. > > 581 // The region could be young. Cards for young regions are set > to > 582 // g1_young_gen, so the post-barrier will filter them > out. However, > 583 // that marking is performed concurrently. A write to a young > 584 // object could occur before the card has been marked young, > slipping > 585 // past the filter. > > I would prefer if the text would not change terminology for the same > thing mid-paragraph, from "setting" to "marking". The advantage of it > reading better seems to be smaller than the potential confusion. // The region could be young. Cards for young regions are // distinctly marked (set to g1_young_gen), so the post-barrier will // filter them out. However, that marking is performed // concurrently. A write to a young object could occur before the // card has been marked young, slipping past the filter. Better? > > Everything else looks very nice. > > Thanks for considering my comments. Thanks, and thank you for reviewing so carefully. >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8166811 >> >> Webrevs: >> full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/ >> incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/ >> >> Also, since this set of changes is rather intertwined with the >> changes >> for 8166607, here is a combined webrev for both: >> http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/ >> >> I think I'll do as Erik suggested and push the two together. > > Just fyi, you can push two commits at once, or one commit having two > CR-number lines. > I think it is sufficient to commit these two changes in a single push > job, but I do not see a need for making it a single commit. > > Either way is fine with me. Perhaps I sowed confusion with the combined webrev. The purpose of that was to make it easy to see the combined effect of the two changes. I?m planning to do one push of two change sets. > > Thanks, > Thomas From gromero at linux.vnet.ibm.com Thu Nov 17 01:45:50 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 16 Nov 2016 23:45:50 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation Message-ID: <582D0BCE.2030209@linux.vnet.ibm.com> Hi, Currently, optimization for building fdlibm is disabled, except for the "solaris" OS target [1]. As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to, sin(), cos(), and tan() perform verify poor in comparison to the same methods in Math class [2]: Math StrictMath ========= ========== sin 0m29.984s 1m41.184s cos 0m30.031s 1m41.200s tan 0m31.772s 1m46.976s asin 0m4.577s 0m4.543s acos 0m4.539s 0m4.525s atan 0m12.929s 0m12.896s exp 0m1.071s 0m4.570s log 0m3.272s 0m14.239s log10 0m4.362s 0m20.236s sqrt 0m0.913s 0m0.981s cbrt 0m10.786s 0m10.808s sinh 0m4.438s 0m4.433s cosh 0m4.496s 0m4.478s tanh 0m3.360s 0m3.353s expm1 0m4.076s 0m4.094s log1p 0m13.518s 0m13.527s IEEEremainder 0m38.803s 0m38.909s atan2 0m20.100s 0m20.057s pow 0m14.096s 0m19.938s hypot 0m5.136s 0m5.122s Switching on the O3 optimization can damage precision of those methods, nonetheless it's possible to avoid that side effect and yet get huge benefits of the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in addition to the -O3 optimization flag. In that sense the following change is proposed to resolve the issue: diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk --- a/make/lib/CoreLibraries.gmk Wed Nov 09 13:37:19 2016 +0100 +++ b/make/lib/CoreLibraries.gmk Wed Nov 16 19:11:11 2016 -0500 @@ -33,10 +33,16 @@ # libfdlibm is statically linked with libjava below and not delivered into the # product on its own. -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH +BUILD_LIBFDLIBM_OPTIMIZATION := NONE -ifneq ($(OPENJDK_TARGET_OS), solaris) - BUILD_LIBFDLIBM_OPTIMIZATION := NONE +ifeq ($(OPENJDK_TARGET_OS), solaris) + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH +endif + +ifeq ($(OPENJDK_TARGET_OS), linux) + ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc) + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH + endif endif LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm @@ -51,6 +57,7 @@ CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \ CFLAGS_windows_debug := -DLOGGING, \ CFLAGS_aix := -qfloat=nomaf, \ + CFLAGS_linux_ppc := -fno-expensive-optimizations, \ DISABLED_WARNINGS_gcc := sign-compare, \ DISABLED_WARNINGS_microsoft := 4146 4244 4018, \ ARFLAGS := $(ARFLAGS), \ diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk --- a/make/common/NativeCompilation.gmk Wed Nov 09 15:32:39 2016 +0100 +++ b/make/common/NativeCompilation.gmk Wed Nov 16 19:08:06 2016 -0500 @@ -569,16 +569,19 @@ $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES)) # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS. - $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) + $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \ + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)) ifneq ($(DEBUG_LEVEL),release) # Pickup extra debug dependent variables for CFLAGS $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug) $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug) $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug) + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug) else $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release) $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release) $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release) + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release) endif # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS. After enabling the optimization it's possible to again up to 3x on performance regarding the aforementioned methods without losing precision: StrictMath, original StrictMath, optimized ============================ ============================ sin 1.7136493465700542 1m41.184s 1.7136493465700542 0m33.895s cos 0.1709843554185943 1m41.200s 0.1709843554185943 0m33.884s tan -5.5500322522995315E7 1m46.976s -5.5500322522995315E7 0m36.461s asin NaN 0m4.543s NaN 0m3.175s acos NaN 0m4.525s NaN 0m3.211s atan 1.5707961389886132E8 0m12.896s 1.5707961389886132E8 0m7.100s exp Infinity 0m4.570s Infinity 0m3.187s log 1.7420680845245087E9 0m14.239s 1.7420680845245087E9 0m7.170s log10 7.565705562087342E8 0m20.236s 7.565705562087342E8 0m9.610s sqrt 6.66666671666567E11 0m0.981s 6.66666671666567E11 0m0.948s cbrt 3.481191648389617E10 0m10.808s 3.481191648389617E10 0m10.786s sinh Infinity 0m4.433s Infinity 0m3.179s cosh Infinity 0m4.478s Infinity 0m3.174s tanh 9.999999971990079E7 0m3.353s 9.999999971990079E7 0m3.208s expm1 Infinity 0m4.094s Infinity 0m3.185s log1p 1.7420681029451895E9 0m13.527s 1.7420681029451895E9 0m8.756s IEEEremainder 502000.0 0m38.909s 502000.0 0m14.055s atan2 1.570453905253704E8 0m20.057s 1.570453905253704E8 0m10.510s pow Infinity 0m19.938s Infinity 0m20.204s hypot 5.000000099033372E15 0m5.122s 5.000000099033372E15 0m5.130s I believe that as the FC is passed but FEC is not the change can, after the due scrutiny and review, be pushed if a special exception approval grants it. Once on 9, I'll request the downport to 8. Could I open a bug to address that issue? Thank you very much. Regards, Gustavo [1] http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39 [2] https://github.com/gromero/strictmath (comparison script used to get the results) From david.holmes at oracle.com Thu Nov 17 02:31:48 2016 From: david.holmes at oracle.com (David Holmes) Date: Thu, 17 Nov 2016 12:31:48 +1000 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <582D0BCE.2030209@linux.vnet.ibm.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> Message-ID: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> Adding in build-dev as they need to scrutinize all build changes. David On 17/11/2016 11:45 AM, Gustavo Romero wrote: > Hi, > > Currently, optimization for building fdlibm is disabled, except for the > "solaris" OS target [1]. > > As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to, > sin(), cos(), and tan() perform verify poor in comparison to the same methods > in Math class [2]: > > Math StrictMath > ========= ========== > sin 0m29.984s 1m41.184s > cos 0m30.031s 1m41.200s > tan 0m31.772s 1m46.976s > asin 0m4.577s 0m4.543s > acos 0m4.539s 0m4.525s > atan 0m12.929s 0m12.896s > exp 0m1.071s 0m4.570s > log 0m3.272s 0m14.239s > log10 0m4.362s 0m20.236s > sqrt 0m0.913s 0m0.981s > cbrt 0m10.786s 0m10.808s > sinh 0m4.438s 0m4.433s > cosh 0m4.496s 0m4.478s > tanh 0m3.360s 0m3.353s > expm1 0m4.076s 0m4.094s > log1p 0m13.518s 0m13.527s > IEEEremainder 0m38.803s 0m38.909s > atan2 0m20.100s 0m20.057s > pow 0m14.096s 0m19.938s > hypot 0m5.136s 0m5.122s > > > Switching on the O3 optimization can damage precision of those methods, > nonetheless it's possible to avoid that side effect and yet get huge benefits of > the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in > addition to the -O3 optimization flag. > > In that sense the following change is proposed to resolve the issue: > > diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk > --- a/make/lib/CoreLibraries.gmk Wed Nov 09 13:37:19 2016 +0100 > +++ b/make/lib/CoreLibraries.gmk Wed Nov 16 19:11:11 2016 -0500 > @@ -33,10 +33,16 @@ > # libfdlibm is statically linked with libjava below and not delivered into the > # product on its own. > > -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > +BUILD_LIBFDLIBM_OPTIMIZATION := NONE > > -ifneq ($(OPENJDK_TARGET_OS), solaris) > - BUILD_LIBFDLIBM_OPTIMIZATION := NONE > +ifeq ($(OPENJDK_TARGET_OS), solaris) > + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > +endif > + > +ifeq ($(OPENJDK_TARGET_OS), linux) > + ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc) > + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > + endif > endif > > LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm > @@ -51,6 +57,7 @@ > CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \ > CFLAGS_windows_debug := -DLOGGING, \ > CFLAGS_aix := -qfloat=nomaf, \ > + CFLAGS_linux_ppc := -fno-expensive-optimizations, \ > DISABLED_WARNINGS_gcc := sign-compare, \ > DISABLED_WARNINGS_microsoft := 4146 4244 4018, \ > ARFLAGS := $(ARFLAGS), \ > > > diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk > --- a/make/common/NativeCompilation.gmk Wed Nov 09 15:32:39 2016 +0100 > +++ b/make/common/NativeCompilation.gmk Wed Nov 16 19:08:06 2016 -0500 > @@ -569,16 +569,19 @@ > $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES)) > > # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS. > - $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) > + $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \ > + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)) > ifneq ($(DEBUG_LEVEL),release) > # Pickup extra debug dependent variables for CFLAGS > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug) > + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug) > else > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release) > + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release) > endif > > # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS. > > > After enabling the optimization it's possible to again up to 3x on performance > regarding the aforementioned methods without losing precision: > > StrictMath, original StrictMath, optimized > ============================ ============================ > sin 1.7136493465700542 1m41.184s 1.7136493465700542 0m33.895s > cos 0.1709843554185943 1m41.200s 0.1709843554185943 0m33.884s > tan -5.5500322522995315E7 1m46.976s -5.5500322522995315E7 0m36.461s > asin NaN 0m4.543s NaN 0m3.175s > acos NaN 0m4.525s NaN 0m3.211s > atan 1.5707961389886132E8 0m12.896s 1.5707961389886132E8 0m7.100s > exp Infinity 0m4.570s Infinity 0m3.187s > log 1.7420680845245087E9 0m14.239s 1.7420680845245087E9 0m7.170s > log10 7.565705562087342E8 0m20.236s 7.565705562087342E8 0m9.610s > sqrt 6.66666671666567E11 0m0.981s 6.66666671666567E11 0m0.948s > cbrt 3.481191648389617E10 0m10.808s 3.481191648389617E10 0m10.786s > sinh Infinity 0m4.433s Infinity 0m3.179s > cosh Infinity 0m4.478s Infinity 0m3.174s > tanh 9.999999971990079E7 0m3.353s 9.999999971990079E7 0m3.208s > expm1 Infinity 0m4.094s Infinity 0m3.185s > log1p 1.7420681029451895E9 0m13.527s 1.7420681029451895E9 0m8.756s > IEEEremainder 502000.0 0m38.909s 502000.0 0m14.055s > atan2 1.570453905253704E8 0m20.057s 1.570453905253704E8 0m10.510s > pow Infinity 0m19.938s Infinity 0m20.204s > hypot 5.000000099033372E15 0m5.122s 5.000000099033372E15 0m5.130s > > > I believe that as the FC is passed but FEC is not the change can, after the due > scrutiny and review, be pushed if a special exception approval grants it. Once > on 9, I'll request the downport to 8. > > Could I open a bug to address that issue? > > Thank you very much. > > > Regards, > Gustavo > > [1] http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39 > [2] https://github.com/gromero/strictmath (comparison script used to get the results) > From erik.joelsson at oracle.com Thu Nov 17 09:17:33 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Thu, 17 Nov 2016 10:17:33 +0100 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> Message-ID: <9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com> Hello, Overall this looks reasonable to me. However, if we want to introduce a new possible tuple for specifying compilation flags to SetupNativeCompilation, we (the build team) would prefer if we used OPENJDK_TARGET_CPU instead of OPENJDK_TARGET_CPU_ARCH. /Erik On 2016-11-17 03:31, David Holmes wrote: > Adding in build-dev as they need to scrutinize all build changes. > > David > > On 17/11/2016 11:45 AM, Gustavo Romero wrote: >> Hi, >> >> Currently, optimization for building fdlibm is disabled, except for the >> "solaris" OS target [1]. >> >> As a consequence on PPC64 (Linux) StrictMath methods like, but not >> limited to, >> sin(), cos(), and tan() perform verify poor in comparison to the same >> methods >> in Math class [2]: >> >> Math StrictMath >> ========= ========== >> sin 0m29.984s 1m41.184s >> cos 0m30.031s 1m41.200s >> tan 0m31.772s 1m46.976s >> asin 0m4.577s 0m4.543s >> acos 0m4.539s 0m4.525s >> atan 0m12.929s 0m12.896s >> exp 0m1.071s 0m4.570s >> log 0m3.272s 0m14.239s >> log10 0m4.362s 0m20.236s >> sqrt 0m0.913s 0m0.981s >> cbrt 0m10.786s 0m10.808s >> sinh 0m4.438s 0m4.433s >> cosh 0m4.496s 0m4.478s >> tanh 0m3.360s 0m3.353s >> expm1 0m4.076s 0m4.094s >> log1p 0m13.518s 0m13.527s >> IEEEremainder 0m38.803s 0m38.909s >> atan2 0m20.100s 0m20.057s >> pow 0m14.096s 0m19.938s >> hypot 0m5.136s 0m5.122s >> >> >> Switching on the O3 optimization can damage precision of those methods, >> nonetheless it's possible to avoid that side effect and yet get huge >> benefits of >> the -O3 optimization on PPC64 if -fno-expensive-optimizations is >> passed in >> addition to the -O3 optimization flag. >> >> In that sense the following change is proposed to resolve the issue: >> >> diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk >> --- a/make/lib/CoreLibraries.gmk Wed Nov 09 13:37:19 2016 +0100 >> +++ b/make/lib/CoreLibraries.gmk Wed Nov 16 19:11:11 2016 -0500 >> @@ -33,10 +33,16 @@ >> # libfdlibm is statically linked with libjava below and not >> delivered into the >> # product on its own. >> >> -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH >> +BUILD_LIBFDLIBM_OPTIMIZATION := NONE >> >> -ifneq ($(OPENJDK_TARGET_OS), solaris) >> - BUILD_LIBFDLIBM_OPTIMIZATION := NONE >> +ifeq ($(OPENJDK_TARGET_OS), solaris) >> + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH >> +endif >> + >> +ifeq ($(OPENJDK_TARGET_OS), linux) >> + ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc) >> + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH >> + endif >> endif >> >> LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm >> @@ -51,6 +57,7 @@ >> CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \ >> CFLAGS_windows_debug := -DLOGGING, \ >> CFLAGS_aix := -qfloat=nomaf, \ >> + CFLAGS_linux_ppc := -fno-expensive-optimizations, \ >> DISABLED_WARNINGS_gcc := sign-compare, \ >> DISABLED_WARNINGS_microsoft := 4146 4244 4018, \ >> ARFLAGS := $(ARFLAGS), \ >> >> >> diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk >> --- a/make/common/NativeCompilation.gmk Wed Nov 09 15:32:39 2016 >> +0100 >> +++ b/make/common/NativeCompilation.gmk Wed Nov 16 19:08:06 2016 >> -0500 >> @@ -569,16 +569,19 @@ >> $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) >> $$($1_EXTRA_OBJECT_FILES)) >> >> # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS >> dependent variables for CFLAGS. >> - $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) >> $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) >> + $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) >> $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \ >> + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)) >> ifneq ($(DEBUG_LEVEL),release) >> # Pickup extra debug dependent variables for CFLAGS >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug) >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug) >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug) >> + >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug) >> else >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release) >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release) >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release) >> + >> $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release) >> endif >> >> # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS >> dependent variables for CXXFLAGS. >> >> >> After enabling the optimization it's possible to again up to 3x on >> performance >> regarding the aforementioned methods without losing precision: >> >> StrictMath, original StrictMath, optimized >> ============================ >> ============================ >> sin 1.7136493465700542 1m41.184s 1.7136493465700542 >> 0m33.895s >> cos 0.1709843554185943 1m41.200s 0.1709843554185943 >> 0m33.884s >> tan -5.5500322522995315E7 1m46.976s >> -5.5500322522995315E7 0m36.461s >> asin NaN 0m4.543s >> NaN 0m3.175s >> acos NaN 0m4.525s >> NaN 0m3.211s >> atan 1.5707961389886132E8 0m12.896s >> 1.5707961389886132E8 0m7.100s >> exp Infinity 0m4.570s Infinity 0m3.187s >> log 1.7420680845245087E9 0m14.239s >> 1.7420680845245087E9 0m7.170s >> log10 7.565705562087342E8 0m20.236s 7.565705562087342E8 >> 0m9.610s >> sqrt 6.66666671666567E11 0m0.981s 6.66666671666567E11 >> 0m0.948s >> cbrt 3.481191648389617E10 0m10.808s 3.481191648389617E10 >> 0m10.786s >> sinh Infinity 0m4.433s Infinity 0m3.179s >> cosh Infinity 0m4.478s Infinity 0m3.174s >> tanh 9.999999971990079E7 0m3.353s 9.999999971990079E7 >> 0m3.208s >> expm1 Infinity 0m4.094s Infinity 0m3.185s >> log1p 1.7420681029451895E9 0m13.527s >> 1.7420681029451895E9 0m8.756s >> IEEEremainder 502000.0 0m38.909s 502000.0 0m14.055s >> atan2 1.570453905253704E8 0m20.057s 1.570453905253704E8 >> 0m10.510s >> pow Infinity 0m19.938s Infinity 0m20.204s >> hypot 5.000000099033372E15 0m5.122s >> 5.000000099033372E15 0m5.130s >> >> >> I believe that as the FC is passed but FEC is not the change can, >> after the due >> scrutiny and review, be pushed if a special exception approval grants >> it. Once >> on 9, I'll request the downport to 8. >> >> Could I open a bug to address that issue? >> >> Thank you very much. >> >> >> Regards, >> Gustavo >> >> [1] >> http://hg.openjdk.java.net/jdk9/hs/jdk/file/81eb4bd34611/make/lib/CoreLibraries.gmk#l39 >> [2] https://github.com/gromero/strictmath (comparison script used to >> get the results) >> From thomas.schatzl at oracle.com Thu Nov 17 11:28:06 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Nov 2016 12:28:06 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> Message-ID: <1479382086.2891.24.camel@oracle.com> Hi Kim, ? while unconsciously dwelling on the issue I think there is one unanswered question: On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote: > > > > On Nov 8, 2016, at 7:52 AM, Thomas Schatzl > om> wrote: > > On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: > > > > > > There is still a situation where processing can fail, namely an > > > in-progress humongous allocation that hasn't set the klass > > > yet.??We > > > continue to handle that as before. > > - I am not completely sure about whether this case is handled > > correctly. I am mostly concerned that the information used before > > the > > fence may not be the correct ones, but the checks expect them to be > > valid. > > > > Probably I am overlooking something critical somewhere. > > > > A: allocates humongous object C, sets region type, issues > > storestore, sets top pointers, writes the object, and then sets C.y > > = x to mark a > > card > > > > Refinement: gets card (and assuming we have no further > > synchronization > > around which is not true, e.g. the enqueuing) > > > > ?592???if (!r->is_old_or_humongous()) { > > > > assume refinement thread has not received the "type" correctly yet, > > so must be Free. So the card will be filtered out incorrectly? > > > > That is contradictory to what I said in the other email about the > > comment discussion, but I only thoroughly looked at the comment > > aspect there. :) > > > > I think at this point in general we can't do anything but > > !is_young(), as we can't ignore cards in "Free" regions - they may > > be for cards for humongous ones where the thread did not receive > > top and/or the type yet? Here, combined with the scenario described in the other thread (I will repeat it for clarity): " A: allocate new young region X, allocate object, storestore, stops at the beginning of the dirty_young_block() method B: allocate new object B in X, set B.y = something-outside, making the card "Dirty" since thread A did not actually start doing dirty_young_block() yet. Refinement: scans the card; since R does not seem to synchronize with A either, you may get a "dirty" card in a young (or free, depending on whether the setting of the region flag in X has already been observed - but it must be either one) region here in this case? A: does the work in dirty_young_block()" Since thread A allocated the region X, the top and region type of region X are set by A. Now, in this scenario, refinement gets the dirty card from thread B first (because eg. it happens that thread B's queue just got full), and A is still busy marking the card table. The region type change (caused by A) for region X may not have been observed by the refinement yet, so it may still be Free? So the check in g1RemSet.cpp ?597 ? if (!r->is_old_or_humongous()) { may filter the card out wrongly when processing the card from thread B as far as I can see. That's why I remarked about only being able to filter out using is_young() here. For the refinement thread, "top" is current (after the fence), but the region type not (may still be "Free" until the refinement "synchronizes" with thread A in some way), doesn't it? The change to "top" must have been observed already after the fence (in line 684) though and is safe to use (the allocation of the TLAB for thread B sets top using appropriate barriers, and the refinement will synchronize with whatever thread B set). Probably I am overlooking something about how the type of region X set by thread A can be visible to refinement if it only "synchronizes" with thread B (that did not write the type of region X). Thanks, ? Thomas From thomas.schatzl at oracle.com Thu Nov 17 11:31:38 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Nov 2016 12:31:38 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> <1479287214.2466.35.camel@oracle.com> <1A54AB8B-C2A8-4F17-BC98-E76FE815A009@oracle.com> Message-ID: <1479382298.2891.25.camel@oracle.com> Hi Kim, On Wed, 2016-11-16 at 13:02 -0500, Kim Barrett wrote: > > > > On Nov 16, 2016, at 4:06 AM, Thomas Schatzl > com> wrote: > > > > Hi Kim, > > > > On Tue, 2016-11-15 at 19:00 -0500, Kim Barrett wrote: > > > > > > I've updated some comments to mention that external > > > synchronization. > > ?581???// The region could be young.??Cards for young regions are > > set > > to > > ?582???// g1_young_gen, so the post-barrier will filter them > > out.??However, > > ?583???// that marking is performed concurrently.??A write to a > > young > > ?584???// object could occur before the card has been marked young, > > slipping > > ?585???// past the filter. > > > > I would prefer if the text would not change terminology for the > > same > > thing mid-paragraph, from "setting" to "marking". The advantage of > > it > > reading better seems to be smaller than the potential confusion. > ? // The region could be young.??Cards for young regions are > ? // distinctly marked (set to g1_young_gen), so the post-barrier > will > ? // filter them out.??However, that marking is performed > ? // concurrently.??A write to a young object could occur before the > ? // card has been marked young, slipping past the filter. > > Better? ? better :) Thanks, ? Thomas From tobias.hartmann at oracle.com Thu Nov 17 12:42:26 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 17 Nov 2016 13:42:26 +0100 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled Message-ID: <582DA5B2.4020307@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8169711 http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()). Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code. The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub. While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized. I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals. Tested with regression test, JPRT and RBT (running). Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8169867 From erik.helin at oracle.com Thu Nov 17 14:13:14 2016 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 17 Nov 2016 15:13:14 +0100 Subject: JEP 189: Shenandoah: An Ultra-Low-Pause-Time Garbage Collector In-Reply-To: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com> References: <1444338101.14210351.1479147940505.JavaMail.zimbra@redhat.com> Message-ID: <76e1a719-c786-5c84-7287-4053d4e96021@oracle.com> On 11/14/2016 07:25 PM, Christine Flood wrote: > Hi > > We've addressed the issues with the JEP that were brought up last summer. > We've been meeting our performance goals. > > What do we need to do to get Shenandoah approved for OpenJDK10? Hi Christine, I read through the JEP, thanks for making the suggested changes. One thing I'm missing though are the operating systems you intend to support? The JEP mentions that Red Hat will support Shenandoah for the arm64 and amd64 CPU architectures, but doesn't mention any operating systems. I would strongly prefer that the JEP suggested by Roman, "GC Interface: Better isolation of GC implementations" [0], is integrated before this JEP is submitted in order to ensure that the code can co-exist side-by-side with the existing GC algorithms (and be maintained effectively by another contributor). Would you mind adding a dependency in the Shenandoah JEP on Roman's "GCInterface" JEP? As for the JEP process, please see http://openjdk.java.net/jeps/1 and http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html. Thanks, Erik [0]: https://bugs.openjdk.java.net/browse/JDK-8163329 > Christine > From vladimir.x.ivanov at oracle.com Thu Nov 17 14:34:35 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 17 Nov 2016 17:34:35 +0300 Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV In-Reply-To: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com> References: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com> Message-ID: <5bc4a55e-e2d1-4bcf-ccf4-382e65239d82@oracle.com> Looks good (not a 8u Reviewer). Best regards, Vladimir Ivanov On 11/16/16 7:44 PM, david buck wrote: > (moving to hotspot-dev for more exposure.) > > Jamsheed, thanks once again reviewing my backport! > > Any reviewers out there willing to chime in? > > Cheers, > -Buck > > > -------- Forwarded Message -------- > Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV > Date: Wed, 16 Nov 2016 21:48:10 +0530 > From: Jamsheed C m > Organization: Oracle Corporation > To: david buck , > hotspot-compiler-dev at openjdk.java.net > > > Thanks for fixing. new webrev looks good to me (not a reviewer). > > Best Regards, > Jamsheed > On 11/16/2016 4:31 PM, david buck wrote: >> Hi Jamsheed! >> >> Thank you for catching the mistake! I have modified the backport to >> include the relevant change from 8072008 [0]. Here is an updated webrev: >> >> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/ >> >> In the new chunk of code added, the only difference from the code in >> JDK 9 is I had to add a call to err_msg() as JDK 8 does not have >> variadic macro version of assert() [1]. >> >> I have reran all tests (both JPRT and manual) with no issues. >> >> Cheers, >> -Buck >> >> [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8080775 >> >> On 2016/11/16 16:29, Jamsheed C m wrote: >>> Hi David, >>> >>> this change is missing >>> >>> JVMState* VirtualCallGenerator::generate(JVMState* jvms) { >>> >>> ... >>> >>> if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) { >>> assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", kit.java_bc(), >>> Bytecodes::name(kit.java_bc())); >>> ciMethod* declared_method = >>> kit.method()->get_method_at_bci(kit.bci()); >>> int arg_size = >>> declared_method->signature()->arg_size_for_bc(kit.java_bc()); >>> kit.inc_sp(arg_size); // restore arguments >>> kit.uncommon_trap(Deoptimization::Reason_null_check, >>> Deoptimization::Action_none, >>> NULL, "null receiver"); >>> >>> >>> Best Regards, >>> >>> Jamsheed >>> >>> >>> On 11/15/2016 8:55 PM, david buck wrote: >>>> Hi! >>>> >>>> Please review the backported changes of JDK-8158639 to 8u: >>>> >>>> It is a very straightforward backport. The only two differences are: >>>> >>>> - I added a convenience macro, get_method_at_bci(), from the change >>>> for 8072008 to make the backport cleaner. >>>> >>>> - I had to modify (remove) the package used for the testcase. >>>> >>>> Bug Report: >>>> [ 8158639: C2 compilation fails with SIGSEGV ] >>>> https://bugs.openjdk.java.net/browse/JDK-8158639 >>>> >>>> JDK 9 changeset: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b >>>> >>>> 8u-dev Webrev: >>>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/ >>>> >>>> Testing: >>>> Manual verification and JPRT (default and hotspot testsets) >>>> >>>> Cheers, >>>> -Buck >>> > From coleen.phillimore at oracle.com Thu Nov 17 15:42:25 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Thu, 17 Nov 2016 10:42:25 -0500 Subject: RFR(S)[8u]: 8158639: C2 compilation fails with SIGSEGV In-Reply-To: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com> References: <8cedca6d-0111-61c8-8327-04a7f9fe4e47@oracle.com> Message-ID: This looks like a good backport of the original bug fix. Reviewed. Coleen On 11/16/16 11:44 AM, david buck wrote: > (moving to hotspot-dev for more exposure.) > > Jamsheed, thanks once again reviewing my backport! > > Any reviewers out there willing to chime in? > > Cheers, > -Buck > > > -------- Forwarded Message -------- > Subject: Re: RFR[8u]: 8158639: C2 compilation fails with SIGSEGV > Date: Wed, 16 Nov 2016 21:48:10 +0530 > From: Jamsheed C m > Organization: Oracle Corporation > To: david buck , > hotspot-compiler-dev at openjdk.java.net > > > Thanks for fixing. new webrev looks good to me (not a reviewer). > > Best Regards, > Jamsheed > On 11/16/2016 4:31 PM, david buck wrote: >> Hi Jamsheed! >> >> Thank you for catching the mistake! I have modified the backport to >> include the relevant change from 8072008 [0]. Here is an updated webrev: >> >> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_02/ >> >> In the new chunk of code added, the only difference from the code in >> JDK 9 is I had to add a call to err_msg() as JDK 8 does not have >> variadic macro version of assert() [1]. >> >> I have reran all tests (both JPRT and manual) with no issues. >> >> Cheers, >> -Buck >> >> [0] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9988b390777b >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8080775 >> >> On 2016/11/16 16:29, Jamsheed C m wrote: >>> Hi David, >>> >>> this change is missing >>> >>> JVMState* VirtualCallGenerator::generate(JVMState* jvms) { >>> >>> ... >>> >>> if (kit.gvn().type(receiver)->higher_equal(TypePtr::NULL_PTR)) { >>> assert(Bytecodes::is_invoke(kit.java_bc()), "%d: %s", >>> kit.java_bc(), >>> Bytecodes::name(kit.java_bc())); >>> ciMethod* declared_method = >>> kit.method()->get_method_at_bci(kit.bci()); >>> int arg_size = >>> declared_method->signature()->arg_size_for_bc(kit.java_bc()); >>> kit.inc_sp(arg_size); // restore arguments >>> kit.uncommon_trap(Deoptimization::Reason_null_check, >>> Deoptimization::Action_none, >>> NULL, "null receiver"); >>> >>> >>> Best Regards, >>> >>> Jamsheed >>> >>> >>> On 11/15/2016 8:55 PM, david buck wrote: >>>> Hi! >>>> >>>> Please review the backported changes of JDK-8158639 to 8u: >>>> >>>> It is a very straightforward backport. The only two differences are: >>>> >>>> - I added a convenience macro, get_method_at_bci(), from the change >>>> for 8072008 to make the backport cleaner. >>>> >>>> - I had to modify (remove) the package used for the testcase. >>>> >>>> Bug Report: >>>> [ 8158639: C2 compilation fails with SIGSEGV ] >>>> https://bugs.openjdk.java.net/browse/JDK-8158639 >>>> >>>> JDK 9 changeset: >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/119a2a3cc29b >>>> >>>> 8u-dev Webrev: >>>> http://cr.openjdk.java.net/~dbuck/jdk8158639_8u_01/ >>>> >>>> Testing: >>>> Manual verification and JPRT (default and hotspot testsets) >>>> >>>> Cheers, >>>> -Buck >>> > From thomas.schatzl at oracle.com Thu Nov 17 17:07:41 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 17 Nov 2016 18:07:41 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1479382086.2891.24.camel@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479382086.2891.24.camel@oracle.com> Message-ID: <1479402461.2522.21.camel@oracle.com> Hi Kim, On Thu, 2016-11-17 at 12:28 +0100, Thomas Schatzl wrote: > Hi Kim, > > [...] > So the check in g1RemSet.cpp > > ?597 ? if (!r->is_old_or_humongous()) { > > may filter the card out wrongly when processing the card from thread > B > as far as I can see. > > That's why I remarked about only being able to filter out using > is_young() here. For the refinement thread, "top" is current (after > the > fence), but the region type not (may still be "Free" until the > refinement "synchronizes" with thread A in some way), doesn't it? > > The change to "top" must have been observed already after the fence > (in > line 684) though and is safe to use (the allocation of the TLAB for > thread B sets top using appropriate barriers, and the refinement will > synchronize with whatever thread B set). > > Probably I am overlooking something about how the type of region X > set by thread A can be visible to refinement if it only > "synchronizes" with thread B (that did not write the type of region > X). ? I think it is good. Erik gave me the hint (and probably you already mentioned it somewhere). That case can only happen for young regions, and we can ignore them. We only allocate into humongous regions once. Thanks, ? Thomas From trevor.d.watson at oracle.com Thu Nov 17 17:29:14 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Thu, 17 Nov 2016 17:29:14 +0000 Subject: Unsafe compareAnd* Message-ID: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com> I'm working on an implementation of the C2 code for compareAndExchangeShort on SPARC. I've only implemented this function so far, and no compareAndSwapShort equivalent. When I run the test in hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java it fails because Unsafe.compareAndSwapShort() returns an incorrect value. This test passes without my implementation of compareAndExchangeShort. If I comment out the Unsafe.compareAndSwapShort() tests, the Unsafe.compareAndExchangeShort tests run successfully but the Unsafe.weakCompareAndSwapShort() tests subsequently fail. Can anyone tell me why it might be that an implementation for CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()? Thanks, Trevor From erik.helin at oracle.com Thu Nov 17 17:28:28 2016 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 17 Nov 2016 18:28:28 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> Message-ID: On 11/16/2016 01:00 AM, Kim Barrett wrote: >> On Nov 15, 2016, at 5:26 AM, Thomas Schatzl wrote: >> >> Hi, >> >> On Thu, 2016-11-10 at 13:20 -0500, Kim Barrett wrote: >>>> >>>> On Nov 8, 2016, at 7:52 AM, Thomas Schatzl >>> om> wrote: >>>> On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote: >>>> - assuming this works due to other synchronization, >>> This is the critical point. There *is* synchronization there. >> >> Okay, thanks. I just wanted to make sure that we are aware of that we >> are using this other synchronization here. >> >> Thanks. Again I was mostly worried about noting this reliance on >> previous synchronization down somewhere, even if it is only the mailing >> list. >> >> It may be useful to note this in the code too. This would save the next >> one working on this code looking through old mailing list threads. >> >> Maybe I am a bit overly concerned about making sure that these thoughts >> are provided in the proper place though. Or maybe everyone thinks that >> everything is clear :) > > I've updated some comments to mention that external synchronization. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166811 > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01/ First of all, thanks for doing this tricky work. One initial comment: 659 // Iterate over the objects overlapping the card designated by 660 // card_ptr, applying cl to all references in the region. This 661 // is a helper for G1RemSet::refine_card, and is tightly coupled 662 // with it. In the first sentence you mention the now removed argument card_ptr. Maybe just reword this to "Iterate over the objects covered by the memory region, applying cl to all references in the region"? I will have to sleep on this review, the synchronization to make all of this hold together seems to be all over place :) (not your fault, pre-existing). I will continue this review tomorrow morning with a fresh brain. Thanks, Erik > incr: http://cr.openjdk.java.net/~kbarrett/8166811/webrev.01.inc/ > > Also, since this set of changes is rather intertwined with the changes > for 8166607, here is a combined webrev for both: > http://cr.openjdk.java.net/~kbarrett/8166811/combined.01/ > > I think I'll do as Erik suggested and push the two together. > From joe.darcy at oracle.com Thu Nov 17 17:35:05 2016 From: joe.darcy at oracle.com (joe darcy) Date: Thu, 17 Nov 2016 09:35:05 -0800 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <582D0BCE.2030209@linux.vnet.ibm.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> Message-ID: Hello, On 11/16/2016 5:45 PM, Gustavo Romero wrote: > Hi, > > Currently, optimization for building fdlibm is disabled, except for the > "solaris" OS target [1]. The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline. > > As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to, > sin(), cos(), and tan() perform verify poor in comparison to the same methods > in Math class [2]: If you are doing your work against JDK 9, note that the pow, hypot, and cbrt fdlibm methods required by StrictMath have been ported to Java (JDK-8134780: Port fdlibm to Java). I have intentions to port the remaining methods to Java, but it is unclear whether or not this will occur for JDK 9. Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases. > > Math StrictMath > ========= ========== > sin 0m29.984s 1m41.184s > cos 0m30.031s 1m41.200s > tan 0m31.772s 1m46.976s > asin 0m4.577s 0m4.543s > acos 0m4.539s 0m4.525s > atan 0m12.929s 0m12.896s > exp 0m1.071s 0m4.570s > log 0m3.272s 0m14.239s > log10 0m4.362s 0m20.236s > sqrt 0m0.913s 0m0.981s > cbrt 0m10.786s 0m10.808s > sinh 0m4.438s 0m4.433s > cosh 0m4.496s 0m4.478s > tanh 0m3.360s 0m3.353s > expm1 0m4.076s 0m4.094s > log1p 0m13.518s 0m13.527s > IEEEremainder 0m38.803s 0m38.909s > atan2 0m20.100s 0m20.057s > pow 0m14.096s 0m19.938s > hypot 0m5.136s 0m5.122s > > > Switching on the O3 optimization can damage precision of those methods, > nonetheless it's possible to avoid that side effect and yet get huge benefits of > the -O3 optimization on PPC64 if -fno-expensive-optimizations is passed in > addition to the -O3 optimization flag. > > In that sense the following change is proposed to resolve the issue: > > diff -r 81eb4bd34611 make/lib/CoreLibraries.gmk > --- a/make/lib/CoreLibraries.gmk Wed Nov 09 13:37:19 2016 +0100 > +++ b/make/lib/CoreLibraries.gmk Wed Nov 16 19:11:11 2016 -0500 > @@ -33,10 +33,16 @@ > # libfdlibm is statically linked with libjava below and not delivered into the > # product on its own. > > -BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > +BUILD_LIBFDLIBM_OPTIMIZATION := NONE > > -ifneq ($(OPENJDK_TARGET_OS), solaris) > - BUILD_LIBFDLIBM_OPTIMIZATION := NONE > +ifeq ($(OPENJDK_TARGET_OS), solaris) > + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > +endif > + > +ifeq ($(OPENJDK_TARGET_OS), linux) > + ifeq ($(OPENJDK_TARGET_CPU_ARCH), ppc) > + BUILD_LIBFDLIBM_OPTIMIZATION := HIGH > + endif > endif > > LIBFDLIBM_SRC := $(JDK_TOPDIR)/src/java.base/share/native/libfdlibm > @@ -51,6 +57,7 @@ > CFLAGS := $(CFLAGS_JDKLIB) $(LIBFDLIBM_CFLAGS), \ > CFLAGS_windows_debug := -DLOGGING, \ > CFLAGS_aix := -qfloat=nomaf, \ > + CFLAGS_linux_ppc := -fno-expensive-optimizations, \ > DISABLED_WARNINGS_gcc := sign-compare, \ > DISABLED_WARNINGS_microsoft := 4146 4244 4018, \ > ARFLAGS := $(ARFLAGS), \ > > > diff -r 2a1f97c0ad3d make/common/NativeCompilation.gmk > --- a/make/common/NativeCompilation.gmk Wed Nov 09 15:32:39 2016 +0100 > +++ b/make/common/NativeCompilation.gmk Wed Nov 16 19:08:06 2016 -0500 > @@ -569,16 +569,19 @@ > $1_ALL_OBJS := $$(sort $$($1_EXPECTED_OBJS) $$($1_EXTRA_OBJECT_FILES)) > > # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CFLAGS. > - $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) > + $1_EXTRA_CFLAGS:=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)) $$($1_CFLAGS_$(OPENJDK_TARGET_OS)) \ > + $$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)) > ifneq ($(DEBUG_LEVEL),release) > # Pickup extra debug dependent variables for CFLAGS > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_debug) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_debug) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_debug) > + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_debug) > else > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_release) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS_TYPE)_release) > $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_release) > + $1_EXTRA_CFLAGS+=$$($1_CFLAGS_$(OPENJDK_TARGET_OS)_$(OPENJDK_TARGET_CPU_ARCH)_release) > endif > > # Pickup extra OPENJDK_TARGET_OS_TYPE and/or OPENJDK_TARGET_OS dependent variables for CXXFLAGS. > > > After enabling the optimization it's possible to again up to 3x on performance > regarding the aforementioned methods without losing precision: > > StrictMath, original StrictMath, optimized > ============================ ============================ > sin 1.7136493465700542 1m41.184s 1.7136493465700542 0m33.895s > cos 0.1709843554185943 1m41.200s 0.1709843554185943 0m33.884s > tan -5.5500322522995315E7 1m46.976s -5.5500322522995315E7 0m36.461s > asin NaN 0m4.543s NaN 0m3.175s > acos NaN 0m4.525s NaN 0m3.211s > atan 1.5707961389886132E8 0m12.896s 1.5707961389886132E8 0m7.100s > exp Infinity 0m4.570s Infinity 0m3.187s > log 1.7420680845245087E9 0m14.239s 1.7420680845245087E9 0m7.170s > log10 7.565705562087342E8 0m20.236s 7.565705562087342E8 0m9.610s > sqrt 6.66666671666567E11 0m0.981s 6.66666671666567E11 0m0.948s > cbrt 3.481191648389617E10 0m10.808s 3.481191648389617E10 0m10.786s > sinh Infinity 0m4.433s Infinity 0m3.179s > cosh Infinity 0m4.478s Infinity 0m3.174s > tanh 9.999999971990079E7 0m3.353s 9.999999971990079E7 0m3.208s > expm1 Infinity 0m4.094s Infinity 0m3.185s > log1p 1.7420681029451895E9 0m13.527s 1.7420681029451895E9 0m8.756s > IEEEremainder 502000.0 0m38.909s 502000.0 0m14.055s > atan2 1.570453905253704E8 0m20.057s 1.570453905253704E8 0m10.510s > pow Infinity 0m19.938s Infinity 0m20.204s > hypot 5.000000099033372E15 0m5.122s 5.000000099033372E15 0m5.130s > > > I believe that as the FC is passed but FEC is not the change can, after the due > scrutiny and review, be pushed if a special exception approval grants it. Once > on 9, I'll request the downport to 8. Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones. The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ. Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area. Cheers, -Joe From gromero at linux.vnet.ibm.com Thu Nov 17 17:45:59 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Thu, 17 Nov 2016 15:45:59 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> Message-ID: <582DECD7.4020901@linux.vnet.ibm.com> Hi David, On 17-11-2016 00:31, David Holmes wrote: > Adding in build-dev as they need to scrutinize all build changes. Thanks a lot. Regards, Gustavo From gromero at linux.vnet.ibm.com Thu Nov 17 17:47:40 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Thu, 17 Nov 2016 15:47:40 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <37b58c35-72b2-cc19-f175-6d1cff410213@oracle.com> <9dea2dbf-4413-c03e-1cd6-8aceb0e263a0@oracle.com> Message-ID: <582DED3C.5030507@linux.vnet.ibm.com> Hi Erik, On 17-11-2016 07:17, Erik Joelsson wrote: > Overall this looks reasonable to me. However, if we want to introduce a new possible tuple for specifying compilation flags to SetupNativeCompilation, we (the build team) would prefer if we used > OPENJDK_TARGET_CPU instead of OPENJDK_TARGET_CPU_ARCH. Got it. Thanks a lot for that info. I'll take that into account. Regards, Gustavo From gromero at linux.vnet.ibm.com Thu Nov 17 18:31:00 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Thu, 17 Nov 2016 16:31:00 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <582D0BCE.2030209@linux.vnet.ibm.com> Message-ID: <582DF764.70504@linux.vnet.ibm.com> Hi Joe, Thanks a lot for your valuable comments. On 17-11-2016 15:35, joe darcy wrote: >> Currently, optimization for building fdlibm is disabled, except for the >> "solaris" OS target [1]. > > The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the > Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline. oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does not affect the precision, even if setting -O3 does not improve the performance as much as on PPC64. >> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to, >> sin(), cos(), and tan() perform verify poor in comparison to the same methods >> in Math class [2]: > > If you are doing your work against JDK 9, note that the pow, hypot, and cbrt fdlibm methods required by StrictMath have been ported to Java (JDK-8134780: Port fdlibm to Java). I have intentions to > port the remaining methods to Java, but it is unclear whether or not this will occur for JDK 9. Yes, I'm doing my work against 9. So is there any problem if I proceed with my change? I understand that there is no conflict as JDK-8134780 progresses and replaces the StrictMath methods by their counterparts in Java. Please, advice. Is it intended to downport JDK-8134780 to 8? > Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases. I agree. It's just that the issue on StrictMath methods was first noted due to that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64. > Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones. > The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ. That's really good point, thanks for letting me know about that. I'll re-test my change under that perspective. > Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area. Got it. By "the JDK math library regression tests" you mean exactly which test suite? the jtreg tests? For testing against JCK/TCK I'll need some help on that. Thank you very much. Regards, Gustavo From paul.sandoz at oracle.com Thu Nov 17 18:48:50 2016 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 17 Nov 2016 10:48:50 -0800 Subject: Unsafe compareAnd* In-Reply-To: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com> References: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com> Message-ID: <77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com> Hi Trevor, The compareAndSwapShort (non-instrinsic) implementation defers to the compareAndExchangeShortVolatile implementation, and the weakCompareAndSwapShortVolatile implementation defers to (the stronger) compareAndSwapShort implementation [*]: i.e. weakCompareAndSwapShortVolatile -> compareAndSwapShort -> compareAndExchangeShortVolatile @HotSpotIntrinsicCandidate public final short compareAndExchangeShortVolatile(Object o, long offset, short expected, short x) { if ((offset & 3) == 3) { throw new IllegalArgumentException("Update spans the word, not supported"); } ? } @HotSpotIntrinsicCandidate public final boolean compareAndSwapShort(Object o, long offset, short expected, short x) { return compareAndExchangeShortVolatile(o, offset, expected, x) == expected; } @HotSpotIntrinsicCandidate public final boolean weakCompareAndSwapShortVolatile(Object o, long offset, short expected, short x) { return compareAndSwapShort(o, offset, expected, x); } I think that explains why you are observing failing Unsafe.weakCompareAndSwapShort() tests. Paul. [*] Note, we really need to change the names here to be consistent with the schema on VarHandles > On 17 Nov 2016, at 09:29, Trevor Watson wrote: > > I'm working on an implementation of the C2 code for compareAndExchangeShort on SPARC. > > I've only implemented this function so far, and no compareAndSwapShort equivalent. > > When I run the test in hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java it fails because Unsafe.compareAndSwapShort() returns an incorrect value. This test passes without my implementation of compareAndExchangeShort. > > If I comment out the Unsafe.compareAndSwapShort() tests, the Unsafe.compareAndExchangeShort tests run successfully but the Unsafe.weakCompareAndSwapShort() tests subsequently fail. > > Can anyone tell me why it might be that an implementation for CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()? > > Thanks, > Trevor From vladimir.kozlov at oracle.com Thu Nov 17 19:34:29 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Nov 2016 11:34:29 -0800 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled In-Reply-To: <582DA5B2.4020307@oracle.com> References: <582DA5B2.4020307@oracle.com> Message-ID: Hi Tobias, It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods. May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false. Thanks, Vladimir On 11/17/16 4:42 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8169711 > http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ > > When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()). > > Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code. > > The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). > > I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub. > > While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because > 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and > 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized. > > I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals. > > Tested with regression test, JPRT and RBT (running). > > Thanks, > Tobias > > [1] https://bugs.openjdk.java.net/browse/JDK-8169867 > From kim.barrett at oracle.com Thu Nov 17 21:06:19 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 17 Nov 2016 16:06:19 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> Message-ID: <4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com> > On Nov 17, 2016, at 12:28 PM, Erik Helin wrote: > > First of all, thanks for doing this tricky work. One initial comment: > > 659 // Iterate over the objects overlapping the card designated by > 660 // card_ptr, applying cl to all references in the region. This > 661 // is a helper for G1RemSet::refine_card, and is tightly coupled > 662 // with it. > > In the first sentence you mention the now removed argument card_ptr. Maybe just reword this to "Iterate over the objects covered by the memory region, applying cl to all references in the region?? You?re right, I missed updating the comment when the signature was changed. Changing to: // Iterate over the objects overlapping part of a card, applying cl // to all references in the region. This is a helper for // G1RemSet::refine_card, and is tightly coupled with it. which is still immediately followed by: // mr: the memory region covered by the card, trimmed to the // allocated space for this region. Must not be empty. From kim.barrett at oracle.com Thu Nov 17 21:20:35 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 17 Nov 2016 16:20:35 -0500 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <1479402461.2522.21.camel@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479382086.2891.24.camel@oracle.com> <1479402461.2522.21.camel@oracle.com> Message-ID: > On Nov 17, 2016, at 12:07 PM, Thomas Schatzl wrote: > > Hi Kim, > > > > On Thu, 2016-11-17 at 12:28 +0100, Thomas Schatzl wrote: >> Hi Kim, >> >> > [...] > >> So the check in g1RemSet.cpp >> >> 597 if (!r->is_old_or_humongous()) { >> >> may filter the card out wrongly when processing the card from thread >> B >> as far as I can see. >> >> That's why I remarked about only being able to filter out using >> is_young() here. For the refinement thread, "top" is current (after >> the >> fence), but the region type not (may still be "Free" until the >> refinement "synchronizes" with thread A in some way), doesn't it? >> >> The change to "top" must have been observed already after the fence >> (in >> line 684) though and is safe to use (the allocation of the TLAB for >> thread B sets top using appropriate barriers, and the refinement will >> synchronize with whatever thread B set). >> >> Probably I am overlooking something about how the type of region X >> set by thread A can be visible to refinement if it only >> "synchronizes" with thread B (that did not write the type of region >> X). > > I think it is good. Erik gave me the hint (and probably you already > mentioned it somewhere). That case can only happen for young regions, > and we can ignore them. > > We only allocate into humongous regions once. > > Thanks, > Thomas Kudos to Erik of helping you answer your question. I was still struggling with understanding the scenario you were trying to describe. From ioi.lam at oracle.com Thu Nov 17 21:31:39 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 17 Nov 2016 13:31:39 -0800 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled In-Reply-To: References: <582DA5B2.4020307@oracle.com> Message-ID: <582E21BB.1060704@oracle.com> Hi Tobias, The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts. Thanks - Ioi On 11/17/16 11:34 AM, Vladimir Kozlov wrote: > Hi Tobias, > > It is a little inconsistent. CRC32 instrinsics check their flag in > generate_CRC32* methods. > May be we should do the same for FMA instead of assert in > generate_math_entry() return NULL if flag is false. > > Thanks, > Vladimir > > On 11/17/16 4:42 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8169711 >> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ >> >> When dumping metadata with class data sharing (CDS), >> Method::unlink_method() takes care of removing all entry points of >> methods that will be shared. The _i2i and _from_interpreted entries >> are set to the corresponding address in the _cds_entry_table (see >> AbstractInterpreter::entry_for_cds_method()). This address points to >> a trampoline in shared space that jumps to the actual (unshared) >> interpreter method entry at runtime (see >> AbstractInterpreter::update_cds_entry_table()). >> >> Intrinsic methods may have a special interpreter entry (for example, >> 'Interpreter::java_lang_math_fmaF') and if they are shared, their >> entry points are set to such a trampoline that is patched at runtime >> to jump to the interpreter stub containing the intrinsic code. >> >> The problem is that if an intrinsic is enabled during dumping but >> disabled during re-using the shared archive, the trampoline is not >> patched and therefore still refers to the old stub address that was >> only valid during dumping. In debug, we hit the "should be correctly >> set during dump time" assert in Method::link_method() because the >> method entries are inconsistent. In product, we crash because we jump >> to an invalid address through the unpatched trampoline. This problem >> exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). >> >> I fixed this by always creating the interpreter method entries for >> intrinsified methods but replace them with vanilla entries in >> TemplateInterpreterGenerator::generate_method_entry() if the >> intrinsic is disabled at runtime. Like this, we patch the trampoline >> destination address even if the intrinsic is disabled but just >> execute the Java bytecodes instead of the stub. >> >> While testing, I noticed that the assert in Method::link_method() is >> not always triggered (sometimes we just crash). This is because >> 1) the "_from_compiled_entry == NULL" check in >> Method::restore_unshareable_info() is always false and therefore >> link_method() is not invoked and >> 2) in Method::link_method() we only execute the check if the adapter >> (which is shared) was not yet initialized. >> >> I filed JDK-8169867 [1] for this because I'm not too familiar with >> the CDS internals. >> >> Tested with regression test, JPRT and RBT (running). >> >> Thanks, >> Tobias >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8169867 >> From joe.darcy at oracle.com Thu Nov 17 21:33:48 2016 From: joe.darcy at oracle.com (joe darcy) Date: Thu, 17 Nov 2016 13:33:48 -0800 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <582DF764.70504@linux.vnet.ibm.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> Message-ID: Hi Gustavo, On 11/17/2016 10:31 AM, Gustavo Romero wrote: > Hi Joe, > > Thanks a lot for your valuable comments. > > On 17-11-2016 15:35, joe darcy wrote: >>> Currently, optimization for building fdlibm is disabled, except for the >>> "solaris" OS target [1]. >> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the >> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline. > oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm > optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does > not affect the precision, even if setting -O3 does not improve the performance > as much as on PPC64. The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume values of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would run afoul of these fdlibm coding practices. >>> As a consequence on PPC64 (Linux) StrictMath methods like, but not limited to, >>> sin(), cos(), and tan() perform verify poor in comparison to the same methods >>> in Math class [2]: >> If you are doing your work against JDK 9, note that the pow, hypot, and cbrt fdlibm methods required by StrictMath have been ported to Java (JDK-8134780: Port fdlibm to Java). I have intentions to >> port the remaining methods to Java, but it is unclear whether or not this will occur for JDK 9. > Yes, I'm doing my work against 9. So is there any problem if I proceed with my > change? I understand that there is no conflict as JDK-8134780 progresses and > replaces the StrictMath methods by their counterparts in Java. Please, advice. If I manage to finish the fdlibm C -> Java port in JDK 9, the changes you are proposing would eventually be removed as unneeded since the C code wouldn't be there to get compiled anymore. > > Is it intended to downport JDK-8134780 to 8? Such a backport would be technically possible, but we at Oracle don't currently plan to do so. > > >> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases. > I agree. It's just that the issue on StrictMath methods was first noted due to > that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64. Depending on how Math.{sin, cos} is implemented on PPC64, compiling the fdlibm sin/cos with more aggressive optimizations should not be expected to close the performance gap. In particular, if Math.{sin, cos} is an intrinsic on PPC64 (I haven't checked the sources) that used platform-specific feature (say fused multiply add instructions) then just compiling fdlibm more aggressively wouldn't necessarily make up that gap. To allow cross-platform and cross-release reproducibility, StrictMath is specified to use the particular fdlibm algorithms, which precludes using better algorithms developed more recently. If we were to start with a clean slate today, to get such reproducibility we would specify correctly-rounded behavior of all those methods, but such an approach was much less tractable technical 20+ years ago without benefit of the research that was been done in the interim, such as the work of Prof. Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/. > > >> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones. >> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ. > That's really good point, thanks for letting me know about that. I'll re-test my > change under that perspective. > > >> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area. > Got it. By "the JDK math library regression tests" you mean exactly which test > suite? the jtreg tests? Specifically, the regression tests under test/java/lang/Math and test/java/lang/StrictMath in the jdk repository. There are some other math library tests in the hotspot repo, but I don't know where they are offhand. A note on methodologies, when I've been writing test for my port I've tried to include test cases that exercise all the branches point in the code. Due to the large input space (~2^64 for a single-argument method), random sampling alone is an inefficient way to try to find differences in behavior. > For testing against JCK/TCK I'll need some help on that. > I believe the JCK/TCK does have additional testcases relevant here. HTH; thanks, -Joe From chris.plummer at oracle.com Thu Nov 17 21:48:36 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 17 Nov 2016 13:48:36 -0800 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> Message-ID: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> On 11/17/16 1:33 PM, joe darcy wrote: > Hi Gustavo, > > > On 11/17/2016 10:31 AM, Gustavo Romero wrote: >> Hi Joe, >> >> Thanks a lot for your valuable comments. >> >> On 17-11-2016 15:35, joe darcy wrote: >>>> Currently, optimization for building fdlibm is disabled, except for >>>> the >>>> "solaris" OS target [1]. >>> The reason for that is because historically the Solaris compilers >>> have had sufficient discipline and control regarding floating-point >>> semantics and compiler optimizations to still implement the >>> Java-mandated results when optimization was enabled. The gcc family >>> of compilers, for example, has lacked such discipline. >> oh, I see. Thanks for clarifying that. I was exactly wondering why >> fdlibm >> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 >> only, does >> not affect the precision, even if setting -O3 does not improve the >> performance >> as much as on PPC64. > > The fdlibm code relies on aliasing a two-element array of int with a > double to do bit-level reads and writes of floating-point values. As I > understand it, the C spec allows compilers to assume values of > different types don't overlap in memory. The compilation environment > has to be configured in such a way that the C compiler disables code > generation and optimization techniques that would run afoul of these > fdlibm coding practices. This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more than 12 years since I last dealt with fdlibm and compiler aliasing issues. Chris > >>>> As a consequence on PPC64 (Linux) StrictMath methods like, but not >>>> limited to, >>>> sin(), cos(), and tan() perform verify poor in comparison to the >>>> same methods >>>> in Math class [2]: >>> If you are doing your work against JDK 9, note that the pow, hypot, >>> and cbrt fdlibm methods required by StrictMath have been ported to >>> Java (JDK-8134780: Port fdlibm to Java). I have intentions to >>> port the remaining methods to Java, but it is unclear whether or not >>> this will occur for JDK 9. >> Yes, I'm doing my work against 9. So is there any problem if I >> proceed with my >> change? I understand that there is no conflict as JDK-8134780 >> progresses and >> replaces the StrictMath methods by their counterparts in Java. >> Please, advice. > > If I manage to finish the fdlibm C -> Java port in JDK 9, the changes > you are proposing would eventually be removed as unneeded since the C > code wouldn't be there to get compiled anymore. > >> >> Is it intended to downport JDK-8134780 to 8? > > Such a backport would be technically possible, but we at Oracle don't > currently plan to do so. > >> >> >>> Methods in the Math class, such as pow, are often intrinsified and >>> use a different algorithm so a straight performance comparison may >>> not be as fair or meaningful in those cases. >> I agree. It's just that the issue on StrictMath methods was first >> noted due to >> that huge gap (Math vs StrictMath) on PPC64, which is not prominent >> on x64. > > Depending on how Math.{sin, cos} is implemented on PPC64, compiling > the fdlibm sin/cos with more aggressive optimizations should not be > expected to close the performance gap. In particular, if Math.{sin, > cos} is an intrinsic on PPC64 (I haven't checked the sources) that > used platform-specific feature (say fused multiply add instructions) > then just compiling fdlibm more aggressively wouldn't necessarily make > up that gap. > > To allow cross-platform and cross-release reproducibility, StrictMath > is specified to use the particular fdlibm algorithms, which precludes > using better algorithms developed more recently. If we were to start > with a clean slate today, to get such reproducibility we would specify > correctly-rounded behavior of all those methods, but such an approach > was much less tractable technical 20+ years ago without benefit of the > research that was been done in the interim, such as the work of Prof. > Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/. > >> >> >>> Accumulating the the results of the functions and comparisons the >>> sums is not a sufficiently robust way of checking to see if the >>> optimized versions are indeed equivalent to the non-optimized ones. >>> The specification of StrictMath requires a particular result for >>> each set of floating-point arguments and sums get round-away >>> low-order bits that differ. >> That's really good point, thanks for letting me know about that. I'll >> re-test my >> change under that perspective. >> >> >>> Running the JDK math library regression tests and corresponding JCK >>> tests is recommended for work in this area. >> Got it. By "the JDK math library regression tests" you mean exactly >> which test >> suite? the jtreg tests? > > Specifically, the regression tests under test/java/lang/Math and > test/java/lang/StrictMath in the jdk repository. There are some other > math library tests in the hotspot repo, but I don't know where they > are offhand. > > A note on methodologies, when I've been writing test for my port I've > tried to include test cases that exercise all the branches point in > the code. Due to the large input space (~2^64 for a single-argument > method), random sampling alone is an inefficient way to try to find > differences in behavior. >> For testing against JCK/TCK I'll need some help on that. >> > > I believe the JCK/TCK does have additional testcases relevant here. > > HTH; thanks, > > -Joe From Derek.White at cavium.com Thu Nov 17 22:47:58 2016 From: Derek.White at cavium.com (White, Derek) Date: Thu, 17 Nov 2016 22:47:58 +0000 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> Message-ID: Hi Joe, Although neither a floating point expert (as I think I've proven to you over the years), or a gcc expert, I checked with our in-house gcc expert and got this following answer: "Yes using -fno-strict-aliasing fixes the issues. Also there are many forks of fdlibm which has this fixed including the code inside glibc. " FWIW, - Derek -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of Chris Plummer Sent: Thursday, November 17, 2016 4:49 PM To: joe darcy ; Gustavo Romero ; ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net Cc: build-dev Subject: Re: PPC64: Poor StrictMath performance due to non-optimized compilation On 11/17/16 1:33 PM, joe darcy wrote: > Hi Gustavo, > > > On 11/17/2016 10:31 AM, Gustavo Romero wrote: >> Hi Joe, >> >> Thanks a lot for your valuable comments. >> >> On 17-11-2016 15:35, joe darcy wrote: >>>> Currently, optimization for building fdlibm is disabled, except for >>>> the "solaris" OS target [1]. >>> The reason for that is because historically the Solaris compilers >>> have had sufficient discipline and control regarding floating-point >>> semantics and compiler optimizations to still implement the >>> Java-mandated results when optimization was enabled. The gcc family >>> of compilers, for example, has lacked such discipline. >> oh, I see. Thanks for clarifying that. I was exactly wondering why >> fdlibm optimization is off even for x86_x64 as it, AFAICS regarding >> gcc 5 only, does not affect the precision, even if setting -O3 does >> not improve the performance as much as on PPC64. > > The fdlibm code relies on aliasing a two-element array of int with a > double to do bit-level reads and writes of floating-point values. As I > understand it, the C spec allows compilers to assume values of > different types don't overlap in memory. The compilation environment > has to be configured in such a way that the C compiler disables code > generation and optimization techniques that would run afoul of these > fdlibm coding practices. This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more than 12 years since I last dealt with fdlibm and compiler aliasing issues. Chris > >>>> As a consequence on PPC64 (Linux) StrictMath methods like, but not >>>> limited to, sin(), cos(), and tan() perform verify poor in >>>> comparison to the same methods in Math class [2]: >>> If you are doing your work against JDK 9, note that the pow, hypot, >>> and cbrt fdlibm methods required by StrictMath have been ported to >>> Java (JDK-8134780: Port fdlibm to Java). I have intentions to port >>> the remaining methods to Java, but it is unclear whether or not this >>> will occur for JDK 9. >> Yes, I'm doing my work against 9. So is there any problem if I >> proceed with my change? I understand that there is no conflict as >> JDK-8134780 progresses and replaces the StrictMath methods by their >> counterparts in Java. >> Please, advice. > > If I manage to finish the fdlibm C -> Java port in JDK 9, the changes > you are proposing would eventually be removed as unneeded since the C > code wouldn't be there to get compiled anymore. > >> >> Is it intended to downport JDK-8134780 to 8? > > Such a backport would be technically possible, but we at Oracle don't > currently plan to do so. > >> >> >>> Methods in the Math class, such as pow, are often intrinsified and >>> use a different algorithm so a straight performance comparison may >>> not be as fair or meaningful in those cases. >> I agree. It's just that the issue on StrictMath methods was first >> noted due to that huge gap (Math vs StrictMath) on PPC64, which is >> not prominent on x64. > > Depending on how Math.{sin, cos} is implemented on PPC64, compiling > the fdlibm sin/cos with more aggressive optimizations should not be > expected to close the performance gap. In particular, if Math.{sin, > cos} is an intrinsic on PPC64 (I haven't checked the sources) that > used platform-specific feature (say fused multiply add instructions) > then just compiling fdlibm more aggressively wouldn't necessarily make > up that gap. > > To allow cross-platform and cross-release reproducibility, StrictMath > is specified to use the particular fdlibm algorithms, which precludes > using better algorithms developed more recently. If we were to start > with a clean slate today, to get such reproducibility we would specify > correctly-rounded behavior of all those methods, but such an approach > was much less tractable technical 20+ years ago without benefit of the > research that was been done in the interim, such as the work of Prof. > Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/. > >> >> >>> Accumulating the the results of the functions and comparisons the >>> sums is not a sufficiently robust way of checking to see if the >>> optimized versions are indeed equivalent to the non-optimized ones. >>> The specification of StrictMath requires a particular result for >>> each set of floating-point arguments and sums get round-away >>> low-order bits that differ. >> That's really good point, thanks for letting me know about that. I'll >> re-test my change under that perspective. >> >> >>> Running the JDK math library regression tests and corresponding JCK >>> tests is recommended for work in this area. >> Got it. By "the JDK math library regression tests" you mean exactly >> which test >> suite? the jtreg tests? > > Specifically, the regression tests under test/java/lang/Math and > test/java/lang/StrictMath in the jdk repository. There are some other > math library tests in the hotspot repo, but I don't know where they > are offhand. > > A note on methodologies, when I've been writing test for my port I've > tried to include test cases that exercise all the branches point in > the code. Due to the large input space (~2^64 for a single-argument > method), random sampling alone is an inefficient way to try to find > differences in behavior. >> For testing against JCK/TCK I'll need some help on that. >> > > I believe the JCK/TCK does have additional testcases relevant here. > > HTH; thanks, > > -Joe From trevor.d.watson at oracle.com Fri Nov 18 07:58:58 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Fri, 18 Nov 2016 07:58:58 +0000 Subject: Unsafe compareAnd* In-Reply-To: <77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com> References: <499e02ce-1bb5-0441-d647-005d77feaa4c@oracle.com> <77841DFC-7C6D-48E9-B036-0B2373905EBF@oracle.com> Message-ID: <0fc2f433-3ee9-d24f-2081-ecf174ec5b75@oracle.com> Thanks for the explanation Paul. On 17/11/16 18:48, Paul Sandoz wrote: > Hi Trevor, > > The compareAndSwapShort (non-instrinsic) implementation defers to the compareAndExchangeShortVolatile implementation, and the weakCompareAndSwapShortVolatile implementation defers to (the stronger) compareAndSwapShort implementation [*]: > > i.e. weakCompareAndSwapShortVolatile -> compareAndSwapShort -> compareAndExchangeShortVolatile > > @HotSpotIntrinsicCandidate > public final short compareAndExchangeShortVolatile(Object o, long offset, > short expected, > short x) { > if ((offset & 3) == 3) { > throw new IllegalArgumentException("Update spans the word, not supported"); > } > > ? > } > > @HotSpotIntrinsicCandidate > public final boolean compareAndSwapShort(Object o, long offset, > short expected, > short x) { > return compareAndExchangeShortVolatile(o, offset, expected, x) == expected; > } > > @HotSpotIntrinsicCandidate > public final boolean weakCompareAndSwapShortVolatile(Object o, long offset, > short expected, > short x) { > return compareAndSwapShort(o, offset, expected, x); > } > > I think that explains why you are observing failing Unsafe.weakCompareAndSwapShort() tests. > > Paul. > > [*] Note, we really need to change the names here to be consistent with the schema on VarHandles > > >> On 17 Nov 2016, at 09:29, Trevor Watson wrote: >> >> I'm working on an implementation of the C2 code for compareAndExchangeShort on SPARC. >> >> I've only implemented this function so far, and no compareAndSwapShort equivalent. >> >> When I run the test in hotspot/test/compiler/unsafe/JdkInternalMiscUnsafeAccessTestShort.java it fails because Unsafe.compareAndSwapShort() returns an incorrect value. This test passes without my implementation of compareAndExchangeShort. >> >> If I comment out the Unsafe.compareAndSwapShort() tests, the Unsafe.compareAndExchangeShort tests run successfully but the Unsafe.weakCompareAndSwapShort() tests subsequently fail. >> >> Can anyone tell me why it might be that an implementation for CompareAndExchangeS would trigger a failure in Unsafe.compareAndSwapShort()? >> >> Thanks, >> Trevor > From tobias.hartmann at oracle.com Fri Nov 18 08:33:36 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 18 Nov 2016 09:33:36 +0100 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled In-Reply-To: <582E21BB.1060704@oracle.com> References: <582DA5B2.4020307@oracle.com> <582E21BB.1060704@oracle.com> Message-ID: <582EBCE0.7090506@oracle.com> Thanks for the reviews, Vladimir and Ioi! As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry(): http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/ Best regards, Tobias On 17.11.2016 22:31, Ioi Lam wrote: > Hi Tobias, > > The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts. > > Thanks > - Ioi > > On 11/17/16 11:34 AM, Vladimir Kozlov wrote: >> Hi Tobias, >> >> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods. >> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false. >> >> Thanks, >> Vladimir >> >> On 11/17/16 4:42 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8169711 >>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ >>> >>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()). >>> >>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code. >>> >>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). >>> >>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub. >>> >>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because >>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and >>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized. >>> >>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals. >>> >>> Tested with regression test, JPRT and RBT (running). >>> >>> Thanks, >>> Tobias >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867 >>> > From erik.helin at oracle.com Fri Nov 18 12:59:12 2016 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 18 Nov 2016 13:59:12 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> Message-ID: On 11/16/2016 12:58 AM, Kim Barrett wrote: >> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl wrote: >> >> Hi Kim, >> >> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>>> >>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >>> om> wrote: >>>> Maybe it would be cleaner to call a method in the barrier set >>>> instead of inlining the dirtying + enqueuing in lines 685 to 691? >>>> Maybe as an additional RFE. >>> We could use _ct_bs->invalidate(dirtyRegion). That's rather >>> overgeneralized and inefficient for this situation, but this >>> situation should occur *very* rarely; it requires a stale card get >>> processed just as a humongous object is in the midst of being >>> allocated in the same region. >> >> I kind of think for these reasons we should use _ct_bs->invalidate() as >> it seems clearer to me. There is the mentioned drawback of having no >> other more efficient way, so I will let you decide about this. > > I've made the change to call invalidate, and also updated some comments. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166607 > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ Again, thanks for all your hard work on this patch series! I've been over this patch (and the other one) many times now, and I think this is good. At least I can't come up with any reason why it wouldn't work (this is one trickiest parts of G1). Thanks, Erik > incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ > > Also, see RFR: 8166811, where I've included a webrev combining the > latest changes for 8166607 and 8166811, since they are rather > intertwined. I think I'll do as Erik suggested and push the two > together. > > From erik.helin at oracle.com Fri Nov 18 13:00:33 2016 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 18 Nov 2016 14:00:33 +0100 Subject: RFR: 8166811: Missing memory fences between memory allocation and refinement In-Reply-To: <4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com> References: <1478609549.2689.71.camel@oracle.com> <671AC8AB-23FE-4674-9552-DF9C63D8E719@oracle.com> <1479205608.3251.18.camel@oracle.com> <894A84DA-BDDA-4B32-891A-7266912A279D@oracle.com> <4459FFB6-3866-414E-B511-B28591DB5A6C@oracle.com> Message-ID: <635c9aa8-62d6-dda8-6fee-741e213686ce@oracle.com> On 11/17/2016 10:06 PM, Kim Barrett wrote: >> On Nov 17, 2016, at 12:28 PM, Erik Helin wrote: >> >> First of all, thanks for doing this tricky work. One initial comment: >> >> 659 // Iterate over the objects overlapping the card designated by >> 660 // card_ptr, applying cl to all references in the region. This >> 661 // is a helper for G1RemSet::refine_card, and is tightly coupled >> 662 // with it. >> >> In the first sentence you mention the now removed argument card_ptr. Maybe just reword this to "Iterate over the objects covered by the memory region, applying cl to all references in the region?? > > You?re right, I missed updating the comment when the signature was changed. > Changing to: > > // Iterate over the objects overlapping part of a card, applying cl > // to all references in the region. This is a helper for > // G1RemSet::refine_card, and is tightly coupled with it. > > which is still immediately followed by: > > // mr: the memory region covered by the card, trimmed to the > // allocated space for this region. Must not be empty. Ok, looks good. I think this patch is good to go now, thanks for taking this on, appreciate it. Erik From kim.barrett at oracle.com Fri Nov 18 14:03:50 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 18 Nov 2016 09:03:50 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> Message-ID: > On Nov 15, 2016, at 6:58 PM, Kim Barrett wrote: > >> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl wrote: >> >> Hi Kim, >> >> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>>> >>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >>> om> wrote: >>>> Maybe it would be cleaner to call a method in the barrier set >>>> instead of inlining the dirtying + enqueuing in lines 685 to 691? >>>> Maybe as an additional RFE. >>> We could use _ct_bs->invalidate(dirtyRegion). That's rather >>> overgeneralized and inefficient for this situation, but this >>> situation should occur *very* rarely; it requires a stale card get >>> processed just as a humongous object is in the midst of being >>> allocated in the same region. >> >> I kind of think for these reasons we should use _ct_bs->invalidate() as >> it seems clearer to me. There is the mentioned drawback of having no >> other more efficient way, so I will let you decide about this. > > I've made the change to call invalidate, and also updated some comments. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8166607 > > Webrevs: > full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ > incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ > > Also, see RFR: 8166811, where I've included a webrev combining the > latest changes for 8166607 and 8166811, since they are rather > intertwined. I think I'll do as Erik suggested and push the two > together. Sorry folks, but I want to revert this part and go back to the old code where it locked the shared queue and enqueued there. If the executing invocation of refine_card is from a Java thread, e.g. this is the "mutator helps with refinement" case, calling invalidate would enqueue to the current thread's buffer. But that is effectively a reentrant call to enqueue, and the Java thread case of enqueue is not reentrant-safe. Only enqueue to the shared queue is reentrant-safe. I think that scenario presently can't happen, since the mutator helps case is dealt with by the mutator processing it's own buffer. In that situation, all the cards in the buffer came from writes by this thread to an object this thread either allocated or has access to, so the klass must be there. But that's getting uncomfortably subtle in what is already difficult to analyze code. Also, we've talked about changing the mutator helps case to not immediately process it's own buffer but instead add its buffer to the pending buffer list and process the next (FIFO ordered) buffer, in order to let its buffer age. (I have a change for that in my post-JDK 9 collection of pending changes. The mutator-invoked enqueue might be reentrant-safe in that change, but I don't think I want to make that guarantee.) From thomas.schatzl at oracle.com Fri Nov 18 14:28:27 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 18 Nov 2016 15:28:27 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> Message-ID: <1479479307.2483.5.camel@oracle.com> Hi Kim, On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote: > > > > On Nov 15, 2016, at 6:58 PM, Kim Barrett > > wrote: > > > > > > > > On Nov 15, 2016, at 5:21 AM, Thomas Schatzl > > e.com> wrote: > > > > > > Hi Kim, > > > > > > On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: > > > > > > > > > > > > > > > > > > > On Nov 7, 2016, at 5:53 AM, Thomas Schatzl > > > > acle.c > > > > > om> wrote: > > > > > Maybe it would be cleaner to call a method in the barrier set > > > > > instead of inlining the dirtying + enqueuing in lines 685 to > > > > > 691? > > > > > Maybe as an additional RFE. > > > > We could use _ct_bs->invalidate(dirtyRegion).??That's rather > > > > overgeneralized and inefficient for this situation, but this > > > > situation should occur *very* rarely; it requires a stale card > > > > get > > > > processed just as a humongous object is in the midst of being > > > > allocated in the same region. > > > I kind of think for these reasons we should use _ct_bs- > > > >invalidate() as > > > it seems clearer to me. There is the mentioned drawback of having > > > no > > > other more efficient way, so I will let you decide about this. > > I've made the change to call invalidate, and also updated some > > comments. > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8166607 > > > > Webrevs: > > full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ > > incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ > > > > Also, see RFR: 8166811, where I've included a webrev combining the > > latest changes for 8166607 and 8166811, since they are rather > > intertwined.??I think I'll do as Erik suggested and push the two > > together. > Sorry folks, but I want to revert this part and go back to the old > code where it locked the shared queue and enqueued there. > You mean the invalidate() call? If you think this is better, it has only been a suggestion. > If the executing invocation of refine_card is from a Java thread, > e.g. this is the "mutator helps with refinement" case, calling > invalidate would enqueue to the current thread's buffer.??But that is > effectively a reentrant call to enqueue, and the Java thread case of > enqueue is not reentrant-safe.??Only enqueue to the shared queue is > reentrant-safe. Yes, that's bad. One could extract this code out and put into the barrier set though - as another CR. Thanks, ? Thomas From kim.barrett at oracle.com Fri Nov 18 14:32:33 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 18 Nov 2016 09:32:33 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <1479479307.2483.5.camel@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> <1479479307.2483.5.camel@oracle.com> Message-ID: > On Nov 18, 2016, at 9:28 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote: >>> >>> On Nov 15, 2016, at 6:58 PM, Kim Barrett >>> wrote: >>> >>>> >>>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl >>> e.com> wrote: >>>> >>>> Hi Kim, >>>> >>>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>>>> >>>>>> >>>>>> >>>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >>>>> acle.c >>>>>> om> wrote: >>>>>> Maybe it would be cleaner to call a method in the barrier set >>>>>> instead of inlining the dirtying + enqueuing in lines 685 to >>>>>> 691? >>>>>> Maybe as an additional RFE. >>>>> We could use _ct_bs->invalidate(dirtyRegion). That's rather >>>>> overgeneralized and inefficient for this situation, but this >>>>> situation should occur *very* rarely; it requires a stale card >>>>> get >>>>> processed just as a humongous object is in the midst of being >>>>> allocated in the same region. >>>> I kind of think for these reasons we should use _ct_bs- >>>>> invalidate() as >>>> it seems clearer to me. There is the mentioned drawback of having >>>> no >>>> other more efficient way, so I will let you decide about this. >>> I've made the change to call invalidate, and also updated some >>> comments. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8166607 >>> >>> Webrevs: >>> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ >>> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ >>> >>> Also, see RFR: 8166811, where I've included a webrev combining the >>> latest changes for 8166607 and 8166811, since they are rather >>> intertwined. I think I'll do as Erik suggested and push the two >>> together. >> Sorry folks, but I want to revert this part and go back to the old >> code where it locked the shared queue and enqueued there. >> > > You mean the invalidate() call? If you think this is better, it has > only been a suggestion. Yes. It seemed like a good idea at the time? From erik.joelsson at oracle.com Fri Nov 18 15:30:20 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 18 Nov 2016 16:30:20 +0100 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images Message-ID: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> Hello, Please review this change which removes the $ARCH sub directory in the lib directory of the runtime images, which is an outstanding issue from the new runtime images. Most of the changes are in the build, but there are some in hotspot and launcher source. I have verified -testset hotspot and default in JPRT as well as tried to run as many jtreg tests as possible locally. I could only really find two tests that needed to be adjusted. Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 /Erik From vladimir.kozlov at oracle.com Fri Nov 18 16:09:01 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Nov 2016 08:09:01 -0800 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled In-Reply-To: <582EBCE0.7090506@oracle.com> References: <582DA5B2.4020307@oracle.com> <582E21BB.1060704@oracle.com> <582EBCE0.7090506@oracle.com> Message-ID: <3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com> Looks good. Thanks, Vladimir On 11/18/16 12:33 AM, Tobias Hartmann wrote: > Thanks for the reviews, Vladimir and Ioi! > > As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry(): > http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/ > > Best regards, > Tobias > > On 17.11.2016 22:31, Ioi Lam wrote: >> Hi Tobias, >> >> The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts. >> >> Thanks >> - Ioi >> >> On 11/17/16 11:34 AM, Vladimir Kozlov wrote: >>> Hi Tobias, >>> >>> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods. >>> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false. >>> >>> Thanks, >>> Vladimir >>> >>> On 11/17/16 4:42 AM, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch: >>>> https://bugs.openjdk.java.net/browse/JDK-8169711 >>>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ >>>> >>>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()). >>>> >>>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code. >>>> >>>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). >>>> >>>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub. >>>> >>>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because >>>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and >>>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized. >>>> >>>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals. >>>> >>>> Tested with regression test, JPRT and RBT (running). >>>> >>>> Thanks, >>>> Tobias >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867 >>>> >> From tim.bell at oracle.com Fri Nov 18 16:34:03 2016 From: tim.bell at oracle.com (Tim Bell) Date: Fri, 18 Nov 2016 08:34:03 -0800 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> Message-ID: <9846f58b-0c80-b8ce-674f-cf3fd239b01f@oracle.com> Erik: > Please review this change which removes the $ARCH sub directory in the > lib directory of the runtime images, which is an outstanding issue from > the new runtime images. Most of the changes are in the build, but there > are some in hotspot and launcher source. I have verified -testset > hotspot and default in JPRT as well as tried to run as many jtreg tests > as possible locally. I could only really find two tests that needed to > be adjusted. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 hotspot/test/runtime/ThreadSignalMask/exeThreadSignalMask.c jdk/make/copy/Copy-java.desktop.gmk jdk/src/java.base/unix/classes/java/lang/ProcessImpl.java These legal notices need to be updated for 2016. No need to redo the webrev if this is all the feedback you get. Looks fine otherwise. Tim From vladimir.kozlov at oracle.com Fri Nov 18 16:41:29 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Nov 2016 08:41:29 -0800 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> Message-ID: <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> Finally! :) Hotspot changes looks fine to me. But you missed hotspot/make/hotspot.script file. Our colleges in RH and SAP should test these changes on their platforms. Next step would be removal of client/server sub-directories on platforms where we have only Server JVM (64-bit JDK has only Server JVM). Thanks, Vladimir On 11/18/16 7:30 AM, Erik Joelsson wrote: > Hello, > > Please review this change which removes the $ARCH sub directory in the > lib directory of the runtime images, which is an outstanding issue from > the new runtime images. Most of the changes are in the build, but there > are some in hotspot and launcher source. I have verified -testset > hotspot and default in JPRT as well as tried to run as many jtreg tests > as possible locally. I could only really find two tests that needed to > be adjusted. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 > > /Erik > From magnus.ihse.bursie at oracle.com Fri Nov 18 20:40:58 2016 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 18 Nov 2016 21:40:58 +0100 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> Message-ID: On 2016-11-18 16:30, Erik Joelsson wrote: > Hello, > > Please review this change which removes the $ARCH sub directory in the > lib directory of the runtime images, which is an outstanding issue > from the new runtime images. Most of the changes are in the build, but > there are some in hotspot and launcher source. I have verified > -testset hotspot and default in JPRT as well as tried to run as many > jtreg tests as possible locally. I could only really find two tests > that needed to be adjusted. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 Looks good to me. If anything, the switch statement in ProcessImpl.java seems superfluous now, and you could possibly prune that bit even harder. Nice to see this go. :) /Magnus From kim.barrett at oracle.com Fri Nov 18 20:53:45 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 18 Nov 2016 15:53:45 -0500 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: <1479479307.2483.5.camel@oracle.com> References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> <1479479307.2483.5.camel@oracle.com> Message-ID: <6740CCD9-671C-485B-8E39-5F1DEF249830@oracle.com> > On Nov 18, 2016, at 9:28 AM, Thomas Schatzl wrote: > On Fri, 2016-11-18 at 09:03 -0500, Kim Barrett wrote: >> If the executing invocation of refine_card is from a Java thread, >> e.g. this is the "mutator helps with refinement" case, calling >> invalidate would enqueue to the current thread's buffer. But that is >> effectively a reentrant call to enqueue, and the Java thread case of >> enqueue is not reentrant-safe. Only enqueue to the shared queue is >> reentrant-safe. > > Yes, that's bad. One could extract this code out and put into the > barrier set though - as another CR. https://bugs.openjdk.java.net/browse/JDK-8170020 From tobias.hartmann at oracle.com Mon Nov 21 06:05:19 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 21 Nov 2016 07:05:19 +0100 Subject: [9] RFR(S): 8169711: CDS does not patch entry trampoline if intrinsic method is disabled In-Reply-To: <3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com> References: <582DA5B2.4020307@oracle.com> <582E21BB.1060704@oracle.com> <582EBCE0.7090506@oracle.com> <3bc033fc-a2cb-ee72-1d88-9f6b422620f7@oracle.com> Message-ID: <58328E9F.5010303@oracle.com> Thanks again, Vladimir! Best regards, Tobias On 18.11.2016 17:09, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 11/18/16 12:33 AM, Tobias Hartmann wrote: >> Thanks for the reviews, Vladimir and Ioi! >> >> As Vladimir suggested, I moved the UseFMA check into TemplateInterpreterGenerator::generate_math_entry(): >> http://cr.openjdk.java.net/~thartmann/8169711/webrev.01/ >> >> Best regards, >> Tobias >> >> On 17.11.2016 22:31, Ioi Lam wrote: >>> Hi Tobias, >>> >>> The interpreter changes look OK to me. I'll defer to Vladimir on his opinion on the asserts. >>> >>> Thanks >>> - Ioi >>> >>> On 11/17/16 11:34 AM, Vladimir Kozlov wrote: >>>> Hi Tobias, >>>> >>>> It is a little inconsistent. CRC32 instrinsics check their flag in generate_CRC32* methods. >>>> May be we should do the same for FMA instead of assert in generate_math_entry() return NULL if flag is false. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 11/17/16 4:42 AM, Tobias Hartmann wrote: >>>>> Hi, >>>>> >>>>> please review the following patch: >>>>> https://bugs.openjdk.java.net/browse/JDK-8169711 >>>>> http://cr.openjdk.java.net/~thartmann/8169711/webrev.00/ >>>>> >>>>> When dumping metadata with class data sharing (CDS), Method::unlink_method() takes care of removing all entry points of methods that will be shared. The _i2i and _from_interpreted entries are set to the corresponding address in the _cds_entry_table (see AbstractInterpreter::entry_for_cds_method()). This address points to a trampoline in shared space that jumps to the actual (unshared) interpreter method entry at runtime (see AbstractInterpreter::update_cds_entry_table()). >>>>> >>>>> Intrinsic methods may have a special interpreter entry (for example, 'Interpreter::java_lang_math_fmaF') and if they are shared, their entry points are set to such a trampoline that is patched at runtime to jump to the interpreter stub containing the intrinsic code. >>>>> >>>>> The problem is that if an intrinsic is enabled during dumping but disabled during re-using the shared archive, the trampoline is not patched and therefore still refers to the old stub address that was only valid during dumping. In debug, we hit the "should be correctly set during dump time" assert in Method::link_method() because the method entries are inconsistent. In product, we crash because we jump to an invalid address through the unpatched trampoline. This problem exists with the FMA, CRC32 and CRC32C intrinsics (see regression test). >>>>> >>>>> I fixed this by always creating the interpreter method entries for intrinsified methods but replace them with vanilla entries in TemplateInterpreterGenerator::generate_method_entry() if the intrinsic is disabled at runtime. Like this, we patch the trampoline destination address even if the intrinsic is disabled but just execute the Java bytecodes instead of the stub. >>>>> >>>>> While testing, I noticed that the assert in Method::link_method() is not always triggered (sometimes we just crash). This is because >>>>> 1) the "_from_compiled_entry == NULL" check in Method::restore_unshareable_info() is always false and therefore link_method() is not invoked and >>>>> 2) in Method::link_method() we only execute the check if the adapter (which is shared) was not yet initialized. >>>>> >>>>> I filed JDK-8169867 [1] for this because I'm not too familiar with the CDS internals. >>>>> >>>>> Tested with regression test, JPRT and RBT (running). >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8169867 >>>>> >>> From shafi.s.ahmad at oracle.com Mon Nov 21 06:29:42 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Sun, 20 Nov 2016 22:29:42 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> <582B622F.7030909@oracle.com> <4332d26a-0efa-4582-9068-f28fb7ebd109@default> Message-ID: Hi All, May I get the second review on this. I am putting together all the webrevs to make it simple for reviewer. http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ Please note that I tested with jprt, all jtreg and rbt tests. Regards, Shafi > -----Original Message----- > From: Vladimir Kozlov > Sent: Wednesday, November 16, 2016 10:21 PM > To: Shafi Ahmad; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > Looks good. > > I would suggest to run all jtreg tests (or even RBT) when you apply all > changes before pushing this. > > Thanks, > Vladimir > > On 11/16/16 4:52 AM, Shafi Ahmad wrote: > > Hi Vladimir, > > > > Thank you for the review and feedback. > > > > Please find updated webrevs: > > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed > the test case as it use only jdk9 APIs. > > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed > test methods testFixedOffsetHeaderArray17() and > testFixedOffsetHeader17() which referenced jdk9 API > UNSAFE.getIntUnaligned. > > > > > > Regards, > > Shafi > > > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Wednesday, November 16, 2016 1:00 AM > >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> mismatched unsafe accesses > >> > >> Hi Shafi > >> > >> You should not backport tests which use only new JDK 9 APIs. Like > >> TestUnsafeUnalignedMismatchedAccesses.java test. > >> > >> But it is perfectly fine to modify backport by removing part of > >> changes which use a new API. For example, 8162101 changes in > >> OpaqueAccesses.java test which use getIntUnaligned() method. > >> > >> It is unfortunate that 8140309 changes include also code which > >> process new Unsafe Unaligned intrinsics from JDK 9. It should not be > >> backported but it will simplify this and following backports. So I > >> agree with changes you did for > >> 8140309 backport. > >> > >> Thanks, > >> Vladimir > >> > >> On 11/14/16 10:34 PM, Shafi Ahmad wrote: > >>> Hi Vladimir, > >>> > >>> Thanks for the review. > >>> > >>>> -----Original Message----- > >>> > >>>> From: Vladimir Kozlov > >>> > >>>> Sent: Monday, November 14, 2016 11:20 PM > >>> > >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >>> > >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>> produces > >>> > >>>> mismatched unsafe accesses > >>> > >>>> > >>> > >>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote: > >>> > >>>>> Hi Vladimir, > >>> > >>>>> > >>> > >>>>> Thanks for the review. > >>> > >>>>> > >>> > >>>>> Please find updated webrevs. > >>> > >>>>> > >>> > >>>>> All webrevs are with respect to the base changes on JDK-8140309. > >>> > >>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ > >>> > >>>> > >>> > >>>> Why you kept unaligned parameter in changes? > >>> > >>> The fix of JDK-8136473 caused many problems after integration (see > >>> JDK- > >> 8140267). > >>> > >>> The fix was backed out and re-implemented with JDK-8140309 by > >>> slightly > >> changing the assert: > >>> > >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015- > >> Novem > >>> ber/019696.html > >>> > >>> The code change for the fix of JDK-8140309 is code changes for > >>> JDK-8136473 > >> by slightly changing one assert. > >>> > >>> jdk9 original changeset is > >>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c > >>> > >>> As this is a backport so I keep the changes as it is. > >>> > >>>> > >>> > >>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work > >>>> since > >>> > >>>> since Unsafe class in jdk8 does not have unaligned methods. > >>> > >>>> Hot did you run it? > >>> > >>> I am sorry, looks there is some issue with my testing. > >>> > >>> I have run jtreg test after merging the changes but somehow the test > >>> does > >> not run and I verified only the failing list of jtreg result. > >>> > >>> When I run the test case separately it is failing as you already > >>> pointed out > >> the same. > >>> > >>> $java -jar ~/Tools/jtreg/lib/jtreg.jar > >>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ > >>> > >> > hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched > >> A > >>> ccesses.java > >>> > >>> Test results: failed: 1 > >>> > >>> Report written to > >>> /scratch/shshahma/Java/jdk8u-dev- > >> 8140309_01/JTreport/html/report.html > >>> > >>> Results written to > >>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork > >>> > >>> Error: > >>> > >>> /scratch/shshahma/Java/jdk8u-dev- > >> 8140309_01/hotspot/test/compiler/intr > >>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: > >>> cannot find symbol > >>> > >>> UNSAFE.putIntUnaligned(array, > >>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); > >>> > >>> Not sure if we should push without the test case. > >>> > >>>> > >>> > >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ > >>> > >>>> > >>> > >>>> Good. Did you run new UnsafeAccess.java test? > >>> > >>> Due to same process issue the test case is not run and when I run it > >> separately it fails. > >>> > >>> It passes after doing below changes: > >>> > >>> 1. Added /othervm > >>> > >>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by > >>> 'import > >> sun.misc.Unsafe;' > >>> > >>> Updated webrev: > >>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > >>> > >>>> > >>> > >>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ > >>> > >>> I am getting the similar compilation error as above for added test > >>> case. Not > >> sure if we can push without the test case. > >>> > >>> Regards, > >>> > >>> Shafi > >>> > >>>> > >>> > >>>> Good. > >>> > >>>> > >>> > >>>> Thanks, > >>> > >>>> Vladimir > >>> > >>>> > >>> > >>>>> > >>> > >>>>> Regards, > >>> > >>>>> Shafi > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>>> -----Original Message----- > >>> > >>>>>> From: Vladimir Kozlov > >>> > >>>>>> Sent: Friday, November 11, 2016 1:26 AM > >>> > >>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >>>>>> > >>> > >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>>> produces > >>> > >>>>>> mismatched unsafe accesses > >>> > >>>>>> > >>> > >>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote: > >>> > >>>>>>> Hi, > >>> > >>>>>>> > >>> > >>>>>>> Please review the backport of following dependent backports. > >>> > >>>>>>> > >>> > >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473 > >>> > >>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. > >>> > >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 > >>>>>>> [JDK- > >>> > >>>>>> 8080289]. Manual merge is not done as the corresponding code is > >>>>>> not > >>> > >>>>>> there in jdk8u-dev. > >>> > >>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp > >>>>>>> and > >>> > >>>>>>> manual > >>> > >>>>>> merge is done. > >>> > >>>>>>> webrev link: > >>> > >>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > >>> > >>>>>> > >>> > >>>>>> unaligned unsafe access methods were added in jdk 9 only. In your > >>> > >>>>>> changes unaligned argument is always false. You can simplify > changes. > >>> > >>>>>> > >>> > >>>>>> Also you should base changes on JDK-8140309 (original 8136473 > >>>>>> changes > >>> > >>>>>> were backout by 8140267): > >>> > >>>>>> > >>> > >>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote: > >>> > >>>>>> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > >>> > >>>>>> > > >>> > >>>>>> > Same as 8136473 with only the following change: > >>> > >>>>>> > > >>> > >>>>>> > diff --git a/src/share/vm/opto/library_call.cpp > >>> > >>>>>> b/src/share/vm/opto/library_call.cpp > >>> > >>>>>> > --- a/src/share/vm/opto/library_call.cpp > >>> > >>>>>> > +++ b/src/share/vm/opto/library_call.cpp > >>> > >>>>>> > @@ -2527,7 +2527,7 @@ > >>> > >>>>>> > // of safe & unsafe memory. > >>> > >>>>>> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); > >>> > >>>>>> > > >>> > >>>>>> > - assert(is_native_ptr || alias_type->adr_type() == > >>> > >>>>>> TypeOopPtr::BOTTOM > >>> > >>>>>> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || > >>> > >>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM || > >>> > >>>>>> > alias_type->field() != NULL || alias_type->element() != > >>> > >>>>>> NULL, "field, array element or unknown"); > >>> > >>>>>> > bool mismatched = false; > >>> > >>>>>> > if (alias_type->element() != NULL || alias_type->field() != NULL) > { > >>> > >>>>>> > > >>> > >>>>>> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the > >>> > >>>>>> is_native_ptr case and the case where the unsafe method is called > >>>>>> with a > >>> > >>>> null object. > >>> > >>>>>> > >>> > >>>>>>> jdk9 changeset: > >>> > >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > >>> > >>>>>>> > >>> > >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 > >>> > >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>> > >>>>>>> > >>> > >>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16 > >>>> 5 > >>> > >>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code > >>>>>> is > >>> > >>>>>> not there in jdk8u-dev. > >>> > >>>>>> > >>> > >>>>>> I explained situation with this line above. > >>> > >>>>>> > >>> > >>>>>>> webrev link: > >>> > >>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>> > >>>>>> > >>> > >>>>>> This webrev is not incremental for your 8136473 changes - > >>> > >>>>>> library_call.cpp has part from 8136473 changes. > >>> > >>>>>> > >>> > >>>>>>> jdk9 changeset: > >>> > >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>> > >>>>>>> > >>> > >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781 > >>> > >>>>>>> Clean merge > >>> > >>>>>>> webrev link: > >>> > >>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > >>> > >>>>>> > >>> > >>>>>> Thanks seems fine. > >>> > >>>>>> > >>> > >>>>>>> jdk9 changeset: > >>> > >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > >>> > >>>>>>> > >>> > >>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101 > >>> > >>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>> > >>>> > >>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > >>> > >>>>>>> [JDK-8160360] - Resolved 2. > >>> > >>>> > >>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2 > >>>>>> 73 > >>> > >>>>>> [JDK-8148146] - Manual merge is not done as the corresponding > >>>>>> code is > >>> > >>>>>> not there in jdk8u-dev. > >>> > >>>>>>> webrev link: > >>> > >>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > >>> > >>>>>> > >>> > >>>>>> This webrev is not incremental in library_call.cpp. Difficult to > >>>>>> see > >>> > >>>>>> this part of changes. > >>> > >>>>>> > >>> > >>>>>> Thanks, > >>> > >>>>>> Vladimir > >>> > >>>>>> > >>> > >>>>>>> jdk9 changeset: > >>> > >>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > >>> > >>>>>>> > >>> > >>>>>>> Testing: jprt and jtreg > >>> > >>>>>>> > >>> > >>>>>>> Regards, > >>> > >>>>>>> Shafi > >>> > >>>>>>> > >>> > >>>>>>>> -----Original Message----- > >>> > >>>>>>>> From: Shafi Ahmad > >>> > >>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM > >>> > >>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net > >>>>>>>> > >>> > >>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation > >>> > >>>>>>>> produces mismatched unsafe accesses > >>> > >>>>>>>> > >>> > >>>>>>>> Thanks Vladimir. > >>> > >>>>>>>> > >>> > >>>>>>>> I will create dependent backport of 1. > >>> > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473 > >>> > >>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 > >>> > >>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 > >>> > >>>>>>>> > >>> > >>>>>>>> Regards, > >>> > >>>>>>>> Shafi > >>> > >>>>>>>> > >>> > >>>>>>>>> -----Original Message----- > >>> > >>>>>>>>> From: Vladimir Kozlov > >>> > >>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM > >>> > >>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >>>>>>>>> > >>> > >>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>> > >>>>>>>>> produces mismatched unsafe accesses > >>> > >>>>>>>>> > >>> > >>>>>>>>> Hi Shafi, > >>> > >>>>>>>>> > >>> > >>>>>>>>> You should also consider backporting following related fixes: > >>> > >>>>>>>>> > >>> > >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 > >>> > >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 > >>> > >>>>>>>>> > >>> > >>>>>>>>> Otherwise you may hit asserts added by 8134918 changes. > >>> > >>>>>>>>> > >>> > >>>>>>>>> Thanks, > >>> > >>>>>>>>> Vladimir > >>> > >>>>>>>>> > >>> > >>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > >>> > >>>>>>>>>> Hi All, > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type > >>>>>>>>>> speculation > >>> > >>>>>>>>>> produces > >>> > >>>>>>>>> mismatched unsafe accesses to jdk8u-dev. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Please note that backport is not clean and the conflict is due to: > >>> > >>>>>>>>>> > >>> > >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>> > >>>>>>>>>> 1 > >>> > >>>>>>>>>> 65 > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Getting debug build failure because of: > >>> > >>>>>>>>>> > >>> > >>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>> > >>>>>>>>>> 1 > >>> > >>>>>>>>>> 55 > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed: > >>>>>>>>>> no > >>> > >>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which > >>>>>>>>> is > >>> > >>>>>>>>> not back ported to jdk8u and the current backport is on top of > >>> > >>>>>>>>> above > >>> > >>>>>> change. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Please note that I am not sure if there is any dependency > >>> > >>>>>>>>>> between these > >>> > >>>>>>>>> two changesets. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> open webrev: > >>> > >>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>> > >>>>>>>>>> jdk9 bug > >>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918 > >>> > >>>>>>>>>> jdk9 changeset: > >>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> testing: Passes JPRT, jtreg not completed > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Regards, > >>> > >>>>>>>>>> Shafi > >>> > >>>>>>>>>> > >>> From ioi.lam at oracle.com Mon Nov 21 06:58:13 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 20 Nov 2016 22:58:13 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method Message-ID: <58329B05.6070602@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8169867 http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ Thanks to Tobias for finding the bug. I have done the following + integrated Tobias' suggested fix + fixed Method::restore_unshareable_info to call Method::link_method + added comments and a diagram to illustrate how the CDS method entry trampolines work. BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. It's basically an extra level of indirection to get to the adapter. However. The word "trampoline" usually is used for and extra jump in executable code, so it may be a little confusing when we use it for a data pointer here. Any suggest for a better name? Testing: [1] I have tested Tobias' TestInterpreterMethodEntries.java class and now it produces the correct assertion. I won't check in this test, though, since it won't assert anymore after Tobias fixes 8169711. # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: # should be correctly set during dump time [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist All tests passed. Thanks - Ioi From erik.helin at oracle.com Mon Nov 21 07:21:05 2016 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 21 Nov 2016 08:21:05 +0100 Subject: RFR: 8166607: G1 needs klass_or_null_acquire In-Reply-To: References: <98788D01-D3B4-45C4-B133-204B36A6799D@oracle.com> <20161010135436.GB11489@ehelin.jrpg.bea.com> <20161013101149.GL19291@ehelin.jrpg.bea.com> <01DA7656-1AFD-4DF7-9FE8-743092ADEDBA@oracle.com> <328E1EF5-9DD6-41A9-900A-56E66DF245F3@oracle.com> <20161018085351.GA19291@ehelin.jrpg.bea.com> <7A66BDA6-87B3-417A-8BF5-61E8EFFB8F29@oracle.com> <299b4c9a-6090-cf9a-ad01-8a010146e647@oracle.com> <36E3C19E-F2B7-4EA3-BBF5-2A09540EDE16@oracle.com> <788C53B9-1CB1-45C9-BFA9-58D4DAE94C3E@oracle.com> <1478516014.2646.16.camel@oracle.com> <98A51B1C-BEBC-4B02-AA17-C279D8E2E058@oracle.com> <1479205264.3251.13.camel@oracle.com> <05986D2C-3901-410E-A8B5-DAA96C7D193C@oracle.com> Message-ID: <9976738f-1dc6-f3c0-3ec3-6229741b7db0@oracle.com> On 11/18/2016 03:03 PM, Kim Barrett wrote: >> On Nov 15, 2016, at 6:58 PM, Kim Barrett wrote: >> >>> On Nov 15, 2016, at 5:21 AM, Thomas Schatzl wrote: >>> >>> Hi Kim, >>> >>> On Mon, 2016-11-07 at 14:38 -0500, Kim Barrett wrote: >>>>> >>>>> On Nov 7, 2016, at 5:53 AM, Thomas Schatzl >>>> om> wrote: >>>>> Maybe it would be cleaner to call a method in the barrier set >>>>> instead of inlining the dirtying + enqueuing in lines 685 to 691? >>>>> Maybe as an additional RFE. >>>> We could use _ct_bs->invalidate(dirtyRegion). That's rather >>>> overgeneralized and inefficient for this situation, but this >>>> situation should occur *very* rarely; it requires a stale card get >>>> processed just as a humongous object is in the midst of being >>>> allocated in the same region. >>> >>> I kind of think for these reasons we should use _ct_bs->invalidate() as >>> it seems clearer to me. There is the mentioned drawback of having no >>> other more efficient way, so I will let you decide about this. >> >> I've made the change to call invalidate, and also updated some comments. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8166607 >> >> Webrevs: >> full: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03/ >> incr: http://cr.openjdk.java.net/~kbarrett/8166607/webrev.03.inc/ >> >> Also, see RFR: 8166811, where I've included a webrev combining the >> latest changes for 8166607 and 8166811, since they are rather >> intertwined. I think I'll do as Erik suggested and push the two >> together. > > Sorry folks, but I want to revert this part and go back to the old > code where it locked the shared queue and enqueued there. > > If the executing invocation of refine_card is from a Java thread, > e.g. this is the "mutator helps with refinement" case, calling > invalidate would enqueue to the current thread's buffer. But that is > effectively a reentrant call to enqueue, and the Java thread case of > enqueue is not reentrant-safe. Only enqueue to the shared queue is > reentrant-safe. > > I think that scenario presently can't happen, since the mutator helps > case is dealt with by the mutator processing it's own buffer. In that > situation, all the cards in the buffer came from writes by this thread > to an object this thread either allocated or has access to, so the > klass must be there. But that's getting uncomfortably subtle in what > is already difficult to analyze code. Agree, lets revert to the old code. Thanks for being so careful about this change. > Also, we've talked about changing the mutator helps case to not > immediately process it's own buffer but instead add its buffer to the > pending buffer list and process the next (FIFO ordered) buffer, in > order to let its buffer age. (I have a change for that in my post-JDK > 9 collection of pending changes. The mutator-invoked enqueue might be > reentrant-safe in that change, but I don't think I want to make that > guarantee.) It is hard as it is to keep track of all the synchronization and guarantees spread out in the code to make the card refinement work, so I would prefer to keep it simple as just revert back to the existing code. From tobias.hartmann at oracle.com Mon Nov 21 07:53:32 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 21 Nov 2016 08:53:32 +0100 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <58329B05.6070602@oracle.com> References: <58329B05.6070602@oracle.com> Message-ID: <5832A7FC.8030505@oracle.com> Hi Ioi, this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation! For the record: the test mentioned in [1] is part of my fix for JDK-8169711. Best regards, Tobias On 21.11.2016 07:58, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8169867 > http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ > > Thanks to Tobias for finding the bug. I have done the following > > + integrated Tobias' suggested fix > + fixed Method::restore_unshareable_info to call Method::link_method > + added comments and a diagram to illustrate how the CDS method entry > trampolines work. > > BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. > It's basically an extra level of indirection to get to the adapter. However. > The word "trampoline" usually is used for and extra jump in executable code, > so it may be a little confusing when we use it for a data pointer here. > > Any suggest for a better name? > > > Testing: > [1] I have tested Tobias' TestInterpreterMethodEntries.java class and > now it produces the correct assertion. I won't check in this test, though, > since it won't assert anymore after Tobias fixes 8169711. > > # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 > # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: > # should be correctly set during dump time > > [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist > All tests passed. > > Thanks > - Ioi > From goetz.lindenmaier at sap.com Mon Nov 21 11:35:20 2016 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 21 Nov 2016 11:35:20 +0000 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> Message-ID: <1156e3bb13944547977e904fd8d79ccb@DEROTE13DE08.global.corp.sap> Hi, we appreciate this change a lot, and also if /server would go away. I built and tested it on linuxppcle, aixppc and linuxs390. There is still a place that refers to a removed variables and breaks the build: jdk/src/java.base/unix/native/libjli/ergo.c:94 LIBARCHNAME You can probably just replace LIBARCHNAME by ARCH which is set to the same value. I would propose to remove VM_CPU from hotspot/test/test_env.sh after you removed the last place where it is used. (VM_BITS is dead, too.) Best regards, Goetz. > -----Original Message----- > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On > Behalf Of Vladimir Kozlov > Sent: Freitag, 18. November 2016 17:41 > To: Erik Joelsson ; build-dev dev at openjdk.java.net>; core-libs-dev ; > hotspot-dev developers > Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux > and Solaris images > > Finally! :) > > Hotspot changes looks fine to me. But you missed > hotspot/make/hotspot.script file. > > Our colleges in RH and SAP should test these changes on their platforms. > > Next step would be removal of client/server sub-directories on platforms > where we have only Server JVM (64-bit JDK has only Server JVM). > > Thanks, > Vladimir > > On 11/18/16 7:30 AM, Erik Joelsson wrote: > > Hello, > > > > Please review this change which removes the $ARCH sub directory in the > > lib directory of the runtime images, which is an outstanding issue from > > the new runtime images. Most of the changes are in the build, but there > > are some in hotspot and launcher source. I have verified -testset > > hotspot and default in JPRT as well as tried to run as many jtreg tests > > as possible locally. I could only really find two tests that needed to > > be adjusted. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > > > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 > > > > /Erik > > From goetz.lindenmaier at sap.com Mon Nov 21 13:10:15 2016 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 21 Nov 2016 13:10:15 +0000 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> Message-ID: Hi, linuxx86_64 has the same issue. I tested it with the jdk9/hs repo: jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function ServerClassMachineImpl: jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected ) before LIBARCHNAME Best regards, Goetz > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Montag, 21. November 2016 12:35 > To: 'Vladimir Kozlov' ; Erik Joelsson > ; build-dev ; > core-libs-dev ; hotspot-dev developers > > Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux > and Solaris images > > Hi, > > we appreciate this change a lot, and also if /server would go away. > > I built and tested it on linuxppcle, aixppc and linuxs390. > > There is still a place that refers to a removed variables > and breaks the build: > jdk/src/java.base/unix/native/libjli/ergo.c:94 LIBARCHNAME > You can probably just replace LIBARCHNAME by ARCH which is set to > the same value. > > I would propose to remove VM_CPU from hotspot/test/test_env.sh after > you > removed the last place where it is used. (VM_BITS is dead, too.) > > Best regards, > Goetz. > > > -----Original Message----- > > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On > > Behalf Of Vladimir Kozlov > > Sent: Freitag, 18. November 2016 17:41 > > To: Erik Joelsson ; build-dev > dev at openjdk.java.net>; core-libs-dev ; > > hotspot-dev developers > > Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux > > and Solaris images > > > > Finally! :) > > > > Hotspot changes looks fine to me. But you missed > > hotspot/make/hotspot.script file. > > > > Our colleges in RH and SAP should test these changes on their platforms. > > > > Next step would be removal of client/server sub-directories on platforms > > where we have only Server JVM (64-bit JDK has only Server JVM). > > > > Thanks, > > Vladimir > > > > On 11/18/16 7:30 AM, Erik Joelsson wrote: > > > Hello, > > > > > > Please review this change which removes the $ARCH sub directory in the > > > lib directory of the runtime images, which is an outstanding issue from > > > the new runtime images. Most of the changes are in the build, but there > > > are some in hotspot and launcher source. I have verified -testset > > > hotspot and default in JPRT as well as tried to run as many jtreg tests > > > as possible locally. I could only really find two tests that needed to > > > be adjusted. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > > > > > > Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 > > > > > > /Erik > > > From erik.joelsson at oracle.com Mon Nov 21 13:26:54 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 21 Nov 2016 14:26:54 +0100 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> Message-ID: Hello Goetz, Thanks for trying this out. Note that the ergo* files were removed in JDK-8169001 which is currently in jdk9/dev but not yet in hs. /Erik On 2016-11-21 14:10, Lindenmaier, Goetz wrote: > Hi, > > linuxx86_64 has the same issue. I tested it with the jdk9/hs repo: > > jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function ServerClassMachineImpl: > jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected ) before LIBARCHNAME > > Best regards, > Goetz > >> -----Original Message----- >> From: Lindenmaier, Goetz >> Sent: Montag, 21. November 2016 12:35 >> To: 'Vladimir Kozlov' ; Erik Joelsson >> ; build-dev ; >> core-libs-dev ; hotspot-dev developers >> >> Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux >> and Solaris images >> >> Hi, >> >> we appreciate this change a lot, and also if /server would go away. >> >> I built and tested it on linuxppcle, aixppc and linuxs390. >> >> There is still a place that refers to a removed variables >> and breaks the build: >> jdk/src/java.base/unix/native/libjli/ergo.c:94 LIBARCHNAME >> You can probably just replace LIBARCHNAME by ARCH which is set to >> the same value. >> >> I would propose to remove VM_CPU from hotspot/test/test_env.sh after >> you >> removed the last place where it is used. (VM_BITS is dead, too.) >> >> Best regards, >> Goetz. >> >>> -----Original Message----- >>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >>> Behalf Of Vladimir Kozlov >>> Sent: Freitag, 18. November 2016 17:41 >>> To: Erik Joelsson ; build-dev >> dev at openjdk.java.net>; core-libs-dev ; >>> hotspot-dev developers >>> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux >>> and Solaris images >>> >>> Finally! :) >>> >>> Hotspot changes looks fine to me. But you missed >>> hotspot/make/hotspot.script file. >>> >>> Our colleges in RH and SAP should test these changes on their platforms. >>> >>> Next step would be removal of client/server sub-directories on platforms >>> where we have only Server JVM (64-bit JDK has only Server JVM). >>> >>> Thanks, >>> Vladimir >>> >>> On 11/18/16 7:30 AM, Erik Joelsson wrote: >>>> Hello, >>>> >>>> Please review this change which removes the $ARCH sub directory in the >>>> lib directory of the runtime images, which is an outstanding issue from >>>> the new runtime images. Most of the changes are in the build, but there >>>> are some in hotspot and launcher source. I have verified -testset >>>> hotspot and default in JPRT as well as tried to run as many jtreg tests >>>> as possible locally. I could only really find two tests that needed to >>>> be adjusted. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 >>>> >>>> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 >>>> >>>> /Erik >>>> From goetz.lindenmaier at sap.com Mon Nov 21 13:43:21 2016 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 21 Nov 2016 13:43:21 +0000 Subject: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux and Solaris images In-Reply-To: References: <9bdf96a7-c15f-cc73-06ef-38e02428aba1@oracle.com> <1b0e979f-bcba-a912-66d8-c1282075461e@oracle.com> Message-ID: Ah, ok, so this is fine. Best regards, Goetz. > -----Original Message----- > From: Erik Joelsson [mailto:erik.joelsson at oracle.com] > Sent: Montag, 21. November 2016 14:27 > To: Lindenmaier, Goetz ; Vladimir Kozlov > ; build-dev ; > core-libs-dev ; hotspot-dev developers > > Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from Linux > and Solaris images > > Hello Goetz, > > Thanks for trying this out. Note that the ergo* files were removed in > JDK-8169001 which is currently in jdk9/dev but not yet in hs. > > /Erik > > > On 2016-11-21 14:10, Lindenmaier, Goetz wrote: > > Hi, > > > > linuxx86_64 has the same issue. I tested it with the jdk9/hs repo: > > > > jdk/src/java.base/unix/native/libjli/ergo_i586.c: In function > ServerClassMachineImpl: > > jdk/src/java.base/unix/native/libjli/ergo_i586.c:196:30: error: expected ) > before LIBARCHNAME > > > > Best regards, > > Goetz > > > >> -----Original Message----- > >> From: Lindenmaier, Goetz > >> Sent: Montag, 21. November 2016 12:35 > >> To: 'Vladimir Kozlov' ; Erik Joelsson > >> ; build-dev ; > >> core-libs-dev ; hotspot-dev > developers > >> > >> Subject: RE: RFR: JDK-8066474: Remove the lib/$ARCH directory from > Linux > >> and Solaris images > >> > >> Hi, > >> > >> we appreciate this change a lot, and also if /server would go away. > >> > >> I built and tested it on linuxppcle, aixppc and linuxs390. > >> > >> There is still a place that refers to a removed variables > >> and breaks the build: > >> jdk/src/java.base/unix/native/libjli/ergo.c:94 LIBARCHNAME > >> You can probably just replace LIBARCHNAME by ARCH which is set to > >> the same value. > >> > >> I would propose to remove VM_CPU from hotspot/test/test_env.sh after > >> you > >> removed the last place where it is used. (VM_BITS is dead, too.) > >> > >> Best regards, > >> Goetz. > >> > >>> -----Original Message----- > >>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On > >>> Behalf Of Vladimir Kozlov > >>> Sent: Freitag, 18. November 2016 17:41 > >>> To: Erik Joelsson ; build-dev >>> dev at openjdk.java.net>; core-libs-dev dev at openjdk.java.net>; > >>> hotspot-dev developers > >>> Subject: Re: RFR: JDK-8066474: Remove the lib/$ARCH directory from > Linux > >>> and Solaris images > >>> > >>> Finally! :) > >>> > >>> Hotspot changes looks fine to me. But you missed > >>> hotspot/make/hotspot.script file. > >>> > >>> Our colleges in RH and SAP should test these changes on their platforms. > >>> > >>> Next step would be removal of client/server sub-directories on > platforms > >>> where we have only Server JVM (64-bit JDK has only Server JVM). > >>> > >>> Thanks, > >>> Vladimir > >>> > >>> On 11/18/16 7:30 AM, Erik Joelsson wrote: > >>>> Hello, > >>>> > >>>> Please review this change which removes the $ARCH sub directory in > the > >>>> lib directory of the runtime images, which is an outstanding issue from > >>>> the new runtime images. Most of the changes are in the build, but > there > >>>> are some in hotspot and launcher source. I have verified -testset > >>>> hotspot and default in JPRT as well as tried to run as many jtreg tests > >>>> as possible locally. I could only really find two tests that needed to > >>>> be adjusted. > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8066474 > >>>> > >>>> Webrev: http://cr.openjdk.java.net/~erikj/8066474/webrev.01 > >>>> > >>>> /Erik > >>>> From aph at redhat.com Mon Nov 21 15:15:58 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 15:15:58 +0000 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled Message-ID: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> JavaThread::interp_only_mode is a 32-bit sized field. In the assembly code we read it as a 64-bit xword, causing false positives. This means that as soon as we attach a JVMTI debugger everything runs very slowly. http://cr.openjdk.java.net/~aph/8170098/ From aph at redhat.com Mon Nov 21 15:18:13 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 15:18:13 +0000 Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing References Message-ID: In the entry of TemplateInterpreterGenerator::generate_Reference_get_entry the sender's SP is saved in r13, a call-clobbered register. We need to save it in a register which is not call-clobbered when we call g1_write_barrier_pre(). It would be better to convert all usages of r13 as senderSP to r19, but this is less risky. I'll do it in JDK 10. http://cr.openjdk.java.net/~aph/8170100/ Andrew. From adinn at redhat.com Mon Nov 21 15:21:08 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 21 Nov 2016 15:21:08 +0000 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> Message-ID: <57a5fe39-1f0b-3c0c-c2ee-409769a32c0d@redhat.com> On 21/11/16 15:15, Andrew Haley wrote: > JavaThread::interp_only_mode is a 32-bit sized field. In the assembly > code we read it as a 64-bit xword, causing false positives. This > means that as soon as we attach a JVMTI debugger everything runs very > slowly. > > http://cr.openjdk.java.net/~aph/8170098/ Looks good (not an official review). regards, Andrew Dinn ----------- From aph at redhat.com Mon Nov 21 15:22:16 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 15:22:16 +0000 Subject: RFR: 8170106: AArch64: Multiple JVMCI issues Message-ID: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> JVMCI nearly works, but there are multiple minor bugs which make it non-functional. It's not possible to separate these into multiple issues, so this is a composite patch. The handling of some relocs is wrong. Narrow klasses and OOPs have only partial support, returning Unimplemented() Register numbering for float registers is wrong Scratch registers r8 and r9 aren't marked as non-allocatable. http://cr.openjdk.java.net/~aph/8170106 Andrew. From adinn at redhat.com Mon Nov 21 15:23:29 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 21 Nov 2016 15:23:29 +0000 Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing References In-Reply-To: References: Message-ID: On 21/11/16 15:18, Andrew Haley wrote: > In the entry of > TemplateInterpreterGenerator::generate_Reference_get_entry the > sender's SP is saved in r13, a call-clobbered register. We need to > save it in a register which is not call-clobbered when we call > g1_write_barrier_pre(). > > It would be better to convert all usages of r13 as senderSP to r19, > but this is less risky. I'll do it in JDK 10. > > http://cr.openjdk.java.net/~aph/8170100/ Looks good to me (not an official review). I agree that postponing the full change to JDK10 is a wiser. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Mon Nov 21 15:34:07 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 15:34:07 +0000 Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing References In-Reply-To: References: Message-ID: <569efbb2-9292-e789-8738-c55ad521ab04@redhat.com> On 21/11/16 15:23, Andrew Dinn wrote: > Looks good to me (not an official review). Shouldn't you be a JDK9 reviewer by now? IMO you have enough experience. I'll propose you if you like. Andrew. From adinn at redhat.com Mon Nov 21 15:50:37 2016 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 21 Nov 2016 15:50:37 +0000 Subject: RFR: 8170106: AArch64: Multiple JVMCI issues In-Reply-To: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> Message-ID: <8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com> On 21/11/16 15:22, Andrew Haley wrote: > JVMCI nearly works, but there are multiple minor bugs which make it > non-functional. It's not possible to separate these into multiple > issues, so this is a composite patch. > > The handling of some relocs is wrong. > Narrow klasses and OOPs have only partial support, returning Unimplemented() > Register numbering for float registers is wrong > Scratch registers r8 and r9 aren't marked as non-allocatable. > > http://cr.openjdk.java.net/~aph/8170106 I guess I probably ought to review this. All the code changes look sensible and appear correct by eyeball. Whether they are really needed or, indeed, are /all/ that is needed is far from obvious. I could at least build the tree and test that it runs ok. Are you able to provide any special instructions needed to achieve that? (esp the latter). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Mon Nov 21 15:52:00 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 15:52:00 +0000 Subject: RFR: 8170106: AArch64: Multiple JVMCI issues In-Reply-To: <8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com> References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> <8456c3d4-bbc4-e6d9-d98b-2f185a630bf3@redhat.com> Message-ID: <4121dae2-d69f-abc4-f425-973cc4515e52@redhat.com> On 21/11/16 15:50, Andrew Dinn wrote: > I guess I probably ought to review this. All the code changes look > sensible and appear correct by eyeball. Whether they are really needed > or, indeed, are /all/ that is needed is far from obvious. I could at > least build the tree and test that it runs ok. Are you able to provide > any special instructions needed to achieve that? (esp the latter). It'll be tricky without the Graal patches which are needed to make things run. Andrew. From doug.simon at oracle.com Mon Nov 21 15:54:27 2016 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 21 Nov 2016 16:54:27 +0100 Subject: RFR: 8170106: AArch64: Multiple JVMCI issues In-Reply-To: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> References: <406b6c54-397d-054b-5b5c-a82d12b87517@redhat.com> Message-ID: <29BBCC34-9BF6-41C4-BE14-6ED90639A7E6@oracle.com> (including hotspot-compiler-dev) > On 21 Nov 2016, at 16:22, Andrew Haley wrote: > > JVMCI nearly works, but there are multiple minor bugs which make it > non-functional. It's not possible to separate these into multiple > issues, so this is a composite patch. > > The handling of some relocs is wrong. > Narrow klasses and OOPs have only partial support, returning Unimplemented() > Register numbering for float registers is wrong > Scratch registers r8 and r9 aren't marked as non-allocatable. > > http://cr.openjdk.java.net/~aph/8170106 > > Andrew. From kirill.zhaldybin at oracle.com Mon Nov 21 16:38:37 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 21 Nov 2016 19:38:37 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> Message-ID: <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> Marcus, Thank you for reviewing the fix! >> WebRev: >> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ > > ISO8601 says the decimal point can be either '.' or ',' so the test > should accept either. You could let sscanf read out the decimal point > as a character and just verify that it is one of the two. > > In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that > we won't accept "Z" suffixed strings. Please revert that. I agree that ISO8601 could add "Z" to time (and as far as I understand date/time without delimiters is legal too) but these are the unit tests. Hence they cover the existing code and they should pass only if the result corresponds to existing code and fail otherwise. The current code from os::iso8601_time format date/time string %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not consider any other format as valid. Could you please let me know your opinion? Thank you. Regards, Kirill > > Thanks, > Marcus > >> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >> >> Thank you. >> >> Regards, Kirill > From rwestrel at redhat.com Mon Nov 21 17:24:47 2016 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 21 Nov 2016 18:24:47 +0100 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> Message-ID: > http://cr.openjdk.java.net/~aph/8170098/ Looks good to me. Roland. From rwestrel at redhat.com Mon Nov 21 17:25:56 2016 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 21 Nov 2016 18:25:56 +0100 Subject: RFR: 8170100: AArch64: Crash in C1-compiled code accessing References In-Reply-To: References: Message-ID: > http://cr.openjdk.java.net/~aph/8170100/ That looks good to me. Roland. From vladimir.kozlov at oracle.com Mon Nov 21 17:48:39 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 21 Nov 2016 09:48:39 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <58329B05.6070602@oracle.com> References: <58329B05.6070602@oracle.com> Message-ID: On 11/20/16 10:58 PM, Ioi Lam wrote: > https://bugs.openjdk.java.net/browse/JDK-8169867 > http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ Looks good to me. > > > Thanks to Tobias for finding the bug. I have done the following > > + integrated Tobias' suggested fix > + fixed Method::restore_unshareable_info to call Method::link_method > + added comments and a diagram to illustrate how the CDS method entry > trampolines work. > > BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. > It's basically an extra level of indirection to get to the adapter. > However. > The word "trampoline" usually is used for and extra jump in executable > code, > so it may be a little confusing when we use it for a data pointer here. > > Any suggest for a better name? _adapter_cds_entry ? Thanks, Vladimir > > > Testing: > [1] I have tested Tobias' TestInterpreterMethodEntries.java class and > now it produces the correct assertion. I won't check in this test, > though, > since it won't assert anymore after Tobias fixes 8169711. > > # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error > (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), > pid=16840, tid=16843 > # assert(entry != __null && entry == _i2i_entry && entry == > _from_interpreted_entry) failed: > # should be correctly set during dump time > > [2] Ran RBT in fastdebug build for > hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist > All tests passed. > > Thanks > - Ioi > From dmitry.samersoff at oracle.com Mon Nov 21 18:46:53 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 21 Nov 2016 21:46:53 +0300 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> Message-ID: Andrew, Should the code in MethodHandles::jump_from_method_handle() be changed as well? -Dmitry On 2016-11-21 18:15, Andrew Haley wrote: > JavaThread::interp_only_mode is a 32-bit sized field. In the assembly code we read it as a 64-bit xword, causing false positives. This means that as soon as we attach a JVMTI debugger everything runs very slowly. > > http://cr.openjdk.java.net/~aph/8170098/ > -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From aph at redhat.com Mon Nov 21 19:00:37 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 21 Nov 2016 19:00:37 +0000 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> Message-ID: <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com> Hi, On 21/11/16 18:46, Dmitry Samersoff wrote: > Should the code in MethodHandles::jump_from_method_handle() be changed > as well? I can't see where. We don't seem to be calling a native function in there. Can you tell me more about the code path you have in mind? Thanks, Andrew. From dmitry.samersoff at oracle.com Mon Nov 21 19:17:10 2016 From: dmitry.samersoff at oracle.com (Dmitry Samersoff) Date: Mon, 21 Nov 2016 22:17:10 +0300 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com> References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com> Message-ID: <42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com> On 2016-11-21 22:00, Andrew Haley wrote: > Hi, > > On 21/11/16 18:46, Dmitry Samersoff wrote: > >> Should the code in MethodHandles::jump_from_method_handle() be changed >> as well? > > I can't see where. We don't seem to be calling a native function in > there. > Can you tell me more about the code path you have in mind? methodHandles_aarch64.cpp:106 __ ldrb(rscratch1, Address(rthread, JavaThread::interp_only_mode_offset())); __ cbnz(rscratch1, run_compiled_code); -Dmitry -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources. From gromero at linux.vnet.ibm.com Tue Nov 22 00:27:10 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 21 Nov 2016 22:27:10 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> Message-ID: <583390DE.5050406@linux.vnet.ibm.com> Hi Joe, On 17-11-2016 19:33, joe darcy wrote: >>>> Currently, optimization for building fdlibm is disabled, except for the >>>> "solaris" OS target [1]. >>> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the >>> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline. >> oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm >> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does >> not affect the precision, even if setting -O3 does not improve the performance >> as much as on PPC64. > > The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume values > of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would run afoul > of these fdlibm coding practices. On discussing with the Power toolchain folks we narrowed down the issue on PPC64 to the FMA. -fno-strict-aliasing has no effect and when used with an aggressive optimization does not solve the issue on precision. Thus -ffp-contract=off is the best options we have by now to optimize the fdlibm on PPC64. >>> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases. >> I agree. It's just that the issue on StrictMath methods was first noted due to >> that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64. > > Depending on how Math.{sin, cos} is implemented on PPC64, compiling the fdlibm sin/cos with more aggressive optimizations should not be expected to close the performance gap. In particular, if > Math.{sin, cos} is an intrinsic on PPC64 (I haven't checked the sources) that used platform-specific feature (say fused multiply add instructions) then just compiling fdlibm more aggressively wouldn't > necessarily make up that gap. In our case (PPC64) it does close the gap. Non-optimized code will suffer a lot, for instance, from load-hit-store issues. Contrary to what happens on PPC64, the gap on x64 seems to be quite small as you said. > > To allow cross-platform and cross-release reproducibility, StrictMath is specified to use the particular fdlibm algorithms, which precludes using better algorithms developed more recently. If we were > to start with a clean slate today, to get such reproducibility we would specify correctly-rounded behavior of all those methods, but such an approach was much less tractable technical 20+ years ago > without benefit of the research that was been done in the interim, such as the work of Prof. Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/. > >> >> >>> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones. >>> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ. >> That's really good point, thanks for letting me know about that. I'll re-test my >> change under that perspective. >> >> >>> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area. >> Got it. By "the JDK math library regression tests" you mean exactly which test >> suite? the jtreg tests? > > Specifically, the regression tests under test/java/lang/Math and test/java/lang/StrictMath in the jdk repository. There are some other math library tests in the hotspot repo, but I don't know where > they are offhand. > > A note on methodologies, when I've been writing test for my port I've tried to include test cases that exercise all the branches point in the code. Due to the large input space (~2^64 for a > single-argument method), random sampling alone is an inefficient way to try to find differences in behavior. >> For testing against JCK/TCK I'll need some help on that. >> > > I believe the JCK/TCK does have additional testcases relevant here. > > HTH; thanks, > > -Joe > Thank you very much for the valuable comments. I'll send a webrev accordingly for review. I filed a bug: https://bugs.openjdk.java.net/browse/JDK-8170153 Best regards, Gustavo From gromero at linux.vnet.ibm.com Tue Nov 22 00:34:37 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 21 Nov 2016 22:34:37 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> Message-ID: <5833929D.9000602@linux.vnet.ibm.com> Hi Chris, On 17-11-2016 19:48, Chris Plummer wrote: >> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume >> values of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would >> run afoul of these fdlibm coding practices. > This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more > than 12 years since I last dealt with fdlibm and compiler aliasing issues. I've tested with -O3 and -fno-strict-aliasing as you suggested but it did not fix the fp precision issue on PPC64. After finding that -fno-expensive-optimizations solved the problem, we narrowed down the problem to the FMA: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 Thank you. Regards, Gustavo From gromero at linux.vnet.ibm.com Tue Nov 22 00:41:34 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 21 Nov 2016 22:41:34 -0200 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> Message-ID: <5833943E.9010807@linux.vnet.ibm.com> Hi Derek, On 17-11-2016 20:47, White, Derek wrote: > Hi Joe, > > Although neither a floating point expert (as I think I've proven to you over the years), or a gcc expert, I checked with our in-house gcc expert and got this following answer: > > "Yes using -fno-strict-aliasing fixes the issues. Also there are many forks of fdlibm which has this fixed including the code inside glibc. " I've tried on PPC64 -O3 and -fno-strict-aliasing but it didn't work. Disabling the FMA fixed the issue although. Do you know if the gap between Math and StrictMath is also huge on aarch64? Thank you. Regards, Gustavo: From joe.darcy at oracle.com Tue Nov 22 00:42:10 2016 From: joe.darcy at oracle.com (joe darcy) Date: Mon, 21 Nov 2016 16:42:10 -0800 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <5833929D.9000602@linux.vnet.ibm.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> <5a4a0e96-71b5-247b-60ea-5eb518d2162b@oracle.com> <5833929D.9000602@linux.vnet.ibm.com> Message-ID: <8d21cafc-a4f5-0ed7-1f8f-4c40ccf4ecbe@oracle.com> Hello, On 11/21/2016 4:34 PM, Gustavo Romero wrote: > Hi Chris, > > On 17-11-2016 19:48, Chris Plummer wrote: >>> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume >>> values of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would >>> run afoul of these fdlibm coding practices. >> This is the strict aliasing issue right? It's a long standing problem with fdlibm that kept getting worse as gcc got smarter. IIRC, compiling with -fno-strict-aliasing fixes it, but it's been more >> than 12 years since I last dealt with fdlibm and compiler aliasing issues. > I've tested with -O3 and -fno-strict-aliasing as you suggested but it did not > fix the fp precision issue on PPC64. > > After finding that -fno-expensive-optimizations solved the problem, we narrowed > down the problem to the FMA: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 > > That makes sense; an FMA will by its nature provide different results than separate (unfused) multiple and add operations. While the polynomials used in fdlibm would benefit performance-wise from implicit replacement with FMA, such a replacement would violate the StrictMath contract. Therefore, if FDLIBM is left in C sources, it must be compiled in such a way that FMA is *not* substituted for multiply and add. Thanks, -Joe From gromero at linux.vnet.ibm.com Tue Nov 22 00:43:49 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 21 Nov 2016 22:43:49 -0200 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation Message-ID: <583394C5.3030206@linux.vnet.ibm.com> Hi, Could the following change be reviewed, please? webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/ webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/ bug: https://bugs.openjdk.java.net/browse/JDK-8170153 It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the StrictMath methods (in some cases up to 3x) on that platform. On PPC64 fdlibm optimization can be done without precision issues if floating-point expression contraction is disable, i.e. if the compiler does not use floating-point multiply-add (FMA). For further details please refer to gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 No regression was observed on Math and StrictMath tests: Passed: java/lang/Math/AbsPositiveZero.java Passed: java/lang/Math/Atan2Tests.java Passed: java/lang/Math/CeilAndFloorTests.java Passed: java/lang/Math/CubeRootTests.java Passed: java/lang/Math/DivModTests.java Passed: java/lang/Math/ExactArithTests.java Passed: java/lang/Math/Expm1Tests.java Passed: java/lang/Math/FusedMultiplyAddTests.java Passed: java/lang/Math/HyperbolicTests.java Passed: java/lang/Math/HypotTests.java Passed: java/lang/Math/IeeeRecommendedTests.java Passed: java/lang/Math/Log10Tests.java Passed: java/lang/Math/Log1pTests.java Passed: java/lang/Math/MinMax.java Passed: java/lang/Math/MultiplicationTests.java Passed: java/lang/Math/PowTests.java Passed: java/lang/Math/Rint.java Passed: java/lang/Math/RoundTests.java Passed: java/lang/Math/SinCosCornerCasesTests.java Passed: java/lang/Math/TanTests.java Passed: java/lang/Math/WorstCaseTests.java Test results: passed: 21 Passed: java/lang/StrictMath/CubeRootTests.java Passed: java/lang/StrictMath/ExactArithTests.java Passed: java/lang/StrictMath/Expm1Tests.java Passed: java/lang/StrictMath/HyperbolicTests.java Passed: java/lang/StrictMath/HypotTests.java Passed: java/lang/StrictMath/Log10Tests.java Passed: java/lang/StrictMath/Log1pTests.java Passed: java/lang/StrictMath/PowTests.java Test results: passed: 8 and also on the following hotspot tests: Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java Passed: compiler/intrinsics/mathexact/AddExactICondTest.java Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java Passed: compiler/intrinsics/mathexact/CompareTest.java Passed: compiler/intrinsics/mathexact/DecExactITest.java Passed: compiler/intrinsics/mathexact/DecExactLTest.java Passed: compiler/intrinsics/mathexact/GVNTest.java Passed: compiler/intrinsics/mathexact/IncExactITest.java Passed: compiler/intrinsics/mathexact/IncExactLTest.java Passed: compiler/intrinsics/mathexact/MulExactICondTest.java Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java Passed: compiler/intrinsics/mathexact/SubExactICondTest.java Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java Test results: passed: 50 Thank you. Regards, Gustavo From chris.plummer at oracle.com Tue Nov 22 01:33:08 2016 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 21 Nov 2016 17:33:08 -0800 Subject: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583390DE.5050406@linux.vnet.ibm.com> References: <582D0BCE.2030209@linux.vnet.ibm.com> <582DF764.70504@linux.vnet.ibm.com> <583390DE.5050406@linux.vnet.ibm.com> Message-ID: On 11/21/16 4:27 PM, Gustavo Romero wrote: > Hi Joe, > > On 17-11-2016 19:33, joe darcy wrote: >>>>> Currently, optimization for building fdlibm is disabled, except for the >>>>> "solaris" OS target [1]. >>>> The reason for that is because historically the Solaris compilers have had sufficient discipline and control regarding floating-point semantics and compiler optimizations to still implement the >>>> Java-mandated results when optimization was enabled. The gcc family of compilers, for example, has lacked such discipline. >>> oh, I see. Thanks for clarifying that. I was exactly wondering why fdlibm >>> optimization is off even for x86_x64 as it, AFAICS regarding gcc 5 only, does >>> not affect the precision, even if setting -O3 does not improve the performance >>> as much as on PPC64. >> The fdlibm code relies on aliasing a two-element array of int with a double to do bit-level reads and writes of floating-point values. As I understand it, the C spec allows compilers to assume values >> of different types don't overlap in memory. The compilation environment has to be configured in such a way that the C compiler disables code generation and optimization techniques that would run afoul >> of these fdlibm coding practices. > On discussing with the Power toolchain folks we narrowed down the issue on PPC64 > to the FMA. -fno-strict-aliasing has no effect and when used with an aggressive > optimization does not solve the issue on precision. Thus -ffp-contract=off is > the best options we have by now to optimize the fdlibm on PPC64. Ah! I should have thought of this. I dealt with with fdlibm FMA issues on ppc about 15 years ago. At the time -mno-fused-madd was the solution. I don't think -ffp-contract=off existed back then. Chris > > >>>> Methods in the Math class, such as pow, are often intrinsified and use a different algorithm so a straight performance comparison may not be as fair or meaningful in those cases. >>> I agree. It's just that the issue on StrictMath methods was first noted due to >>> that huge gap (Math vs StrictMath) on PPC64, which is not prominent on x64. >> Depending on how Math.{sin, cos} is implemented on PPC64, compiling the fdlibm sin/cos with more aggressive optimizations should not be expected to close the performance gap. In particular, if >> Math.{sin, cos} is an intrinsic on PPC64 (I haven't checked the sources) that used platform-specific feature (say fused multiply add instructions) then just compiling fdlibm more aggressively wouldn't >> necessarily make up that gap. > In our case (PPC64) it does close the gap. Non-optimized code will suffer a lot, > for instance, from load-hit-store issues. Contrary to what happens on PPC64, the > gap on x64 seems to be quite small as you said. > > >> To allow cross-platform and cross-release reproducibility, StrictMath is specified to use the particular fdlibm algorithms, which precludes using better algorithms developed more recently. If we were >> to start with a clean slate today, to get such reproducibility we would specify correctly-rounded behavior of all those methods, but such an approach was much less tractable technical 20+ years ago >> without benefit of the research that was been done in the interim, such as the work of Prof. Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/. >> >>> >>>> Accumulating the the results of the functions and comparisons the sums is not a sufficiently robust way of checking to see if the optimized versions are indeed equivalent to the non-optimized ones. >>>> The specification of StrictMath requires a particular result for each set of floating-point arguments and sums get round-away low-order bits that differ. >>> That's really good point, thanks for letting me know about that. I'll re-test my >>> change under that perspective. >>> >>> >>>> Running the JDK math library regression tests and corresponding JCK tests is recommended for work in this area. >>> Got it. By "the JDK math library regression tests" you mean exactly which test >>> suite? the jtreg tests? >> Specifically, the regression tests under test/java/lang/Math and test/java/lang/StrictMath in the jdk repository. There are some other math library tests in the hotspot repo, but I don't know where >> they are offhand. >> >> A note on methodologies, when I've been writing test for my port I've tried to include test cases that exercise all the branches point in the code. Due to the large input space (~2^64 for a >> single-argument method), random sampling alone is an inefficient way to try to find differences in behavior. >>> For testing against JCK/TCK I'll need some help on that. >>> >> I believe the JCK/TCK does have additional testcases relevant here. >> >> HTH; thanks, >> >> -Joe >> > Thank you very much for the valuable comments. > > I'll send a webrev accordingly for review. > > I filed a bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > > Best regards, > Gustavo > From jiangli.zhou at Oracle.COM Tue Nov 22 04:33:02 2016 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Mon, 21 Nov 2016 20:33:02 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <58329B05.6070602@oracle.com> References: <58329B05.6070602@oracle.com> Message-ID: <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com> Hi Ioi, Looks good. I also have one suggestion. To make it a little more easier to read, could you please fold the 'else if (_i2i_entry != NULL)? block starting at line 1039 and ?if (!is_shared())? block starting at line 1047 into to one block? 1032 if (is_shared()) { 1033 address entry = Interpreter::entry_for_cds_method(h_method); 1034 assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry, 1035 "should be correctly set during dump time"); 1036 if (adapter() != NULL) { 1037 return; 1038 } 1039 } else if (_i2i_entry != NULL) { 1040 return; 1041 } 1042 assert( _code == NULL, "nothing compiled yet" ); 1043 1044 // Setup interpreter entrypoint 1045 assert(this == h_method(), "wrong h_method()" ); 1046 1047 if (!is_shared()) { 1048 assert(adapter() == NULL, "init'd to NULL"); 1049 address entry = Interpreter::entry_for_method(h_method); 1050 assert(entry != NULL, "interpreter entry must be non-null"); 1051 // Sets both _i2i_entry and _from_interpreted_entry 1052 set_interpreter_entry(entry); 1053 } Thanks, Jiangli > On Nov 20, 2016, at 10:58 PM, Ioi Lam wrote: > > https://bugs.openjdk.java.net/browse/JDK-8169867 > http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ > > Thanks to Tobias for finding the bug. I have done the following > > + integrated Tobias' suggested fix > + fixed Method::restore_unshareable_info to call Method::link_method > + added comments and a diagram to illustrate how the CDS method entry > trampolines work. > > BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. > It's basically an extra level of indirection to get to the adapter. However. > The word "trampoline" usually is used for and extra jump in executable code, > so it may be a little confusing when we use it for a data pointer here. > > Any suggest for a better name? > > > Testing: > [1] I have tested Tobias' TestInterpreterMethodEntries.java class and > now it produces the correct assertion. I won't check in this test, though, > since it won't assert anymore after Tobias fixes 8169711. > > # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 > # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: > # should be correctly set during dump time > > [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist > All tests passed. > > Thanks > - Ioi > From ioi.lam at oracle.com Tue Nov 22 07:05:42 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 21 Nov 2016 23:05:42 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com> References: <58329B05.6070602@oracle.com> <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com> Message-ID: <5833EE46.8020304@oracle.com> On 11/21/16 8:33 PM, Jiangli Zhou wrote: > Hi Ioi, > > Looks good. > > I also have one suggestion. To make it a little more easier to read, > could you please fold the 'else if (_i2i_entry != NULL)? block > starting at line 1039 and ?if (!is_shared())? block starting at line > 1047 into to one block? > 1032 if (is_shared()) { > 1033 address entry = Interpreter::entry_for_cds_method(h_method); > 1034 assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry, > 1035 "should be correctly set during dump time"); > 1036 if (adapter() != NULL) { > 1037 return; > 1038 } > 1039 } else if (_i2i_entry != NULL) { > 1040 return; > 1041 } > 1042 assert( _code == NULL, "nothing compiled yet" ); > 1043 > 1044 // Setup interpreter entrypoint > 1045 assert(this == h_method(), "wrong h_method()" ); > 1046 > 1047 if (!is_shared()) { > 1048 assert(adapter() == NULL, "init'd to NULL"); > 1049 address entry = Interpreter::entry_for_method(h_method); > 1050 assert(entry != NULL, "interpreter entry must be non-null"); > 1051 // Sets both _i2i_entry and _from_interpreted_entry > 1052 set_interpreter_entry(entry); > 1053 } Hi Jiangli, The line assert( _code == NULL, "nothing compiled yet" ); is necessary before we call set_interpreter_entry(entry); That's because the _from_interpreted_entry would be different if the method has been compiled. So this means I cannot simply move the block starting at #1047 to above #1042. Thanks - Ioi > Thanks, > Jiangli > >> On Nov 20, 2016, at 10:58 PM, Ioi Lam > > wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8169867 >> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >> >> Thanks to Tobias for finding the bug. I have done the following >> >> + integrated Tobias' suggested fix >> + fixed Method::restore_unshareable_info to call Method::link_method >> + added comments and a diagram to illustrate how the CDS method entry >> trampolines work. >> >> BTW, I am a little unhappy about the name >> ConstMethod::_adapter_trampoline. >> It's basically an extra level of indirection to get to the adapter. >> However. >> The word "trampoline" usually is used for and extra jump in >> executable code, >> so it may be a little confusing when we use it for a data pointer here. >> >> Any suggest for a better name? >> >> >> Testing: >> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >> now it produces the correct assertion. I won't check in this test, >> though, >> since it won't assert anymore after Tobias fixes 8169711. >> >> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error >> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), >> pid=16840, tid=16843 >> # assert(entry != __null && entry == _i2i_entry && entry == >> _from_interpreted_entry) failed: >> # should be correctly set during dump time >> >> [2] Ran RBT in fastdebug build for >> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >> All tests passed. >> >> Thanks >> - Ioi >> > From aph at redhat.com Tue Nov 22 10:08:39 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 22 Nov 2016 10:08:39 +0000 Subject: RFR: 8170098: AArch64: VM is extremely slow with JVMTI debugging enabled In-Reply-To: <42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com> References: <4a1f1fd6-5e21-940f-450c-e4ac68abe469@redhat.com> <44a8064c-f247-bca1-f887-75260bb8458f@redhat.com> <42c88db7-2bc6-f0e4-5090-a56e53c28eab@oracle.com> Message-ID: On 21/11/16 19:17, Dmitry Samersoff wrote: > On 2016-11-21 22:00, Andrew Haley wrote: >> >> On 21/11/16 18:46, Dmitry Samersoff wrote: >> >>> Should the code in MethodHandles::jump_from_method_handle() be changed >>> as well? >> >> I can't see where. We don't seem to be calling a native function in >> there. >> Can you tell me more about the code path you have in mind? > > methodHandles_aarch64.cpp:106 > > __ ldrb(rscratch1, Address(rthread, JavaThread::interp_only_mode_offset())); > > __ cbnz(rscratch1, run_compiled_code); Oh, I see. I guess it would have been a good idea for me to change this, but unless we see a big-endian ARM it doesn't matter. Thanks, Andrew. From trevor.d.watson at oracle.com Tue Nov 22 10:25:26 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Tue, 22 Nov 2016 10:25:26 +0000 Subject: Ping: RFR: 8162865 Implementation of SPARC lzcnt In-Reply-To: References: Message-ID: <42f837bb-bb59-1dab-14fc-578cd95de101@oracle.com> On 15/11/16 11:57, Trevor Watson wrote: > I have implemented the code to use the lzcnt instruction for both > integer and long countLeadingZeros() methods on SPARC platforms > supporting the vis3 instruction set. > > Current "bmi" tests for the above are updated so that they run on both > SPARC and x86 platforms. > > I've also implemented a test to ensure that Integer.countLeadingZeros() > and Long.countLeadingZeros() return the correct values when C2 runs. > This test is currently under the intrinsics "bmi" tests for want of > somewhere better (they do apply to both SPARC and x86 though). > > http://cr.openjdk.java.net/~alanbur/8162865/ > > Thanks, > Trevor From marcus.larsson at oracle.com Tue Nov 22 12:32:21 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Tue, 22 Nov 2016 13:32:21 +0100 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> Message-ID: Hi, On 2016-11-21 17:38, Kirill Zhaldybin wrote: > Marcus, > > Thank you for reviewing the fix! >>> WebRev: >>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >> >> ISO8601 says the decimal point can be either '.' or ',' so the test >> should accept either. You could let sscanf read out the decimal point >> as a character and just verify that it is one of the two. >> >> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that >> we won't accept "Z" suffixed strings. Please revert that. > I agree that ISO8601 could add "Z" to time (and as far as I understand > date/time without delimiters is legal too) but these are the unit tests. > Hence they cover the existing code and they should pass only if the > result corresponds to existing code and fail otherwise. > The current code from os::iso8601_time format date/time string > %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not consider > any other format as valid. > > Could you please let me know your opinion? I think the test should verify the intended behavior, not the implementation. If we refactor or change something in iso8601_time() we shouldn't be failing the test if it still conforms to ISO8601, IMO. Thanks, Marcus > > Thank you. > > Regards, Kirill > >> >> Thanks, >> Marcus >> >>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>> >>> Thank you. >>> >>> Regards, Kirill >> > From kirill.zhaldybin at oracle.com Tue Nov 22 13:24:07 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Tue, 22 Nov 2016 16:24:07 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> Message-ID: <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> Marcus, Thank you for prompt reply! Could you please read comments inline? I'm looking forward to your reply. Thank you. Regards, Kirill On 22.11.2016 15:32, Marcus Larsson wrote: > Hi, > > > On 2016-11-21 17:38, Kirill Zhaldybin wrote: >> Marcus, >> >> Thank you for reviewing the fix! >>>> WebRev: >>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>> >>> ISO8601 says the decimal point can be either '.' or ',' so the test >>> should accept either. You could let sscanf read out the decimal >>> point as a character and just verify that it is one of the two. >>> >>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means that >>> we won't accept "Z" suffixed strings. Please revert that. >> I agree that ISO8601 could add "Z" to time (and as far as I >> understand date/time without delimiters is legal too) but these are >> the unit tests. >> Hence they cover the existing code and they should pass only if the >> result corresponds to existing code and fail otherwise. >> The current code from os::iso8601_time format date/time string >> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >> consider any other format as valid. >> >> Could you please let me know your opinion? > > I think the test should verify the intended behavior, not the > implementation. If we refactor or change something in iso8601_time() > we shouldn't be failing the test if it still conforms to ISO8601, IMO. I would agree with you if we were talking about a functional test. But since it is an unit test I think we should keep it as close to implementation as possible. If the implementation is changed unintentionally the test fails and signals us that something is broken. If it is an intentional change the test must be updated correspondingly. > > Thanks, > Marcus > >> >> Thank you. >> >> Regards, Kirill >> >>> >>> Thanks, >>> Marcus >>> >>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>> >>>> Thank you. >>>> >>>> Regards, Kirill >>> >> > From stefan.karlsson at oracle.com Tue Nov 22 14:54:55 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 22 Nov 2016 15:54:55 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist Message-ID: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Hi all, Please, review this patch to fix a bug in ChunkManager::list_index(): http://cr.openjdk.java.net/~stefank/8169931/webrev.01 There's a great description of the bug in the bug report: https://bugs.openjdk.java.net/browse/JDK-8169931 There are two conceptual parts of the metaspace. The _class_ metaspace, and the _non-class_ metaspace. They have different chunk sizes, and while querying for the list index of a humongous chunk in the class metaspace, the code accidentally matched the size against the MediumChunk size of the non-class metaspace. I've changed the code to not query against the global ChunkSizes enum, but rather the values stored inside the ChunkManager instances. Therefore, the list_index() function was changed into an instance method. I've written a unit test that provoked the bug. It's a simplified test with vm asserts instead of gtest asserts. The reason is that the ChunkManager class is currently located in metaspace.cpp, and is not accessible from the gtest unit tests. Testing: jprt, Kitchensink, parallel class loading tests Thanks, StefanK From mikael.gerdin at oracle.com Tue Nov 22 16:08:24 2016 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 22 Nov 2016 17:08:24 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: <41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com> Hi Stefan, On 2016-11-22 15:54, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix a bug in ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 The change looks good to me. /Mikael > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > There are two conceptual parts of the metaspace. The _class_ metaspace, > and the _non-class_ metaspace. They have different chunk sizes, and > while querying for the list index of a humongous chunk in the class > metaspace, the code accidentally matched the size against the > MediumChunk size of the non-class metaspace. > > I've changed the code to not query against the global ChunkSizes enum, > but rather the values stored inside the ChunkManager instances. > Therefore, the list_index() function was changed into an instance method. > > I've written a unit test that provoked the bug. It's a simplified test > with vm asserts instead of gtest asserts. The reason is that the > ChunkManager class is currently located in metaspace.cpp, and is not > accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK From thomas.stuefe at gmail.com Tue Nov 22 17:09:12 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 22 Nov 2016 18:09:12 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: Hi Stefan, this change looks good! Small nitpick, there exists already a function returning a pointer to the free list by chunk index (ChunkManager::free_chunks(index)). You could have implemented ChunkManager::list_chunk_size() using this function (return free_chunks(index)->size()) and add your assert to ChunkManager::free_chunks(index) instead. Or, alternativly, just use free_chunks(index)->size() directly instead of adding list_chunk_size(). Kind Regards, Thomas On Tue, Nov 22, 2016 at 3:54 PM, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix a bug in ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > There are two conceptual parts of the metaspace. The _class_ metaspace, > and the _non-class_ metaspace. They have different chunk sizes, and while > querying for the list index of a humongous chunk in the class metaspace, > the code accidentally matched the size against the MediumChunk size of the > non-class metaspace. > > I've changed the code to not query against the global ChunkSizes enum, but > rather the values stored inside the ChunkManager instances. Therefore, the > list_index() function was changed into an instance method. > > I've written a unit test that provoked the bug. It's a simplified test > with vm asserts instead of gtest asserts. The reason is that the > ChunkManager class is currently located in metaspace.cpp, and is not > accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK > From jiangli.zhou at oracle.com Tue Nov 22 17:55:51 2016 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Tue, 22 Nov 2016 09:55:51 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <5833EE46.8020304@oracle.com> References: <58329B05.6070602@oracle.com> <02D3833E-5A5A-4E7E-8E87-0F537FB253CB@oracle.com> <5833EE46.8020304@oracle.com> Message-ID: <70FCC942-9363-489D-8CDC-6441CB246953@oracle.com> Hi Ioi, > On Nov 21, 2016, at 11:05 PM, Ioi Lam wrote: > > > > On 11/21/16 8:33 PM, Jiangli Zhou wrote: >> Hi Ioi, >> >> Looks good. >> >> I also have one suggestion. To make it a little more easier to read, could you please fold the 'else if (_i2i_entry != NULL)? block starting at line 1039 and ?if (!is_shared())? block starting at line 1047 into to one block? >> 1032 if (is_shared()) { >> 1033 address entry = Interpreter::entry_for_cds_method(h_method); >> 1034 assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry, >> 1035 "should be correctly set during dump time"); >> 1036 if (adapter() != NULL) { >> 1037 return; >> 1038 } >> 1039 } else if (_i2i_entry != NULL) { >> 1040 return; >> 1041 } >> 1042 assert( _code == NULL, "nothing compiled yet" ); >> 1043 >> 1044 // Setup interpreter entrypoint >> 1045 assert(this == h_method(), "wrong h_method()" ); >> 1046 >> 1047 if (!is_shared()) { >> 1048 assert(adapter() == NULL, "init'd to NULL"); >> 1049 address entry = Interpreter::entry_for_method(h_method); >> 1050 assert(entry != NULL, "interpreter entry must be non-null"); >> 1051 // Sets both _i2i_entry and _from_interpreted_entry >> 1052 set_interpreter_entry(entry); >> 1053 } > > Hi Jiangli, > > The line > > assert( _code == NULL, "nothing compiled yet" ); > > is necessary before we call > > set_interpreter_entry(entry); > > That's because the _from_interpreted_entry would be different if the method has been compiled. > > So this means I cannot simply move the block starting at #1047 to above #1042. Ok. Thanks, Jiangli > > Thanks > - Ioi > >> Thanks, >> Jiangli >> >>> On Nov 20, 2016, at 10:58 PM, Ioi Lam > wrote: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8169867 >>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >>> >>> Thanks to Tobias for finding the bug. I have done the following >>> >>> + integrated Tobias' suggested fix >>> + fixed Method::restore_unshareable_info to call Method::link_method >>> + added comments and a diagram to illustrate how the CDS method entry >>> trampolines work. >>> >>> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. >>> It's basically an extra level of indirection to get to the adapter. However. >>> The word "trampoline" usually is used for and extra jump in executable code, >>> so it may be a little confusing when we use it for a data pointer here. >>> >>> Any suggest for a better name? >>> >>> >>> Testing: >>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >>> now it produces the correct assertion. I won't check in this test, though, >>> since it won't assert anymore after Tobias fixes 8169711. >>> >>> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 >>> # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: >>> # should be correctly set during dump time >>> >>> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >>> All tests passed. >>> >>> Thanks >>> - Ioi >>> >> > From coleen.phillimore at oracle.com Tue Nov 22 19:30:24 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 22 Nov 2016 14:30:24 -0500 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: Can you put this test at the end of the file with // Unit Tests and an explanation why this is here so people don't try to port the whole thing to gtest? I was looking for uses of list_index and found this code, which looks wrong: assert((word_size <= chunk->word_size()) || list_index(chunk->word_size() == HumongousIndex), "Non-humongous variable sized chunk"); This change looks good though. Thanks, Coleen On 11/22/16 9:54 AM, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix a bug in ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > There are two conceptual parts of the metaspace. The _class_ > metaspace, and the _non-class_ metaspace. They have different chunk > sizes, and while querying for the list index of a humongous chunk in > the class metaspace, the code accidentally matched the size against > the MediumChunk size of the non-class metaspace. > > I've changed the code to not query against the global ChunkSizes enum, > but rather the values stored inside the ChunkManager instances. > Therefore, the list_index() function was changed into an instance method. > > I've written a unit test that provoked the bug. It's a simplified test > with vm asserts instead of gtest asserts. The reason is that the > ChunkManager class is currently located in metaspace.cpp, and is not > accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK From vladimir.kozlov at oracle.com Tue Nov 22 20:04:09 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 22 Nov 2016 12:04:09 -0800 Subject: RFR: 8162865 Implementation of SPARC lzcnt In-Reply-To: References: Message-ID: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com> Hi Trevor Do you have performance numbers? UseVIS is too wide flag to control only these instructions generation. To be consistent with x86 code please add UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is done for x86. May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear that VIS is used. Indention in the new test is all over place. Please, fix. Thanks, Vladimir On 11/15/16 3:57 AM, Trevor Watson wrote: > I have implemented the code to use the lzcnt instruction for both > integer and long countLeadingZeros() methods on SPARC platforms > supporting the vis3 instruction set. > > Current "bmi" tests for the above are updated so that they run on both > SPARC and x86 platforms. > > I've also implemented a test to ensure that Integer.countLeadingZeros() > and Long.countLeadingZeros() return the correct values when C2 runs. > This test is currently under the intrinsics "bmi" tests for want of > somewhere better (they do apply to both SPARC and x86 though). > > http://cr.openjdk.java.net/~alanbur/8162865/ > > Thanks, > Trevor From stefan.karlsson at oracle.com Tue Nov 22 21:05:10 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 22 Nov 2016 22:05:10 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: <9f67e983-a058-ad27-e0ce-2300e6bb8a56@oracle.com> Hi Coleen, On 2016-11-22 20:30, Coleen Phillimore wrote: > > Can you put this test at the end of the file with // Unit Tests and an > explanation why this is here so people don't try to port the whole > thing to gtest? Sure. > > I was looking for uses of list_index and found this code, which looks > wrong: > > assert((word_size <= chunk->word_size()) || > list_index(chunk->word_size() == HumongousIndex), > "Non-humongous variable sized chunk"); I'll fix that assert. > > > This change looks good though. Thanks, I'll send out a new patch. StefanK > > Thanks, > Coleen > > On 11/22/16 9:54 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ >> metaspace, and the _non-class_ metaspace. They have different chunk >> sizes, and while querying for the list index of a humongous chunk in >> the class metaspace, the code accidentally matched the size against >> the MediumChunk size of the non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes >> enum, but rather the values stored inside the ChunkManager instances. >> Therefore, the list_index() function was changed into an instance >> method. >> >> I've written a unit test that provoked the bug. It's a simplified >> test with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK > From stefan.karlsson at oracle.com Tue Nov 22 21:06:09 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 22 Nov 2016 22:06:09 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: Hi Thomas, On 2016-11-22 18:09, Thomas St?fe wrote: > Hi Stefan, > > this change looks good! Thanks! > > Small nitpick, there exists already a function returning a pointer to > the free list by chunk index (ChunkManager::free_chunks(index)). You > could have implemented ChunkManager::list_chunk_size() using this > function (return free_chunks(index)->size()) and add your assert to > ChunkManager::free_chunks(index) instead. Or, alternativly, just use > free_chunks(index)->size() directly instead of adding list_chunk_size(). Sure. I'll send out a new patch including your suggestion. Thanks, StefanK > > Kind Regards, Thomas > > > On Tue, Nov 22, 2016 at 3:54 PM, Stefan Karlsson > > wrote: > > Hi all, > > Please, review this patch to fix a bug in ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > > There are two conceptual parts of the metaspace. The _class_ > metaspace, and the _non-class_ metaspace. They have different > chunk sizes, and while querying for the list index of a humongous > chunk in the class metaspace, the code accidentally matched the > size against the MediumChunk size of the non-class metaspace. > > I've changed the code to not query against the global ChunkSizes > enum, but rather the values stored inside the ChunkManager > instances. Therefore, the list_index() function was changed into > an instance method. > > I've written a unit test that provoked the bug. It's a simplified > test with vm asserts instead of gtest asserts. The reason is that > the ChunkManager class is currently located in metaspace.cpp, and > is not accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK > > From stefan.karlsson at oracle.com Tue Nov 22 21:06:29 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 22 Nov 2016 22:06:29 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <41dc02b9-edee-0060-07c4-2fd8220cbda7@oracle.com> Message-ID: <8a47ecc7-8195-8df8-df70-a2657cd683f8@oracle.com> Thanks, Mikael. StefanK On 2016-11-22 17:08, Mikael Gerdin wrote: > Hi Stefan, > > > On 2016-11-22 15:54, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > The change looks good to me. > > /Mikael > >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ metaspace, >> and the _non-class_ metaspace. They have different chunk sizes, and >> while querying for the list index of a humongous chunk in the class >> metaspace, the code accidentally matched the size against the >> MediumChunk size of the non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes enum, >> but rather the values stored inside the ChunkManager instances. >> Therefore, the list_index() function was changed into an instance >> method. >> >> I've written a unit test that provoked the bug. It's a simplified test >> with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Tue Nov 22 21:37:51 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 22 Nov 2016 22:37:51 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> Message-ID: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Hi all, Here are the update patch, with changes suggested by Coleen and Thomas: http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta http://cr.openjdk.java.net/~stefank/8169931/webrev.02 Changes to the previous patch: * Removed list_chunk_size and instead used free_chunks(index)->size() * Removed the const qualifier from list_index, since free_chunks isn't declared const. Fixing this would have been a too large change for this bug fix. * Moved ChunkManager_test_list_index into the unit test section of metaspace.cpp * Fixed a broken assert Thanks, StefanK On 2016-11-22 15:54, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix a bug in ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > There are two conceptual parts of the metaspace. The _class_ > metaspace, and the _non-class_ metaspace. They have different chunk > sizes, and while querying for the list index of a humongous chunk in > the class metaspace, the code accidentally matched the size against > the MediumChunk size of the non-class metaspace. > > I've changed the code to not query against the global ChunkSizes enum, > but rather the values stored inside the ChunkManager instances. > Therefore, the list_index() function was changed into an instance method. > > I've written a unit test that provoked the bug. It's a simplified test > with vm asserts instead of gtest asserts. The reason is that the > ChunkManager class is currently located in metaspace.cpp, and is not > accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK From coleen.phillimore at oracle.com Tue Nov 22 22:48:30 2016 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Tue, 22 Nov 2016 17:48:30 -0500 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: <6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com> Looks good! Thanks, Coleen On 11/22/16 4:37 PM, Stefan Karlsson wrote: > Hi all, > > Here are the update patch, with changes suggested by Coleen and Thomas: > http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta > http://cr.openjdk.java.net/~stefank/8169931/webrev.02 > > Changes to the previous patch: > * Removed list_chunk_size and instead used free_chunks(index)->size() > * Removed the const qualifier from list_index, since free_chunks isn't > declared const. Fixing this would have been a too large change for > this bug fix. > * Moved ChunkManager_test_list_index into the unit test section of > metaspace.cpp > * Fixed a broken assert > > Thanks, > StefanK > > > On 2016-11-22 15:54, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ >> metaspace, and the _non-class_ metaspace. They have different chunk >> sizes, and while querying for the list index of a humongous chunk in >> the class metaspace, the code accidentally matched the size against >> the MediumChunk size of the non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes >> enum, but rather the values stored inside the ChunkManager instances. >> Therefore, the list_index() function was changed into an instance >> method. >> >> I've written a unit test that provoked the bug. It's a simplified >> test with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK > > From david.holmes at oracle.com Wed Nov 23 05:08:28 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 23 Nov 2016 15:08:28 +1000 Subject: Presentation: Understanding OrderAccess Message-ID: This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf Cheers, David From erik.helin at oracle.com Wed Nov 23 07:09:28 2016 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 23 Nov 2016 08:09:28 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: <00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com> On 11/22/2016 10:37 PM, Stefan Karlsson wrote: > Hi all, > > Here are the update patch, with changes suggested by Coleen and Thomas: > http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta > http://cr.openjdk.java.net/~stefank/8169931/webrev.02 Hey StefanK, thanks for taking care of this! The patch looks good to me, Reviewed. Thanks, Erik > Changes to the previous patch: > * Removed list_chunk_size and instead used free_chunks(index)->size() > * Removed the const qualifier from list_index, since free_chunks isn't > declared const. Fixing this would have been a too large change for this > bug fix. > * Moved ChunkManager_test_list_index into the unit test section of > metaspace.cpp > * Fixed a broken assert > > Thanks, > StefanK > > > On 2016-11-22 15:54, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ >> metaspace, and the _non-class_ metaspace. They have different chunk >> sizes, and while querying for the list index of a humongous chunk in >> the class metaspace, the code accidentally matched the size against >> the MediumChunk size of the non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes enum, >> but rather the values stored inside the ChunkManager instances. >> Therefore, the list_index() function was changed into an instance method. >> >> I've written a unit test that provoked the bug. It's a simplified test >> with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK > > From thomas.stuefe at gmail.com Wed Nov 23 07:42:12 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 23 Nov 2016 08:42:12 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: Hi Stefan, this looks fine! Thanks, Thomas On Tue, Nov 22, 2016 at 10:37 PM, Stefan Karlsson < stefan.karlsson at oracle.com> wrote: > Hi all, > > Here are the update patch, with changes suggested by Coleen and Thomas: > http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta > http://cr.openjdk.java.net/~stefank/8169931/webrev.02 > > Changes to the previous patch: > * Removed list_chunk_size and instead used free_chunks(index)->size() > * Removed the const qualifier from list_index, since free_chunks isn't > declared const. Fixing this would have been a too large change for this bug > fix. > * Moved ChunkManager_test_list_index into the unit test section of > metaspace.cpp > * Fixed a broken assert > > Thanks, > StefanK > > > On 2016-11-22 15:54, Stefan Karlsson wrote: > >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ metaspace, >> and the _non-class_ metaspace. They have different chunk sizes, and while >> querying for the list index of a humongous chunk in the class metaspace, >> the code accidentally matched the size against the MediumChunk size of the >> non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes enum, >> but rather the values stored inside the ChunkManager instances. Therefore, >> the list_index() function was changed into an instance method. >> >> I've written a unit test that provoked the bug. It's a simplified test >> with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK >> > > > From mikael.gerdin at oracle.com Wed Nov 23 09:42:26 2016 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Wed, 23 Nov 2016 10:42:26 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: Hi Stefan, On 2016-11-22 22:37, Stefan Karlsson wrote: > Hi all, > > Here are the update patch, with changes suggested by Coleen and Thomas: > http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta > http://cr.openjdk.java.net/~stefank/8169931/webrev.02 Updated webrev looks good to me as well. /Mikael > > Changes to the previous patch: > * Removed list_chunk_size and instead used free_chunks(index)->size() > * Removed the const qualifier from list_index, since free_chunks isn't > declared const. Fixing this would have been a too large change for this > bug fix. > * Moved ChunkManager_test_list_index into the unit test section of > metaspace.cpp > * Fixed a broken assert > > Thanks, > StefanK > > > On 2016-11-22 15:54, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix a bug in ChunkManager::list_index(): >> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >> >> There's a great description of the bug in the bug report: >> https://bugs.openjdk.java.net/browse/JDK-8169931 >> >> There are two conceptual parts of the metaspace. The _class_ >> metaspace, and the _non-class_ metaspace. They have different chunk >> sizes, and while querying for the list index of a humongous chunk in >> the class metaspace, the code accidentally matched the size against >> the MediumChunk size of the non-class metaspace. >> >> I've changed the code to not query against the global ChunkSizes enum, >> but rather the values stored inside the ChunkManager instances. >> Therefore, the list_index() function was changed into an instance method. >> >> I've written a unit test that provoked the bug. It's a simplified test >> with vm asserts instead of gtest asserts. The reason is that the >> ChunkManager class is currently located in metaspace.cpp, and is not >> accessible from the gtest unit tests. >> >> Testing: jprt, Kitchensink, parallel class loading tests >> >> Thanks, >> StefanK > > From aph at redhat.com Wed Nov 23 10:40:49 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 23 Nov 2016 10:40:49 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: On 23/11/16 05:08, David Holmes wrote: > This is a presentation I recently gave internally to the runtime and > serviceability teams that may be of more general interest to hotspot > developers. > > http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf That's pretty cool; nicely done. I'd quibble about a couple of minor things: In Data Race Example: Using Barriers, the use of a naked StoreStore is rather terrifying. In real-world code it'd be better to use StoreStore|LoadStore or release unless the author really knows what they're doing. The use of "fence" to mean a full barrier is rather idiosyncratic; it confused me the first time I saw it in HotSpot source, and from time to time it still does. But, as I said, these are minor criticisms. Andrew. From tobias.hartmann at oracle.com Wed Nov 23 11:42:00 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 23 Nov 2016 12:42:00 +0100 Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> <582B622F.7030909@oracle.com> <4332d26a-0efa-4582-9068-f28fb7ebd109@default> Message-ID: <58358088.1090709@oracle.com> Hi Shafi, On 21.11.2016 07:29, Shafi Ahmad wrote: > Hi All, > > May I get the second review on this. > > I am putting together all the webrevs to make it simple for reviewer. > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ > http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ This looks good to me (not a 8u reviewer). Best regards, Tobias > > Please note that I tested with jprt, all jtreg and rbt tests. > > Regards, > Shafi > >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Wednesday, November 16, 2016 10:21 PM >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces >> mismatched unsafe accesses >> >> Looks good. >> >> I would suggest to run all jtreg tests (or even RBT) when you apply all >> changes before pushing this. >> >> Thanks, >> Vladimir >> >> On 11/16/16 4:52 AM, Shafi Ahmad wrote: >>> Hi Vladimir, >>> >>> Thank you for the review and feedback. >>> >>> Please find updated webrevs: >>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => Removed >> the test case as it use only jdk9 APIs. >>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => Removed >> test methods testFixedOffsetHeaderArray17() and >> testFixedOffsetHeader17() which referenced jdk9 API >> UNSAFE.getIntUnaligned. >>> >>> >>> Regards, >>> Shafi >>> >>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov >>>> Sent: Wednesday, November 16, 2016 1:00 AM >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces >>>> mismatched unsafe accesses >>>> >>>> Hi Shafi >>>> >>>> You should not backport tests which use only new JDK 9 APIs. Like >>>> TestUnsafeUnalignedMismatchedAccesses.java test. >>>> >>>> But it is perfectly fine to modify backport by removing part of >>>> changes which use a new API. For example, 8162101 changes in >>>> OpaqueAccesses.java test which use getIntUnaligned() method. >>>> >>>> It is unfortunate that 8140309 changes include also code which >>>> process new Unsafe Unaligned intrinsics from JDK 9. It should not be >>>> backported but it will simplify this and following backports. So I >>>> agree with changes you did for >>>> 8140309 backport. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 11/14/16 10:34 PM, Shafi Ahmad wrote: >>>>> Hi Vladimir, >>>>> >>>>> Thanks for the review. >>>>> >>>>>> -----Original Message----- >>>>> >>>>>> From: Vladimir Kozlov >>>>> >>>>>> Sent: Monday, November 14, 2016 11:20 PM >>>>> >>>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net >>>>> >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>>> produces >>>>> >>>>>> mismatched unsafe accesses >>>>> >>>>>> >>>>> >>>>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote: >>>>> >>>>>>> Hi Vladimir, >>>>> >>>>>>> >>>>> >>>>>>> Thanks for the review. >>>>> >>>>>>> >>>>> >>>>>>> Please find updated webrevs. >>>>> >>>>>>> >>>>> >>>>>>> All webrevs are with respect to the base changes on JDK-8140309. >>>>> >>>>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ >>>>> >>>>>> >>>>> >>>>>> Why you kept unaligned parameter in changes? >>>>> >>>>> The fix of JDK-8136473 caused many problems after integration (see >>>>> JDK- >>>> 8140267). >>>>> >>>>> The fix was backed out and re-implemented with JDK-8140309 by >>>>> slightly >>>> changing the assert: >>>>> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015- >>>> Novem >>>>> ber/019696.html >>>>> >>>>> The code change for the fix of JDK-8140309 is code changes for >>>>> JDK-8136473 >>>> by slightly changing one assert. >>>>> >>>>> jdk9 original changeset is >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c >>>>> >>>>> As this is a backport so I keep the changes as it is. >>>>> >>>>>> >>>>> >>>>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not work >>>>>> since >>>>> >>>>>> since Unsafe class in jdk8 does not have unaligned methods. >>>>> >>>>>> Hot did you run it? >>>>> >>>>> I am sorry, looks there is some issue with my testing. >>>>> >>>>> I have run jtreg test after merging the changes but somehow the test >>>>> does >>>> not run and I verified only the failing list of jtreg result. >>>>> >>>>> When I run the test case separately it is failing as you already >>>>> pointed out >>>> the same. >>>>> >>>>> $java -jar ~/Tools/jtreg/lib/jtreg.jar >>>>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ >>>>> >>>> >> hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched >>>> A >>>>> ccesses.java >>>>> >>>>> Test results: failed: 1 >>>>> >>>>> Report written to >>>>> /scratch/shshahma/Java/jdk8u-dev- >>>> 8140309_01/JTreport/html/report.html >>>>> >>>>> Results written to >>>>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork >>>>> >>>>> Error: >>>>> >>>>> /scratch/shshahma/Java/jdk8u-dev- >>>> 8140309_01/hotspot/test/compiler/intr >>>>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: error: >>>>> cannot find symbol >>>>> >>>>> UNSAFE.putIntUnaligned(array, >>>>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); >>>>> >>>>> Not sure if we should push without the test case. >>>>> >>>>>> >>>>> >>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ >>>>> >>>>>> >>>>> >>>>>> Good. Did you run new UnsafeAccess.java test? >>>>> >>>>> Due to same process issue the test case is not run and when I run it >>>> separately it fails. >>>>> >>>>> It passes after doing below changes: >>>>> >>>>> 1. Added /othervm >>>>> >>>>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;' by >>>>> 'import >>>> sun.misc.Unsafe;' >>>>> >>>>> Updated webrev: >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ >>>>> >>>>>> >>>>> >>>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ >>>>> >>>>> I am getting the similar compilation error as above for added test >>>>> case. Not >>>> sure if we can push without the test case. >>>>> >>>>> Regards, >>>>> >>>>> Shafi >>>>> >>>>>> >>>>> >>>>>> Good. >>>>> >>>>>> >>>>> >>>>>> Thanks, >>>>> >>>>>> Vladimir >>>>> >>>>>> >>>>> >>>>>>> >>>>> >>>>>>> Regards, >>>>> >>>>>>> Shafi >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>>> -----Original Message----- >>>>> >>>>>>>> From: Vladimir Kozlov >>>>> >>>>>>>> Sent: Friday, November 11, 2016 1:26 AM >>>>> >>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net >>>>>>>> >>>>> >>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>>>>> produces >>>>> >>>>>>>> mismatched unsafe accesses >>>>> >>>>>>>> >>>>> >>>>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote: >>>>> >>>>>>>>> Hi, >>>>> >>>>>>>>> >>>>> >>>>>>>>> Please review the backport of following dependent backports. >>>>> >>>>>>>>> >>>>> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8136473 >>>>> >>>>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. >>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 >>>>>>>>> [JDK- >>>>> >>>>>>>> 8080289]. Manual merge is not done as the corresponding code is >>>>>>>> not >>>>> >>>>>>>> there in jdk8u-dev. >>>>> >>>>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp >>>>>>>>> and >>>>> >>>>>>>>> manual >>>>> >>>>>>>> merge is done. >>>>> >>>>>>>>> webrev link: >>>>> >>>>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ >>>>> >>>>>>>> >>>>> >>>>>>>> unaligned unsafe access methods were added in jdk 9 only. In your >>>>> >>>>>>>> changes unaligned argument is always false. You can simplify >> changes. >>>>> >>>>>>>> >>>>> >>>>>>>> Also you should base changes on JDK-8140309 (original 8136473 >>>>>>>> changes >>>>> >>>>>>>> were backout by 8140267): >>>>> >>>>>>>> >>>>> >>>>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote: >>>>> >>>>>>>> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ >>>>> >>>>>>>> > >>>>> >>>>>>>> > Same as 8136473 with only the following change: >>>>> >>>>>>>> > >>>>> >>>>>>>> > diff --git a/src/share/vm/opto/library_call.cpp >>>>> >>>>>>>> b/src/share/vm/opto/library_call.cpp >>>>> >>>>>>>> > --- a/src/share/vm/opto/library_call.cpp >>>>> >>>>>>>> > +++ b/src/share/vm/opto/library_call.cpp >>>>> >>>>>>>> > @@ -2527,7 +2527,7 @@ >>>>> >>>>>>>> > // of safe & unsafe memory. >>>>> >>>>>>>> > if (need_mem_bar) insert_mem_bar(Op_MemBarCPUOrder); >>>>> >>>>>>>> > >>>>> >>>>>>>> > - assert(is_native_ptr || alias_type->adr_type() == >>>>> >>>>>>>> TypeOopPtr::BOTTOM >>>>> >>>>>>>> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM || >>>>> >>>>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM || >>>>> >>>>>>>> > alias_type->field() != NULL || alias_type->element() != >>>>> >>>>>>>> NULL, "field, array element or unknown"); >>>>> >>>>>>>> > bool mismatched = false; >>>>> >>>>>>>> > if (alias_type->element() != NULL || alias_type->field() != NULL) >> { >>>>> >>>>>>>> > >>>>> >>>>>>>> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the >>>>> >>>>>>>> is_native_ptr case and the case where the unsafe method is called >>>>>>>> with a >>>>> >>>>>> null object. >>>>> >>>>>>>> >>>>> >>>>>>>>> jdk9 changeset: >>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 >>>>> >>>>>>>>> >>>>> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8134918 >>>>> >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>>>> >>>>>>>>> >>>>> >>>>>> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16 >>>>>> 5 >>>>> >>>>>>>> [JDK-8140309]. Manual merge is not done as the corresponding code >>>>>>>> is >>>>> >>>>>>>> not there in jdk8u-dev. >>>>> >>>>>>>> >>>>> >>>>>>>> I explained situation with this line above. >>>>> >>>>>>>> >>>>> >>>>>>>>> webrev link: >>>>> >>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>>>> >>>>>>>> >>>>> >>>>>>>> This webrev is not incremental for your 8136473 changes - >>>>> >>>>>>>> library_call.cpp has part from 8136473 changes. >>>>> >>>>>>>> >>>>> >>>>>>>>> jdk9 changeset: >>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>>>> >>>>>>>>> >>>>> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8155781 >>>>> >>>>>>>>> Clean merge >>>>> >>>>>>>>> webrev link: >>>>> >>>>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ >>>>> >>>>>>>> >>>>> >>>>>>>> Thanks seems fine. >>>>> >>>>>>>> >>>>> >>>>>>>>> jdk9 changeset: >>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 >>>>> >>>>>>>>> >>>>> >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK-8162101 >>>>> >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. >>>>> >>>>>> >>>>>>> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 >>>>> >>>>>>>>> [JDK-8160360] - Resolved 2. >>>>> >>>>>> >>>>>> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2 >>>>>>>> 73 >>>>> >>>>>>>> [JDK-8148146] - Manual merge is not done as the corresponding >>>>>>>> code is >>>>> >>>>>>>> not there in jdk8u-dev. >>>>> >>>>>>>>> webrev link: >>>>> >>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ >>>>> >>>>>>>> >>>>> >>>>>>>> This webrev is not incremental in library_call.cpp. Difficult to >>>>>>>> see >>>>> >>>>>>>> this part of changes. >>>>> >>>>>>>> >>>>> >>>>>>>> Thanks, >>>>> >>>>>>>> Vladimir >>>>> >>>>>>>> >>>>> >>>>>>>>> jdk9 changeset: >>>>> >>>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 >>>>> >>>>>>>>> >>>>> >>>>>>>>> Testing: jprt and jtreg >>>>> >>>>>>>>> >>>>> >>>>>>>>> Regards, >>>>> >>>>>>>>> Shafi >>>>> >>>>>>>>> >>>>> >>>>>>>>>> -----Original Message----- >>>>> >>>>>>>>>> From: Shafi Ahmad >>>>> >>>>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM >>>>> >>>>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net >>>>>>>>>> >>>>> >>>>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>> >>>>>>>>>> produces mismatched unsafe accesses >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Thanks Vladimir. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> I will create dependent backport of 1. >>>>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473 >>>>> >>>>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 >>>>> >>>>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> Regards, >>>>> >>>>>>>>>> Shafi >>>>> >>>>>>>>>> >>>>> >>>>>>>>>>> -----Original Message----- >>>>> >>>>>>>>>>> From: Vladimir Kozlov >>>>> >>>>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM >>>>> >>>>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net >>>>>>>>>>> >>>>> >>>>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation >>>>> >>>>>>>>>>> produces mismatched unsafe accesses >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Hi Shafi, >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> You should also consider backporting following related fixes: >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 >>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Otherwise you may hit asserts added by 8134918 changes. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Thanks, >>>>> >>>>>>>>>>> Vladimir >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: >>>>> >>>>>>>>>>>> Hi All, >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type >>>>>>>>>>>> speculation >>>>> >>>>>>>>>>>> produces >>>>> >>>>>>>>>>> mismatched unsafe accesses to jdk8u-dev. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Please note that backport is not clean and the conflict is due to: >>>>> >>>>>>>>>>>> >>>>> >>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. >>>>> >>>>>>>>>>>> 1 >>>>> >>>>>>>>>>>> 65 >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Getting debug build failure because of: >>>>> >>>>>>>>>>>> >>>>> >>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. >>>>> >>>>>>>>>>>> 1 >>>>> >>>>>>>>>>>> 55 >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> The above changes are done under bug# 'JDK-8136473: failed: >>>>>>>>>>>> no >>>>> >>>>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' which >>>>>>>>>>> is >>>>> >>>>>>>>>>> not back ported to jdk8u and the current backport is on top of >>>>> >>>>>>>>>>> above >>>>> >>>>>>>> change. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Please note that I am not sure if there is any dependency >>>>> >>>>>>>>>>>> between these >>>>> >>>>>>>>>>> two changesets. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> open webrev: >>>>> >>>>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ >>>>> >>>>>>>>>>>> jdk9 bug >>>>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918 >>>>> >>>>>>>>>>>> jdk9 changeset: >>>>> >>>>>>>>>>> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> testing: Passes JPRT, jtreg not completed >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Regards, >>>>> >>>>>>>>>>>> Shafi >>>>> >>>>>>>>>>>> >>>>> From shafi.s.ahmad at oracle.com Wed Nov 23 11:47:34 2016 From: shafi.s.ahmad at oracle.com (Shafi Ahmad) Date: Wed, 23 Nov 2016 03:47:34 -0800 (PST) Subject: [8u] RFR for JDK-8134918 - C2: Type speculation produces mismatched unsafe accesses In-Reply-To: <58358088.1090709@oracle.com> References: <77e0b348-2b95-4097-ba95-906257d8893c@default> <137be921-c1ef-48d8-b85a-301d597109c0@default> <4c4f408d-7d1b-dfaf-7a04-f63322e0d560@oracle.com> <769e91f9-b0ad-421d-a8c2-ef6fedac4693@default> <582B622F.7030909@oracle.com> <4332d26a-0efa-4582-9068-f28fb7ebd109@default> <58358088.1090709@oracle.com> Message-ID: <341e37fe-0e73-4f20-afbd-33cdbe42ffba@default> Thank you very much Vladimir and Tobias for reviewing it. Regards, Shafi > -----Original Message----- > From: Tobias Hartmann > Sent: Wednesday, November 23, 2016 5:12 PM > To: Shafi Ahmad; Vladimir Kozlov; hotspot-dev at openjdk.java.net > Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > mismatched unsafe accesses > > Hi Shafi, > > On 21.11.2016 07:29, Shafi Ahmad wrote: > > Hi All, > > > > May I get the second review on this. > > > > I am putting together all the webrevs to make it simple for reviewer. > > http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ > > http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ > > http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > > http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > > This looks good to me (not a 8u reviewer). > > Best regards, > Tobias > > > > > Please note that I tested with jprt, all jtreg and rbt tests. > > > > Regards, > > Shafi > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Wednesday, November 16, 2016 10:21 PM > >> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation produces > >> mismatched unsafe accesses > >> > >> Looks good. > >> > >> I would suggest to run all jtreg tests (or even RBT) when you apply > >> all changes before pushing this. > >> > >> Thanks, > >> Vladimir > >> > >> On 11/16/16 4:52 AM, Shafi Ahmad wrote: > >>> Hi Vladimir, > >>> > >>> Thank you for the review and feedback. > >>> > >>> Please find updated webrevs: > >>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.01/ => > Removed > >> the test case as it use only jdk9 APIs. > >>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.02/ => > Removed > >> test methods testFixedOffsetHeaderArray17() and > >> testFixedOffsetHeader17() which referenced jdk9 API > >> UNSAFE.getIntUnaligned. > >>> > >>> > >>> Regards, > >>> Shafi > >>> > >>> > >>>> -----Original Message----- > >>>> From: Vladimir Kozlov > >>>> Sent: Wednesday, November 16, 2016 1:00 AM > >>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>> produces mismatched unsafe accesses > >>>> > >>>> Hi Shafi > >>>> > >>>> You should not backport tests which use only new JDK 9 APIs. Like > >>>> TestUnsafeUnalignedMismatchedAccesses.java test. > >>>> > >>>> But it is perfectly fine to modify backport by removing part of > >>>> changes which use a new API. For example, 8162101 changes in > >>>> OpaqueAccesses.java test which use getIntUnaligned() method. > >>>> > >>>> It is unfortunate that 8140309 changes include also code which > >>>> process new Unsafe Unaligned intrinsics from JDK 9. It should not > >>>> be backported but it will simplify this and following backports. So > >>>> I agree with changes you did for > >>>> 8140309 backport. > >>>> > >>>> Thanks, > >>>> Vladimir > >>>> > >>>> On 11/14/16 10:34 PM, Shafi Ahmad wrote: > >>>>> Hi Vladimir, > >>>>> > >>>>> Thanks for the review. > >>>>> > >>>>>> -----Original Message----- > >>>>> > >>>>>> From: Vladimir Kozlov > >>>>> > >>>>>> Sent: Monday, November 14, 2016 11:20 PM > >>>>> > >>>>>> To: Shafi Ahmad; hotspot-dev at openjdk.java.net > >>>>> > >>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>>> produces > >>>>> > >>>>>> mismatched unsafe accesses > >>>>> > >>>>>> > >>>>> > >>>>>> On 11/14/16 1:03 AM, Shafi Ahmad wrote: > >>>>> > >>>>>>> Hi Vladimir, > >>>>> > >>>>>>> > >>>>> > >>>>>>> Thanks for the review. > >>>>> > >>>>>>> > >>>>> > >>>>>>> Please find updated webrevs. > >>>>> > >>>>>>> > >>>>> > >>>>>>> All webrevs are with respect to the base changes on JDK-8140309. > >>>>> > >>>>>>> http://cr.openjdk.java.net/~shshahma/8140309/webrev.00/ > >>>>> > >>>>>> > >>>>> > >>>>>> Why you kept unaligned parameter in changes? > >>>>> > >>>>> The fix of JDK-8136473 caused many problems after integration (see > >>>>> JDK- > >>>> 8140267). > >>>>> > >>>>> The fix was backed out and re-implemented with JDK-8140309 by > >>>>> slightly > >>>> changing the assert: > >>>>> > >>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015- > >>>> Novem > >>>>> ber/019696.html > >>>>> > >>>>> The code change for the fix of JDK-8140309 is code changes for > >>>>> JDK-8136473 > >>>> by slightly changing one assert. > >>>>> > >>>>> jdk9 original changeset is > >>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c > >>>>> > >>>>> As this is a backport so I keep the changes as it is. > >>>>> > >>>>>> > >>>>> > >>>>>> The test TestUnsafeUnalignedMismatchedAccesses.java will not > work > >>>>>> since > >>>>> > >>>>>> since Unsafe class in jdk8 does not have unaligned methods. > >>>>> > >>>>>> Hot did you run it? > >>>>> > >>>>> I am sorry, looks there is some issue with my testing. > >>>>> > >>>>> I have run jtreg test after merging the changes but somehow the > >>>>> test does > >>>> not run and I verified only the failing list of jtreg result. > >>>>> > >>>>> When I run the test case separately it is failing as you already > >>>>> pointed out > >>>> the same. > >>>>> > >>>>> $java -jar ~/Tools/jtreg/lib/jtreg.jar > >>>>> -jdk:build/linux-x86_64-normal-server-slowdebug/jdk/ > >>>>> > >>>> > >> > hotspot/test/compiler/intrinsics/unsafe/TestUnsafeUnalignedMismatched > >>>> A > >>>>> ccesses.java > >>>>> > >>>>> Test results: failed: 1 > >>>>> > >>>>> Report written to > >>>>> /scratch/shshahma/Java/jdk8u-dev- > >>>> 8140309_01/JTreport/html/report.html > >>>>> > >>>>> Results written to > >>>>> /scratch/shshahma/Java/jdk8u-dev-8140309_01/JTwork > >>>>> > >>>>> Error: > >>>>> > >>>>> /scratch/shshahma/Java/jdk8u-dev- > >>>> 8140309_01/hotspot/test/compiler/intr > >>>>> insics/unsafe/TestUnsafeUnalignedMismatchedAccesses.java:92: > error: > >>>>> cannot find symbol > >>>>> > >>>>> UNSAFE.putIntUnaligned(array, > >>>>> UNSAFE.ARRAY_BYTE_BASE_OFFSET+1, -1); > >>>>> > >>>>> Not sure if we should push without the test case. > >>>>> > >>>>>> > >>>>> > >>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.01/ > >>>>> > >>>>>> > >>>>> > >>>>>> Good. Did you run new UnsafeAccess.java test? > >>>>> > >>>>> Due to same process issue the test case is not run and when I run > >>>>> it > >>>> separately it fails. > >>>>> > >>>>> It passes after doing below changes: > >>>>> > >>>>> 1. Added /othervm > >>>>> > >>>>> 2. replaced import statement 'import jdk.internal.misc.Unsafe;' > >>>>> by 'import > >>>> sun.misc.Unsafe;' > >>>>> > >>>>> Updated webrev: > >>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.02/ > >>>>> > >>>>>> > >>>>> > >>>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.01/ > >>>>> > >>>>> I am getting the similar compilation error as above for added test > >>>>> case. Not > >>>> sure if we can push without the test case. > >>>>> > >>>>> Regards, > >>>>> > >>>>> Shafi > >>>>> > >>>>>> > >>>>> > >>>>>> Good. > >>>>> > >>>>>> > >>>>> > >>>>>> Thanks, > >>>>> > >>>>>> Vladimir > >>>>> > >>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> Regards, > >>>>> > >>>>>>> Shafi > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>> > >>>>> > >>>>>>>> -----Original Message----- > >>>>> > >>>>>>>> From: Vladimir Kozlov > >>>>> > >>>>>>>> Sent: Friday, November 11, 2016 1:26 AM > >>>>> > >>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >>>>>>>> > >>>>> > >>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>>>>> produces > >>>>> > >>>>>>>> mismatched unsafe accesses > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> On 11/9/16 10:42 PM, Shafi Ahmad wrote: > >>>>> > >>>>>>>>> Hi, > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> Please review the backport of following dependent backports. > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK- > 8136473 > >>>>> > >>>>>>>>> Conflict in file src/share/vm/opto/memnode.cpp due to 1. > >>>>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/fe311de64c61 > >>>>>>>>> [JDK- > >>>>> > >>>>>>>> 8080289]. Manual merge is not done as the corresponding code is > >>>>>>>> not > >>>>> > >>>>>>>> there in jdk8u-dev. > >>>>> > >>>>>>>>> Multiple conflicts in file src/share/vm/opto/library_call.cpp > >>>>>>>>> and > >>>>> > >>>>>>>>> manual > >>>>> > >>>>>>>> merge is done. > >>>>> > >>>>>>>>> webrev link: > >>>>> > >>>>>> http://cr.openjdk.java.net/~shshahma/8136473/webrev.00/ > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> unaligned unsafe access methods were added in jdk 9 only. In > >>>>>>>> your > >>>>> > >>>>>>>> changes unaligned argument is always false. You can simplify > >> changes. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> Also you should base changes on JDK-8140309 (original 8136473 > >>>>>>>> changes > >>>>> > >>>>>>>> were backout by 8140267): > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> On 11/4/15 10:21 PM, Roland Westrelin wrote: > >>>>> > >>>>>>>> >http://cr.openjdk.java.net/~roland/8140309/webrev.00/ > >>>>> > >>>>>>>> > > >>>>> > >>>>>>>> > Same as 8136473 with only the following change: > >>>>> > >>>>>>>> > > >>>>> > >>>>>>>> > diff --git a/src/share/vm/opto/library_call.cpp > >>>>> > >>>>>>>> b/src/share/vm/opto/library_call.cpp > >>>>> > >>>>>>>> > --- a/src/share/vm/opto/library_call.cpp > >>>>> > >>>>>>>> > +++ b/src/share/vm/opto/library_call.cpp > >>>>> > >>>>>>>> > @@ -2527,7 +2527,7 @@ > >>>>> > >>>>>>>> > // of safe & unsafe memory. > >>>>> > >>>>>>>> > if (need_mem_bar) > insert_mem_bar(Op_MemBarCPUOrder); > >>>>> > >>>>>>>> > > >>>>> > >>>>>>>> > - assert(is_native_ptr || alias_type->adr_type() == > >>>>> > >>>>>>>> TypeOopPtr::BOTTOM > >>>>> > >>>>>>>> || > + assert(alias_type->adr_type() == TypeRawPtr::BOTTOM > || > >>>>> > >>>>>>>> alias_type->adr_type() == TypeOopPtr::BOTTOM || > >>>>> > >>>>>>>> > alias_type->field() != NULL || alias_type->element() != > >>>>> > >>>>>>>> NULL, "field, array element or unknown"); > >>>>> > >>>>>>>> > bool mismatched = false; > >>>>> > >>>>>>>> > if (alias_type->element() != NULL || alias_type->field() != > NULL) > >> { > >>>>> > >>>>>>>> > > >>>>> > >>>>>>>> > alias_type->adr_type() == TypeRawPtr::BOTTOM covers the > >>>>> > >>>>>>>> is_native_ptr case and the case where the unsafe method is > >>>>>>>> called with a > >>>>> > >>>>>> null object. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>> jdk9 changeset: > >>>>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4 > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK- > 8134918 > >>>>> > >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>>>> > >>>>>>>>> > >>>>> > >>>>>> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4bee38ba018c#l5.16 > >>>>>> 5 > >>>>> > >>>>>>>> [JDK-8140309]. Manual merge is not done as the corresponding > >>>>>>>> code is > >>>>> > >>>>>>>> not there in jdk8u-dev. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> I explained situation with this line above. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>> webrev link: > >>>>> > >>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> This webrev is not incremental for your 8136473 changes - > >>>>> > >>>>>>>> library_call.cpp has part from 8136473 changes. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>> jdk9 changeset: > >>>>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK- > 8155781 > >>>>> > >>>>>>>>> Clean merge > >>>>> > >>>>>>>>> webrev link: > >>>>> > >>>>>> http://cr.openjdk.java.net/~shshahma/8155781/webrev.00/ > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> Thanks seems fine. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>> jdk9 changeset: > >>>>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/cde17b3e2e70 > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> jdk9 bug link:https://bugs.openjdk.java.net/browse/JDK- > 8162101 > >>>>> > >>>>>>>>> Conflict in file src/share/vm/opto/library_call.cpp due to 1. > >>>>> > >>>>>> > >>>>>>> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/4be0cada20ad#l1.7 > >>>>> > >>>>>>>>> [JDK-8160360] - Resolved 2. > >>>>> > >>>>>> > >>>>>> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/8b9fdaeb8c57#l10.2 > >>>>>>>> 73 > >>>>> > >>>>>>>> [JDK-8148146] - Manual merge is not done as the corresponding > >>>>>>>> code is > >>>>> > >>>>>>>> not there in jdk8u-dev. > >>>>> > >>>>>>>>> webrev link: > >>>>> > >>>>>> http://cr.openjdk.java.net/~shshahma/8162101/webrev.00/ > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> This webrev is not incremental in library_call.cpp. Difficult > >>>>>>>> to see > >>>>> > >>>>>>>> this part of changes. > >>>>> > >>>>>>>> > >>>>> > >>>>>>>> Thanks, > >>>>> > >>>>>>>> Vladimir > >>>>> > >>>>>>>> > >>>>> > >>>>>>>>> jdk9 changeset: > >>>>> > >>>>>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/10dad1d40843 > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> Testing: jprt and jtreg > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>> Regards, > >>>>> > >>>>>>>>> Shafi > >>>>> > >>>>>>>>> > >>>>> > >>>>>>>>>> -----Original Message----- > >>>>> > >>>>>>>>>> From: Shafi Ahmad > >>>>> > >>>>>>>>>> Sent: Thursday, October 20, 2016 10:08 AM > >>>>> > >>>>>>>>>> To: Vladimir Kozlov;hotspot-dev at openjdk.java.net > >>>>>>>>>> > >>>>> > >>>>>>>>>> Subject: RE: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>> > >>>>>>>>>> produces mismatched unsafe accesses > >>>>> > >>>>>>>>>> > >>>>> > >>>>>>>>>> Thanks Vladimir. > >>>>> > >>>>>>>>>> > >>>>> > >>>>>>>>>> I will create dependent backport of 1. > >>>>> > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8136473 > >>>>> > >>>>>>>>>> 2.https://bugs.openjdk.java.net/browse/JDK-8155781 > >>>>> > >>>>>>>>>> 3.https://bugs.openjdk.java.net/browse/JDK-8162101 > >>>>> > >>>>>>>>>> > >>>>> > >>>>>>>>>> Regards, > >>>>> > >>>>>>>>>> Shafi > >>>>> > >>>>>>>>>> > >>>>> > >>>>>>>>>>> -----Original Message----- > >>>>> > >>>>>>>>>>> From: Vladimir Kozlov > >>>>> > >>>>>>>>>>> Sent: Wednesday, October 19, 2016 8:27 AM > >>>>> > >>>>>>>>>>> To: Shafi Ahmad;hotspot-dev at openjdk.java.net > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> Subject: Re: [8u] RFR for JDK-8134918 - C2: Type speculation > >>>>> > >>>>>>>>>>> produces mismatched unsafe accesses > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> Hi Shafi, > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> You should also consider backporting following related fixes: > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8155781 > >>>>> > >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8162101 > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> Otherwise you may hit asserts added by 8134918 changes. > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> Thanks, > >>>>> > >>>>>>>>>>> Vladimir > >>>>> > >>>>>>>>>>> > >>>>> > >>>>>>>>>>> On 10/17/16 3:12 AM, Shafi Ahmad wrote: > >>>>> > >>>>>>>>>>>> Hi All, > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> Please review the backport of JDK-8134918 - C2: Type > >>>>>>>>>>>> speculation > >>>>> > >>>>>>>>>>>> produces > >>>>> > >>>>>>>>>>> mismatched unsafe accesses to jdk8u-dev. > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> Please note that backport is not clean and the conflict is due > to: > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>>>> > >>>>>>>>>>>> 1 > >>>>> > >>>>>>>>>>>> 65 > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> Getting debug build failure because of: > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>> > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/9108fab781a4#l5. > >>>>> > >>>>>>>>>>>> 1 > >>>>> > >>>>>>>>>>>> 55 > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> The above changes are done under bug# 'JDK-8136473: > failed: > >>>>>>>>>>>> no > >>>>> > >>>>>>>>>>> mismatched stores, except on raw memory: StoreB StoreI' > >>>>>>>>>>> which is > >>>>> > >>>>>>>>>>> not back ported to jdk8u and the current backport is on top > >>>>>>>>>>> of > >>>>> > >>>>>>>>>>> above > >>>>> > >>>>>>>> change. > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> Please note that I am not sure if there is any dependency > >>>>> > >>>>>>>>>>>> between these > >>>>> > >>>>>>>>>>> two changesets. > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> open webrev: > >>>>> > >>>>>>>>>> http://cr.openjdk.java.net/~shshahma/8134918/webrev.00/ > >>>>> > >>>>>>>>>>>> jdk9 bug > >>>>>>>>>>>> link:https://bugs.openjdk.java.net/browse/JDK-8134918 > >>>>> > >>>>>>>>>>>> jdk9 changeset: > >>>>> > >>>>>>>>>>> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/79dae2cd00ef > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> testing: Passes JPRT, jtreg not completed > >>>>> > >>>>>>>>>>>> > >>>>> > >>>>>>>>>>>> Regards, > >>>>> > >>>>>>>>>>>> Shafi > >>>>> > >>>>>>>>>>>> > >>>>> From stefan.karlsson at oracle.com Wed Nov 23 11:53:12 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 23 Nov 2016 12:53:12 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> <6b4f5c80-b74f-eda2-7f3a-1f6e4610bcba@oracle.com> Message-ID: Thanks, Coleen! StefanK On 2016-11-22 23:48, Coleen Phillimore wrote: > Looks good! > Thanks, > Coleen > > On 11/22/16 4:37 PM, Stefan Karlsson wrote: >> Hi all, >> >> Here are the update patch, with changes suggested by Coleen and Thomas: >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02 >> >> Changes to the previous patch: >> * Removed list_chunk_size and instead used free_chunks(index)->size() >> * Removed the const qualifier from list_index, since free_chunks isn't >> declared const. Fixing this would have been a too large change for >> this bug fix. >> * Moved ChunkManager_test_list_index into the unit test section of >> metaspace.cpp >> * Fixed a broken assert >> >> Thanks, >> StefanK >> >> >> On 2016-11-22 15:54, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review this patch to fix a bug in ChunkManager::list_index(): >>> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >>> >>> There's a great description of the bug in the bug report: >>> https://bugs.openjdk.java.net/browse/JDK-8169931 >>> >>> There are two conceptual parts of the metaspace. The _class_ >>> metaspace, and the _non-class_ metaspace. They have different chunk >>> sizes, and while querying for the list index of a humongous chunk in >>> the class metaspace, the code accidentally matched the size against >>> the MediumChunk size of the non-class metaspace. >>> >>> I've changed the code to not query against the global ChunkSizes >>> enum, but rather the values stored inside the ChunkManager instances. >>> Therefore, the list_index() function was changed into an instance >>> method. >>> >>> I've written a unit test that provoked the bug. It's a simplified >>> test with vm asserts instead of gtest asserts. The reason is that the >>> ChunkManager class is currently located in metaspace.cpp, and is not >>> accessible from the gtest unit tests. >>> >>> Testing: jprt, Kitchensink, parallel class loading tests >>> >>> Thanks, >>> StefanK >> >> > From stefan.karlsson at oracle.com Wed Nov 23 11:54:01 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 23 Nov 2016 12:54:01 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: <00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com> References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> <00efb9dd-6477-3ec4-590e-a1732d5af82f@oracle.com> Message-ID: Thanks, Erik. StefanK On 2016-11-23 08:09, Erik Helin wrote: > On 11/22/2016 10:37 PM, Stefan Karlsson wrote: >> Hi all, >> >> Here are the update patch, with changes suggested by Coleen and Thomas: >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02 > > Hey StefanK, thanks for taking care of this! The patch looks good to me, > Reviewed. > > Thanks, > Erik > >> Changes to the previous patch: >> * Removed list_chunk_size and instead used free_chunks(index)->size() >> * Removed the const qualifier from list_index, since free_chunks isn't >> declared const. Fixing this would have been a too large change for this >> bug fix. >> * Moved ChunkManager_test_list_index into the unit test section of >> metaspace.cpp >> * Fixed a broken assert >> >> Thanks, >> StefanK >> >> >> On 2016-11-22 15:54, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review this patch to fix a bug in ChunkManager::list_index(): >>> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >>> >>> There's a great description of the bug in the bug report: >>> https://bugs.openjdk.java.net/browse/JDK-8169931 >>> >>> There are two conceptual parts of the metaspace. The _class_ >>> metaspace, and the _non-class_ metaspace. They have different chunk >>> sizes, and while querying for the list index of a humongous chunk in >>> the class metaspace, the code accidentally matched the size against >>> the MediumChunk size of the non-class metaspace. >>> >>> I've changed the code to not query against the global ChunkSizes enum, >>> but rather the values stored inside the ChunkManager instances. >>> Therefore, the list_index() function was changed into an instance >>> method. >>> >>> I've written a unit test that provoked the bug. It's a simplified test >>> with vm asserts instead of gtest asserts. The reason is that the >>> ChunkManager class is currently located in metaspace.cpp, and is not >>> accessible from the gtest unit tests. >>> >>> Testing: jprt, Kitchensink, parallel class loading tests >>> >>> Thanks, >>> StefanK >> >> From stefan.karlsson at oracle.com Wed Nov 23 11:54:14 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 23 Nov 2016 12:54:14 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: <218f8b02-3138-70a1-26d4-cbb6ebc4e243@oracle.com> Thanks, Thomas. StefanK On 2016-11-23 08:42, Thomas St?fe wrote: > Hi Stefan, > > this looks fine! > > Thanks, > Thomas > > On Tue, Nov 22, 2016 at 10:37 PM, Stefan Karlsson > > wrote: > > Hi all, > > Here are the update patch, with changes suggested by Coleen and Thomas: > http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta > > http://cr.openjdk.java.net/~stefank/8169931/webrev.02 > > > Changes to the previous patch: > * Removed list_chunk_size and instead used free_chunks(index)->size() > * Removed the const qualifier from list_index, since free_chunks > isn't declared const. Fixing this would have been a too large change > for this bug fix. > * Moved ChunkManager_test_list_index into the unit test section of > metaspace.cpp > * Fixed a broken assert > > Thanks, > StefanK > > > On 2016-11-22 15:54, Stefan Karlsson wrote: > > Hi all, > > Please, review this patch to fix a bug in > ChunkManager::list_index(): > http://cr.openjdk.java.net/~stefank/8169931/webrev.01 > > > There's a great description of the bug in the bug report: > https://bugs.openjdk.java.net/browse/JDK-8169931 > > > There are two conceptual parts of the metaspace. The _class_ > metaspace, and the _non-class_ metaspace. They have different > chunk sizes, and while querying for the list index of a > humongous chunk in the class metaspace, the code accidentally > matched the size against the MediumChunk size of the non-class > metaspace. > > I've changed the code to not query against the global ChunkSizes > enum, but rather the values stored inside the ChunkManager > instances. Therefore, the list_index() function was changed into > an instance method. > > I've written a unit test that provoked the bug. It's a > simplified test with vm asserts instead of gtest asserts. The > reason is that the ChunkManager class is currently located in > metaspace.cpp, and is not accessible from the gtest unit tests. > > Testing: jprt, Kitchensink, parallel class loading tests > > Thanks, > StefanK > > > > From stefan.karlsson at oracle.com Wed Nov 23 11:54:28 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 23 Nov 2016 12:54:28 +0100 Subject: RFR: 8169931: 8k class metaspace chunks misallocated from 4k chunk freelist In-Reply-To: References: <90dd2eee-7c48-b526-adb6-6e8a2ddff3e5@oracle.com> <8ce35dae-02db-8363-5d02-6d07f021fdfe@oracle.com> Message-ID: Thanks, Mikael. StefanK On 2016-11-23 10:42, Mikael Gerdin wrote: > Hi Stefan, > > On 2016-11-22 22:37, Stefan Karlsson wrote: >> Hi all, >> >> Here are the update patch, with changes suggested by Coleen and Thomas: >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02.delta >> http://cr.openjdk.java.net/~stefank/8169931/webrev.02 > > Updated webrev looks good to me as well. > /Mikael > >> >> Changes to the previous patch: >> * Removed list_chunk_size and instead used free_chunks(index)->size() >> * Removed the const qualifier from list_index, since free_chunks isn't >> declared const. Fixing this would have been a too large change for this >> bug fix. >> * Moved ChunkManager_test_list_index into the unit test section of >> metaspace.cpp >> * Fixed a broken assert >> >> Thanks, >> StefanK >> >> >> On 2016-11-22 15:54, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review this patch to fix a bug in ChunkManager::list_index(): >>> http://cr.openjdk.java.net/~stefank/8169931/webrev.01 >>> >>> There's a great description of the bug in the bug report: >>> https://bugs.openjdk.java.net/browse/JDK-8169931 >>> >>> There are two conceptual parts of the metaspace. The _class_ >>> metaspace, and the _non-class_ metaspace. They have different chunk >>> sizes, and while querying for the list index of a humongous chunk in >>> the class metaspace, the code accidentally matched the size against >>> the MediumChunk size of the non-class metaspace. >>> >>> I've changed the code to not query against the global ChunkSizes enum, >>> but rather the values stored inside the ChunkManager instances. >>> Therefore, the list_index() function was changed into an instance >>> method. >>> >>> I've written a unit test that provoked the bug. It's a simplified test >>> with vm asserts instead of gtest asserts. The reason is that the >>> ChunkManager class is currently located in metaspace.cpp, and is not >>> accessible from the gtest unit tests. >>> >>> Testing: jprt, Kitchensink, parallel class loading tests >>> >>> Thanks, >>> StefanK >> >> From igor.ignatyev at oracle.com Wed Nov 23 12:46:15 2016 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 23 Nov 2016 15:46:15 +0300 Subject: RFR(XXS) : register closed @requires property setter Message-ID: Hi all, could you please review the changeset which registers closed vm property setter (for @requires expressions)? this setter is register as optional, so test execution won?t fail if the file doesn?t exist. webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/ webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/ JBS : https://bugs.openjdk.java.net/browse/JDK-8170228 Thanks, ? Igor From dmitry.fazunenko at oracle.com Wed Nov 23 13:24:48 2016 From: dmitry.fazunenko at oracle.com (Dmitry Fazunenenko) Date: Wed, 23 Nov 2016 16:24:48 +0300 Subject: RFR(XXS) : register closed @requires property setter In-Reply-To: References: Message-ID: Hi Igor, The change itself looks good to me. Would you provide a bit more information into the CR. "register closed @requires property setter" doesn't provide enough information to understand the reasons why it's necessary. Thanks, Dima On 23.11.2016 15:46, Igor Ignatyev wrote: > Hi all, > > could you please review the changeset which registers closed vm property setter (for @requires expressions)? > this setter is register as optional, so test execution won?t fail if the file doesn?t exist. > > webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/ > webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/ > JBS : https://bugs.openjdk.java.net/browse/JDK-8170228 > > Thanks, > ? Igor From volker.simonis at gmail.com Wed Nov 23 14:05:33 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 23 Nov 2016 15:05:33 +0100 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583394C5.3030206@linux.vnet.ibm.com> References: <583394C5.3030206@linux.vnet.ibm.com> Message-ID: Hi Gustavo, thanks a lot for tracking this down! The change looks good and I a can sponsor it once you get another review from the build group and the FC Extension Request was approved. In general I'd advise to sign the OCTLA [1] to get access to the Java SE TCK [2] as this contains quite a lot of additional conformance tests which can be quite valuable for changes like this. Regards, Volker [1] http://openjdk.java.net/legal/octla-java-se-8.pdf [2] http://openjdk.java.net/groups/conformance/JckAccess/ On Tue, Nov 22, 2016 at 1:43 AM, Gustavo Romero wrote: > Hi, > > Could the following change be reviewed, please? > > webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/ > webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/ > bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the > StrictMath methods (in some cases up to 3x) on that platform. > > On PPC64 fdlibm optimization can be done without precision issues if > floating-point expression contraction is disable, i.e. if the compiler does not > use floating-point multiply-add (FMA). For further details please refer to gcc > bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 > > No regression was observed on Math and StrictMath tests: > > Passed: java/lang/Math/AbsPositiveZero.java > Passed: java/lang/Math/Atan2Tests.java > Passed: java/lang/Math/CeilAndFloorTests.java > Passed: java/lang/Math/CubeRootTests.java > Passed: java/lang/Math/DivModTests.java > Passed: java/lang/Math/ExactArithTests.java > Passed: java/lang/Math/Expm1Tests.java > Passed: java/lang/Math/FusedMultiplyAddTests.java > Passed: java/lang/Math/HyperbolicTests.java > Passed: java/lang/Math/HypotTests.java > Passed: java/lang/Math/IeeeRecommendedTests.java > Passed: java/lang/Math/Log10Tests.java > Passed: java/lang/Math/Log1pTests.java > Passed: java/lang/Math/MinMax.java > Passed: java/lang/Math/MultiplicationTests.java > Passed: java/lang/Math/PowTests.java > Passed: java/lang/Math/Rint.java > Passed: java/lang/Math/RoundTests.java > Passed: java/lang/Math/SinCosCornerCasesTests.java > Passed: java/lang/Math/TanTests.java > Passed: java/lang/Math/WorstCaseTests.java > Test results: passed: 21 > > Passed: java/lang/StrictMath/CubeRootTests.java > Passed: java/lang/StrictMath/ExactArithTests.java > Passed: java/lang/StrictMath/Expm1Tests.java > Passed: java/lang/StrictMath/HyperbolicTests.java > Passed: java/lang/StrictMath/HypotTests.java > Passed: java/lang/StrictMath/Log10Tests.java > Passed: java/lang/StrictMath/Log1pTests.java > Passed: java/lang/StrictMath/PowTests.java > Test results: passed: 8 > > and also on the following hotspot tests: > > Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java > Passed: compiler/intrinsics/mathexact/AddExactICondTest.java > Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/CompareTest.java > Passed: compiler/intrinsics/mathexact/DecExactITest.java > Passed: compiler/intrinsics/mathexact/DecExactLTest.java > Passed: compiler/intrinsics/mathexact/GVNTest.java > Passed: compiler/intrinsics/mathexact/IncExactITest.java > Passed: compiler/intrinsics/mathexact/IncExactLTest.java > Passed: compiler/intrinsics/mathexact/MulExactICondTest.java > Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java > Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java > Passed: compiler/intrinsics/mathexact/SubExactICondTest.java > Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java > Test results: passed: 50 > > Thank you. > > > Regards, > Gustavo > From erik.joelsson at oracle.com Wed Nov 23 14:29:52 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Wed, 23 Nov 2016 15:29:52 +0100 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583394C5.3030206@linux.vnet.ibm.com> References: <583394C5.3030206@linux.vnet.ibm.com> Message-ID: <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> Build changes look ok. In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too. /Erik On 2016-11-22 01:43, Gustavo Romero wrote: > Hi, > > Could the following change be reviewed, please? > > webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/ > webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/ > bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds up the > StrictMath methods (in some cases up to 3x) on that platform. > > On PPC64 fdlibm optimization can be done without precision issues if > floating-point expression contraction is disable, i.e. if the compiler does not > use floating-point multiply-add (FMA). For further details please refer to gcc > bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 > > No regression was observed on Math and StrictMath tests: > > Passed: java/lang/Math/AbsPositiveZero.java > Passed: java/lang/Math/Atan2Tests.java > Passed: java/lang/Math/CeilAndFloorTests.java > Passed: java/lang/Math/CubeRootTests.java > Passed: java/lang/Math/DivModTests.java > Passed: java/lang/Math/ExactArithTests.java > Passed: java/lang/Math/Expm1Tests.java > Passed: java/lang/Math/FusedMultiplyAddTests.java > Passed: java/lang/Math/HyperbolicTests.java > Passed: java/lang/Math/HypotTests.java > Passed: java/lang/Math/IeeeRecommendedTests.java > Passed: java/lang/Math/Log10Tests.java > Passed: java/lang/Math/Log1pTests.java > Passed: java/lang/Math/MinMax.java > Passed: java/lang/Math/MultiplicationTests.java > Passed: java/lang/Math/PowTests.java > Passed: java/lang/Math/Rint.java > Passed: java/lang/Math/RoundTests.java > Passed: java/lang/Math/SinCosCornerCasesTests.java > Passed: java/lang/Math/TanTests.java > Passed: java/lang/Math/WorstCaseTests.java > Test results: passed: 21 > > Passed: java/lang/StrictMath/CubeRootTests.java > Passed: java/lang/StrictMath/ExactArithTests.java > Passed: java/lang/StrictMath/Expm1Tests.java > Passed: java/lang/StrictMath/HyperbolicTests.java > Passed: java/lang/StrictMath/HypotTests.java > Passed: java/lang/StrictMath/Log10Tests.java > Passed: java/lang/StrictMath/Log1pTests.java > Passed: java/lang/StrictMath/PowTests.java > Test results: passed: 8 > > and also on the following hotspot tests: > > Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java > Passed: compiler/intrinsics/mathexact/AddExactICondTest.java > Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/CompareTest.java > Passed: compiler/intrinsics/mathexact/DecExactITest.java > Passed: compiler/intrinsics/mathexact/DecExactLTest.java > Passed: compiler/intrinsics/mathexact/GVNTest.java > Passed: compiler/intrinsics/mathexact/IncExactITest.java > Passed: compiler/intrinsics/mathexact/IncExactLTest.java > Passed: compiler/intrinsics/mathexact/MulExactICondTest.java > Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java > Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java > Passed: compiler/intrinsics/mathexact/SubExactICondTest.java > Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java > Test results: passed: 50 > > Thank you. > > > Regards, > Gustavo > From martin.doerr at sap.com Wed Nov 23 14:38:09 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 23 Nov 2016 14:38:09 +0000 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583394C5.3030206@linux.vnet.ibm.com> Message-ID: <8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap> Hi Gustavo, thanks for providing the webrevs. I have ran the StrictMath jck tests which fail when building with -O3 and without -ffp-contract=off: FailedTests: api/java_lang/StrictMath/desc.html#acos javasoft.sqe.tests.api.java.lang.StrictMath.acos_test api/java_lang/StrictMath/desc.html#asin javasoft.sqe.tests.api.java.lang.StrictMath.asin_test api/java_lang/StrictMath/desc.html#atan javasoft.sqe.tests.api.java.lang.StrictMath.atan_test api/java_lang/StrictMath/desc.html#atan2 javasoft.sqe.tests.api.java.lang.StrictMath.atan2_test api/java_lang/StrictMath/desc.html#cos javasoft.sqe.tests.api.java.lang.StrictMath.cos_test api/java_lang/StrictMath/desc.html#exp javasoft.sqe.tests.api.java.lang.StrictMath.exp_test api/java_lang/StrictMath/desc.html#log javasoft.sqe.tests.api.java.lang.StrictMath.log_test api/java_lang/StrictMath/desc.html#sin javasoft.sqe.tests.api.java.lang.StrictMath.sin_test api/java_lang/StrictMath/desc.html#tan javasoft.sqe.tests.api.java.lang.StrictMath.tan_test api/java_lang/StrictMath/index.html#expm1 javasoft.sqe.tests.api.java.lang.StrictMath.expm1Tests -TestCaseID ALL api/java_lang/StrictMath/index.html#log10 javasoft.sqe.tests.api.java.lang.StrictMath.log10Tests -TestCaseID ALL api/java_lang/StrictMath/index.html#log1p javasoft.sqe.tests.api.java.lang.StrictMath.log1pTests -TestCaseID ALL All of them have passed when building with -O3 and -ffp-contract=off (on linuxppc64le). So thumbs up from my side. Thanks and best regards, Martin -----Original Message----- From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Volker Simonis Sent: Mittwoch, 23. November 2016 15:06 To: Gustavo Romero Cc: build-dev ; ppc-aix-port-dev at openjdk.java.net; Java Core Libs ; hotspot-dev at openjdk.java.net Subject: Re: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation Hi Gustavo, thanks a lot for tracking this down! The change looks good and I a can sponsor it once you get another review from the build group and the FC Extension Request was approved. In general I'd advise to sign the OCTLA [1] to get access to the Java SE TCK [2] as this contains quite a lot of additional conformance tests which can be quite valuable for changes like this. Regards, Volker [1] http://openjdk.java.net/legal/octla-java-se-8.pdf [2] http://openjdk.java.net/groups/conformance/JckAccess/ On Tue, Nov 22, 2016 at 1:43 AM, Gustavo Romero wrote: > Hi, > > Could the following change be reviewed, please? > > webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/ > webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/jdk/ > bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > It enables fdlibm optimization on Linux PPC64 LE & BE and hence speeds > up the StrictMath methods (in some cases up to 3x) on that platform. > > On PPC64 fdlibm optimization can be done without precision issues if > floating-point expression contraction is disable, i.e. if the compiler > does not use floating-point multiply-add (FMA). For further details > please refer to gcc > bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78386 > > No regression was observed on Math and StrictMath tests: > > Passed: java/lang/Math/AbsPositiveZero.java > Passed: java/lang/Math/Atan2Tests.java > Passed: java/lang/Math/CeilAndFloorTests.java > Passed: java/lang/Math/CubeRootTests.java > Passed: java/lang/Math/DivModTests.java > Passed: java/lang/Math/ExactArithTests.java > Passed: java/lang/Math/Expm1Tests.java > Passed: java/lang/Math/FusedMultiplyAddTests.java > Passed: java/lang/Math/HyperbolicTests.java > Passed: java/lang/Math/HypotTests.java > Passed: java/lang/Math/IeeeRecommendedTests.java > Passed: java/lang/Math/Log10Tests.java > Passed: java/lang/Math/Log1pTests.java > Passed: java/lang/Math/MinMax.java > Passed: java/lang/Math/MultiplicationTests.java > Passed: java/lang/Math/PowTests.java > Passed: java/lang/Math/Rint.java > Passed: java/lang/Math/RoundTests.java > Passed: java/lang/Math/SinCosCornerCasesTests.java > Passed: java/lang/Math/TanTests.java > Passed: java/lang/Math/WorstCaseTests.java > Test results: passed: 21 > > Passed: java/lang/StrictMath/CubeRootTests.java > Passed: java/lang/StrictMath/ExactArithTests.java > Passed: java/lang/StrictMath/Expm1Tests.java > Passed: java/lang/StrictMath/HyperbolicTests.java > Passed: java/lang/StrictMath/HypotTests.java > Passed: java/lang/StrictMath/Log10Tests.java > Passed: java/lang/StrictMath/Log1pTests.java > Passed: java/lang/StrictMath/PowTests.java > Test results: passed: 8 > > and also on the following hotspot tests: > > Passed: compiler/intrinsics/mathexact/sanity/AddExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/AddExactLongTest.java > Passed: > compiler/intrinsics/mathexact/sanity/DecrementExactIntTest.java > Passed: > compiler/intrinsics/mathexact/sanity/DecrementExactLongTest.java > Passed: > compiler/intrinsics/mathexact/sanity/IncrementExactIntTest.java > Passed: > compiler/intrinsics/mathexact/sanity/IncrementExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/MultiplyExactIntTest.java > Passed: > compiler/intrinsics/mathexact/sanity/MultiplyExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactIntTest.java > Passed: compiler/intrinsics/mathexact/sanity/NegateExactLongTest.java > Passed: compiler/intrinsics/mathexact/sanity/SubtractExactIntTest.java > Passed: > compiler/intrinsics/mathexact/sanity/SubtractExactLongTest.java > Passed: compiler/intrinsics/mathexact/AddExactICondTest.java > Passed: compiler/intrinsics/mathexact/AddExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoadTest.java > Passed: compiler/intrinsics/mathexact/AddExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/AddExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/AddExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/AddExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/CompareTest.java > Passed: compiler/intrinsics/mathexact/DecExactITest.java > Passed: compiler/intrinsics/mathexact/DecExactLTest.java > Passed: compiler/intrinsics/mathexact/GVNTest.java > Passed: compiler/intrinsics/mathexact/IncExactITest.java > Passed: compiler/intrinsics/mathexact/IncExactLTest.java > Passed: compiler/intrinsics/mathexact/MulExactICondTest.java > Passed: compiler/intrinsics/mathexact/MulExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoadTest.java > Passed: compiler/intrinsics/mathexact/MulExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/MulExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/MulExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/MulExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoadTest.java > Passed: compiler/intrinsics/mathexact/NegExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/NegExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/NegExactLNonConstantTest.java > Passed: compiler/intrinsics/mathexact/NestedMathExactTest.java > Passed: compiler/intrinsics/mathexact/SplitThruPhiTest.java > Passed: compiler/intrinsics/mathexact/SubExactICondTest.java > Passed: compiler/intrinsics/mathexact/SubExactIConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoadTest.java > Passed: compiler/intrinsics/mathexact/SubExactILoopDependentTest.java > Passed: compiler/intrinsics/mathexact/SubExactINonConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactIRepeatTest.java > Passed: compiler/intrinsics/mathexact/SubExactLConstantTest.java > Passed: compiler/intrinsics/mathexact/SubExactLNonConstantTest.java > Test results: passed: 50 > > Thank you. > > > Regards, > Gustavo > From gromero at linux.vnet.ibm.com Wed Nov 23 15:28:05 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 23 Nov 2016 13:28:05 -0200 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583394C5.3030206@linux.vnet.ibm.com> Message-ID: <5835B585.5040807@linux.vnet.ibm.com> Hi Volker, On 23-11-2016 12:05, Volker Simonis wrote: > thanks a lot for tracking this down! Happy to contribute :) > The change looks good and I a can sponsor it once you get another > review from the build group and the FC Extension Request was approved. Thanks a lot for sponsoring it! > In general I'd advise to sign the OCTLA [1] to get access to the Java > SE TCK [2] as this contains quite a lot of additional conformance > tests which can be quite valuable for changes like this. > > Regards, > Volker > > [1] http://openjdk.java.net/legal/octla-java-se-8.pdf > [2] http://openjdk.java.net/groups/conformance/JckAccess/ Right. I'll check the documentation and find a way to get access to the TCK. Best regards, Gustavo From gromero at linux.vnet.ibm.com Wed Nov 23 15:29:51 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 23 Nov 2016 13:29:51 -0200 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap> References: <583394C5.3030206@linux.vnet.ibm.com> <8d52c2bcc0c0473e8e79d3f794ca81f3@dewdfe13de06.global.corp.sap> Message-ID: <5835B5EF.7070006@linux.vnet.ibm.com> Hi Martin, On 23-11-2016 12:38, Doerr, Martin wrote: > Hi Gustavo, > > thanks for providing the webrevs. > > I have ran the StrictMath jck tests which fail when building with -O3 and without -ffp-contract=off: > FailedTests: > api/java_lang/StrictMath/desc.html#acos javasoft.sqe.tests.api.java.lang.StrictMath.acos_test > api/java_lang/StrictMath/desc.html#asin javasoft.sqe.tests.api.java.lang.StrictMath.asin_test > api/java_lang/StrictMath/desc.html#atan javasoft.sqe.tests.api.java.lang.StrictMath.atan_test > api/java_lang/StrictMath/desc.html#atan2 javasoft.sqe.tests.api.java.lang.StrictMath.atan2_test > api/java_lang/StrictMath/desc.html#cos javasoft.sqe.tests.api.java.lang.StrictMath.cos_test > api/java_lang/StrictMath/desc.html#exp javasoft.sqe.tests.api.java.lang.StrictMath.exp_test > api/java_lang/StrictMath/desc.html#log javasoft.sqe.tests.api.java.lang.StrictMath.log_test > api/java_lang/StrictMath/desc.html#sin javasoft.sqe.tests.api.java.lang.StrictMath.sin_test > api/java_lang/StrictMath/desc.html#tan javasoft.sqe.tests.api.java.lang.StrictMath.tan_test > api/java_lang/StrictMath/index.html#expm1 javasoft.sqe.tests.api.java.lang.StrictMath.expm1Tests -TestCaseID ALL > api/java_lang/StrictMath/index.html#log10 javasoft.sqe.tests.api.java.lang.StrictMath.log10Tests -TestCaseID ALL > api/java_lang/StrictMath/index.html#log1p javasoft.sqe.tests.api.java.lang.StrictMath.log1pTests -TestCaseID ALL > > All of them have passed when building with -O3 and -ffp-contract=off (on linuxppc64le). Thank you very much for running the additional StrictMath jck tests against the change! Best regards, Gustavo From gromero at linux.vnet.ibm.com Wed Nov 23 15:33:43 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 23 Nov 2016 13:33:43 -0200 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> References: <583394C5.3030206@linux.vnet.ibm.com> <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> Message-ID: <5835B6D7.4020101@linux.vnet.ibm.com> Hi Erik, On 23-11-2016 12:29, Erik Joelsson wrote: > Build changes look ok. > > In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too. Thanks a lot for reviewing the change. Regards, Gustavo From martin.doerr at sap.com Wed Nov 23 16:20:32 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 23 Nov 2016 16:20:32 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap> Hi David, thank you very much for the presentation. I think it provides a good guideline for hotspot development. Would you like to add something about multi-copy atomicity? E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). It is needed in the following scenario: - Different threads write 2 variables. - Readers of these 2 variables expect a globally consistent order of the write accesses. In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". (While taking a look at it, the condition "#if !(defined SPARC || defined IA32 || defined AMD64)" is not accurate and should better get improved. E.g. s390 is multi-copy atomic.) I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. Thanks and best regards, Martin -----Original Message----- From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of David Holmes Sent: Mittwoch, 23. November 2016 06:08 To: hotspot-dev developers Subject: Presentation: Understanding OrderAccess This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf Cheers, David From adinn at redhat.com Wed Nov 23 16:30:29 2016 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 23 Nov 2016 16:30:29 +0000 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: <1effc60c-94ce-f42c-8756-310737969799@jku.at> References: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at> <204272cd-b606-2be7-9359-ea05d0922515@redhat.com> <1effc60c-94ce-f42c-8756-310737969799@jku.at> Message-ID: <3263b517-d397-22f3-8351-cb36b9fe539a@redhat.com> On 23/11/16 16:06, Peter Hofer wrote: > I finally got around to measuring the change in execution times between > disabling the profiler in a patched OpenJDK and an entirely unmodified > OpenJDK. I did this for the benchmarks of the DaCapo and scalabench suites. > > For many benchmarks, there is some difference even when the profiler is > not enabled. Still, the disabled case was not something that we > optimized for. I think that most, if not all of that cost can be shaved > off by revisiting changes to frequent code paths and to the object layouts. . . . Thanks very much for doing this! Am I safe to assume the y axis measures execution time? The differences never appear to be very great but a few of the tests show a couple of percent points which is maybe a little troubling. It would probably help if you could improve on that. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From peter.hofer at jku.at Wed Nov 23 16:06:59 2016 From: peter.hofer at jku.at (Peter Hofer) Date: Wed, 23 Nov 2016 17:06:59 +0100 Subject: Contribution: Lock Contention Profiler for HotSpot In-Reply-To: <204272cd-b606-2be7-9359-ea05d0922515@redhat.com> References: <8a2536a3-8d16-1777-55c7-95b10000465b@jku.at> <204272cd-b606-2be7-9359-ea05d0922515@redhat.com> Message-ID: <1effc60c-94ce-f42c-8756-310737969799@jku.at> Hi Andrew, I finally got around to measuring the change in execution times between disabling the profiler in a patched OpenJDK and an entirely unmodified OpenJDK. I did this for the benchmarks of the DaCapo and scalabench suites. For many benchmarks, there is some difference even when the profiler is not enabled. Still, the disabled case was not something that we optimized for. I think that most, if not all of that cost can be shaved off by revisiting changes to frequent code paths and to the object layouts. Here are the results for the JDK 8u patch: > http://ssw.jku.at/General/Staff/PH/lct/unmodified-vs-disabled/jdk8u.pdf For the JDK 9 patch (tracing only native locks): > http://ssw.jku.at/General/Staff/PH/lct/unmodified-vs-disabled/jdk9-nativeonly.pdf I measured this on a openSUSE 13.2 system with a single Intel Core i7-4790K processor, using a fixed Java heap size of 8 GB. Cheers, Peter On 11/04/2016 03:21 PM, Andrew Dinn wrote: > On 04/11/16 12:04, Peter Hofer wrote: > . . . >>> Have you measured the overhead this change produces when running with >>> contention detection disabled? (i.e. do we pay to have this feature even >>> when we don't use it). >> >> We measured only the overhead relative to an unmodified OpenJDK build. >> >> Our profiler observes only lock contention, which is generally handled >> via slow paths in the VM code, so this is where we added the code to >> record events. I don't expect this code to cause much overhead when >> disabled. However, we added fields to several data structures, which >> might make a difference. > > Yes, increased footprint (in code as well as object space) would be as > much a concern as increased execution time. > >> I'll run some more benchmarks and report my findings. > > Thanks very much. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From igor.ignatyev at oracle.com Thu Nov 24 12:14:31 2016 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 24 Nov 2016 15:14:31 +0300 Subject: RFR(XXS) : 8170228 : register closed @requires property setter In-Reply-To: References: Message-ID: <9D707B36-466F-4739-994B-2695E020D030@oracle.com> Dima, thanks for review. I?ve added more detail to the bug report. I hope it?ll be enough for descendants to understand why it was needed. Thanks, ? Igor > On Nov 23, 2016, at 4:24 PM, Dmitry Fazunenenko wrote: > > Hi Igor, > > The change itself looks good to me. > > Would you provide a bit more information into the CR. > "register closed @requires property setter" doesn't provide enough information to understand the reasons why it's necessary. > > Thanks, > Dima > > On 23.11.2016 15:46, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the changeset which registers closed vm property setter (for @requires expressions)? >> this setter is register as optional, so test execution won?t fail if the file doesn?t exist. >> >> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/ >> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/ >> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228 >> >> Thanks, >> ? Igor > From marcus.larsson at oracle.com Thu Nov 24 14:35:37 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Thu, 24 Nov 2016 15:35:37 +0100 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> Message-ID: Hi, On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: > Marcus, > > Thank you for prompt reply! > > Could you please read comments inline? > I'm looking forward to your reply. > > Thank you. > > Regards, Kirill > > On 22.11.2016 15:32, Marcus Larsson wrote: >> Hi, >> >> >> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>> Marcus, >>> >>> Thank you for reviewing the fix! >>>>> WebRev: >>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>> >>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>> should accept either. You could let sscanf read out the decimal >>>> point as a character and just verify that it is one of the two. >>>> >>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>> that we won't accept "Z" suffixed strings. Please revert that. >>> I agree that ISO8601 could add "Z" to time (and as far as I >>> understand date/time without delimiters is legal too) but these are >>> the unit tests. >>> Hence they cover the existing code and they should pass only if the >>> result corresponds to existing code and fail otherwise. >>> The current code from os::iso8601_time format date/time string >>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>> consider any other format as valid. >>> >>> Could you please let me know your opinion? >> >> I think the test should verify the intended behavior, not the >> implementation. If we refactor or change something in iso8601_time() >> we shouldn't be failing the test if it still conforms to ISO8601, IMO. > I would agree with you if we were talking about a functional test. But > since it is an unit test I think we should keep it as close to > implementation as possible. > If the implementation is changed unintentionally the test fails and > signals us that something is broken. > If it is an intentional change the test must be updated correspondingly. I still think it's unnecessary noise, but if you insist I'm fine with it. If we're not going to accept anything else than the current implementation then you should also remove the if-case for the Z suffix, since the test will fail for that anyway. Thanks, Marcus > >> >> Thanks, >> Marcus >> >>> >>> Thank you. >>> >>> Regards, Kirill >>> >>>> >>>> Thanks, >>>> Marcus >>>> >>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>> >>>>> Thank you. >>>>> >>>>> Regards, Kirill >>>> >>> >> > From vladimir.x.ivanov at oracle.com Thu Nov 24 20:26:45 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 24 Nov 2016 23:26:45 +0300 Subject: RFR(XXS) : register closed @requires property setter In-Reply-To: References: Message-ID: Reviewed. Best regards, Vladimir Ivanov On 11/23/16 3:46 PM, Igor Ignatyev wrote: > Hi all, > > could you please review the changeset which registers closed vm property setter (for @requires expressions)? > this setter is register as optional, so test execution won?t fail if the file doesn?t exist. > > webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/ > webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/ > JBS : https://bugs.openjdk.java.net/browse/JDK-8170228 > > Thanks, > ? Igor > From david.holmes at oracle.com Fri Nov 25 10:38:39 2016 From: david.holmes at oracle.com (David Holmes) Date: Fri, 25 Nov 2016 20:38:39 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 The bug is not public unfortunately for non-technical reasons - but see my eval below. Background: if you load the JVM from the primordial thread of a process (not done by the java launcher since JDK 6), there is an artificial stack limit imposed on the initial thread (by sticking the guard page at the limit position of the actual stack) of the minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is ignored for the main thread even if the true stack is, say, 8M. This limitation dates back 10-15 years and is no longer relevant today and should be removed (see below). I've also added additional explanatory notes. webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ Testing was manually done by modifying the launcher to not run the VM in a new thread, and checking the resulting stack size used. This change will only affect hosted JVMs launched with a -Xss value > 2M. Thanks, David ----- Bug eval: JDK-4441425 limits the stack to 8M as a safeguard against an unlimited value from getrlimit in 1.3.1, but further constrained that to 2M in 1.4.0 due to JDK-4466587. By 1.4.2 we have the basic form of the current problematic code: #ifndef IA64 if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; #else // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; #endif _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); if (max_size && _initial_thread_stack_size > max_size) { _initial_thread_stack_size = max_size; } This was added by JDK-4678676 to allow the stack of the main thread to be _reduced_ below the default 2M/4M if the -Xss value was smaller than that.** There was no intent to allow the stack size to follow -Xss arbitrarily due to the operational constraints imposed by the OS/glibc at the time when dealing with the primordial process thread. ** It could not actually change the actual stack size of course, but set the guard pages to limit use to the expected stack size. In JDK 6, under JDK-6316197, the launcher was changed to create the JVM in a new thread, so that it was not limited by the idiosyncracies of the OS or thread library primordial thread handling. However, the stack size limitations remained in place in case the VM was launched from the primordial thread of a user application via the JNI invocation API. I believe it should be safe to remove the 2M limitation now. From volker.simonis at gmail.com Fri Nov 25 13:32:37 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 25 Nov 2016 14:32:37 +0100 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <5835B6D7.4020101@linux.vnet.ibm.com> References: <583394C5.3030206@linux.vnet.ibm.com> <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> <5835B6D7.4020101@linux.vnet.ibm.com> Message-ID: Hi Gustavo, we've realized that we have exactly the same problem on Linux/s390 so I hope you don't mind that I've updated the bug and the webrev to also include the fix for Linux/s390: http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/ http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/ https://bugs.openjdk.java.net/browse/JDK-8170153 The top-level change stays the same (I've only added the current reviewers) and for the jdk change I've just added Linux/s390 as another platform which can compile fdlibm with HIGH optimization. Thanks, Volker On Wed, Nov 23, 2016 at 4:33 PM, Gustavo Romero wrote: > Hi Erik, > > On 23-11-2016 12:29, Erik Joelsson wrote: >> Build changes look ok. >> >> In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too. > > Thanks a lot for reviewing the change. > > > Regards, > Gustavo > From erik.joelsson at oracle.com Fri Nov 25 14:06:27 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 25 Nov 2016 15:06:27 +0100 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583394C5.3030206@linux.vnet.ibm.com> <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> <5835B6D7.4020101@linux.vnet.ibm.com> Message-ID: <98b7942d-837e-0166-93de-9ea256bb1ecf@oracle.com> Looks good. /Erik On 2016-11-25 14:32, Volker Simonis wrote: > Hi Gustavo, > > we've realized that we have exactly the same problem on Linux/s390 so > I hope you don't mind that I've updated the bug and the webrev to also > include the fix for Linux/s390: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/ > http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/ > https://bugs.openjdk.java.net/browse/JDK-8170153 > > The top-level change stays the same (I've only added the current > reviewers) and for the jdk change I've just added Linux/s390 as > another platform which can compile fdlibm with HIGH optimization. > > Thanks, > Volker > > On Wed, Nov 23, 2016 at 4:33 PM, Gustavo Romero > wrote: >> Hi Erik, >> >> On 23-11-2016 12:29, Erik Joelsson wrote: >>> Build changes look ok. >>> >>> In CoreLibraries.gmk, I think it would have been ok to keep the conditional checking (OPENJDK_TARGET_CPU_ARCH, ppc), but this certainly works too. >> Thanks a lot for reviewing the change. >> >> >> Regards, >> Gustavo >> From kirill.zhaldybin at oracle.com Fri Nov 25 17:23:52 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Fri, 25 Nov 2016 20:23:52 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> Message-ID: <583873A8.8000106@oracle.com> Marcus, Here are a new webrev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/ I addressed your comment about "if-case for the Z suffix". Could you please let me know your opinion? Thank you. Regards, Kirill On 24.11.2016 17:35, Marcus Larsson wrote: > Hi, > > > On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: >> Marcus, >> >> Thank you for prompt reply! >> >> Could you please read comments inline? >> I'm looking forward to your reply. >> >> Thank you. >> >> Regards, Kirill >> >> On 22.11.2016 15:32, Marcus Larsson wrote: >>> Hi, >>> >>> >>> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>>> Marcus, >>>> >>>> Thank you for reviewing the fix! >>>>>> WebRev: >>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>>> >>>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>>> should accept either. You could let sscanf read out the decimal >>>>> point as a character and just verify that it is one of the two. >>>>> >>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>>> that we won't accept "Z" suffixed strings. Please revert that. >>>> I agree that ISO8601 could add "Z" to time (and as far as I >>>> understand date/time without delimiters is legal too) but these are >>>> the unit tests. >>>> Hence they cover the existing code and they should pass only if the >>>> result corresponds to existing code and fail otherwise. >>>> The current code from os::iso8601_time format date/time string >>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>>> consider any other format as valid. >>>> >>>> Could you please let me know your opinion? >>> >>> I think the test should verify the intended behavior, not the >>> implementation. If we refactor or change something in iso8601_time() >>> we shouldn't be failing the test if it still conforms to ISO8601, IMO. >> I would agree with you if we were talking about a functional test. But >> since it is an unit test I think we should keep it as close to >> implementation as possible. >> If the implementation is changed unintentionally the test fails and >> signals us that something is broken. >> If it is an intentional change the test must be updated correspondingly. > > I still think it's unnecessary noise, but if you insist I'm fine with it. > > If we're not going to accept anything else than the current > implementation then you should also remove the if-case for the Z suffix, > since the test will fail for that anyway. > > Thanks, > Marcus > >> >>> >>> Thanks, >>> Marcus >>> >>>> >>>> Thank you. >>>> >>>> Regards, Kirill >>>> >>>>> >>>>> Thanks, >>>>> Marcus >>>>> >>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Regards, Kirill >>>>> >>>> >>> >> > From igor.ignatyev at oracle.com Fri Nov 25 20:01:49 2016 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 25 Nov 2016 23:01:49 +0300 Subject: RFR(XXS) : register closed @requires property setter In-Reply-To: References: Message-ID: <688B8C65-700C-4DAD-B959-BE5429688ACF@oracle.com> Vladimir, thanks a lot for your Review. ? Igor > On Nov 24, 2016, at 11:26 PM, Vladimir Ivanov wrote: > > Reviewed. > > Best regards, > Vladimir Ivanov > > On 11/23/16 3:46 PM, Igor Ignatyev wrote: >> Hi all, >> >> could you please review the changeset which registers closed vm property setter (for @requires expressions)? >> this setter is register as optional, so test execution won?t fail if the file doesn?t exist. >> >> webrev.top : http://cr.openjdk.java.net/~iignatyev/8170228/top/webrev.00/ >> webrev.hotspot : http://cr.openjdk.java.net/~iignatyev/8170228/hotspot/webrev.00/ >> JBS : https://bugs.openjdk.java.net/browse/JDK-8170228 >> >> Thanks, >> ? Igor >> From ioi.lam at oracle.com Mon Nov 28 03:58:19 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Sun, 27 Nov 2016 19:58:19 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <5832A7FC.8030505@oracle.com> References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com> Message-ID: <583BAB5B.4020404@oracle.com> I found a problem in my previous patch. Here's the fix (on top of he previous patch): diff -r 3404f61c7081 src/share/vm/oops/method.cpp --- a/src/share/vm/oops/method.cpp Sun Nov 27 19:44:44 2016 -0800 +++ b/src/share/vm/oops/method.cpp Sun Nov 27 19:50:35 2016 -0800 @@ -1031,11 +1031,13 @@ // leftover methods that weren't linked. if (is_shared()) { address entry = Interpreter::entry_for_cds_method(h_method); - assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry, + assert(entry != NULL && entry == _i2i_entry, "should be correctly set during dump time"); if (adapter() != NULL) { return; } + assert(entry == _from_interpreted_entry, + "should be correctly set during dump time"); } else if (_i2i_entry != NULL) { return; } The problem is: if the method has been compiled, then a shared method's _from_interpreted_entry would be different than _i2i_entry (see Method::set_code()). I am not sure if Method::link_method() would ever be called after it's been compiled, but I think it's safer to make the asserts no stronger than before this patch. Thanks - Ioi On 11/20/16 11:53 PM, Tobias Hartmann wrote: > Hi Ioi, > > this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation! > > For the record: the test mentioned in [1] is part of my fix for JDK-8169711. > > Best regards, > Tobias > > On 21.11.2016 07:58, Ioi Lam wrote: >> https://bugs.openjdk.java.net/browse/JDK-8169867 >> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >> >> Thanks to Tobias for finding the bug. I have done the following >> >> + integrated Tobias' suggested fix >> + fixed Method::restore_unshareable_info to call Method::link_method >> + added comments and a diagram to illustrate how the CDS method entry >> trampolines work. >> >> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. >> It's basically an extra level of indirection to get to the adapter. However. >> The word "trampoline" usually is used for and extra jump in executable code, >> so it may be a little confusing when we use it for a data pointer here. >> >> Any suggest for a better name? >> >> >> Testing: >> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >> now it produces the correct assertion. I won't check in this test, though, >> since it won't assert anymore after Tobias fixes 8169711. >> >> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 >> # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: >> # should be correctly set during dump time >> >> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >> All tests passed. >> >> Thanks >> - Ioi >> From david.holmes at oracle.com Mon Nov 28 05:55:35 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Nov 2016 15:55:35 +1000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap> References: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap> Message-ID: Hi Martin On 24/11/2016 2:20 AM, Doerr, Martin wrote: > Hi David, > > thank you very much for the presentation. I think it provides a good guideline for hotspot development. Thanks. > > Would you like to add something about multi-copy atomicity? Not really. :) > E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). > > It is needed in the following scenario: > - Different threads write 2 variables. > - Readers of these 2 variables expect a globally consistent order of the write accesses. > > In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... > (While taking a look at it, the condition "#if !(defined SPARC || defined IA32 || defined AMD64)" is not accurate and should better get improved. E.g. s390 is multi-copy atomic.) > > > I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :( Cheers, David > > Thanks and best regards, > Martin > > > -----Original Message----- > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On Behalf Of David Holmes > Sent: Mittwoch, 23. November 2016 06:08 > To: hotspot-dev developers > Subject: Presentation: Understanding OrderAccess > > This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. > > http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf > > Cheers, > David > From david.holmes at oracle.com Mon Nov 28 06:08:34 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Nov 2016 16:08:34 +1000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: <0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com> On 23/11/2016 8:40 PM, Andrew Haley wrote: > On 23/11/16 05:08, David Holmes wrote: >> This is a presentation I recently gave internally to the runtime and >> serviceability teams that may be of more general interest to hotspot >> developers. >> >> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderAccess-v1.1.pdf > > That's pretty cool; nicely done. Thanks Andrew. > I'd quibble about a couple of minor things: > > In Data Race Example: Using Barriers, the use of a naked StoreStore is > rather terrifying. In real-world code it'd be better to use > StoreStore|LoadStore or release unless the author really knows what > they're doing. It would all depend on the exact code of course. The simple flag+data example doesn't require it. > The use of "fence" to mean a full barrier is rather idiosyncratic; it > confused me the first time I saw it in HotSpot source, and from time > to time it still does. Yeah not sure the detailed history there - possibly related to x86 mfence. Cheers, David > But, as I said, these are minor criticisms. > > Andrew. > From marcus.larsson at oracle.com Mon Nov 28 10:06:27 2016 From: marcus.larsson at oracle.com (Marcus Larsson) Date: Mon, 28 Nov 2016 11:06:27 +0100 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: <583873A8.8000106@oracle.com> References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> <583873A8.8000106@oracle.com> Message-ID: Hi, On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote: > Marcus, > > Here are a new webrev: > http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/ Looks ok. Thanks, Marcus > > I addressed your comment about "if-case for the Z suffix". > > Could you please let me know your opinion? > > Thank you. > > Regards, Kirill > > On 24.11.2016 17:35, Marcus Larsson wrote: >> Hi, >> >> >> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: >>> Marcus, >>> >>> Thank you for prompt reply! >>> >>> Could you please read comments inline? >>> I'm looking forward to your reply. >>> >>> Thank you. >>> >>> Regards, Kirill >>> >>> On 22.11.2016 15:32, Marcus Larsson wrote: >>>> Hi, >>>> >>>> >>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>>>> Marcus, >>>>> >>>>> Thank you for reviewing the fix! >>>>>>> WebRev: >>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>>>> >>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>>>> should accept either. You could let sscanf read out the decimal >>>>>> point as a character and just verify that it is one of the two. >>>>>> >>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>>>> that we won't accept "Z" suffixed strings. Please revert that. >>>>> I agree that ISO8601 could add "Z" to time (and as far as I >>>>> understand date/time without delimiters is legal too) but these are >>>>> the unit tests. >>>>> Hence they cover the existing code and they should pass only if the >>>>> result corresponds to existing code and fail otherwise. >>>>> The current code from os::iso8601_time format date/time string >>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>>>> consider any other format as valid. >>>>> >>>>> Could you please let me know your opinion? >>>> >>>> I think the test should verify the intended behavior, not the >>>> implementation. If we refactor or change something in iso8601_time() >>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO. >>> I would agree with you if we were talking about a functional test. But >>> since it is an unit test I think we should keep it as close to >>> implementation as possible. >>> If the implementation is changed unintentionally the test fails and >>> signals us that something is broken. >>> If it is an intentional change the test must be updated >>> correspondingly. >> >> I still think it's unnecessary noise, but if you insist I'm fine with >> it. >> >> If we're not going to accept anything else than the current >> implementation then you should also remove the if-case for the Z suffix, >> since the test will fail for that anyway. >> >> Thanks, >> Marcus >> >>> >>>> >>>> Thanks, >>>> Marcus >>>> >>>>> >>>>> Thank you. >>>>> >>>>> Regards, Kirill >>>>> >>>>>> >>>>>> Thanks, >>>>>> Marcus >>>>>> >>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Regards, Kirill >>>>>> >>>>> >>>> >>> >> > From martin.doerr at sap.com Mon Nov 28 10:43:22 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 28 Nov 2016 10:43:22 +0000 Subject: Presentation: Understanding OrderAccess Message-ID: Hi David, I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). The term "multiple-copy atomicity" is described as "... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. Since you have asked about C++11, there's an example implementation for PPC [3]. Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. But I guess the Java memory model is beyond the scope of your presentation. Best regards, Martin [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf [2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030212.html [3] http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Montag, 28. November 2016 06:56 To: Doerr, Martin ; hotspot-dev developers Subject: Re: Presentation: Understanding OrderAccess Hi Martin On 24/11/2016 2:20 AM, Doerr, Martin wrote: > Hi David, > > thank you very much for the presentation. I think it provides a good guideline for hotspot development. Thanks. > > Would you like to add something about multi-copy atomicity? Not really. :) > E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). > > It is needed in the following scenario: > - Different threads write 2 variables. > - Readers of these 2 variables expect a globally consistent order of the write accesses. > > In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... > (While taking a look at it, the condition "#if !(defined SPARC || > defined IA32 || defined AMD64)" is not accurate and should better get > improved. E.g. s390 is multi-copy atomic.) > > > I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :( Cheers, David > > Thanks and best regards, > Martin > > > -----Original Message----- > From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On > Behalf Of David Holmes > Sent: Mittwoch, 23. November 2016 06:08 > To: hotspot-dev developers > Subject: Presentation: Understanding OrderAccess > > This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. > > http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderA > ccess-v1.1.pdf > > Cheers, > David > From aph at redhat.com Mon Nov 28 10:50:55 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 28 Nov 2016 10:50:55 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com> References: <0b9c05c9-2d56-d448-550e-1c83d1ed7aec@oracle.com> Message-ID: <8f4d6742-3592-7539-b176-028522ac2d32@redhat.com> On 28/11/16 06:08, David Holmes wrote: > On 23/11/2016 8:40 PM, Andrew Haley wrote: >> On 23/11/16 05:08, David Holmes wrote: > >> I'd quibble about a couple of minor things: >> >> In Data Race Example: Using Barriers, the use of a naked StoreStore is >> rather terrifying. In real-world code it'd be better to use >> StoreStore|LoadStore or release unless the author really knows what >> they're doing. > > It would all depend on the exact code of course. The simple flag+data > example doesn't require it. Ya, but it's a rare case: it's a bit like teaching someone to use a chainsaw before they've learned to use a knife and fork. :-) Andrew. From aph at redhat.com Mon Nov 28 10:59:09 2016 From: aph at redhat.com (Andrew Haley) Date: Mon, 28 Nov 2016 10:59:09 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: <5575cc0d6c7843b988c896b29caaf124@dewdfe13de06.global.corp.sap> Message-ID: <681454af-691b-268b-8328-a636b67a8afa@redhat.com> On 28/11/16 05:55, David Holmes wrote: > I still can't get my head around the C++11 terminology for this and how > you are expected to use it - what does it mean for an individual > operation to be "sequentially consistent" ? :( It means that a set of atomic::seq_cst loads and stores form a sequentially consistent order. So, if your program uses *only* atomic::seq_cst operations, "... the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program." Andrew. Leslie Lamport, "How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs", IEEE Trans. Comput. C-28,9 (Sept. 1979), 690-691. From igor.ignatyev at oracle.com Mon Nov 28 12:19:05 2016 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 28 Nov 2016 15:19:05 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> <583873A8.8000106@oracle.com> Message-ID: Hi Kirill, looks good to me, thanks for fixing that. Cheers, ? Igor > On Nov 28, 2016, at 1:06 PM, Marcus Larsson wrote: > > Hi, > > > On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote: >> Marcus, >> >> Here are a new webrev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/ > > Looks ok. > > Thanks, > Marcus > >> >> I addressed your comment about "if-case for the Z suffix". >> >> Could you please let me know your opinion? >> >> Thank you. >> >> Regards, Kirill >> >> On 24.11.2016 17:35, Marcus Larsson wrote: >>> Hi, >>> >>> >>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: >>>> Marcus, >>>> >>>> Thank you for prompt reply! >>>> >>>> Could you please read comments inline? >>>> I'm looking forward to your reply. >>>> >>>> Thank you. >>>> >>>> Regards, Kirill >>>> >>>> On 22.11.2016 15:32, Marcus Larsson wrote: >>>>> Hi, >>>>> >>>>> >>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>>>>> Marcus, >>>>>> >>>>>> Thank you for reviewing the fix! >>>>>>>> WebRev: >>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>>>>> >>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>>>>> should accept either. You could let sscanf read out the decimal >>>>>>> point as a character and just verify that it is one of the two. >>>>>>> >>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>>>>> that we won't accept "Z" suffixed strings. Please revert that. >>>>>> I agree that ISO8601 could add "Z" to time (and as far as I >>>>>> understand date/time without delimiters is legal too) but these are >>>>>> the unit tests. >>>>>> Hence they cover the existing code and they should pass only if the >>>>>> result corresponds to existing code and fail otherwise. >>>>>> The current code from os::iso8601_time format date/time string >>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>>>>> consider any other format as valid. >>>>>> >>>>>> Could you please let me know your opinion? >>>>> >>>>> I think the test should verify the intended behavior, not the >>>>> implementation. If we refactor or change something in iso8601_time() >>>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO. >>>> I would agree with you if we were talking about a functional test. But >>>> since it is an unit test I think we should keep it as close to >>>> implementation as possible. >>>> If the implementation is changed unintentionally the test fails and >>>> signals us that something is broken. >>>> If it is an intentional change the test must be updated correspondingly. >>> >>> I still think it's unnecessary noise, but if you insist I'm fine with it. >>> >>> If we're not going to accept anything else than the current >>> implementation then you should also remove the if-case for the Z suffix, >>> since the test will fail for that anyway. >>> >>> Thanks, >>> Marcus >>> >>>> >>>>> >>>>> Thanks, >>>>> Marcus >>>>> >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Regards, Kirill >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Marcus >>>>>>> >>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Regards, Kirill >>>>>>> >>>>>> >>>>> >>>> >>> >> > From tobias.hartmann at oracle.com Mon Nov 28 12:40:33 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 28 Nov 2016 13:40:33 +0100 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <583BAB5B.4020404@oracle.com> References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com> <583BAB5B.4020404@oracle.com> Message-ID: <583C25C1.1000605@oracle.com> Hi Ioi, On 28.11.2016 04:58, Ioi Lam wrote: > I found a problem in my previous patch. Here's the fix (on top of he previous patch): > > diff -r 3404f61c7081 src/share/vm/oops/method.cpp > --- a/src/share/vm/oops/method.cpp Sun Nov 27 19:44:44 2016 -0800 > +++ b/src/share/vm/oops/method.cpp Sun Nov 27 19:50:35 2016 -0800 > @@ -1031,11 +1031,13 @@ > // leftover methods that weren't linked. > if (is_shared()) { > address entry = Interpreter::entry_for_cds_method(h_method); > - assert(entry != NULL && entry == _i2i_entry && entry == _from_interpreted_entry, > + assert(entry != NULL && entry == _i2i_entry, > "should be correctly set during dump time"); > if (adapter() != NULL) { > return; > } > + assert(entry == _from_interpreted_entry, > + "should be correctly set during dump time"); > } else if (_i2i_entry != NULL) { > return; > } > > The problem is: if the method has been compiled, then a shared method's > _from_interpreted_entry would be different than _i2i_entry (see > Method::set_code()). > > I am not sure if Method::link_method() would ever be called after > it's been compiled, but I think it's safer to make the asserts no > stronger than before this patch. That looks reasonable to me! Thanks, Tobias > Thanks > - Ioi > > > On 11/20/16 11:53 PM, Tobias Hartmann wrote: >> Hi Ioi, >> >> this looks good to me, the detailed description including the diagram is very nice and helps to understand the complex implementation! >> >> For the record: the test mentioned in [1] is part of my fix for JDK-8169711. >> >> Best regards, >> Tobias >> >> On 21.11.2016 07:58, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8169867 >>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >>> >>> Thanks to Tobias for finding the bug. I have done the following >>> >>> + integrated Tobias' suggested fix >>> + fixed Method::restore_unshareable_info to call Method::link_method >>> + added comments and a diagram to illustrate how the CDS method entry >>> trampolines work. >>> >>> BTW, I am a little unhappy about the name ConstMethod::_adapter_trampoline. >>> It's basically an extra level of indirection to get to the adapter. However. >>> The word "trampoline" usually is used for and extra jump in executable code, >>> so it may be a little confusing when we use it for a data pointer here. >>> >>> Any suggest for a better name? >>> >>> >>> Testing: >>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >>> now it produces the correct assertion. I won't check in this test, though, >>> since it won't assert anymore after Tobias fixes 8169711. >>> >>> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # Internal Error (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), pid=16840, tid=16843 >>> # assert(entry != __null && entry == _i2i_entry && entry == _from_interpreted_entry) failed: >>> # should be correctly set during dump time >>> >>> [2] Ran RBT in fastdebug build for hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >>> All tests passed. >>> >>> Thanks >>> - Ioi >>> > From david.holmes at oracle.com Mon Nov 28 12:55:51 2016 From: david.holmes at oracle.com (David Holmes) Date: Mon, 28 Nov 2016 22:55:51 +1000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: Hi Martin, On 28/11/2016 8:43 PM, Doerr, Martin wrote: > Hi David, > > I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. > I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). > > The term "multiple-copy atomicity" is described as > "... in a machine which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". > > I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. > The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". Thanks for the reminder of that discussion. :) > A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose. > Since you have asked about C++11, there's an example implementation for PPC [3]. > Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. > But I guess the Java memory model is beyond the scope of your presentation. Oh yes way out of scope! :) Cheers, David > Best regards, > Martin > > > [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf > [2] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030212.html > [3] http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. November 2016 06:56 > To: Doerr, Martin ; hotspot-dev developers > Subject: Re: Presentation: Understanding OrderAccess > > Hi Martin > > On 24/11/2016 2:20 AM, Doerr, Martin wrote: >> Hi David, >> >> thank you very much for the presentation. I think it provides a good guideline for hotspot development. > > Thanks. > >> >> Would you like to add something about multi-copy atomicity? > > Not really. :) > >> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). >> >> It is needed in the following scenario: >> - Different threads write 2 variables. >> - Readers of these 2 variables expect a globally consistent order of the write accesses. >> >> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". > > Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... > >> (While taking a look at it, the condition "#if !(defined SPARC || >> defined IA32 || defined AMD64)" is not accurate and should better get >> improved. E.g. s390 is multi-copy atomic.) >> >> >> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. > > I still can't get my head around the C++11 terminology for this and how you are expected to use it - what does it mean for an individual operation to be "sequentially consistent" ? :( > > Cheers, > David > >> >> Thanks and best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >> Behalf Of David Holmes >> Sent: Mittwoch, 23. November 2016 06:08 >> To: hotspot-dev developers >> Subject: Presentation: Understanding OrderAccess >> >> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. >> >> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-OrderA >> ccess-v1.1.pdf >> >> Cheers, >> David >> From kirill.zhaldybin at oracle.com Mon Nov 28 13:01:25 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 28 Nov 2016 16:01:25 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> <583873A8.8000106@oracle.com> Message-ID: Markus, Thank you for review! Regards, Kirill On 28.11.2016 13:06, Marcus Larsson wrote: > Hi, > > > On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote: >> Marcus, >> >> Here are a new webrev: >> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/ > > Looks ok. > > Thanks, > Marcus > >> >> I addressed your comment about "if-case for the Z suffix". >> >> Could you please let me know your opinion? >> >> Thank you. >> >> Regards, Kirill >> >> On 24.11.2016 17:35, Marcus Larsson wrote: >>> Hi, >>> >>> >>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: >>>> Marcus, >>>> >>>> Thank you for prompt reply! >>>> >>>> Could you please read comments inline? >>>> I'm looking forward to your reply. >>>> >>>> Thank you. >>>> >>>> Regards, Kirill >>>> >>>> On 22.11.2016 15:32, Marcus Larsson wrote: >>>>> Hi, >>>>> >>>>> >>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>>>>> Marcus, >>>>>> >>>>>> Thank you for reviewing the fix! >>>>>>>> WebRev: >>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>>>>>> >>>>>>> >>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>>>>> should accept either. You could let sscanf read out the decimal >>>>>>> point as a character and just verify that it is one of the two. >>>>>>> >>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>>>>> that we won't accept "Z" suffixed strings. Please revert that. >>>>>> I agree that ISO8601 could add "Z" to time (and as far as I >>>>>> understand date/time without delimiters is legal too) but these are >>>>>> the unit tests. >>>>>> Hence they cover the existing code and they should pass only if the >>>>>> result corresponds to existing code and fail otherwise. >>>>>> The current code from os::iso8601_time format date/time string >>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>>>>> consider any other format as valid. >>>>>> >>>>>> Could you please let me know your opinion? >>>>> >>>>> I think the test should verify the intended behavior, not the >>>>> implementation. If we refactor or change something in iso8601_time() >>>>> we shouldn't be failing the test if it still conforms to ISO8601, >>>>> IMO. >>>> I would agree with you if we were talking about a functional test. But >>>> since it is an unit test I think we should keep it as close to >>>> implementation as possible. >>>> If the implementation is changed unintentionally the test fails and >>>> signals us that something is broken. >>>> If it is an intentional change the test must be updated >>>> correspondingly. >>> >>> I still think it's unnecessary noise, but if you insist I'm fine >>> with it. >>> >>> If we're not going to accept anything else than the current >>> implementation then you should also remove the if-case for the Z >>> suffix, >>> since the test will fail for that anyway. >>> >>> Thanks, >>> Marcus >>> >>>> >>>>> >>>>> Thanks, >>>>> Marcus >>>>> >>>>>> >>>>>> Thank you. >>>>>> >>>>>> Regards, Kirill >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Marcus >>>>>>> >>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>>>>> >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Regards, Kirill >>>>>>> >>>>>> >>>>> >>>> >>> >> > From kirill.zhaldybin at oracle.com Mon Nov 28 13:01:53 2016 From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin) Date: Mon, 28 Nov 2016 16:01:53 +0300 Subject: RFR(XS): 8169003: LogDecorations.iso8601_utctime_test fails if numeric locale uses ", " as separator between integer and fraction part In-Reply-To: References: <60ea633b-23b7-ee2f-27c6-9f0c754a7ec6@oracle.com> <85e51138-f3ce-7392-2cc3-ce7840aa3747@oracle.com> <23fcd50a-eed0-52a5-8817-244ecf75bb2a@oracle.com> <583873A8.8000106@oracle.com> Message-ID: <0b1cf3ff-5317-cd1c-e5e3-e43ef215bdb7@oracle.com> Igor, Thank you for review! Regards, Kirill On 28.11.2016 15:19, Igor Ignatyev wrote: > Hi Kirill, > > looks good to me, thanks for fixing that. > > Cheers, > ? Igor > >> On Nov 28, 2016, at 1:06 PM, Marcus Larsson wrote: >> >> Hi, >> >> >> On 11/25/2016 06:23 PM, Kirill Zhaldybin wrote: >>> Marcus, >>> >>> Here are a new webrev: http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.01/ >> Looks ok. >> >> Thanks, >> Marcus >> >>> I addressed your comment about "if-case for the Z suffix". >>> >>> Could you please let me know your opinion? >>> >>> Thank you. >>> >>> Regards, Kirill >>> >>> On 24.11.2016 17:35, Marcus Larsson wrote: >>>> Hi, >>>> >>>> >>>> On 11/22/2016 02:24 PM, Kirill Zhaldybin wrote: >>>>> Marcus, >>>>> >>>>> Thank you for prompt reply! >>>>> >>>>> Could you please read comments inline? >>>>> I'm looking forward to your reply. >>>>> >>>>> Thank you. >>>>> >>>>> Regards, Kirill >>>>> >>>>> On 22.11.2016 15:32, Marcus Larsson wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> On 2016-11-21 17:38, Kirill Zhaldybin wrote: >>>>>>> Marcus, >>>>>>> >>>>>>> Thank you for reviewing the fix! >>>>>>>>> WebRev: >>>>>>>>> http://cr.openjdk.java.net/~kzhaldyb/webrevs/JDK-8169003/webrev.00/ >>>>>>>> ISO8601 says the decimal point can be either '.' or ',' so the test >>>>>>>> should accept either. You could let sscanf read out the decimal >>>>>>>> point as a character and just verify that it is one of the two. >>>>>>>> >>>>>>>> In the UTC test you changed ASSERT_GT to ASSERT_EQ, which means >>>>>>>> that we won't accept "Z" suffixed strings. Please revert that. >>>>>>> I agree that ISO8601 could add "Z" to time (and as far as I >>>>>>> understand date/time without delimiters is legal too) but these are >>>>>>> the unit tests. >>>>>>> Hence they cover the existing code and they should pass only if the >>>>>>> result corresponds to existing code and fail otherwise. >>>>>>> The current code from os::iso8601_time format date/time string >>>>>>> %04d-%02d-%02dT%02d:%02d:%02d.%03d%c%02d%02d so we should not >>>>>>> consider any other format as valid. >>>>>>> >>>>>>> Could you please let me know your opinion? >>>>>> I think the test should verify the intended behavior, not the >>>>>> implementation. If we refactor or change something in iso8601_time() >>>>>> we shouldn't be failing the test if it still conforms to ISO8601, IMO. >>>>> I would agree with you if we were talking about a functional test. But >>>>> since it is an unit test I think we should keep it as close to >>>>> implementation as possible. >>>>> If the implementation is changed unintentionally the test fails and >>>>> signals us that something is broken. >>>>> If it is an intentional change the test must be updated correspondingly. >>>> I still think it's unnecessary noise, but if you insist I'm fine with it. >>>> >>>> If we're not going to accept anything else than the current >>>> implementation then you should also remove the if-case for the Z suffix, >>>> since the test will fail for that anyway. >>>> >>>> Thanks, >>>> Marcus >>>> >>>>>> Thanks, >>>>>> Marcus >>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Regards, Kirill >>>>>>> >>>>>>>> Thanks, >>>>>>>> Marcus >>>>>>>> >>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8169003 >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> Regards, Kirill From gromero at linux.vnet.ibm.com Mon Nov 28 13:24:40 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 28 Nov 2016 11:24:40 -0200 Subject: RFR(s) 8170153: PPC64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583394C5.3030206@linux.vnet.ibm.com> <9327e543-f35b-b88f-2831-a51e265b6a30@oracle.com> <5835B6D7.4020101@linux.vnet.ibm.com> Message-ID: <583C3018.5080109@linux.vnet.ibm.com> Hi Volker, Sorry for not replying earlier, it was day-off on Friday here... On 25-11-2016 11:32, Volker Simonis wrote: > Hi Gustavo, > > we've realized that we have exactly the same problem on Linux/s390 so > I hope you don't mind that I've updated the bug and the webrev to also > include the fix for Linux/s390: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.top/ > http://cr.openjdk.java.net/~simonis/webrevs/2016/8170153.jdk/ > https://bugs.openjdk.java.net/browse/JDK-8170153 > > The top-level change stays the same (I've only added the current > reviewers) and for the jdk change I've just added Linux/s390 as > another platform which can compile fdlibm with HIGH optimization. Actually, it's really cool to know that an analysis on PPC64 contributed also to the s390 arch! :) Thanks for providing the updated webrevs. Regards, Gustavo From stefan.karlsson at oracle.com Mon Nov 28 13:52:20 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 14:52:20 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist Message-ID: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Hi all, Please, review this patch to fix metaspace initialization. http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8170395 The fix for JDK-8169931 introduced a new assert to ensure that we always try to allocate chunks that are any of the three fixed sizes (specialized, small, medium) or a humongous chunk (if it is larger then the medium chunk size). During metaspace initialization an initial metaspace chunk is allocated. The size of some of the metaspace instances can be specified on the command line. For example: java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version If this size is smaller than the medium chunk size and at the same time doesn't match the specialized or small chunk size, then we end up hitting the assert mentioned above: # # Internal Error (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643, tid=31646 # assert(size > free_chunks(MediumIndex)->size()) failed: Not a humongous chunk # ======================================================================== The most important part of the fix is this line: + // Adjust to one of the fixed chunk sizes (unless humongous) + const size_t adjusted = adjust_initial_chunk_size(requested); which ensures that we always request either of a specialized, small, medium, or humongous chunk size, even if the requested size is neither of these. Most of the other code is refactoring to unify the non-class metaspace and the class metaspace code paths to get rid of some of the existing code duplication, bring the chunk size calculation nearer to the the actual chunk allocation, and make it easier to write a unit test for the new adjust_initial_chunk_size function. ======================================================================== The patch for JDK-8169931 was backed out with JDK-8170355 and will be reintroduced as JDK-8170358 when this patch has been reviewed and pushed. Testing: jprt, unit test, parts of PIT testing (including CDS tests), failing test Thanks, StefanK From michail.chernov at oracle.com Mon Nov 28 13:57:23 2016 From: michail.chernov at oracle.com (Michail Chernov) Date: Mon, 28 Nov 2016 16:57:23 +0300 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: Hi Stefan, Could you please add simple regression test for this case? Thanks, Michail On 28.11.2016 16:52, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8170395 > > The fix for JDK-8169931 introduced a new assert to ensure that we > always try to allocate chunks that are any of the three fixed sizes > (specialized, small, medium) or a humongous chunk (if it is larger > then the medium chunk size). > > During metaspace initialization an initial metaspace chunk is > allocated. The size of some of the metaspace instances can be > specified on the command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the same > time doesn't match the specialized or small chunk size, then we end up > hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a > humongous chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, small, > medium, or humongous chunk size, even if the requested size is neither > of these. > > Most of the other code is refactoring to unify the non-class metaspace > and the class metaspace code paths to get rid of some of the existing > code duplication, bring the chunk size calculation nearer to the the > actual chunk allocation, and make it easier to write a unit test for > the new adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and will be > reintroduced as JDK-8170358 when this patch has been reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS tests), > failing test > > Thanks, > StefanK From martin.doerr at sap.com Mon Nov 28 14:37:16 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 28 Nov 2016 14:37:16 +0000 Subject: [JUNK] Re: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: <0acb8779574543ff80607e460a81061f@dewdfe13de06.global.corp.sap> Hi David, > Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try: I think the comment in orderAccess.hpp is not bad: // Finally, we define a "fence" operation, as a bidirectional barrier. // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order. One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B. If A contains a load, one has to include the corresponding store which may have been performed by another thread into A. The same is valid for B. Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one. > Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... > but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4] So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5]. Best regards, Martin [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering [5] http://g.oswego.edu/dl/jmm/cookbook.html -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Montag, 28. November 2016 13:56 To: Doerr, Martin ; hotspot-dev developers Subject: [JUNK] Re: Presentation: Understanding OrderAccess Hi Martin, On 28/11/2016 8:43 PM, Doerr, Martin wrote: > Hi David, > > I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. > I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). > > The term "multiple-copy atomicity" is described as "... in a machine > which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". > > I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. > The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". Thanks for the reminder of that discussion. :) > A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose. > Since you have asked about C++11, there's an example implementation for PPC [3]. > Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. > But I guess the Java memory model is beyond the scope of your presentation. Oh yes way out of scope! :) Cheers, David > Best regards, > Martin > > > [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf > [2] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030 > 212.html [3] > http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. November 2016 06:56 > To: Doerr, Martin ; hotspot-dev developers > > Subject: Re: Presentation: Understanding OrderAccess > > Hi Martin > > On 24/11/2016 2:20 AM, Doerr, Martin wrote: >> Hi David, >> >> thank you very much for the presentation. I think it provides a good guideline for hotspot development. > > Thanks. > >> >> Would you like to add something about multi-copy atomicity? > > Not really. :) > >> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). >> >> It is needed in the following scenario: >> - Different threads write 2 variables. >> - Readers of these 2 variables expect a globally consistent order of the write accesses. >> >> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". > > Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... > >> (While taking a look at it, the condition "#if !(defined SPARC || >> defined IA32 || defined AMD64)" is not accurate and should better get >> improved. E.g. s390 is multi-copy atomic.) >> >> >> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. > > I still can't get my head around the C++11 terminology for this and > how you are expected to use it - what does it mean for an individual > operation to be "sequentially consistent" ? :( > > Cheers, > David > >> >> Thanks and best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >> Behalf Of David Holmes >> Sent: Mittwoch, 23. November 2016 06:08 >> To: hotspot-dev developers >> Subject: Presentation: Understanding OrderAccess >> >> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. >> >> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order >> A >> ccess-v1.1.pdf >> >> Cheers, >> David >> From gromero at linux.vnet.ibm.com Mon Nov 28 16:28:00 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 28 Nov 2016 14:28:00 -0200 Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation Message-ID: <583C5B10.8040204@linux.vnet.ibm.com> Hi all, I'm re-sending due to JDK title update to include s390x and aarch64 archs. Could the following webrev be reviewed, please? webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/ webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/ bug: https://bugs.openjdk.java.net/browse/JDK-8170153 Thank you. Regards, Gustavo From martin.doerr at sap.com Mon Nov 28 16:29:28 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 28 Nov 2016 16:29:28 +0000 Subject: Presentation: Understanding OrderAccess Message-ID: Hi David, sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that. > Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try: I think the comment in orderAccess.hpp is not bad: // Finally, we define a "fence" operation, as a bidirectional barrier. // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order. One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B. If A contains a load, one has to include the corresponding store which may have been performed by another thread into A. Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one. > Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... > but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4] So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5]. Best regards, Martin [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering [5] http://g.oswego.edu/dl/jmm/cookbook.html -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Montag, 28. November 2016 13:56 To: Doerr, Martin ; hotspot-dev developers Subject: Re: Presentation: Understanding OrderAccess Hi Martin, On 28/11/2016 8:43 PM, Doerr, Martin wrote: > Hi David, > > I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. > I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). > > The term "multiple-copy atomicity" is described as "... in a machine > which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". > > I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. > The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". Thanks for the reminder of that discussion. :) > A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose. > Since you have asked about C++11, there's an example implementation for PPC [3]. > Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. > But I guess the Java memory model is beyond the scope of your presentation. Oh yes way out of scope! :) Cheers, David > Best regards, > Martin > > > [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf > [2] > http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030 > 212.html [3] > http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. November 2016 06:56 > To: Doerr, Martin ; hotspot-dev developers > > Subject: Re: Presentation: Understanding OrderAccess > > Hi Martin > > On 24/11/2016 2:20 AM, Doerr, Martin wrote: >> Hi David, >> >> thank you very much for the presentation. I think it provides a good guideline for hotspot development. > > Thanks. > >> >> Would you like to add something about multi-copy atomicity? > > Not really. :) > >> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). >> >> It is needed in the following scenario: >> - Different threads write 2 variables. >> - Readers of these 2 variables expect a globally consistent order of the write accesses. >> >> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". > > Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... > >> (While taking a look at it, the condition "#if !(defined SPARC || >> defined IA32 || defined AMD64)" is not accurate and should better get >> improved. E.g. s390 is multi-copy atomic.) >> >> >> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. > > I still can't get my head around the C++11 terminology for this and > how you are expected to use it - what does it mean for an individual > operation to be "sequentially consistent" ? :( > > Cheers, > David > >> >> Thanks and best regards, >> Martin >> >> >> -----Original Message----- >> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >> Behalf Of David Holmes >> Sent: Mittwoch, 23. November 2016 06:08 >> To: hotspot-dev developers >> Subject: Presentation: Understanding OrderAccess >> >> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. >> >> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order >> A >> ccess-v1.1.pdf >> >> Cheers, >> David >> From mikael.gerdin at oracle.com Mon Nov 28 16:45:08 2016 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Mon, 28 Nov 2016 17:45:08 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> Hi Stefan, On 2016-11-28 14:52, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ Overall I think this change looks good. One thing I noticed is that the first parameter to VirtualSpaceList::get_new_chunk is actually ignored so you might want to just get rid of it, it's just confusing to see it. If you decide to do something about get_new_chunk I think it wouldn't hurt to have the names of the parameters changed as well, "grow_chunks_by_words" is actually "requested_chunk_size" and "medium_chunk_bunch" could be something like "suggested_commit_granularity" You might want to make the "const size_t" constants you moved out of the enum to either be "static" (which would be static in the C-sense) or add them in an anonymous namespace since otherwise they will pollute the global symbol namespace (more so than an enum which is strictly file scoped). The rest of the change looks good to me. /Mikael > https://bugs.openjdk.java.net/browse/JDK-8170395 > > The fix for JDK-8169931 introduced a new assert to ensure that we always > try to allocate chunks that are any of the three fixed sizes > (specialized, small, medium) or a humongous chunk (if it is larger then > the medium chunk size). > > During metaspace initialization an initial metaspace chunk is allocated. > The size of some of the metaspace instances can be specified on the > command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the same time > doesn't match the specialized or small chunk size, then we end up > hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a > humongous chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, small, > medium, or humongous chunk size, even if the requested size is neither > of these. > > Most of the other code is refactoring to unify the non-class metaspace > and the class metaspace code paths to get rid of some of the existing > code duplication, bring the chunk size calculation nearer to the the > actual chunk allocation, and make it easier to write a unit test for the > new adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and will be > reintroduced as JDK-8170358 when this patch has been reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS tests), > failing test > > Thanks, > StefanK From thomas.stuefe at gmail.com Mon Nov 28 16:48:11 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 28 Nov 2016 17:48:11 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: Hi Stefan, This looks good. Some small remarks: - Metaspace::verify_initialized () : could be made static. For clarity I'd also either rename it to something like "verify_global_initialization" or to just roll the code out into its only caller, Metaspace::initialize(). - I never liked the ChunkSizes enum names, because they do not indicate they are sizes, and now that the encompassing enum name "ChunkSizes" is gone they are even less clear. Would it be possible to rename the former enum values to "...Size" for better code clarity, e.g. "MediumChunkSize" instead of "MediumChunk"? - Metaspace::get_space_manager(MetadataType mdtype) - asserting for mdType==Class||NonClassType instead of != MetadaTypeCount could be a bit clearer. - SpaceManager::adjust_initial_chunk_size () - could we rename this to a more generic name like "::next_larger_chunksize" or similar? I also wonder whether this could be combined somehow with SpaceManager::calc_chunk_size(), which wants to do something similar (calculate a fitting chunk size for a given smaller allocation size) Kind Regards, Thomas On Mon, Nov 28, 2016 at 2:52 PM, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8170395 > > The fix for JDK-8169931 introduced a new assert to ensure that we always > try to allocate chunks that are any of the three fixed sizes (specialized, > small, medium) or a humongous chunk (if it is larger then the medium chunk > size). > > During metaspace initialization an initial metaspace chunk is allocated. > The size of some of the metaspace instances can be specified on the command > line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the same time > doesn't match the specialized or small chunk size, then we end up hitting > the assert mentioned above: > # > # Internal Error (/scratch/opt/jprt/T/P1/142848 > .erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643, > tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a humongous > chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, small, > medium, or humongous chunk size, even if the requested size is neither of > these. > > Most of the other code is refactoring to unify the non-class metaspace and > the class metaspace code paths to get rid of some of the existing code > duplication, bring the chunk size calculation nearer to the the actual > chunk allocation, and make it easier to write a unit test for the new > adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and will be > reintroduced as JDK-8170358 when this patch has been reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS tests), > failing test > > Thanks, > StefanK > From erik.joelsson at oracle.com Mon Nov 28 16:55:00 2016 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Mon, 28 Nov 2016 17:55:00 +0100 Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583C5B10.8040204@linux.vnet.ibm.com> References: <583C5B10.8040204@linux.vnet.ibm.com> Message-ID: <1b332dd2-aa9f-e24b-faaf-b95eacd11dac@oracle.com> Looks good. /Erik On 2016-11-28 17:28, Gustavo Romero wrote: > Hi all, > > I'm re-sending due to JDK title update to include s390x and aarch64 archs. > > Could the following webrev be reviewed, please? > > webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/ > webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/ > bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > Thank you. > > > Regards, > Gustavo > From thomas.stuefe at gmail.com Mon Nov 28 16:58:37 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 28 Nov 2016 17:58:37 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> Message-ID: On Mon, Nov 28, 2016 at 5:45 PM, Mikael Gerdin wrote: > Hi Stefan, > > On 2016-11-28 14:52, Stefan Karlsson wrote: > >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >> > > Overall I think this change looks good. > One thing I noticed is that the first parameter to > VirtualSpaceList::get_new_chunk > is actually ignored so you might want to just get rid of it, it's just > confusing to see it. If you decide to do something about get_new_chunk I > think it wouldn't hurt to have the names of the parameters changed as well, > "grow_chunks_by_words" is actually "requested_chunk_size" and > "medium_chunk_bunch" could be something like "suggested_commit_granularity" > > +1 to that, this would make the code quite a bit clearer. I also had a hard time understanding the "make_current" flag in SpaceManager::add_chunk() until I (hope I) understood that it only matters for humongous chunks where we differentiate between (a) preallocating a still-unused humongous chunk for future allocations (initial chunk) or (b) allocating a humongous chunk for immediate consumption by a larger-than-medium-chunk memory request. I never saw (b) in real life, however, the only humongous chunks I ever see are the initial chunks. Does this ever happen? > You might want to make the "const size_t" constants you moved out of the > enum to either be "static" (which would be static in the C-sense) or add > them in an anonymous namespace since otherwise they will pollute the global > symbol namespace (more so than an enum which is strictly file scoped). > > The rest of the change looks good to me. > > /Mikael > > > https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we always >> try to allocate chunks that are any of the three fixed sizes >> (specialized, small, medium) or a humongous chunk (if it is larger then >> the medium chunk size). >> >> During metaspace initialization an initial metaspace chunk is allocated. >> The size of some of the metaspace instances can be specified on the >> command line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same time >> doesn't match the specialized or small chunk size, then we end up >> hitting the assert mentioned above: >> # >> # Internal Error >> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/ >> memory/metaspace.cpp:2359), >> pid=31643, tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is neither >> of these. >> >> Most of the other code is refactoring to unify the non-class metaspace >> and the class metaspace code paths to get rid of some of the existing >> code duplication, bring the chunk size calculation nearer to the the >> actual chunk allocation, and make it easier to write a unit test for the >> new adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK >> > From stefan.karlsson at oracle.com Mon Nov 28 18:44:20 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 19:44:20 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: <8142bf8c-0fda-5f15-4747-452ac09b578e@oracle.com> Hi Thomas, On 2016-11-28 17:48, Thomas St?fe wrote: > Hi Stefan, > > This looks good. Thanks. > Some small remarks: > > - Metaspace::verify_initialized () : could be made static. For clarity > I'd also either rename it to something like > "verify_global_initialization" or to just roll the code out into its > only caller, Metaspace::initialize(). I'll rename the function and make it static. Personally, I want verbose verification and debugging code to get out of the way of the other code. That's why I moved it to a separate function. > > - I never liked the ChunkSizes enum names, because they do not > indicate they are sizes, and now that the encompassing enum name > "ChunkSizes" is gone they are even less clear. Would it be possible to > rename the former enum values to "...Size" for better code clarity, > e.g. "MediumChunkSize" instead of "MediumChunk"? I sort of agree, but changing it will affect large parts of metaspace.cpp, which makes it hard to see the other changes in this patch. I'd rather revert back to the enum, and maybe deal with that cleanup as a separate enhancement. > > - Metaspace::get_space_manager(MetadataType mdtype) - asserting for > mdType==Class||NonClassType instead of != MetadaTypeCount could be a > bit clearer. The assert is copied from the other getters in the file, so I'd like to keep it for consistency. Maybe we should get rid of MetadataTypeCount and that assert, and let the code that converts back and forth between MetadataType and integers do the assert check? That would need to be handled as a separate enhancement. > > - SpaceManager::adjust_initial_chunk_size () - could we rename this to > a more generic name like "::next_larger_chunksize" or similar? I choose the name because it is a helper for a specific use-case and call site. I also considered giving it a more generic name, but I couldn't immediately come up with a name that accurately described the function. The proposed next_larger_chunksize isn't describing the function correctly, since adjust_initial_chunk_size(SmallChunk) returns SmallChunk and not MediumChunk. If we can figure out a spot-on name, I'd be happy to change it. > I also wonder whether this could be combined somehow with > SpaceManager::calc_chunk_size(), which wants to do something similar > (calculate a fitting chunk size for a given smaller allocation size) I briefly thought about that as well, but then skipped that though because of the heuristics involved in calc_chunk_size(). Thanks reviewing, StefanK > > > Kind Regards, Thomas > > > On Mon, Nov 28, 2016 at 2:52 PM, Stefan Karlsson > > wrote: > > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > > https://bugs.openjdk.java.net/browse/JDK-8170395 > > > The fix for JDK-8169931 introduced a new assert to ensure that we > always try to allocate chunks that are any of the three fixed > sizes (specialized, small, medium) or a humongous chunk (if it is > larger then the medium chunk size). > > During metaspace initialization an initial metaspace chunk is > allocated. The size of some of the metaspace instances can be > specified on the command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the same > time doesn't match the specialized or small chunk size, then we > end up hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a > humongous chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, > small, medium, or humongous chunk size, even if the requested size > is neither of these. > > Most of the other code is refactoring to unify the non-class > metaspace and the class metaspace code paths to get rid of some of > the existing code duplication, bring the chunk size calculation > nearer to the the actual chunk allocation, and make it easier to > write a unit test for the new adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and will > be reintroduced as JDK-8170358 when this patch has been reviewed > and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS > tests), failing test > > Thanks, > StefanK > > From kim.barrett at oracle.com Mon Nov 28 18:49:30 2016 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 28 Nov 2016 13:49:30 -0500 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> Message-ID: <2C5018D2-204D-4C81-96C5-E003941DA731@oracle.com> > On Nov 28, 2016, at 11:45 AM, Mikael Gerdin wrote: > You might want to make the "const size_t" constants you moved out of the enum to either be "static" (which would be static in the C-sense) or add them in an anonymous namespace since otherwise they will pollute the global symbol namespace (more so than an enum which is strictly file scoped). C++ const declarations at namesapce scope have internal linkage unless explicitly declared to have external linkage. From stefan.karlsson at oracle.com Mon Nov 28 19:23:32 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 20:23:32 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> <2044737c-c2d2-e51b-b333-d65b300b1dc1@oracle.com> Message-ID: Hi Mikael, On 2016-11-28 17:45, Mikael Gerdin wrote: > Hi Stefan, > > On 2016-11-28 14:52, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > > Overall I think this change looks good. > One thing I noticed is that the first parameter to > VirtualSpaceList::get_new_chunk > is actually ignored so you might want to just get rid of it, it's just > confusing to see it. If you decide to do something about get_new_chunk > I think it wouldn't hurt to have the names of the parameters changed > as well, "grow_chunks_by_words" is actually "requested_chunk_size" and > "medium_chunk_bunch" could be something like > "suggested_commit_granularity" I'll fix this and the surrounding code. > > You might want to make the "const size_t" constants you moved out of > the enum to either be "static" (which would be static in the C-sense) > or add them in an anonymous namespace since otherwise they will > pollute the global symbol namespace (more so than an enum which is > strictly file scoped). I'm going to revert back to using an enum, for now. > > The rest of the change looks good to me. Thanks, StefanK > > /Mikael > >> https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we always >> try to allocate chunks that are any of the three fixed sizes >> (specialized, small, medium) or a humongous chunk (if it is larger then >> the medium chunk size). >> >> During metaspace initialization an initial metaspace chunk is allocated. >> The size of some of the metaspace instances can be specified on the >> command line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same time >> doesn't match the specialized or small chunk size, then we end up >> hitting the assert mentioned above: >> # >> # Internal Error >> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >> >> pid=31643, tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is neither >> of these. >> >> Most of the other code is refactoring to unify the non-class metaspace >> and the class metaspace code paths to get rid of some of the existing >> code duplication, bring the chunk size calculation nearer to the the >> actual chunk allocation, and make it easier to write a unit test for the >> new adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and >> pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Mon Nov 28 19:29:29 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 20:29:29 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: <98082fd3-37d2-e2d4-842f-26e5ea38dcbc@oracle.com> Hi Michail, On 2016-11-28 14:57, Michail Chernov wrote: > Hi Stefan, > > > Could you please add simple regression test for this case? The failure below was found with one of the test cases in: runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java Is this enough or do you want an explicit regression test that simply invokes: java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ? Thanks, StefanK > > > Thanks, > > Michail > > > On 28.11.2016 16:52, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we >> always try to allocate chunks that are any of the three fixed sizes >> (specialized, small, medium) or a humongous chunk (if it is larger >> then the medium chunk size). >> >> During metaspace initialization an initial metaspace chunk is >> allocated. The size of some of the metaspace instances can be >> specified on the command line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same >> time doesn't match the specialized or small chunk size, then we end >> up hitting the assert mentioned above: >> # >> # Internal Error >> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >> pid=31643, tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is >> neither of these. >> >> Most of the other code is refactoring to unify the non-class >> metaspace and the class metaspace code paths to get rid of some of >> the existing code duplication, bring the chunk size calculation >> nearer to the the actual chunk allocation, and make it easier to >> write a unit test for the new adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and >> pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK > From michail.chernov at oracle.com Mon Nov 28 19:47:45 2016 From: michail.chernov at oracle.com (Michail Chernov) Date: Mon, 28 Nov 2016 11:47:45 -0800 (PST) Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist Message-ID: Hi Stefan, Since the bug was caught in existing test, I don't see any reason to make additional test for this case. Thanks for explanation! Michail ----- ???????? ????????? ----- ??: stefan.karlsson at oracle.com ????: michail.chernov at oracle.com, hotspot-dev at openjdk.java.net ????????????: ???????????, 28 ?????? 2016 ? 22:29:33 GMT +03:00 ???? ????: Re: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist Hi Michail, On 2016-11-28 14:57, Michail Chernov wrote: Hi Stefan, Could you please add simple regression test for this case? The failure below was found with one of the test cases in: runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java Is this enough or do you want an explicit regression test that simply invokes: java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ? Thanks, StefanK Thanks, Michail On 28.11.2016 16:52, Stefan Karlsson wrote: Hi all, Please, review this patch to fix metaspace initialization. http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8170395 The fix for JDK-8169931 introduced a new assert to ensure that we always try to allocate chunks that are any of the three fixed sizes (specialized, small, medium) or a humongous chunk (if it is larger then the medium chunk size). During metaspace initialization an initial metaspace chunk is allocated. The size of some of the metaspace instances can be specified on the command line. For example: java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version If this size is smaller than the medium chunk size and at the same time doesn't match the specialized or small chunk size, then we end up hitting the assert mentioned above: # # Internal Error (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643, tid=31646 # assert(size > free_chunks(MediumIndex)->size()) failed: Not a humongous chunk # ======================================================================== The most important part of the fix is this line: + // Adjust to one of the fixed chunk sizes (unless humongous) + const size_t adjusted = adjust_initial_chunk_size(requested); which ensures that we always request either of a specialized, small, medium, or humongous chunk size, even if the requested size is neither of these. Most of the other code is refactoring to unify the non-class metaspace and the class metaspace code paths to get rid of some of the existing code duplication, bring the chunk size calculation nearer to the the actual chunk allocation, and make it easier to write a unit test for the new adjust_initial_chunk_size function. ======================================================================== The patch for JDK-8169931 was backed out with JDK-8170355 and will be reintroduced as JDK-8170358 when this patch has been reviewed and pushed. Testing: jprt, unit test, parts of PIT testing (including CDS tests), failing test Thanks, StefanK From stefan.karlsson at oracle.com Mon Nov 28 19:49:26 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 20:49:26 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: Message-ID: <688aff6d-5bca-fd30-282b-91a8ae31e7f9@oracle.com> Thanks, Michail. StefanK On 2016-11-28 20:47, Michail Chernov wrote: > Hi Stefan, > > Since the bug was caught in existing test, I don't see any reason to > make additional test for this case. Thanks for explanation! > > Michail > > ----- ???????? ????????? ----- > ??: stefan.karlsson at oracle.com > ????: michail.chernov at oracle.com, hotspot-dev at openjdk.java.net > ????????????: ???????????, 28 ?????? 2016 ? 22:29:33 GMT +03:00 ???? > ????: Re: RFR: 8170395: Metaspace initialization queries the wrong > chunk freelist > > Hi Michail, > > On 2016-11-28 14:57, Michail Chernov wrote: > > Hi Stefan, > > > Could you please add simple regression test for this case? > > The failure below was found with one of the test cases in: > runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java > > Is this enough or do you want an explicit regression test that simply > invokes: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version ? > > Thanks, > StefanK > > > > Thanks, > > Michail > > > On 28.11.2016 16:52, Stefan Karlsson wrote: > > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8170395 > > The fix for JDK-8169931 introduced a new assert to ensure that > we always try to allocate chunks that are any of the three > fixed sizes (specialized, small, medium) or a humongous chunk > (if it is larger then the medium chunk size). > > During metaspace initialization an initial metaspace chunk is > allocated. The size of some of the metaspace instances can be > specified on the command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the > same time doesn't match the specialized or small chunk size, > then we end up hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not > a humongous chunk > # > > ======================================================================== > > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, > small, medium, or humongous chunk size, even if the requested > size is neither of these. > > Most of the other code is refactoring to unify the non-class > metaspace and the class metaspace code paths to get rid of > some of the existing code duplication, bring the chunk size > calculation nearer to the the actual chunk allocation, and > make it easier to write a unit test for the new > adjust_initial_chunk_size function. > > ======================================================================== > > > The patch for JDK-8169931 was backed out with JDK-8170355 and > will be reintroduced as JDK-8170358 when this patch has been > reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS > tests), failing test > > Thanks, > StefanK > > > From dean.long at oracle.com Mon Nov 28 20:01:44 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 28 Nov 2016 12:01:44 -0800 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: Message-ID: Hi David, On 11/25/16 2:38 AM, David Holmes wrote: > However, the stack size limitations remained in place in case the VM > was launched from the primordial thread of a user application via the > JNI invocation API. why is the JNI invocation API no longer a problem? Does it create a new thread like the launcher? dl From vladimir.kozlov at oracle.com Mon Nov 28 20:18:05 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 28 Nov 2016 12:18:05 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: <583BAB5B.4020404@oracle.com> References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com> <583BAB5B.4020404@oracle.com> Message-ID: Hi Ioi, Did you have updated webrev? And you did not comment on my suggestion: >> Any suggest for a better name? > > _adapter_cds_entry ? Thanks, Vladimir On 11/27/16 7:58 PM, Ioi Lam wrote: > I found a problem in my previous patch. Here's the fix (on top of he > previous patch): > > diff -r 3404f61c7081 src/share/vm/oops/method.cpp > --- a/src/share/vm/oops/method.cpp Sun Nov 27 19:44:44 2016 -0800 > +++ b/src/share/vm/oops/method.cpp Sun Nov 27 19:50:35 2016 -0800 > @@ -1031,11 +1031,13 @@ > // leftover methods that weren't linked. > if (is_shared()) { > address entry = Interpreter::entry_for_cds_method(h_method); > - assert(entry != NULL && entry == _i2i_entry && entry == > _from_interpreted_entry, > + assert(entry != NULL && entry == _i2i_entry, > "should be correctly set during dump time"); > if (adapter() != NULL) { > return; > } > + assert(entry == _from_interpreted_entry, > + "should be correctly set during dump time"); > } else if (_i2i_entry != NULL) { > return; > } > > The problem is: if the method has been compiled, then a shared method's > _from_interpreted_entry would be different than _i2i_entry (see > Method::set_code()). > > I am not sure if Method::link_method() would ever be called after > it's been compiled, but I think it's safer to make the asserts no > stronger than before this patch. > > Thanks > - Ioi > > > On 11/20/16 11:53 PM, Tobias Hartmann wrote: >> Hi Ioi, >> >> this looks good to me, the detailed description including the diagram >> is very nice and helps to understand the complex implementation! >> >> For the record: the test mentioned in [1] is part of my fix for >> JDK-8169711. >> >> Best regards, >> Tobias >> >> On 21.11.2016 07:58, Ioi Lam wrote: >>> https://bugs.openjdk.java.net/browse/JDK-8169867 >>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >>> >>> >>> Thanks to Tobias for finding the bug. I have done the following >>> >>> + integrated Tobias' suggested fix >>> + fixed Method::restore_unshareable_info to call Method::link_method >>> + added comments and a diagram to illustrate how the CDS method entry >>> trampolines work. >>> >>> BTW, I am a little unhappy about the name >>> ConstMethod::_adapter_trampoline. >>> It's basically an extra level of indirection to get to the adapter. >>> However. >>> The word "trampoline" usually is used for and extra jump in >>> executable code, >>> so it may be a little confusing when we use it for a data pointer here. >>> >>> Any suggest for a better name? >>> >>> >>> Testing: >>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >>> now it produces the correct assertion. I won't check in this >>> test, though, >>> since it won't assert anymore after Tobias fixes 8169711. >>> >>> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # Internal Error >>> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), >>> pid=16840, tid=16843 >>> # assert(entry != __null && entry == _i2i_entry && entry == >>> _from_interpreted_entry) failed: >>> # should be correctly set during dump time >>> >>> [2] Ran RBT in fastdebug build for >>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >>> All tests passed. >>> >>> Thanks >>> - Ioi >>> > From stefan.karlsson at oracle.com Mon Nov 28 21:06:08 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 28 Nov 2016 22:06:08 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: Hi again, This set of patches resolve some of the comments given by Mikael and Thomas: Entire patch: http://cr.openjdk.java.net/~stefank/8170395/webrev.02 Delta patches: http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter I consider pushing the last patch as a separate changeset. This is the entire patch without the unused_parameter patch: http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 Thanks, StefanK On 2016-11-28 14:52, Stefan Karlsson wrote: > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8170395 > > The fix for JDK-8169931 introduced a new assert to ensure that we > always try to allocate chunks that are any of the three fixed sizes > (specialized, small, medium) or a humongous chunk (if it is larger > then the medium chunk size). > > During metaspace initialization an initial metaspace chunk is > allocated. The size of some of the metaspace instances can be > specified on the command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the same > time doesn't match the specialized or small chunk size, then we end up > hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a > humongous chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, small, > medium, or humongous chunk size, even if the requested size is neither > of these. > > Most of the other code is refactoring to unify the non-class metaspace > and the class metaspace code paths to get rid of some of the existing > code duplication, bring the chunk size calculation nearer to the the > actual chunk allocation, and make it easier to write a unit test for > the new adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and will be > reintroduced as JDK-8170358 when this patch has been reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS tests), > failing test > > Thanks, > StefanK From david.holmes at oracle.com Mon Nov 28 21:22:29 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Nov 2016 07:22:29 +1000 Subject: Presentation: Understanding OrderAccess In-Reply-To: References: Message-ID: Hi Martin, I've added Erik explicitly to the cc as he and I have been discussing fences and "visibility", and of course he most recently revised the descriptions in orderAccess.hpp On 29/11/2016 2:29 AM, Doerr, Martin wrote: > Hi David, > > sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that. > >> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? > > This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try: > > I think the comment in orderAccess.hpp is not bad: > // Finally, we define a "fence" operation, as a bidirectional barrier. > // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order. > > One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B. > If A contains a load, one has to include the corresponding store which may have been performed by another thread into A. > Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one. But again that attribution of global properties is not something I think is necessarily implied or intended by OrderAccess. Or maybe it is, but as it is only an issue on non-multicopy-atomic systems, it has never been called out explicitly. ?? And those global properties must also be a part of the other barriers (as the fence is just the combination of them all) - but I don't know how you would describe the affects of the other barriers (like loadload) in "global" terms. David ----- > >> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... >> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > > "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4] > > So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5]. > > > Best regards, > Martin > > [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering > [5] http://g.oswego.edu/dl/jmm/cookbook.html > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. November 2016 13:56 > To: Doerr, Martin ; hotspot-dev developers > Subject: Re: Presentation: Understanding OrderAccess > > Hi Martin, > > On 28/11/2016 8:43 PM, Doerr, Martin wrote: >> Hi David, >> >> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. >> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). >> >> The term "multiple-copy atomicity" is described as "... in a machine >> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". >> >> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. >> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". > > Thanks for the reminder of that discussion. :) > >> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. > > Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose. > >> Since you have asked about C++11, there's an example implementation for PPC [3]. >> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. > > Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > >> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. >> But I guess the Java memory model is beyond the scope of your presentation. > > Oh yes way out of scope! :) > > Cheers, > David > >> Best regards, >> Martin >> >> >> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf >> [2] >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030 >> 212.html [3] >> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Montag, 28. November 2016 06:56 >> To: Doerr, Martin ; hotspot-dev developers >> >> Subject: Re: Presentation: Understanding OrderAccess >> >> Hi Martin >> >> On 24/11/2016 2:20 AM, Doerr, Martin wrote: >>> Hi David, >>> >>> thank you very much for the presentation. I think it provides a good guideline for hotspot development. >> >> Thanks. >> >>> >>> Would you like to add something about multi-copy atomicity? >> >> Not really. :) >> >>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). >>> >>> It is needed in the following scenario: >>> - Different threads write 2 variables. >>> - Readers of these 2 variables expect a globally consistent order of the write accesses. >>> >>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". >> >> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... >> >>> (While taking a look at it, the condition "#if !(defined SPARC || >>> defined IA32 || defined AMD64)" is not accurate and should better get >>> improved. E.g. s390 is multi-copy atomic.) >>> >>> >>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. >> >> I still can't get my head around the C++11 terminology for this and >> how you are expected to use it - what does it mean for an individual >> operation to be "sequentially consistent" ? :( >> >> Cheers, >> David >> >>> >>> Thanks and best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >>> Behalf Of David Holmes >>> Sent: Mittwoch, 23. November 2016 06:08 >>> To: hotspot-dev developers >>> Subject: Presentation: Understanding OrderAccess >>> >>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. >>> >>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order >>> A >>> ccess-v1.1.pdf >>> >>> Cheers, >>> David >>> From david.holmes at oracle.com Mon Nov 28 21:25:50 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Nov 2016 07:25:50 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: Message-ID: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com> Hi Dean, On 29/11/2016 6:01 AM, dean.long at oracle.com wrote: > Hi David, > > > On 11/25/16 2:38 AM, David Holmes wrote: >> However, the stack size limitations remained in place in case the VM >> was launched from the primordial thread of a user application via the >> JNI invocation API. > > why is the JNI invocation API no longer a problem? Does it create a new > thread like the launcher? No, the JNI invocation API is unchanged. What has changed now are the conditions that required the 2MB limit due to the behaviour of the thread library (this goes back to LinuxThreads and the IA64 port). Thanks, David > dl From ioi.lam at oracle.com Mon Nov 28 23:03:50 2016 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 28 Nov 2016 15:03:50 -0800 Subject: RFR (S) 8169867 Method::restore_unshareable_info does not invoke Method::link_method In-Reply-To: References: <58329B05.6070602@oracle.com> <5832A7FC.8030505@oracle.com> <583BAB5B.4020404@oracle.com> Message-ID: <583CB7D6.2020207@oracle.com> On 11/28/16 12:18 PM, Vladimir Kozlov wrote: > Hi Ioi, > > Did you have updated webrev? > I didn't update the webrev. The only change from the previous webrev is the diff below > And you did not comment on my suggestion: > > >> Any suggest for a better name? > > > > _adapter_cds_entry ? > Thanks for the suggestion. I think "entry" may be confusing with other use of the word, such as _i2i_entry -- in this case this pointer doesn't point to the entry point of executable code. I think I'll just leave the names as is for now, and maybe file an RFE to rename it in JDK10. Thanks - Ioi > Thanks, > Vladimir > > On 11/27/16 7:58 PM, Ioi Lam wrote: >> I found a problem in my previous patch. Here's the fix (on top of he >> previous patch): >> >> diff -r 3404f61c7081 src/share/vm/oops/method.cpp >> --- a/src/share/vm/oops/method.cpp Sun Nov 27 19:44:44 2016 -0800 >> +++ b/src/share/vm/oops/method.cpp Sun Nov 27 19:50:35 2016 -0800 >> @@ -1031,11 +1031,13 @@ >> // leftover methods that weren't linked. >> if (is_shared()) { >> address entry = Interpreter::entry_for_cds_method(h_method); >> - assert(entry != NULL && entry == _i2i_entry && entry == >> _from_interpreted_entry, >> + assert(entry != NULL && entry == _i2i_entry, >> "should be correctly set during dump time"); >> if (adapter() != NULL) { >> return; >> } >> + assert(entry == _from_interpreted_entry, >> + "should be correctly set during dump time"); >> } else if (_i2i_entry != NULL) { >> return; >> } >> >> The problem is: if the method has been compiled, then a shared method's >> _from_interpreted_entry would be different than _i2i_entry (see >> Method::set_code()). >> >> I am not sure if Method::link_method() would ever be called after >> it's been compiled, but I think it's safer to make the asserts no >> stronger than before this patch. >> >> Thanks >> - Ioi >> >> >> On 11/20/16 11:53 PM, Tobias Hartmann wrote: >>> Hi Ioi, >>> >>> this looks good to me, the detailed description including the diagram >>> is very nice and helps to understand the complex implementation! >>> >>> For the record: the test mentioned in [1] is part of my fix for >>> JDK-8169711. >>> >>> Best regards, >>> Tobias >>> >>> On 21.11.2016 07:58, Ioi Lam wrote: >>>> https://bugs.openjdk.java.net/browse/JDK-8169867 >>>> http://cr.openjdk.java.net/~iklam/jdk9/8169867_cds_not_calling_link_method.v01/ >>>> >>>> >>>> >>>> Thanks to Tobias for finding the bug. I have done the following >>>> >>>> + integrated Tobias' suggested fix >>>> + fixed Method::restore_unshareable_info to call Method::link_method >>>> + added comments and a diagram to illustrate how the CDS method entry >>>> trampolines work. >>>> >>>> BTW, I am a little unhappy about the name >>>> ConstMethod::_adapter_trampoline. >>>> It's basically an extra level of indirection to get to the adapter. >>>> However. >>>> The word "trampoline" usually is used for and extra jump in >>>> executable code, >>>> so it may be a little confusing when we use it for a data pointer >>>> here. >>>> >>>> Any suggest for a better name? >>>> >>>> >>>> Testing: >>>> [1] I have tested Tobias' TestInterpreterMethodEntries.java class and >>>> now it produces the correct assertion. I won't check in this >>>> test, though, >>>> since it won't assert anymore after Tobias fixes 8169711. >>>> >>>> # after -XX: or in .hotspotrc: SuppressErrorAt=/method.cpp:1035 >>>> # >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> # Internal Error >>>> (/home/iklam/jdk/ul/hotspot/src/share/vm/oops/method.cpp:1035), >>>> pid=16840, tid=16843 >>>> # assert(entry != __null && entry == _i2i_entry && entry == >>>> _from_interpreted_entry) failed: >>>> # should be correctly set during dump time >>>> >>>> [2] Ran RBT in fastdebug build for >>>> hotspot/test/:hotspot_all,vm.parallel_class_loading,vm.runtime.testlist >>>> >>>> All tests passed. >>>> >>>> Thanks >>>> - Ioi >>>> >> From davidcholmes at aapt.net.au Tue Nov 29 00:16:58 2016 From: davidcholmes at aapt.net.au (David Holmes) Date: Tue, 29 Nov 2016 10:16:58 +1000 Subject: TEST - please ignore Message-ID: <01dd01d249d5$e6a52510$b3ef6f30$@aapt.net.au> From dean.long at oracle.com Tue Nov 29 06:59:27 2016 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 28 Nov 2016 22:59:27 -0800 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com> References: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com> Message-ID: <4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com> On 11/28/16 1:25 PM, David Holmes wrote: > Hi Dean, > > On 29/11/2016 6:01 AM, dean.long at oracle.com wrote: >> Hi David, >> >> >> On 11/25/16 2:38 AM, David Holmes wrote: >>> However, the stack size limitations remained in place in case the VM >>> was launched from the primordial thread of a user application via the >>> JNI invocation API. >> >> why is the JNI invocation API no longer a problem? Does it create a new >> thread like the launcher? > > No, the JNI invocation API is unchanged. What has changed now are the > conditions that required the 2MB limit due to the behaviour of the > thread library (this goes back to LinuxThreads and the IA64 port). > Let me see if I have it straight. The stack size limit was needed for the primordial thread on LinuxThreads (I remember those days!). We can still start the JVM on the primordial thread if we use a custom launcher or the JNI invocation API, but we no longer need the 2MB limit because we no longer support LinuxThreads. Based on the comment in os::Linux::capture_initial_stack, I'd also like to know if pthread_getattr_np() is now reliable on the primordial thread. If so, we could remove a lot of ugly code. dl > Thanks, > David > >> dl From thomas.stuefe at gmail.com Tue Nov 29 07:13:01 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Nov 2016 08:13:01 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: Hi Stefan, looks fine. There is a trailing ; after the "smallest_chunk_size" method (no need to do a webrev for that). Thanks for taking my suggestions. Best regards, Thomas On Mon, Nov 28, 2016 at 10:06 PM, Stefan Karlsson < stefan.karlsson at oracle.com> wrote: > Hi again, > > This set of patches resolve some of the comments given by Mikael and > Thomas: > > Entire patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02 > > Delta patches: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.ve > rify_global_initialization > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter > > I consider pushing the last patch as a separate changeset. > > This is the entire patch without the unused_parameter patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 > > Thanks, > StefanK > > On 2016-11-28 14:52, Stefan Karlsson wrote: > >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we always >> try to allocate chunks that are any of the three fixed sizes (specialized, >> small, medium) or a humongous chunk (if it is larger then the medium chunk >> size). >> >> During metaspace initialization an initial metaspace chunk is allocated. >> The size of some of the metaspace instances can be specified on the command >> line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same time >> doesn't match the specialized or small chunk size, then we end up hitting >> the assert mentioned above: >> # >> # Internal Error (/scratch/opt/jprt/T/P1/142848 >> .erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), pid=31643, >> tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is neither of >> these. >> >> Most of the other code is refactoring to unify the non-class metaspace >> and the class metaspace code paths to get rid of some of the existing code >> duplication, bring the chunk size calculation nearer to the the actual >> chunk allocation, and make it easier to write a unit test for the new >> adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK >> > > > From david.holmes at oracle.com Tue Nov 29 09:20:26 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Nov 2016 19:20:26 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com> References: <53c38b45-22d5-082d-1244-1d1025822fea@oracle.com> <4e08fd2c-476c-31fe-2c19-f1b2962199ad@oracle.com> Message-ID: On 29/11/2016 4:59 PM, dean.long at oracle.com wrote: > On 11/28/16 1:25 PM, David Holmes wrote: > >> Hi Dean, >> >> On 29/11/2016 6:01 AM, dean.long at oracle.com wrote: >>> Hi David, >>> >>> >>> On 11/25/16 2:38 AM, David Holmes wrote: >>>> However, the stack size limitations remained in place in case the VM >>>> was launched from the primordial thread of a user application via the >>>> JNI invocation API. >>> >>> why is the JNI invocation API no longer a problem? Does it create a new >>> thread like the launcher? >> >> No, the JNI invocation API is unchanged. What has changed now are the >> conditions that required the 2MB limit due to the behaviour of the >> thread library (this goes back to LinuxThreads and the IA64 port). >> > > Let me see if I have it straight. The stack size limit was needed for > the primordial thread on LinuxThreads (I remember those days!). We can > still start the JVM on the primordial thread if we use a custom launcher > or the JNI invocation API, but we no longer need the 2MB limit because > we no longer support LinuxThreads. Yes. There were some other reasons why the 2MB limit was needed but those no longer exist either (ie ia64 port, alt-stack usage) > Based on the comment in os::Linux::capture_initial_stack, I'd also like > to know if pthread_getattr_np() is now reliable on the primordial > thread. If so, we could remove a lot of ugly code. I expect that it would be, but that would require a lot more extensive testing of different Linuxes. This can be done as part of planned future cleanup work in 10. Thanks, David > dl > > >> Thanks, >> David >> >>> dl > From stefan.karlsson at oracle.com Tue Nov 29 09:32:05 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Nov 2016 10:32:05 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: Hi Thomas, On 2016-11-29 08:13, Thomas St?fe wrote: > Hi Stefan, > > looks fine. There is a trailing ; after the "smallest_chunk_size" method > (no need to do a webrev for that). Will fix. > > Thanks for taking my suggestions. Thanks for reviewing. StefanK > > Best regards, Thomas > > > > On Mon, Nov 28, 2016 at 10:06 PM, Stefan Karlsson > > wrote: > > Hi again, > > This set of patches resolve some of the comments given by Mikael and > Thomas: > > Entire patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02 > > > Delta patches: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization > > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter > > > I consider pushing the last patch as a separate changeset. > > This is the entire patch without the unused_parameter patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 > > > Thanks, > StefanK > > On 2016-11-28 14:52, Stefan Karlsson wrote: > > Hi all, > > Please, review this patch to fix metaspace initialization. > > http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ > > https://bugs.openjdk.java.net/browse/JDK-8170395 > > > The fix for JDK-8169931 introduced a new assert to ensure that > we always try to allocate chunks that are any of the three fixed > sizes (specialized, small, medium) or a humongous chunk (if it > is larger then the medium chunk size). > > During metaspace initialization an initial metaspace chunk is > allocated. The size of some of the metaspace instances can be > specified on the command line. For example: > java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version > > If this size is smaller than the medium chunk size and at the > same time doesn't match the specialized or small chunk size, > then we end up hitting the assert mentioned above: > # > # Internal Error > (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), > pid=31643, tid=31646 > # assert(size > free_chunks(MediumIndex)->size()) failed: Not a > humongous chunk > # > > ======================================================================== > > The most important part of the fix is this line: > + // Adjust to one of the fixed chunk sizes (unless humongous) > + const size_t adjusted = adjust_initial_chunk_size(requested); > > which ensures that we always request either of a specialized, > small, medium, or humongous chunk size, even if the requested > size is neither of these. > > Most of the other code is refactoring to unify the non-class > metaspace and the class metaspace code paths to get rid of some > of the existing code duplication, bring the chunk size > calculation nearer to the the actual chunk allocation, and make > it easier to write a unit test for the new > adjust_initial_chunk_size function. > > ======================================================================== > > The patch for JDK-8169931 was backed out with JDK-8170355 and > will be reintroduced as JDK-8170358 when this patch has been > reviewed and pushed. > > Testing: jprt, unit test, parts of PIT testing (including CDS > tests), failing test > > Thanks, > StefanK > > > > From volker.simonis at gmail.com Tue Nov 29 09:41:10 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 29 Nov 2016 10:41:10 +0100 Subject: RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583C5B10.8040204@linux.vnet.ibm.com> References: <583C5B10.8040204@linux.vnet.ibm.com> Message-ID: Thanks Gustavo, the change looks good. So now we're just waiting for another review from somebody of the aarch64 folks. Once we have that and the fc-request is approved I'll push the changes. Regards, Volker On Mon, Nov 28, 2016 at 5:28 PM, Gustavo Romero wrote: > Hi all, > > I'm re-sending due to JDK title update to include s390x and aarch64 archs. > > Could the following webrev be reviewed, please? > > webrev 1/2: http://cr.openjdk.java.net/~gromero/8170153/v2/ > webrev 2/2: http://cr.openjdk.java.net/~gromero/8170153/v2/jdk/ > bug: https://bugs.openjdk.java.net/browse/JDK-8170153 > > Thank you. > > > Regards, > Gustavo > From thomas.stuefe at gmail.com Tue Nov 29 10:39:51 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Nov 2016 11:39:51 +0100 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: Message-ID: Hi David, thanks for the good explanation. Change looks good, I really like the comment in capture_initial_stack(). Question, with -Xss given and being smaller than current thread stack size, guard pages may appear in the middle of the invoking thread stack? I always thought this is a bit dangerous. If your model is to have the VM created from the main thread, which then goes off to do different things, and have other threads then attach and run java code, main thread later may crash in unrelated native code just because it reached the stack depth of the hava threads? Or am I misunderstanding something? Thanks, Thomas On Fri, Nov 25, 2016 at 11:38 AM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 > > The bug is not public unfortunately for non-technical reasons - but see my > eval below. > > Background: if you load the JVM from the primordial thread of a process > (not done by the java launcher since JDK 6), there is an artificial stack > limit imposed on the initial thread (by sticking the guard page at the > limit position of the actual stack) of the minimum of the -Xss setting and > 2M. So if you set -Xss to > 2M it is ignored for the main thread even if > the true stack is, say, 8M. This limitation dates back 10-15 years and is > no longer relevant today and should be removed (see below). I've also added > additional explanatory notes. > > webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ > > Testing was manually done by modifying the launcher to not run the VM in a > new thread, and checking the resulting stack size used. > > This change will only affect hosted JVMs launched with a -Xss value > 2M. > > Thanks, > David > ----- > > Bug eval: > > JDK-4441425 limits the stack to 8M as a safeguard against an unlimited > value from getrlimit in 1.3.1, but further constrained that to 2M in 1.4.0 > due to JDK-4466587. > > By 1.4.2 we have the basic form of the current problematic code: > > #ifndef IA64 > if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; > #else > // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small > if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; > #endif > > _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); > > if (max_size && _initial_thread_stack_size > max_size) { > _initial_thread_stack_size = max_size; > } > > This was added by JDK-4678676 to allow the stack of the main thread to be > _reduced_ below the default 2M/4M if the -Xss value was smaller than > that.** There was no intent to allow the stack size to follow -Xss > arbitrarily due to the operational constraints imposed by the OS/glibc at > the time when dealing with the primordial process thread. > > ** It could not actually change the actual stack size of course, but set > the guard pages to limit use to the expected stack size. > > In JDK 6, under JDK-6316197, the launcher was changed to create the JVM in > a new thread, so that it was not limited by the idiosyncracies of the OS or > thread library primordial thread handling. However, the stack size > limitations remained in place in case the VM was launched from the > primordial thread of a user application via the JNI invocation API. > > I believe it should be safe to remove the 2M limitation now. > From mikael.gerdin at oracle.com Tue Nov 29 10:53:04 2016 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 29 Nov 2016 11:53:04 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: <7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com> Hi Stefan, On 2016-11-28 22:06, Stefan Karlsson wrote: > Hi again, > > This set of patches resolve some of the comments given by Mikael and > Thomas: > > Entire patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02 Looks good! /Mikael > > Delta patches: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization > > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter > > I consider pushing the last patch as a separate changeset. > > This is the entire patch without the unused_parameter patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 > > Thanks, > StefanK > > On 2016-11-28 14:52, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we >> always try to allocate chunks that are any of the three fixed sizes >> (specialized, small, medium) or a humongous chunk (if it is larger >> then the medium chunk size). >> >> During metaspace initialization an initial metaspace chunk is >> allocated. The size of some of the metaspace instances can be >> specified on the command line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same >> time doesn't match the specialized or small chunk size, then we end up >> hitting the assert mentioned above: >> # >> # Internal Error >> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >> pid=31643, tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is neither >> of these. >> >> Most of the other code is refactoring to unify the non-class metaspace >> and the class metaspace code paths to get rid of some of the existing >> code duplication, bring the chunk size calculation nearer to the the >> actual chunk allocation, and make it easier to write a unit test for >> the new adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK > > From per.liden at oracle.com Tue Nov 29 11:11:52 2016 From: per.liden at oracle.com (Per Liden) Date: Tue, 29 Nov 2016 12:11:52 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> Message-ID: <6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com> Hi Stefan, On 2016-11-28 22:06, Stefan Karlsson wrote: > Hi again, > > This set of patches resolve some of the comments given by Mikael and > Thomas: > > Entire patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02 Looks good, just a few comments. metaspace.cpp ------------- 751 static size_t specialized_chunk_size(bool is_class) { return (size_t) is_class ? ClassSpecializedChunk : SpecializedChunk; } 752 static size_t small_chunk_size(bool is_class) { return (size_t) is_class ? ClassSmallChunk : SmallChunk; } 753 static size_t medium_chunk_size(bool is_class) { return (size_t) is_class ? ClassMediumChunk : MediumChunk; } The size_t casts above binds to is_class and not the result from ?: so you probably you want to do: return is_class ? (size_t)A : (size_t)B; ... or perhaps just skip the casts. 760 size_t specialized_chunk_size() { return specialized_chunk_size(is_class()); } 761 size_t small_chunk_size() { return small_chunk_size(is_class()); } 762 size_t medium_chunk_size() { return medium_chunk_size(is_class()); } 763 764 size_t smallest_chunk_size() { return smallest_chunk_size(is_class()); } 765 766 size_t medium_chunk_bunch() { return medium_chunk_size() * MediumChunkMultiple; } More of a style thing, but it looks like these functions could also be const, no? I don't need to see a new webrev. cheers, Per > > Delta patches: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization > > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter > > I consider pushing the last patch as a separate changeset. > > This is the entire patch without the unused_parameter patch: > http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 > > Thanks, > StefanK > > On 2016-11-28 14:52, Stefan Karlsson wrote: >> Hi all, >> >> Please, review this patch to fix metaspace initialization. >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8170395 >> >> The fix for JDK-8169931 introduced a new assert to ensure that we >> always try to allocate chunks that are any of the three fixed sizes >> (specialized, small, medium) or a humongous chunk (if it is larger >> then the medium chunk size). >> >> During metaspace initialization an initial metaspace chunk is >> allocated. The size of some of the metaspace instances can be >> specified on the command line. For example: >> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >> >> If this size is smaller than the medium chunk size and at the same >> time doesn't match the specialized or small chunk size, then we end up >> hitting the assert mentioned above: >> # >> # Internal Error >> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >> pid=31643, tid=31646 >> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >> humongous chunk >> # >> >> ======================================================================== >> >> The most important part of the fix is this line: >> + // Adjust to one of the fixed chunk sizes (unless humongous) >> + const size_t adjusted = adjust_initial_chunk_size(requested); >> >> which ensures that we always request either of a specialized, small, >> medium, or humongous chunk size, even if the requested size is neither >> of these. >> >> Most of the other code is refactoring to unify the non-class metaspace >> and the class metaspace code paths to get rid of some of the existing >> code duplication, bring the chunk size calculation nearer to the the >> actual chunk allocation, and make it easier to write a unit test for >> the new adjust_initial_chunk_size function. >> >> ======================================================================== >> >> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >> reintroduced as JDK-8170358 when this patch has been reviewed and pushed. >> >> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >> failing test >> >> Thanks, >> StefanK > > From stefan.karlsson at oracle.com Tue Nov 29 11:47:26 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Nov 2016 12:47:26 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> <6eae82f0-b73e-0596-dcca-9ff7a1efcd23@oracle.com> Message-ID: Hi Per, On 2016-11-29 12:11, Per Liden wrote: > Hi Stefan, > > On 2016-11-28 22:06, Stefan Karlsson wrote: >> Hi again, >> >> This set of patches resolve some of the comments given by Mikael and >> Thomas: >> >> Entire patch: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02 > > Looks good, just a few comments. > > metaspace.cpp > ------------- > > 751 static size_t specialized_chunk_size(bool is_class) { return > (size_t) is_class ? ClassSpecializedChunk : SpecializedChunk; } > 752 static size_t small_chunk_size(bool is_class) { return > (size_t) is_class ? ClassSmallChunk : SmallChunk; } > 753 static size_t medium_chunk_size(bool is_class) { return > (size_t) is_class ? ClassMediumChunk : MediumChunk; } > > The size_t casts above binds to is_class and not the result from ?: so > you probably you want to do: > > return is_class ? (size_t)A : (size_t)B; > > ... or perhaps just skip the casts. > Sure. This cast existed before my changes, but I can remove it since it's obviously wrong. > > 760 size_t specialized_chunk_size() { return > specialized_chunk_size(is_class()); } > 761 size_t small_chunk_size() { return > small_chunk_size(is_class()); } > 762 size_t medium_chunk_size() { return > medium_chunk_size(is_class()); } > 763 > 764 size_t smallest_chunk_size() { return > smallest_chunk_size(is_class()); } > 765 > 766 size_t medium_chunk_bunch() { return medium_chunk_size() * > MediumChunkMultiple; } > > More of a style thing, but it looks like these functions could also be > const, no? Yes, and many other functions in that file. I'll update these since I changed them. > > I don't need to see a new webrev. Thanks for reviewing, StefanK > > cheers, > Per > >> >> Delta patches: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization >> >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter >> >> >> I consider pushing the last patch as a separate changeset. >> >> This is the entire patch without the unused_parameter patch: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 >> >> Thanks, >> StefanK >> >> On 2016-11-28 14:52, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review this patch to fix metaspace initialization. >>> >>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8170395 >>> >>> The fix for JDK-8169931 introduced a new assert to ensure that we >>> always try to allocate chunks that are any of the three fixed sizes >>> (specialized, small, medium) or a humongous chunk (if it is larger >>> then the medium chunk size). >>> >>> During metaspace initialization an initial metaspace chunk is >>> allocated. The size of some of the metaspace instances can be >>> specified on the command line. For example: >>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >>> >>> If this size is smaller than the medium chunk size and at the same >>> time doesn't match the specialized or small chunk size, then we end up >>> hitting the assert mentioned above: >>> # >>> # Internal Error >>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >>> >>> pid=31643, tid=31646 >>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >>> humongous chunk >>> # >>> >>> ======================================================================== >>> >>> The most important part of the fix is this line: >>> + // Adjust to one of the fixed chunk sizes (unless humongous) >>> + const size_t adjusted = adjust_initial_chunk_size(requested); >>> >>> which ensures that we always request either of a specialized, small, >>> medium, or humongous chunk size, even if the requested size is neither >>> of these. >>> >>> Most of the other code is refactoring to unify the non-class metaspace >>> and the class metaspace code paths to get rid of some of the existing >>> code duplication, bring the chunk size calculation nearer to the the >>> actual chunk allocation, and make it easier to write a unit test for >>> the new adjust_initial_chunk_size function. >>> >>> ======================================================================== >>> >>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >>> reintroduced as JDK-8170358 when this patch has been reviewed and >>> pushed. >>> >>> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >>> failing test >>> >>> Thanks, >>> StefanK >> >> From stefan.karlsson at oracle.com Tue Nov 29 11:47:51 2016 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 29 Nov 2016 12:47:51 +0100 Subject: RFR: 8170395: Metaspace initialization queries the wrong chunk freelist In-Reply-To: <7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com> References: <791d0cf9-7fa6-2bd7-293d-730b1d8157b7@oracle.com> <7f8d3b14-bc33-28ea-d630-d189c68d5d00@oracle.com> Message-ID: <4235288e-c437-56a7-6de4-38f577c8fa7e@oracle.com> Thanks, Mikael! StefanK On 2016-11-29 11:53, Mikael Gerdin wrote: > Hi Stefan, > > On 2016-11-28 22:06, Stefan Karlsson wrote: >> Hi again, >> >> This set of patches resolve some of the comments given by Mikael and >> Thomas: >> >> Entire patch: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02 > > Looks good! > /Mikael > >> >> Delta patches: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01.verify_global_initialization >> >> >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.02.revert_enum >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.03.unused_parameter >> >> >> I consider pushing the last patch as a separate changeset. >> >> This is the entire patch without the unused_parameter patch: >> http://cr.openjdk.java.net/~stefank/8170395/webrev.02.01-02 >> >> Thanks, >> StefanK >> >> On 2016-11-28 14:52, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please, review this patch to fix metaspace initialization. >>> >>> http://cr.openjdk.java.net/~stefank/8170395/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8170395 >>> >>> The fix for JDK-8169931 introduced a new assert to ensure that we >>> always try to allocate chunks that are any of the three fixed sizes >>> (specialized, small, medium) or a humongous chunk (if it is larger >>> then the medium chunk size). >>> >>> During metaspace initialization an initial metaspace chunk is >>> allocated. The size of some of the metaspace instances can be >>> specified on the command line. For example: >>> java -XX:InitialBootClassLoaderMetaspaceSize=30720 -version >>> >>> If this size is smaller than the medium chunk size and at the same >>> time doesn't match the specialized or small chunk size, then we end up >>> hitting the assert mentioned above: >>> # >>> # Internal Error >>> (/scratch/opt/jprt/T/P1/142848.erik/s/hotspot/src/share/vm/memory/metaspace.cpp:2359), >>> >>> pid=31643, tid=31646 >>> # assert(size > free_chunks(MediumIndex)->size()) failed: Not a >>> humongous chunk >>> # >>> >>> ======================================================================== >>> >>> The most important part of the fix is this line: >>> + // Adjust to one of the fixed chunk sizes (unless humongous) >>> + const size_t adjusted = adjust_initial_chunk_size(requested); >>> >>> which ensures that we always request either of a specialized, small, >>> medium, or humongous chunk size, even if the requested size is neither >>> of these. >>> >>> Most of the other code is refactoring to unify the non-class metaspace >>> and the class metaspace code paths to get rid of some of the existing >>> code duplication, bring the chunk size calculation nearer to the the >>> actual chunk allocation, and make it easier to write a unit test for >>> the new adjust_initial_chunk_size function. >>> >>> ======================================================================== >>> >>> The patch for JDK-8169931 was backed out with JDK-8170355 and will be >>> reintroduced as JDK-8170358 when this patch has been reviewed and >>> pushed. >>> >>> Testing: jprt, unit test, parts of PIT testing (including CDS tests), >>> failing test >>> >>> Thanks, >>> StefanK >> >> From david.holmes at oracle.com Tue Nov 29 11:59:44 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Nov 2016 21:59:44 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: Message-ID: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> Hi Thomas, On 29/11/2016 8:39 PM, Thomas St?fe wrote: > Hi David, > > thanks for the good explanation. Change looks good, I really like the > comment in capture_initial_stack(). > > Question, with -Xss given and being smaller than current thread stack > size, guard pages may appear in the middle of the invoking thread stack? > I always thought this is a bit dangerous. If your model is to have the > VM created from the main thread, which then goes off to do different > things, and have other threads then attach and run java code, main > thread later may crash in unrelated native code just because it reached > the stack depth of the hava threads? Or am I misunderstanding something? There is no change to the general behaviour other than allowing a primordial process thread that launches the VM, to now not have an effective stack limited at 2MB. The current logic will insert guard pages where ever -Xss states (as long as less than 2MB else 2MB), while with the fix the guard pages will be inserted above 2MB - as dictated by -Xss. David ----- > Thanks, Thomas > > > On Fri, Nov 25, 2016 at 11:38 AM, David Holmes > wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 > > > The bug is not public unfortunately for non-technical reasons - but > see my eval below. > > Background: if you load the JVM from the primordial thread of a > process (not done by the java launcher since JDK 6), there is an > artificial stack limit imposed on the initial thread (by sticking > the guard page at the limit position of the actual stack) of the > minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is > ignored for the main thread even if the true stack is, say, 8M. This > limitation dates back 10-15 years and is no longer relevant today > and should be removed (see below). I've also added additional > explanatory notes. > > webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ > > > Testing was manually done by modifying the launcher to not run the > VM in a new thread, and checking the resulting stack size used. > > This change will only affect hosted JVMs launched with a -Xss value > > 2M. > > Thanks, > David > ----- > > Bug eval: > > JDK-4441425 limits the stack to 8M as a safeguard against an > unlimited value from getrlimit in 1.3.1, but further constrained > that to 2M in 1.4.0 due to JDK-4466587. > > By 1.4.2 we have the basic form of the current problematic code: > > #ifndef IA64 > if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; > #else > // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small > if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; > #endif > > _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); > > if (max_size && _initial_thread_stack_size > max_size) { > _initial_thread_stack_size = max_size; > } > > This was added by JDK-4678676 to allow the stack of the main thread > to be _reduced_ below the default 2M/4M if the -Xss value was > smaller than that.** There was no intent to allow the stack size to > follow -Xss arbitrarily due to the operational constraints imposed > by the OS/glibc at the time when dealing with the primordial process > thread. > > ** It could not actually change the actual stack size of course, but > set the guard pages to limit use to the expected stack size. > > In JDK 6, under JDK-6316197, the launcher was changed to create the > JVM in a new thread, so that it was not limited by the > idiosyncracies of the OS or thread library primordial thread > handling. However, the stack size limitations remained in place in > case the VM was launched from the primordial thread of a user > application via the JNI invocation API. > > I believe it should be safe to remove the 2M limitation now. > > From david.holmes at oracle.com Tue Nov 29 12:25:45 2016 From: david.holmes at oracle.com (David Holmes) Date: Tue, 29 Nov 2016 22:25:45 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> Message-ID: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> I just realized I overlooked the case where ThreadStackSize=0 and the stack is unlimited. In that case it isn't clear where the guard pages will get inserted - I do know that I don't get a stackoverflow error. This needs further investigation. David On 29/11/2016 9:59 PM, David Holmes wrote: > Hi Thomas, > > On 29/11/2016 8:39 PM, Thomas St?fe wrote: >> Hi David, >> >> thanks for the good explanation. Change looks good, I really like the >> comment in capture_initial_stack(). >> >> Question, with -Xss given and being smaller than current thread stack >> size, guard pages may appear in the middle of the invoking thread stack? >> I always thought this is a bit dangerous. If your model is to have the >> VM created from the main thread, which then goes off to do different >> things, and have other threads then attach and run java code, main >> thread later may crash in unrelated native code just because it reached >> the stack depth of the hava threads? Or am I misunderstanding something? > > There is no change to the general behaviour other than allowing a > primordial process thread that launches the VM, to now not have an > effective stack limited at 2MB. The current logic will insert guard > pages where ever -Xss states (as long as less than 2MB else 2MB), while > with the fix the guard pages will be inserted above 2MB - as dictated by > -Xss. > > David > ----- > >> Thanks, Thomas >> >> >> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes > > wrote: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >> >> >> The bug is not public unfortunately for non-technical reasons - but >> see my eval below. >> >> Background: if you load the JVM from the primordial thread of a >> process (not done by the java launcher since JDK 6), there is an >> artificial stack limit imposed on the initial thread (by sticking >> the guard page at the limit position of the actual stack) of the >> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is >> ignored for the main thread even if the true stack is, say, 8M. This >> limitation dates back 10-15 years and is no longer relevant today >> and should be removed (see below). I've also added additional >> explanatory notes. >> >> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >> >> >> Testing was manually done by modifying the launcher to not run the >> VM in a new thread, and checking the resulting stack size used. >> >> This change will only affect hosted JVMs launched with a -Xss value >> > 2M. >> >> Thanks, >> David >> ----- >> >> Bug eval: >> >> JDK-4441425 limits the stack to 8M as a safeguard against an >> unlimited value from getrlimit in 1.3.1, but further constrained >> that to 2M in 1.4.0 due to JDK-4466587. >> >> By 1.4.2 we have the basic form of the current problematic code: >> >> #ifndef IA64 >> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >> #else >> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >> small >> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >> #endif >> >> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >> >> if (max_size && _initial_thread_stack_size > max_size) { >> _initial_thread_stack_size = max_size; >> } >> >> This was added by JDK-4678676 to allow the stack of the main thread >> to be _reduced_ below the default 2M/4M if the -Xss value was >> smaller than that.** There was no intent to allow the stack size to >> follow -Xss arbitrarily due to the operational constraints imposed >> by the OS/glibc at the time when dealing with the primordial process >> thread. >> >> ** It could not actually change the actual stack size of course, but >> set the guard pages to limit use to the expected stack size. >> >> In JDK 6, under JDK-6316197, the launcher was changed to create the >> JVM in a new thread, so that it was not limited by the >> idiosyncracies of the OS or thread library primordial thread >> handling. However, the stack size limitations remained in place in >> case the VM was launched from the primordial thread of a user >> application via the JNI invocation API. >> >> I believe it should be safe to remove the 2M limitation now. >> >> From martin.doerr at sap.com Tue Nov 29 13:08:16 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 29 Nov 2016 13:08:16 +0000 Subject: Presentation: Understanding OrderAccess Message-ID: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> Hi David and Erik, > But again that attribution of global properties is not something I think is necessarily implied or intended by OrderAccess. > Or maybe it is, but as it is only an issue on non-multicopy-atomic systems, it has never been called out explicitly. ?? And > those global properties must also be a part of the other barriers (as the fence is just the combination of them all) - but I > don't know how you would describe the affects of the other barriers (like loadload) in "global" terms. I think the global properties are implicitly assumed on multicopy-atomic systems and most people don't think about them. But they are important as soon as more than 2 threads are involved, especially on PPC64 and Aarch64. That's why I'd appreciate if they could be added to hotspot documentations or presentations. Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation. If one thread releases something which is based on something else which was written by another thread, a third thread which acquires it, is expected to see that in a consistent way. Do you agree? loadStore and loadLoad barriers are much simpler as they basically require the following accesses to occur late enough without any global synchronization requirements. Best regards, Martin -----Original Message----- From: David Holmes [mailto:david.holmes at oracle.com] Sent: Montag, 28. November 2016 22:22 To: Doerr, Martin ; hotspot-dev developers Cc: ERIK.OSTERLUND Subject: Re: Presentation: Understanding OrderAccess Hi Martin, I've added Erik explicitly to the cc as he and I have been discussing fences and "visibility", and of course he most recently revised the descriptions in orderAccess.hpp On 29/11/2016 2:29 AM, Doerr, Martin wrote: > Hi David, > > sending the email again with corrected subject + removed confusing statement. My spam filter had added "[JUNK]". I have no clue what it didn't like. Sorry for that. > >> Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? > > This is really hard to explain. Maybe there are better explanations out there, but I'll give it a try: > > I think the comment in orderAccess.hpp is not bad: > // Finally, we define a "fence" operation, as a bidirectional barrier. > // It guarantees that any memory access preceding the fence is not // reordered w.r.t. any memory accesses subsequent to the fence in program // order. > > One can consider a fence as a global operation which separates a set of accesses A from a set of accesses B. > If A contains a load, one has to include the corresponding store which may have been performed by another thread into A. > Especially the storeLoad part of the barrier must include stores performed by other processors but observed by this one. But again that attribution of global properties is not something I think is necessarily implied or intended by OrderAccess. Or maybe it is, but as it is only an issue on non-multicopy-atomic systems, it has never been called out explicitly. ?? And those global properties must also be a part of the other barriers (as the fence is just the combination of them all) - but I don't know how you would describe the affects of the other barriers (like loadload) in "global" terms. David ----- > >> Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... >> but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > > "Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged." [4] > > So acquire+release orders wrt. all memory accesses while the total modification order only applies to "atomic operations that are so tagged". This is pretty much like volatile vs. non-volatile in Java [5]. > > > Best regards, > Martin > > [4] http://en.cppreference.com/w/cpp/atomic/memory_order#Sequentially-consistent_ordering > [5] http://g.oswego.edu/dl/jmm/cookbook.html > > > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Montag, 28. November 2016 13:56 > To: Doerr, Martin ; hotspot-dev developers > Subject: Re: Presentation: Understanding OrderAccess > > Hi Martin, > > On 28/11/2016 8:43 PM, Doerr, Martin wrote: >> Hi David, >> >> I know, multi-copy atomicity is hard to understand. It is relevant for complex scenarios in which more than 2 threads are involved. >> I think a good explanation is given in the paper [1] which we had discussed some time ago (email thread [2]). >> >> The term "multiple-copy atomicity" is described as "... in a machine >> which is not multiple-copy atomic, even if a write instruction is access-atomic, the write may become visible to different threads at different times ...". >> >> I think "IRIW" (in [1] "6.1 Extending SB to more threads: IRIW and RWC") is the most comprehensible example. >> The key property of the architectures is that "... writes can be propagated to different threads in different orders ...". > > Thanks for the reminder of that discussion. :) > >> A globally consistent order can be enforced by adding - in hotspot terms - OrderAccess::fence() between the read accesses. > > Problem there, I think, is that fence() is really not special in that regard. You need to insert something between the two loads to force a globally consistent view of memory. But what part of fence() gives that guarantee? Maybe there is something we need to define for non-multi-copy-atomicarchitectures to use just for this purpose. > >> Since you have asked about C++11, there's an example implementation for PPC [3]. >> Load Seq Cst uses a heavy-weight sync instruction (OrderAccess::fence() in hotspot terms) before the load. Such "Load Seq Cst" observe writes in a globally consistent order. > > Yeah I've seen the mappings but it is the conceptual model that I have a problem with. Andrew's reply makes it somewhat clearer - if every atomic op is seq-cst then you get a seq-cst execution ... but does that somehow bind all memory accesses not just those involved in the atomic ops? And how do non seq-cst atomic ops interact with seq-cst ones? > >> Btw.: We have implemented the Java volatile accesses very similar to [3] for PPC64 even though the recent Java memory model does not strictly require this implementation. >> But I guess the Java memory model is beyond the scope of your presentation. > > Oh yes way out of scope! :) > > Cheers, > David > >> Best regards, >> Martin >> >> >> [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf >> [2] >> http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-December/030 >> 212.html [3] >> http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html >> >> >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Montag, 28. November 2016 06:56 >> To: Doerr, Martin ; hotspot-dev developers >> >> Subject: Re: Presentation: Understanding OrderAccess >> >> Hi Martin >> >> On 24/11/2016 2:20 AM, Doerr, Martin wrote: >>> Hi David, >>> >>> thank you very much for the presentation. I think it provides a good guideline for hotspot development. >> >> Thanks. >> >>> >>> Would you like to add something about multi-copy atomicity? >> >> Not really. :) >> >>> E.g. there's a usage of OrderAccess::fence() in GenericTaskQueue::pop_global which is only needed on platforms which don't provide this property (PPC and ARM). >>> >>> It is needed in the following scenario: >>> - Different threads write 2 variables. >>> - Readers of these 2 variables expect a globally consistent order of the write accesses. >>> >>> In this case, the readers must use OrderAccess::fence() between the 2 load accesses on platforms without "multi-copy atomicity". >> >> Hmmm ... I know this code was discussed at length a couple of years ago ... and I know I've probably forgotten most of what was discussed ... so I'll have to revisit this because this seems wrong ... >> >>> (While taking a look at it, the condition "#if !(defined SPARC || >>> defined IA32 || defined AMD64)" is not accurate and should better get >>> improved. E.g. s390 is multi-copy atomic.) >>> >>> >>> I like that you have added our cmpxchg_memory_order definition. We implemented it even more conservative than C++' seq_cst on PPC64. >> >> I still can't get my head around the C++11 terminology for this and >> how you are expected to use it - what does it mean for an individual >> operation to be "sequentially consistent" ? :( >> >> Cheers, >> David >> >>> >>> Thanks and best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net] On >>> Behalf Of David Holmes >>> Sent: Mittwoch, 23. November 2016 06:08 >>> To: hotspot-dev developers >>> Subject: Presentation: Understanding OrderAccess >>> >>> This is a presentation I recently gave internally to the runtime and serviceability teams that may be of more general interest to hotspot developers. >>> >>> http://cr.openjdk.java.net/~dholmes/presentations/Understanding-Order >>> A >>> ccess-v1.1.pdf >>> >>> Cheers, >>> David >>> From thomas.stuefe at gmail.com Tue Nov 29 13:43:53 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 29 Nov 2016 14:43:53 +0100 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> Message-ID: Hi David, On Tue, Nov 29, 2016 at 12:59 PM, David Holmes wrote: > Hi Thomas, > > On 29/11/2016 8:39 PM, Thomas St?fe wrote: > >> Hi David, >> >> thanks for the good explanation. Change looks good, I really like the >> comment in capture_initial_stack(). >> >> Question, with -Xss given and being smaller than current thread stack >> size, guard pages may appear in the middle of the invoking thread stack? >> I always thought this is a bit dangerous. If your model is to have the >> VM created from the main thread, which then goes off to do different >> things, and have other threads then attach and run java code, main >> thread later may crash in unrelated native code just because it reached >> the stack depth of the hava threads? Or am I misunderstanding something? >> > > There is no change to the general behaviour other than allowing a > primordial process thread that launches the VM, to now not have an > effective stack limited at 2MB. The current logic will insert guard pages > where ever -Xss states (as long as less than 2MB else 2MB), while with the > fix the guard pages will be inserted above 2MB - as dictated by -Xss. > > Thank you for this answer. I know my question was outside the scope of your patch. Thomas > David > ----- > > Thanks, Thomas >> >> >> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes > > wrote: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >> >> >> The bug is not public unfortunately for non-technical reasons - but >> see my eval below. >> >> Background: if you load the JVM from the primordial thread of a >> process (not done by the java launcher since JDK 6), there is an >> artificial stack limit imposed on the initial thread (by sticking >> the guard page at the limit position of the actual stack) of the >> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is >> ignored for the main thread even if the true stack is, say, 8M. This >> limitation dates back 10-15 years and is no longer relevant today >> and should be removed (see below). I've also added additional >> explanatory notes. >> >> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >> >> >> Testing was manually done by modifying the launcher to not run the >> VM in a new thread, and checking the resulting stack size used. >> >> This change will only affect hosted JVMs launched with a -Xss value >> > 2M. >> >> Thanks, >> David >> ----- >> >> Bug eval: >> >> JDK-4441425 limits the stack to 8M as a safeguard against an >> unlimited value from getrlimit in 1.3.1, but further constrained >> that to 2M in 1.4.0 due to JDK-4466587. >> >> By 1.4.2 we have the basic form of the current problematic code: >> >> #ifndef IA64 >> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >> #else >> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >> small >> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >> #endif >> >> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >> >> if (max_size && _initial_thread_stack_size > max_size) { >> _initial_thread_stack_size = max_size; >> } >> >> This was added by JDK-4678676 to allow the stack of the main thread >> to be _reduced_ below the default 2M/4M if the -Xss value was >> smaller than that.** There was no intent to allow the stack size to >> follow -Xss arbitrarily due to the operational constraints imposed >> by the OS/glibc at the time when dealing with the primordial process >> thread. >> >> ** It could not actually change the actual stack size of course, but >> set the guard pages to limit use to the expected stack size. >> >> In JDK 6, under JDK-6316197, the launcher was changed to create the >> JVM in a new thread, so that it was not limited by the >> idiosyncracies of the OS or thread library primordial thread >> handling. However, the stack size limitations remained in place in >> case the VM was launched from the primordial thread of a user >> application via the JNI invocation API. >> >> I believe it should be safe to remove the 2M limitation now. >> >> >> From aph at redhat.com Tue Nov 29 14:07:23 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Nov 2016 14:07:23 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> Message-ID: <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com> On 29/11/16 13:08, Doerr, Martin wrote: > Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation. > If one thread releases something which is based on something else which was written by another thread, a third thread > which acquires it, is expected to see that in a consistent way. Do you agree? It depends. What exactly do you mean by "is based on"? Andrew. From aph at redhat.com Tue Nov 29 15:35:16 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Nov 2016 15:35:16 +0000 Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583C5B10.8040204@linux.vnet.ibm.com> Message-ID: On 29/11/16 09:41, Volker Simonis wrote: > Thanks Gustavo, > > the change looks good. > > So now we're just waiting for another review from somebody of the aarch64 folks. > Once we have that and the fc-request is approved I'll push the changes. One thing I don't understand: cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s Do you know what causes the lower digits to be different? Is it that Math and StrictMath use different algorithms, not just different optimization levels? Andrew. From gromero at linux.vnet.ibm.com Tue Nov 29 16:31:58 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Tue, 29 Nov 2016 14:31:58 -0200 Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583C5B10.8040204@linux.vnet.ibm.com> Message-ID: <583DAD7E.7020807@linux.vnet.ibm.com> Hi Andrew, On 29-11-2016 13:35, Andrew Haley wrote: > On 29/11/16 09:41, Volker Simonis wrote: >> Thanks Gustavo, >> >> the change looks good. >> >> So now we're just waiting for another review from somebody of the aarch64 folks. >> Once we have that and the fc-request is approved I'll push the changes. > > One thing I don't understand: > > cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s > sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s > > Do you know what causes the lower digits to be different? Is > it that Math and StrictMath use different algorithms, not just > different optimization levels? I don't know exactly what's the root cause for that difference (in the result). The difference is not present on x64, however on PPC64 even with -O0 (as it is by now) that difference exists. Math methods are intrisified, but StricMath are not. But I understand that Math and StrictMath share the fdlibm code since I already changed some code in fdlibm that reflected both on Math and StrictMath, so it's not clear to me where the Math relaxation occurs on PPC64 (given that such a relaxation is allowed [1]). For sure others much more experienced than I can comment about difference. Regards, Gustavo [1] https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html From martin.doerr at sap.com Tue Nov 29 17:50:32 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 29 Nov 2016 17:50:32 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com> References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com> Message-ID: <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap> Hi Andrew, I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1]. Thread 1 reads a value from Thread 0, Thread 1 publishes something e.g. by a releasing store (which could be lwsync + store on PPC64) and Thread 2 acquires this value (or relies on address dependency based ordering). The barrier must order Thread 0's store wrt. Thread 1's store in this case. E.g. Thread 1 could have updated a data structure referencing stuff from Thread 0. I think we all rely on that Thread 3 sees at least the same changes from Thread 0 when accessing this data structure. So this "cumulative" property is relevant for hotspot's OrderAccess functions. Best regards, Martin [1] http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Dienstag, 29. November 2016 15:07 To: Doerr, Martin ; David Holmes ; hotspot-dev developers Subject: Re: Presentation: Understanding OrderAccess On 29/11/16 13:08, Doerr, Martin wrote: > Also storeStore barriers are expected to be transitive or "cumulative" as the property is called in PPC64 documentation. > If one thread releases something which is based on something else > which was written by another thread, a third thread which acquires it, is expected to see that in a consistent way. Do you agree? It depends. What exactly do you mean by "is based on"? Andrew. From daniel.daugherty at oracle.com Tue Nov 29 17:57:50 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 29 Nov 2016 10:57:50 -0700 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> Message-ID: Sorry for being late to this party! Seems like thread stack sizes are very much on folks minds lately... > webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ src/os/linux/vm/os_linux.cpp L936: // a user-specified value known to be greater than the minimum needed. Perhaps: ... known to be at least the minimum needed. As enforced by this code in os::Posix::set_minimum_stack_sizes(): _java_thread_min_stack_allowed = MAX2(_java_thread_min_stack_allowed, JavaThread::stack_guard_zone_size() + JavaThread::stack_shadow_zone_size() + (4 * BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K); _java_thread_min_stack_allowed = align_size_up(_java_thread_min_stack_allowed, vm_page_size()); size_t stack_size_in_bytes = ThreadStackSize * K; if (stack_size_in_bytes != 0 && stack_size_in_bytes < _java_thread_min_stack_allowed) { // The '-Xss' and '-XX:ThreadStackSize=N' options both set // ThreadStackSize so we go with "Java thread stack size" instead // of "ThreadStackSize" to be more friendly. tty->print_cr("\nThe Java thread stack size specified is too small. " "Specify at least " SIZE_FORMAT "k", _java_thread_min_stack_allowed / K); return JNI_ERR; } L939: // can not do anything to emulate a larger stack than what has been provided by Typo: 'can not' -> 'cannot' L943: // Mamimum stack size is the easy part, get it from RLIMIT_STACK Typo: 'Mamimum' -> 'Maximum' nit - please add a '.' to the end. Thumbs up! I don't need to see a new webrev if you decide to make the minor edits above. Dan On 11/29/16 5:25 AM, David Holmes wrote: > I just realized I overlooked the case where ThreadStackSize=0 and the > stack is unlimited. In that case it isn't clear where the guard pages > will get inserted - I do know that I don't get a stackoverflow error. > > This needs further investigation. > > David > > On 29/11/2016 9:59 PM, David Holmes wrote: >> Hi Thomas, >> >> On 29/11/2016 8:39 PM, Thomas St?fe wrote: >>> Hi David, >>> >>> thanks for the good explanation. Change looks good, I really like the >>> comment in capture_initial_stack(). >>> >>> Question, with -Xss given and being smaller than current thread stack >>> size, guard pages may appear in the middle of the invoking thread >>> stack? >>> I always thought this is a bit dangerous. If your model is to have the >>> VM created from the main thread, which then goes off to do different >>> things, and have other threads then attach and run java code, main >>> thread later may crash in unrelated native code just because it reached >>> the stack depth of the hava threads? Or am I misunderstanding >>> something? >> >> There is no change to the general behaviour other than allowing a >> primordial process thread that launches the VM, to now not have an >> effective stack limited at 2MB. The current logic will insert guard >> pages where ever -Xss states (as long as less than 2MB else 2MB), while >> with the fix the guard pages will be inserted above 2MB - as dictated by >> -Xss. >> >> David >> ----- >> >>> Thanks, Thomas >>> >>> >>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes >> > wrote: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >>> >>> >>> The bug is not public unfortunately for non-technical reasons - but >>> see my eval below. >>> >>> Background: if you load the JVM from the primordial thread of a >>> process (not done by the java launcher since JDK 6), there is an >>> artificial stack limit imposed on the initial thread (by sticking >>> the guard page at the limit position of the actual stack) of the >>> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M >>> it is >>> ignored for the main thread even if the true stack is, say, 8M. >>> This >>> limitation dates back 10-15 years and is no longer relevant today >>> and should be removed (see below). I've also added additional >>> explanatory notes. >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >>> >>> >>> Testing was manually done by modifying the launcher to not run the >>> VM in a new thread, and checking the resulting stack size used. >>> >>> This change will only affect hosted JVMs launched with a -Xss value >>> > 2M. >>> >>> Thanks, >>> David >>> ----- >>> >>> Bug eval: >>> >>> JDK-4441425 limits the stack to 8M as a safeguard against an >>> unlimited value from getrlimit in 1.3.1, but further constrained >>> that to 2M in 1.4.0 due to JDK-4466587. >>> >>> By 1.4.2 we have the basic form of the current problematic code: >>> >>> #ifndef IA64 >>> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >>> #else >>> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >>> small >>> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >>> #endif >>> >>> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >>> >>> if (max_size && _initial_thread_stack_size > max_size) { >>> _initial_thread_stack_size = max_size; >>> } >>> >>> This was added by JDK-4678676 to allow the stack of the main thread >>> to be _reduced_ below the default 2M/4M if the -Xss value was >>> smaller than that.** There was no intent to allow the stack size to >>> follow -Xss arbitrarily due to the operational constraints imposed >>> by the OS/glibc at the time when dealing with the primordial >>> process >>> thread. >>> >>> ** It could not actually change the actual stack size of course, >>> but >>> set the guard pages to limit use to the expected stack size. >>> >>> In JDK 6, under JDK-6316197, the launcher was changed to create the >>> JVM in a new thread, so that it was not limited by the >>> idiosyncracies of the OS or thread library primordial thread >>> handling. However, the stack size limitations remained in place in >>> case the VM was launched from the primordial thread of a user >>> application via the JNI invocation API. >>> >>> I believe it should be safe to remove the 2M limitation now. >>> >>> From volker.simonis at gmail.com Tue Nov 29 18:06:05 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 29 Nov 2016 19:06:05 +0100 Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <583DAD7E.7020807@linux.vnet.ibm.com> References: <583C5B10.8040204@linux.vnet.ibm.com> <583DAD7E.7020807@linux.vnet.ibm.com> Message-ID: On Tue, Nov 29, 2016 at 5:31 PM, Gustavo Romero wrote: > Hi Andrew, > > On 29-11-2016 13:35, Andrew Haley wrote: >> On 29/11/16 09:41, Volker Simonis wrote: >>> Thanks Gustavo, >>> >>> the change looks good. >>> >>> So now we're just waiting for another review from somebody of the aarch64 folks. >>> Once we have that and the fc-request is approved I'll push the changes. >> >> One thing I don't understand: >> >> cos 0.17098435541865692 1m7.433s 0.1709843554185943 0m56.678s >> sin 1.7136493465700289 1m10.654s 1.7136493465700542 0m57.114s >> >> Do you know what causes the lower digits to be different? Is >> it that Math and StrictMath use different algorithms, not just >> different optimization levels? > > I don't know exactly what's the root cause for that difference (in the result). > The difference is not present on x64, however on PPC64 even with -O0 (as it is > by now) that difference exists. > > Math methods are intrisified, but StricMath are not. But I understand that Math > and StrictMath share the fdlibm code since I already changed some code in fdlibm > that reflected both on Math and StrictMath, so it's not clear to me where the > Math relaxation occurs on PPC64 (given that such a relaxation is allowed [1]). > I think the difference is because Math functions can be intrinsified (and optimized) while StricMath functions can not. HotSpot has different ways of intrinsifying the Math functions. If the CPU is supporting the corresponding function the VM generates special nodes for that. Otherwise, if there exist special optimized assembler stubs for a function (e.g. see "StubRoutines::_dsin = generate_libmSin()" in stubGenerator_x86_64.cpp) the VM makes use of them. Otherwise it still uses leaf-calls into HotSpots internal C++-Implementation of the functions (e.g. SharedRuntime::dsin() in sharedRuntimeTrig.cpp) which are faster than doing a native call into the fdlibm version. The implementation in SharedRuntime doesn't has to be "strict" so it probably uses fused multiplication and it is also build with full optimization without '-ffp-contract=off' (which is OK in this case). @Andrew: are you fine with Gustavos latest version of the change? > For sure others much more experienced than I can comment about difference. > > > Regards, > Gustavo > > [1] https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html > From aph at redhat.com Tue Nov 29 18:15:09 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Nov 2016 18:15:09 +0000 Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: References: <583C5B10.8040204@linux.vnet.ibm.com> <583DAD7E.7020807@linux.vnet.ibm.com> Message-ID: <3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com> On 29/11/16 18:06, Volker Simonis wrote: > @Andrew: are you fine with Gustavos latest version of the change? Sure. The StrictMath versions all seem to give the same results. Andrew. From gromero at linux.vnet.ibm.com Tue Nov 29 18:37:01 2016 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Tue, 29 Nov 2016 16:37:01 -0200 Subject: [aarch64-port-dev ] RFR(s) PPC64/s390x/aarch64: Poor StrictMath performance due to non-optimized compilation In-Reply-To: <3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com> References: <583C5B10.8040204@linux.vnet.ibm.com> <583DAD7E.7020807@linux.vnet.ibm.com> <3c3aa7f0-01c7-46ae-ce8d-414d43213e4a@redhat.com> Message-ID: <583DCACD.3090803@linux.vnet.ibm.com> Hi Erik, Volker, Andrew On 29-11-2016 16:15, Andrew Haley wrote: > On 29/11/16 18:06, Volker Simonis wrote: >> @Andrew: are you fine with Gustavos latest version of the change? > > Sure. The StrictMath versions all seem to give the same results. > > Andrew. > Thanks for reviewing the change! I changed the "FC Extension Request" status to "reviewed". Regards, Gustavo From aph at redhat.com Tue Nov 29 19:01:56 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 29 Nov 2016 19:01:56 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap> References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com> <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap> Message-ID: <612df825-e963-2518-db9e-bf713dbf166a@redhat.com> On 29/11/16 17:50, Doerr, Martin wrote: > I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1]. > > Thread 1 reads a value from Thread 0, Thread 1 publishes something > e.g. by a releasing store (which could be lwsync + store on PPC64) > and Thread 2 acquires this value (or relies on address dependency > based ordering). > > The barrier must order Thread 0's store wrt. Thread 1's store in this case. > > E.g. Thread 1 could have updated a data structure referencing stuff > from Thread 0. I think we all rely on that Thread 3 sees at least > the same changes from Thread 0 when accessing this data > structure. So this "cumulative" property is relevant for hotspot's > OrderAccess functions. You can't rely on address dependency ordering in a language like C++ unless you use something like memory_order_consume: the compiler is capable of optimizing your code so that it doesn't use the address you think it should be using. That example is only valid for assembly code. Acquire is fine. Andrew. From david.holmes at oracle.com Wed Nov 30 07:22:49 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Nov 2016 17:22:49 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> Message-ID: <0bf53099-6f87-419c-ca5c-af6437002929@oracle.com> Thanks for the review Dan. Unfortunately I overlooked one case - see my other emails. :) Cheers, David On 30/11/2016 3:57 AM, Daniel D. Daugherty wrote: > Sorry for being late to this party! Seems like thread stack sizes are > very much on folks minds lately... > >> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ > > src/os/linux/vm/os_linux.cpp > L936: // a user-specified value known to be greater than the > minimum needed. > Perhaps: ... known to be at least the minimum needed. > > As enforced by this code in os::Posix::set_minimum_stack_sizes(): > > _java_thread_min_stack_allowed = > MAX2(_java_thread_min_stack_allowed, > JavaThread::stack_guard_zone_size() + > JavaThread::stack_shadow_zone_size() + > (4 * BytesPerWord > COMPILER2_PRESENT(+ 2)) * 4 * K); > > _java_thread_min_stack_allowed = > align_size_up(_java_thread_min_stack_allowed, vm_page_size()); > > size_t stack_size_in_bytes = ThreadStackSize * K; > if (stack_size_in_bytes != 0 && > stack_size_in_bytes < _java_thread_min_stack_allowed) { > // The '-Xss' and '-XX:ThreadStackSize=N' options both set > // ThreadStackSize so we go with "Java thread stack size" instead > // of "ThreadStackSize" to be more friendly. > tty->print_cr("\nThe Java thread stack size specified is too > small. " > "Specify at least " SIZE_FORMAT "k", > _java_thread_min_stack_allowed / K); > return JNI_ERR; > } > > L939: // can not do anything to emulate a larger stack than what > has been provided by > Typo: 'can not' -> 'cannot' > > L943: // Mamimum stack size is the easy part, get it from > RLIMIT_STACK > Typo: 'Mamimum' -> 'Maximum' > nit - please add a '.' to the end. > > > Thumbs up! > > I don't need to see a new webrev if you decide to make the > minor edits above. > > Dan > > > > On 11/29/16 5:25 AM, David Holmes wrote: >> I just realized I overlooked the case where ThreadStackSize=0 and the >> stack is unlimited. In that case it isn't clear where the guard pages >> will get inserted - I do know that I don't get a stackoverflow error. >> >> This needs further investigation. >> >> David >> >> On 29/11/2016 9:59 PM, David Holmes wrote: >>> Hi Thomas, >>> >>> On 29/11/2016 8:39 PM, Thomas St?fe wrote: >>>> Hi David, >>>> >>>> thanks for the good explanation. Change looks good, I really like the >>>> comment in capture_initial_stack(). >>>> >>>> Question, with -Xss given and being smaller than current thread stack >>>> size, guard pages may appear in the middle of the invoking thread >>>> stack? >>>> I always thought this is a bit dangerous. If your model is to have the >>>> VM created from the main thread, which then goes off to do different >>>> things, and have other threads then attach and run java code, main >>>> thread later may crash in unrelated native code just because it reached >>>> the stack depth of the hava threads? Or am I misunderstanding >>>> something? >>> >>> There is no change to the general behaviour other than allowing a >>> primordial process thread that launches the VM, to now not have an >>> effective stack limited at 2MB. The current logic will insert guard >>> pages where ever -Xss states (as long as less than 2MB else 2MB), while >>> with the fix the guard pages will be inserted above 2MB - as dictated by >>> -Xss. >>> >>> David >>> ----- >>> >>>> Thanks, Thomas >>>> >>>> >>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes >>> > wrote: >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >>>> >>>> >>>> The bug is not public unfortunately for non-technical reasons - but >>>> see my eval below. >>>> >>>> Background: if you load the JVM from the primordial thread of a >>>> process (not done by the java launcher since JDK 6), there is an >>>> artificial stack limit imposed on the initial thread (by sticking >>>> the guard page at the limit position of the actual stack) of the >>>> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M >>>> it is >>>> ignored for the main thread even if the true stack is, say, 8M. >>>> This >>>> limitation dates back 10-15 years and is no longer relevant today >>>> and should be removed (see below). I've also added additional >>>> explanatory notes. >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >>>> >>>> >>>> Testing was manually done by modifying the launcher to not run the >>>> VM in a new thread, and checking the resulting stack size used. >>>> >>>> This change will only affect hosted JVMs launched with a -Xss value >>>> > 2M. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>> Bug eval: >>>> >>>> JDK-4441425 limits the stack to 8M as a safeguard against an >>>> unlimited value from getrlimit in 1.3.1, but further constrained >>>> that to 2M in 1.4.0 due to JDK-4466587. >>>> >>>> By 1.4.2 we have the basic form of the current problematic code: >>>> >>>> #ifndef IA64 >>>> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >>>> #else >>>> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >>>> small >>>> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >>>> #endif >>>> >>>> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >>>> >>>> if (max_size && _initial_thread_stack_size > max_size) { >>>> _initial_thread_stack_size = max_size; >>>> } >>>> >>>> This was added by JDK-4678676 to allow the stack of the main thread >>>> to be _reduced_ below the default 2M/4M if the -Xss value was >>>> smaller than that.** There was no intent to allow the stack size to >>>> follow -Xss arbitrarily due to the operational constraints imposed >>>> by the OS/glibc at the time when dealing with the primordial >>>> process >>>> thread. >>>> >>>> ** It could not actually change the actual stack size of course, >>>> but >>>> set the guard pages to limit use to the expected stack size. >>>> >>>> In JDK 6, under JDK-6316197, the launcher was changed to create the >>>> JVM in a new thread, so that it was not limited by the >>>> idiosyncracies of the OS or thread library primordial thread >>>> handling. However, the stack size limitations remained in place in >>>> case the VM was launched from the primordial thread of a user >>>> application via the JNI invocation API. >>>> >>>> I believe it should be safe to remove the 2M limitation now. >>>> >>>> > From david.holmes at oracle.com Wed Nov 30 07:35:24 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Nov 2016 17:35:24 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> Message-ID: <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com> On 29/11/2016 10:25 PM, David Holmes wrote: > I just realized I overlooked the case where ThreadStackSize=0 and the > stack is unlimited. In that case it isn't clear where the guard pages > will get inserted - I do know that I don't get a stackoverflow error. > > This needs further investigation. So what happens here is that the massive stack-size causes stack-bottom to be higher than stack-top! So we will set a guard-page goodness knows where, and we can consume the current stack until such time as we hit an unmapped or protected region at which point we are killed. I'm not sure what to do here. My gut feel is that in such a case we should not attempt to create a guard page in the initial thread. That would require using a sentinel value for the stack-size. Though it also presents a problem for stack-bottom - which is implicitly zero. It may also give false positives in the is_initial_thread() check! Thoughts? Suggestions? > David > > On 29/11/2016 9:59 PM, David Holmes wrote: >> Hi Thomas, >> >> On 29/11/2016 8:39 PM, Thomas St?fe wrote: >>> Hi David, >>> >>> thanks for the good explanation. Change looks good, I really like the >>> comment in capture_initial_stack(). >>> >>> Question, with -Xss given and being smaller than current thread stack >>> size, guard pages may appear in the middle of the invoking thread stack? >>> I always thought this is a bit dangerous. If your model is to have the >>> VM created from the main thread, which then goes off to do different >>> things, and have other threads then attach and run java code, main >>> thread later may crash in unrelated native code just because it reached >>> the stack depth of the hava threads? Or am I misunderstanding something? >> >> There is no change to the general behaviour other than allowing a >> primordial process thread that launches the VM, to now not have an >> effective stack limited at 2MB. The current logic will insert guard >> pages where ever -Xss states (as long as less than 2MB else 2MB), while >> with the fix the guard pages will be inserted above 2MB - as dictated by >> -Xss. >> >> David >> ----- >> >>> Thanks, Thomas >>> >>> >>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes >> > wrote: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >>> >>> >>> The bug is not public unfortunately for non-technical reasons - but >>> see my eval below. >>> >>> Background: if you load the JVM from the primordial thread of a >>> process (not done by the java launcher since JDK 6), there is an >>> artificial stack limit imposed on the initial thread (by sticking >>> the guard page at the limit position of the actual stack) of the >>> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is >>> ignored for the main thread even if the true stack is, say, 8M. This >>> limitation dates back 10-15 years and is no longer relevant today >>> and should be removed (see below). I've also added additional >>> explanatory notes. >>> >>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >>> >>> >>> Testing was manually done by modifying the launcher to not run the >>> VM in a new thread, and checking the resulting stack size used. >>> >>> This change will only affect hosted JVMs launched with a -Xss value >>> > 2M. >>> >>> Thanks, >>> David >>> ----- >>> >>> Bug eval: >>> >>> JDK-4441425 limits the stack to 8M as a safeguard against an >>> unlimited value from getrlimit in 1.3.1, but further constrained >>> that to 2M in 1.4.0 due to JDK-4466587. >>> >>> By 1.4.2 we have the basic form of the current problematic code: >>> >>> #ifndef IA64 >>> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >>> #else >>> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >>> small >>> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >>> #endif >>> >>> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >>> >>> if (max_size && _initial_thread_stack_size > max_size) { >>> _initial_thread_stack_size = max_size; >>> } >>> >>> This was added by JDK-4678676 to allow the stack of the main thread >>> to be _reduced_ below the default 2M/4M if the -Xss value was >>> smaller than that.** There was no intent to allow the stack size to >>> follow -Xss arbitrarily due to the operational constraints imposed >>> by the OS/glibc at the time when dealing with the primordial process >>> thread. >>> >>> ** It could not actually change the actual stack size of course, but >>> set the guard pages to limit use to the expected stack size. >>> >>> In JDK 6, under JDK-6316197, the launcher was changed to create the >>> JVM in a new thread, so that it was not limited by the >>> idiosyncracies of the OS or thread library primordial thread >>> handling. However, the stack size limitations remained in place in >>> case the VM was launched from the primordial thread of a user >>> application via the JNI invocation API. >>> >>> I believe it should be safe to remove the 2M limitation now. >>> >>> From thomas.stuefe at gmail.com Wed Nov 30 08:17:03 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Nov 2016 09:17:03 +0100 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com> Message-ID: On Wed, Nov 30, 2016 at 8:35 AM, David Holmes wrote: > On 29/11/2016 10:25 PM, David Holmes wrote: > >> I just realized I overlooked the case where ThreadStackSize=0 and the >> stack is unlimited. In that case it isn't clear where the guard pages >> will get inserted - I do know that I don't get a stackoverflow error. >> >> This needs further investigation. >> > > So what happens here is that the massive stack-size causes stack-bottom to > be higher than stack-top! So we will set a guard-page goodness knows where, > and we can consume the current stack until such time as we hit an unmapped > or protected region at which point we are killed. > > I'm not sure what to do here. My gut feel is that in such a case we should > not attempt to create a guard page in the initial thread. That would > require using a sentinel value for the stack-size. Though it also presents > a problem for stack-bottom - which is implicitly zero. It may also give > false positives in the is_initial_thread() check! > > Thoughts? Suggestions? > > Maybe I am overlooking something, but should os::capture_initial_thread() not call pthread_getattr_np() first to handle the case where the VM was created on a pthread which is not the primordial thread and may have a different stack size than what getrlimit returns? And fall back to getrlimit only if pthread_getattr_np() fails? And then we also should handle RLIM_INFINITY. For that case, I also think not setting guard pages would be safest. We also may just refuse to run in that case, because the workaround for the user is easy - just set the limit before process start. Note that on AIX, we currently refuse to run on the primordial thread because it may have different page sizes than pthreads and it is impossible to get the exact stack locations. Thomas > > David >> >> On 29/11/2016 9:59 PM, David Holmes wrote: >> >>> Hi Thomas, >>> >>> On 29/11/2016 8:39 PM, Thomas St?fe wrote: >>> >>>> Hi David, >>>> >>>> thanks for the good explanation. Change looks good, I really like the >>>> comment in capture_initial_stack(). >>>> >>>> Question, with -Xss given and being smaller than current thread stack >>>> size, guard pages may appear in the middle of the invoking thread stack? >>>> I always thought this is a bit dangerous. If your model is to have the >>>> VM created from the main thread, which then goes off to do different >>>> things, and have other threads then attach and run java code, main >>>> thread later may crash in unrelated native code just because it reached >>>> the stack depth of the hava threads? Or am I misunderstanding something? >>>> >>> >>> There is no change to the general behaviour other than allowing a >>> primordial process thread that launches the VM, to now not have an >>> effective stack limited at 2MB. The current logic will insert guard >>> pages where ever -Xss states (as long as less than 2MB else 2MB), while >>> with the fix the guard pages will be inserted above 2MB - as dictated by >>> -Xss. >>> >>> David >>> ----- >>> >>> Thanks, Thomas >>>> >>>> >>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes >>> > wrote: >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >>>> >>>> >>>> The bug is not public unfortunately for non-technical reasons - but >>>> see my eval below. >>>> >>>> Background: if you load the JVM from the primordial thread of a >>>> process (not done by the java launcher since JDK 6), there is an >>>> artificial stack limit imposed on the initial thread (by sticking >>>> the guard page at the limit position of the actual stack) of the >>>> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M it is >>>> ignored for the main thread even if the true stack is, say, 8M. This >>>> limitation dates back 10-15 years and is no longer relevant today >>>> and should be removed (see below). I've also added additional >>>> explanatory notes. >>>> >>>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >>>> >>>> >>>> Testing was manually done by modifying the launcher to not run the >>>> VM in a new thread, and checking the resulting stack size used. >>>> >>>> This change will only affect hosted JVMs launched with a -Xss value >>>> > 2M. >>>> >>>> Thanks, >>>> David >>>> ----- >>>> >>>> Bug eval: >>>> >>>> JDK-4441425 limits the stack to 8M as a safeguard against an >>>> unlimited value from getrlimit in 1.3.1, but further constrained >>>> that to 2M in 1.4.0 due to JDK-4466587. >>>> >>>> By 1.4.2 we have the basic form of the current problematic code: >>>> >>>> #ifndef IA64 >>>> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >>>> #else >>>> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >>>> small >>>> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >>>> #endif >>>> >>>> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - 1); >>>> >>>> if (max_size && _initial_thread_stack_size > max_size) { >>>> _initial_thread_stack_size = max_size; >>>> } >>>> >>>> This was added by JDK-4678676 to allow the stack of the main thread >>>> to be _reduced_ below the default 2M/4M if the -Xss value was >>>> smaller than that.** There was no intent to allow the stack size to >>>> follow -Xss arbitrarily due to the operational constraints imposed >>>> by the OS/glibc at the time when dealing with the primordial process >>>> thread. >>>> >>>> ** It could not actually change the actual stack size of course, but >>>> set the guard pages to limit use to the expected stack size. >>>> >>>> In JDK 6, under JDK-6316197, the launcher was changed to create the >>>> JVM in a new thread, so that it was not limited by the >>>> idiosyncracies of the OS or thread library primordial thread >>>> handling. However, the stack size limitations remained in place in >>>> case the VM was launched from the primordial thread of a user >>>> application via the JNI invocation API. >>>> >>>> I believe it should be safe to remove the 2M limitation now. >>>> >>>> >>>> From martin.doerr at sap.com Wed Nov 30 08:36:21 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 30 Nov 2016 08:36:21 +0000 Subject: Presentation: Understanding OrderAccess In-Reply-To: <612df825-e963-2518-db9e-bf713dbf166a@redhat.com> References: <6fe68555c44b4ad6b2c88f8c3aef150a@dewdfe13de06.global.corp.sap> <593d72b9-e9e6-13c5-44e4-9626c22a68ce@redhat.com> <47aa81fbef654a38a0a2f50373729921@dewdfe13de06.global.corp.sap> <612df825-e963-2518-db9e-bf713dbf166a@redhat.com> Message-ID: <32c3e619ba3e4dcf9525e596b5c91312@dewdfe13de06.global.corp.sap> Hi Andrew, I know that. My point was the global effect of Thread 1's barrier. Best regards, Martin -----Original Message----- From: Andrew Haley [mailto:aph at redhat.com] Sent: Dienstag, 29. November 2016 20:02 To: Doerr, Martin ; David Holmes ; hotspot-dev developers Subject: Re: Presentation: Understanding OrderAccess On 29/11/16 17:50, Doerr, Martin wrote: > I mean a scenario like in 5.1 " Cumulative Barriers for WRC" in [1]. > > Thread 1 reads a value from Thread 0, Thread 1 publishes something > e.g. by a releasing store (which could be lwsync + store on PPC64) and > Thread 2 acquires this value (or relies on address dependency based > ordering). > > The barrier must order Thread 0's store wrt. Thread 1's store in this case. > > E.g. Thread 1 could have updated a data structure referencing stuff > from Thread 0. I think we all rely on that Thread 3 sees at least the > same changes from Thread 0 when accessing this data structure. So this > "cumulative" property is relevant for hotspot's OrderAccess functions. You can't rely on address dependency ordering in a language like C++ unless you use something like memory_order_consume: the compiler is capable of optimizing your code so that it doesn't use the address you think it should be using. That example is only valid for assembly code. Acquire is fine. Andrew. From david.holmes at oracle.com Wed Nov 30 08:46:47 2016 From: david.holmes at oracle.com (David Holmes) Date: Wed, 30 Nov 2016 18:46:47 +1000 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> <52c86d04-ff44-a720-f376-a2a34d091b02@oracle.com> Message-ID: On 30/11/2016 6:17 PM, Thomas St?fe wrote: > On Wed, Nov 30, 2016 at 8:35 AM, David Holmes > wrote: > > On 29/11/2016 10:25 PM, David Holmes wrote: > > I just realized I overlooked the case where ThreadStackSize=0 > and the > stack is unlimited. In that case it isn't clear where the guard > pages > will get inserted - I do know that I don't get a stackoverflow > error. > > This needs further investigation. > > > So what happens here is that the massive stack-size causes > stack-bottom to be higher than stack-top! So we will set a > guard-page goodness knows where, and we can consume the current > stack until such time as we hit an unmapped or protected region at > which point we are killed. > > I'm not sure what to do here. My gut feel is that in such a case we > should not attempt to create a guard page in the initial thread. > That would require using a sentinel value for the stack-size. Though > it also presents a problem for stack-bottom - which is implicitly > zero. It may also give false positives in the is_initial_thread() check! > > Thoughts? Suggestions? > > > Maybe I am overlooking something, but should > os::capture_initial_thread() not call pthread_getattr_np() first to > handle the case where the VM was created on a pthread which is not the > primordial thread and may have a different stack size than what > getrlimit returns? And fall back to getrlimit only if > pthread_getattr_np() fails? My understanding of the problem (which likely no longer exists) is that pthread_getattr_np didn't fail as such but returned bogus values - so the problem was not detectable and so we just had to not use pthread_getattr_np. > And then we also should handle > RLIM_INFINITY. For that case, I also think not setting guard pages would > be safest. > > We also may just refuse to run in that case, because the workaround for > the user is easy - just set the limit before process start. Note that on > AIX, we currently refuse to run on the primordial thread because it may > have different page sizes than pthreads and it is impossible to get the > exact stack locations. I was wondering why the AIX set up seemed so simple in comparison :) Thanks, David > > Thomas > > > > David > > On 29/11/2016 9:59 PM, David Holmes wrote: > > Hi Thomas, > > On 29/11/2016 8:39 PM, Thomas St?fe wrote: > > Hi David, > > thanks for the good explanation. Change looks good, I > really like the > comment in capture_initial_stack(). > > Question, with -Xss given and being smaller than current > thread stack > size, guard pages may appear in the middle of the > invoking thread stack? > I always thought this is a bit dangerous. If your model > is to have the > VM created from the main thread, which then goes off to > do different > things, and have other threads then attach and run java > code, main > thread later may crash in unrelated native code just > because it reached > the stack depth of the hava threads? Or am I > misunderstanding something? > > > There is no change to the general behaviour other than > allowing a > primordial process thread that launches the VM, to now not > have an > effective stack limited at 2MB. The current logic will > insert guard > pages where ever -Xss states (as long as less than 2MB else > 2MB), while > with the fix the guard pages will be inserted above 2MB - as > dictated by > -Xss. > > David > ----- > > Thanks, Thomas > > > On Fri, Nov 25, 2016 at 11:38 AM, David Holmes > > >> wrote: > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8170307 > > > > > The bug is not public unfortunately for > non-technical reasons - but > see my eval below. > > Background: if you load the JVM from the primordial > thread of a > process (not done by the java launcher since JDK 6), > there is an > artificial stack limit imposed on the initial thread > (by sticking > the guard page at the limit position of the actual > stack) of the > minimum of the -Xss setting and 2M. So if you set > -Xss to > 2M it is > ignored for the main thread even if the true stack > is, say, 8M. This > limitation dates back 10-15 years and is no longer > relevant today > and should be removed (see below). I've also added > additional > explanatory notes. > > webrev: > http://cr.openjdk.java.net/~dholmes/8170307/webrev/ > > > > > Testing was manually done by modifying the launcher > to not run the > VM in a new thread, and checking the resulting stack > size used. > > This change will only affect hosted JVMs launched > with a -Xss value > > 2M. > > Thanks, > David > ----- > > Bug eval: > > JDK-4441425 limits the stack to 8M as a safeguard > against an > unlimited value from getrlimit in 1.3.1, but further > constrained > that to 2M in 1.4.0 due to JDK-4466587. > > By 1.4.2 we have the basic form of the current > problematic code: > > #ifndef IA64 > if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * > K * K; > #else > // Problem still exists RH7.2 (IA64 anyway) but > 2MB is a little > small > if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * > K * K; > #endif > > _initial_thread_stack_size = rlim.rlim_cur & > ~(page_size() - 1); > > if (max_size && _initial_thread_stack_size > > max_size) { > _initial_thread_stack_size = max_size; > } > > This was added by JDK-4678676 to allow the stack of > the main thread > to be _reduced_ below the default 2M/4M if the -Xss > value was > smaller than that.** There was no intent to allow > the stack size to > follow -Xss arbitrarily due to the operational > constraints imposed > by the OS/glibc at the time when dealing with the > primordial process > thread. > > ** It could not actually change the actual stack > size of course, but > set the guard pages to limit use to the expected > stack size. > > In JDK 6, under JDK-6316197, the launcher was > changed to create the > JVM in a new thread, so that it was not limited by the > idiosyncracies of the OS or thread library > primordial thread > handling. However, the stack size limitations > remained in place in > case the VM was launched from the primordial thread > of a user > application via the JNI invocation API. > > I believe it should be safe to remove the 2M > limitation now. > > > From daniel.daugherty at oracle.com Wed Nov 30 15:10:14 2016 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 30 Nov 2016 08:10:14 -0700 Subject: RFR: 8170307: Stack size option -Xss is ignored In-Reply-To: <0bf53099-6f87-419c-ca5c-af6437002929@oracle.com> References: <1b894e26-011e-05e4-6e24-91bebd4d465c@oracle.com> <9daf1211-d1f9-7a1c-bee4-19612766a269@oracle.com> <0bf53099-6f87-419c-ca5c-af6437002929@oracle.com> Message-ID: <32d6ce82-3279-1e3b-b23e-aa37ec79a459@oracle.com> On 11/30/16 12:22 AM, David Holmes wrote: > Thanks for the review Dan. Unfortunately I overlooked one case - see > my other emails. :) Yup. I always read the entire review thread before posting my review (and sometimes update said review with "Update:" lines). I poked around a bit in the code, but couldn't come up with an "aha moment" on the -XX:ThreadStackSize=0 issue. It looked like the few comments I had might still be useful when you find your way out of the current quagmire... :-) Gotta love these thread stack size issues... :-( Dan > > Cheers, > David > > On 30/11/2016 3:57 AM, Daniel D. Daugherty wrote: >> Sorry for being late to this party! Seems like thread stack sizes are >> very much on folks minds lately... >> >>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >> >> src/os/linux/vm/os_linux.cpp >> L936: // a user-specified value known to be greater than the >> minimum needed. >> Perhaps: ... known to be at least the minimum needed. >> >> As enforced by this code in >> os::Posix::set_minimum_stack_sizes(): >> >> _java_thread_min_stack_allowed = >> MAX2(_java_thread_min_stack_allowed, >> JavaThread::stack_guard_zone_size() + >> JavaThread::stack_shadow_zone_size() + >> (4 * BytesPerWord >> COMPILER2_PRESENT(+ 2)) * 4 * K); >> >> _java_thread_min_stack_allowed = >> align_size_up(_java_thread_min_stack_allowed, vm_page_size()); >> >> size_t stack_size_in_bytes = ThreadStackSize * K; >> if (stack_size_in_bytes != 0 && >> stack_size_in_bytes < _java_thread_min_stack_allowed) { >> // The '-Xss' and '-XX:ThreadStackSize=N' options both set >> // ThreadStackSize so we go with "Java thread stack size" >> instead >> // of "ThreadStackSize" to be more friendly. >> tty->print_cr("\nThe Java thread stack size specified is too >> small. " >> "Specify at least " SIZE_FORMAT "k", >> _java_thread_min_stack_allowed / K); >> return JNI_ERR; >> } >> >> L939: // can not do anything to emulate a larger stack than what >> has been provided by >> Typo: 'can not' -> 'cannot' >> >> L943: // Mamimum stack size is the easy part, get it from >> RLIMIT_STACK >> Typo: 'Mamimum' -> 'Maximum' >> nit - please add a '.' to the end. >> >> >> Thumbs up! >> >> I don't need to see a new webrev if you decide to make the >> minor edits above. >> >> Dan >> >> >> >> On 11/29/16 5:25 AM, David Holmes wrote: >>> I just realized I overlooked the case where ThreadStackSize=0 and the >>> stack is unlimited. In that case it isn't clear where the guard pages >>> will get inserted - I do know that I don't get a stackoverflow error. >>> >>> This needs further investigation. >>> >>> David >>> >>> On 29/11/2016 9:59 PM, David Holmes wrote: >>>> Hi Thomas, >>>> >>>> On 29/11/2016 8:39 PM, Thomas St?fe wrote: >>>>> Hi David, >>>>> >>>>> thanks for the good explanation. Change looks good, I really like the >>>>> comment in capture_initial_stack(). >>>>> >>>>> Question, with -Xss given and being smaller than current thread stack >>>>> size, guard pages may appear in the middle of the invoking thread >>>>> stack? >>>>> I always thought this is a bit dangerous. If your model is to have >>>>> the >>>>> VM created from the main thread, which then goes off to do different >>>>> things, and have other threads then attach and run java code, main >>>>> thread later may crash in unrelated native code just because it >>>>> reached >>>>> the stack depth of the hava threads? Or am I misunderstanding >>>>> something? >>>> >>>> There is no change to the general behaviour other than allowing a >>>> primordial process thread that launches the VM, to now not have an >>>> effective stack limited at 2MB. The current logic will insert guard >>>> pages where ever -Xss states (as long as less than 2MB else 2MB), >>>> while >>>> with the fix the guard pages will be inserted above 2MB - as >>>> dictated by >>>> -Xss. >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, Thomas >>>>> >>>>> >>>>> On Fri, Nov 25, 2016 at 11:38 AM, David Holmes >>>>> >>>> > wrote: >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8170307 >>>>> >>>>> >>>>> The bug is not public unfortunately for non-technical reasons >>>>> - but >>>>> see my eval below. >>>>> >>>>> Background: if you load the JVM from the primordial thread of a >>>>> process (not done by the java launcher since JDK 6), there is an >>>>> artificial stack limit imposed on the initial thread (by sticking >>>>> the guard page at the limit position of the actual stack) of the >>>>> minimum of the -Xss setting and 2M. So if you set -Xss to > 2M >>>>> it is >>>>> ignored for the main thread even if the true stack is, say, 8M. >>>>> This >>>>> limitation dates back 10-15 years and is no longer relevant today >>>>> and should be removed (see below). I've also added additional >>>>> explanatory notes. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~dholmes/8170307/webrev/ >>>>> >>>>> >>>>> Testing was manually done by modifying the launcher to not run >>>>> the >>>>> VM in a new thread, and checking the resulting stack size used. >>>>> >>>>> This change will only affect hosted JVMs launched with a -Xss >>>>> value >>>>> > 2M. >>>>> >>>>> Thanks, >>>>> David >>>>> ----- >>>>> >>>>> Bug eval: >>>>> >>>>> JDK-4441425 limits the stack to 8M as a safeguard against an >>>>> unlimited value from getrlimit in 1.3.1, but further constrained >>>>> that to 2M in 1.4.0 due to JDK-4466587. >>>>> >>>>> By 1.4.2 we have the basic form of the current problematic code: >>>>> >>>>> #ifndef IA64 >>>>> if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur = 2 * K * K; >>>>> #else >>>>> // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little >>>>> small >>>>> if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur = 4 * K * K; >>>>> #endif >>>>> >>>>> _initial_thread_stack_size = rlim.rlim_cur & ~(page_size() - >>>>> 1); >>>>> >>>>> if (max_size && _initial_thread_stack_size > max_size) { >>>>> _initial_thread_stack_size = max_size; >>>>> } >>>>> >>>>> This was added by JDK-4678676 to allow the stack of the main >>>>> thread >>>>> to be _reduced_ below the default 2M/4M if the -Xss value was >>>>> smaller than that.** There was no intent to allow the stack >>>>> size to >>>>> follow -Xss arbitrarily due to the operational constraints >>>>> imposed >>>>> by the OS/glibc at the time when dealing with the primordial >>>>> process >>>>> thread. >>>>> >>>>> ** It could not actually change the actual stack size of course, >>>>> but >>>>> set the guard pages to limit use to the expected stack size. >>>>> >>>>> In JDK 6, under JDK-6316197, the launcher was changed to >>>>> create the >>>>> JVM in a new thread, so that it was not limited by the >>>>> idiosyncracies of the OS or thread library primordial thread >>>>> handling. However, the stack size limitations remained in >>>>> place in >>>>> case the VM was launched from the primordial thread of a user >>>>> application via the JNI invocation API. >>>>> >>>>> I believe it should be safe to remove the 2M limitation now. >>>>> >>>>> >> From trevor.d.watson at oracle.com Wed Nov 30 15:29:50 2016 From: trevor.d.watson at oracle.com (Trevor Watson) Date: Wed, 30 Nov 2016 15:29:50 +0000 Subject: RFR: 8162865 Implementation of SPARC lzcnt In-Reply-To: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com> References: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com> Message-ID: <8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com> Hi Vladimir, Thanks for the review. Comments inline below... On 22/11/16 20:04, Vladimir Kozlov wrote: > Do you have performance numbers? I've spent a lot of time looking at performance and it's proving verify difficult to precisely quantify either on a T5 or an S7. However, overall, it would appear that using the native lzcnt instruction is around 10% quicker than the current implementation which uses POPC. > UseVIS is too wide flag to control only these instructions generation. > > To be consistent with x86 code please add > UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its > setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is > done for x86. I've done this and it actually proved useful in testing as I was able to turn off lzcnt and use popc and vice-versa :) > May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear > that VIS is used. Done. > Indention in the new test is all over place. Please, fix. I've fixed it (I hope) and broken the test up into separate Integer and Long tests to be consistent with the rest of the BMI tests in that directory. I've run the jtreg bmi tests on Solaris 12 SPARC and x86 and am awaiting the results of a jprt (hotspot) run on all platforms. The code review is in the same place as before: >> http://cr.openjdk.java.net/~alanbur/8162865/ Thanks, Trevor From igor.nunes at eldorado.org.br Wed Nov 30 16:51:59 2016 From: igor.nunes at eldorado.org.br (Igor Henrique Soares Nunes) Date: Wed, 30 Nov 2016 16:51:59 +0000 Subject: [8u] request for approval: "8168318 : PPC64: Use cmpldi instead of li/cmpld" Message-ID: Hi all, Could you please approve the backport of the following ppc64-only improvement to jdk8u-dev: 8168318: PPC64: Use cmpldi instead of li/cmpld Bug: https://bugs.openjdk.java.net/browse/JDK-8168318 Webrev: https://igorsnunes.github.io/openjdk/webrev/8168318/ Review: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-October/024809.html URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/622d3fe587f2 Thank you and best regards, Igor Nunes From rob.mckenna at oracle.com Wed Nov 30 17:40:23 2016 From: rob.mckenna at oracle.com (Rob McKenna) Date: Wed, 30 Nov 2016 17:40:23 +0000 Subject: [8u] request for approval: "8168318 : PPC64: Use cmpldi instead of li/cmpld" In-Reply-To: References: Message-ID: <20161130174023.GA2448@vimes> Hi Igor, As this is an enhancement request, please follow the enhancement approval request process: http://openjdk.java.net/projects/jdk8u/enhancement-template.html http://openjdk.java.net/projects/jdk8u/groundrules.html -Rob On 30/11/16 04:51, Igor Henrique Soares Nunes wrote: > Hi all, > > Could you please approve the backport of the following ppc64-only improvement to jdk8u-dev: > > 8168318: PPC64: Use cmpldi instead of li/cmpld > > Bug: https://bugs.openjdk.java.net/browse/JDK-8168318 > Webrev: https://igorsnunes.github.io/openjdk/webrev/8168318/ > Review: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-October/024809.html > URL: http://hg.openjdk.java.net/jdk9/hs/hotspot/rev/622d3fe587f2 > > Thank you and best regards, > Igor Nunes > From jcowgill at debian.org Wed Nov 30 17:50:33 2016 From: jcowgill at debian.org (James Cowgill) Date: Wed, 30 Nov 2016 17:50:33 +0000 Subject: JDK 9 fails to build on MIPS Message-ID: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org> Hi, Firstly I have never submitted anything to OpenJDK before so apologies if I haven't done things the right way. I also have no bug number for this. OpenJDK 9 does not build on MIPS machines and hasn't for some time. This is due to code in hotspot which assumes NSIG <= 65 which is not the case on MIPS since MIPS has 127 signal numbers. I've attached an initial patch which converts the offending code in hotspot/src/os/linux/vm/jsig.c to use sigset_t instead of an array to store the used signals. I notice the AIX implementation of jsig.c already does this. Originally from: https://bugs.debian.org/841173 Thanks, James -------------- next part -------------- A non-text attachment was scrubbed... Name: mips-sigset-hotspot.diff Type: text/x-patch Size: 3570 bytes Desc: not available URL: From vladimir.kozlov at oracle.com Wed Nov 30 19:19:10 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 30 Nov 2016 11:19:10 -0800 Subject: RFR: 8162865 Implementation of SPARC lzcnt In-Reply-To: <8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com> References: <1f9581e5-3bed-dec3-ec4b-81b5e3e6d478@oracle.com> <8e47a2d0-c823-4d74-89bf-831c08a8f10d@oracle.com> Message-ID: <583F262E.9020604@oracle.com> Looks good. Only one small issue - new tests files should have only 2016 year: * Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved. Changes have to wait when JDK 10 repo is open. It is Enhancement and we done with new features in JDK 9 already. Thanks, Vladimir On 11/30/16 7:29 AM, Trevor Watson wrote: > Hi Vladimir, > > Thanks for the review. Comments inline below... > > On 22/11/16 20:04, Vladimir Kozlov wrote: >> Do you have performance numbers? > > I've spent a lot of time looking at performance and it's proving verify difficult to precisely quantify either on a T5 or an S7. However, overall, it would appear that using the native lzcnt > instruction is around 10% quicker than the current implementation which uses POPC. > >> UseVIS is too wide flag to control only these instructions generation. >> >> To be consistent with x86 code please add >> UseCountLeadingZerosInstruction flag to globals_sparc.hpp and its >> setting in vm_version_sparc.cpp (based on has_vis3()) similar to what is >> done for x86. > > I've done this and it actually proved useful in testing as I was able to turn off lzcnt and use popc and vice-versa :) > >> May be name new instructions *ZerosIvis instead of *ZerosI1 to be clear >> that VIS is used. > > Done. > >> Indention in the new test is all over place. Please, fix. > > I've fixed it (I hope) and broken the test up into separate Integer and Long tests to be consistent with the rest of the BMI tests in that directory. > > I've run the jtreg bmi tests on Solaris 12 SPARC and x86 and am awaiting the results of a jprt (hotspot) run on all platforms. > > The code review is in the same place as before: >>> http://cr.openjdk.java.net/~alanbur/8162865/ > > Thanks, > Trevor From thomas.stuefe at gmail.com Wed Nov 30 19:33:16 2016 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 30 Nov 2016 20:33:16 +0100 Subject: JDK 9 fails to build on MIPS In-Reply-To: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org> References: <53391318-5ee3-28d4-b7bd-a51037de6032@debian.org> Message-ID: Hi James, In general I like your patch - we used sigset_t in the AIX port instead of masks and this would be a good cleanup for the other platforms too. But in this case, is the problem not that the mips signal.h headers fails to define NSIG? We have NSIG and _NSIG. _NSIG seems to be the platform dependent max including real time signals. NSIG excludes real time signals, and seems to be 32 (SIGRTMIN) on all Linux platforms I checked. I may have looked wrong (I searched http://lxr.free-electrons.com/ident?v=3.2&i=NSIG), but I found that NSIG was missing from signal.h on some architectures, mips being among them. I do not know why, but would like to understand the reason. Do you define NSIG to be _NSIG? The VM currently does not use real time signals, so NSIG should be sufficient. If NSIG is really missing on mips, then maybe defining it locally as SIGRTMIN would be a less invasive change. If we were to change the hand-written bitmask to sigset_t, we probably should also take a look at the arrays of length NSIG (sigact, sigflags, pending_signals) and the associated checks. This would be a bigger cleanup. --- Apart from all that, I'd suggest moving the sigset initialization in os_linux.cpp from the "__attribute__((constructor))" function to os::signal_init_pd(). I'd suggest a similar move for jsig.c, but do not see a suitable initialization function there. Maybe someone else has an idea? Thanks & Kind Regards, Thomas On Wed, Nov 30, 2016 at 6:50 PM, James Cowgill wrote: > Hi, > > Firstly I have never submitted anything to OpenJDK before so apologies > if I haven't done things the right way. I also have no bug number for this. > > OpenJDK 9 does not build on MIPS machines and hasn't for some time. This > is due to code in hotspot which assumes NSIG <= 65 which is not the case > on MIPS since MIPS has 127 signal numbers. > > I've attached an initial patch which converts the offending code in > hotspot/src/os/linux/vm/jsig.c to use sigset_t instead of an array to > store the used signals. I notice the AIX implementation of jsig.c > already does this. > > Originally from: https://bugs.debian.org/841173 > > Thanks, > James > > > From max.ockner at oracle.com Wed Nov 30 19:57:00 2016 From: max.ockner at oracle.com (Max Ockner) Date: Wed, 30 Nov 2016 14:57:00 -0500 Subject: RFR(s): 8169206: TemplateInterpreter::_continuation_entry is never referenced Message-ID: <583F2F0C.7050206@oracle.com> Hello everyone! Please review this small fix which removes some dead code from the interpreter. TemplateInterpreter::_continuation_entry table and its accessor are never called from anywhere else. Bug: https://bugs.openjdk.java.net/browse/JDK-8169206 Webrev: http://cr.openjdk.java.net/~mockner/8169206.01/ Tested with java -version. Thanks, Max From frederic.parain at oracle.com Wed Nov 30 20:51:31 2016 From: frederic.parain at oracle.com (Frederic Parain) Date: Wed, 30 Nov 2016 15:51:31 -0500 Subject: RFR(s): 8169206: TemplateInterpreter::_continuation_entry is never referenced In-Reply-To: <583F2F0C.7050206@oracle.com> References: <583F2F0C.7050206@oracle.com> Message-ID: <8ea4d1ca-ece1-9e09-42bd-fd4cad7f2658@oracle.com> Looks good to me. Fred On 11/30/2016 02:57 PM, Max Ockner wrote: > Hello everyone! > > Please review this small fix which removes some dead code from the > interpreter. TemplateInterpreter::_continuation_entry table and its > accessor are never called from anywhere else. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8169206 > Webrev: http://cr.openjdk.java.net/~mockner/8169206.01/ > > Tested with java -version. > > Thanks, > Max > > > > From chf at redhat.com Wed Nov 30 21:59:56 2016 From: chf at redhat.com (Christine Flood) Date: Wed, 30 Nov 2016 16:59:56 -0500 (EST) Subject: Java heap size defaults when running with CGroups in Linux. In-Reply-To: <162383910.709983.1479672832499.JavaMail.zimbra@redhat.com> Message-ID: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com> The problem is that when running the JVM inside of a cgroup, such as docker, the JVM bases it's default heap parameters on the size of the whole machine's memory not on the memory available to the container. This causes errors as discussed on this blog entry. http://matthewkwilliams.com/index.php/2016/03/17/docker-cgroups-memory-constraints-and-java-cautionary-tale/ Basically the JVM dies in a non-obvious manner. The solution I propose is to add a parameter -XX:+UseCGroupLimits to the JVM which states that you should look to the CGroup when calculating default heap sizes. Webrev is here: http://cr.openjdk.java.net/~andrew/rh1390708/webrev.01/ Christine From mikael.vidstedt at oracle.com Wed Nov 30 22:53:41 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 30 Nov 2016 14:53:41 -0800 Subject: Java heap size defaults when running with CGroups in Linux. In-Reply-To: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com> References: <321822099.1218801.1480543196266.JavaMail.zimbra@redhat.com> Message-ID: Out of curiosity, why wouldn?t this be the default behavior? That is, in which cases is it not a good idea to use the cgroup information when sizing the JVM? Cheers, Mikael > On Nov 30, 2016, at 1:59 PM, Christine Flood wrote: > > > The problem is that when running the JVM inside of a cgroup, such as docker, the JVM bases it's default heap parameters on the size of the whole machine's memory not on the memory available to the container. This causes errors as discussed on this blog entry. http://matthewkwilliams.com/index.php/2016/03/17/docker-cgroups-memory-constraints-and-java-cautionary-tale/ > > Basically the JVM dies in a non-obvious manner. > > The solution I propose is to add a parameter -XX:+UseCGroupLimits to the JVM which states that you should look to the CGroup when calculating default heap sizes. > > Webrev is here: http://cr.openjdk.java.net/~andrew/rh1390708/webrev.01/ > > > Christine > > > > > > >