RFR: Parallelize safepoint cleanup
Roman Kennke
rkennke at redhat.com
Thu Jul 20 10:53:18 UTC 2017
Hi all,
Robbin found some more missing includes in jprt testing (thanks!!)
Differential:
http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18.diff/>
Full:
http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
<http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18/>
Am I breaking the record for most webrev revisions? :-P
According the Robbin, builds are now all clean.
Can I get final reviews and then a sponsor?
Thanks,
Roman
Am 16.07.2017 um 10:25 schrieb Robbin Ehn:
> Hi Roman,
>
> On 2017-07-12 15:32, Roman Kennke wrote:
>> Hi Robbin and all,
>>
>> I fixed the 32bit failures by using jlong in all relevant places:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>>
>> then Robbin found another problem. SafepointCleanupTest started to fail,
>> because "mark nmethods" is no longer printed. This made me think that
>> we're not measuring the conflated (and possibly parallelized)
>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>> "safepoint cleanup tasks" which measures the total duration of safepoint
>> cleanup. We can't reasonably measure a possibly parallel and conflated
>> pass standalone, but we can measure all and by subtrating all the other
>> subphases, get an idea how long deflation and nmethod marking take up.
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>>
>> The full webrev is now:
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>
>> Hope that's all ;-)
>
> With this changeset something always pop-ups.
>
> Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED.
>
> /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS
> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE
> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions
> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16
> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070
> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN
> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef
> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS
> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86
> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED
> -DINCLUDE_AOT
> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm
> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22:
> error: variable has incomplete type 'StrongRootsScope'
> StrongRootsScope srs(num_cleanup_workers);
> ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
> note: forward declaration of 'StrongRootsScope'
> class StrongRootsScope;
> ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22:
> error: variable has incomplete type 'StrongRootsScope'
> StrongRootsScope srs(1);
> ^
> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
> note: forward declaration of 'StrongRootsScope'
> class StrongRootsScope;
> ^
> 2 errors generated.
> make[3]: ***
> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o]
> Error 1
> make[3]: *** Waiting for unfinished jobs....
> make[2]: *** [hotspot-server-libs] Error 2
>
> Send me the new webrev and I'll test it before the 16th round of
> review :)
>
> /Robbin
>
>>
>> Roman
>>
>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>> Hi, unfortunately the push failed on 32-bit.
>>>
>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>
>>> I do not have anytime to look at this, so here is the error.
>>>
>>> /Robbin
>>>
>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'long int nmethod::stack_traversal_mark()':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>
>>> error: call of overloaded 'load_acquire(volatile long int*)' is
>>> ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>
>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>
>>> note: no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile jint* {aka const volatile int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>
>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>
>>> note: no known conversion for argument 1 from 'volatile long int*'
>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>
>>> error: call of overloaded 'release_store(volatile long int*, long
>>> int&)' is ambiguous
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>
>>> note: candidates are:
>>> In file included from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>
>>> from
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>
>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>
>>> note: no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile jint* {aka volatile int*}'
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>
>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>> <near match>
>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>
>>> note: no known conversion for argument 1 from 'volatile long int*'
>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>
>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>> I'll start a push now.
>>>>
>>>> /Robbin
>>>>
>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>
>>>>> Roman
>>>>>
>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>
>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>> Hi Robbin,
>>>>>>>>>
>>>>>>>>> Far down ->
>>>>>>>>>
>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>
>>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>> + // This is here to be consistent with sweeper.cpp
>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>> + // TODO: Is this really needed?
>>>>>>>>>>> + OrderAccess::storestore();
>>>>>>>>>>> + }
>>>>>>>>>>>
>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>> consistent
>>>>>>>>>>> with an OrderAccess::storestore() that's not properly
>>>>>>>>>>> documented
>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>
>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>
>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case
>>>>>>>>>>>> that
>>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>> sweeper)
>>>>>>>>>>>> is holding still.
>>>>>>>>>>>
>>>>>>>>>>> and:
>>>>>>>>>>>
>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>
>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>> marking
>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>> (outside
>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>> storestore()
>>>>>>>>>> should be necessary.
>>>>>>>>>>
>>>>>>>>>> From Igor's comment I can see how it happened though:
>>>>>>>>>> Apparently
>>>>>>>>>> there
>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>> with
>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>> required
>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>> also put
>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>> storestore()
>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>> 'for
>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>> necessary in
>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>
>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>> Refactor the
>>>>>>>>>> code so that both paths at least call the storestore() in the
>>>>>>>>>> same
>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and
>>>>>>>>>> call
>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>
>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>
>>>>>>>>> So there is a slight optimization when not running sweeper to
>>>>>>>>> skip
>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>
>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>> - long stack_traversal_mark() { return
>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>> - void set_stack_traversal_mark(long l) {
>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>> + long stack_traversal_mark() { return
>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>> + void set_stack_traversal_mark(long l) {
>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>
>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>> that
>>>>>>>>> it is concurrent accessed.
>>>>>>>>> And remove both storestore.
>>>>>>>>>
>>>>>>>>> "Also neither of these state variables are volatile in
>>>>>>>>> nmethod, so
>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>
>>>>>>>>> I think _state also should use la/rs semantics instead, but
>>>>>>>>> that's
>>>>>>>>> another story.
>>>>>>>> Like this?
>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>
>>>>>>> Yes, exactly, I like this!
>>>>>>> Dan? Igor ? Tobias?
>>>>>>>
>>>>>>
>>>>>> That seems correct.
>>>>>>
>>>>>> igor
>>>>>>
>>>>>>> Thanks Roman!
>>>>>>>
>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow
>>>>>>> this
>>>>>>> thread/changeset to the end!
>>>>>>>
>>>>>>> /Robbin
>>>>>>>
>>>>>>>> Roman
>>>>>>
>>>>>
>>
More information about the hotspot-gc-dev
mailing list