RFR: Parallelize safepoint cleanup
Robbin Ehn
robbin.ehn at oracle.com
Sun Jul 16 08:25:14 UTC 2017
Hi Roman,
On 2017-07-12 15:32, Roman Kennke wrote:
> Hi Robbin and all,
>
> I fixed the 32bit failures by using jlong in all relevant places:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>
> then Robbin found another problem. SafepointCleanupTest started to fail,
> because "mark nmethods" is no longer printed. This made me think that
> we're not measuring the conflated (and possibly parallelized)
> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
> "safepoint cleanup tasks" which measures the total duration of safepoint
> cleanup. We can't reasonably measure a possibly parallel and conflated
> pass standalone, but we can measure all and by subtrating all the other
> subphases, get an idea how long deflation and nmethod marking take up.
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>
> The full webrev is now:
>
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>
> Hope that's all ;-)
With this changeset something always pop-ups.
Failure reason: Targets failed. Target macosx_x64_10.9-fastdebug FAILED.
/opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16 -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070 -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef -Wunused-function -Wformat=2 -DASSERT
-DCHECK_UNHANDLED_OOPS -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86 -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64 -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED -DINCLUDE_AOT -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22: error: variable has incomplete type 'StrongRootsScope'
StrongRootsScope srs(num_cleanup_workers);
^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope'
class StrongRootsScope;
^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22: error: variable has incomplete type 'StrongRootsScope'
StrongRootsScope srs(1);
^
/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7: note: forward declaration of 'StrongRootsScope'
class StrongRootsScope;
^
2 errors generated.
make[3]: *** [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [hotspot-server-libs] Error 2
Send me the new webrev and I'll test it before the 16th round of review :)
/Robbin
>
> Roman
>
> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>> Hi, unfortunately the push failed on 32-bit.
>>
>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>
>> I do not have anytime to look at this, so here is the error.
>>
>> /Robbin
>>
>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'long int nmethod::stack_traversal_mark()':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> error: call of overloaded 'load_acquire(volatile long int*)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>> note: no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile jint* {aka const volatile int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>> note: no known conversion for argument 1 from 'volatile long int*'
>> to 'const volatile juint* {aka const volatile unsigned int*}'
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> error: call of overloaded 'release_store(volatile long int*, long
>> int&)' is ambiguous
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>> note: candidates are:
>> In file included from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>> from
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note: static void OrderAccess::release_store(volatile jint*, jint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>> note: no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile jint* {aka volatile int*}'
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note: static void OrderAccess::release_store(volatile juint*, juint)
>> <near match>
>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>> note: no known conversion for argument 1 from 'volatile long int*'
>> to 'volatile juint* {aka volatile unsigned int*}'
>>
>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>> I'll start a push now.
>>>
>>> /Robbin
>>>
>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>> Ok, so I guess I need a sponsor for this now:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>
>>>> Roman
>>>>
>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>
>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>
>>>>>> Hi Roman,
>>>>>>
>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>> Hi Robbin,
>>>>>>>>
>>>>>>>> Far down ->
>>>>>>>>
>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>
>>>>>>>>>> + ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>> + // This is here to be consistent with sweeper.cpp
>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>> + // TODO: Is this really needed?
>>>>>>>>>> + OrderAccess::storestore();
>>>>>>>>>> + }
>>>>>>>>>>
>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>> consistent
>>>>>>>>>> with an OrderAccess::storestore() that's not properly documented
>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>
>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>
>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case that
>>>>>>>>>>> I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>> sweeper)
>>>>>>>>>>> is holding still.
>>>>>>>>>>
>>>>>>>>>> and:
>>>>>>>>>>
>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>
>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>> marking
>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>> (outside
>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>> storestore()
>>>>>>>>> should be necessary.
>>>>>>>>>
>>>>>>>>> From Igor's comment I can see how it happened though: Apparently
>>>>>>>>> there
>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>> with
>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>> required
>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>> also put
>>>>>>>>> a storestore() in the other places that call
>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>> storestore()
>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>> 'for
>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>> necessary in
>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>
>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>> Refactor the
>>>>>>>>> code so that both paths at least call the storestore() in the same
>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and call
>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>
>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>
>>>>>>>> So there is a slight optimization when not running sweeper to skip
>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>
>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>> - long stack_traversal_mark() { return
>>>>>>>> _stack_traversal_mark; }
>>>>>>>> - void set_stack_traversal_mark(long l) {
>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>> + long stack_traversal_mark() { return
>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>> + void set_stack_traversal_mark(long l) {
>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>
>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>> that
>>>>>>>> it is concurrent accessed.
>>>>>>>> And remove both storestore.
>>>>>>>>
>>>>>>>> "Also neither of these state variables are volatile in nmethod, so
>>>>>>>> even the compiler may reorder the stores"
>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>
>>>>>>>> I think _state also should use la/rs semantics instead, but that's
>>>>>>>> another story.
>>>>>>> Like this?
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>
>>>>>> Yes, exactly, I like this!
>>>>>> Dan? Igor ? Tobias?
>>>>>>
>>>>>
>>>>> That seems correct.
>>>>>
>>>>> igor
>>>>>
>>>>>> Thanks Roman!
>>>>>>
>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow this
>>>>>> thread/changeset to the end!
>>>>>>
>>>>>> /Robbin
>>>>>>
>>>>>>> Roman
>>>>>
>>>>
>
More information about the hotspot-gc-dev
mailing list