RFR: Parallelize safepoint cleanup

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Jul 20 16:43:34 UTC 2017


On 7/20/17 4:53 AM, Roman Kennke wrote:
> Hi all,
>
> Robbin found some more missing includes in jprt testing (thanks!!)
>
> Differential:
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18.diff/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18.diff/>
> Full:
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.18/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.18/>
>
> Am I breaking the record for most webrev revisions? :-P
>
> According the Robbin, builds are now all clean.
>
> Can I get final reviews and then a sponsor?

src/share/vm/runtime/safepoint.cpp
     No comments.

Only reviewed the one file that changed since webrev.15.

Thumbs up!

Dan



>
> Thanks,
> Roman
>
> Am 16.07.2017 um 10:25 schrieb Robbin Ehn:
>> Hi Roman,
>>
>> On 2017-07-12 15:32, Roman Kennke wrote:
>>> Hi Robbin and all,
>>>
>>> I fixed the 32bit failures by using jlong in all relevant places:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.14.diff/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.14.diff/>
>>>
>>> then Robbin found another problem. SafepointCleanupTest started to fail,
>>> because "mark nmethods" is no longer printed. This made me think that
>>> we're not measuring the conflated (and possibly parallelized)
>>> deflate-idle-monitors+mark-nmethods pass. I added a TraceTime with
>>> "safepoint cleanup tasks" which measures the total duration of safepoint
>>> cleanup. We can't reasonably measure a possibly parallel and conflated
>>> pass standalone, but we can measure all and by subtrating all the other
>>> subphases, get an idea how long deflation and nmethod marking take up.
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15.diff/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15.diff/>
>>>
>>> The full webrev is now:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.15/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.15/>
>>>
>>> Hope that's all ;-)
>> With this changeset something always pop-ups.
>>
>> Failure reason: Targets failed.  Target macosx_x64_10.9-fastdebug FAILED.
>>
>>   /opt/jprt/jib-data/install/jpg/infra/builddeps/devkit-macosx_x64/Xcode6.3-MacOSX10.9+1.0/devkit-macosx_x64-Xcode6.3-MacOSX10.9+1.0.tar.gz/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
>> -m64 -fPIC -D_GNU_SOURCE -flimit-debug-info -D__STDC_FORMAT_MACROS
>> -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_ALLBSD_SOURCE
>> -D_DARWIN_C_SOURCE -D_XOPEN_SOURCE -fno-rtti -fno-exceptions
>> -fvisibility=hidden -mno-omit-leaf-frame-pointer -mstack-alignment=16
>> -pipe -fno-strict-aliasing -DMAC_OS_X_VERSION_MAX_ALLOWED=1070
>> -mmacosx-version-min=10.7.0 -fno-omit-frame-pointer -DVM_LITTLE_ENDIAN
>> -D_LP64=1 -Wno-deprecated -Wpointer-arith -Wsign-compare -Wundef
>> -Wunused-function -Wformat=2 -DASSERT -DCHECK_UNHANDLED_OOPS
>> -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_bsd -DINCLUDE_SUFFIX_CPU=_x86
>> -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DAMD64
>> -DHOTSPOT_LIB_ARCH='"amd64"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED
>> -DINCLUDE_AOT
>> -I/opt/jprt/T/P1/193338.rehn/s/hotspot/src/closed/share/vm
>> -I/opt/j/opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:654:22:
>> error: variable has incomplete type 'StrongRootsScope'
>>      StrongRootsScope srs(num_cleanup_workers);
>>                       ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>> note: forward declaration of 'StrongRootsScope'
>> class StrongRootsScope;
>>        ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/runtime/safepoint.cpp:659:22:
>> error: variable has incomplete type 'StrongRootsScope'
>>      StrongRootsScope srs(1);
>>                       ^
>> /opt/jprt/T/P1/193338.rehn/s/hotspot/src/share/vm/gc/shared/genCollectedHeap.hpp:33:7:
>> note: forward declaration of 'StrongRootsScope'
>> class StrongRootsScope;
>>        ^
>> 2 errors generated.
>> make[3]: ***
>> [/opt/jprt/T/P1/193338.rehn/s/build/macosx-x64-debug/hotspot/variant-server/libjvm/objs/safepoint.o]
>> Error 1
>> make[3]: *** Waiting for unfinished jobs....
>> make[2]: *** [hotspot-server-libs] Error 2
>>
>> Send me the new webrev and I'll test it before the 16th round of
>> review :)
>>
>> /Robbin
>>
>>> Roman
>>>
>>> Am 10.07.2017 um 21:22 schrieb Robbin Ehn:
>>>> Hi, unfortunately the push failed on 32-bit.
>>>>
>>>> (looks like _stack_traversal_mark should be jlong, I feel a bit guilty)
>>>>
>>>> I do not have anytime to look at this, so here is the error.
>>>>
>>>> /Robbin
>>>>
>>>> make[3]: Leaving directory '/opt/jprt/T/P1/185117.rehn/s/hotspot/make'
>>>> make/Main.gmk:263: recipe for target 'hotspot-client-libs' failed
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>> member function 'long int nmethod::stack_traversal_mark()':
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>
>>>> error: call of overloaded 'load_acquire(volatile long int*)' is
>>>> ambiguous
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:399:108:
>>>>
>>>> note: candidates are:
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>
>>>> note: static jint OrderAccess::load_acquire(const volatile jint*)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:57:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'const volatile jint* {aka const volatile int*}'
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>
>>>> note: static juint OrderAccess::load_acquire(const volatile juint*)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:63:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'const volatile juint* {aka const volatile unsigned int*}'
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp: In
>>>> member function 'void nmethod::set_stack_traversal_mark(long int)':
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>
>>>> error: call of overloaded 'release_store(volatile long int*, long
>>>> int&)' is ambiguous
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:400:105:
>>>>
>>>> note: candidates are:
>>>> In file included from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/typeArrayOop.hpp:30:0,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/constantPool.hpp:32,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/oops/method.hpp:34,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/frame.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/codeBlob.hpp:31,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/compiledMethod.hpp:28,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/code/nmethod.hpp:28,
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/safepoint.hpp:29,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/shared/collectedHeap.hpp:33,
>>>>
>>>>                   from
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/gc/cms/adaptiveFreeList.cpp:28:
>>>>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>
>>>> note: static void OrderAccess::release_store(volatile jint*, jint)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:71:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'volatile jint* {aka volatile int*}'
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>
>>>> note: static void OrderAccess::release_store(volatile juint*, juint)
>>>> <near match>
>>>> /opt/jprt/T/P1/185117.rehn/s/hotspot/src/share/vm/runtime/orderAccess.inline.hpp:77:17:
>>>>
>>>> note:   no known conversion for argument 1 from 'volatile long int*'
>>>> to 'volatile juint* {aka volatile unsigned int*}'
>>>>
>>>> On 2017-07-10 20:50, Robbin Ehn wrote:
>>>>> I'll start a push now.
>>>>>
>>>>> /Robbin
>>>>>
>>>>> On 2017-07-10 12:38, Roman Kennke wrote:
>>>>>> Ok, so I guess I need a sponsor for this now:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>
>>>>>> Roman
>>>>>>
>>>>>> Am 07.07.2017 um 20:09 schrieb Igor Veresov:
>>>>>>>> On Jul 7, 2017, at 4:23 AM, Robbin Ehn <robbin.ehn at oracle.com
>>>>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>> On 07/07/2017 12:51 PM, Roman Kennke wrote:
>>>>>>>>> Hi Robbin,
>>>>>>>>>> Far down ->
>>>>>>>>>>
>>>>>>>>>> On 07/06/2017 08:05 PM, Roman Kennke wrote:
>>>>>>>>>>>> I'm not happy about this change:
>>>>>>>>>>>>
>>>>>>>>>>>> +  ~ParallelSPCleanupThreadClosure() {
>>>>>>>>>>>> +    // This is here to be consistent with sweeper.cpp
>>>>>>>>>>>> NMethodSweeper::mark_active_nmethods().
>>>>>>>>>>>> +    // TODO: Is this really needed?
>>>>>>>>>>>> +    OrderAccess::storestore();
>>>>>>>>>>>> +  }
>>>>>>>>>>>>
>>>>>>>>>>>> because we're adding an OrderAccess::storestore() to be
>>>>>>>>>>>> consistent
>>>>>>>>>>>> with an OrderAccess::storestore() that's not properly
>>>>>>>>>>>> documented
>>>>>>>>>>>> which is only increasing the technical debt.
>>>>>>>>>>>>
>>>>>>>>>>>> So a couple of things above don't make sense to me:
>>>>>>>>>>>>
>>>>>>>>>>>>> - sweeper thread runs outside safepoint
>>>>>>>>>>>>> - VMThread (which is doing the nmethod marking in the case
>>>>>>>>>>>>> that
>>>>>>>>>>>>>     I'm looking at) runs while all other threads (incl. the
>>>>>>>>>>>>> sweeper)
>>>>>>>>>>>>>     is holding still.
>>>>>>>>>>>> and:
>>>>>>>>>>>>
>>>>>>>>>>>>> There should be no need for a storestore() (at least in
>>>>>>>>>>>>> sweeper.cpp...
>>>>>>>>>>> Either one or the other are running. Either the VMThread is
>>>>>>>>>>> marking
>>>>>>>>>>> nmethods (during safepoint) or the sweeper threads are running
>>>>>>>>>>> (outside
>>>>>>>>>>> safepoint). Between the two phases, there is a guaranteed
>>>>>>>>>>> OrderAccess::fence() (see safepoint.cpp). Therefore, no
>>>>>>>>>>> storestore()
>>>>>>>>>>> should be necessary.
>>>>>>>>>>>
>>>>>>>>>>>   From Igor's comment I can see how it happened though:
>>>>>>>>>>> Apparently
>>>>>>>>>>> there
>>>>>>>>>>> *is* a race in sweeper's own concurrent processing (concurrent
>>>>>>>>>>> with
>>>>>>>>>>> compiler threads, as far as I understand). And there's a call to
>>>>>>>>>>> nmethod::mark_as_seen_on_stack() after which a storestore() is
>>>>>>>>>>> required
>>>>>>>>>>> (as per Igor's explanation). So the logic probably was: we have
>>>>>>>>>>> mark_as_seen_on_stack() followed by storestore() here, so let's
>>>>>>>>>>> also put
>>>>>>>>>>> a storestore() in the other places that call
>>>>>>>>>>> mark_as_seen_on_stack(),
>>>>>>>>>>> one of which happens to be the safepoint cleanup code that we're
>>>>>>>>>>> discussing. (why the storestore() hasn't been put right into
>>>>>>>>>>> mark_as_seen_on_stack() I don't understand). In short, one
>>>>>>>>>>> storestore()
>>>>>>>>>>> really was necessary, the other looks like it has been put there
>>>>>>>>>>> 'for
>>>>>>>>>>> consistency' or just conservatively. But it shouldn't be
>>>>>>>>>>> necessary in
>>>>>>>>>>> the safepoint cleanup code that we're discussing.
>>>>>>>>>>>
>>>>>>>>>>> So what should we do? Remove the storestore() for good?
>>>>>>>>>>> Refactor the
>>>>>>>>>>> code so that both paths at least call the storestore() in the
>>>>>>>>>>> same
>>>>>>>>>>> place? (E.g. make mark_active_nmethods() use the closure and
>>>>>>>>>>> call
>>>>>>>>>>> storestore() in the dtor as proposed?)
>>>>>>>>>> I took a quick look, maybe I'm missing some stuff but:
>>>>>>>>>>
>>>>>>>>>> So there is a slight optimization when not running sweeper to
>>>>>>>>>> skip
>>>>>>>>>> compiler barrier/fence in stw.
>>>>>>>>>>
>>>>>>>>>> Don't think that matter, so I propose something like:
>>>>>>>>>> -  long  stack_traversal_mark()                    { return
>>>>>>>>>> _stack_traversal_mark; }
>>>>>>>>>> -  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>> _stack_traversal_mark = l; }
>>>>>>>>>> +  long  stack_traversal_mark()                    { return
>>>>>>>>>> OrderAccess::load_acquire(&_stack_traversal_mark); }
>>>>>>>>>> +  void  set_stack_traversal_mark(long l)          {
>>>>>>>>>> OrderAccess::release_store(&_stack_traversal_mark, l); }
>>>>>>>>>>
>>>>>>>>>> Maybe make _stack_traversal_mark volatile also, just as a marking
>>>>>>>>>> that
>>>>>>>>>> it is concurrent accessed.
>>>>>>>>>> And remove both storestore.
>>>>>>>>>>
>>>>>>>>>> "Also neither of these state variables are volatile in
>>>>>>>>>> nmethod, so
>>>>>>>>>> even the compiler may reorder the stores"
>>>>>>>>>> Fortunately at least _state is volatile now.
>>>>>>>>>>
>>>>>>>>>> I think _state also should use la/rs semantics instead, but
>>>>>>>>>> that's
>>>>>>>>>> another story.
>>>>>>>>> Like this?
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.12/
>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.12/>
>>>>>>>> Yes, exactly, I like this!
>>>>>>>> Dan? Igor ? Tobias?
>>>>>>>>
>>>>>>> That seems correct.
>>>>>>>
>>>>>>> igor
>>>>>>>
>>>>>>>> Thanks Roman!
>>>>>>>>
>>>>>>>> BTW I'm going on vacation (5w) in a few hours, but I will follow
>>>>>>>> this
>>>>>>>> thread/changeset to the end!
>>>>>>>>
>>>>>>>> /Robbin
>>>>>>>>
>>>>>>>>> Roman




More information about the hotspot-gc-dev mailing list