RFR: Parallelize safepoint cleanup

Robbin Ehn robbin.ehn at oracle.com
Mon May 29 12:53:22 UTC 2017


Hi Roman,

On 05/29/2017 10:23 AM, Roman Kennke wrote:
> Hi David,
> 
> the patch applied cleanly, i.e. was not affected by any hotspot changes.
> Most of your suggestions applied to the GC part that I excluded from the
> scope of this change, except for the following:
> 
>>> I introduced a dedicated worker thread pool in SafepointSynchronize.
>>> This is only initialized when -XX:+ParallelSafepointCleanup is enabled,
>>> and uses -XX:ParallelSafepointCleanupThreads=X threads, defaulting to 8
>>> threads (just a wild guess, open for discussion. With
>>> -XX:-ParallelSafepointCleanup turned off (the default) it will use the
>>> old serial safepoint cleanup processing (with optional GC hooks, see
>>> below).
>>
>> I think the default needs to be ergonomically set, as for other
>> "thread pools". Both flags should be experimental at this stage.
> 
> I implemented this using the following:
> 
>    if (ParallelSafepointCleanup &&
> FLAG_IS_DEFAULT(ParallelSafepointCleanupThreads)) {
>      // This will pick up ParallelGCThreads if set...
>      FLAG_SET_DEFAULT(ParallelSafepointCleanupThreads,
> Abstract_VM_Version::parallel_worker_threads());
>    }
> 
> As noted, it will pick up ParallelGCThreads, which might be a reasonable
> number. Otherwise we'd need to re-work the logic in vm_version.cpp to
> calculate the number of worker threads without considering
> ParallelGCThreads.
> 
> http://cr.openjdk.java.net/~rkennke/8180932/webrev.02/
> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.02/>
> 

This worries me because heuristics will be very hard to the average user.

1885   if (ParallelSafepointCleanup && FLAG_IS_DEFAULT(ParallelSafepointCleanupThreads)) {
1886     // This will pick up ParallelGCThreads if set...
1887     FLAG_SET_DEFAULT(ParallelSafepointCleanupThreads, Abstract_VM_Version::parallel_worker_threads());
1888   }

So on a 8 vcpu machine we could be running 8 threads cleanup and 8 workers threads might be marking?

"and I now think this is the best solution"

I don't agree that is the the best solution and I'm not in favor of a second thread pool.
If you want short safepoints it seem much better to use available cpu time finishing the cleanup.
Maybe steal or reserve worker threads... just keep 1 pool of thread which we can do decent sizing for.

"For example, in one popular benchmark on a server-class machine, I got
total average pause time down from ~80ms to ~30ms."

What's the numbers for a 4 core/8 threads machine? And other GCs?

Looked through the patch quick, I think you should do some abstraction.
Now this code is already lacking abstraction so I understand why you did it this way.

For example:
void ObjectSynchronizer::deflate_idle_monitors(bool deflate_thread_local_monitors)
...
   if (MonitorInUseLists) {
     if (deflate_thread_local_monitors) {

You have mandatory parameter which is ignore by a global flag.
Right now there are 4 different cases in the same code path.
Does ParallelSafepointCleanup work with -MonitorInUseLists ?
Please create some kind of abstraction for this and set it a startup.

I do not see any reason for duplicating the SafepointSynchronize::serial_cleanup(), just create proper abstraction so it can be used in both cases?

Thanks!

/Robbin

> What do you think?
> 
> Roman
> 


More information about the hotspot-runtime-dev mailing list