RFR: Parallelize safepoint cleanup

Mon May 29 13:39:14 UTC 2017

Hi,

How do we feel about reusing the Parallel GC Threads work gang during 
safepoint cleanup?
We know that the Parallel GC Threads worker gang is idle then. This 
saves creating another gazillion threads and another JVM flag to keep 
track of.

Thanks,
/Erik

On 2017-05-29 14:53, Robbin Ehn wrote:
> Hi Roman,
>
> On 05/29/2017 10:23 AM, Roman Kennke wrote:
>> Hi David,
>>
>> the patch applied cleanly, i.e. was not affected by any hotspot changes.
>> Most of your suggestions applied to the GC part that I excluded from the
>> scope of this change, except for the following:
>>
>>>> I introduced a dedicated worker thread pool in SafepointSynchronize.
>>>> This is only initialized when -XX:+ParallelSafepointCleanup is 
>>>> enabled,
>>>> and uses -XX:ParallelSafepointCleanupThreads=X threads, defaulting 
>>>> to 8
>>>> threads (just a wild guess, open for discussion. With
>>>> -XX:-ParallelSafepointCleanup turned off (the default) it will use the
>>>> old serial safepoint cleanup processing (with optional GC hooks, see
>>>> below).
>>>
>>> I think the default needs to be ergonomically set, as for other
>>> "thread pools". Both flags should be experimental at this stage.
>>
>> I implemented this using the following:
>>
>>    if (ParallelSafepointCleanup &&
>> FLAG_IS_DEFAULT(ParallelSafepointCleanupThreads)) {
>>      // This will pick up ParallelGCThreads if set...
>>      FLAG_SET_DEFAULT(ParallelSafepointCleanupThreads,
>> Abstract_VM_Version::parallel_worker_threads());
>>    }
>>
>> As noted, it will pick up ParallelGCThreads, which might be a reasonable
>> number. Otherwise we'd need to re-work the logic in vm_version.cpp to
>> calculate the number of worker threads without considering
>> ParallelGCThreads.
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.02/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.02/>
>>
>
> This worries me because heuristics will be very hard to the average user.
>
> 1885   if (ParallelSafepointCleanup && 
> FLAG_IS_DEFAULT(ParallelSafepointCleanupThreads)) {
> 1886     // This will pick up ParallelGCThreads if set...
> 1887     FLAG_SET_DEFAULT(ParallelSafepointCleanupThreads, 
> Abstract_VM_Version::parallel_worker_threads());
> 1888   }
>
> So on a 8 vcpu machine we could be running 8 threads cleanup and 8 
> workers threads might be marking?
>
> "and I now think this is the best solution"
>
> I don't agree that is the the best solution and I'm not in favor of a 
> second thread pool.
> If you want short safepoints it seem much better to use available cpu 
> time finishing the cleanup.
> Maybe steal or reserve worker threads... just keep 1 pool of thread 
> which we can do decent sizing for.
>
> "For example, in one popular benchmark on a server-class machine, I got
> total average pause time down from ~80ms to ~30ms."
>
> What's the numbers for a 4 core/8 threads machine? And other GCs?
>
> Looked through the patch quick, I think you should do some abstraction.
> Now this code is already lacking abstraction so I understand why you 
> did it this way.
>
> For example:
> void ObjectSynchronizer::deflate_idle_monitors(bool 
> deflate_thread_local_monitors)
> ...
>   if (MonitorInUseLists) {
>     if (deflate_thread_local_monitors) {
>
> You have mandatory parameter which is ignore by a global flag.
> Right now there are 4 different cases in the same code path.
> Does ParallelSafepointCleanup work with -MonitorInUseLists ?
> Please create some kind of abstraction for this and set it a startup.
>
> I do not see any reason for duplicating the 
> SafepointSynchronize::serial_cleanup(), just create proper abstraction 
> so it can be used in both cases?
>
> Thanks!
>
> /Robbin
>
>> What do you think?
>>
>> Roman
>>