RFR: Parallelize safepoint cleanup

Thu Jun 1 09:29:44 UTC 2017

Am 31.05.2017 um 22:06 schrieb Robbin Ehn:
> Hi Roman, I agree that is really needed but:
>
> On 05/31/2017 10:27 AM, Roman Kennke wrote:
>> I realized that sharing workers with GC is not so easy.
>>
>> We need to be able to use the workers at a safepoint during concurrent
>> GC work (which also uses the same workers). This does not only require
>> that those workers be suspended, like e.g.
>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. have
>> finished their tasks. This needs some careful handling to work without
>> races: it requires a SuspendibleThreadSetJoiner around the corresponding
>> run_task() call and also the tasks themselves need to join the STS and
>> handle requests for safepoints not by yielding, but by leaving the task.
>> This is far too peculiar for me to make the call to hook up GC workers
>> for safepoint cleanup, and I thus removed those parts. I left the API in
>> CollectedHeap in place. I think GC devs who know better about G1 and CMS
>> should make that call, or else just use a separate thread pool.
>>
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>
>> Is it ok now?
>
> I still think you should put the "Parallel Safepoint Cleanup" workers
> inside Shenandoah,
> so the SafepointSynchronizer only calls get_safepoint_workers, e.g.:
>
> _cleanup_workers = heap->get_safepoint_workers();
> _num_cleanup_workers = _cleanup_workers != NULL ?
> _cleanup_workers->total_workers() : 1;
> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
> StrongRootsScope srs(_num_cleanup_workers);
> if (_cleanup_workers != NULL) {
>   _cleanup_workers->run_task(&cleanup, _num_cleanup_workers);
> } else {
>   cleanup.work(0);
> }
>
> That way you don't even need your new flags, but it will be up to the
> other GCs to make their worker available
> or cheat with a separate workgang.
I can do that, I don't mind. The question is, do we want that?
I wouldn't call it 'cheating with a separate workgang' though. I see
that both G1 and CMS suspend their worker threads at a safepoint. However:
- Do they finish their work, stop, and then restart work after
safepoint? Or are the workers simply calling STS::yield() to suspend and
later resume their work where they left off. If they only call yield()
(or whatever equivalent in CMS), then this is not enough: the workers
need to be truly idle in order to be used by the safepoint cleaners.
- Parallel and serial GC don't have workgangs of their own.

So, as far as I can tell, this means that parallel safepoint cleanup
would only be supported by GCs for which we explicitely implement it,
after having carefully checked if/how workgangs are suspended at
safepoints, or by providing GC-internal thread pools. Do we really want
that?

Roman