RFR: Parallelize safepoint cleanup

Fri Jun 2 09:41:47 UTC 2017

Hi David,
thanks for reviewing. I'll be on vacation the next two weeks too, with
only sporadic access to work stuff.
Yes, exposure will not be as good as otherwise, but it's not totally
untested either: the serial code path is the same as the parallel, the
only difference is that it's not actually called by multiple threads.
It's ok I think.

I found two more issues that I think should be addressed:
- There are some counters in deflate_idle_monitors() and I'm not sure I
correctly handle them in the split-up and MT'ed thread-local/ global
list deflation
- nmethod marking seems to unconditionally poke true or something like
that in nmethod fields. This doesn't hurt correctness-wise, but it's
probably worth checking if it's already true, especially when doing this
with multiple threads concurrently.

I'll send an updated patch around later, I hope I can get to it today...

Roman

> Hi Roman,
>
> I am about to disappear on an extended vacation so will let others
> pursue this. IIUC this is longer an opt-in by the user at runtime, but
> an opt-in by the particular GC developers. Okay. My only concern with
> that is if Shenandoah is the only GC that currently opts in then this
> code is not going to get much testing and will be more prone to
> incidental breakage.
>
> Cheers,
> David
>
> On 2/06/2017 2:21 AM, Roman Kennke wrote:
>> Am 01.06.2017 um 17:50 schrieb Roman Kennke:
>>> Am 01.06.2017 um 14:18 schrieb Robbin Ehn:
>>>> Hi Roman,
>>>>
>>>> On 06/01/2017 11:29 AM, Roman Kennke wrote:
>>>>> Am 31.05.2017 um 22:06 schrieb Robbin Ehn:
>>>>>> Hi Roman, I agree that is really needed but:
>>>>>>
>>>>>> On 05/31/2017 10:27 AM, Roman Kennke wrote:
>>>>>>> I realized that sharing workers with GC is not so easy.
>>>>>>>
>>>>>>> We need to be able to use the workers at a safepoint during
>>>>>>> concurrent
>>>>>>> GC work (which also uses the same workers). This does not only
>>>>>>> require
>>>>>>> that those workers be suspended, like e.g.
>>>>>>> SuspendibleThreadSet::yield(), but they need to be idle, i.e. have
>>>>>>> finished their tasks. This needs some careful handling to work
>>>>>>> without
>>>>>>> races: it requires a SuspendibleThreadSetJoiner around the
>>>>>>> corresponding
>>>>>>> run_task() call and also the tasks themselves need to join the
>>>>>>> STS and
>>>>>>> handle requests for safepoints not by yielding, but by leaving the
>>>>>>> task.
>>>>>>> This is far too peculiar for me to make the call to hook up GC
>>>>>>> workers
>>>>>>> for safepoint cleanup, and I thus removed those parts. I left the
>>>>>>> API in
>>>>>>> CollectedHeap in place. I think GC devs who know better about G1
>>>>>>> and CMS
>>>>>>> should make that call, or else just use a separate thread pool.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.05/
>>>>>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.05/>
>>>>>>>
>>>>>>> Is it ok now?
>>>>>> I still think you should put the "Parallel Safepoint Cleanup"
>>>>>> workers
>>>>>> inside Shenandoah,
>>>>>> so the SafepointSynchronizer only calls get_safepoint_workers, e.g.:
>>>>>>
>>>>>> _cleanup_workers = heap->get_safepoint_workers();
>>>>>> _num_cleanup_workers = _cleanup_workers != NULL ?
>>>>>> _cleanup_workers->total_workers() : 1;
>>>>>> ParallelSPCleanupTask cleanup(_cleanup_subtasks);
>>>>>> StrongRootsScope srs(_num_cleanup_workers);
>>>>>> if (_cleanup_workers != NULL) {
>>>>>>     _cleanup_workers->run_task(&cleanup, _num_cleanup_workers);
>>>>>> } else {
>>>>>>     cleanup.work(0);
>>>>>> }
>>>>>>
>>>>>> That way you don't even need your new flags, but it will be up to
>>>>>> the
>>>>>> other GCs to make their worker available
>>>>>> or cheat with a separate workgang.
>>>>> I can do that, I don't mind. The question is, do we want that?
>>>> The problem is that we do not want to haste such decision, we believe
>>>> there is a better solution.
>>>> I think you also would want another solution.
>>>> But it's seems like such solution with 1 'global' thread pool either
>>>> own by GC or the VM it self is quite the undertaking.
>>>> Since this probably will not be done any time soon my suggestion is,
>>>> to not hold you back (we also want this), just to make
>>>> the code parallel and as an intermediate step ask the GC if it minds
>>>> sharing it's thread.
>>>>
>>>> Now when Shenandoah is merged it's possible that e.g. G1 will share
>>>> the code for a separate thread pool, do something of it's own or
>>>> wait until the bigger question about thread pool(s) have been
>>>> resolved.
>>>>
>>>> By adding a thread pool directly to the SafepointSynchronizer and
>>>> flags for it we might limit our future options.
>>>>
>>>>> I wouldn't call it 'cheating with a separate workgang' though. I see
>>>>> that both G1 and CMS suspend their worker threads at a safepoint.
>>>>> However:
>>>> Yes it's not cheating but I want decent heuristics between e.g. number
>>>> of concurrent marking threads and parallel safepoint threads since
>>>> they compete for cpu time.
>>>> As the code looks now, I think that decisions must be made by the GC.
>>> Ok, I see your point. I updated the proposed patch accordingly:
>>>
>>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.06/
>>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.06/>
>> Oops. Minor mistake there. Correction:
>> http://cr.openjdk.java.net/~rkennke/8180932/webrev.07/
>> <http://cr.openjdk.java.net/%7Erkennke/8180932/webrev.07/>
>>
>> (Removed 'class WorkGang' from safepoint.hpp, and forgot to add it into
>> collectedHeap.hpp, resulting in build failure...)
>>
>> Roman
>>