RFR(XS): JDK-8212122: Allow ReferenceProcessor to always be MT processing

Mon Oct 15 23:33:48 UTC 2018

Let me answer to your recent 2 emails together.

On 10/15/18 6:15 AM, Roman Kennke wrote:
>
> Am 13.10.18 um 11:38 schrieb Roman Kennke:
>>>> Currently, in ReferenceProcessor, the mt_processing flag is adjusted for
>>>> each phase depending on the number of workers: if it's 1 worker, then
>>>> mt_processing is disabled, for more than 1 worker, mt_processing is
>>>> enabled. Depending on the flag, ref-processing is either done through
>>>> AbstractRefProcTaskExecutor, or directly in VMThread. However, we have a
>>>> situation where we always want to execute via the executor, and never
>>>> directly. Instead we decide in the executor whether or not we want to
>>>> execute directly in the VMThread.
>>>>
>>>> A simple change allows a subclass of ReferenceProcessor to change this
>>>> behavior.
>>>>
>>>> In case you're interested why we need to do this: Shenandoah requires
>>>> (for each worker) a scope around reference processing. In current
>>>> scheme, we'd need to put one scope around
>>>> RP::process_discovered_reference() to cover the 1-threaded case, and one
>>>> scope inside each worker to cover those. However, we cannot handle this
>>>> sort of weirdly nested scoping.
>>>>
>>>> (If it was me, I'd rework this whole execution scheme to always call the
>>>> MT config, and let RP figure out and execute directly in VMThread if
>>>> it's only 1 worker, and not put this burden on the caller to set up both
>>>> for 1-threaded and N-threaded execution. This is just an ugly mess.)
>>> I agree with you.
>>> In addition to Kim's comment which covered ParallelGC case,
>>> 1) For CMS, single vs. MT processing should be decided before calling
>>> RP  because of CMS specific implementation. For more detail, please
>>> refer comments at https://bugs.openjdk.java.net/browse/JDK-6938732
>>> 2) For G1, MT processing is set to true by default. But still caller has
>>> 2 paths as you described. I'm not sure whether single thread case is
>>> still needed there or not because RP is deciding to use single or
>>> multiple threads. Probably we can remove it unless an user really wants
>>> to force using single thread only. :)
>>
>> The problem is the (rather new) ergonomics code in RP that scales number
>> of threads to spin up based on the amount of work to do. And if it ends
>> up being only 1 thread, then it goes down the single-threaded-path.
I was thinking why it is a problem..
And now I understand your statement is something like 'why not obey an 
provided executor' rather than single-thread stuff. Because current code 
is run in single thread which is VMThread (RP::_processing_is_mt = false).

All existing GCs (when I worked on the code) have below setting,
RP::_processing_is_mt = (ParallelGCThreads > 1) && ParallelRefProcEnabled
which convinced me 1 worker case can be considered same as running on 
VMThread because ParallelGCThreads == 1 is not considered as mt processing.

If you are suggesting that we should always use an executor if it is 
provided, it would be okay for this case. But I'm not sure whether it is 
the way we want to go.

>> Otherwise I wouldn't actually need this change... (that might be another
>> approach to this problem: don't actually change _mt_processing flag at
>> all, and call the executor even with 1 thread. In Shenandoah, we do this
>> and it works fine, and we special-case the 1-thread-path by not spinning
>> up the workers and execute it in the calling thread (which is the VMThread).
> In other words, this is a fix that actually removes 'cruft', and seems
> less surprising API-wise: i.e. if caller asks for mt_processing, then
> don't change it, and leave it to caller to execute in VMThread if it wishes.
>
> Updated webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8212122/webrev.01/
>
> An example how G1 can be made to strictly do the current behaviour (i.e.
> execute in VMThread when ergo_workers==1):
>
> http://cr.openjdk.java.net/~rkennke/JDK-8212122/webrev.01.g1/
>
> ... altough I haven't tested it at all. It works well with Shenandoah,
> but I don't know if you'd actually do it/want it/ask me to do it ;-)
>
> What do you think? Which approach do you like best (except the huge
> parallel/CMS/WorkGang/ReferenceProcessor reworking)?
As this could need more discussion, let me postpone answering it. :)

Thanks,
Sangheon

>
> Thanks,
> Roman
>