RFR(XS): JDK-8212122: Allow ReferenceProcessor to always be MT processing

Roman Kennke rkennke at redhat.com
Tue Oct 16 21:07:21 UTC 2018


Hi Thomas,

>> The more I think about it, the more messy this occurs to me.
>>
>> Again, to begin with, I believe we already have what we want: GCs can
>> tell the RP if it wants mt_processing or not via the mt_processing
>> flag.
>> It used to be the case that RP always respected that. GCs set this to
>> false if num-gc-treads==1 or if it didn't want mt_processing for
>> other reasons (serial GC), otherwise to true, and RP would always use
>> the executor if mt_processing==true, otherwise go the single-threaded
>> path.
>>
>> With the change of using ergonomically determined num-workers, this
>> has changed to dynamically use one or the other path, which is not
>> good for GCs that actually always want mt_processing. I find this
>> surprising behaviour. I would argue that when GC tells RP to to 
> 
> I agree with that this may be surprising behavior. However you do not
> want all-MT for performance reasons - spinning up just one worker
> thread is significantly slower than just processing the few References
> inline. This particularly matters if you want very low-pause times.

Right. That's why we 'fixed' that by special-casing this in the executor.

> (This has been improved a bit with just not doing any work if the
> References are zero, which is a common case).
> 
> So you somehow need to alternate between in-thread processing and real
> MT processing for maximum throughput, particularly in this case where
> you spin up and shut down threads a lot potentially in rapid
> succession.
> 
> Now this can be fixed _partially_ with letting the executor deciding
> whether doing the work inline or not, with the design issue mentioned
> by Kim already ("isn't this the point of run_task()?"). The other issue
> is balancing the queues which is superfluous with only one thread - but
> we've had some very subtle issues with that.

Riiight. Very good point.

> (Yeah, yeah, I do have a mostly done JDK-8202328 around, but....)
> 
>> mt_processing, then then yes, please always go the executor/mt-
>> processing route. The executor is free to decide to not spin up
>> workers for 1-threaded-execution as optimization. Other than that, I 
> 
> Been there, done that, i.e. the programmer had been responsible for
> deciding whether to do the work inline or not. This had several issues
> (history lesson follows, just skip if not interested):
> 
> - people forgot to do that in some places, leading to massive
> performance issues in situations the author hadn't thought of yet (like
> running on a multi-TB heap with thousands of HW threads).
> 
> - people started optimizing for one or the other case, adding lots of
> code and so increasing maintenance costs a lot (e.g. some paths were
> basically never tested until the next crash, lots of duplicate and
> boilerplate code added etc) which made for lots of pain.
> 
> - WorkGang itself does not give any kind of guarantees on which threads
> a particular AbstractGangTask is run in the first place, and users must
> not rely on that. Relying on that again and again just caused once-
> during-a-full-moon issues in the past; another example for a misguided
> assumption is that there is no guaranteed 1:1 relation between worker
> id passed to the AbstractGangTask and the thread's id. I.e. a single
> thread may execute an AbstractGangTask multiple times (with different
> worker_id of course), and some workgang threads may not execute any at
> all. (I assume that Shenandoah handles that case just fine?)
> Assuming otherwise is just a wrong assumption of what WorkGang provides
> to you and any code assuming so should simply be considered broken.
> 
> These are the reasons why at some point (early JDK9 timeframe I think)
> we just removed the special casing and cleaned G1 AbstractGangTasks
> because we got weary of spending so much time on fixing these issues.
> Basically CMS' AbstractGangTasks are left to look over (which seems
> obviously broken for the RP case; but CMS'es removal keeps getting
> closer anyway).
> 
>> believe that 1-threaded and N-threaded execution should be
>> equivalent.
>>
>> For this reason, I still think that simply removing this surprising
>> adjustment of _mt_processing is the way to go, as proposed here:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8212122/webrev.01/
>>
>> If we really need to preserve the old (actually, rather new) 
> 
> This behavior is fully in-line with the guarantees of WorkGang
> (assuming that ideally you would pass a WorkGang and AbstractGangTask
> to reference processing if for a moment we ignore the Executor
> framework which is an additional wrapper for Parallel GC).
> 
>> behaviour to execute single-threaded using the specific single-
>> threaded setup, we can do as proposed earlier:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8212122/webrev.00/
>>
>> and GC can override this by passing a suitable RP subclass.
>>
>> As much as I'd like to fix this mess, and care about CMS and
>> ParallelGC, I currently don't have time for that (maybe later). What
>> I need is a way to avoid the single-threaded-path altogether, because
>> it breaks Shenandoah.
> 
> From what I have understood so far, Shenandoah code seems to have
> jumped into the same trap, i.e. seems to assume things about WorkGang
> that it should not (which is unfortunately probably not a quick fix
> either).
> 
> If there is really need for getting this fixed asap, please preserve
> old behavior.
> 
> The new behavior you propose may cause significant regressions (even if
> the Executor special-cases the 1-thread case) in reference processing
> as it seems to unconditionally enable queue balancing in the MT case -
> which is rather slow (and useless to actually attempt when there is
> only one thread to do reference processing).

Thanks for the detailed explanations! I have reconsidered our problem
and found another workaround, which does not require any RP changes and
does use the separate 1-threaded and N-threaded setups for now. In the
long run we'll probably figure out something significantly better, and
until then we're good I guess. I'm withdrawing this RFR.

Thanks,
Roman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20181016/0ca87399/signature.asc>


More information about the hotspot-gc-dev mailing list