RFR: 8072498: Multi-thread JNI weak reference processing

Wed Aug 8 14:27:43 UTC 2018

> On Aug 8, 2018, at 10:11 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Tue, 2018-07-31 at 15:23 -0400, Kim Barrett wrote:
>> […]
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8072498
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8072498/open.00/
>> 
>> Testing:
>> mach5 tier1-3, hs-tier4-5.
>> 
>> Local and mach5 testing of TestGCBasherWithG1 and TestSystemGCWithG1,
>> with the tests modified to use a smaller non-default value of
>> -XX:ReferencesPerThread, to examine the logging output and verify
>> multi-threaded execution.
>> 
> 
>  - g1ConcurrentMark.cpp:1673: I assume that using the serial version
> of weak processing has been used here because of logging concerns. Is
> there an upcoming change that fixes that?
> One could create a local instance of a WeakProcessorPhaseTimes and just
> take the interesting total time taken from it and print it. Or just
> print it the same way as for the young gcs - I understand this is ugly,
> but as long as there is no other way...
> 
>  - same for g1FullCollector.cpp:217 and g1FullGCAdjustTask:101

These are using the parallel versions.  Maybe overloading the weak_oops_do
name with one that takes a workgang is too subtle?

>  - g1GCPhaseTimes.hpp:130: not sure if the removal of the previously
> unused _cur_ref_enq_time_ms member should be done here, but I don't
> really mind to remove it here. It would be a very trivial change where
> the overhead for doing so might be too high.

Yes, tiny trivial change rolled into this one, rather than doing separately.
Even making a CR for it seemed kind of high overhead.  I probably should
have mentioned it in the RFR though.

>  - weakProcessor.cpp:73: I am totally okay with the
> 
> size_t n_workers = 1 + (ref_count / ReferencesPerThread);
> 
> formula, but maybe "ref_count + ReferencesPerThread - 1) /
> ReferencesPerThread" which seems more common would be as fine.
> Note that you may ignore this comment :)

I think what I have is what we used in the j.l.r.Reference processing code too.
Precision is not needed; this is already heuristic.  That said, I don’t really
care either way, but think the two calculations ought to be consistent.  Maybe
there should be a helper function somewhere, but that seems like a lot of work
for this.

>  - in WeakProcessor::weak_oops_do() there is this optimization that if
> nworkers (which other GC code code always writes out as "num_workers"
> or "n_workers" :)) is one, it calls the method explicitly.
> 
> I would prefer if this optimization would be performed more globally in
> e.g. WorkGang::run_task(), or in a static method somewhere that is
> called.
> 
> We removed all that special-casing in the JDK 9 time frame (https://bug
> s.openjdk.java.net/browse/JDK-6979279 (for G1)) with the intent to do
> this in a more generic fashion as described above at some point; we are
> aware that this is slower. We just did not get to changing it yet. I
> still would kind of prefer that we do not start doing that again, but
> then again one could remove that at that point too.

OK, I’ll take that out.  I wondered why WorkGang::run_task() wasn’t already
doing that, but thought there might be code somewhere that really expects
to be executed by a worker thread.

>  - weakProcessor.inline.hpp, although a new file, has copyright year
> 2017.

Oops, will fix.

>  - we should at least in the gc team at some point come to a rule on
> how to format static consts :) Looking at the GC code all possible
> "reasonable" styles are used.
> 
> Otherwise looks good.

Thanks.

> 
> Thanks,
>  Thomas