RFR(M) 8043575: Dynamically parallelize reference processing work

Thomas Schatzl thomas.schatzl at oracle.com
Fri Jun 8 14:52:48 UTC 2018


Hi Sangheon,

On Thu, 2018-06-07 at 10:50 -0700, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> On 6/7/18 7:12 AM, Thomas Schatzl wrote:
> > Hi Sangheon,
> > 
> > > [...]
> > > As I noted in pre-review email thread, it showed better results
> > > if I avoid using more than active cpus.
> > > This is a kind of enhancement for better performance, so even
> > > though the command-line option was set to use more than cpus(via
> > > ParalelGCThreads), I think it is good to limit. This is different
> > > from command-line option processing which lets users decide even
> > > though the given options would results in worse performance.
> > > 
> > > Or if you are saying I should use "active_processor_count()"? If
> > > it is the case, I updated to use 'active_processor_count()'
> > > instead of 'initial_active_processor_count()'.
> > > 
> > > I would prefer to have this unless there's any reason of not
> > > having it.
> > 
> > First, always use active_processor_count() for current thread
> > decisions. Initial_active_processor_count() is only here to get a
> > consistent value of the processor count during initialization.
> > 
> > What has been your test to determine this? GCOld?
> 
> Micro-benchmark which can create 1~3digit K of references. (mostly
> 8k, 32k, 128k) Specjvm2008.Derby as it generates maximum of 12k
> references.
> 
> GCOld is my favorite to play with but it doesn't stress much
> references.

I will look into this some more.

> > 
> > I think this may be outdated as with recent changes scalability
> > (with G1) should have improved a lot.
> 
> I tested before your rebuildRset patch, so a bit old.
> But I'm not sure how recent changes would affect to this case. i.e.
> if users sets quite big number than actual cpus, and then use those
> threads on ref.proc. etc.. This patch is suggesting not to rely on
> current setting in worst case.
> 
> But I will not argue here, as you will the person to push! :)

There have been improvements to actual reference processing in G1 that
improved throughput for reference processing significantly.

[...]
> > 
> > There is a WIP webrev at
> > 
> > http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/
> > 
> > for the curious (I think accomodated the review comments so far
> > too).
> > Still needs at least some more testing.
> > 
> > And I need to remove the parallel gc support.
> 
> Okay.
> 
> > 
> > I am also not sure whether there has been a decision on how this
> > feature should be enabled, i.e. which switches need to be active to
> > use it.
> > I made it so that if ParallelRefProcEnabled is true, and
> > ReferencesPerThread > 0, we enable the dynamic number of reference
> > processing threads.
> > I.e. if ParallelRefProcEnabled is false, dynamic number of
> > reference processing threads is also turned off (because the user
> > requested to not do parallel ref processing at all).
> 
> That is how works on my patch, too.
> I should mention on my review request, thank you for bringing up
> this topic!!

I filed a preliminary CSR at https://bugs.openjdk.java.net/browse/JDK-8
204612 for this change.

I.e.
- make ParallelRefProcEnabled with dynamic selection of thread numbers
the default for G1.
- (I do not think we need the CSR for the ReferencesPerThread flag, but
I did anyway).

> Previously we discussed for changing ParallelRefProcEnabled option
> to on/off/auto(enabling ReferencesPerThread). Of course, the name
> also  should be changed too. It would be good to discuss under
> separate CR.
> 
> thomas.webrev.3 looks good to me.

I wasn't expecting a review already :)

> 
> Some minor comments:
> --------------------------
> - I guess you didn't merge TestPrintReferences.java yet? So 
> intentionally not included the file on webrev.3?

I had to fix it a bit for the latest version of webrev.3, which I think
is ready for review.

> --------------------------
> src/hotspot/share/gc/shared/referenceProcessor.hpp
> 215   bool        _adjust_no_of_processing_threads;
> 
> 556   // True, to allow ergonomically changing a number of
> processing 
> threads based on references.
> 557   bool _adjust_no_of_processing_threads;
> - There are 2 declaration of  _adjust_no_of_processing_threads. The 
> latter one exists on other class.
> 
> --------------------------
> (pre-existing from my webrev)
> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
> 5128   assert(workers->active_workers() == ergo_workers,
> 5129          "Ergonomically chosen workers(%u) should be less than
> or 
> equal to active workers(%u)",
> 5130          ergo_workers, workers->active_workers());
> - We can remove the assert or change the statement to remove 'less
> than or'.
> 
> --------------------------
> src/hotspot/share/gc/cms/parNewGeneration.cpp
> 797   assert(workers->active_workers() == ergo_workers,
> 798          "Ergonomically chosen workers(%u) should be less than
> or 
> equal to active workers(%u)",
> 799          ergo_workers, workers->active_workers());
> - Same as above.
> 
> --------------------------
> - Same for pcTasks.cpp and psScavenge.cpp too if we decide not to 
> include parallel gc in this patch.
> 
> --------------------------
> src/hotspot/share/gc/shared/referenceProcessor.cpp
>   786   RefProcMTDegreeAdjuster a(this,
>   787                             RefPhase1,
>   788                             num_soft_refs);
> - We can put at the same line. Same for other cases.
> 
> Again thank you for taking this CR, Thomas!

I think I fixed all this.

Webrev is at http://cr.openjdk.java.net/~tschatzl/8043575/webrev.3/ .

This webrev is based on top of latest jdk/jdk and https://bugs.openjdk.
java.net/browse/JDK-8202845 .

If you want to test parallel gc, you also need the fixes for JDK-
8204617 and JDK-8204618 currently out for review.

Testing:
hs-tier1-4,jdk-tier1-3 with +/-ParallelRefProcEnabled

Thanks,
  Thomas




More information about the hotspot-gc-dev mailing list