RFR: 8232588: G1 concurrent System.gc can return early or late

Wed Nov 6 15:24:54 UTC 2019

Hi,

On 31.10.19 21:53, Kim Barrett wrote:
> RFR: 8232588: G1 concurrent System.gc can return early or late
> RFR: 8233279: G1: GCLocker GC with +GCLockerInvokesConcurrent spins while cycle in progress
> 
> Please review this refactoring and fixing of the state machine used by
> G1CollectedHeap::collect for handling requests for concurrent collections.
> 
> The handling of concurrent collection requests is now split out into a
> helper function for that purpose.  All of the state machine logic for
> checking for completion, waiting for completions, and performing retries is
> now in that new helper function, rather than being distributed between
> try_collect() and various parts of the VMOp.

[...]

> A change is that waiting by a user-requested GC for a concurrent marking
> cycle to complete used to be performed with the thread transitioned to
> native and without safepoint checks on the associated monitor lock and wait.
> This was noted as having been cribbed from CMS.  Coleen and I looked at this
> and could not come up with a reason for doing that for G1 (anymore, after
> the recent spate of locking improvements), so there's a new G1-specific
> monitor being used and the locking and waiting is now "normal".  (This makes
> the FullGCCount_lock monitor largely CMS-specific.)

I do not see a reason for the extra monitor (why it is better to have a 
separate monitor instead of reusing the existing one with almost the 
same name?), it does not seem to add information. It is still used in 
GenCollectedHeap/Serial, so can't be removed with CMS.

> For other concurrent GC requests, the only intentional change is for
> _gc_locker with GCLockerInvokesConcurrent.  Previously it would spin in
> try_collect while there was a concurrent marking cycle in progress, also
> blocking any callers of GCLocker::stall_until_clear() (JDK-8233279).  Now it
> returns in that situation, though it's not clear that's a great idea either.
> Indeed, even when that option was introduced (for CMS, as part of fixing a
> bad interaction between GCLocker GCs and +ExplicitGCInvokesConcurrent) it
> was not clear it was a good idea (see JDK-6919638).  Fortunately it's off by
> default. JDK-8233280 has been filed to remove this option.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8233279
> https://bugs.openjdk.java.net/browse/JDK-8232588
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8232588/open.00/
> 
> Testing:
> mach5 tier1-6
> 
> Local (linux-x64) testing with a program that allocates some live data in
> the old gen, then has several threads all repeatedly looping on System.gc().
> Looked at output from new logging in try_collect_concurrently and verified
> the interleavings of GC start/end and new log messages were as expected.
> 

- Not sure, but maybe this logging should be moved to trace level? In 
either case I would suggest to improve it to cover the try_collect call 
too, i.e. with appropriate "attempt"/"complete"/"discard" messages.

Otherwise the logging looks a bit incomplete to me, i.e. only attempts 
to initiate a concurrent collection cauuse messages. I understand that 
this might generate too many messages for gc+debug, that is the reason 
to potentially move to gc+trace.

- feel free to ignore: in such log messages, if I add thread id for 
debugging purposes I tend to put the thread id value first so that if 
you have a log in front of you, and ask your viewer to highlight a 
particular id, all highlights are in the same column.

I.e.

.... stuff from unified logging] <thread-id> Message

instead of having the thread-id somewhere located in the message like in
this case where it is somewhere in the middle.

(I kind of also prefer the raw Thread::current() value instead of the 
pretty name, fwiw)

I do not feel strongly about this at all, just something that came to my 
mind.

- a comment why we need a separate gc_counter_less_than method instead 
of a < comparison would be nice (I know, because of roll-over of these 
counts).

- pre-existing:

g1VMOperations.cpp:124-127: maybe this could be changed to

} else if (g1h->should_upgrade....) {
}

Also please fix the indentation of the parameter list in the call to 
do_full_collection() in line 133.

Looks good otherwise.

Thanks,
   Thomas