RFR (S): 8240556: Abort concurrent mark after effective eager reclamation of humongous objects

Fri Mar 6 11:35:17 UTC 2020

Hi,

Thanks for Man's accurate comments and I made the change
http://cr.openjdk.java.net/~luchsh/g1hum/humongous.webrev.1/

Stefan's concern is fairly reasonable since I have noticed if GC 
workers are not enough, the addition pause time caused by clearing 
could be considerable. concurrent_cycle_abort might not be easily to
 reuse because it still clears the bitmap in pause. I was thinking to let
 the concurrent mark thread continue and finish the last step of
 "_cm->cleanup_for_next_mark()" although it has chance to delay the 
next initial mark. Anyway I'm glad to make a try and you guys can compare
two approaches and provide comments.

Thanks,
Liang

------------------------------------------------------------------
From:Stefan Johansson <stefan.johansson at oracle.com>
Send Time:2020 Mar. 6 (Fri.) 18:59
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; Thomas Schatzl <thomas.schatzl at oracle.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR (S): 8240556: Abort concurrent mark after effective eager reclamation of humongous objects

Hi Liang,

Thanks for picking this up, really nice to see it progressing.

It would be nice if we could make the clearing concurrently to avoid 
prolonging the pause. An alternative to abort like you do now, would be 
to let the concurrent cycle start, but have it abort it self directly. 
This should be done by calling:
G1ConcurrentMark::concurrent_cycle_abort()

This would also reuse the abort mechanism already in place and if 
aborting needs updating in the future there is only one place to change. 
There might be some things that have to be altered to get this to work 
and I haven't explored this more than in theory. Would you consider 
trying this out?

I'm thinking this should look something like this in the log:
GC(1) Pause Young (Concurrent Start) (G1 Evacuation Pause) 
261M->262M(502M) 50.153ms
GC(2) Concurrent Cycle
GC(2) Concurrent Mark Abort
GC(2) Concurrent Cycle 12.345ms

We might want to call it something other than "Abort" in the logs to 
differ it from an abort by a Full GC, but we can discuss the details 
later on.

Thanks,
Stefan

On 2020-03-05 08:13, Liang Mao wrote:
> Hi All,
> 
> Now we have the bug id. I did more test to the patch. There's
> a little concern in the patch that when we decide to cancle
> the concurrent cycle in initial mark pause we need to clear
> the next bitmap which supposes to be cleared concurrently.
> In my test with -Xmx20g -Xms20g -XX:ParallelGCThreads=10,
> the time spent on clearing next bitmap was consistently less
> than 10ms. So I guess it could be acceptable.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8240556
> Webrev:
> http://cr.openjdk.java.net/~luchsh/g1hum/humongous.webrev/
> 
> Thanks,
> Liang
> 
> 
> 
> 
> 
> ------------------------------------------------------------------
> From:MAO, Liang <maoliang.ml at alibaba-inc.com>
> Send Time:2020 Mar. 3 (Tue.) 19:14
> To:Thomas Schatzl <thomas.schatzl at oracle.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
> Subject:G1: Abort concurrent at initial mark pause
> 
> Hi All,
> 
> As previous discusion, there're several ideas to improve the humongous
> objects handling. We've made some experiments that canceling concurrent
>   mark at initial mark pause is proved to be effective in the senario that
> frequent temporary humongous objects allocation leads to frequent concurrent
>   mark and high CPU usage. The sub-test: scimark.fft.large in specjvm2008 is
> also the exact case but not GC sensative so there's little difference
> in score.
> 
> The patch is small and shall we have a bug id for it?
> http://cr.openjdk.java.net/~luchsh/g1hum/humongous.webrev/
> 
> Thanks,
> Liang
> 
> 
> 
> 
>