FYI: G1 pauses may be extremely long with upcoming EA build JDK-17+18
Thomas Schatzl
thomas.schatzl at oracle.com
Mon Apr 19 20:30:22 UTC 2021
Hi all,
during perf testing we noticed that due to some recent change
(JDK-8262068) GC pauses after a G1 full GC may be extremely slow.
The problem has been fixed with JDK-8264987 that has already been
integrated. This change will be available with the following EA build
JDK-17+19.
More technical information:
After a full gc, young generation regions that are mostly (>95%) full at
the time of GC do not have a data structure, the so-called Block Offset
Table (BOT), fully set up. This data structure is used to efficiently
find the start of the next Java object given an arbitrary location in
the heap. This is an operation in some cases very frequently executed
during GC.
When G1 needs to perform that lookup within these regions, typically
when trying to find cross-generational references during garbage
collection, that BOT can not be used and G1 needs to use a very slow
fallback.
This situation will not self-heal until the affected regions are
evacuated or compacted again.
I would expect an application that will suffer from this problem would
need to do the following:
a) loads or generates almost exclusively long-lived data (e.g.
"database") with the intent to put it into the old generation
b) does a full gc (e.g. via system.gc()) at the end of this operation to
compact that data in the old generation before doing actual work.
c) this "database structure" references the young generation during that
time a lot because it mutates a lot
d) performs young collection(s)
Most applications and benchmarks will not be affected because all the
conditions need to be true.
In some stress test with 20gb heap we are using young collection pause
times went up from 200ms to 30s. ;)
So if you notice a regression with that build, please update your
sources and recompile or wait for JDK-17+19.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list