GC.heap_dump performance regression in Java 21

Hannes Greule hannesgreule at outlook.de
Mon Oct 2 18:49:05 UTC 2023


recently, a performance regression of jcmd GC.heap_dump was brought to 
my attention. I investigated the regression and tracked down 
https://bugs.openjdk.org/browse/JDK-8292818 as the source of it.
For reproduction, I used the code at [1] and ran it with `java -Xmx2G 
In Java 17, jcmd CountPrimes GC.heap_dump -overwrite heap.hprof finishes 
in 2-3 seconds. In Java 21, it almost takes 20 seconds instead.

Further analysis showed that the functions in InstanceKlass to get the 
access flags of a field (identified by its index) now requires an 
iteration of the fields. As FieldStream from reflectionUtils.hpp 
accesses such data through the InstanceKlass with a given field index, 
this results in quadratic complexity for each object that gets dumped.

I wrote a fix for this, with which it seems to finish even faster than 
before the regression.

Before opening a Pull Request for it, however, I would like to know if 
this change is even feasible.
Based on the implementation in fieldStreams, I built a class 
`HierarchicalFieldStream` to stream over fields of all InstanceKlasses 
in a hiararchy, similar to how `FieldStream` in reflectionUtils is 
implemented already.
The most significant difference is that the `FieldStream` from 
reflectionUtils iterates fields backwards, while the `JavaFieldStream` 
from fieldStreams iterates forwards. That means using the 
`JavaFieldStream` and my `HierarchicalFieldStream` directly results in 
different heap dumps as the fields are dumped in their encounter order. 
 From what I've found, this order isn't specified. The order in which 
super types are visited remains the same.
Is this an acceptable change?
I decided against changing the implementation of `FieldStream` from 
reflectionUtils as it is used in JVMTI code too.

You can find my suggested implementation at [2].

Please let me know what you think about it, and also let me know if 
there are any relevant tests that I should run that don't run in GHA 

If you agree with my changes, I will open a bug report and create a PR.


[1] https://gist.github.com/SirYwell/73d8e3d679e5aa49a11ebefc868b4404

