GC.heap_dump performance regression in Java 21

Hannes Greule hannesgreule at outlook.de
Fri Oct 6 21:02:06 UTC 2023


Hi Frederic, hi Alex,

thank you for your comments.

I think it is better to have a fresh implementation in fieldStreams.hpp. 
This way, other functionality can be addressed separately (and 
fieldStreams.hpp is a more appropriate place imo).

I agree that the FieldStream implementation should be removed then in 
future. Actually, from what it looks like, the reflectionUtils.hpp/cpp 
files could be fully removed (MethodStream seems to be unused, and 
KlassStream only seems to be used by MethodStream and FieldStream) in 
that case.

I filed a bug report (https://bugs.openjdk.org/browse/JDK-8317692) and 
opened the PR for it (https://github.com/openjdk/jdk/pull/16083).
Do you want me to also file a bug report for the future work on 
FieldStream/JVMTI?

As a side note, do you think there might be more cases where fields are 
accessed by index when streaming would be beneficial?

Thanks,
Hannes


On 06.10.2023 21:02, Frederic Parain wrote:
> Hi Hannes,
> 
> 
> Thank you for the analysis and the proposed solution.
> 
> The changes look reasonable to me, and I agree with Alex that we should 
> either fix or get rid of the old FieldStream implementation. If we keep 
> it, this kind of performance issue will happen again.
> 
> 
> Regards,
> 
> 
> Fred
> 
> 
> On 10/2/23 10:00 PM, Alex Menkov wrote:
>> Hi Hannes,
>>
>> The change looks very reasonable to me.
>> Field order is not important in the heap dump (the order just should 
>> be the same in class and instance subrecords).
>>
>> And I think it would be better to fix original FieldStream (or 
>> introduce new HierarchicalFieldStream and use in for heap dumping 
>> first and then switch JVMTI to use it). This should improve 
>> performance of JVMTI functions as well.
>>
>> AFAICS JVMTI uses FieldStream/FilteredFieldStream in 2 places: 
>> GetClassFields and heap walking functions.
>> GetClassFields needs fields in the order they occur in the class file 
>> and it has to reverse the order returned by FieldStream, so switch it 
>> to use forward field stream is straightforward.
>> For heap walking functions field index is calculated trickier (it 
>> includes count of fields in superclasses/interfaces).
>>
>> We don't have good test coverage for heap dumping, there are some 
>> basic tests in test/hotspot/jtreg/serviceability.
>>
>> regards,
>> Alex
>>
>> On 02/10/23 11:49, Hannes Greule wrote:
>>> Hi,
>>>
>>> recently, a performance regression of jcmd GC.heap_dump was brought 
>>> to my attention. I investigated the regression and tracked down 
>>> https://bugs.openjdk.org/browse/JDK-8292818 as the source of it.
>>> For reproduction, I used the code at [1] and ran it with `java -Xmx2G 
>>> CountPrimes`.
>>> In Java 17, jcmd CountPrimes GC.heap_dump -overwrite heap.hprof 
>>> finishes in 2-3 seconds. In Java 21, it almost takes 20 seconds instead.
>>>
>>> Further analysis showed that the functions in InstanceKlass to get 
>>> the access flags of a field (identified by its index) now requires an 
>>> iteration of the fields. As FieldStream from reflectionUtils.hpp 
>>> accesses such data through the InstanceKlass with a given field 
>>> index, this results in quadratic complexity for each object that gets 
>>> dumped.
>>>
>>> I wrote a fix for this, with which it seems to finish even faster 
>>> than before the regression.
>>>
>>> Before opening a Pull Request for it, however, I would like to know 
>>> if this change is even feasible.
>>> Based on the implementation in fieldStreams, I built a class 
>>> `HierarchicalFieldStream` to stream over fields of all 
>>> InstanceKlasses in a hiararchy, similar to how `FieldStream` in 
>>> reflectionUtils is implemented already.
>>> The most significant difference is that the `FieldStream` from 
>>> reflectionUtils iterates fields backwards, while the 
>>> `JavaFieldStream` from fieldStreams iterates forwards. That means 
>>> using the `JavaFieldStream` and my `HierarchicalFieldStream` directly 
>>> results in different heap dumps as the fields are dumped in their 
>>> encounter order.  From what I've found, this order isn't specified. 
>>> The order in which super types are visited remains the same.
>>> Is this an acceptable change?
>>> I decided against changing the implementation of `FieldStream` from 
>>> reflectionUtils as it is used in JVMTI code too.
>>>
>>> You can find my suggested implementation at [2].
>>>
>>> Please let me know what you think about it, and also let me know if 
>>> there are any relevant tests that I should run that don't run in GHA 
>>> already.
>>>
>>> If you agree with my changes, I will open a bug report and create a PR.
>>>
>>> Thanks,
>>> Hannes
>>>
>>> [1] https://gist.github.com/SirYwell/73d8e3d679e5aa49a11ebefc868b4404
>>> [2] 
>>> https://github.com/SirYwell/jdk/commit/9814ca2aea8ebd7400e256b7430d3961a3692a83


More information about the serviceability-dev mailing list