RFR: 8352075: Perf regression accessing fields [v4]

Fri May 9 07:38:54 UTC 2025

On Wed, 7 May 2025 19:18:46 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> Compressing the fields into unsigned5 and decoding them into streams was quite a complicated change but manageable because the interface to decode them is all one has write the FieldStream iterator. This is hard to review.

Not sure if I got you right, but I agree that the interface should not be changed (significantly). Here I am adding `skip_fields_until` method for one place; to me it was a bit surprising that there was not focus on (sub-linear) lookup from the beginning.

> I'm wondering how much of a problem this is in real code, other than the case with 21k fields and if there's a way to programmatically work around this case, like decompress the fields into a hashtable or something (?) It would be interesting to see some histograms of some corpus Java code (maybe put this info in the associated bug).

The customer code that hit the regression looked to me as something generated, probably already working around some sizing limits, and the problem was in an initialization routine setting up thousands of descriptors. I am not saying that it could not be reworked for the better, but for someone this is an order of magnitude regression and I understand that they demand fix on JDK side.
What kind of histogram would you imagine? I could count the number of fields in those 10 classes...

> Fred tells me that we already store the original field index so maybe above is moot.

Could you be more specific, please?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2865480638