RFR: 8278020: ~13% variation in Renaissance-Scrabble [v2]

Thomas Stuefe stuefe at openjdk.java.net
Wed Dec 15 05:49:59 UTC 2021


On Wed, 15 Dec 2021 04:19:35 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> We found that when CDS is enabled, there is a ~13% variation in the Renaissance-Scrabble benchmark between different builds of the JDK. In one example, only two core-lib classes, unrelated to the benchmark, changed between two builds, but one build is consistently faster than the other.
>> 
>> When CDS is disabled, we do not see such variations.
>> 
>> In the slow case, there seems to be frequent dcache misses when loading the `Klass::_vtable_len` field, which is at offset 24 from the beginning of the Klass (see [bug report](https://bugs.openjdk.java.net/browse/JDK-8278020) for details). 
>> 
>> We suspect that the problem is with the layout of the CDS archive. Specifically, in CDS, Klass objects are inter-mixed with other metadata objects (such as Methods). In contrast, when CDS is disabled, (on 64-bit platforms with compressed klass pointers), Klass objects are allocated in their own space, separated from other metadata objects.
>> 
>> My theory is: when CDS is enabled, perhaps the modification of an object that sits immediately above the Klass invalidates the cacheline that holds `Klass::_vtable_len`. In a different JDK build, the exact addresses of the metadata objects in the CDS archive may be slightly nudged so we don't see the cacheline effect anymore.
>> 
>> As an experiment, I swapped `Klass::_vtable_len` with `Klass::_modifier_flags` (which was at offset 164 before this patch), and the variation stopped. Both fields are 32 bits in size.
>> 
>> I have no concrete proof that my theory is correct, but this change seems to be harmless. @ericcaspole has run all the benchmarks in Oracle's CI and found consistent improvement with Renaissance-Scrabble, and no degradation in other benchmarks.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   added comments about the location of vtable_len

Hi Ioi,

The fix looks fine.

this is interesting to me, because in the context of Lilliput (https://github.com/openjdk/lilliput/pull/13) I was kind of counting on CDS to intermix Klass and non-class metadata, since that way CDS uses the larger Klass alignment gaps. In fact, I have this wild idea to shape metaspace in that form, merging Klass and non-class metadata into one larger class space. It would be really good to have a better idea of these interactions.

What tool did you use to measure the dcache misses?

Cheers, Thomas

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6838


More information about the hotspot-dev mailing list