RFR: 8278020: ~13% variation in Renaissance-Scrabble [v2]
Thomas Stuefe
stuefe at openjdk.java.net
Wed Dec 15 05:49:59 UTC 2021
On Wed, 15 Dec 2021 04:19:35 GMT, Ioi Lam <iklam at openjdk.org> wrote:
>> We found that when CDS is enabled, there is a ~13% variation in the Renaissance-Scrabble benchmark between different builds of the JDK. In one example, only two core-lib classes, unrelated to the benchmark, changed between two builds, but one build is consistently faster than the other.
>>
>> When CDS is disabled, we do not see such variations.
>>
>> In the slow case, there seems to be frequent dcache misses when loading the `Klass::_vtable_len` field, which is at offset 24 from the beginning of the Klass (see [bug report](https://bugs.openjdk.java.net/browse/JDK-8278020) for details).
>>
>> We suspect that the problem is with the layout of the CDS archive. Specifically, in CDS, Klass objects are inter-mixed with other metadata objects (such as Methods). In contrast, when CDS is disabled, (on 64-bit platforms with compressed klass pointers), Klass objects are allocated in their own space, separated from other metadata objects.
>>
>> My theory is: when CDS is enabled, perhaps the modification of an object that sits immediately above the Klass invalidates the cacheline that holds `Klass::_vtable_len`. In a different JDK build, the exact addresses of the metadata objects in the CDS archive may be slightly nudged so we don't see the cacheline effect anymore.
>>
>> As an experiment, I swapped `Klass::_vtable_len` with `Klass::_modifier_flags` (which was at offset 164 before this patch), and the variation stopped. Both fields are 32 bits in size.
>>
>> I have no concrete proof that my theory is correct, but this change seems to be harmless. @ericcaspole has run all the benchmarks in Oracle's CI and found consistent improvement with Renaissance-Scrabble, and no degradation in other benchmarks.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
>
> added comments about the location of vtable_len
Hi Ioi,
The fix looks fine.
this is interesting to me, because in the context of Lilliput (https://github.com/openjdk/lilliput/pull/13) I was kind of counting on CDS to intermix Klass and non-class metadata, since that way CDS uses the larger Klass alignment gaps. In fact, I have this wild idea to shape metaspace in that form, merging Klass and non-class metadata into one larger class space. It would be really good to have a better idea of these interactions.
What tool did you use to measure the dcache misses?
Cheers, Thomas
-------------
Marked as reviewed by stuefe (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/6838
More information about the hotspot-dev
mailing list