RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter
Vladimir Ivanov
vlivanov at openjdk.org
Fri Jul 12 00:01:56 UTC 2024
On Tue, 2 Jul 2024 14:52:09 GMT, Andrew Haley <aph at openjdk.org> wrote:
> This patch expands the use of a hash table for secondary superclasses
> to the interpreter, C1, and runtime. It also adds a C2 implementation
> of hashed lookup in cases where the superclass isn't known at compile
> time.
>
> HotSpot shared runtime
> ----------------------
>
> Building hashed secondary tables is now unconditional. It takes very
> little time, and now that the shared runtime always has the tables, it
> might as well take advantage of them. The shared code is easier to
> follow now, I think.
>
> There might be a performance issue with x86-64 in that we build
> HotSpot for a default x86-64 target that does not support popcount.
> This means that HotSpot C++ runtime on x86 always uses a software
> emulation for popcount, even though the vast majority of machines made
> for the past 20 years can do popcount in a single instruction. It
> wouldn't be terribly hard to do something about that.
>
> Having said that, the software popcount is really not bad.
>
> x86
> ---
>
> x86 is rather tricky, because we still support
> `-XX:-UseSecondarySupersTable` and `-XX:+UseSecondarySupersCache`, as
> well as 32- and 64-bit ports. There's some further complication in
> that only `RCX` can be used as a shift count, so there's some register
> shuffling to do. All of this makes the logic in macroAssembler_x86.cpp
> rather gnarly, with multiple levels of conditionals at compile time
> and runtime.
>
> AArch64
> -------
>
> AArch64 is considerably more straightforward. We always have a
> popcount instruction and (thankfully) no 32-bit code to worry about.
>
> Generally
> ---------
>
> I would dearly love simply to rip out the "old" secondary supers cache
> support, but I've left it in just in case someone has a performance
> regression.
>
> The versions of `MacroAssembler::lookup_secondary_supers_table` that
> work with variable superclasses don't take a fixed set of temp
> registers, and neither do they call out to to a slow path subroutine.
> Instead, the slow patch is expanded inline.
>
> I don't think this is necessarily bad. Apart from the very rare cases
> where C2 can't determine the superclass to search for at compile time,
> this code is only used for generating stubs, and it seemed to me
> ridiculous to have stubs calling other stubs.
>
> I've followed the guidance from @iwanowww not to obsess too much about
> the performance of C1-compiled secondary supers lookups, and to prefer
> simplicity over absolute performance. Nonetheless, this is a
> complicated patch that touches many areas.
Looks very good, Andrew!
Some comments on minor things follow.
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1433:
> 1431:
> 1432: // Don't check secondary_super_cache
> 1433: if (super_check_offset.is_register()
Do you see any effects from this particular change?
It adds a runtime check on the fast path for all subtype checks (irrespective of whether it checks primary or secondary super). Moreover, the very same check is performed after primary super slot is checked.
Unless `_secondary_super_cache` field is removed, unconditionally checking the slot at `super_check_offset` is benign.
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 1040:
> 1038:
> 1039: // Secondary subtype checking
> 1040: void lookup_secondary_supers_table(Register sub_klass,
While browsing the code, I noticed that it's far from evident at call sites which overload is used (especially with so many arguments). Does it make sense to avoid method overloads here and use distinct method names instead?
src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 1981:
> 1979: __ load_klass(r19_klass, copied_oop);// query the object klass
> 1980:
> 1981: BLOCK_COMMENT("type_check:");
Why don't you move it inside `generate_type_check`?
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4781:
> 4779: Label* L_success,
> 4780: Label* L_failure) {
> 4781: if (! UseSecondarySupersTable) {
Any particular reason to keep the condition negated?
(Here and in general. There are multiple places where `!UseSecondarySupersTable` is used.)
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4810:
> 4808: Label* L_success,
> 4809: Label* L_failure) {
> 4810: // NB! Callers may assume that, when temp2_reg is a valid register,
Oh, that's a subtle point... Can we make it more evident at call sites?
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 5062:
> 5060:
> 5061: #ifdef DEBUG
> 5062: call_VM_leaf_base((address)&poo, /*number_of_arguments*/0);
A leftover from debugging?
src/hotspot/share/memory/universe.cpp line 443:
> 441:
> 442: {
> 443: Universe::_the_array_interfaces_bitmap = Klass::compute_secondary_supers_bitmap(_the_array_interfaces_array);
Cleanup idea: remove `Universe::` prefixes. The rest of the method don't use qualified names for class members.
src/hotspot/share/oops/instanceKlass.cpp line 1410:
> 1408: return nullptr;
> 1409: } else if (num_extra_slots == 0) {
> 1410: if (num_extra_slots == 0 && interfaces->length() <= 1) {
Since `secondary_supers` are hashed unconditionally now, is `interfaces->length() <= 1` check still needed?
src/hotspot/share/oops/instanceKlass.cpp line 3524:
> 3522:
> 3523: st->print(BULLET"secondary supers: "); secondary_supers()->print_value_on(st); st->cr();
> 3524: {
Any particular reason to keep brackets around `hash_slot` and `bitmap`?
src/hotspot/share/oops/klass.cpp line 175:
> 173: if (secondary_supers()->at(i) == k) {
> 174: if (UseSecondarySupersCache) {
> 175: ((Klass*)this)->set_secondary_super_cache(k);
Does it make sense to assert `UseSecondarySupersCache` in `Klass::set_secondary_super_cache()`?
src/hotspot/share/oops/klass.cpp line 284:
> 282: // which doesn't zero out the memory before calling the constructor.
> 283: Klass::Klass(KlassKind kind) : _kind(kind),
> 284: _bitmap(SECONDARY_SUPERS_BITMAP_FULL),
I like the idea, but what are the benefits of initializing `_bitmap` separately from `_secondary_supers`?
src/hotspot/share/oops/klass.cpp line 469:
> 467: #endif
> 468:
> 469: bitmap = hash_secondary_supers(secondary_supers, /*rewrite=*/true); // rewrites freshly allocated array
I like that hashing is performed unconditionally now.
Looks like you can remove `UseSecondarySupersTable`-specific CDS support (in `filemap.cpp`). CDS archive should unconditionally contain hashed tables.
src/hotspot/share/oops/klass.inline.hpp line 117:
> 115: }
> 116:
> 117: inline bool Klass::search_secondary_supers(Klass *k) const {
I see you moved `Klass::search_secondary_supers` in `klass.inline.hpp`, but I'm not sure how it interacts with `Klass::is_subtype_of` (the sole caller) being declared in `klass.hpp`.
Will the inlining still happen if `Klass::is_subtype_of()` callers include `klass.hpp`?
src/hotspot/share/oops/klass.inline.hpp line 122:
> 120: return true;
> 121:
> 122: bool result = lookup_secondary_supers_table(k);
Should `UseSecondarySupersTable` affect `Klass::search_secondary_supers` as well?
-------------
PR Review: https://git.openjdk.org/jdk/pull/19989#pullrequestreview-2161098896
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674812600
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674823519
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674790160
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667194729
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674832710
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667037608
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667194339
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674792021
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667193916
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667190974
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667192207
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667192783
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1674806107
PR Review Comment: https://git.openjdk.org/jdk/pull/19989#discussion_r1667193323
More information about the core-libs-dev
mailing list