RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v9]
Vladimir Ivanov
vlivanov at openjdk.org
Sat Jul 27 00:03:33 UTC 2024
On Fri, 26 Jul 2024 15:13:06 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> This patch expands the use of a hash table for secondary superclasses
>> to the interpreter, C1, and runtime. It also adds a C2 implementation
>> of hashed lookup in cases where the superclass isn't known at compile
>> time.
>>
>> HotSpot shared runtime
>> ----------------------
>>
>> Building hashed secondary tables is now unconditional. It takes very
>> little time, and now that the shared runtime always has the tables, it
>> might as well take advantage of them. The shared code is easier to
>> follow now, I think.
>>
>> There might be a performance issue with x86-64 in that we build
>> HotSpot for a default x86-64 target that does not support popcount.
>> This means that HotSpot C++ runtime on x86 always uses a software
>> emulation for popcount, even though the vast majority of machines made
>> for the past 20 years can do popcount in a single instruction. It
>> wouldn't be terribly hard to do something about that.
>>
>> Having said that, the software popcount is really not bad.
>>
>> x86
>> ---
>>
>> x86 is rather tricky, because we still support
>> `-XX:-UseSecondarySupersTable` and `-XX:+UseSecondarySupersCache`, as
>> well as 32- and 64-bit ports. There's some further complication in
>> that only `RCX` can be used as a shift count, so there's some register
>> shuffling to do. All of this makes the logic in macroAssembler_x86.cpp
>> rather gnarly, with multiple levels of conditionals at compile time
>> and runtime.
>>
>> AArch64
>> -------
>>
>> AArch64 is considerably more straightforward. We always have a
>> popcount instruction and (thankfully) no 32-bit code to worry about.
>>
>> Generally
>> ---------
>>
>> I would dearly love simply to rip out the "old" secondary supers cache
>> support, but I've left it in just in case someone has a performance
>> regression.
>>
>> The versions of `MacroAssembler::lookup_secondary_supers_table` that
>> work with variable superclasses don't take a fixed set of temp
>> registers, and neither do they call out to to a slow path subroutine.
>> Instead, the slow patch is expanded inline.
>>
>> I don't think this is necessarily bad. Apart from the very rare cases
>> where C2 can't determine the superclass to search for at compile time,
>> this code is only used for generating stubs, and it seemed to me
>> ridiculous to have stubs calling other stubs.
>>
>> I've followed the guidance from @iwanowww not to obsess too much about
>> the performance of C1-compiled secondary supers lookups, and to prefer
>> simplicity over absolute performance. Nonetheless, this i...
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix test failure
BTW it makes sense to assert the invariant. Here's a patch (accompanied by a minor cleanup):
diff --git a/src/hotspot/share/oops/instanceKlass.cpp b/src/hotspot/share/oops/instanceKlass.cpp
index b050784dfc5..5413e29defb 100644
--- a/src/hotspot/share/oops/instanceKlass.cpp
+++ b/src/hotspot/share/oops/instanceKlass.cpp
@@ -647,7 +647,7 @@ void InstanceKlass::deallocate_contents(ClassLoaderData* loader_data) {
!secondary_supers()->is_shared()) {
MetadataFactory::free_array<Klass*>(loader_data, secondary_supers());
}
- set_secondary_supers(nullptr);
+ set_secondary_supers(nullptr, SECONDARY_SUPERS_BITMAP_EMPTY);
deallocate_interfaces(loader_data, super(), local_interfaces(), transitive_interfaces());
set_transitive_interfaces(nullptr);
diff --git a/src/hotspot/share/oops/klass.cpp b/src/hotspot/share/oops/klass.cpp
index 26ec25d1c80..b1012810be4 100644
--- a/src/hotspot/share/oops/klass.cpp
+++ b/src/hotspot/share/oops/klass.cpp
@@ -319,16 +319,16 @@ bool Klass::can_be_primary_super_slow() const {
return true;
}
-void Klass::set_secondary_supers(Array<Klass*>* secondaries) {
- assert(!UseSecondarySupersTable || secondaries == nullptr, "");
- set_secondary_supers(secondaries, SECONDARY_SUPERS_BITMAP_EMPTY);
-}
-
void Klass::set_secondary_supers(Array<Klass*>* secondaries, uintx bitmap) {
#ifdef ASSERT
if (secondaries != nullptr) {
uintx real_bitmap = compute_secondary_supers_bitmap(secondaries);
assert(bitmap == real_bitmap, "must be");
+ if (bitmap != SECONDARY_SUPERS_BITMAP_FULL) {
+ assert(((uint)secondaries->length() == population_count(bitmap)), "required");
+ }
+ } else {
+ assert(bitmap == SECONDARY_SUPERS_BITMAP_EMPTY, "");
}
#endif
_secondary_supers_bitmap = bitmap;
diff --git a/src/hotspot/share/oops/klass.hpp b/src/hotspot/share/oops/klass.hpp
index 2f733e11eef..a9e73e7bcbd 100644
--- a/src/hotspot/share/oops/klass.hpp
+++ b/src/hotspot/share/oops/klass.hpp
@@ -236,7 +236,6 @@ class Klass : public Metadata {
void set_secondary_super_cache(Klass* k) { _secondary_super_cache = k; }
Array<Klass*>* secondary_supers() const { return _secondary_supers; }
- void set_secondary_supers(Array<Klass*>* k);
void set_secondary_supers(Array<Klass*>* k, uintx bitmap);
uint8_t hash_slot() const { return _hash_slot; }
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19989#issuecomment-2253660006
More information about the core-libs-dev
mailing list