Redesign interface dispatch: progress report
Andrew Haley
aph-open at littlepinkcloud.com
Wed Nov 13 14:59:09 UTC 2024
On 11/13/24 01:11, dean.long at oracle.com wrote:
> This work sounds related to JDK-8221828, which says about interface
> dispatch:
>
> "The proposal for invokeinterface calls is to give each interface
> method a unique number called "selector", and create a cuckoo-style
> hash table mapping selectors to code pointers for the concrete
> implementations of the interface method. The table is embedded
> straight into the native class. This allows dispatching the majority
> of the time with one indirect branch and one direct branch (compared
> to two direct branches for dynamically monomorphic calls today)."
Thank you for that reference. I discussed this at some length with
Erik, the author of 8221828, at the time. The work I've been doing is
a follow-on from his.
I believe this result is more general than just testing one particular
way of hashed lookup. Exactly what kind of hashing you do, as long as
it isn't bad, doesn't matter much. Maybe it would be possible to shave
a nanosecond in the predictable case, but that's it. Misprediction
would be always slow. The more complex your hash function is, no
matter how theoretically good, the harder it is for branch prediction.
The current linear probe works remarkably well, probably because it is
very simple.
8221828's proposal is similar to what I did, and I believe it would
have similar performance. For this exercise, I used a small (16-way)
embedded hash table in the Klass itself, with collisions handled by
compressed hash tables. With a small number of interface methods we
get few collisions, so its performance is very similar to any good
hash table.
I did consider global numbering, but went for a 64-bit hash function
based on the full name of the method. Simply using the address of the
Method as a key would be the same as a global number in its runtime
complexity.
The problem with embedding the entire hash table in a Klass is that of
Klass size. I think it would be possible to compress code pointers to
32 bits, and also use 32-bit indexes, which would get that space back.
But there is more pressure today, because of Liliput, on Klass size.
Cuckoo hashing is a nice trick, but it has its own problems. To begin
with, there are two possible table entries for each target. Also,
Cuckoo needs multiple hash functions and fallbacks for when hashing
fails, which isn't ideal for a generic hash table like we need here.
What really matters is predictability, and that is more a function of
user code. If the target of a megamorphic call really is random,
that's going to hurt.
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-dev
mailing list