initialization times for invokedynamic

Tue Sep 12 12:19:42 UTC 2023

Hi,

I wasn’t suggesting using, but to generate and archive some kind of lookup table when generating the pre-generated LambdaForm Holder classes, i.e., somewhere in GenerateJLIClassesPlugin. Depending on implementation choices this would add a little bit of footprint and an extra lookup step, so a little overhead when the speculation wins to reduce a larger cost on speculation failures - which might be a net win.

But yes, fixing this in the runtime code would be even better. Main issue is that the linkResolver code you point out is shared between resolveOrNull/resolveOrFail and bytecode linking - and we probably shouldn’t change semantics of the latter. If you can come up with a patch idea that solves this issue for resolveOrNull, such as throwing a pre-created NSME in case the caller is resolveOrFail, then I’d be happy to file an RFE and help get it through review.

/Claes

11 sep. 2023 kl. 14:50 skrev liangchenblue at gmail.com:

Hi Claes,
After looking at the usages of resolveOrFail in VarHandle, I believe changing the runtime's THROW_MSG_NULL template occurrences in linkResolver would be a better approach, for resolveOrFail is currently the most efficient way for finding particular members; reflection, on the other hand, has to perform a search over the list of all methods to find one that's accessible.

On an unrelated note, VarForm can probably substitute failed resolution MemberName with dummy ones like that for Object.toString so we don't need to query resolveOrNull repeatedly.

/Chen

On Mon, Sep 11, 2023 at 5:28 PM Claes Redestad <claes.redestad at oracle.com<mailto:claes.redestad at oracle.com>> wrote:
Hi,

It’s been something I’ve wanted to get rid of, sure. An alternative that wouldn’t require changes to the runtime code would be to store a table of LFs that have actually been generated and skip the speculative VM call. This can be done in a few different ways, would add a little overhead on hits but remove the exception overhead (which clutters JFR recordings) on misses

/Claes

11 sep. 2023 kl. 11:02 skrev liangchenblue at gmail.com<mailto:liangchenblue at gmail.com>:

Hello Jochen and Claes,
I have done a little debugging and have found the cause, that looking up pre-generated LambdaForm (mentioned by Claes) causes VM to initialize an NoSuchMethodError [1] that's later silently dropped [2], but the NoSuchMethodError constructor is already executed and the stacktrace filled, causing a significant overhead, as shown in this [3] JMC's rendering of a JFR recording.

You can capture this NoSuchMethodError construction with IDE debug, even when running an empty main, on newer JDK versions. I tested with a breakpoint in NoSuchMethodError(String) constructor and it hits twice (for instrumentation agent uses reflection, which now depends on Method Handles after JEP 416 in Java 18).

I think a resolution would be to modify linkResolver so that it can also resolve speculatively instead of always throwing exceptions, but this might be too invasive and I want to hear from other developers such as Claes, who authored the old resolveOrNull silent-dropping patch.

Looking forward to a solution,
Chen Liang

[1]: https://github.com/openjdk/jdk/blob/a04c6c1ac663a1eab7d45913940cb6ac0af2c11c/src/hotspot/share/interpreter/linkResolver.cpp#L773
[2]: https://github.com/openjdk/jdk/blob/a04c6c1ac663a1eab7d45913940cb6ac0af2c11c/src/hotspot/share/prims/methodHandles.cpp#L794-L796
[3]: https://cr.openjdk.org/~liach/mess/invokerbytecodegen-cache-miss.png

On Mon, Sep 11, 2023 at 4:41 PM Jochen Theodorou <blackdrag at gmx.org<mailto:blackdrag at gmx.org>> wrote:
I changed my testing a bit to have more infrastructure types and test
with a fresh VM each time.

The scenario is still the same: call a method foo with argument 1. foo
does nothing but returning 0. Implement the call.

indyDirect:
bootstrap method selects method and produces constant call-site

indyDoubleDispatch:
bootstrap selects a selector method and produces a mutable call-site.
selector then selects the target method and sets it in the call-site

reflective:
a inner class is used to select the method using reflection and directly
invoke it.

reflectiveCached:
same as reflective but caching the selected method

staticCallSite:
I have the call abstracted and replace what is called after method
selection. Here with a direct call to the method using normal Java

runtimeCallSite:
I have the call abstracted like staticCallSite, but instead of replacing
with a direct call I create a class at runtime, which does the direct
call for me.

My interest is in the performance of the first few calls. My experiments
show that at most 5 calls there is no significant performance change
anymore for a long time. But long time performance is secondary right now.

Out of these implementations it is no surprise that staticCallSite has
the least cost, but it is almost on par with the reflective variant.
That really surprised me. It seems reflection came a long way since the
old times. There is probably still a lot of cost in the long term, but
well, I focus on the short term here right now.

The cached variant really differs not much but if reflection gets a
score of 41, then the cached variant is at 105. That is surprising much
for an additional if condition. But if you think of how many
instructions that involves maybe not that surprising. indyDirect has
almost the same initial cost as the reflectiveCached. indyDoubleDispatch
follows with a score of 149... which looks very much like
reflective+indyDirect-"a small something". At 361 we find
runtimeCallSite, the slowest by far. The numbers used to be quite
different for this, but back then MagicAccessor was an option to reduce
cost.

My conclusion so far. callsite generation is a questionable option. Not
only because of performance, but also because of the module system.
Though we have cases where we can use the static variant.

The next best is actually reflective. But how would you combine
reflective with something that has better long term performance? Even a
direct call with indy costs much more.

I think I have to change my tests.. I think I should test a scenario in
which I have a quite big number - like 1 million - of one-time
call-sites to get really conclusive numbers... Of course that means 1
million direct method handles for indy.

Well, I will write again if I have more numbers.

bye Jochen
_______________________________________________
mlvm-dev mailing list
mlvm-dev at openjdk.org<mailto:mlvm-dev at openjdk.org>
https://mail.openjdk.org/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
mlvm-dev at openjdk.org<mailto:mlvm-dev at openjdk.org>
https://mail.openjdk.org/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
mlvm-dev at openjdk.org<mailto:mlvm-dev at openjdk.org>
https://mail.openjdk.org/mailman/listinfo/mlvm-dev
_______________________________________________
mlvm-dev mailing list
mlvm-dev at openjdk.org
https://mail.openjdk.org/mailman/listinfo/mlvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/mlvm-dev/attachments/20230912/1bb98932/attachment-0001.htm>