Lambda special inline treatment is desirable elsewhere
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Sep 26 17:18:30 UTC 2023
Hi Randall,
I don't fully understand what kind of change you experimented with. Do
you mind sharing the patch?
Compilers have special handling for *lambda forms*
(java.lang.invoke.LambdaForm) which are the crucial piece of performant
invokedynamic and java.lang.invoke implementation. Lambda forms are
aggressively shared and many distinct MethodHandles share LambdaForm
instances. Based on that knowledge, JVM special case them in several
places. The check you refer to in InlineTree::try_to_inline() lifts
recursive inlining constraints from LambdaForms to MethodHandles, but
the constraint is still there.
Lambdas (Java language feature) are implemented on top of invokedynamic
and JVM doesn't do anything particular to optimize specifically for them.
It would be really helpful if you share a benchmark demonstrating the
use case you care about.
Best regards,
Vladimir Ivanov
On 9/7/23 10:49, Randall Oveson wrote:
> I'm considering a patch to improve the performance of a common pattern
> in my (and plausibly others') application. The pattern relates to
> polymorphic processing of records or tuples, e.g. serializing or
> deserializing an Avro or CSV record, or evaluating a runtime-constructed
> expression tree.
>
> You have an immutable tree (often a mere list) of objects implementing a
> common interface. From a CHA perspective the interface is megamorphic,
> but it's always runtime-monomorphic at most call sites (anything within
> the immutable tree). The methods themselves are often cheap, sometimes
> as simple as reading a single byte from a stream or doing a single
> arithmetic operation, so it's imperative that they all be inlined. You
> might say these methods are "dominated by their composition with other
> methods".
>
> In practice it is not possible to tune C2's inlining acceptably for this
> pattern for a few reasons, but the major one is the recursive inlining
> detection. If your tuple-processor has to deal with, say, 15-integer
> type values (so 15 of the methods in the immutable call tree happen to
> be the same method), you won't see any inlining happen because
> InlineTree::try_to_inline considers these calls recursive and the
> default MaxRecursiveInlineLevel is 1. Intuitively, these calls aren't
> really "recursive" in the classic sense; the number of calls to the same
> method is statically bounded, and there's nothing significant about them
> being the same call anyway; they could just as well have been different
> calls if the tuple types at those positions had been different.
>
> It seems this problem was well-observed with lambdas, because there's an
> exception carved out in try_to_inline for lambda-form methods. In those
> cases, we check to see if the argument 0 ("receiver") of the method is
> the same before considering it recursive.
>
> One patch I tested is extending that lambda-form detection of recursive
> inlining to all non-static methods. That solves my performance problem
> and doesn't appear to cause any new performance problems in my project,
> but I can imagine cases where it might be problematic. Still, I think
> it's worth considering as a solution if it hasn't been already.
>
> Another patch I've got is one that treats any non-static method that is
> also @ForceInline the same as lambda-form methods in the recursive
> inline check, along with a change to classFileParser.cpp to allow the
> use of @ForceInline outside of privileged code (the latter change I'd
> bet has been proposed before). This also solves my problem, but I doubt
> it would be acceptable upstream.
>
> I think my intuition about lambdas--which I'd hesitantly suggest is the
> popular intuition about lambdas--being merely "syntatic sugar" for
> ad-hoc abstract method implementations is at odds with the current state
> of C2. The more considerate and aggressive inlining behavior is
> extremely important for any immutable tree of compile-time-polymorphic,
> runtime-monomorphic calls. It's unfortunate that the only way to access
> that behavior is by using a different syntax, which may not be
> appropriate for other reasons.
>
> I'd appreciate any better ideas than the ones I've proposed here. I only
> started digging into this recently and it's my first time on the openjdk
> lists, so thanks in advance for your patience.
>
> Randall
More information about the hotspot-compiler-dev
mailing list