[lworld] RFR: 8235914: [lworld] Profile acmp bytecode

John R Rose jrose at openjdk.java.net
Wed Sep 16 18:31:14 UTC 2020

On Wed, 16 Sep 2020 16:16:42 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> When the JIT speculates on an acmp profile, I suppose the independent type profiles will help to make some cases more
>> profitable to test:  If an inline type is common, you test for it first and inline the rest.
>> But I don't think a good result requires independent type profiles for the two operands.  I would think that the
>> relevant history would consist of (a) the number of acmp attempts, and (b) for each inline type *for which the operands
>> had that as their common type* the frequency of encountering that type.  That's really just a single klass profile
>> (with counters).  This other form would be somewhat preferable because it would use less footprint and require less
>> bookkeeping, and it would capture sharper information for the JIT, than two independent profiles.  The weakness of
>> independent profiles is you don't know how often the two operands end up with the same type.  Just a suggestion.
> @rose00 Thanks for the suggestion.
> With this patch, my goal is to improve the performance of acmp when there's no inline types involved but the compiler
> can't tell from the static types of the acmp inputs so that legacy code that make no use of inline types is not
> affected by them. My understanding is that, first of all, we want the new acmp to not cause regressions in non inlined
> type code. How important is it to optimize acmp (or aaload/aastore) for cases where inline types hidden behind Object
> are compared (or flattened arrays hidden behind Object[])?  Current logic for an acmp is: if (left == right) {
>  // equal
> } else {
>   if (left == null) {
>     // not equal
>   } else  if (left not inline type) {
>     // not equal
>   } else if (right == null) {
>     // not equal
>   } else if (left.klass != right.klass) {
>     // not equal
>   } else {
>     // substituability test
>   }
> }
> Now if we have profiling for left or right that tells one of them is always null or one them is never an inline type
> and never null then we only need 2 comparisons, for instance:
> if (left == right) {
>   // equal
> } else {
>   if (left.right.klass == java.lang.Integer) { // implicit null check
>     trap;
>   }
>  // not equal
> }
> which is a pattern that's a lot friendlier to the compiler.

The simple comparison `left == right` is valid for `acmp` along all paths *except* when `left` and `right` are inlines,
and *even then* the simple comparison is valid if the two inline types are *distinct*.  Therefore, the only evidence
that the JIT needs that the S-test was ever needed is if the two types are (a) inlines and (b) identical.  Your profile
detects (a) but not (b), which means the JIT can't speculate (b).  If you speculate (b) you can use `left == right`
because any inlines that appear are irrelevant.

(Of course you need additional testing to verify the speculation.  That's something like `left.klass == right.klass &&
left.klass.is_inline` or the reverse, which detects (a) and (b).  It can go to an uncommon trap if the profile claims
it's rare, to avoid compiling in a potentially polymorphic S-test.)

My overall point here is that `left == right` is a good way to prove, in many cases, that two things are identical, and
`left != right` is a good heuristic (not proof) that two things are not identical.  The heuristic fails only if (a) and
(b) are true.  That heuristic seems to be a good target for speculation; perhaps you are aiming at other speculations
here and I'm barking up a different tree.  For this speculation, I think you would throw an uncommon trap you saw (a)
and (b), so the complicated S-test path doesn't need to be compiled.

To answer your question, I think (but don't know for sure) that inlines masked by `Object` references will be an
important source of `acmp` slow paths.  If the profile indicates that either (a) inlines are not seen in the operands
or (b) equal inline types are not seen, then the slow S-test can be removed and replaced (at most) by an uncommon trap.

(Idea of the day:  Make an `acmp` entry point a hidden/injected virtual method on every object, with a distinct body
for each inline object, and `this==x` on `jl.Object`.  The body for any given inline class `V` is then `this==x || x
instanceof V && V.$MONOMORPHIC_S_TEST(this, (V)x)`.  Then refer the whole problem to our portfolio of devirtualization
optimizations.  That gives us bimorphic calls, etc., for `acmp`.  Maybe that's a better factoring than adding more and
more ad hoc `acmp` logic.  This idea pairs well with your profile proposal, since it basically treats `acmp` as a
hidden virtual call.)



PR: https://git.openjdk.java.net/valhalla/pull/185

More information about the valhalla-dev mailing list