[lworld] RFR: 8235914: [lworld] Profile acmp bytecode

Wed Sep 16 16:21:09 UTC 2020

On Wed, 16 Sep 2020 02:39:54 GMT, John R Rose <jrose at openjdk.org> wrote:

>> This includes:
>> - a new ProfileData structure to profile both inputs to acmp
>> - profile collection at acmp in the interpreter on x86
>> - profile collection at acmp in c1 generated code on x86
>> - changes to c2's acmp implementation to leverage profiling (both existing profiling through type speculation and new
>>   profile data at acmp)
>> - small tweaks to the assembly code generated for acmp
>> - a change to the implementation of LIRGenerator::profile_null_free_array() so it doesn't use a branch (which is
>>   dangerous given the register allocator is not aware of branches added at the LIR level)
>> - new tests
>> 
>> Profile collection happens unconditionally. Leveraging profiling at acmp is under UseACmpProfile which is false by
>> default.
>
> When the JIT speculates on an acmp profile, I suppose the independent type profiles will help to make some cases more
> profitable to test:  If an inline type is common, you test for it first and inline the rest.
> But I don't think a good result requires independent type profiles for the two operands.  I would think that the
> relevant history would consist of (a) the number of acmp attempts, and (b) for each inline type *for which the operands
> had that as their common type* the frequency of encountering that type.  That's really just a single klass profile
> (with counters).  This other form would be somewhat preferable because it would use less footprint and require less
> bookkeeping, and it would capture sharper information for the JIT, than two independent profiles.  The weakness of
> independent profiles is you don't know how often the two operands end up with the same type.  Just a suggestion.

@rose00 Thanks for the suggestion.

With this patch, my goal is to improve the performance of acmp when there's no inline types involved but the compiler
can't tell from the static types of the acmp inputs so that legacy code that make no use of inline types is not
affected by them. My understanding is that, first of all, we want the new acmp to not cause regressions in non inlined
type code. How important is it to optimize acmp (or aaload/aastore) for cases where inline types hidden behind Object
are compared (or flattened arrays hidden behind Object[])?

Current logic for an acmp is:
if (left == right) {
 // equal
} else {
  if (left == null) {
    // not equal
  } else  if (left not inline type) {
    // not equal
  } else if (right == null) {
    // not equal
  } else if (left.klass != right.klass) {
    // not equal
  } else {
    // substituability test
  }
}
Now if we have profiling for left or right that tells one of them is always null or one them is never an inline type
and never null then we only need 2 comparisons, for instance:

if (left == right) {
  // equal
} else {
  if (left.right.klass == java.lang.Integer) { // implicit null check
    trap;
  }
 // not equal
}

which is a pattern that's a lot friendlier to the compiler.

-------------

PR: https://git.openjdk.java.net/valhalla/pull/185