RFR(L): 8203197: C2: consider all paths in loop body for loop predication

Wed Jun 6 17:32:02 UTC 2018

Hi Roland,

On 6/4/18 7:05 AM, Roland Westrelin wrote:
> 
> Hi Vladimir,
> 
>> SpeculativeTrapData has only one cell for Method* which should be only
>> one cell, even in 32-bit VM. Change it to 2 cells is wrong I think.
>>
>> I think old assumption is invalid with your changes:
>>
>> // This code assumes an entry for a SpeculativeTrapData is 2 cells
>>
>> because you changed header size for 32 bit - it is now 2 cells.
> 
> The extra data area where the speculative traps are stored is allocated
> with:
> 
>    // Add some extra DataLayout cells (at least one) to track stray traps.
>    int extra_data_count = compute_extra_data_count(data_size, empty_bc_count, needs_speculative_traps);
>    object_size += extra_data_count * DataLayout::compute_size_in_bytes(0);
> 
>    // Add a cell to record information about modified arguments.
>    int arg_size = method->size_of_parameters();
>    object_size += DataLayout::compute_size_in_bytes(arg_size+1);
> 
> So
> 
> 1) extra_data_count * DataLayout::compute_size_in_bytes(0) guarantees
> the size of the space for traps is a multiple of the header size so a
> multiple of 2 cells on 32 bits
> 
> 2) space is reserved after the traps for entries with tag
> DataLayout::arg_info_data_tag. That stuff is still inside the extra data
> area.
> 
> When MethodData::bci_to_extra_data() allocates a new trap in the extra
> data area, it must make sure there's still space available. It checks
> that there's room by finding the last trap in the area, moving to the
> next entry and checking if:
> 
> - it's still within the extra data area
> - it's not an arg_info_data_tag, i.e. we're not overflowing in the
> argument area
> - if the trap is a speculative trap it also needs to check that there's
> room, not only for the header, but also for the Method*
> 
> Now if traps are either 2 (regular traps) or 3 (speculative traps)
> cells, that code cannot be left as is. Let say we have 6 cells for extra
> data. We allocate a regular trap and a speculative trap. Next entry is
> at cell number 5 (assuming they are numbered from 0). Let's say now we
> want to allocate a regular trap. We're still within the extra data area
> (but don't have sufficient room), we're half way on the first arg info
> but reading the tag can't be used to reliably tell us that.
> 
> Adding the padding allows us to keep the existing logic.

Got it. Please, add comment to that code explaining why it is needed in 
32-bit mode:

+#ifndef _LP64
+    speculative_trap_padding,
+#endif

Also may be name it simple 'padding'. Otherwise it is confusing. I first 
assumed that it is needed functionality for speculative traps and not 
simple memory allocation layout fix.

> 
>> It may actually cause issues (with 32-bit VM) in other places with
>> similar assumption.
>>
>> An other issue is that header can't be allocated/set/load atomically in
>> 32-bit VM as MethodData::bci_to_extra_data_helper() expecting.
> 
> The header contains:
> 
> bci/tag, flags and traps
> 
> bci/tag are constant. They are set once for all when the MethodData is
> built. There's an exception for traps as discussed above but then
> allocation happens under a lock and once a trap is allocated the bci/tag
> is constant.
> 
> So, the only things being updated are flags and traps. It's possible we
> loose an update because of concurrency but that was true before and the
> change doesn't make it worse. I don't see what other problem the lack of
> atomicity could cause.
> 
> The method data is laid out statically in a way that, AFAICT, is quite
> robust to a 2 cell header. Once it is built, updates mostly happen in
> fields of the individual profile data. So I don't really see what else
> could break.
> 
>> Then I don't understand your changes. I thought you move out checks (as
>> new predicates) which have profiling information which means you have an
>> other valid path branch from dominating check. Such check can't be moved
>> from loop as I remember. I don't think you can move out from loop loop
>> exit checks.
>>
>> Please, explain which checks you moved from loop as profiling predicates
>> without moving dominating checks.

Thank you for showing examples. I understand it now. As you said it is 
loop invariant checks and the only concerns is the order of predicates 
execution and more frequent deoptimizations.
I agree with separate "profiling" predicates to separate deoptimizations.

> 
> Say we have:
> 
> while() {
>    if (some_condition) { // both branches profiled taken
>      if (!null_check) unc(null_check);
>      if (!range_check) unc(range_check);
>      ...
>    }
> }
> 
> The existing loop predication wouldn't consider those predicates. My
> change does if there's a speculated pay off.
> 
> if (!null_check) unc(profile_predicate);
> if (!range_check) unc(profile_predicate);
> while() {
>    if (some_condition) { // both branches profiled taken
>      ...
>    }
> }
> 
> It's otherwise strictly identical to the existing loop predication:
> predicates are moved only if all inputs are loop independent.
> 
> Now:
> 
> while() {
>    if (!null_check1) unc(null_check);
>    if (!range_check1) unc(range_check);
>    ...
>    if (some_condition) { // both branches profiled taken
>      if (!null_check2) unc(null_check);
>      if (!range_check2) unc(range_check);
>      ...
>    }
> }
> 
> becomes:
> 
> if (!null_check1) unc(predicate);
> if (!range_check1) unc(predicate);
> 
> if (!null_check2) unc(profile_predicate);
> if (!range_check2) unc(profile_predicate);
> 
> while() {
>    ...
>    if (some_condition) { // both branches profiled taken
>      ...
>    }
> }
> 
> that's because loop predication is now more speculative, so it's more
> likely to fail and I don't want a failure with profile predicates to
> disable predication entirely and cause a regression.
> 
> First, a pass over the loop body moves regular predicates out of the
> loop and then a second pass moves other predicates as profile
> predicates.
> 
> There can be dependencies between predicates. Dependencies are preserved
> because predicates are kept out of the loop in the order they were moved
> out of loop.
> 
> With this:
> 
> while() {
>    if (!null_check1) unc(null_check);
>    if (!range_check1) unc(range_check);
>    ...
>    if (some_condition) { // both branches profiled taken
>      if (!null_check2) unc(null_check);
>      if (!range_check2) unc(range_check);
>      ...
>    }
>    if (!null_check3) unc(null_check);
>    if (!range_check3) unc(range_check);
> }
> 
> Let's say, the first time loop predication is applies, it finds
> null_check1/range_check1 and null_check2/range_check2 to hoist but not
> null_check3/range_check3 because they are not loop independent at that
> point.
> 
> if (!null_check1) unc(predicate);
> if (!range_check1) unc(predicate);
> 
> if (!null_check2) unc(profile_predicate);
> if (!range_check2) unc(profile_predicate);
> 
> while() {
>    ...
>    if (some_condition) { // both branches profiled taken
>      ...
>    }
>    if (!null_check3) unc(null_check);
>    if (!range_check3) unc(range_check);
> }
> 
> Now let's say a later loop predication pass finds
> it can hoist null_check3/range_check3. To guarantee correct ordering of
> predicates, they will be moved as profile_predicates even though they
> are not in a branch:
> 
> if (!null_check1) unc(predicate);
> if (!range_check1) unc(predicate);
> 
> if (!null_check2) unc(profile_predicate);
> if (!range_check2) unc(profile_predicate);
> if (!null_check3) unc(profile_predicate);
> if (!range_check3) unc(profile_predicate);
> 
> while() {
>    ...
>    if (some_condition) { // both branches profiled taken
>      ...
>    }
> }

In this example what will happen if *_check1 and *_check3 are moved from 
loop first (they are regular predicates). And then you found that 
"profiling" *_check2 can be moved. How you enforce order in such case?

> 
> When you say loop exit are you talking about something like?
> 
> while() {
>    if (some_condition) { // both branches profiled taken
>      break;
>    }
>    if (!null_check) unc(null_check);
>    if (!range_check) unc(range_check);
>    ...
> }

No I was asking about this case when you have "profiling" case:

while() {
   if (some_condition) { // both branches profiled taken
     if (!null_check) unc(null_check);
     if (!range_check) unc(range_check);
     break;
   }
   ...
}

Should you take special care in such case because such checks only 
executed on exit?

Thanks,
Vladimir

> 
> It would be optimized as:
> 
> if (!null_check) unc(profile_predicate);
> if (!range_check) unc(profile_predicate);
> while() {
>    if (some_condition) { // both branches profiled taken
>      break;
>    }
> }
> 
> As long as inputs for the tests are loop independent I don't see why
> that would be illegal. Worst case, they trigger a deopt and next
> compilation won't apply profile predication.

> 
> Roland.
>