[lworld+vector] RFR: 8319972: [lworld+vector] Enable intrinsification of Unsafe.finishPrivateBuffer.

Xiaohong Gong xgong at openjdk.org
Wed Nov 15 04:03:00 UTC 2023


On Tue, 14 Nov 2023 15:00:37 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Re-enabling intrinsification of Unsafe.finishPrivateBuffer to fix performance degradation seen in Vector API JMH micro benchmarks. It's intrinsifation was disabled to fix incorrectness caused due to missing inductive InlineType Phi node in loops with Unsafe.put* operations. Since Unsafe.put* APIs returns a void hence JVM state is not altered and ciTypeFlow dataflow analysis does not encounter a inductive definition in the loop, due to this C2 parser misses creating an inductive phi node on
> encountering [loop header block](https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/opto/parse1.cpp#L696). 
> 
> In order to maintain IR compliance with ciTypeFlow analysis C2 should prevent scalarizing value object b/w make and finish private buffer calls.
> 
> New behavior of unsafe inline expanders
> 1) makePrivate: Receive InlineTypeNode input and return initialized buffer in larval state.
> 2) finishPrivateBuffer: Receive value object buffer input and return rematerialize InlineTypeNode in non-larval state.
> 
> This may result into creation of unbalanced phi node at control flow merges where one of the phi input may InlineTypeNode and other is a buffer of compatible value type, but the IR still remains valid.
> 
> In addition compiler is now preventing elimination of allocation in larval state.
> 
> Validation Status:-
>   - All the VectorAPI JTREG tests are now passing at various AVX level with / wo -XX:+DoptimizeALot.
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Nice job @jatin-bhateja! Verified on NEON with `-XX:+DeoptimizeALot`! 

But I still have some questions that I don't understand very well, together with the reason that you disabled the intrinsify to `finishPrivateBuffer` before.
1. Here is what I pasted from the previous PR:
> Disable inline expansion of Unsafe.finishPrivateBuffer. This is fixing the incorrectness problems seen with masked operations loop e.g.
vt = newPayload();
vt = Unsafe.makePrivateBuffer(vt);
for (int i = 0; i < Double128.length; i++) {
     if (mask[i]) {
          Unsafe.putDouble(vt, apply(src[i]));
     }
}
Unsafe.putDouble() modifies memory state of an InlineTypeNode by storing a new value in payload buffer, it also re-materializes InlineTypeNode from modified memory state. This causes incorrectness if second mask value is false as entire payload is overwritten with default value and earlier lane updates are over-written. In order to be safe, disabling intrinsification of finishPrivateBuffer if incoming argument node is of VectorPayload* type, so that it always receive updated buffer argument.

You mean this causes incorrectness if second mask value is false because the entire payload is overwritten with default value. Do you mean it is optimized to the default value? But if not all its fields are `0`, how this happen? And I cannot see the direct relationship with intrinsifying of `finishPrivateBuffer`. Could you please clarify more?

2. In this PR, I understand the main fixing is avoiding the scalarization of `InlineType` buffer if it is in larval state, right? I guess this is the main cause of the jtreg regression with `-XX:+DeoptimizeALot`? But I still cannot understand the direct relationship with intrinsifying of `finishPrivateBuffer`. Do you mean with the elimination of scalarization for larval state buffer, the above issue in point 1 is gone?

Going through with this PR now. I still need time learning it. 

Thanks,
Xiaohong

src/hotspot/share/opto/library_call.cpp line 2634:

> 2632:     if (alloc != nullptr) {
> 2633:       assert(alloc->_larval, "InlineType instance must be in _larval state for unsafe put operation.\n");
> 2634:     }

Can we move this check after line2397 (https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/opto/library_call.cpp#L2397), and remove the same assertion between line2351-2359 (https://github.com/openjdk/valhalla/blob/lworld%2Bvector/src/hotspot/share/opto/library_call.cpp#L2351) ?

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/952#issuecomment-1811772053
PR Review Comment: https://git.openjdk.org/valhalla/pull/952#discussion_r1393600527



More information about the valhalla-dev mailing list