RFR: 8325520: Vector loads with offsets incorrectly compiled
Emanuel Peter
epeter at openjdk.org
Mon Apr 15 08:56:44 UTC 2024
On Mon, 18 Mar 2024 12:20:34 GMT, Damon Fenacci <dfenacci at openjdk.org> wrote:
> # Issue
> When loading multiple vectors using offsets or masks (e.g. `LongVector::fromArray(LongVector.SPECIES_256, storage, 0, offsets, 0` or `LongVector::fromArray(LongVector.SPECIES_256, storage, 0, longMask)`) there is an error in the C2 compiled code that makes different vectors be treated as equal even though they are not.
>
> # Causes
> On vector-capable platforms, vector loads with masks and offsets (for Long, Integer, Float and Double) create specific nodes in the ideal graph (i.e. `LoadVectorGather`, `LoadVectorMasked`, `LoadVectorGatherMasked`). Vector loads without mask or offsets are mapped as `LoadVector` nodes instead.
> While running GVN loops, we can get to the situation in which there are multiple loads from the same address. If successive loads are deemed to be identical to the first one, they might get folded, which is what happens in the problematic examples of this issue. The check for identity happens in
> https://github.com/openjdk/jdk/blob/481c866df87c693a90a1da20e131e5654b084ddd/src/hotspot/share/opto/memnode.cpp#L1253
> This version of `Identity` is defined in `LoadNode` but it is also the one used by the subclasses `LoadVectorGather`, `LoadVectorMasked` and `LoadVectorGatherMasked`. Although this definition of `Identity` is enough for `LoadVector` nodes it is not sufficient for `LoadVectorGather`, `LoadVectorMasked` and `LoadVectorGatherMasked` ones, as the value being loaded also depends on the offsets and mask (different offsets and/or masks load completely different values). This is the reason why these nodes get folded even if they shouldn't.
>
> # Solution
> `LoadVectorGather`, `LoadVectorMasked` and `LoadVectorGatherMasked` need their own version of the `Identity` method, which specialize `LoadVector::Identity` by restricting the results to nodes that also have the same offsets and masks.
>
> The same issue exists for _StoreVector_ nodes (i.e. `StoreVectorScatter`, `StoreVectorMasked` and `StoreVectorScatterMasked`). So, `Identity` has to be redefined there as well.
>
> Regression tests for all versions of `Load/StoreVectorGather/Masked` have been added too.
@dafedafe Nice work! I left a few comments ;)
I wonder if there cannot be some shared method now, since the code in both cases looks quite similar.
src/hotspot/share/opto/memnode.cpp line 1173:
> 1171: if (in_vt != out_vt) {
> 1172: return nullptr;
> 1173: }
I see there is a vector type check here. Do we not need that for the code in `StoreNode::Identity`? "Normal" stores like `StoreB` and `StoreI` have the type implicit, but for vector nodes, this type is hidden in the `vect_type()`, so I suspect you need to check it.
I imagine a scenario where we store a float-vector, and then read from the same address as int-vector. Is that ok, or would we need a ReinterpretCast node? I'm not sure, but it would be worth trying to create some tests to check that.
src/hotspot/share/opto/memnode.cpp line 2834:
> 2832: if (is_StoreVectorScatter()) {
> 2833: const Node* offsets = as_StoreVectorScatter()->in(StoreVectorScatterNode::Offsets);
> 2834: if (val->is_LoadVectorGather() && offsets->eqv_uncast(val->as_LoadVectorGather()->in(LoadVectorGatherNode::Offsets))) {
Suggestion:
const Node* offsets_store = as_StoreVectorScatter()->in(StoreVectorScatterNode::Offsets);
const Node* offsets_load = val->as_LoadVectorGather()->in(LoadVectorGatherNode::Offsets);
if (offsets_store->eqv_uncast(offsets_load)) {
As explained below, `val->as_Load()->store_Opcode() == Opcode()` (with the fix I described) would imply that `val->is_LoadVectorGather()`.
src/hotspot/share/opto/memnode.cpp line 2858:
> 2856: } else {
> 2857: result = mem;
> 2858: }
You already have the condition `val->as_Load()->store_Opcode() == Opcode()`.
So once you have `is_StoreVectorScatter()`, I think `val->is_LoadVectorGather()` is implied, no?
Oh wow. I think actually that this is another bug here: we only have
`virtual int store_Opcode() const { return Op_StoreVector; }` for `LoadVectorNode`, but not for all masked/gather/scatter vector nodes! I think that should be fixed.
That would also simplify your code.
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/18347#pullrequestreview-2000347459
PR Review Comment: https://git.openjdk.org/jdk/pull/18347#discussion_r1565402819
PR Review Comment: https://git.openjdk.org/jdk/pull/18347#discussion_r1565385954
PR Review Comment: https://git.openjdk.org/jdk/pull/18347#discussion_r1565382548
More information about the hotspot-compiler-dev
mailing list