RFR: 8325520: Vector loads with offsets incorrectly compiled [v4]
Emanuel Peter
epeter at openjdk.org
Thu Apr 25 12:24:30 UTC 2024
On Thu, 25 Apr 2024 08:01:00 GMT, Damon Fenacci <dfenacci at openjdk.org> wrote:
>> # Issue
>> When loading multiple vectors using offsets or masks (e.g. `LongVector::fromArray(LongVector.SPECIES_256, storage, 0, offsets, 0` or `LongVector::fromArray(LongVector.SPECIES_256, storage, 0, longMask)`) there is an error in the C2 compiled code that makes different vectors be treated as equal even though they are not.
>>
>> # Causes
>> On vector-capable platforms, vector loads with masks and offsets (for Long, Integer, Float and Double) create specific nodes in the ideal graph (i.e. `LoadVectorGather`, `LoadVectorMasked`, `LoadVectorGatherMasked`). Vector loads without mask or offsets are mapped as `LoadVector` nodes instead.
>> The same is true for `StoreVector`s.
>> When running GVN loops we can get to the situation where we check if a Load node is preceded by a Store of the same address to be able to replace the Load with the input of the Store (in `LoadNode::Identity`). Here we call
>>
>> https://github.com/openjdk/jdk/blob/87e864bf21d71daae4e001ec4edbb4ef1f60c36d/src/hotspot/share/opto/memnode.cpp#L1258
>>
>> where we do an extra check for types if we deal with vectors but we don’t make sure that neither masks nor offsets interfere.
>> Similarly, in `StoreNode::Identity` we first check if there is a Load and then a Store:
>>
>> https://github.com/openjdk/jdk/blob/87e864bf21d71daae4e001ec4edbb4ef1f60c36d/src/hotspot/share/opto/memnode.cpp#L3509-L3515
>>
>> but we don’t make sure that there are no masks or offsets.
>> A few lines below, we check if there are 2 stores for the same value in a row. We need to check for masks and offsets here too but in this case we can include these cases if the masks and offsets of the vector stores are equivalent.
>>
>> # Solution
>> To avoid folding `Load`- and `StoreVector`s with masks and offsets we add a specific `store_Opcode` method to `LoadVectorGatherNode`, `LoadVectorMaskedNode` and `LoadVectorGatherMaskedNode` that doesn’t return a store opcode but instead returns its own (to avoid ever being the same as a store node). In this way, the checks in `MemNode::can_see_stored_value`
>>
>> https://github.com/openjdk/jdk/blob/87e864bf21d71daae4e001ec4edbb4ef1f60c36d/src/hotspot/share/opto/memnode.cpp#L1164-L1166
>>
>> and `StoreNode::Identity`
>>
>> https://github.com/openjdk/jdk/blob/87e864bf21d71daae4e001ec4edbb4ef1f60c36d/src/hotspot/share/opto/memnode.cpp#L3509-L3515
>>
>> will fail if masks or offsets are used.
>> For 2 stores of the same value we instead check for mask and offset equality.
>>
>> Regression tests for...
>
> Damon Fenacci has updated the pull request incrementally with five additional commits since the last revision:
>
> - JDK-8325520: take advantage of store_Opcode to avoid more checks
> - JDK-8325520: remove check for offsets/mask equivalence if load->store or store->load
> - JDK-8325520: merge
> - JDK-8325520: fix missing semicolon
> - JDK-8325520: address PR comments
Nice, ah you are right, there can be issues with mask-only cases as well!
It would be great if you had tests that exactly exercise these "bad" examples, where it looks like we might optimize, but it would be wrong.
I'll look at your `store_Opcode` changes now...
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18347#issuecomment-2077053827
More information about the hotspot-compiler-dev
mailing list