RFR: 8335392: C2 MergeStores: enhanced pointer parsing
Emanuel Peter
epeter at openjdk.org
Fri Oct 18 08:06:50 UTC 2024
On Thu, 17 Oct 2024 21:42:33 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> **Background**
>> I am introducing the `MemPointer`, for enhanced pointer parsing. For now, it replaces the much more limited `ArrayPointer` in `MergeStores` (see https://github.com/openjdk/jdk/pull/16245), but eventually it is supposed to be used widely in optimizations for pointer analysis: adjacency, aliasing, etc. I also plan to refactor the `VPointer` from auto-vectorization with it, and unlock more pointer patterns that way - possibly including scatter/gather.
>>
>> **Details**
>>
>> The `MemPointer` decomposes a pointer into the form `pointer = con + sum_i(scale_i * variable_i)` - a linear form with a sum of variables and scale-coefficients, plus some constant offset.
>>
>> This form allows us to perform aliasing checks - basically we can check if two pointers are always at a constant offset. This allows us to answer many questions, including if two pointers are adjacent. `MergeStores` needs to know if two stores are adjacent, so that we can safely merge them.
>>
>> More details can be found in the description in `mempointer.hpp`. Please read them when reviewing!
>>
>> `MemPointer` is more powerful than the previous `ArrayPointer`: the latter only allows arrays, the former also allows native memory accesses, `Unsafe` and `MemorySegement`.
>>
>> **Dealing with Overflows**
>>
>> We have to be very careful with overflows when dealing with pointers. For this, I introduced a `NoOverflowInt`. It allows us to do "normal" int operations on it, and tracks if there was ever an overflow. This way, we can do all overflow checks implicitly, and do not clutter the code with overflow-checks or - God forbid - forget overflow-checks.
>
> For me it is confusing to call `pointer = con + sum_i(scale_i * variable_i)` as "pointer" unless it is Unsafe address which has base address as constant. It misses base address. All out pointer types are correspond to an address of some object in Java heap, out of heap, VM's object or some native (C heap) VM object.
> This looks like `address_offset`, `displacement`, ...
@vnkozlov thanks for looking at this!
>For me it is confusing to call pointer = con + sum_i(scale_i * variable_i) as "pointer" unless it is Unsafe address which has base address as constant. It misses base address. All out pointer types are correspond to an address of some object in Java heap, out of heap, VM's object or some native (C heap) VM object.
This looks like address_offset, displacement, ...
I added some explanations and examples in the code now. But essencially, any `base` is just another `variable`, with `scale = 1`. Just for adjacency, it does not matter if the variable is some offset or a base address. Of course, there may be some other aliasing analysis tasks that do care if it is an array or not. We can add such detection later, if we need it.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19970#issuecomment-2421757827
More information about the hotspot-compiler-dev
mailing list