RFR: 8300258: C2: vectorization fails on simple ByteBuffer loop [v2]

Thu Mar 2 10:37:10 UTC 2023

On Thu, 2 Mar 2023 09:53:35 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>>> Thanks for the comments @eme64 @vnkozlov Looking at the code again, if `vectors_should_be_aligned()` is true, if `create_pack` is false, the current code removes every memops and already created packset with `same_velt_type()` true: That can't be motivated by a correctness issue. So I suppose we want to preserve that behavior. Wouldn't we need the change of the last commit I pushed then?
>> 
>> I think the reason we used `same_velt_type` was that we were confused. Or maybe we did that before we had memory slices, and using `same_velt_type` was at least already an improvemnt? At any rate: it was confused and leads to Bugs in conjunction with `Unsafe`, as my example showed.
>> 
>> Keeping `same_velt_type` will probably not harm much, but be more restrictive than neccessary.
>> It will not harm much because `velt_type == memory_slice` as long as we are not using `Unsafe`.
>> And when we do use `Unsafe`, we probably do not use it in very wild ways.
>> 
>> One "wild" use might be something like this:
>> 
>> void test(int[] iarr, float[] farr) {
>>     // cyclic dependency -> not vectorized
>>     in v1 = (int)Unsafe.LoadF(iarr, i);  // assume this to be best
>>     Unsafe.StoreI(iarr, i + 1);
>>     // separate slice -> could be vectorized
>>     Unsafe.StoreI(farr, i) = Unsafe.LoadI(farr, i);   // on different slice as best, but have same velt_type -> rejected
>>     // We end up vectorizing nothing, even though we could vectorize the farr
>> }
>
>> Keeping `same_velt_type` will probably not harm much, but be more restrictive than neccessary.
> 
> It's quite possible that it's over conservative. What this change is trying to achieve is to relax checks so a pattern that's known to be used in the core libraries optimizes better. That pattern only optimizes for misaligned accesses. So it does seem wrong that those architectures that don't allow misaligned accesses are affected. Also, this code is complicated so it certainly feels safer to me to be on the safe side even if it feels too restrictive. Going forward refactoring this would be nice. I gave it a quick try and it was more complicated than I expected.
> 
> I also don't think we should spend too much time making sure every possible combinations of unsafe accesses optimize well or even  correctly if it's too much work. Once people start using unsafe, they are on their own. I think we should stick with whatever feels reasonable or is used in the core libraries (hopefully the second category is included in the first category).
> 
> What do you think @vnkozlov ?

@rwestrel that sounds ok to me.
BTW I am refactoring the code around your changes, because multiple bugs https://github.com/openjdk/jdk/pull/12350.
I think the code was so badly structured that more and more bugs snuck in.

-------------

PR: https://git.openjdk.org/jdk/pull/12440