RFR: 8260355: AArch64: deoptimization stub should save vector registers
Ningsheng Jian
njian at openjdk.java.net
Thu Jan 28 11:19:47 UTC 2021
On Thu, 28 Jan 2021 10:27:04 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
>> This is an AArch64 port of the fix for JDK-8256056 "Deoptimization stub
>> doesn't save vector registers on x86". The problem is that a vector
>> produced by the Vector API may be stored in a register when the deopt
>> blob is called. Because the deopt blob only stores the lower half of
>> vector registers, the full vector object cannot be rematerialized during
>> deoptimization. So the following will crash on AArch64 with current JDK:
>>
>> make test TEST="jdk/incubator/vector" \
>> JTREG="VM_OPTIONS=-XX:+DeoptimizeALot -XX:DeoptimizeALotInterval=0"
>>
>> The fix is to store the full vector registers by passing
>> save_vectors=true to save_live_registers() in the deopt blob. Because
>> save_live_registers() places the integer registers above the floating
>> registers in the stack frame, RegisterSaver::r0_offset_in_bytes() needs
>> to calculate the SP offset based on whether full vectors were saved, and
>> whether those vectors were NEON or SVE, rather than using a static
>> offset as it does currently.
>>
>> The change to VectorSupport::allocate_vector_payload_helper() is
>> required because we only store the lowest VMReg slot in the oop map.
>> However unlike x86 the vector registers are always saved in a contiguous
>> region of memory, so we can calculate the address of each vector element
>> as an offset from the address of the first slot. X86 handles this in
>> RegisterMap::pd_location() but that won't work on AArch64 because with
>> SVE there isn't a unique VMReg corresponding to each four-byte physical
>> slot in the vector (there are always exactly eight logical VMRegs
>> regardless of the actual vector length).
>>
>> Tested hotspot_all_no_apps and jdk_core.
>
> src/hotspot/share/prims/vectorSupport.cpp line 138:
>
>> 136:
>> 137: for (int i = 0; i < num_elem; i++) {
>> 138: bool contiguous = X86_ONLY(false) NOT_X86(true);
>
> I don't like this change. It's not x86-specific, but SVE-specific code. What is broken here is `VMReg::next()` doesn't work properly for `VecA` registers. And, as a result, it makes `RegisterMap::location(VMReg)` unusable as well.
>
> So, a proper fix should address that instead. If there's no way to save `VMReg::next()` and `RegisterMap::location(VMReg)`, then new cross-platform API should be introduced and `VectorSupport::allocate_vector_payload_helper()` migrated to it.
For Arm NEON (and PPC) we don't set VMReg::next() in oopmap either, and their vector slots are contiguous, so that's x86-specific? But yes, NEON can also generate correct full oopmap as for fixed vector size. For SVE, I have no idea to have proper VMReg::next() support, so Nick's solution looks good to me. Regarding to introducing new cross-platform API, which API do you mean? If we could have some better api, that would be perfect. Currently, allocate_vector_payload_helper() is the only one I can see that is vector related for RegisterMap::location() call.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2279
More information about the hotspot-dev
mailing list