RFR: 8294729: [s390] Implement nmethod entry barriers
Tyler Steele
tsteele at openjdk.org
Wed Oct 26 20:00:41 UTC 2022
On Wed, 19 Oct 2022 18:10:49 GMT, Tyler Steele <tsteele at openjdk.org> wrote:
>> src/hotspot/cpu/s390/stubGenerator_s390.cpp line 2876:
>>
>>> 2874: __ z_lay(Z_R1_scratch, -32, Z_R0, Z_R14); // R1 <- R14 - 32
>>> 2875: __ z_stg(Z_R1_scratch, _z_abi(carg_2), Z_R0, Z_SP); // SP[abi_carg2] <- R1
>>> 2876: __ z_la(Z_ARG1, _z_abi(carg_2), Z_R0, Z_SP); // R2 <- SP + abi_carg2
>>
>> Z_ARG1 should point to the address _z_abi16(return_pc) + Z_SP in the caller frame. (Don't generate a copy!) That matches _z_abi16(return_pc) + current frame size + Z_SP in the current frame at this point.
>> In addition, I'm missing save_volatile_gprs & restore_volatile_gprs for GP and FP regs. I think they should get saved directly before you use Z_ARG1 for the return pc address and restored after the call_VM_leaf + z_ltr(Z_RET, Z_RET) which needs to get moved before the restoration. Note that this will need extra stack space: (5 + 8) * BytesPerWord
>> (See `MacroAssembler::verify_oop` for reference, but note that you don't need to include_flags which reduces complexity.)
>
>> Z_ARG1 should point to the address _z_abi16(return_pc) + Z_SP in the caller frame.
>
> This matches what the PPC implementation does, but when I do the same thing on s390 I get a cache miss in nmethod_stub_entry_barrier (the vm-call). It looked as though CodeCache::find_blob expects the address of the start of the compiled code, so I tried subtracting the size of the barrier from R14 (which currently points to end of the barrier in the compiled frame). After doing this I no longer saw the CodeCache miss.
>
>> I'm missing save_volatile_gprs & restore_volatile_gprs for GP and FP regs.
>
> I had been trying to get the volatile registers saved, but didn't have any luck. I tried it today with your suggestions and it worked like a charm. Not sure what the difference was. Thanks for the pointers.
After a bit more investigation, I believe I see the reasoning behind using the suggested computation for the VM-Call's argument. It seems the CodeCache miss is the issue, not the argument, so I am focusing my efforts on understanding what is happening there.
-------------
PR: https://git.openjdk.org/jdk/pull/10558
More information about the hotspot-dev
mailing list