RFR: 8256215: Shenandoah: re-organize saving/restoring machine state in assembler code
Roman Kennke
rkennke at openjdk.java.net
Wed Jan 27 12:54:47 UTC 2021
On Tue, 17 Nov 2020 21:42:52 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> $ CONF=linux-x86-server-fastdebug make images run-test TEST=compiler/c1/Test6855215.java TEST_VM_OPTS="-XX:+UseShenandoahGC"
>>
>> # Internal Error (/home/shade/trunks/jdk/src/hotspot/cpu/x86/assembler_x86.cpp:3047), pid=1427307, tid=1427311
>> # Error: assert(VM_Version::supports_sse2()) failed
>>
>> V [libjvm.so+0x53f9e8] Assembler::movsd(Address, XMMRegisterImpl*)+0x168
>> V [libjvm.so+0x14647bd] save_xmm_registers(MacroAssembler*)+0x9d
>> V [libjvm.so+0x1465d8f] ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler*, RegisterImpl*, Address, ShenandoahBarrierSet::AccessKind)+0x91f
>>
>> This only affects x86_32, as x86_64 uses at least UseSSE >= 2 at all times.
>>
>> Additional testing:
>> - [ ] `tier1`, Linux x86_64 `-XX:+UseShenandoahGC`
>> - [ ] `tier1`, Linux x86_32 `-XX:+UseShenandoahGC`
>> - [ ] `tier1`, Linux x86_32 `-XX:+UseShenandoahGC -XX:UseSSE=0`
>> - [ ] `tier1`, Linux x86_32 `-XX:+UseShenandoahGC -XX:UseSSE=1`
>
> src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 127:
>
>> 125: __ get_thread(thread);
>> 126: #endif
>> 127: assert_different_registers(src, dst, count, thread);
>
> Take a look at methodHandles_x86.cpp, there is pushing/popping code like this:
> #ifdef _LP64
> __ movdbl(Address(rsp, 0), xmm0);
> #else
> if (UseSSE >= 2) {
> __ movdbl(Address(rsp, 0), xmm0);
> } else if (UseSSE == 1) {
> __ movflt(Address(rsp, 0), xmm0);
> } else {
> __ fst_d(Address(rsp, 0));
> }
> #endif // LP64
>
> IOW, it also has a branch for no SSE at all. :-)
>
> BTW, I am almost certain that we only ever need to save/restore xmm0. The relevant code is (afaict) only ever called by the interpreter (which only uses xmm0) and methodHandles (which also only seems to care about xmm0 - see the various code sequences there 'save FP result'). Grep for load_heap_oop(), that gives all relevant entries. But that is outside the scope of this patch.
To extend a little bit on that last comment: we introduced that xmm save/restore code when we still had RBs and WBs, and we needed a WB in front of C1's CAS-obj. That would mess with C1 keeping FP values in xmm registers across the then-WB. With the move to LRB, this was no longer necessary, and all remaining calls into the asm LRB routines are from the interpreter and thus only need to save/restore xmm0 (which could reasonably be folded into save/restore of other registers and save one add/sub each). The C1 LRB stub calls C1_MacroAssembler::save_live_registers_no_oop_map() which is already smart about which registers to save in context of C1.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1172
More information about the shenandoah-dev
mailing list