[9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to restore xmm registers on 32 bit
Tobias Hartmann
tobias.hartmann at oracle.com
Tue Feb 2 18:00:06 UTC 2016
Thanks, Vladimir.
Best,
Tobias
On 02.02.2016 18:59, Vladimir Kozlov wrote:
> Looks good.
>
> Thanks,
> Vladimir
>
> On 2/2/16 12:19 AM, Tobias Hartmann wrote:
>> Hi Michael,
>>
>> On 01.02.2016 19:50, Berg, Michael C wrote:
>>> Tobias, since it works everywhere now, let us proceed.
>>> The code is ok in its current form.
>>
>> Okay, here is the new webrev, including the fixed asserts:
>> http://cr.openjdk.java.net/~thartmann/8148490/webrev.01/
>>
>> On 29.01.2016 20:39, Vladimir Kozlov wrote:
>>> Originally we could have vectors even with only 64bit XMM registers. MaxVectorSize and UseAVX can be set on command line - what happens in such case? No vectorization?
>>
>> Yes, we hit the assert if MaxVectorSize < 64 is set. This is fixed with above's changes.
>>
>> Thanks,
>> Tobias
>>
>>>
>>> Thanks,
>>> Michael
>>>
>>> -----Original Message-----
>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com]
>>> Sent: Monday, February 01, 2016 3:33 AM
>>> To: Berg, Michael C; Vladimir Kozlov
>>> Cc: hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: [9] RFR(S): 8148490: RegisterSaver::restore_live_registers() fails to restore xmm registers on 32 bit
>>>
>>> Thanks Vladimir and Michael, for the reviews.
>>>
>>> Michael wrote (off-thread):
>>>> Tobias, I might take the restore code a different direction. Would it be ok if I handed you code back which has an offset based approach that makes the direction of restores no longer dependent upon the current stack?
>>>
>>> I will wait for his changes and then send out a new RFR.
>>>
>>> Best,
>>> Tobias
>>>
>>> On 29.01.2016 23:28, Berg, Michael C wrote:
>>>> Tobias/Vladimir:
>>>>
>>>> I would change the two asserts to in the 64bit code to make the check clear:
>>>>
>>>> assert(UseAVX > 0, "up to 512bit vectors are supported with EVEX");
>>>> assert(MaxVectorSize <= 64, "up to 512bit vectors are supported
>>>> now");
>>>>
>>>> As for testing with the patch applied to hotspot on a current jdk(01-29-16):
>>>>
>>>> Windows sde 32-bit: skx - pass, also ran and passed part of
>>>> specjvm2008 Windows 32-bit: hsw - pass, also ran and passed all of
>>>> specjvm2008 Windows sde 64-bit: skx - pass, also ran and passed part
>>>> of specjvm2008 Windows 64-bit: hsw -pass, also ran and passed all of
>>>> specjvm2008 : caveat Linux on skx: 32-bit - pass, also ran and passed
>>>> all of specjvm2008 Linux on skx:64-bit - pass, also ran and passed all
>>>> of specjvm2008
>>>>
>>>> We should proceed with checkin in the changelist after the usual testing.
>>>>
>>>> Note: The above tests were done with the asserts changed on windows
>>>> only. The 64bit changes are mostly cosmetic. It's the change to the additional_frame_bytes that makes it correct, we used equivalent constants in the stack adjustment beforehand, they had not been mapped to the movdqu for the non-vector case for a few iterations on the file. Early on I did have that code though.
>>>>
>>>> Caveat: xml.transform fails with the changelist and without, I checked this against a 12-21-15 built jdk which is 1 month old, so we have a new bug that is causing this app to fail as well (on windows for 64bit) on hsw.
>>>> I checked recent jbs traffic, the occurrence does not appear to be tracked at this time.
>>>>
>>>> -Michael
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Friday, January 29, 2016 11:40 AM
>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>> Cc: Berg, Michael C
>>>> Subject: Re: [9] RFR(S): 8148490:
>>>> RegisterSaver::restore_live_registers() fails to restore xmm registers
>>>> on 32 bit
>>>>
>>>> Tobias, please verify that 64-bit code works correctly.
>>>> About 32-bit code.
>>>>
>>>> Please verify correctness of next asserts:
>>>>
>>>> assert(UseAVX > 0, "512bit vectors are supported only with EVEX");
>>>> assert(MaxVectorSize == 64, "only 512bit vectors are supported
>>>> now");
>>>>
>>>> Originally we could have vectors even with only 64bit XMM registers.
>>>> MaxVectorSize and UseAVX can be set on command line
>>>> - what happens in such case? No vectorization?
>>>>
>>>> May be it is done because we save whole 128bit XMM always. Still MaxVectorSize == 64 condition is strange.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 1/29/16 6:16 AM, Tobias Hartmann wrote:
>>>>> Hi,
>>>>>
>>>>> please review the following patch:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8148490
>>>>> http://cr.openjdk.java.net/~thartmann/8148490/webrev.00/
>>>>>
>>>>> RegisterSaver::save_live_registers() and RegisterSaver::restore_live_registers() are used by the safepoint handling code to save and restore registers. The following code is emitted to save and restore XMM/YMM registers on 32 bit:
>>>>>
>>>>> Save:
>>>>> ...
>>>>> 0xf34ca12e: vmovdqu %xmm0,0xb0(%esp)
>>>>> 0xf34ca137: vmovdqu %xmm1,0xc0(%esp)
>>>>> ...
>>>>> 0xf34ca16d: vmovdqu %xmm7,0x120(%esp)
>>>>> 0xf34ca176: sub $0x80,%esp
>>>>> 0xf34ca17c: vextractf128 $0x1,%ymm0,(%esp)
>>>>> 0xf34ca183: vextractf128 $0x1,%ymm1,0x10(%esp)
>>>>> ...
>>>>> 0xf34ca1b3: vextractf128 $0x1,%ymm7,0x70(%esp)
>>>>> ...
>>>>>
>>>>> Restore:
>>>>> ...
>>>>> 0xf34ca202: vinsertf128 $0x1,(%esp),%ymm0,%ymm0
>>>>> 0xf34ca209: vinsertf128 $0x1,0x10(%esp),%ymm1,%ymm1
>>>>> ...
>>>>> 0xf34ca239: vinsertf128 $0x1,0x70(%esp),%ymm7,%ymm7
>>>>> 0xf34ca241: add $0x80,%esp
>>>>> 0xf34ca247: vmovdqu 0x130(%esp),%xmm0
>>>>> 0xf34ca250: vmovdqu 0x140(%esp),%xmm1
>>>>> ...
>>>>> 0xf34ca286: vmovdqu 0x1a0(%esp),%xmm7
>>>>> ...
>>>>>
>>>>> The stack offsets for the vmovdqu instructions are wrong, causing the XMM registers to contain random values after a safepoint. The problem is that "additional_frame_bytes" is added to the stack offset although the stack pointer is incremented just before:
>>>>>
>>>>> 283 __ addptr(rsp, additional_frame_bytes); // Save upper half of YMM registers
>>>>>
>>>>> The regression test fails with "Test failed: array[0] = 1973.0 but should be 10.000" because the vectorized loop returns a wrong result.
>>>>>
>>>>> I spotted and fixed the following other problems:
>>>>> - the vmovdqu instructions should be emitted before restoring YMM and
>>>>> ZMM because they zero the upper part of the XMM registers (i.e.
>>>>> YMM/ZMM)
>>>>> - if 'UseAVX > 2' is set/available, we save the ZMM registers as well
>>>>> but we do not increment 'additional_frame_words' accordingly (we need
>>>>> another 8*32 bytes of stack space)
>>>>>
>>>>> Unfortunately, I don't have access to a CPU with the AVX-512 instruction set to test the "UseAVX > 2" related changes. Michael, could you verify the changes?
>>>>>
>>>>> The problems were introduced by the fix for JDK-8142980.
>>>>>
>>>>> Thanks,
>>>>> Tobias
>>>>>
More information about the hotspot-compiler-dev
mailing list