Howto replicate failure of 8254790?

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Oct 20 21:55:40 UTC 2020


Perfect - exactly as my local fix. I assigned bug to you and will do review of your PR.

I am running tier1-3 testing and let you know results. Please, wait before integration.

Thanks,
Vladimir K

On 10/20/20 2:44 PM, Viswanathan, Sandhya wrote:
> Hi Vladimir,
> 
> I submitted a pull request an hour or so ago as this was a P1 bug, feel free to use that or ignore.
> https://git.openjdk.java.net/jdk/pull/772
> 
> Best Regards,
> Sandhya
> 
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Tuesday, October 20, 2020 2:40 PM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Tatton, Jason <jptatton at amazon.com>; David Holmes <david.holmes at oracle.com>; hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; Hohensee, Paul <hohensee at amazon.com>
> Subject: Re: Howto replicate failure of 8254790?
> 
> Thank you, Sandhya
> 
> Very nice analysis.
> 
> I just finished running dsig/GenerationTests.java test multiply runs (to besure) on our systems and confirmed your proposed fix:
> 
>      bsfl(ch, tmp);
> +  if (UseNewCode) {
> +    addptr(result, ch);
> +  } else {
>        addl(result, ch);
> +  }
> 
> It always fails with addl() and always passed with addptr(). I will assign bug to me and file PR now.
> I will also fix Unicode string index instrinsic code.
> 
> Thanks,
> Vladimir
> 
> 
> On 10/20/20 10:27 AM, Viswanathan, Sandhya wrote:
>>    Hi Vladimir,
>>
>> I analyzed the instruction dump yesterday to find out where the issue is. I have attached it to the bug report as 8254790.asm:
>> https://bugs.openjdk.java.net/browse/JDK-8254790
>>
>> The crash is reported at:
>> 100:       450FB64C1810             movzx r9d, byte ptr [r8+rbx*1+0x10]
>>
>> Which is just after the intrinsics and uses the rbx register (containing the index of char from the intrinsic).
>>
>> RBX has the large value 0xfffffff900000008 instead of 8.  The length of the string is 34 bytes. The match is found in first 32 bytes at index 8.
>> After doing the 32 bytes with the following instructions:
>> 6b:       C5FE6F13                 vmovdqu ymm2, ymmword ptr [rbx]
>> 6f:       C5ED74D1                 vpcmpeqb ymm2, ymm2, ymm1
>> 73:       C4E27D17C2               vptest ymm0, ymm2
>> 78:       0F8369000000             jnb 0xe7
>> The control goes to 0xe7.
>>
>> The code snippet at 0xe7 is:
>> e7:       C5FDD7CA                 vpmovmskb ecx, ymm2
>> eb:       0FBCC1                   bsf eax, ecx
>> ee:       03D8                     add ebx, eax
>> f0:       482BDF                   sub rbx, rdi
>> f3:       0F1F4000                 nop dword ptr [rax], eax
>> f7:       413BDB                   cmp ebx, r11d
>> fa:       0F83DF290000             jnb 0x2adf
>> 100:       450FB64C1810             movzx r9d, byte ptr [r8+rbx*1+0x10]
>>
>> After vpmovmskb, the bit mask in ecx is 0x1100, showing the match at 8th and 9th byte.
>> The register rbx at this point must be holding address to the base of array: 0x00000007e41d2700 same as rdi.
>> Bsf puts 8 in eax.
>> Then 8 is added to ebx instead of rbx using 32-bit add, making upper 32 bits as 0, resulting in rbx = 0xe41d2708.
>> If the add was 64-bit add, everything would have worked well.
>> Then sub rbx, rdi results in 0xe41d2708 - 0x00000007e41d2700 = 0xFFFFFFF900000008 being loaded in rbx.
>> This is the value we see at crash.
>>
>> Best Regards,
>> Sandhya
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>> Sent: Tuesday, October 20, 2020 10:01 AM
>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Tatton,
>> Jason <jptatton at amazon.com>; David Holmes <david.holmes at oracle.com>;
>> hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net;
>> Hohensee, Paul <hohensee at amazon.com>
>> Subject: Re: Howto replicate failure of 8254790?
>>
>> Yes, I saw it too but I was not sure because we never hit the issue with Unicode string index intrinsic.
>> An other thing is we see the failure only on MacOS.
>>
>> I also want someone to decode asm dump I provided in bug to see actual instructions where it happened.
>>
>> Vladimir K
>>
>> On 10/19/20 5:38 PM, Viswanathan, Sandhya wrote:
>>> Hi Jason,
>>>
>>> I think I found the problem looking at the error log from Vladimir Kozlov. In stringL_indexof_char() function, the following snippet is the cause of problem:
>>>
>>> 2807   bind(FOUND_CHAR);
>>> 2808   if (UseAVX >= 2) {
>>> 2809     vpmovmskb(tmp, vec3);
>>> 2810   } else {
>>> 2811     pmovmskb(tmp, vec3);
>>> 2812   }
>>> 2813   bsfl(ch, tmp);
>>> 2814   addl(result, ch);                         <====   The problem is here
>>> 2815
>>> 2816   bind(FOUND_SEQ_CHAR);
>>> 2817   subptr(result, str1);
>>>
>>> The line addl(result, ch) should have been addptr(result, ch).
>>>
>>> The same problem exists in the Unicode string index of char intrinsic as well and need to be fixed.
>>>
>>> Hope this helps.
>>>
>>> Best Regards,
>>> Sandhya
>>>
>>> -----Original Message-----
>>> From: hotspot-compiler-dev
>>> <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Vladimir
>>> Kozlov
>>> Sent: Thursday, October 15, 2020 3:59 PM
>>> To: Tatton, Jason <jptatton at amazon.com>; David Holmes
>>> <david.holmes at oracle.com>; hotspot-compiler-dev at openjdk.java.net;
>>> core-libs-dev at openjdk.java.net
>>> Subject: Re: Howto replicate failure of 8254790?
>>>
>>> Hi Jason,
>>>
>>> I added surrounding instructions dump from hs_err file we have so you can reconstruct x86 assembler from it.
>>>
>>> If you look on si_addr: 0x00000000e41d2718 which case memory map
>>> failure, it looks like R8 =0x00000007e41d2700 is an
>>> oop: [B with upper 32-bits zeroed. It seems uppers 32-bits of address were cut.
>>>
>>> But I don't see it can happens in stringL_indexof_char() sub. You correctly used movptr() and addptr() instructions.
>>>
>>> Vladimir K
>>>
>>> On 10/15/20 2:10 PM, Tatton, Jason wrote:
>>>> Thanks Vladimir and David, I have access to a new macbook with an Intel i7-9750H (supports AVX2) so I will try on that.
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
>>>> Sent: 15 October 2020 20:25
>>>> To: David Holmes <david.holmes at oracle.com>; Tatton, Jason
>>>> <jptatton at amazon.com>; hotspot-compiler-dev at openjdk.java.net;
>>>> core-libs-dev at openjdk.java.net
>>>> Subject: RE: [EXTERNAL] Howto replicate failure of 8254790?
>>>>
>>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>>>
>>>>
>>>>
>>>> Note, we have old Mac machines in our testing env:
>>>> cx8, cmov, fxsr, ht, mmx, 3dnowpref, sse, sse2, sse3, ssse3, sse4.1,
>>>> sse4.2, popcnt, lzcnt, tsc, tscinvbit, avx, avx2, aes, erms, clmul,
>>>> bmi1, bmi2, rtm, adx, fma, vzeroupper, clflush, clflushopt
>>>>
>>>> Use -XX:UseAVX=2
>>>>
>>>> But I was not able reproduce failure on my Skylake Linux machine even with -XX:UseAVX=2. Maybe there are other factors on MacOS.
>>>>
>>>> Regards,
>>>> Vladimir K
>>>>
>>>> On 10/14/20 5:48 PM, David Holmes wrote:
>>>>> Hi Jason,
>>>>>
>>>>> On 15/10/2020 10:42 am, Tatton, Jason wrote:
>>>>>> Hi all,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am trying to replicate the failure of the tier2 test mentioned
>>>>>> in 8254790<https://bugs.openjdk.java.net/browse/JDK-8254790> but I
>>>>>> am only seeing it pass under an x86 linux machine. Are there any specific architectural constraints under which this test should be run in order to make it fail?
>>>>>
>>>>> It failed on a Mac, not Linux.
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>>
>>>>>>
>>>>>> I am running the test via: make test TEST="test/jdk/javax/xml/crypto/dsig/GenerationTests.java"
>>>>>>
>>>>>>
>>>>>>
>>>>>> Note that I am running the test against master without the commit:
>>>>>> "8254792: Disable intrinsic StringLatin1.indexOf until 8254790 is fixed" which disables the intrinsic that is causing the test to fail.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> --
>>>>>> Jason
>>>>>>


More information about the core-libs-dev mailing list