[x86_64 AVX2] weird crash due to RAX in String.compareTo(Object)

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Thu Mar 12 09:30:15 UTC 2020


Hi Volker,

> The other strange thing is that we had about 6 different processes
> (running for up to 100 days and more) all crashing within the same
> minute in this very same code on the same host. The hs_err file Xin
> posted was the simplest one. The others all claim to crash at the same
> instruction within the 'compareTo' intrinsic, but looking at the
> registers, they don't even point into character arrays any more (see
> for example http://cr.openjdk.java.net/~simonis/tmp/hs_err.log).

Even through str1/str2 don't point to char[] anymore, the symptoms look 
very similar:

str1 = RDI=0x00000000f5707164 is pointing into object: 0x00000000f5707150
com.amazon.alf.internal.google.common.collect.SingletonImmutableList

str2 = RSI=0x00000000f5706ff4 is pointing into object: 0x00000000f5706fb8
alfbus.config.BusNodeConfig
  - klass: 'alfbus/config/BusNodeConfig'

cnt1 = RCX=0x0000000000000008
cnt2 = RDX=0x0000000000000010

result = RAX=0xffffffff00000036

Top of Stack:
0x00007f1afddcc7d8:   0000000000000000

vs

str1 = RDI=0x00000000fe711e44 is pointing into object: 0x00000000fe711dc0
[C
  - klass: {type array char}
  - length: 70

str2 = RSI=0x00000000fe71487c is pointing into object: 0x00000000fe7147f8
[C
  - klass: {type array char}
  - length: 70

cnt1 = RCX=0x0000000000000008
cnt2 = RDX=0x0000000000000010

result = RAX=0xffffffff00000036

Top of Stack:
0x00007ff9c0752588:   0000000000000000

So, it means str1/str2 are corrupted at some point as well and can't be 
trusted.

> I found one difference between 8u an jdk which was introduced in jdk9
> with "8154896: xml.transform fails intermittently on SKX"
> (https://bugs.openjdk.java.net/browse/JDK-8154896). It changed some
> short branches to normal ones (e.g. 'jccb' tp 'jcc') but I must
> confess, that I don't really understand the explanation for the
> change:
> 
> "There is a guarantee of isBit(imm8) for jccb which can sometimes fail
> when upper bank marshaling is required for instructions without EVEX
> or conditionally EVEX support, as the side effect code can push us
> over the imm8 limit." I don't see a guarantee in jccb, just an
> assertion which checks for the correct length.
> 
> Do you know why that change was necessary and do you think it can be a
> reason why we see these errors?

As I understand the description, on EVEX-capable (AVX512) systems some 
instructions became EVEX-encoded which led to code size increase (4-byte 
EVEX prefix vs 2-/3-byte VEX prefix) and some offsets became too large 
to be encoded with short jumps.

The crash you are seeing may be explained by it, but Liu mentioned 
Broadwell which shouldn't be affected (no AVX512 present).

Also, it's surprising to see multiple crashes after 100 days of uptime.

Best regards,
Vladimir Ivanov

> 
> Thank you and best regards,
> Volker
> 
>> The only plausible explanation I have is there's patching of String
>> backing char array happening and it breaks the intrinsic which doesn't
>> expect any concurrent modifications (and there shouldn't be any).
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> RSI=0x00000000fe71487c is pointing into object: 0x00000000fe7147f8
>> [C
>>    - klass: {type array char}
>>    - length: 70
>> RDI=0x00000000fe711e44 is pointing into object: 0x00000000fe711dc0
>> [C
>>    - klass: {type array char}
>>    - length: 70
>>
>> (lldb) p 0x00000000fe71487c - 0x00000000fe7147f8
>> (unsigned int) $0 = 132
>> (lldb) p 0x00000000fe711e44 - 0x00000000fe711dc0
>> (unsigned int) $2 = 132
>>
>>> On 3/9/20, 1:00 AM, "hotspot-compiler-dev on behalf of Liu, Xin" <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of xxinliu at amazon.com> wrote:
>>>
>>>       Hi,
>>>
>>>       I got some crash reports of C2 generated method String.compareTo(Object) on x86_64. This method is an intrinsics and defined in MacroAssembler::string_compare(macroAssembly_x86.cpp).
>>>       Yes, one interesting fact is the problem only happens on the bridge method compareTo(Object), deriving from the interface Comparable<String>.
>>>
>>>       So far, I only see crashes in jdk8u because newer JDKs use AVX3 version by default, but I read the tip of jdk and AVX2 version is still the same. My concern is the bug is still there.  Have you seen this problem before?
>>>
>>>       I found they all crash at an AVX instruction "0x00007ffb0d830235 vmovdqu ymm0, ymmword ptr [rdi + rax*2]", where RAX=0xffffffff00000036, RDI=0x00000000fe711e44.
>>>       JVM got SIGSEGV because of access violation. The faulty address is 0xfffffffefe711eb0, which is exactly (rax *2 + rdi).  It looks like result(rax)  has been overflowed. -4294967242
>>>
>>>       AVX2 version comes from JDK-8005419. By changing the method signature a little bit in Test8005419.java, we can get String.compareTo(Object) AVX2 version as string_compare.S.
>>>       diff --git a/src/hotspot/test/compiler/8005419/Test8005419.java b/src/hotspot/test/compiler/8005419/Test8005419.java
>>>       index 201153e8a..1f8c57097 100644
>>>       --- a/src/hotspot/test/compiler/8005419/Test8005419.java
>>>       +++ b/src/hotspot/test/compiler/8005419/Test8005419.java
>>>       @@ -114,7 +114,7 @@ public class Test8005419 {
>>>                System.out.println("PASSED");
>>>            }
>>>
>>>       -    private static int test(String str1, String str2) {
>>>       +    private static int test(Comparable<String>str1, String str2) {
>>>                return str1.compareTo(str2);
>>>            }
>>>        }
>>>
>>>       Because it's an intrinsics, there's no code shape issue, right? I can't figure out how Rax becomes 0xffffffff00000036. I attached the original error message. According to RSI and RDI, the method was comparing two 70-length strings.
>>>       Test.java permutates all cases of two 70-length strings. Why I still can't hit this problem? Did I still miss anything?
>>>
>>>       Thanks in advanced.
>>>       --lx
>>>
>>>
>>>
>>>
>>>


More information about the hotspot-compiler-dev mailing list