RFR: 8202381: (Solaris) SIGBUS in # V [libjvm.so+0xcee494] jni_GetIntField+0x224

Daniel D. Daugherty daniel.daugherty at oracle.com
Mon Apr 30 15:36:14 UTC 2018


On 4/30/18 9:32 AM, Erik Österlund wrote:
> Hi,
>
> It looks like I broke jniFastGetField on SPARC with my changes to 
> modularize this feature (8200235). Sorry about that. And the backout 
> of that (8202386) didn't go in due to maintenance. So here is a patch 
> to fix the problem instead.
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8202381/webrev.00/

src/hotspot/cpu/aarch64/gc/shared/barrierSetAssembler_aarch64.cpp
src/hotspot/cpu/aarch64/gc/shared/barrierSetAssembler_aarch64.hpp
src/hotspot/cpu/aarch64/jniFastGetField_aarch64.cpp
     Addition of the placeholder 'jni_env'; you went to the trouble
     of changing 'robj' -> 'obj' on the other platforms, but not here
     which is inconsistent.

src/hotspot/cpu/sparc/gc/shared/barrierSetAssembler_sparc.cpp
src/hotspot/cpu/sparc/gc/shared/barrierSetAssembler_sparc.hpp
     Addition of the placeholder 'jni_env'; why rename 'robj' -> 'obj'?

src/hotspot/cpu/sparc/jniFastGetField_sparc.cpp
     Addition of the placeholder 'jni_env'; why rename 'robj' -> 'obj'?

     L74:   // Both O5 and G3 are clobbered by 
try_resolve_jobject_in_native.
         You are passing 'G3' as 'tmp'. I don't see where/how in
         try_resolve_jobject_in_native() that the 'tmp' parameter gets
         clobbered.

src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp
src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.hpp
     Addition of the placeholder 'jni_env'; why rename 'robj' -> 'obj'?

src/hotspot/cpu/x86/jniFastGetField_x86_64.cpp
     L50: static const Register rtmp          = r8;
         Nit: These seem to be sorted by register number and not by
              variable name.

     Addition of the placeholder 'jni_env'; why rename 'robj' -> 'obj'?


Thumbs up on the technical part of this change. My comments are
just about style and about the comments. You should make sure
that you get a review from either Kim or David H. since they
were on the original review team for JDK-8200235.

Dan


>
> Bug ID:
> https://bugs.openjdk.java.net/browse/JDK-8202381
>
> My (long version) analysis on the architectures (that I made changes 
> to) of what is going on:
>
> == SPARC ==
>
> This is where the SIGBUS happened. We used to get the jobject/jweak in 
> O1, clear the jweak tag in O1, and resolve the handle by loading from 
> O1 to O5. This was in fact already broken, since 8176100 "[REDO][REDO] 
> G1 Needs pre barrier on dereference of weak JNI handles". The 
> registers used to pass the arguments into the C function (O0, O1 and 
> O2) must not be touched throughout the call, despite being caller 
> saved, as we have not saved the register window. By clearing the jweak 
> in O1, the wrong access functions are used in the slow path if the 
> speculative load is invalidated, either due to the second safepoint 
> counter not being able to the first sampled safepoint counter, or the 
> load of the int popping in to the signal handler. I made this problem 
> worse in my 8200235 patch by not just clearing O1, but also resolving 
> into O1, which clobbered the register even further. This caused 
> intermittent failures in the following sequence of events: 1) the 
> safepoint counter is read, and we are not in a safepoint, 2) the 
> jobject/jweak is resolved, clobbering O1, 3) a safepoint is triggered, 
> 4) after resolving the jobject/jweak, the safepoint counter is sampled 
> again with a new number, causing a call to the slow path where O1 no 
> longer contains the jobject/jweak.
>
> In my proposed changes, this was remedied by moving O1 to O5 first, 
> then clearing the jweak value (in O5), then resolving (in O5) and then 
> performing the speculative load with all passed C arguments untouched 
> in their O1, O2 and O3 registers, as required for the fallback code.
>
> == x86_64 ==
>
> On x86_64, the passed in jobject/jweak is already moved to a separate 
> scratch register. Therefore, the problem of clearing the jweak tag in 
> the jobject does not apply here. However, rscratch1 was the register I 
> picked as a temporary register for resolving. After reading this code 
> more carefully, I realize this is the r10 register, which coincides 
> with the register used to remember the old counter. So using that 
> temporary register is not safe. I chose a new register instead: r8. 
> This is, like the other registers used in this code, a caller saved 
> register, that does not intersect with the input registers of the 
> function arguments. That makes it safe to use in this context. I added 
> DEBUG_ONLY code for clobbering this temporary register on x86_64, to 
> make sure we more easily catch such problems.
>
> == AArch64 ==
>
> On AArch64, the jobject/jweak is already in r3 when cleared (which 
> does not intersect with the registers used in the C calling 
> convention, and is caller saved). Therefore, the clearing of the jweak 
> tag there is correct, and so is the resolution. The temporary register 
> used is r8, which is used as a temporary register by surrounding code 
> in this context (and is caller saved).
>
> Apart from this, I also added a jni_env parameter to 
> try_resolve_jobject_in_native, which just passes c_rarg0 containing 
> the jni environment, without doing anything to it. We need to read 
> from this register in ZGC, and I would prefer not to hardcode in the 
> backend that c_rarg0 always refers to the jni environment in this 
> context.
>
> There are still remaining problems in ARM that the jweak tag is 
> cleared in r1, which will cause the slow path to resolve jweaks as 
> jobjects. However, I would like to not address that issue in this fix, 
> as my original changes did not touch that ARM code. I would like to 
> instead file a bug for that to be solved by another brave soul who 
> wants to change the jniFastGetField code in the ARM port.
>
> Testing:
> I ran the httpclient tests (that failed because of this) with 
> --test-repeat 100, and ran that 3 times. No problems discovered. I am 
> also finishing up hs-tier1-3 and jdk-tier1-3.
>
> Thanks,
> /Erik



More information about the hotspot-dev mailing list