UnsafeAtomicityTest crashes on SPARC

Fri Oct 30 13:02:00 UTC 2015

Hi Aleksey,

exactly, the putInt(offset, 0xFFFFFFFF) improves the situation. It uses DEFINE_GETSETNATIVE which calls set_doing_unsafe_access before the access while the other one uses DEFINE_GETSETOOP which doesn't do that (see unsafe.cpp).

The JVM will usually not run much longer after the SIGBUS was caught because set_pending_unsafe_access_error() gets called afterwards which should eventually lead to JVM exit with the asynchronous java.lang.InternalError exception (unless one catches it which is rather uncommon).

Best regards,
  Martin

-----Original Message-----
From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com] 
Sent: Freitag, 30. Oktober 2015 13:06
To: Doerr, Martin
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: UnsafeAtomicityTest crashes on SPARC

* PGP Signed by an unknown key

Hi Martin,

Thanks for a heads-up.

On 10/30/2015 02:52 PM, Doerr, Martin wrote:
> we have seen JVM crashes when running the following test on SPARC:
> org.openjdk.jcstress.tests.vjug.UnsafeAtomicityTest
> 
> Maybe it is not supposed to run on platforms which don't support
> unaligned accesses?

Yes, unaligned Unsafe access might crash on platforms that do not
support unaligned accesses. The test should have checked
Unsafe.unalignedAccess() and/or used Unsafe.putIntUnaligned. Both APIs
are not available in JDK 8, though.

> I see 2 problems:
> 
> 1.       The current implementation uses the version of
> UnsafeHolder.U.putInt(null, offset, 0xFFFFFFFF) (and getInt) which is
> designed to access object fields. Seems like the JVM is allowed to crash
> with SIGBUS if it is misused for unaligned accesses. The JVM is designed
> to catch SIGBUS only in the other version which only takes the address
> UnsafeHolder.U.putInt(offset, 0xFFFFFFFF).

This is an odd difference. So, nominally, making the test to use
putInt(offset, 0xFFFFFFFF) avoids the issue?

> 2.       The signal handler in os_solaris_sparc needs a fix to catch
> BUS_ADRALN as well. The part "&& info->si_code == BUS_OBJERR"  of the
> condition "if (sig == SIGBUS && info->si_code == BUS_OBJERR &&
> thread->doing_unsafe_access())" should get removed as it was done on
> other platforms.

> With the problems fixed, it may be possible to catch the asynchronous
> exception which may get generated by the Unsafe access. The following
> stand-alone test program below can do it.

Yes, I think runtime folks might consider bullet-proofing this. Although
I sometimes see the SIGBUS as a viable alternative for a creeping
performance problem, at least in testing. (IIRC, some kernels are known
to silently fix up this as well).

Thanks,
-Aleksey

* Unknown Key
* 0x62A119A7