Array accesses using sun.misc.Unsafe cause data corruption or SIGSEGV

Martijn Verburg martijnverburg at gmail.com
Sat Jul 18 06:43:39 UTC 2015


Fix works for me as well - thanks for following up, appreciate this was an
obscure one in an officially unsupported API

On Friday, 17 July 2015, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:

> It is in released few days ago JDK 8u51:
>
> http://www.oracle.com/technetwork/java/javase/8u51-relnotes-2587590.html
>
> Regards,
> Vladimir
>
> On 7/17/15 12:49 PM, Serkan Özal wrote:
>
>> Hi John,
>>
>> Yes, I have applied your fix and it works.
>> Thanks!
>>
>> Since which JDK version this patch will be there?
>>
>> Regards.
>>
>> On Fri, Jul 17, 2015 at 10:31 PM, John Rose <john.r.rose at oracle.com
>> <mailto:john.r.rose at oracle.com>> wrote:
>>
>>     Thanks Serkan and Martijn for reporting and analyzing this.
>>
>>     We had a very similar bug reported internally, and we just
>>     integrated a fix:
>>     http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/3816de51b5e7
>>
>>     Would you mind checking if it fixes your problem also?
>>
>>     Best wishes,
>>     — John
>>
>>     On Jul 12, 2015, at 5:07 AM, Serkan Özal <serkan at hazelcast.com
>>     <mailto:serkan at hazelcast.com>> wrote:
>>
>>>
>>>     Hi Martjin,
>>>
>>>     Thanks for your interest and comment for making this thread a
>>>     little bit more hot.
>>>
>>>
>>>     From my previous message
>>>     (
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018221.html
>>> ):
>>>
>>>         I added some additional logs to *"vm/c1/c1_Canonicalizer.cpp"*:
>>>
>>>
>>>         void Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>>>
>>>         if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>
>>>         tty->print_cr("Canonicalizer: do_UnsafeGetRaw id %d: base = id
>>>         %d, index = id %d, log2_scale = %d",
>>>
>>>         x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>>
>>>         }
>>>
>>>
>>>         void Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>>>
>>>         if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>
>>>         tty->print_cr("Canonicalizer: do_UnsafePutRaw id %d: base = id
>>>         %d, index = id %d, log2_scale = %d",
>>>
>>>         x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>>
>>>         }
>>>
>>>
>>>
>>>         So I run the test by calculating address as:
>>>
>>>         - *"int * long"* (int is index and long is 8l)
>>>
>>>         - *"long * long"* (the first long is index and the second long
>>>         is 8l)
>>>
>>>         - *"int * int"* (the first int is index and the second int is 8)
>>>
>>>         Here are the logs:
>>>
>>>
>>>         *int * long:*
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>>         17, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>>         19, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>>         21, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>>         23, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>>>         27, log2_scale = 3
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>>>         27, log2_scale = 3
>>>
>>>         *long * long:*
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>>         17, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>>         19, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>>         21, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>>         23, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafePutRaw id 35: base = id 13, index = id
>>>         14, log2_scale = 3
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 37: base = id 13, index = id
>>>         14, log2_scale = 3
>>>
>>>         *int * int:*
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>>         17, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>>         19, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>>         21, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>>         23, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>>>         29, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>>>         29, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafePutRaw id 19: base = id 8, index = id
>>>         15, log2_scale = 0
>>>
>>>         Canonicalizer: do_UnsafeGetRaw id 22: base = id 8, index = id
>>>         15, log2_scale = 0
>>>
>>>         As you can see, at the problematic runs (*"int * long"* and
>>>         *"long * long"*) there are two scaling.
>>>
>>>         One for *"Unsafe.put"* and the other one is for*"Unsafe.get"*
>>>         and these instructions points to
>>>
>>>         same *"base"* and *"index"* instructions. This means that
>>>         address is scaled one more time because there should be only
>>>         one scale.
>>>
>>>
>>>
>>>     With this fix (or attempt since I am not %100 sure if it is
>>>     perfect/optimum way or not), I prevent multiple scaling on the
>>>     same index instruction.
>>>
>>>     Also one of my previous messages
>>>     (
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018383.html
>>> )
>>>     shows that there are multiple scaling on the index so when it
>>>     scaled multiple, anymore it shows somewhere or anywhere in the
>>> memory.
>>>
>>>     On Sun, Jul 12, 2015 at 2:54 PM, Martijn Verburg
>>>     <martijnverburg at gmail.com <mailto:martijnverburg at gmail.com>> wrote:
>>>
>>>         Non reviewer here, but I'd add to the comment *why* you don't
>>>         want to scale again.
>>>
>>>         Cheers,
>>>         Martijn
>>>
>>>         On 12 July 2015 at 11:29, Serkan Özal <serkan at hazelcast.com
>>>         <mailto:serkan at hazelcast.com>> wrote:
>>>
>>>             Hi all,
>>>
>>>             I have created a webrev for review including the patch and
>>>             shared for public access from here:
>>>             https://s3.amazonaws.com/jdk-8087134/webrev.00/index.html
>>>
>>>             Regards.
>>>
>>>             On Sat, Jul 4, 2015 at 9:06 PM, Serkan Özal
>>>             <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote:
>>>
>>>                 Hi,
>>>
>>>                 I have added some logs to show that problem is caused
>>>                 by double scaling of offset (index)
>>>
>>>                 Here is my updated (log messages added) reproducer code:
>>>
>>>
>>>                 int count = 100000;
>>>                 long size = count * 8L;
>>>                 long baseAddress = unsafe.allocateMemory(size);
>>>                 System.out.println("Start address: " +
>>>                 Long.toHexString(baseAddress) +
>>>                                    ", End address: " +
>>>                 Long.toHexString(baseAddress + size));
>>>
>>>                 for (int i = 0; i < count; i++) {
>>>                     long address = baseAddress + (i * 8L);
>>>                     System.out.println(
>>>                         "Normal: " + Long.toHexString(address) + ", " +
>>>                         "If double scaled: " +
>>>                 Long.toHexString(baseAddress + (i * 8L * 8L)));
>>>                     long expected = i;
>>>                     unsafe.putLong(address, expected);
>>>                     unsafe.getLong(address);
>>>                 }
>>>
>>>
>>>                 After sometime it crashes as
>>>
>>>
>>>                 ...
>>>                 Current thread (0x0000000002068800):  JavaThread
>>>                 "main" [_thread_in_Java, id=10412,
>>>                 stack(0x00000000023f0000,0x00000000024f0000)]
>>>
>>>                 siginfo: ExceptionCode=0xc0000005, reading address
>>>                 0x0000000059061020
>>>                 ...
>>>                 ...
>>>
>>>
>>>                 And here is output of the execution until crash:
>>>
>>>                 Start address: 58bbcfa0, End address: 58c804a0
>>>                 Normal: 58bbcfa0, If double scaled: 58bbcfa0
>>>                 Normal: 58bbcfa8, If double scaled: 58bbcfe0
>>>                 Normal: 58bbcfb0, If double scaled: 58bbd020
>>>                 ...
>>>                 ...
>>>                 Normal: 58c517b0, If double scaled: 59061020
>>>
>>>
>>>                 As seen from the logs and crash dump, double scaled
>>>                 version of target address (*If double scaled:
>>>                 59061020*) is the same with the problematic address
>>>                 (*siginfo: ExceptionCode=0xc0000005, reading address
>>>                 0x0000000059061020*) that causes to crash while
>>>                 accessing it.
>>>
>>>                 So I think, it is obvious that the crash is caused by
>>>                 wrong optimization of index value since index is
>>>                 scaled two times (for *Unsafe::put* and *Unsafe::get*)
>>>                 instead of only one time. Then double scaled index
>>>                 points to invalid memory address.
>>>
>>>                 Regards.
>>>
>>>                 On Sun, Jun 14, 2015 at 2:39 PM, Serkan Özal
>>>                 <serkan at hazelcast.com <mailto:serkan at hazelcast.com>>
>>>                 wrote:
>>>
>>>                     Hi all, I had dived into the issue with
>>>                     JDK-HotSpot commits and the issue arised after
>>>                     this commit:
>>>
>>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/a60a1309a03a
>>>                     Then I added some additional logs to
>>>                     *"vm/c1/c1_Canonicalizer.cpp"*: void
>>>                     Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>>>                     if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>                     tty->print_cr("Canonicalizer: do_UnsafeGetRaw id
>>>                     %d: base = id %d, index = id %d, log2_scale = %d",
>>>                     x->id(), x->base()->id(), x->index()->id(),
>>>                     x->log2_scale()); } void
>>>                     Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>>>                     if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>                     tty->print_cr("Canonicalizer: do_UnsafePutRaw id
>>>                     %d: base = id %d, index = id %d, log2_scale = %d",
>>>                     x->id(), x->base()->id(), x->index()->id(),
>>>                     x->log2_scale()); }
>>>
>>>                     So I run the test by calculating address as -
>>>                     *"int * long"* (int is index and long is 8l) -
>>>                     *"long * long"* (the first long is index and the
>>>                     second long is 8l) - *"int * int"* (the first int
>>>                     is index and the second int is 8) Here are the
>>>                     logs: *int * long:* Canonicalizer: do_UnsafeGetRaw
>>>                     id 18: base = id 16, index = id 17, log2_scale = 0
>>>                     Canonicalizer: do_UnsafeGetRaw id 20: base = id
>>>                     16, index = id 19, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafeGetRaw id 22: base = id 16, index = id
>>>                     21, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>>                     id 24: base = id 16, index = id 23, log2_scale = 0
>>>                     Canonicalizer: do_UnsafePutRaw id 33: base = id
>>>                     13, index = id 27, log2_scale = 3 Canonicalizer:
>>>                     do_UnsafeGetRaw id 36: base = id 13, index = id
>>>                     27, log2_scale = 3*long * long:* Canonicalizer:
>>>                     do_UnsafeGetRaw id 18: base = id 16, index = id
>>>                     17, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>>                     id 20: base = id 16, index = id 19, log2_scale = 0
>>>                     Canonicalizer: do_UnsafeGetRaw id 22: base = id
>>>                     16, index = id 21, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafeGetRaw id 24: base = id 16, index = id
>>>                     23, log2_scale = 0 Canonicalizer: do_UnsafePutRaw
>>>                     id 35: base = id 13, index = id 14, log2_scale = 3
>>>                     Canonicalizer: do_UnsafeGetRaw id 37: base = id
>>>                     13, index = id 14, log2_scale = 3*int * int:*
>>>                     Canonicalizer: do_UnsafeGetRaw id 18: base = id
>>>                     16, index = id 17, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafeGetRaw id 20: base = id 16, index = id
>>>                     19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>>                     id 22: base = id 16, index = id 21, log2_scale = 0
>>>                     Canonicalizer: do_UnsafeGetRaw id 24: base = id
>>>                     16, index = id 23, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafePutRaw id 33: base = id 13, index = id
>>>                     29, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>>                     id 36: base = id 13, index = id 29, log2_scale = 0
>>>                     Canonicalizer: do_UnsafePutRaw id 19: base = id 8,
>>>                     index = id 15, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafeGetRaw id 22: base = id 8, index = id 15,
>>>                     log2_scale = 0As you can see, at the problematic
>>>                     runs (*"int * long"* and *"long * long"*) there
>>>                     are two scaling. One for *"Unsafe.put"* and the
>>>                     other one is for*"Unsafe.get"* and these
>>>                     instructions points to same *"base"* and *"index"*
>>>                     instructions. This means that address is scaled
>>>                     one more time because there should be only one scale.
>>>
>>>                     When I debugged the non-problematic run (*"int *
>>>                     int"*), I saw that *"instr->as_ArithmeticOp();"*
>>>                     is always returns *"null" *then
>>>                     *"match_index_and_scale"* method returns*"false"*
>>>                     always. So there is no scaling. static bool
>>>                     match_index_and_scale(Instruction* instr,
>>>                     Instruction** index, int* log2_scale) { ...
>>>                     ArithmeticOp* arith = instr->as_ArithmeticOp(); if
>>>                     (arith != NULL) { ... } return false; }
>>>
>>>                     Then I have added my fix attempt to prevent
>>>                     multiple scaling for Unsafe instructions points to
>>>                     same index instruction like this: void
>>>                     Canonicalizer::do_UnsafeRawOp(UnsafeRawOp* x) {
>>>                     Instruction* base = NULL; Instruction* index =
>>>                     NULL; int log2_scale; if (match(x, &base, &index,
>>>                     &log2_scale)) { x->set_base(base);
>>>                     x->set_index(index); // The fix attempt here //
>>>                     ///////////////////////////// if (index != NULL) {
>>>                     if (index->is_pinned()) { log2_scale = 0; } else {
>>>                     if (log2_scale != 0) { index->pin(); } } } //
>>>                     /////////////////////////////
>>>                     x->set_log2_scale(log2_scale); if
>>>                     (PrintUnsafeOptimization) {
>>>                     tty->print_cr("Canonicalizer: UnsafeRawOp id %d:
>>>                     base = id %d, index = id %d, log2_scale = %d",
>>>                     x->id(), x->base()->id(), x->index()->id(),
>>>                     x->log2_scale()); } } } In this fix attempt, if
>>>                     there is a scaling for the Unsafe instruction, I
>>>                     pin index instruction of that instruction and at
>>>                     next calls, if the index instruction is pinned, I
>>>                     assummed that there is already scaling so no need
>>>                     to another scaling. After this fix, I rerun the
>>>                     problematic test (*"int * long"*) and it works
>>>                     with these logs: *int * long (after fix):*
>>>                     Canonicalizer: do_UnsafeGetRaw id 18: base = id
>>>                     16, index = id 17, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafeGetRaw id 20: base = id 16, index = id
>>>                     19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>>                     id 22: base = id 16, index = id 21, log2_scale = 0
>>>                     Canonicalizer: do_UnsafeGetRaw id 24: base = id
>>>                     16, index = id 23, log2_scale = 0 Canonicalizer:
>>>                     do_UnsafePutRaw id 35: base = id 13, index = id
>>>                     14, log2_scale = 3 Canonicalizer: do_UnsafeGetRaw
>>>                     id 37: base = id 13, index = id 14, log2_scale = 0
>>>                     Canonicalizer: do_UnsafePutRaw id 21: base = id 8,
>>>                     index = id 11, log2_scale = 3 Canonicalizer:
>>>                     do_UnsafeGetRaw id 23: base = id 8, index = id 11,
>>>                     log2_scale = 0I am not sure my fix attempt is a
>>>                     really fix or maybe there are better fixes.
>>>                     Regards. -- Serkan ÖZAL
>>>
>>>                         Btw, (thanks to one my colleagues), when
>>>                         address calculation in the loop is
>>>                         converted to long address = baseAddress + (i *
>>>                         8) test passes. Only difference is next long
>>>                         pointer is calculated using
>>>                         integer 8 instead of long 8. ```
>>>                         for (int i = 0; i < count; i++) {
>>>                         long address = baseAddress + (i * 8); // <---
>>>                         here, integer 8 instead
>>>                         of long 8 long expected = i;
>>>                         unsafe.putLong(address, expected); long actual
>>>                         = unsafe.getLong(address); if (expected !=
>>>                         actual) {
>>>                         throw new AssertionError("Expected: " +
>>>                         expected + ", Actual: " +
>>>                         actual);
>>>                         }
>>>                         }
>>>                         ``` On Tue, Jun 9, 2015 at 1:07 PM Mehmet
>>>                         Dogan <mehmet at hazelcast.com
>>>                         <
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev>>
>>>                         wrote: >/Hi all, />
>>>                         >/While I was testing my app using java 8, I
>>>                         encountered the previously />/reported
>>>                         sun.misc.Unsafe issue. />
>>>                         >/
>>> https://bugs.openjdk.java.net/browse/JDK-8076445
>>>                         />
>>>                         >/
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017685.html
>>>                         />
>>>                         >/Issue status says it's resolved with
>>>                         resolution "Cannot Reproduce". But
>>>                         />/unfortunately it's still reproducible using
>>>                         "1.8.0_60-ea-b18" and />/"1.9.0-ea-b67". />
>>>                         >/Test is very simple: />
>>>                         >/``` />/public static void main(String[]
>>>                         args) throws Exception { />/Unsafe unsafe =
>>>                         findUnsafe(); />/// 10000 pass />/// 100000
>>>                         jvm crash />/// 1000000 fail />/int count =
>>>                         100000; />/long size = count * 8L; />/long
>>>                         baseAddress = unsafe.allocateMemory(size); />
>>>                         >/try { />/for (int i = 0; i < count; i++) {
>>>                         />/long address = baseAddress + (i * 8L); />
>>>                         >/long expected = i;
>>>                         />/unsafe.putLong(address, expected); />
>>>                         >/long actual = unsafe.getLong(address); />
>>>                         >/if (expected != actual) { />/throw new
>>>                         AssertionError("Expected: " + expected + ",
>>>                         />/Actual: " + actual); />/} />/} />/} finally
>>>                         { />/unsafe.freeMemory(baseAddress); />/} />/}
>>>                         />/``` />/It's not failing up to version
>>>                         1.8.0.31, by starting 1.8.0.40 test is
>>>                         />/failing constantly. />
>>>                         >/- With iteration count 10000, test is
>>>                         passing. />/- With iteration count 100000, jvm
>>>                         is crashing with SIGSEGV. />/- With iteration
>>>                         count 1000000, test is failing with
>>>                         AssertionError. />
>>>                         >/When one of compilation (-Xint) or inlining
>>>                         (-XX:-Inline) or />/on-stack-replacement
>>>                         (-XX:-UseOnStackReplacement) is disabled, test
>>>                         is not />/failing at all. />
>>>                         >/I tested on platforms: />/-
>>>                         Centos-7/openjdk-1.8.0.45 />/-
>>>                         OSX/oraclejdk-1.8.0.40 />/-
>>>                         OSX/oraclejdk-1.8.0.45 />/-
>>>                         OSX/oraclejdk-1.8.0_60-ea-b18 />/-
>>>                         OSX/oraclejdk-1.9.0-ea-b67 />
>>>                         >/Previous issue comment (
>>>                         />/
>>> https://bugs.openjdk.java.net/browse/JDK-8076445?focusedCommentId=13633043#comment-13633043
>>> )
>>>                         />/says "Cannot reproduce based on the latest
>>>                         version". I hope that latest />/version is not
>>>                         mentioning to '1.8.0_60-ea-b18' or
>>>                         '1.9.0-ea-b67'. Because />/both are failing. />
>>>                         >/I'm looking forward to hearing from you. />
>>>                         >/Thanks, />/-Mehmet Dogan- />/-- />
>>>                         >/@mmdogan />
>>>
>>>
>>>                     --
>>>                     Serkan ÖZAL
>>>                     Remotest Software Engineer
>>>                     GSM: +90 542 680 39 18
>>>                     <tel:%2B90%20542%20680%2039%2018>
>>>                     Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>>                 --
>>>                 Serkan ÖZAL
>>>                 Remotest Software Engineer
>>>                 GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>>                 Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>>             --
>>>             Serkan ÖZAL
>>>             Remotest Software Engineer
>>>             GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>>             Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>>
>>>     --
>>>     Serkan ÖZAL
>>>     Remotest Software Engineer
>>>     GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>>     Twitter: @serkan_ozal
>>>
>>
>>
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18
>> Twitter: @serkan_ozal
>>
>

-- 
Cheers, Martijn (Sent from Gmail Mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150718/a542a9bc/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list