Array accesses using sun.misc.Unsafe cause data corruption or SIGSEGV
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Jul 17 20:27:31 UTC 2015
It is in released few days ago JDK 8u51:
http://www.oracle.com/technetwork/java/javase/8u51-relnotes-2587590.html
Regards,
Vladimir
On 7/17/15 12:49 PM, Serkan Özal wrote:
> Hi John,
>
> Yes, I have applied your fix and it works.
> Thanks!
>
> Since which JDK version this patch will be there?
>
> Regards.
>
> On Fri, Jul 17, 2015 at 10:31 PM, John Rose <john.r.rose at oracle.com
> <mailto:john.r.rose at oracle.com>> wrote:
>
> Thanks Serkan and Martijn for reporting and analyzing this.
>
> We had a very similar bug reported internally, and we just
> integrated a fix:
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/3816de51b5e7
>
> Would you mind checking if it fixes your problem also?
>
> Best wishes,
> — John
>
> On Jul 12, 2015, at 5:07 AM, Serkan Özal <serkan at hazelcast.com
> <mailto:serkan at hazelcast.com>> wrote:
>>
>> Hi Martjin,
>>
>> Thanks for your interest and comment for making this thread a
>> little bit more hot.
>>
>>
>> From my previous message
>> (http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018221.html):
>>
>> I added some additional logs to *"vm/c1/c1_Canonicalizer.cpp"*:
>>
>>
>> void Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>>
>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>
>> tty->print_cr("Canonicalizer: do_UnsafeGetRaw id %d: base = id
>> %d, index = id %d, log2_scale = %d",
>>
>> x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>
>> }
>>
>>
>> void Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>>
>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>
>> tty->print_cr("Canonicalizer: do_UnsafePutRaw id %d: base = id
>> %d, index = id %d, log2_scale = %d",
>>
>> x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>
>> }
>>
>>
>>
>> So I run the test by calculating address as:
>>
>> - *"int * long"* (int is index and long is 8l)
>>
>> - *"long * long"* (the first long is index and the second long
>> is 8l)
>>
>> - *"int * int"* (the first int is index and the second int is 8)
>>
>> Here are the logs:
>>
>>
>> *int * long:*
>>
>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>> 17, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>> 19, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>> 21, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>> 23, log2_scale = 0
>>
>> Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>> 27, log2_scale = 3
>>
>> Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>> 27, log2_scale = 3
>>
>> *long * long:*
>>
>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>> 17, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>> 19, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>> 21, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>> 23, log2_scale = 0
>>
>> Canonicalizer: do_UnsafePutRaw id 35: base = id 13, index = id
>> 14, log2_scale = 3
>>
>> Canonicalizer: do_UnsafeGetRaw id 37: base = id 13, index = id
>> 14, log2_scale = 3
>>
>> *int * int:*
>>
>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>> 17, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>> 19, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>> 21, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>> 23, log2_scale = 0
>>
>> Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>> 29, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>> 29, log2_scale = 0
>>
>> Canonicalizer: do_UnsafePutRaw id 19: base = id 8, index = id
>> 15, log2_scale = 0
>>
>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 8, index = id
>> 15, log2_scale = 0
>>
>> As you can see, at the problematic runs (*"int * long"* and
>> *"long * long"*) there are two scaling.
>>
>> One for *"Unsafe.put"* and the other one is for*"Unsafe.get"*
>> and these instructions points to
>>
>> same *"base"* and *"index"* instructions. This means that
>> address is scaled one more time because there should be only
>> one scale.
>>
>>
>>
>> With this fix (or attempt since I am not %100 sure if it is
>> perfect/optimum way or not), I prevent multiple scaling on the
>> same index instruction.
>>
>> Also one of my previous messages
>> (http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018383.html)
>> shows that there are multiple scaling on the index so when it
>> scaled multiple, anymore it shows somewhere or anywhere in the memory.
>>
>> On Sun, Jul 12, 2015 at 2:54 PM, Martijn Verburg
>> <martijnverburg at gmail.com <mailto:martijnverburg at gmail.com>> wrote:
>>
>> Non reviewer here, but I'd add to the comment *why* you don't
>> want to scale again.
>>
>> Cheers,
>> Martijn
>>
>> On 12 July 2015 at 11:29, Serkan Özal <serkan at hazelcast.com
>> <mailto:serkan at hazelcast.com>> wrote:
>>
>> Hi all,
>>
>> I have created a webrev for review including the patch and
>> shared for public access from here:
>> https://s3.amazonaws.com/jdk-8087134/webrev.00/index.html
>>
>> Regards.
>>
>> On Sat, Jul 4, 2015 at 9:06 PM, Serkan Özal
>> <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote:
>>
>> Hi,
>>
>> I have added some logs to show that problem is caused
>> by double scaling of offset (index)
>>
>> Here is my updated (log messages added) reproducer code:
>>
>>
>> int count = 100000;
>> long size = count * 8L;
>> long baseAddress = unsafe.allocateMemory(size);
>> System.out.println("Start address: " +
>> Long.toHexString(baseAddress) +
>> ", End address: " +
>> Long.toHexString(baseAddress + size));
>>
>> for (int i = 0; i < count; i++) {
>> long address = baseAddress + (i * 8L);
>> System.out.println(
>> "Normal: " + Long.toHexString(address) + ", " +
>> "If double scaled: " +
>> Long.toHexString(baseAddress + (i * 8L * 8L)));
>> long expected = i;
>> unsafe.putLong(address, expected);
>> unsafe.getLong(address);
>> }
>>
>>
>> After sometime it crashes as
>>
>>
>> ...
>> Current thread (0x0000000002068800): JavaThread
>> "main" [_thread_in_Java, id=10412,
>> stack(0x00000000023f0000,0x00000000024f0000)]
>>
>> siginfo: ExceptionCode=0xc0000005, reading address
>> 0x0000000059061020
>> ...
>> ...
>>
>>
>> And here is output of the execution until crash:
>>
>> Start address: 58bbcfa0, End address: 58c804a0
>> Normal: 58bbcfa0, If double scaled: 58bbcfa0
>> Normal: 58bbcfa8, If double scaled: 58bbcfe0
>> Normal: 58bbcfb0, If double scaled: 58bbd020
>> ...
>> ...
>> Normal: 58c517b0, If double scaled: 59061020
>>
>>
>> As seen from the logs and crash dump, double scaled
>> version of target address (*If double scaled:
>> 59061020*) is the same with the problematic address
>> (*siginfo: ExceptionCode=0xc0000005, reading address
>> 0x0000000059061020*) that causes to crash while
>> accessing it.
>>
>> So I think, it is obvious that the crash is caused by
>> wrong optimization of index value since index is
>> scaled two times (for *Unsafe::put* and *Unsafe::get*)
>> instead of only one time. Then double scaled index
>> points to invalid memory address.
>>
>> Regards.
>>
>> On Sun, Jun 14, 2015 at 2:39 PM, Serkan Özal
>> <serkan at hazelcast.com <mailto:serkan at hazelcast.com>>
>> wrote:
>>
>> Hi all, I had dived into the issue with
>> JDK-HotSpot commits and the issue arised after
>> this commit:
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/a60a1309a03a
>> Then I added some additional logs to
>> *"vm/c1/c1_Canonicalizer.cpp"*: void
>> Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>> tty->print_cr("Canonicalizer: do_UnsafeGetRaw id
>> %d: base = id %d, index = id %d, log2_scale = %d",
>> x->id(), x->base()->id(), x->index()->id(),
>> x->log2_scale()); } void
>> Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>> tty->print_cr("Canonicalizer: do_UnsafePutRaw id
>> %d: base = id %d, index = id %d, log2_scale = %d",
>> x->id(), x->base()->id(), x->index()->id(),
>> x->log2_scale()); }
>>
>> So I run the test by calculating address as -
>> *"int * long"* (int is index and long is 8l) -
>> *"long * long"* (the first long is index and the
>> second long is 8l) - *"int * int"* (the first int
>> is index and the second int is 8) Here are the
>> logs: *int * long:* Canonicalizer: do_UnsafeGetRaw
>> id 18: base = id 16, index = id 17, log2_scale = 0
>> Canonicalizer: do_UnsafeGetRaw id 20: base = id
>> 16, index = id 19, log2_scale = 0 Canonicalizer:
>> do_UnsafeGetRaw id 22: base = id 16, index = id
>> 21, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>> id 24: base = id 16, index = id 23, log2_scale = 0
>> Canonicalizer: do_UnsafePutRaw id 33: base = id
>> 13, index = id 27, log2_scale = 3 Canonicalizer:
>> do_UnsafeGetRaw id 36: base = id 13, index = id
>> 27, log2_scale = 3*long * long:* Canonicalizer:
>> do_UnsafeGetRaw id 18: base = id 16, index = id
>> 17, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>> id 20: base = id 16, index = id 19, log2_scale = 0
>> Canonicalizer: do_UnsafeGetRaw id 22: base = id
>> 16, index = id 21, log2_scale = 0 Canonicalizer:
>> do_UnsafeGetRaw id 24: base = id 16, index = id
>> 23, log2_scale = 0 Canonicalizer: do_UnsafePutRaw
>> id 35: base = id 13, index = id 14, log2_scale = 3
>> Canonicalizer: do_UnsafeGetRaw id 37: base = id
>> 13, index = id 14, log2_scale = 3*int * int:*
>> Canonicalizer: do_UnsafeGetRaw id 18: base = id
>> 16, index = id 17, log2_scale = 0 Canonicalizer:
>> do_UnsafeGetRaw id 20: base = id 16, index = id
>> 19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>> id 22: base = id 16, index = id 21, log2_scale = 0
>> Canonicalizer: do_UnsafeGetRaw id 24: base = id
>> 16, index = id 23, log2_scale = 0 Canonicalizer:
>> do_UnsafePutRaw id 33: base = id 13, index = id
>> 29, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>> id 36: base = id 13, index = id 29, log2_scale = 0
>> Canonicalizer: do_UnsafePutRaw id 19: base = id 8,
>> index = id 15, log2_scale = 0 Canonicalizer:
>> do_UnsafeGetRaw id 22: base = id 8, index = id 15,
>> log2_scale = 0As you can see, at the problematic
>> runs (*"int * long"* and *"long * long"*) there
>> are two scaling. One for *"Unsafe.put"* and the
>> other one is for*"Unsafe.get"* and these
>> instructions points to same *"base"* and *"index"*
>> instructions. This means that address is scaled
>> one more time because there should be only one scale.
>>
>> When I debugged the non-problematic run (*"int *
>> int"*), I saw that *"instr->as_ArithmeticOp();"*
>> is always returns *"null" *then
>> *"match_index_and_scale"* method returns*"false"*
>> always. So there is no scaling. static bool
>> match_index_and_scale(Instruction* instr,
>> Instruction** index, int* log2_scale) { ...
>> ArithmeticOp* arith = instr->as_ArithmeticOp(); if
>> (arith != NULL) { ... } return false; }
>>
>> Then I have added my fix attempt to prevent
>> multiple scaling for Unsafe instructions points to
>> same index instruction like this: void
>> Canonicalizer::do_UnsafeRawOp(UnsafeRawOp* x) {
>> Instruction* base = NULL; Instruction* index =
>> NULL; int log2_scale; if (match(x, &base, &index,
>> &log2_scale)) { x->set_base(base);
>> x->set_index(index); // The fix attempt here //
>> ///////////////////////////// if (index != NULL) {
>> if (index->is_pinned()) { log2_scale = 0; } else {
>> if (log2_scale != 0) { index->pin(); } } } //
>> /////////////////////////////
>> x->set_log2_scale(log2_scale); if
>> (PrintUnsafeOptimization) {
>> tty->print_cr("Canonicalizer: UnsafeRawOp id %d:
>> base = id %d, index = id %d, log2_scale = %d",
>> x->id(), x->base()->id(), x->index()->id(),
>> x->log2_scale()); } } } In this fix attempt, if
>> there is a scaling for the Unsafe instruction, I
>> pin index instruction of that instruction and at
>> next calls, if the index instruction is pinned, I
>> assummed that there is already scaling so no need
>> to another scaling. After this fix, I rerun the
>> problematic test (*"int * long"*) and it works
>> with these logs: *int * long (after fix):*
>> Canonicalizer: do_UnsafeGetRaw id 18: base = id
>> 16, index = id 17, log2_scale = 0 Canonicalizer:
>> do_UnsafeGetRaw id 20: base = id 16, index = id
>> 19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>> id 22: base = id 16, index = id 21, log2_scale = 0
>> Canonicalizer: do_UnsafeGetRaw id 24: base = id
>> 16, index = id 23, log2_scale = 0 Canonicalizer:
>> do_UnsafePutRaw id 35: base = id 13, index = id
>> 14, log2_scale = 3 Canonicalizer: do_UnsafeGetRaw
>> id 37: base = id 13, index = id 14, log2_scale = 0
>> Canonicalizer: do_UnsafePutRaw id 21: base = id 8,
>> index = id 11, log2_scale = 3 Canonicalizer:
>> do_UnsafeGetRaw id 23: base = id 8, index = id 11,
>> log2_scale = 0I am not sure my fix attempt is a
>> really fix or maybe there are better fixes.
>> Regards. -- Serkan ÖZAL
>>
>> Btw, (thanks to one my colleagues), when
>> address calculation in the loop is
>> converted to long address = baseAddress + (i *
>> 8) test passes. Only difference is next long
>> pointer is calculated using
>> integer 8 instead of long 8. ```
>> for (int i = 0; i < count; i++) {
>> long address = baseAddress + (i * 8); // <---
>> here, integer 8 instead
>> of long 8 long expected = i;
>> unsafe.putLong(address, expected); long actual
>> = unsafe.getLong(address); if (expected !=
>> actual) {
>> throw new AssertionError("Expected: " +
>> expected + ", Actual: " +
>> actual);
>> }
>> }
>> ``` On Tue, Jun 9, 2015 at 1:07 PM Mehmet
>> Dogan <mehmet at hazelcast.com
>> <http://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev>>
>> wrote: >/Hi all, />
>> >/While I was testing my app using java 8, I
>> encountered the previously />/reported
>> sun.misc.Unsafe issue. />
>> >/https://bugs.openjdk.java.net/browse/JDK-8076445
>> />
>> >/http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017685.html
>> />
>> >/Issue status says it's resolved with
>> resolution "Cannot Reproduce". But
>> />/unfortunately it's still reproducible using
>> "1.8.0_60-ea-b18" and />/"1.9.0-ea-b67". />
>> >/Test is very simple: />
>> >/``` />/public static void main(String[]
>> args) throws Exception { />/Unsafe unsafe =
>> findUnsafe(); />/// 10000 pass />/// 100000
>> jvm crash />/// 1000000 fail />/int count =
>> 100000; />/long size = count * 8L; />/long
>> baseAddress = unsafe.allocateMemory(size); />
>> >/try { />/for (int i = 0; i < count; i++) {
>> />/long address = baseAddress + (i * 8L); />
>> >/long expected = i;
>> />/unsafe.putLong(address, expected); />
>> >/long actual = unsafe.getLong(address); />
>> >/if (expected != actual) { />/throw new
>> AssertionError("Expected: " + expected + ",
>> />/Actual: " + actual); />/} />/} />/} finally
>> { />/unsafe.freeMemory(baseAddress); />/} />/}
>> />/``` />/It's not failing up to version
>> 1.8.0.31, by starting 1.8.0.40 test is
>> />/failing constantly. />
>> >/- With iteration count 10000, test is
>> passing. />/- With iteration count 100000, jvm
>> is crashing with SIGSEGV. />/- With iteration
>> count 1000000, test is failing with
>> AssertionError. />
>> >/When one of compilation (-Xint) or inlining
>> (-XX:-Inline) or />/on-stack-replacement
>> (-XX:-UseOnStackReplacement) is disabled, test
>> is not />/failing at all. />
>> >/I tested on platforms: />/-
>> Centos-7/openjdk-1.8.0.45 />/-
>> OSX/oraclejdk-1.8.0.40 />/-
>> OSX/oraclejdk-1.8.0.45 />/-
>> OSX/oraclejdk-1.8.0_60-ea-b18 />/-
>> OSX/oraclejdk-1.9.0-ea-b67 />
>> >/Previous issue comment (
>> />/https://bugs.openjdk.java.net/browse/JDK-8076445?focusedCommentId=13633043#comment-13633043)
>> />/says "Cannot reproduce based on the latest
>> version". I hope that latest />/version is not
>> mentioning to '1.8.0_60-ea-b18' or
>> '1.9.0-ea-b67'. Because />/both are failing. />
>> >/I'm looking forward to hearing from you. />
>> >/Thanks, />/-Mehmet Dogan- />/-- />
>> >/@mmdogan />
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18
>> <tel:%2B90%20542%20680%2039%2018>
>> Twitter: @serkan_ozal
>>
>>
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>> Twitter: @serkan_ozal
>>
>>
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>> Twitter: @serkan_ozal
>>
>>
>>
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>> Twitter: @serkan_ozal
>
>
>
>
> --
> Serkan ÖZAL
> Remotest Software Engineer
> GSM: +90 542 680 39 18
> Twitter: @serkan_ozal
More information about the hotspot-compiler-dev
mailing list