Array accesses using sun.misc.Unsafe cause data corruption or SIGSEGV
Martijn Verburg
martijnverburg at gmail.com
Sat Jul 18 06:43:39 UTC 2015
Fix works for me as well - thanks for following up, appreciate this was an
obscure one in an officially unsupported API
On Friday, 17 July 2015, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> It is in released few days ago JDK 8u51:
>
> http://www.oracle.com/technetwork/java/javase/8u51-relnotes-2587590.html
>
> Regards,
> Vladimir
>
> On 7/17/15 12:49 PM, Serkan Özal wrote:
>
>> Hi John,
>>
>> Yes, I have applied your fix and it works.
>> Thanks!
>>
>> Since which JDK version this patch will be there?
>>
>> Regards.
>>
>> On Fri, Jul 17, 2015 at 10:31 PM, John Rose <john.r.rose at oracle.com
>> <mailto:john.r.rose at oracle.com>> wrote:
>>
>> Thanks Serkan and Martijn for reporting and analyzing this.
>>
>> We had a very similar bug reported internally, and we just
>> integrated a fix:
>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/3816de51b5e7
>>
>> Would you mind checking if it fixes your problem also?
>>
>> Best wishes,
>> — John
>>
>> On Jul 12, 2015, at 5:07 AM, Serkan Özal <serkan at hazelcast.com
>> <mailto:serkan at hazelcast.com>> wrote:
>>
>>>
>>> Hi Martjin,
>>>
>>> Thanks for your interest and comment for making this thread a
>>> little bit more hot.
>>>
>>>
>>> From my previous message
>>> (
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018221.html
>>> ):
>>>
>>> I added some additional logs to *"vm/c1/c1_Canonicalizer.cpp"*:
>>>
>>>
>>> void Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>>>
>>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>
>>> tty->print_cr("Canonicalizer: do_UnsafeGetRaw id %d: base = id
>>> %d, index = id %d, log2_scale = %d",
>>>
>>> x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>>
>>> }
>>>
>>>
>>> void Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>>>
>>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>>
>>> tty->print_cr("Canonicalizer: do_UnsafePutRaw id %d: base = id
>>> %d, index = id %d, log2_scale = %d",
>>>
>>> x->id(), x->base()->id(), x->index()->id(), x->log2_scale());
>>>
>>> }
>>>
>>>
>>>
>>> So I run the test by calculating address as:
>>>
>>> - *"int * long"* (int is index and long is 8l)
>>>
>>> - *"long * long"* (the first long is index and the second long
>>> is 8l)
>>>
>>> - *"int * int"* (the first int is index and the second int is 8)
>>>
>>> Here are the logs:
>>>
>>>
>>> *int * long:*
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>> 17, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>> 19, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>> 21, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>> 23, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>>> 27, log2_scale = 3
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>>> 27, log2_scale = 3
>>>
>>> *long * long:*
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>> 17, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>> 19, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>> 21, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>> 23, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafePutRaw id 35: base = id 13, index = id
>>> 14, log2_scale = 3
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 37: base = id 13, index = id
>>> 14, log2_scale = 3
>>>
>>> *int * int:*
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 18: base = id 16, index = id
>>> 17, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 20: base = id 16, index = id
>>> 19, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 16, index = id
>>> 21, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 24: base = id 16, index = id
>>> 23, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafePutRaw id 33: base = id 13, index = id
>>> 29, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 36: base = id 13, index = id
>>> 29, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafePutRaw id 19: base = id 8, index = id
>>> 15, log2_scale = 0
>>>
>>> Canonicalizer: do_UnsafeGetRaw id 22: base = id 8, index = id
>>> 15, log2_scale = 0
>>>
>>> As you can see, at the problematic runs (*"int * long"* and
>>> *"long * long"*) there are two scaling.
>>>
>>> One for *"Unsafe.put"* and the other one is for*"Unsafe.get"*
>>> and these instructions points to
>>>
>>> same *"base"* and *"index"* instructions. This means that
>>> address is scaled one more time because there should be only
>>> one scale.
>>>
>>>
>>>
>>> With this fix (or attempt since I am not %100 sure if it is
>>> perfect/optimum way or not), I prevent multiple scaling on the
>>> same index instruction.
>>>
>>> Also one of my previous messages
>>> (
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018383.html
>>> )
>>> shows that there are multiple scaling on the index so when it
>>> scaled multiple, anymore it shows somewhere or anywhere in the
>>> memory.
>>>
>>> On Sun, Jul 12, 2015 at 2:54 PM, Martijn Verburg
>>> <martijnverburg at gmail.com <mailto:martijnverburg at gmail.com>> wrote:
>>>
>>> Non reviewer here, but I'd add to the comment *why* you don't
>>> want to scale again.
>>>
>>> Cheers,
>>> Martijn
>>>
>>> On 12 July 2015 at 11:29, Serkan Özal <serkan at hazelcast.com
>>> <mailto:serkan at hazelcast.com>> wrote:
>>>
>>> Hi all,
>>>
>>> I have created a webrev for review including the patch and
>>> shared for public access from here:
>>> https://s3.amazonaws.com/jdk-8087134/webrev.00/index.html
>>>
>>> Regards.
>>>
>>> On Sat, Jul 4, 2015 at 9:06 PM, Serkan Özal
>>> <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote:
>>>
>>> Hi,
>>>
>>> I have added some logs to show that problem is caused
>>> by double scaling of offset (index)
>>>
>>> Here is my updated (log messages added) reproducer code:
>>>
>>>
>>> int count = 100000;
>>> long size = count * 8L;
>>> long baseAddress = unsafe.allocateMemory(size);
>>> System.out.println("Start address: " +
>>> Long.toHexString(baseAddress) +
>>> ", End address: " +
>>> Long.toHexString(baseAddress + size));
>>>
>>> for (int i = 0; i < count; i++) {
>>> long address = baseAddress + (i * 8L);
>>> System.out.println(
>>> "Normal: " + Long.toHexString(address) + ", " +
>>> "If double scaled: " +
>>> Long.toHexString(baseAddress + (i * 8L * 8L)));
>>> long expected = i;
>>> unsafe.putLong(address, expected);
>>> unsafe.getLong(address);
>>> }
>>>
>>>
>>> After sometime it crashes as
>>>
>>>
>>> ...
>>> Current thread (0x0000000002068800): JavaThread
>>> "main" [_thread_in_Java, id=10412,
>>> stack(0x00000000023f0000,0x00000000024f0000)]
>>>
>>> siginfo: ExceptionCode=0xc0000005, reading address
>>> 0x0000000059061020
>>> ...
>>> ...
>>>
>>>
>>> And here is output of the execution until crash:
>>>
>>> Start address: 58bbcfa0, End address: 58c804a0
>>> Normal: 58bbcfa0, If double scaled: 58bbcfa0
>>> Normal: 58bbcfa8, If double scaled: 58bbcfe0
>>> Normal: 58bbcfb0, If double scaled: 58bbd020
>>> ...
>>> ...
>>> Normal: 58c517b0, If double scaled: 59061020
>>>
>>>
>>> As seen from the logs and crash dump, double scaled
>>> version of target address (*If double scaled:
>>> 59061020*) is the same with the problematic address
>>> (*siginfo: ExceptionCode=0xc0000005, reading address
>>> 0x0000000059061020*) that causes to crash while
>>> accessing it.
>>>
>>> So I think, it is obvious that the crash is caused by
>>> wrong optimization of index value since index is
>>> scaled two times (for *Unsafe::put* and *Unsafe::get*)
>>> instead of only one time. Then double scaled index
>>> points to invalid memory address.
>>>
>>> Regards.
>>>
>>> On Sun, Jun 14, 2015 at 2:39 PM, Serkan Özal
>>> <serkan at hazelcast.com <mailto:serkan at hazelcast.com>>
>>> wrote:
>>>
>>> Hi all, I had dived into the issue with
>>> JDK-HotSpot commits and the issue arised after
>>> this commit:
>>>
>>> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/a60a1309a03a
>>> Then I added some additional logs to
>>> *"vm/c1/c1_Canonicalizer.cpp"*: void
>>> Canonicalizer::do_UnsafeGetRaw(UnsafeGetRaw* x) {
>>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>> tty->print_cr("Canonicalizer: do_UnsafeGetRaw id
>>> %d: base = id %d, index = id %d, log2_scale = %d",
>>> x->id(), x->base()->id(), x->index()->id(),
>>> x->log2_scale()); } void
>>> Canonicalizer::do_UnsafePutRaw(UnsafePutRaw* x) {
>>> if (OptimizeUnsafes) do_UnsafeRawOp(x);
>>> tty->print_cr("Canonicalizer: do_UnsafePutRaw id
>>> %d: base = id %d, index = id %d, log2_scale = %d",
>>> x->id(), x->base()->id(), x->index()->id(),
>>> x->log2_scale()); }
>>>
>>> So I run the test by calculating address as -
>>> *"int * long"* (int is index and long is 8l) -
>>> *"long * long"* (the first long is index and the
>>> second long is 8l) - *"int * int"* (the first int
>>> is index and the second int is 8) Here are the
>>> logs: *int * long:* Canonicalizer: do_UnsafeGetRaw
>>> id 18: base = id 16, index = id 17, log2_scale = 0
>>> Canonicalizer: do_UnsafeGetRaw id 20: base = id
>>> 16, index = id 19, log2_scale = 0 Canonicalizer:
>>> do_UnsafeGetRaw id 22: base = id 16, index = id
>>> 21, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>> id 24: base = id 16, index = id 23, log2_scale = 0
>>> Canonicalizer: do_UnsafePutRaw id 33: base = id
>>> 13, index = id 27, log2_scale = 3 Canonicalizer:
>>> do_UnsafeGetRaw id 36: base = id 13, index = id
>>> 27, log2_scale = 3*long * long:* Canonicalizer:
>>> do_UnsafeGetRaw id 18: base = id 16, index = id
>>> 17, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>> id 20: base = id 16, index = id 19, log2_scale = 0
>>> Canonicalizer: do_UnsafeGetRaw id 22: base = id
>>> 16, index = id 21, log2_scale = 0 Canonicalizer:
>>> do_UnsafeGetRaw id 24: base = id 16, index = id
>>> 23, log2_scale = 0 Canonicalizer: do_UnsafePutRaw
>>> id 35: base = id 13, index = id 14, log2_scale = 3
>>> Canonicalizer: do_UnsafeGetRaw id 37: base = id
>>> 13, index = id 14, log2_scale = 3*int * int:*
>>> Canonicalizer: do_UnsafeGetRaw id 18: base = id
>>> 16, index = id 17, log2_scale = 0 Canonicalizer:
>>> do_UnsafeGetRaw id 20: base = id 16, index = id
>>> 19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>> id 22: base = id 16, index = id 21, log2_scale = 0
>>> Canonicalizer: do_UnsafeGetRaw id 24: base = id
>>> 16, index = id 23, log2_scale = 0 Canonicalizer:
>>> do_UnsafePutRaw id 33: base = id 13, index = id
>>> 29, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>> id 36: base = id 13, index = id 29, log2_scale = 0
>>> Canonicalizer: do_UnsafePutRaw id 19: base = id 8,
>>> index = id 15, log2_scale = 0 Canonicalizer:
>>> do_UnsafeGetRaw id 22: base = id 8, index = id 15,
>>> log2_scale = 0As you can see, at the problematic
>>> runs (*"int * long"* and *"long * long"*) there
>>> are two scaling. One for *"Unsafe.put"* and the
>>> other one is for*"Unsafe.get"* and these
>>> instructions points to same *"base"* and *"index"*
>>> instructions. This means that address is scaled
>>> one more time because there should be only one scale.
>>>
>>> When I debugged the non-problematic run (*"int *
>>> int"*), I saw that *"instr->as_ArithmeticOp();"*
>>> is always returns *"null" *then
>>> *"match_index_and_scale"* method returns*"false"*
>>> always. So there is no scaling. static bool
>>> match_index_and_scale(Instruction* instr,
>>> Instruction** index, int* log2_scale) { ...
>>> ArithmeticOp* arith = instr->as_ArithmeticOp(); if
>>> (arith != NULL) { ... } return false; }
>>>
>>> Then I have added my fix attempt to prevent
>>> multiple scaling for Unsafe instructions points to
>>> same index instruction like this: void
>>> Canonicalizer::do_UnsafeRawOp(UnsafeRawOp* x) {
>>> Instruction* base = NULL; Instruction* index =
>>> NULL; int log2_scale; if (match(x, &base, &index,
>>> &log2_scale)) { x->set_base(base);
>>> x->set_index(index); // The fix attempt here //
>>> ///////////////////////////// if (index != NULL) {
>>> if (index->is_pinned()) { log2_scale = 0; } else {
>>> if (log2_scale != 0) { index->pin(); } } } //
>>> /////////////////////////////
>>> x->set_log2_scale(log2_scale); if
>>> (PrintUnsafeOptimization) {
>>> tty->print_cr("Canonicalizer: UnsafeRawOp id %d:
>>> base = id %d, index = id %d, log2_scale = %d",
>>> x->id(), x->base()->id(), x->index()->id(),
>>> x->log2_scale()); } } } In this fix attempt, if
>>> there is a scaling for the Unsafe instruction, I
>>> pin index instruction of that instruction and at
>>> next calls, if the index instruction is pinned, I
>>> assummed that there is already scaling so no need
>>> to another scaling. After this fix, I rerun the
>>> problematic test (*"int * long"*) and it works
>>> with these logs: *int * long (after fix):*
>>> Canonicalizer: do_UnsafeGetRaw id 18: base = id
>>> 16, index = id 17, log2_scale = 0 Canonicalizer:
>>> do_UnsafeGetRaw id 20: base = id 16, index = id
>>> 19, log2_scale = 0 Canonicalizer: do_UnsafeGetRaw
>>> id 22: base = id 16, index = id 21, log2_scale = 0
>>> Canonicalizer: do_UnsafeGetRaw id 24: base = id
>>> 16, index = id 23, log2_scale = 0 Canonicalizer:
>>> do_UnsafePutRaw id 35: base = id 13, index = id
>>> 14, log2_scale = 3 Canonicalizer: do_UnsafeGetRaw
>>> id 37: base = id 13, index = id 14, log2_scale = 0
>>> Canonicalizer: do_UnsafePutRaw id 21: base = id 8,
>>> index = id 11, log2_scale = 3 Canonicalizer:
>>> do_UnsafeGetRaw id 23: base = id 8, index = id 11,
>>> log2_scale = 0I am not sure my fix attempt is a
>>> really fix or maybe there are better fixes.
>>> Regards. -- Serkan ÖZAL
>>>
>>> Btw, (thanks to one my colleagues), when
>>> address calculation in the loop is
>>> converted to long address = baseAddress + (i *
>>> 8) test passes. Only difference is next long
>>> pointer is calculated using
>>> integer 8 instead of long 8. ```
>>> for (int i = 0; i < count; i++) {
>>> long address = baseAddress + (i * 8); // <---
>>> here, integer 8 instead
>>> of long 8 long expected = i;
>>> unsafe.putLong(address, expected); long actual
>>> = unsafe.getLong(address); if (expected !=
>>> actual) {
>>> throw new AssertionError("Expected: " +
>>> expected + ", Actual: " +
>>> actual);
>>> }
>>> }
>>> ``` On Tue, Jun 9, 2015 at 1:07 PM Mehmet
>>> Dogan <mehmet at hazelcast.com
>>> <
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev>>
>>> wrote: >/Hi all, />
>>> >/While I was testing my app using java 8, I
>>> encountered the previously />/reported
>>> sun.misc.Unsafe issue. />
>>> >/
>>> https://bugs.openjdk.java.net/browse/JDK-8076445
>>> />
>>> >/
>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017685.html
>>> />
>>> >/Issue status says it's resolved with
>>> resolution "Cannot Reproduce". But
>>> />/unfortunately it's still reproducible using
>>> "1.8.0_60-ea-b18" and />/"1.9.0-ea-b67". />
>>> >/Test is very simple: />
>>> >/``` />/public static void main(String[]
>>> args) throws Exception { />/Unsafe unsafe =
>>> findUnsafe(); />/// 10000 pass />/// 100000
>>> jvm crash />/// 1000000 fail />/int count =
>>> 100000; />/long size = count * 8L; />/long
>>> baseAddress = unsafe.allocateMemory(size); />
>>> >/try { />/for (int i = 0; i < count; i++) {
>>> />/long address = baseAddress + (i * 8L); />
>>> >/long expected = i;
>>> />/unsafe.putLong(address, expected); />
>>> >/long actual = unsafe.getLong(address); />
>>> >/if (expected != actual) { />/throw new
>>> AssertionError("Expected: " + expected + ",
>>> />/Actual: " + actual); />/} />/} />/} finally
>>> { />/unsafe.freeMemory(baseAddress); />/} />/}
>>> />/``` />/It's not failing up to version
>>> 1.8.0.31, by starting 1.8.0.40 test is
>>> />/failing constantly. />
>>> >/- With iteration count 10000, test is
>>> passing. />/- With iteration count 100000, jvm
>>> is crashing with SIGSEGV. />/- With iteration
>>> count 1000000, test is failing with
>>> AssertionError. />
>>> >/When one of compilation (-Xint) or inlining
>>> (-XX:-Inline) or />/on-stack-replacement
>>> (-XX:-UseOnStackReplacement) is disabled, test
>>> is not />/failing at all. />
>>> >/I tested on platforms: />/-
>>> Centos-7/openjdk-1.8.0.45 />/-
>>> OSX/oraclejdk-1.8.0.40 />/-
>>> OSX/oraclejdk-1.8.0.45 />/-
>>> OSX/oraclejdk-1.8.0_60-ea-b18 />/-
>>> OSX/oraclejdk-1.9.0-ea-b67 />
>>> >/Previous issue comment (
>>> />/
>>> https://bugs.openjdk.java.net/browse/JDK-8076445?focusedCommentId=13633043#comment-13633043
>>> )
>>> />/says "Cannot reproduce based on the latest
>>> version". I hope that latest />/version is not
>>> mentioning to '1.8.0_60-ea-b18' or
>>> '1.9.0-ea-b67'. Because />/both are failing. />
>>> >/I'm looking forward to hearing from you. />
>>> >/Thanks, />/-Mehmet Dogan- />/-- />
>>> >/@mmdogan />
>>>
>>>
>>> --
>>> Serkan ÖZAL
>>> Remotest Software Engineer
>>> GSM: +90 542 680 39 18
>>> <tel:%2B90%20542%20680%2039%2018>
>>> Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>> --
>>> Serkan ÖZAL
>>> Remotest Software Engineer
>>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>> Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>> --
>>> Serkan ÖZAL
>>> Remotest Software Engineer
>>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>> Twitter: @serkan_ozal
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Serkan ÖZAL
>>> Remotest Software Engineer
>>> GSM: +90 542 680 39 18 <tel:%2B90%20542%20680%2039%2018>
>>> Twitter: @serkan_ozal
>>>
>>
>>
>>
>>
>> --
>> Serkan ÖZAL
>> Remotest Software Engineer
>> GSM: +90 542 680 39 18
>> Twitter: @serkan_ozal
>>
>
--
Cheers, Martijn (Sent from Gmail Mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150718/a542a9bc/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list