[concurrency-interest] a volatile bug?
Aleksey Shipilev
aleksey.shipilev at gmail.com
Thu May 17 13:22:43 PDT 2012
Interesting but unrelated bug, is there a CR already?
Basically, if you run the test with safepoint tracing, you will see:
$ ~/Install/jdk7u4/bin/java -XX:+SafepointTimeout -server VirtualMachineLiveLock
...
temp = 34
temp = 35
temp = 36
temp = 37
temp = 38
temp = 39
# SafepointSynchronize::begin: Timeout detected:
# SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# "Thread-1" prio=10 tid=0x6bcff000 nid=0x117a runnable [0x00000000]
java.lang.Thread.State: RUNNABLE
# SafepointSynchronize::begin: (End of list)
-Aleksey.
On Thu, May 17, 2012 at 11:59 PM, Dr Heinz M. Kabutz
<heinz at javaspecialists.eu> wrote:
> That's not all.
>
> Using volatile fields, you can cause a beautiful unbreakable hard spin in
> JVM Server Hotspot since 1.6.0_14, basically livelocking the JVM. It won't
> react to any of the usual Java tools, like jconsole, jstack, etc. Only kill
> -9 to shut it down, plus of course the usual C tools.
>
> https://github.com/kabutz/javaspecialists/blob/master/src/main/java/eu/javaspecialists/tjsn/examples/issue188/VirtualMachineLiveLock.java
>
> and mentioned (briefly) at the end of
> http://www.javaspecialists.eu/archive/Issue188.html
>
> I discovered this in 2010 already, but have so far not managed to get it
> fixed.
>
> And this is a bug in the Server HotSpot JVM - so that is not necessarily a
> refuge where you can hide from volatile bugs!
>
> As we approach the limits with our clever concurrent code, I believe we will
> see issues like this crop up more and more.
>
> Regards
>
> Heinz
> --
> Dr Heinz M. Kabutz (PhD CompSci)
> Author of "The Java(tm) Specialists' Newsletter"
> Sun Java Champion
> IEEE Certified Software Development Professional
> http://www.javaspecialists.eu
> Tel: +30 69 75 595 262
> Skype: kabutz
>
>
>
> On 5/17/12 1:01 AM, Boehm, Hans wrote:
>
> Would it make sense to expand a test suite with a bunch of memory model
> tests, e.g.:
>
> no CSE across volatile load
> corresponding test for a volatile store:
> r1 = x; v = ...; r2 = x; use r1 doesn't replace the use of r1 with r2
> Dekker's example
> No fusion of potentially infinite loops
> Maybe IRIW/write atomicity
>
> ?
>
> My main concern here stems from the fact that this is perhaps the most basic
> test of volatiles that fails. Is there any reason to believe the
> harder-to-enforce properties of volatiles hold?
>
> Hans
>
>
>
> -----Original Message-----
> From: Aleksey Shipilev [mailto:aleksey.shipilev at gmail.com]
> Sent: Wednesday, May 16, 2012 2:22 PM
> To: Vitaly Davidovich
> Cc: Boehm, Hans; hotspot compiler; concurrency-interest at cs.oswego.edu
> Subject: Re: [concurrency-interest] a volatile bug?
>
> Well, I do not want to sound alarming, but... if I understand the C1
> code correctly, then C1 GVN does not account prior volatile reads at
> all. I can not find any code in C1 GVN code which actually prevents
> killing second non-volatile read after volatile one, which is required
> by JMM semantics.
>
> I think I'll stop here. The impact of this issue is limited, given
> most of the guys run -server (even by default on most machines), so
> there is always the workaround for running with -server. Also, I would
> *speculate* turning off GVN with -XX:-UseGlobalValueNumbering when
> running with -client is still a workaround, but kind of insane one,
> since it can *severely* degrade performance.
>
> Words of wisdom: I'm using this command-line to print out GVN tracing:
> $ ~/Install/jdk7u4/fastdebug/bin/java -XX:+PrintCompilation
> -XX:+PrintDominators -XX:+PrintCompilation -XX:+PrintValueNumbering
> -Xbatch -XX:CompileOnly=Test.$1,Test -client Test 2>&1 | tee asm.log
>
> Can anyone more proficient in C1 code confirm this?
>
> -Aleksey.
>
> On Thu, May 17, 2012 at 12:41 AM, Aleksey Shipilev
> <aleksey.shipilev at gmail.com> wrote:
>
>
> In my case, there are always two compiled versions for Test$1.run,
>
>
> one
>
>
> with cached $b, second one is with correct read for $b. I'd guess
> pastebin version had the second one.
>
> -Aleksey.
>
> On Thu, May 17, 2012 at 12:34 AM, Vitaly Davidovich
>
>
> <vitalyd at gmail.com> wrote:
>
>
> I looked at the assembly on SO again (the pastebin link) and it
>
>
> seems to be
>
>
> correct actually: after 'a' is cmp'ed against zero, 'b' is read from
> memory. But now someone is saying there that it sometimes generates
>
>
> the
>
>
> correct assembly and other times not - very strange.
> 0x025bd2b9: cmp $0x0,%edx
>
> 30. 0x025bd2bc: je 0x025bd2a8 ;
>
> 32. 0x025bd2be: mov $0x147062e8,%edx ; {oop('test/TestVolatile')}
>
> 33. 0x025bd2c3: mov 0x1c4(%edx),%edx ;*getstatic b
>
> 34. ; - test.TestVolatile::run at 10 (line 17)
>
> 35. 0x025bd2c9: cmp $0x0,%edx
>
> Sent from my phone
>
> On May 16, 2012 3:55 PM, "Aleksey Shipilev"
>
>
> <aleksey.shipilev at gmail.com>
>
>
> wrote:
>
>
> Update. GVN is clearly under suspicion -XX:-UseGlobalValueNumbering
> mitigates the bug in my setup. Digging through C1 codebase to see
> rules for volatiles.
>
> -Aleksey.
>
> On Wed, May 16, 2012 at 11:23 PM, Aleksey Shipilev
> <aleksey.shipilev at gmail.com> wrote:
>
>
> All right, here's what is on the table.
>
> This bug is reproduced for me on Linux i686 with:
> java version "1.7.0_04"
> Java(TM) SE Runtime Environment (build 1.7.0_04-b20)
> Java HotSpot(TM) Client VM (build 23.0-b21, mixed mode)
>
> It reproduces immediately only with -client.
> Both -server and -Xint do NOT reproduce the bug.
> The code is there in original SO post
>
> http://stackoverflow.com/questions/10620680/why-volatile-in-java-
>
>
> 5-doesnt-synchronize-cached-copies-of-variables-with-main
>
>
> C1 seems to miscompile run(), and indeed does CSE for local:
>
> # {method} 'run' '()V' in 'Test$1'
> [Verified Entry Point]
> 0xb4a91e80: mov %eax,-0x4000(%esp)
> 0xb4a91e87: push %ebp
> 0xb4a91e88: sub $0x18,%esp ;*invokestatic access$000
> ; - Test$1::run at 0 (line
>
>
> 11)
>
>
> 0xb4a91e8b: mov $0xa09c4270,%edx ; {oop(a
>
>
> 'java/lang/Class' =
>
>
> 'Test')}
>
>
> 0xb4a91e90: mov 0x74(%edx),%edx ;*getstatic b <<<<<-
>
>
> ---
>
>
> loads $b to %edx
>
>
> ; - Test::access$000 at 0
>
>
> (line 1)
>
>
> ; - Test$1::run at 0 (line
>
>
> 11)
>
>
> 0xb4a91e93: jmp 0xb4a91e9e ; OopMap{off=40}
> ;*goto
> ; - Test$1::run at 10 (line
>
>
> 13)
>
>
> 0xb4a91e98: test %eax,0xb77a9100 ;*goto
> ; - Test$1::run at 10 (line
>
>
> 13)
>
>
> ; {poll}
> 0xb4a91e9e: mov $0xa09c4270,%ecx ; {oop(a
>
>
> 'java/lang/Class' =
>
>
> 'Test')}
>
>
> 0xb4a91ea3: mov 0x70(%ecx),%ecx ;*getstatic a <<<<<
> volatile read for $a
>
>
> ; - Test::access$100 at 0
>
>
> (line 1)
>
>
> ; - Test$1::run at 4 (line
>
>
> 13)
>
>
> 0xb4a91ea6: cmp $0x0,%ecx // <---- $a is at %ecx
> 0xb4a91ea9: je 0xb4a91e98 ;*ifne
> ; - Test$1::run at 7 (line
>
>
> 13)
>
>
> >>>> 0xb4a91eab: cmp $0x0,%edx // <<<<<<---- $b is cached
>
>
> in
>
>
> %edx here
> 0xb4a91eae: jne 0xb4a91ed8 ;*ifne
> ; - Test$1::run at 16 (line
>
>
> 17)
>
>
> 0xb4a91eb4: nopl 0x0(%eax)
> 0xb4a91eb8: jmp 0xb4a91f0e ; {no_reloc}
> 0xb4a91ebd: xchg %ax,%ax
> 0xb4a91ec0: jmp 0xb4a91f28 ; implicit exception:
> dispatches to 0xb4a91f18
> 0xb4a91ec5: nop ;*getstatic out
> ; - Test$1::run at 19 (line
>
>
> 18)
>
>
> 0xb4a91ec6: cmp (%ecx),%eax ; implicit exception:
> dispatches to 0xb4a91f32
> 0xb4a91ec8: mov $0xa09c6488,%edx ;*invokevirtual println
> ; - Test$1::run at 24 (line
>
>
> 18)
>
>
> ; {oop("error")}
>
>
> Thanks,
> Aleksey.
>
> On Wed, May 16, 2012 at 11:12 PM, Vitaly Davidovich
>
>
> <vitalyd at gmail.com>
>
>
> wrote:
>
>
> It can be a compiler (mis)optimization that causes this, and not
>
>
> x86
>
>
> memory
> ordering.
>
> Someone posted the assembly output in the comments on SO and it
>
>
> does
>
>
> seem
> like there's a place that loads 'b' from the stack rather than
>
>
> memory.
>
>
> Hans' theory of CSE sounds plausible - can someone repro this
>
>
> without
>
>
> that
> "int tt = b;" line?
>
> Adding hotspot compiler guys in case they want to chime in.
>
> Sent from my phone
>
> On May 16, 2012 3:07 PM, "Aleksey Shipilev"
> <aleksey.shipilev at gmail.com>
> wrote:
>
>
> On Wed, May 16, 2012 at 10:40 PM, Boehm, Hans
>
>
> <hans.boehm at hp.com>
>
>
> wrote:
>
>
> A JDK bug AND a serious test suite omission?
>
>
> Stress tests would probably JIT-compile the code in question.
>
>
> See
>
>
> below.
>
>
>
> But is the problem real? Can it be reproduced on a
>
>
> mainstream JVM?
>
>
> Same question.
>
>
>
> Note that the example in the original posting also read b
>
>
> before the
>
>
> loop,
> so naïve common subexpression elimination would cause the
>
>
> bug.
>
>
> Hopefully
> nobody does CSE in cases like this.
>
>
> FWIW, the test case in SO would probably not hit any
>
>
> compilation
>
>
> threshold in HotSpot, so it could be executed in interpreter.
>
>
> Then,
>
>
> assuming the interpreter does not reorder Java code, and
>
>
> assuming
>
>
> original SO poster runs Windows, and hence x86, and hence has
>
>
> TSO,
>
>
> this bug seems very unlikely. I would be surprised if it
>
>
> actually
>
>
> *can* be reproduced. That makes the whole story rather
>
>
> interesting.
>
>
> -Aleksey.
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
More information about the hotspot-compiler-dev
mailing list