Application failure with traversal

Roman Kennke rkennke at redhat.com
Fri Feb 9 21:40:17 UTC 2018


Alright, here's the catch: We cannot forgo the barriers on constants,
because we'd have to scan the whole code cache at init-traversal,
which precludes class unloading. Which means, we need to figure out
why your code trips on an NPE. I suspect we really need a reproducer
for that, or else an -XX:+PrintAssembly dump with the hsdis-amd64.so
in LD_LIBRARY_PATH.

If we are generating barriers on constants anyway, we can just as well
also implement concurrent code cache scanning in traversal mode. This
should enhance latency/pause times significantly.

In the long run we may even do this without barriers on constants:
we'd need nmethod entry barriers (I believe we talked about this a
short while ago) that are activated whenever some code enters a
compiled nmethod, which would scan+evacuate all constants of that
nmethod.

Also, I forgot to actually post the patch in my last email :-)

http://cr.openjdk.java.net/~rkennke/traversal-no-const-barriers.patch

Roman

On Fri, Feb 9, 2018 at 10:21 PM, Roman Kennke <rkennke at redhat.com> wrote:
> Ok, another attempt. :-)
> 1. Are you on the latest code from shenandoah/jdk10 ? If yes, I am
> wondering why my patchlet did not apply... Also, there may have been
> some changes that address the 'does not converge' assert lately.
> 2. Can you try the following patch and pass the additional VM option:
> -XX:ShenandoahUnloadClassesFrequency=0
>
> I haven't yet managed to trip the assert that you mentioned with this.
>
> Thanks, Roman
>
> On Fri, Feb 9, 2018 at 9:18 PM, Lennart Börjeson
> <lennart.borjeson at cinnober.com> wrote:
>> Thank you both, too! ...then I just take the dog for her walk...
>>
>> Best regards,
>>
>> /Lennart
>>
>>> 9 feb. 2018 kl. 21:16 skrev Roman Kennke <rkennke at redhat.com>:
>>>
>>> Ok,I see. This needs a little more work then. I think for now we have
>>> enough information to proceed. There are two things that need to be
>>> done:
>>>
>>> - Get traversal to work without barriers on constants. This will most
>>> likely fix your original NPE, and also provide a nice performance
>>> boost.
>>> - Get it to work *with* barriers on constants too. This will probably
>>> be needed for concurrent code cache scanning, which will enhance pause
>>> times. (Code cache scanning is currently our worst offender during the
>>> pauses.)
>>>
>>> Thank you for testing and testing and testing again, and all your
>>> patience! We will get back to you once we know more.
>>>
>>> Cheers, Roman
>>>
>>>
>>> On Fri, Feb 9, 2018 at 9:11 PM, Lennart Börjeson
>>> <lennart.borjeson at cinnober.com> wrote:
>>>>
>>>>> 9 feb. 2018 kl. 20:24 skrev Roman Kennke <rkennke at redhat.com>:
>>>>>
>>>>> diff --git a/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>>> b/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>>> --- a/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>>> +++ b/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>>> @@ -1308,7 +1308,6 @@
>>>>>    FLAG_SET_DEFAULT(ShenandoahStoreValEnqueueBarrier, true);
>>>>>    FLAG_SET_DEFAULT(ShenandoahKeepAliveBarrier,       false);
>>>>>    FLAG_SET_DEFAULT(ShenandoahAsmWB,                  true);
>>>>> -    FLAG_SET_DEFAULT(ShenandoahBarriersForConst,       true);
>>>>>    FLAG_SET_DEFAULT(ShenandoahWriteBarrierRB,         false);
>>>>>    FLAG_SET_DEFAULT(ShenandoahAllocImplicitLive,      false);
>>>>>  }
>>>>
>>>> I can't seem to get that patch to apply?
>>>>
>>>> patching file src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>> Hunk #1 FAILED at 1308.
>>>> 1 out of 1 hunk FAILED -- saving rejects to file src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp.rej
>>>>
>>>>
>>>> Anyway, I just commented out that line manually:
>>>>
>>>> diff -r 16f17f40cecf src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp
>>>> --- a/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp     Thu Feb 08 13:24:03 2018 +0100
>>>> +++ b/src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp     Fri Feb 09 20:50:36 2018 +0100
>>>> @@ -1308,7 +1308,7 @@
>>>>     FLAG_SET_DEFAULT(ShenandoahStoreValEnqueueBarrier, true);
>>>>     FLAG_SET_DEFAULT(ShenandoahKeepAliveBarrier,       false);
>>>>     FLAG_SET_DEFAULT(ShenandoahAsmWB,                  true);
>>>> -    FLAG_SET_DEFAULT(ShenandoahBarriersForConst,       true);
>>>> +    //    FLAG_SET_DEFAULT(ShenandoahBarriersForConst,       true);
>>>>     FLAG_SET_DEFAULT(ShenandoahWriteBarrierRB,         false);
>>>>     FLAG_SET_DEFAULT(ShenandoahAllocImplicitLive,      false);
>>>>   }
>>>>
>>>>
>>>> Anyway, with this change I don't get an NPE, but the application JVM crashes with this:
>>>>
>>>>
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> #  Internal Error (/home/lennartb/shenandoah-jdk10/src/hotspot/share/opto/split_if.cpp:173), pid=39409, tid=39425
>>>> #  assert(u_ctrl != blk1 && u_ctrl != blk2) failed: won't converge
>>>> #
>>>> # JRE version: OpenJDK Runtime Environment (10.0) (fastdebug build 10-internal+0-adhoc.lennartb.shenandoah-jdk10)
>>>> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 10-internal+0-adhoc.lennartb.shenandoah-jdk10, mixed mode, tiered, compressed oops, Shenandoah gc, linux-amd64)
>>>> # Core dump will be written. Default location: /home/tetest/TE/system/cd1/core.39409
>>>> #
>>>> # If you would like to submit a bug report, please visit:
>>>> #   http://bugreport.java.com/bugreport/crash.jsp
>>>> #
>>>>
>>>> ---------------  S U M M A R Y ------------
>>>>
>>>> Command Line: -Xms2400M -Xmx4800M -XX:+PrintFlagsFinal -Xlog:gc*=info,safepoint*=info,vmoperation*=trace:stdout:uptime,uptimenanos,timenanos,level,tags -XX:+UnlockExperimentalVMOptions -XX:SyncKnobs=Verbose=1 -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -XX:+SafepointTimeout -XX:SafepointTimeoutDelay=1 --add-modules=java.xml.bind --add-exports=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED --add-exports=java.base/sun.nio.ch=ALL-UNNAMED -XX:+UseShenandoahGC -XX:ConcGCThreads=8 -XX:ParallelGCThreads=4 -XX:MonitorBound=20000 -XX:-UseBiasedLocking -XX:+DoEscapeAnalysis -XX:+UseNUMA -XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCHeuristics=traversal -DRoundRobinPrio=0 -DHibernate3=true -DdumpConfig=VALUE -Djava.net.preferIPv4Stack=true -Djava.util.prefs.systemRoot=/home/tetest com.cinnober.framework.server.impl.FwStart --stdouttolog --stderrtolog -s CD1 -r http://frank-10g.cinnober.com:22780 -i TE -v com.cinnober.common.version.TeVersion
>>>>
>>>> Host: frank.cinnober.com, Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 48 cores, 503G, CentOS Linux release 7.3.1611 (Core)
>>>> Time: Fri Feb  9 21:08:28 2018 CET elapsed time: 41 seconds (0d 0h 0m 41s)
>>>>
>>>> ---------------  T H R E A D  ---------------
>>>>
>>>> Current thread (0x00007f7d6039e4a0):  JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=39425, stack(0x00007f7d114f6000,0x00007f7d115f7000)]
>>>>
>>>>
>>>> Current CompileTask:
>>>> C2:  41043 9019       4       com.mysql.jdbc.StringUtils::getInt (283 bytes)
>>>>
>>>> Stack: [0x00007f7d114f6000,0x00007f7d115f7000],  sp=0x00007f7d115f02b0,  free space=1000k
>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>>>> V  [libjvm.so+0x18af9ef]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x15f
>>>> V  [libjvm.so+0x18b082a]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
>>>> V  [libjvm.so+0xb15bca]  report_vm_error(char const*, int, char const*, char const*, ...)+0xea
>>>> V  [libjvm.so+0x17373ee]  PhaseIdealLoop::split_up(Node*, Node*, Node*) [clone .part.131]+0xe2e
>>>> V  [libjvm.so+0x1739c01]  PhaseIdealLoop::do_split_if(Node*)+0x8d1
>>>> V  [libjvm.so+0x114dcfb]  PhaseIdealLoop::split_if_with_blocks_post(Node*)+0xf8b
>>>> V  [libjvm.so+0x114ddab]  PhaseIdealLoop::split_if_with_blocks(VectorSet&, Node_Stack&)+0xab
>>>> V  [libjvm.so+0x11445f4]  PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0x1494
>>>> V  [libjvm.so+0xa79f20]  Compile::optimize_loops(int&, PhaseIterGVN&, LoopOptsMode)+0x100
>>>> V  [libjvm.so+0xa82aa2]  Compile::Optimize()+0xe42
>>>> V  [libjvm.so+0xa84350]  Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x1460
>>>> V  [libjvm.so+0x8c41c3]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x123
>>>> V  [libjvm.so+0xa9142d]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x35d
>>>> V  [libjvm.so+0xa924d8]  CompileBroker::compiler_thread_loop()+0x3b8
>>>> :
>>>>
>>>>
>>


More information about the shenandoah-dev mailing list