Shenandoah performance problem

Roman Kennke rkennke at redhat.com
Fri Oct 18 11:51:16 UTC 2019


Hi Attila,

You could try to run with fastdebug build, which builds in lots of
asserts. In addition to that, try running with -XX:+ShenandoahVerify.
Maybe this gives us some clues.

Thanks,
Roman


> I've applied the patches on top of
> |http:||//hg||.openjdk.java.net||/shenandoah/jdk|
> 
> $ hg parent
> changeset:   57587:c29454a99107
> tag:         tip
> user:        zgu
> date:        Wed Oct 16 15:13:00 2019 -0400
> summary:     8230773: Shenandoah: enable concurrent class unloading on
> x86_32 platforms
> 
> and did some tests in production.
> 
> Good news: the startup time is back to normal, and the scalability
> bottleneck also seems resolved.
> 
> Bad news: I see some random jvm crashes, not sure if they are related to
> this patch.
> 
> Should I use a different parent revision to apply the patches to?
> 
> 
> The crashes:
> 
> ```
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f7fd69a2f51, pid=20946, tid=20988
> #
> # JRE version: OpenJDK Runtime Environment (14.0) (build
> 14-internal+0-adhoc.axt.shenandoah)
> # Java VM: OpenJDK 64-Bit Server VM (14-internal+0-adhoc.axt.shenandoah,
> mixed mode, tiered, compressed oops, shenandoah gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x8a8f51]
> AccessInternal::PostRuntimeDispatch<ShenandoahBarrierSet::AccessBarrier<565366ul,
> ShenandoahBarrierSet>, (AccessInternal::BarrierType)3,
> 565366ul>::oop_access_barrier(oopDesc*, long)+0x1
> #
> # No core dump will be written. Core dumps have been disabled. To enable
> core dumping, try "ulimit -c unlimited" before starting Java again
> Compiled method (c1) 4554513 16071       3
> java.lang.invoke.LambdaForm$DMH/0x0000000800c80440::invokeStatic (21 bytes)
>  total in heap  [0x00007f7fc2a27010,0x00007f7fc2a2a670] = 13920
>  relocation     [0x00007f7fc2a27170,0x00007f7fc2a27490] = 800
>  main code      [0x00007f7fc2a274a0,0x00007f7fc2a29920] = 9344
>  stub code      [0x00007f7fc2a29920,0x00007f7fc2a29a40] = 288
>  oops           [0x00007f7fc2a29a40,0x00007f7fc2a29b78] = 312
>  metadata       [0x00007f7fc2a29b78,0x00007f7fc2a29c78] = 256
>  scopes data    [0x00007f7fc2a29c78,0x00007f7fc2a2a1d8] = 1376
>  scopes pcs     [0x00007f7fc2a2a1d8,0x00007f7fc2a2a608] = 1072
>  dependencies   [0x00007f7fc2a2a608,0x00007f7fc2a2a618] = 16
>  nul chk table  [0x00007f7fc2a2a618,0x00007f7fc2a2a670] = 88
> ```
> or when flight recording is enabled:
> ```
> #  SIGSEGV (0xb) at pc=0x00007f4b66fdd031, pid=1674, tid=1678
> #
> # JRE version: OpenJDK Runtime Environment (14.0) (build
> 14-internal+0-adhoc.axt.shenandoah)
> # Java VM: OpenJDK 64-Bit Server VM (14-internal+0-adhoc.axt.shenandoah,
> mixed mode, tiered, compressed oops, shenandoah gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x932031]
> JfrSymbolId::mark_unsafe_anonymous_klass_name(InstanceKlass const*,
> bool)+0x31
> #
> 
> 
> Current thread (0x00007f4b60185800):  ConcurrentGCThread
> "ShenandoahControlThread" [stack: 0x00007f4b64c2a000,0x00007f4b64d2a000]
> [id=1678]
> 
> Stack: [0x00007f4b64c2a000,0x00007f4b64d2a000], sp=0x00007f4b64d28540, 
> free space=1017k
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x932031]
> JfrSymbolId::mark_unsafe_anonymous_klass_name(InstanceKlass const*,
> bool)+0x31
> V  [libjvm.so+0x932ed5]  JfrArtifactSet::mark(Klass const*, bool)+0x35
> V  [libjvm.so+0x927cc4] write_klass(JfrCheckpointWriter*, Klass const*,
> bool)+0x114
> V  [libjvm.so+0x930f94]  JfrArtifactCallbackHost<Klass const*,
> CompositeFunctor<Klass const*,
> JfrTypeWriterHost<JfrPredicatedTypeWriterImplHost<Klass const*,
> SerializePredicate<Klass const*>, &(write__klass(JfrC
> heckpointWriter*, void const*))>, 20u>, KlassArtifactRegistrator>
>>::do_artifact(void const*)+0x34
> V  [libjvm.so+0x5fe27e]  ClassLoaderData::classes_do(void (*)(Klass*))+0x1e
> V  [libjvm.so+0x6025cb] ClassLoaderDataGraph::classes_unloading_do(void
> (*)(Klass*))+0x2b
> V  [libjvm.so+0x9300f4] JfrTypeSet::serialize(JfrCheckpointWriter*,
> JfrCheckpointWriter*, bool)+0x21d4
> V  [libjvm.so+0x926afc]
> JfrTypeManager::write_type_set_for_unloaded_classes()+0x5c
> V  [libjvm.so+0xf14d11] SystemDictionary::do_unloading(GCTimer*)+0x81
> V  [libjvm.so+0xe86cd0]  ShenandoahUnload::unload()+0x60
> V  [libjvm.so+0xdfacb5] ShenandoahHeap::entry_roots()+0x125
> V  [libjvm.so+0xdf0056]
> ShenandoahControlThread::service_concurrent_normal_cycle(GCCause::Cause)+0xb6
> 
> V  [libjvm.so+0xdf115b] ShenandoahControlThread::run_service()+0x6fb
> V  [libjvm.so+0x6735bb]  ConcurrentGCThread::run()+0x2b
> V  [libjvm.so+0xf66671]  Thread::call_run()+0x111
> V  [libjvm.so+0xc94261] thread_native_entry(Thread*)+0xf1
> 
> 
> ```
> Thanks,
>   axt
> 
> 
> 
> On 10/18/19 1:10 AM, Attila Axt wrote:
>> Thanks for the fix and the quick response.  I've tried the first
>> changeset locally against the benchmark, it seems to work fine.
>>
>> I can try the full changeset earliest on monday in production
>> environment. I will feedback on the results.
>>
>> Greetings,
>>   axt
>>
>> On 10/17/19 8:53 PM, Aleksey Shipilev wrote:
>>> On 10/17/19 6:29 PM, Aleksey Shipilev wrote:
>>>> On 10/17/19 12:49 PM, Aleksey Shipilev wrote:
>>>>> This workload runs:
>>>>>    Parallel GC, -t 1:     0.112 ± 0.006  ms/op
>>>>>    Parallel GC, -t max:   0.146 ± 0.002  ms/op
>>>>>    Shenandoah GC, -t 1:   0.120 ± 0.002  ms/op
>>>>>    Shenandoah GC, -t max: 1.595 ± 0.050  ms/op <--- scalability
>>>>> bottleneck
>>>>>
>>>>> And you are right, that's the contention on pinning/unpinning, I am
>>>>> looking what can be done about
>>>>> that...
>>>> This bunch of fixes makes it much better:
>>>> https://cr.openjdk.java.net/~shade/shenandoah/async-pinning/
>>> Filed:
>>>    https://bugs.openjdk.java.net/browse/JDK-8232575
>>>
>>> ...would RFR it tomorrow.
>>>



More information about the shenandoah-dev mailing list