Performance impact impressions for card marking in Shenandoah

Wed Aug 19 21:27:36 UTC 2020

Very nice! I've expected the impact to be bigger.

Thank you for posting this, and for working on this!

Thanks,
Roman

Am Mittwoch, den 19.08.2020, 20:13 +0000 schrieb Mathiske, Bernd:
> Here are some numbers regarding what might happen in terms of
> slowdown if you add card marking to Shenandoah (see previous emails
> for the patch). Summary: "minor" impact, as expected from anecdotal
> experience with card marking in Parallel and CMS. But the following
> is by no means a comprehensive study, just a report of what I have to
> not make you wait any longer for where I see this going.
> 
> I ran SPECJvm2008 on a c5.2xlarge AWS instance (8 Virtual CPUs, 16.0
> GiB Memory), using OpenJDK 11.0.7 with Shenandoah. 
> 
> .../java -Xms2g -Xmx2g -XX:+UnlockExperimentalVMOptions
> -XX:+UseShenandoahGC -XX:-TieredCompilation -jar SPECjvm2008.jar -coe
> -ict -ikv -wt 15s -it 20s -bt 2
> <compress|crypto|mpegaudio|scimark.large|scimark.small|serial|sunflow
> |xml>
> 
> I could not get the SPECjvm2008 benchmark "compiler" to run on any
> JVM and "startup" seemed irrelevant in this context. "derby" seems to
> be very sensitive to -Xmx and thus producing red herring scores for
> our focal point, relative barrier performance. So I am leaving
> "derby" out for now, too.
> 
> In my code, there is still a bug tickled by "derby" and another one
> in array marking by C1 code. The latter prevents me from invoking C1
> here. So I ran all this with C2 only. I figure this is where we would
> see the most impact anyway. I compared my patched JVM ("CardShen") to
> the same JVM with Parallel, CMS, G1, and to an unpatched vanilla
> Shenandoah ("Shen") JVM without card marking. These are the scores in
> SPECjvm ops/sec, averaged over 3 overall runs. Not very precise, due
> to small run lengths, with variations between runs around 0.5-1%, but
> stable enough to get a qualitative idea.
> 
> 	             Parallel  CMS       G1       Shen     CardShen
>  compress            127       127      127     108        108
>  crypto                 261        255      260     249        248
>  mpegaudio          90           91       92        78          76
>  scimark.large      89           89        89        90         90
>  scimark.small    182        182     173      171       173
>  serial                   129        127     120      116       113
>  sunflow              118        117     126       113      113
>  xml                      358        349     315       292      280
> 
> Adding unconditional card marking makes the JVM with Shenandoah have
> on average about 1% lower scores than vanilla Shenandoah. I have
> tried conditional Shenandoah with conditional card marking briefly
> and spotted similar results. There are more noticeable differences
> dominated by which collector one chooses to begin with. I would
> expect that there is a throughput penalty for using a concurrent
> collector and this seems to shine through here sometimes.
> 
> Next, I shall repeat this with JDK tip. 
> Yes, I want to / will fix the remaining bugs, eventually. :-)
> 
> Bernd
> 
>