[11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Lindenmaier, Goetz goetz.lindenmaier at sap.com
Mon Jun 29 16:26:03 UTC 2020


The message from this sender included one or more files
which could not be scanned for virus detection; do not
open these files unless you are certain of the sender's intent.

----------------------------------------------------------------------
Hi Patrick,

If you use sizeof(uint) it works on mac. Uint is also the
term jdk/jdk uses here.
I put it into our CI again to make sure all platforms build.
I'll  update you tomorrow (or ping me if I forget).

Also I think we should move the line below the
declaration of _bottom, as it now depends on the
type used there.

Best regards,
Goetz.

diff --git a/src/hotspot/share/gc/shared/taskqueue.hpp b/src/hotspot/share/gc/shared/taskqueue.hpp
--- a/src/hotspot/share/gc/shared/taskqueue.hpp
+++ b/src/hotspot/share/gc/shared/taskqueue.hpp
@@ -113,6 +113,8 @@
   // The first free element after the last one pushed (mod N).
   volatile uint _bottom;
+  // Add paddings to reduce false-sharing cache contention between _bottom and _age
+  DEFINE_PAD_MINUS_SIZE(0, DEFAULT_CACHE_LINE_SIZE, sizeof(uint));
   enum { MOD_N_MASK = N - 1 };


From: Patrick Zhang OS <patrick at os.amperecomputing.com>
Sent: Saturday, June 27, 2020 11:33 AM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; jdk-updates-dev at openjdk.java.net
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueuSuper to reduce false-sharing cache contention

Thanks Goetz

I updated the of reviewers, http://cr.openjdk.java.net/~qpzhang/8248214/webrev.02/jdk11u-dev.changeset. Regarding the performance, I had tests on Linux system with a couple of x86_64/aarch64 servers, I am not sure if mentioning specjbb here would be appropriate, by far, most results of this benchmark are positive especially the metrics sensitive to GC stability (G1 or ParallelGC), and no obvious change with others probably due to microarchitecture level differences in handling exclusive load/store. This is similar as the original patch [1].

Updated "Fix request (11u)" with a risk estimation of this downporting, see JBS [1] please.

I am not familiar with the process of jdk-updates. Is it ok to push this downporting patch now? or I should still wait for maintainer's approval at JBS (jdk11u-fix-yes?).

[1] https://bugs.openjdk.java.net/browse/JDK-8248214?focusedCommentId=14349531&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14349531


Regards

Patrick



-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
Sent: Friday, June 26, 2020 3:17 PM
To: Patrick Zhang OS <patrick at os.amperecomputing.com<mailto:patrick at os.amperecomputing.com>>; jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>
Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>
Subject: RE: [11u] RFR: 8244214: Add paddings for TaskQueueSuper to reduce false-sharing cache contention



Hi Patrick,



I had a look at your change.

I think it makes sense to bring this to 11, if there actually is the performance gain you mention.

Reviewed.



Please add in the "Fix request" comment in the JBS the risk of downporting this.  And I think is should be "Fix request (11u)"

because different people will review your fix request for 11 and 8.



Best regards,

  Goetz.



> -----Original Message-----

> From: jdk-updates-dev <jdk-updates-dev-bounces at openjdk.java.net<mailto:jdk-updates-dev-bounces at openjdk.java.net>> On

> Behalf Of Patrick Zhang OS

> Sent: Wednesday, June 24, 2020 11:55 AM

> To: jdk-updates-dev at openjdk.java.net<mailto:jdk-updates-dev at openjdk.java.net>

> Cc: hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>>

> Subject: [DMARC FAILURE] [11u] RFR: 8244214: Add paddings for

> TaskQueueSuper to reduce false-sharing cache contention

>

> Hi

>

> Could I ask for a review of this simple patch which takes a tiny part

> from the original ticket JDK-8243326 [1]. The reason that I do not

> want a full backport is, the majority of the patch at jdk/jdk [2] is

> to clean up the volatile use and may be not very meaningful to 11u,

> furthermore the context (dependencies on atomic.hpp refactor) is too

> complicated to generate a clear backport (I tried, ~81 files need to be changed).

>

> The purpose of having this one-line change to 11u is, the two volatile

> variables in TaskQueueSuper: _bottom, _age and corresponding atomic

> operations upon, may cause severe cache contention inside GC with

> larger number of threads, i.e., specified by -XX:ParallelGCThreads=##,

> adding paddings (up to DEFAULT_CACHE_LINE_SIZE) in-between can reduce

> the possibility of false-sharing cache contention. I do not need the

> paddings before _bottom and after _age from the original patch [2],

> because the instances of TaskQueueSuper are usually (always) allocated

> in a set of queues, in which they are naturally separated. Please review, thanks.

>

> JBS: https://bugs.openjdk.java.net/browse/JDK-8248214

> Webrev: http://cr.openjdk.java.net/~qpzhang/8248214/webrev.01/

> Testing: tier1-2 pass with the patch, commercial benchmarks and small

> C++ test cases (to simulate the data struct and work-stealing

> algorithm atomics) validated the performance, no regression.

>

> By the way, I am going to request for 8u backport as well once 11u

> would have it.

>

> [1] https://bugs.openjdk.java.net/browse/JDK-8243326 Cleanup use of

> volatile in taskqueue code [2]

> https://hg.openjdk.java.net/jdk/jdk/rev/252a1602b4c6

>

> Regards

> Patrick

>





More information about the hotspot-gc-dev mailing list