RFR: 8232782: Shenandoah: streamline post-LRB CAS barrier (aarch64)

Fri Jun 26 20:21:10 UTC 2020

Is there consensus that we should use CAS instruction instead of ldxr/stxr?

Presumably, there are some platforms where ldxr/stxr performs better than CAS, or at least there is the potential that such would exist.

Perhaps the JIT and run-time should adjust their behavior depending on the host platform.

Perhaps the whole issue of which synchronization primitives to use should be addressed in a different ticket.

I am willing to rework this patch.  Just need some clear guidance as to which direction to move it.

Thanks.

On 6/24/20, 8:28 AM, "Roman Kennke" <rkennke at redhat.com> wrote:

      On Wed, 2020-06-24 at 16:22 +0100, Andrew Haley wrote:
    > On 24/06/2020 15:48, Roman Kennke wrote:
    > > On Wed, 2020-06-24 at 15:29 +0100, Andrew Haley wrote:
    > > > On 24/06/2020 14:54, Nilsen, Kelvin wrote:
    > > > > Is this ok to merge?
    > > >
    > > > One thing:
    > > >
    > > > Some CPUs, in particular those based on Neoverse N1, can perform
    > > > very
    > > > badly when using ldxr/stxr. For that reason, all code doing CAS
    > > >
    > > > I can't see any reason why your code needs to use ldxr/stxr. Is
    > > > there
    > > > any?
    > >
    > > As far as I know, Shenandoah's AArch64-CAS-implementation always
    > > did it
    > > that way (don't remember why). If regular CAS is generally better,
    > > then
    > > we should go for it.
    >
    > Does this algorithm need a full barrier even when CAS fails?

    We need to do extra work *only* when CAS fails. We need to catch false
    negatives -- when the compare-value is to-space (that's guaranteed) and
    the value in memory is from-space copy of the same object.

    Roman