RFR: 8232782: Shenandoah: streamline post-LRB CAS barrier (aarch64)

Roman Kennke rkennke at redhat.com
Fri Jun 26 21:00:34 UTC 2020


I believe if you do what Andrew Haley suggested an use
MacroAssembler::cmpxchg() it will do a CAS if supported by the
platform, or ldxr/stxr if not. So either we simply use that, or maybe
come up with two different implementations and select one or the other
like MacroAssembler::cmpxchg() does? Not sure if there would be any
advantage in the latter.

Roman

On Fri, 2020-06-26 at 20:21 +0000, Nilsen, Kelvin wrote:
> Is there consensus that we should use CAS instruction instead of
> ldxr/stxr?
> 
> Presumably, there are some platforms where ldxr/stxr performs better
> than CAS, or at least there is the potential that such would exist.
> 
> Perhaps the JIT and run-time should adjust their behavior depending
> on the host platform.
> 
> Perhaps the whole issue of which synchronization primitives to use
> should be addressed in a different ticket.
> 
> I am willing to rework this patch.  Just need some clear guidance as
> to which direction to move it.
> 
> Thanks.
> 
> 
> On 6/24/20, 8:28 AM, "Roman Kennke" <rkennke at redhat.com> wrote:
> 
>       On Wed, 2020-06-24 at 16:22 +0100, Andrew Haley wrote:
>     > On 24/06/2020 15:48, Roman Kennke wrote:
>     > > On Wed, 2020-06-24 at 15:29 +0100, Andrew Haley wrote:
>     > > > On 24/06/2020 14:54, Nilsen, Kelvin wrote:
>     > > > > Is this ok to merge?
>     > > >
>     > > > One thing:
>     > > >
>     > > > Some CPUs, in particular those based on Neoverse N1, can
> perform
>     > > > very
>     > > > badly when using ldxr/stxr. For that reason, all code doing
> CAS
>     > > >
>     > > > I can't see any reason why your code needs to use
> ldxr/stxr. Is
>     > > > there
>     > > > any?
>     > >
>     > > As far as I know, Shenandoah's AArch64-CAS-implementation
> always
>     > > did it
>     > > that way (don't remember why). If regular CAS is generally
> better,
>     > > then
>     > > we should go for it.
>     >
>     > Does this algorithm need a full barrier even when CAS fails?
> 
>     We need to do extra work *only* when CAS fails. We need to catch
> false
>     negatives -- when the compare-value is to-space (that's
> guaranteed) and
>     the value in memory is from-space copy of the same object.
> 
>     Roman
> 
> 



More information about the shenandoah-dev mailing list