[aarch64-port-dev ] RFR: 8205694: AArch64: Add test to validate volatile load, store and CAS code generation

Andrew Dinn adinn at redhat.com
Thu Jun 28 10:54:26 UTC 2018


Hi Zhongwei,

Thanks very much for looking at this.

On 28/06/18 10:21, Zhongwei Yao wrote:

> Sorry for my separate comment, but I have a question about
> "storestore" barrier eliding in StoreCM node.
> 
> In http://cr.openjdk.java.net/~adinn/8205694/webrev.01/test/hotspot/jtreg/compiler/c2/aarch64/TestVolatiles.java.html
> line: 301, when "storestore" barrier is elided, is following sequence
> still ordered?:
>   stlrw
>   strb
> 
> To my knowledge, strb could be reordered in above sequence. But "stlrw;
> stlrb" will be ordered as following said.

This was discussed some time ago when the original change was made and I
believe it is correct.

G1 and CMS both insert a StoreCM node into the node graph to model at an
abstract level the card mark that should follow a StoreP/N  (volatile or
non-volatile). The other two standard GCs (Parallel, Serial) simply
insert a StoreB.

The purpose of the StoreCM is to allow non-TSO architectures to generate
a StoreStore barrier between the str generated for the StoreP/N and the
strb generated for the StoreCM i.e. ensure they are correctly ordered.
For Aarch64, that would mean planting something like

  1) normal   2) volatile  3) volatile
                 via stlr  via barriers
                           dmb ish
  str         stlr         str
  . . .       . . .        . . .
  dmb ishst   dmb ishst    dmb ishst
  . . .       . . .        . . .
  strb        strb         strb
                           dmb ish

where the ... might be replaced with other instructions generated in
accordance with the GC barrier. First, notice that in case 2 the
StoreStore barrier is always redundant because the StoreP/N has been
translated to stlr. This guarantees that the field write is visible
before the card mark. So, we can always omit the dmb ishst if we know we
are translating a volatile store.

However, the predicate for unnecessary_storestore also optimizes two
further cases.

For G1, the GC barrier plants a card mark membar_volatile between the
StoreP/N and the StoreCM. So, for this case if we generated a storestore
barrier we  would see this

  1) normal   2) volatile  3) volatile
                 via stlr  via barriers
                           dmb ish
  str         stlr         str
  . . .       . . .        . . .
  dmb ish     dmb ish      dmb ish
  . . .       . . .       . . .
  dmb ishst   dmb ishst    dmb ishst
  . . .       . . .        . . .
  strb        strb         strb
                           dmb ish

Clearly, the dmb ishst is not needed for G1.

The same thing applies when using CMS with +UseCondCardMark. The GC
barrier for this case introduces a card mark membar_volatile between the
str/stlr and strb as with G1. So, this makes the dmb ishst redundant in
all 3 cases.

CMS with -UseCondCardMark does not introduce a card mark
membar_volatile. So, in that case that we need to introduce the dmb
ishst in cases 1 and 3. If we have case 2 i.e. we know we have a
volatile store and are not relying on barriers it can be omitted. I
believe the checks in the predicate test for precisely these conditions.

> And I see some discrepancy in the comment of aarch64.ad (without
> your patch applied, line: 1483 to 1484), storeCM is translated to:
>   dmb ishst
>   strlb
> 
> But actually, it is translated to:
>   dmb ishst
>   strb

Yes, that comment is wrong. It should say strb. I am afraid I have
already pushed the fix. Perhaps we can sneak a patch to the comment into
a later fix.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander


More information about the hotspot-compiler-dev mailing list