RFR: 8248238: Implementation of JEP: Windows AArch64 Support [v5]

Wed Sep 23 13:23:43 UTC 2020

On Mon, 21 Sep 2020 14:47:15 GMT, Bernhard Urban-Forster <burban at openjdk.org> wrote:

>>> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [build-dev](mailto:build-dev at openjdk.java.net):_
>>> 
>>> On 18/09/2020 11:14, Monica Beckwith wrote:
>>> 
>>> > This is a continuation of https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-August/009566.html
>>> 
>>> The diffs in assembler_aarch64.cpp are mostly spurious. Please try this.
>> 
>> 
>> Thank you Andrew. Is the goal to reduce the patch diff in `assembler_aarch64.cpp`? If so, we need to get rid of the
>> retry in your patch to avoid additional calls to `random`, e.g. something like this (the diff for the generated part
>> would look indeed nicer with that:  https://gist.github.com/lewurm/2e7b0e00447696c75e00febb83034ba1 ):
>> --- a/src/hotspot/cpu/aarch64/aarch64-asmtest.py
>> +++ b/src/hotspot/cpu/aarch64/aarch64-asmtest.py
>> @@ -13,6 +13,8 @@ class Register(Operand):
>> 
>>      def generate(self):
>>          self.number = random.randint(0, 30)
>> +        if self.number == 18:
>> +            self.number = 17
>>          return self
>> 
>>      def astr(self, prefix):
>> @@ -37,6 +39,8 @@ class GeneralRegisterOrZr(Register):
>> 
>>      def generate(self):
>>          self.number = random.randint(0, 31)
>> +        if self.number == 18:
>> +            self.number = 16
>>          return self
>> 
>>      def astr(self, prefix = ""):
>> @@ -54,6 +58,8 @@ class GeneralRegisterOrZr(Register):
>>  class GeneralRegisterOrSp(Register):
>>      def generate(self):
>>          self.number = random.randint(0, 31)
>> +        if self.number == 18:
>> +            self.number = 15
>>          return self
>> 
>>      def astr(self, prefix = ""):
>> @@ -1331,7 +1337,7 @@ generate(SpecialCases, [["ccmn",   "__ ccmn(zr, zr, 3u, Assembler::LE);",
>>                          ["st1w",   "__ sve_st1w(z0, __ S, p1, Address(r0, 7));",         "st1w\t{z0.s}, p1, [x0, #7, MUL VL]"],
>>                          ["st1b",   "__ sve_st1b(z0, __ B, p2, Address(sp, r1));",        "st1b\t{z0.b}, p2, [sp, x1]"],
>>                          ["st1h",   "__ sve_st1h(z0, __ H, p3, Address(sp, r8));",        "st1h\t{z0.h}, p3, [sp, x8, LSL #1]"],
>> -                        ["st1d",   "__ sve_st1d(z0, __ D, p4, Address(r0, r18));",       "st1d\t{z0.d}, p4, [x0, x18, LSL #3]"],
>> +                        ["st1d",   "__ sve_st1d(z0, __ D, p4, Address(r0, r17));",       "st1d\t{z0.d}, p4, [x0, x17,
>> LSL #3]"],
>>                          ["ldr",    "__ sve_ldr(z0, Address(sp));",                       "ldr\tz0, [sp]"],
>>                          ["ldr",    "__ sve_ldr(z31, Address(sp, -256));",                "ldr\tz31, [sp, #-256, MUL VL]"],
>>                          ["str",    "__ sve_str(z8, Address(r8, 255));",                  "str\tz8, [x8, #255, MUL VL]"],
>
>> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [build-dev](mailto:build-dev at openjdk.java.net):_
>> 
>> On 21/09/2020 09:18, Bernhard Urban-Forster wrote:
>> 
>> > Thank you Andrew. Is the goal to reduce the patch diff in `assembler_aarch64.cpp`? If so, we need to get rid of the
>> > retry in your patch to avoid additional calls to `random`, e.g. something like this (the diff for the generated part
>> > would look indeed nicer with that:  https://gist.github.com/lewurm/2e7b0e00447696c75e00febb83034ba1 ):
>> > [...]
>> 
>> Yes, better. Thanks.
> 
> Great, I've updated the PR. Thank you!

Thanks, Andrew for catching that. I have made the needful changes.

-Monica

> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [build-dev](mailto:build-dev at openjdk.java.net):_
> 
> On 18/09/2020 11:14, Monica Beckwith wrote:
> 
> > This is a continuation of
> > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-August/009566.html
> > Changes since then:
> > * We've improved the write barrier as suggested by Andrew [1]
> 
> It's still wrong, I'm afraid. This is not a full barrier:
> 
> +#define FULL_MEM_BARRIER atomic_thread_fence(std::memory_order_acq_rel);
> 
> it is only StoreStore|LoadStore|LoadLoad, but you need StoreLoad as
> well. It might well be that you get the same DMB ISH instruction, but
> unless you use a StoreLoad barrier here it's theoretically possible
> for a compiler to reorder accesses so that a processor sees its own
> stores before other processors do. (And it's confusing for the reader
> too.)
> 
> Use:
> 
> +#define FULL_MEM_BARRIER atomic_thread_fence(std::memory_order_seq_cst);
> 
> See here:
> 
> https://en.cppreference.com/w/cpp/atomic/memory_order
> 
> memory_order_seq_cst "...plus a single total order exists in which all
> threads observe all modifications in the same order (see
> Sequentially-consistent ordering below)"
> 
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

-------------

PR: https://git.openjdk.java.net/jdk/pull/212