[aarch64-port-dev ] RFR: AARCH64: Changes to HotSpot shared code

Thu Nov 13 09:31:49 UTC 2014

On 12/11/14 20:23, Dean Long wrote:
> On 11/11/2014 11:02 AM, Andrew Haley wrote:
>> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch
>>
>> Everything except cpu/ and os_cpu/.
>>
>> Most of this is obvious and trivial, with a few exceptions.
>>
>> In memory/metaspace.cpp, we allocated the memory for metadata in a
>> different way.  This is because we want to be able to decode and
>> encode compressed metadata pointers with a single instruction, and we
>> can always do that iff the base address is of a particular form.
>>
>> In opto/, we have made some changes in order to be able to use AArch64
>> store release instructions for volatile field stores.  These don't
>> require leading or trailing barriers.  We have tried several times to
>> do this without changing shared code, but it is impossible with the
>> current back-end interface.
> Is this something ppc64 can also take advantage of?  I hope Vladimir can 
> suggest
> a more flexible way to do this, perhaps with a runtime flag.

Perhaps so, but as far as I'm aware AArch64 is the only CPU with
exactly these semantics.  From my point of view, it would be ideal if
we simply emitted volatile store and volatile load as nodes and let
the back end handle them.  But if we do that we lose the opportunity
to coalesce barriers in C2 optimization.  Hmmm....  :-)

>> In several places a release store is used where the AArch64 memory
>> model makes it unnecessary.  From earlier emails on this list we
>> discovered that the only architecture which requires this release
>> store is IA64, and OpenJDK does not support it anyway.  We should
>> perhaps look at re-engineering the way that memory barriers and memory
>> accesses are handled in HotSpot with a view to pushing all these
>> architecture-dependent assumptions out to the back ends.
> I agree.  More comments below.
>> Andrew.
> c1_Canonicalizer.cpp
>      Can this be handled in the back-end?  I imagine other platforms, 
> such as x86, have similar limitations.

It certainly could be.  Maybe pd_valid_shift_count() ?  But I'm striving
not to touch any other ports.

> c1_LIR.cpp
>      It looks like you need a temp for convert because your backend 
> because you're checking the FPSR.
>      What happens if you ignore the FPSR, do you get a wrong result?

I've looked for a while, and I'm sorry but I don't understand which
hunk this refers to.

> c1_LinearScan.cpp
>      I'm not familiar with what the changed code is doing.  Can you 
> explain why it applies to x86 and aarch64?

It certainly was at the time.  I'll investigate to see if this is
still needed.

> c1_Runtime1.cpp
>      This will break our closed port that NOP instructions for
>      patching.

Ah, interesting.  I spent quite a lot of time kicking around ideas for
C1 patching, but (to my surprise) deoptimizing instead didn't seem to
have significant adverse effect.

>      How about moving your deopt-instead-of-patch support
>      into Runtime1::patch_code() and enable it with a read-only
>      platform-specific developer runtime flag
>      (see INTPRESSURE for example)?

Okay.  I'll have a look at that.

> compiledIC.hpp
>      You should be able to use set_inst_mark()/cbuf.insts_mark() to set 
> and retrieve the mark address.

Okay.

> arguments.cpp
>      I wish there was a way to fix ReservedCodeCacheSize in the back-end.

Indeed.

Thanks,
Andrew.