[aarch64-port-dev ] RFR: AARCH64: Changes to HotSpot shared code

Fri Nov 14 10:07:12 UTC 2014

On 14/11/14 06:15, Dean Long wrote:
> On 11/13/2014 1:31 AM, Andrew Haley wrote:
>> On 12/11/14 20:23, Dean Long wrote:
>>> On 11/11/2014 11:02 AM, Andrew Haley wrote:
>>>> http://cr.openjdk.java.net/~aph/aarch64-JDK-8064611/hotspot.patch
>>>>
>>>> Everything except cpu/ and os_cpu/.
>>>>
>>>> Most of this is obvious and trivial, with a few exceptions.
>>>>
>>>> In memory/metaspace.cpp, we allocated the memory for metadata in a
>>>> different way.  This is because we want to be able to decode and
>>>> encode compressed metadata pointers with a single instruction, and we
>>>> can always do that iff the base address is of a particular form.
>>>>
>>>> In opto/, we have made some changes in order to be able to use AArch64
>>>> store release instructions for volatile field stores.  These don't
>>>> require leading or trailing barriers.  We have tried several times to
>>>> do this without changing shared code, but it is impossible with the
>>>> current back-end interface.
>>> Is this something ppc64 can also take advantage of?  I hope Vladimir can
>>> suggest
>>> a more flexible way to do this, perhaps with a runtime flag.
>> Perhaps so, but as far as I'm aware AArch64 is the only CPU with
>> exactly these semantics.  From my point of view, it would be ideal if
>> we simply emitted volatile store and volatile load as nodes and let
>> the back end handle them.  But if we do that we lose the opportunity
>> to coalesce barriers in C2 optimization.  Hmmm....  :-)
>>
>>>> In several places a release store is used where the AArch64 memory
>>>> model makes it unnecessary.  From earlier emails on this list we
>>>> discovered that the only architecture which requires this release
>>>> store is IA64, and OpenJDK does not support it anyway.  We should
>>>> perhaps look at re-engineering the way that memory barriers and memory
>>>> accesses are handled in HotSpot with a view to pushing all these
>>>> architecture-dependent assumptions out to the back ends.
>>> I agree.  More comments below.
>>>> Andrew.
>>> c1_Canonicalizer.cpp
>>>       Can this be handled in the back-end?  I imagine other platforms,
>>> such as x86, have similar limitations.
>> It certainly could be.  Maybe pd_valid_shift_count() ?  But I'm striving
>> not to touch any other ports.
>
> What I'm actually wondering is what happens if you remove the AARCH64 
> log2_scale check
> altogether.  As far as I can tell, it isn't needed, because in 
> do_UnsafeGetRaw, the scale
> is only used directly in the LIR_Address for X86 and ARM.  For other 
> platforms, we do:
> 
>        LIR_Opr tmp = new_pointer_register();
>        __ shift_left(index_op, log2_scale, tmp);
>        addr = new LIR_Address(base_op, tmp, dst_type);
> 
> so you don't have to worry about a mis-scaled load on AARCH64.

Aha!  Okay, that may be a more recent change.  I don't think I would
have made that change if it wasn't necessary at the time, but never
mind, one hunk is gone, thanks.

>>> c1_LIR.cpp
>>>       It looks like you need a temp for convert because your backend
>>> because you're checking the FPSR.
>>>       What happens if you ignore the FPSR, do you get a wrong result?
>> I've looked for a while, and I'm sorry but I don't understand which
>> hunk this refers to.
> 
> This one, for lir_convert:
> 
> #if defined(PPC) || defined(TARGET_ARCH_aarch64)
>        if (opConvert->_tmp1->is_valid()) do_temp(opConvert->_tmp1);
>        if (opConvert->_tmp2->is_valid()) do_temp(opConvert->_tmp2);
> #endif
> 
> I'm wondering if, for example, d2l in the back-end needs to check FPSCR.IOC?
> If you get the correct result even if FPSCR.IOC is set, then you should 
> be able to
> simply ignore FPSCR.IOC.

Yes, you're quite right.

Thanks again,
Andrew.