[aarch64-port-dev ] Problem in MacroAssembler::needs_explicit_null_check
Andrew Dinn
adinn at redhat.com
Wed Jul 26 16:15:08 UTC 2017
Having just talked to Andrew Haley about this I'll add that I'm
suggesting reworking this code (see bottom of original note for a
proposed alternative) not to fix any bug but just to make it clearer
what is being checked and why.
regards,
Andrew Dinn
-----------
On 26/07/17 15:40, Andrew Dinn wrote:
> The Problem
> -----------
>
> This is a follow-up of my review of the Shenandoah patch addressing a
> curiosity that was found in the fault address values passed into
> MacroAssembler::needs_explicit_null_check from the SEGV handler. It
> turns out that when we dereference address -8 (aka 0xffff_ffff_ffff_fff8
> aka Brooks pointer load via null) the fault address passed into the SEGV
> handler in siginfo->si_addr is 0x00ff_ffff_ffff_fff8. The top 8 bits of
> the address are zeroed.
>
> Why only 56 bits of address? Indeed, why are there not 48 bits of
> address? The relevant info can be found at
>
>
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/BABBEFAE.html
>
> To summarize:
>
> The virtual address space requires use of all 64 bits but it is in two
> parts with a hole in the middle i.e. addresses in the range
> 0x1_0000_0000_0000 to 0xFFFE_FFFF_FFFF_FFFF inclusive are not mappable.
> The top and bottom mappable regions each allow for 48 bits worth of
> vmem. So, the total available space adds up to 2 * 48 bits worth of
> space. However these two regions can only be mapped onto an underlying
> physical address space of 48 bits. So, only 48 bits worth of vmem can be
> in use at any given time.
>
> There are two corresponding sets of vmem to pmem translation tables, one
> for use in app and kernel land (TTBR0) the other kernel land only (TTBR1).
>
> TTRB0 maps from the low 48 bit space i.e. top 16 bits of vmem
> addresses are all 0, TTBR0 mappings are active in all execution levels
> (0-3).
>
> TTBR1 maps from the high 48-bit space where the top 16 bits are all 1.
> TTBR1 mappings are only active at execution level 3. So, accessing an
> address in this range from user land results in an automatic SEGV.
>
> The gap between these two spaces (i.e. where the top 16 bits are mixed)
> cannot be mapped via either translation table.
>
> The crucial comment on the page linked above is
>
> "You can enable VA tagging to exclude the top eight bits from the check"
>
> In this case the top 8 bits of an address used by a program /don't/
> actually need to be all 0s or all 1s for addressing still to work. Only
> the top 56 bits are assumed to be valid.
>
> This latter option is documented here
>
>
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html
>
> It is enabled by writing a relevant bit in the translation table
> management register.
>
>
> What does this mean for us?
> ---------------------------
>
> It is probably the case that Linux on AArch64 is enabling VA tagging but
> it may not be
>
> If VA tagging is enabled then it may be that in consequence the hardware
> is only passing the top 56 bits to the trap handler and, thence, to the
> signal handler.
>
> Alternatively, it may be that the hardware passes the full address
> whatever the hardware setting and the OS trap handler is doing the
> masking off because it has noticed that VA tagging has been enabled.
>
> A third possibility is that VA tagging may not be enabled but the trap
> handler may be doing it irrespective of the setting i.e. we may never
> see 0xffff.
>
> I suspect this last case is unlikely. Anyway the upshot is we probably
> cannot rely on VA tagging being on or off. Nor, probably, can we rely on
> the top 8 bits being zeroed if it is on or being left unchanged if it is
> off.
>
> Whichever of these is the case we do know that we are not using tag bits
> in Java. So, we should only see either 0x0000, 0x00ff or 0xffff in the
> top 16 bits of any offset coming into method needs_explicit_null_check.
>
> The first case indicates an address in the valid vmem range and needs
> comparing to the heap base to determine whether to compare against the
> null check table.
>
> The second and third cases can only arise in two ways -- either
> because of a Brooks pointer load via a null oop or because something
> tried to load a garbage address in the kernel only range.
>
> In either of the last two cases we don't need to adjust the offset to do
> a heap check. This cannot be a heap address.
>
> So, when using Shenandoah on AArch64
>
> For the case where (top 16 bits == 0xffff) we just need to test
> whether (offset == -8).
>
> For the case where (top 16 bits == 0x00ff) we can restore the top 8
> bits to 0xff, on the assumption that these were the bits in the address
> that caused the fault i.e. offset = ((offset << 8) >> 8)] and then test
> whether (offset == -8).
>
>
> What does this mean for the current code?
> -----------------------------------------
>
> Here is the current Shenandoah repo code
>
> bool MacroAssembler::needs_explicit_null_check(intptr_t offset) {
> // Exception handler checks the nmethod's implicit null checks table
> // only when this method returns false.
> #ifdef AARCH64
> // AArch64 uses 48-bit addresses
> const uintptr_t address_bits = (uintptr_t)0xfffffffffffful;
> #else
> const uintptr_t address_bits = ~(uintptr_t)0;
> #endif
> #ifdef _LP64
> if (UseCompressedOops && Universe::narrow_oop_base() != NULL) {
> assert (Universe::heap() != NULL, "java heap should be initialized");
> // The first page after heap_base is unmapped and
> // the 'offset' is equal to [heap_base + offset] for
> // narrow oop implicit null checks.
> uintptr_t base = (uintptr_t)Universe::narrow_oop_base();
> int adj = MIN2(0, UseShenandoahGC ? BrooksPointer::byte_offset() : 0);
> if ((uintptr_t)((offset - adj) & address_bits) >= base) {
> // Normalize offset for the next check.
> offset = (intptr_t)(pointer_delta((void*)offset, (void*)base, 1));
> }
> }
> #endif
>
> if (UseShenandoahGC) {
> if ((offset & address_bits) == (BrooksPointer::byte_offset() &
> address_bits)) {
> return false;
> }
> }
> return offset < 0 || os::vm_page_size() <= offset;
> }
>
> To fix this I think we need to special case AArch64 processing separate
> from non-AArch64 processing. In the former case the Shenandoah checks
> can be done under #ifdef AARCH64 before the #ifdef LP64 block. This
> makes the check currently inserted after the #ifdef LP64 block redundant
> on AArch64 and hence require guarding with #ifndef AARCH64. Here is my
> suggested alternative:
>
> bool MacroAssembler::needs_explicit_null_check(intptr_t offset) {
> // Exception handler checks the nmethod's implicit null checks table
> // only when this method returns false.
> #ifdef AARCH64
> // AArch64 addresses should only have 0x0000 or 0xffff
> // in the top 16 bits but in a fault address the latter
> // may be reset to 0x00ff. Non-zero bits are only legitimate
> // when Shenandoah loads a Brooks pointer via a null oop
> // i.e. original address should be - BrooksPointer::byte_offset().
> long loffset = (long)offset;
> long hi = loffset >> 48;
> int adj = BrooksPointer::byte_offset();
> if (hi != 0) {
> if (hi == 0x00ffL || hi == 0xffffL) {
> // the top 8 bits may have been cleared
> // if so make sure they are set again
> loffset = (loffset << 8) >> 8);
> // if this is not 0 - Brooks pointer offset
> // we need an explicit null check
> return ((loffset + adj) != 0);
> }
> }
> #endif
> #ifdef _LP64
> if (UseCompressedOops && Universe::narrow_oop_base() != NULL) {
> assert (Universe::heap() != NULL, "java heap should be initialized");
> // The first page after heap_base is unmapped and
> // the 'offset' is equal to [heap_base + offset] for
> // narrow oop implicit null checks.
> uintptr_t base = (uintptr_t)Universe::narrow_oop_base();
> int adj = MIN2(0, UseShenandoahGC ? BrooksPointer::byte_offset() : 0);
> if (((uintptr_t)(offset - adj)) >= base) {
> // Normalize offset for the next check.
> offset = (intptr_t)(pointer_delta((void*)offset, (void*)base, 1));
> }
> }
> #endif
>
> #ifndef AARCH64
> if (UseShenandoahGC) {
> if (offset == BrooksPointer::byte_offset()) {
> return false;
> }
> }
> #endif
> return offset < 0 || os::vm_page_size() <= offset;
> }
>
> Comments and or corrections welcome.
>
> regards,
>
>
> Andrew Dinn
> -----------
More information about the aarch64-port-dev
mailing list