RFR: JDK-8213199: GC abstraction for Assembler::needs_explicit_null_check()
Andrew Dinn
adinn at redhat.com
Mon Nov 5 10:59:07 UTC 2018
On 02/11/18 22:31, Roman Kennke wrote:
> I had a discussion with Andrew Dinn (who actually wrote that code back
> then) and will copy his explanation here, I think it covers the situation:
Err, ... just to be clear ... I don't believe I did write that code (the
relevant check-in was by shade). I might, perhaps, have been consulted
when it was written and I think my explanation (below) of what it does
is correct. Perhaps Aleksey could also comment?
> ""
> Let's just recap without considering Shenandoah:
>
> MacroAssembler::needs_explicit_null_check is called from several places,
> not just the SEGV handler.
>
> In those other cases offset is a putative field offset for an object.
>
> With uncompressed pointers the method has to decide whether this offset
> will lie within the read-protected page at address 0 i.e. 0 <= offset <
> os::vm_page_size(). If that is true then a null check can be omitted for
> a read or write operation. If not an explicit check is needed.
>
> With compressed pointers when Universe::narrow_oop_base() != NULL the
> test also checks whether this offset may lie within the read-protected
> page at the heap base only -- oh -- that's the same test! Of course,
> we'll see that this is different when the SEGV handler makes the call.
>
> n.b. I think this case is included because there is the possibility that
> a null pointer can be represented by the narrow oop base address.
> However, I think in reality special case checks in all encode decode
> operations ensure that a null oop is always represented as zero (for
> AArch64 at least). I'm not absolutely sure of that though.
>
> Now, when called from the SEGV handler offset is actually the offending
> read/write address that caused the SEGV. In that case a null pointer
> will only have been dereferenced if the address lies in the protected
> zero page or in the protected heap base page. The first if clause
> handles the compressed pointer case and reduces it to the uncompressed
> pointer case by subtracting base when offset >= base.
>
> So, either way the offending address is potentially a null pointer
> dereference under exactly the same conditions as for those other calls.
>
> Next, how does Shenandoah modify this? Well, it looks like that depends
> on how this method is called. If it can be invoked from any of those
> other places to check whether a Brooks pointer read or write is ok to
> validate with an implicit null check then the offset may be passed as -8
> i.e. 0xfffffffffffffff8. However, that's actually not enough.
>
> When invoked from the handler because of an access at of (0 + offset)
> then offset may legitimately be 0x00fffffffffffff8 or lie between 0 and
> os::vm_page_size().
>
> When invoked form the handler because of a dereference of
> (narrow_oop_base + offset) then the subtract i the if clause means that
> offset may legitimately be 0xfffffffffffffff8 or lie between 0 and
> os::vm_page_size().
>
> So, the masking in the Shenandoah case is to reduce those two special
> cases into one.
> ""
>
> Roman
>
>
>
>> Hi Roman,
>>
>> On 2018-11-01 17:58, Roman Kennke wrote:
>>> Hi Erik,
>>>
>>>> Would you mind explaining how you need to override this and why? I'm
>>>> afraid I did not quite get it from your description in the RFC (which is
>>>> also why I didn't come up with a solution either).
>>>>
>>>> Am I correct that you need to return false if the passed in offset is
>>>> the cell header offset -8, or is there anything else at all required to
>>>> that logic?
>>>
>>> No, it's really just that. Plus take care of it in the case of combined
>>> narrow-oops-offset plus -8.
>>
>>>> You mentioned something about the high order byte being
>>>> masked on AArch64, but didn't quite connect the dot to how that is
>>>> related to this code. Is it?
>>>
>>> Yes, we also need to mask the high order byte in AArch64 because of the
>>> way addressing works on that platform, and because -8 flips on those
>>> negative bits.
>>
>> Okay. Looking at
>> https://builds.shipilev.net/patch-openjdk-shenandoah-jdk/latest/src/hotspot/share/asm/assembler.cpp.udiff.html
>> that you posted before, assuming this is how it will be extended.
>>
>> So in the case with a heap base, the resulting address will never be
>> negative, and you know it will trap (presuming there is an mprotected
>> page at offset -8 from the heap base pointer. So it seems in this case,
>> the check could have just been offset == -8.
>>
>> So let's say we access null -8 without a heap base then. The address
>> becomes 0xFFFFFFFFFFFFFFF8. When that traps, the value you see in the
>> signal handler becomes 0x00FFFFFFFFFFFFF8 because of the virtual address
>> masking of the high order byte. I presume that is what the AArch64 code
>> is dealing with. But I don't quite understand why. Addresses starting
>> with 0x00FF are not valid in user space so we never need to worry about
>> any memory being mapped in there accidentally. So I'm not sure what this
>> code protects against.
>>
>> So that takes me back to my original intuition: isn't this the same as
>> checking at the very top: if (offset == BrooksPointer::byte_offset())
>> return false;
>>
>> If it really is as simple as just that, and I did not miss something not
>> obvious, then perhaps we would be better off abstracting a cell header
>> check here, as opposed to wrapping the whole thing in a virtual member
>> function. Especially since there are now a number of these occurrences
>> where some basic knowledge about cell header size that is 0 for all GCs
>> except Shenandoah where it is 8, leaves us wrapping a whole bunch of
>> stuff behind a GC interface, only to handle that one difference.
>>
>> While I don't know exactly what this cell abstraction would reasonably
>> look like in its full shape, perhaps we could start with something
>> simple like a virtual member function size_t cell_header_size() on
>> CollectedHeap with a suitable comment explaining that a cell header is a
>> GC thing we sometimes put above the object header. And then see if there
>> are more related functions that lead us to a helper class for cells or
>> something like that.
>>
>> I'm open to suggestions of course. Thought I'd check with you if this
>> sounds sane to you. Of course if my assumption is wrong that this check
>> could be written much simpler than it looks like, then I suppose I need
>> to give up on that idea. It all boils down to that really.
>>
>> Thanks,
>> /Erik
>>
>>> Also, the mach5 job came back with FAILED (see below). Can somebody with
>>> access check and see what's up?
>>>
>>> Thanks,
>>> Roman
>>>
>>>
>>> Build Details: 2018-11-01-1146402.roman.source
>>> 0 Failed Tests
>>> Mach5 Tasks Results Summary
>>>
>>> EXECUTED_WITH_FAILURE: 6
>>> KILLED: 0
>>> UNABLE_TO_RUN: 21
>>> PASSED: 48
>>> FAILED: 0
>>> NA: 0
>>> Build
>>>
>>> 6 Executed with failure
>>> solaris-sparcv9-solaris-sparcv9-build-8 error while
>>> building, return value: 2
>>> solaris-sparcv9-debug-solaris-sparcv9-build-9 error while
>>> building, return value: 2
>>> windows-x64-windows-x64-build-10 error while building,
>>> return value: 2
>>> windows-x64-debug-windows-x64-build-11 error while building,
>>> return value: 2
>>> windows-x64-open-windows-x64-build-12 error while building,
>>> return value: 2
>>> windows-x64-open-debug-windows-x64-build-13 error while
>>> building, return value: 2
>>> 2 Not run
>>> solaris-sparcv9-install-solaris-sparcv9-build-16 Dependency
>>> task failed: mach5...-8300-solaris-sparcv9-solaris-sparcv9-build-8
>>> windows-x64-install-windows-x64-build-17 Dependency task
>>> failed: YJftjiBUYc
>>>
>>> Test
>>>
>>> 19 Not run
>>>
>>> tier1-solaris-sparc-open_test_hotspot_jtreg_tier1_common-solaris-sparcv9-57
>>>
>>> Dependency task failed:
>>> mach5...-8300-solaris-sparcv9-solaris-sparcv9-build-8
>>>
>>> tier1-solaris-sparc-open_test_hotspot_jtreg_tier1_common-solaris-sparcv9-debug-58
>>>
>>> Dependency task failed:
>>> mach5...solaris-sparcv9-debug-solaris-sparcv9-build-9
>>>
>>> tier1-product-open_test_hotspot_jtreg_tier1_common-windows-x64-20
>>> Dependency task failed: YJftjiBUYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_common-windows-x64-debug-26
>>> Dependency task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-windows-x64-debug-29
>>> Dependency
>>> task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-windows-x64-debug-32
>>> Dependency
>>> task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-windows-x64-debug-35
>>> Dependency
>>> task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-windows-x64-debug-38
>>>
>>> Dependency task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-windows-x64-debug-41
>>> Dependency task failed: YJftjiBVYc
>>>
>>> tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-windows-x64-debug-44
>>> Dependency task failed: YJftjiBVYc
>>> See all 19...
>>>
>>>
>>>
>>>> Thanks,
>>>> /Erik
>>>>
>>>> On 2018-11-01 12:20, Roman Kennke wrote:
>>>>> Hi Kim, thanks for reviewing! I'll push it through jdk/submit now.
>>>>>
>>>>> Erik: ok for you too?
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>>>> On Oct 31, 2018, at 6:14 PM, Roman Kennke <rkennke at redhat.com> wrote:
>>>>>>>
>>>>>>> Hi Erik,
>>>>>>>
>>>>>>> right. Fixed this, and what what Kim mentioned plus a missing
>>>>>>> include:
>>>>>>>
>>>>>>> Incremental:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213199/webrev.01.diff/
>>>>>>> Full:
>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213199/webrev.01/
>>>>>>>
>>>>>>> Ok now?
>>>>>> Looks good.
>>>>>>
>>>>
>>>
>
--
regards,
Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
More information about the hotspot-dev
mailing list