RFR: JDK-8213199: GC abstraction for Assembler::needs_explicit_null_check()

Tue Nov 6 14:33:49 UTC 2018

Hi Roman,

Yes, the check is indeed the same for checking addresses vs offsets. But 
only because it has been written in a non-obvious way to allow that. And 
I think that is a bad idea compared to having the two cases separate, 
and writing the corresponding code in a more obvious way where these two 
different quantities can not be mixed.

For offsets, we are interested in the following range:
[-cell_header_size, os::vm_page_size)

For addresses, we are interested in the following ranges:
1) with heap base: [heap_base - cell_header_size, heap_base + 
os::vm_page_size)
2) without heap base: [-cell_header_size, os::vm_page_size)

Just expressing these trivial ranges separately would have been very 
easy to understand.

By baking this into the same check, there are multiple things that make 
me scratch my head before I understand how/if it works.
For example, let's say we request a 32 GB heap that just barely fits 
with compressed oops encoding. But it just barely does not fit into the 
lower 32 GB virtual address space because it is off by one page - the 
NULL page. And simplicity, let's say that then we map the heap with a 
heap base at address equal to os::vm_page_size(), because that is the 
heap base we are given from the OS when reserving our heap.

Then the compiler asks if it needs a null check for an offset equal to 
os::vm_page_size() + 8. Then with the current logic, that will cause the 
offset to get subtracted by os::vm_page_size(), because the offset 
"looks like an address" despite being an offset. This results in the JIT 
thinking implicit null checks are fine, as the resulting offset 8 (after 
normalizing it like an address to an offset even though it is not an 
address), looks like it will hit the protected page. But it was in fact 
not fine, as this access will, given a NULL base pointer, perform 
accesses into the the heap past the first protected page, without trapping.

So it seems that in order for the current approach that treats offsets 
and address the same to work, we have an implicit contract that no 
offsets observed by the compiler may be greater than the compressed oops 
heap base, or things will go horribly wrong. Do we actually enforce this 
contract in any way? I see we have some HeapBaseMinAddress JVM flag that 
is by default typically 2G. But I see we have also defined a 
MAX_OBJECT_SIZE like this:
#define MAX_OBJECT_SIZE \
   ( arrayOopDesc::header_size(T_DOUBLE) * HeapWordSize \
     + ((julong)max_jint * sizeof(double)) )
Since it seems like MAX_OBJECT_SIZE is greater than HeapBaseMinAddress, 
it seems like the contract we have here is not enforced. We can map in 
the heap at 2G, and have offsets >2G, which will confuse the check to 
treat offsets like addresses, which can cause us to crash the VM. Do we 
have another seat belt that makes this okay? Not that I'm aware of.

With all offsets now also being "adjusted" by 8 to determine whether to 
treat things like an address or not, the head scratching gets worse.

We now have a whole bunch of different cases handled by the same 
function, including:
1) normal offsets
2) negative offsets
3) offsets greater than heap base (probably does not work reliably today)
4) unmasked positive addresses before the heap base
5) unmasked positive addresses after the heap base
6) unmasked positive addresses without a heap base
7) unmasked negative addresses without a heap base
8) masked negative addresses without a heap base

And the check to treat all these different scenarios as the same is very 
far from obvious compared to simply checking the offset range vs address 
range separately, knowing that one gets passed in offsets, the other 
gets passed in addresses.

Here is a patch with a suggested implementation prototype with my take:
http://cr.openjdk.java.net/~eosterlund/8213199/webrev.00/

* It performs sign extension in the AArch64 signal handler to fix masked 
addresses, so we don't need to special case the masked addresses
* It splits the address vs offset cases so you don't have to scratch 
your head as much
* It trivially does not rely on where the heap is mapped in to work 
(corner cases due to blurring offsets "looking like addresses" vs actual 
addresses)
* It supports GCs with non-zero cell header size, in a trivially 
understandable way

Thoughts?

Thanks,
/Erik

On 2018-11-05 16:25, Roman Kennke wrote:
> Another way to look at it is this:
>
> The method checks if:
>
> 1. An offset that, would it be dereferenced on NULL (0) hits the
> protected 0-page (0<=offset<page_size). Or
> 2. An actual address (the fault address in the signal handler) hits the
> protected 0-page.
>
> ... which basically amounts to the exact same check. Extra handling is
> there to:
>
> - handle narrow-oops
> - mask aarch64 address (should be moved to signal handler)
> - with Shenandoah, handle -8 offset, which would wrap-around the fault
> address to 0xFF... if dereferenced on NULL
>
> The proposed patch seems like a reasonably good solution to me. And I
> have tried a bunch of different approaches too (basically keeping the
> main 'algorithm' in place, and only add a hook or two to handle
> Shenandoah), which didn't work out quite as nicely.
>
>
> Roman
>
>> Hi Roman,
>>
>> So I get what you are saying right, we are using the bool
>> needs_explicit_null_check(intptr_t offset) function both with offsets
>> (what the parameter seems to suggest should be passed in), but also with
>> raw addresses from the signal handler, posing as offsets. And we
>> happened to get away with that because the check was written in such a
>> clever way that it would return the right answer regardless of whether
>> it was an offset or an address, saving us from writing those ~3 lines
>> extra code for a new function expecting an address.
>>
>> Except now with negative offsets, that also additionally get masked by
>> the AArch64 linux signal handler, this suddenly falls apart and we can
>> no longer seamlessly pass in offsets and addresses to get consistent
>> answers. The same function suddenly handles different combinations of
>> masked/unmasked positive/negative addresses, with/without heap base, as
>> well as positive/negative offsets.
>>
>> The first obvious suggestion that comes to mind is to sign extend the
>> high order byte inside of the signal handler on AArch64, to fix the
>> address before it gets passed in to needs_explicit_null_check, as
>> opposed to letting the shared needs_explicit_null_check function take
>> that responsibility, handling not just offsets and virtual addresses,
>> but additionally also masked virtual addresses from the signal handler.
>>
>> So with the AArch64 stuff gone from this function, I still can't help
>> but think that the conditions in this code would be possible to write in
>> a trivially understandable way that doesn't require me to scratch my
>> head, if there was one function expecting addresses and one function
>> expecting offsets. And in that case, I think it would look trivial to
>> have a cell header check in those functions, which would fit in well. I
>> think.
>>
>> Thoughts?
>>
>> Thanks,
>> /Erik
>>
>> On 2018-11-02 23:31, Roman Kennke wrote:
>>> Hi Erik,
>>>
>>> I had a discussion with Andrew Dinn (who actually wrote that code back
>>> then) and will copy his explanation here, I think it covers the
>>> situation:
>>>
>>> ""
>>> Let's just recap without considering Shenandoah:
>>>
>>> MacroAssembler::needs_explicit_null_check is called from several places,
>>> not just the SEGV handler.
>>>
>>> In those other cases offset is a putative field offset for an object.
>>>
>>> With uncompressed pointers the method has to decide whether this offset
>>> will lie within the read-protected page at address 0 i.e. 0 <= offset <
>>> os::vm_page_size(). If that is true then a null check can be omitted for
>>> a read or write operation. If not an explicit check is needed.
>>>
>>> With compressed pointers when Universe::narrow_oop_base() != NULL the
>>> test also checks whether this offset may lie within the read-protected
>>> page at the heap base only -- oh -- that's the same test! Of course,
>>> we'll see that this is different when the SEGV handler makes the call.
>>>
>>> n.b. I think this case is included because there is the possibility that
>>> a null pointer can be represented by the narrow oop base address.
>>> However, I think in reality special case checks in all encode decode
>>> operations ensure that a null oop is always represented as zero (for
>>> AArch64 at least). I'm not absolutely sure of that though.
>>>
>>> Now, when called from the SEGV handler offset is actually the offending
>>> read/write address that caused the SEGV. In that case a null pointer
>>> will only have been dereferenced if the address lies in the protected
>>> zero page or in the protected heap base page. The first if clause
>>> handles the compressed pointer case and reduces it to the uncompressed
>>> pointer case by subtracting base when offset >= base.
>>>
>>> So, either way the offending address is potentially a null pointer
>>> dereference under exactly the same conditions as for those other calls.
>>>
>>> Next, how does Shenandoah modify this? Well, it looks like that depends
>>> on how this method is called. If it can be invoked from any of those
>>> other places to check whether a Brooks pointer read or write is ok to
>>> validate with an implicit null check then the offset may be passed as -8
>>> i.e. 0xfffffffffffffff8. However, that's actually not enough.
>>>
>>> When invoked from the handler because of an access at of (0 + offset)
>>> then offset may legitimately be 0x00fffffffffffff8 or lie between 0 and
>>> os::vm_page_size().
>>>
>>> When invoked form the handler because of a dereference of
>>> (narrow_oop_base + offset) then the subtract i the if clause means that
>>> offset may legitimately be 0xfffffffffffffff8 or lie between 0 and
>>> os::vm_page_size().
>>>
>>> So, the masking in the Shenandoah case is to reduce those two special
>>> cases into one.
>>> ""
>>>
>>> Roman
>>>
>>>
>>>
>>>> Hi Roman,
>>>>
>>>> On 2018-11-01 17:58, Roman Kennke wrote:
>>>>> Hi Erik,
>>>>>
>>>>>> Would you mind explaining how you need to override this and why? I'm
>>>>>> afraid I did not quite get it from your description in the RFC
>>>>>> (which is
>>>>>> also why I didn't come up with a solution either).
>>>>>>
>>>>>> Am I correct that you need to return false if the passed in offset is
>>>>>> the cell header offset -8, or is there anything else at all
>>>>>> required to
>>>>>> that logic?
>>>>> No, it's really just that. Plus take care of it in the case of combined
>>>>> narrow-oops-offset plus -8.
>>>>>> You mentioned something about the high order byte being
>>>>>> masked on AArch64, but didn't quite connect the dot to how that is
>>>>>> related to this code. Is it?
>>>>> Yes, we also need to mask the high order byte in AArch64 because of the
>>>>> way addressing works on that platform, and because -8 flips on those
>>>>> negative bits.
>>>> Okay. Looking at
>>>> https://builds.shipilev.net/patch-openjdk-shenandoah-jdk/latest/src/hotspot/share/asm/assembler.cpp.udiff.html
>>>>
>>>> that you posted before, assuming this is how it will be extended.
>>>>
>>>> So in the case with a heap base, the resulting address will never be
>>>> negative, and you know it will trap (presuming there is an mprotected
>>>> page at offset -8 from the heap base pointer. So it seems in this case,
>>>> the check could have just been offset == -8.
>>>>
>>>> So let's say we access null -8 without a heap base then. The address
>>>> becomes 0xFFFFFFFFFFFFFFF8. When that traps, the value you see in the
>>>> signal handler becomes 0x00FFFFFFFFFFFFF8 because of the virtual address
>>>> masking of the high order byte. I presume that is what the AArch64 code
>>>> is dealing with. But I don't quite understand why. Addresses starting
>>>> with 0x00FF are not valid in user space so we never need to worry about
>>>> any memory being mapped in there accidentally. So I'm not sure what this
>>>> code protects against.
>>>>
>>>> So that takes me back to my original intuition: isn't this the same as
>>>> checking at the very top: if (offset == BrooksPointer::byte_offset())
>>>> return false;
>>>>
>>>> If it really is as simple as just that, and I did not miss something not
>>>> obvious, then perhaps we would be better off abstracting a cell header
>>>> check here, as opposed to wrapping the whole thing in a virtual member
>>>> function. Especially since there are now a number of these occurrences
>>>> where some basic knowledge about cell header size that is 0 for all GCs
>>>> except Shenandoah where it is 8, leaves us wrapping a whole bunch of
>>>> stuff behind a GC interface, only to handle that one difference.
>>>>
>>>> While I don't know exactly what this cell abstraction would reasonably
>>>> look like in its full shape, perhaps we could start with something
>>>> simple like a virtual member function size_t cell_header_size() on
>>>> CollectedHeap with a suitable comment explaining that a cell header is a
>>>> GC thing we sometimes put above the object header. And then see if there
>>>> are more related functions that lead us to a helper class for cells or
>>>> something like that.
>>>>
>>>> I'm open to suggestions of course. Thought I'd check with you if this
>>>> sounds sane to you. Of course if my assumption is wrong that this check
>>>> could be written much simpler than it looks like, then I suppose I need
>>>> to give up on that idea. It all boils down to that really.
>>>>
>>>> Thanks,
>>>> /Erik
>>>>
>>>>> Also, the mach5 job came back with FAILED (see below). Can somebody
>>>>> with
>>>>> access check and see what's up?
>>>>>
>>>>> Thanks,
>>>>> Roman
>>>>>
>>>>>
>>>>> Build Details: 2018-11-01-1146402.roman.source
>>>>> 0 Failed Tests
>>>>> Mach5 Tasks Results Summary
>>>>>
>>>>>        EXECUTED_WITH_FAILURE: 6
>>>>>        KILLED: 0
>>>>>        UNABLE_TO_RUN: 21
>>>>>        PASSED: 48
>>>>>        FAILED: 0
>>>>>        NA: 0
>>>>>        Build
>>>>>
>>>>>            6 Executed with failure
>>>>>                solaris-sparcv9-solaris-sparcv9-build-8 error while
>>>>> building, return value: 2
>>>>>                solaris-sparcv9-debug-solaris-sparcv9-build-9 error while
>>>>> building, return value: 2
>>>>>                windows-x64-windows-x64-build-10 error while building,
>>>>> return value: 2
>>>>>                windows-x64-debug-windows-x64-build-11 error while
>>>>> building,
>>>>> return value: 2
>>>>>                windows-x64-open-windows-x64-build-12 error while
>>>>> building,
>>>>> return value: 2
>>>>>                windows-x64-open-debug-windows-x64-build-13 error while
>>>>> building, return value: 2
>>>>>            2 Not run
>>>>>                solaris-sparcv9-install-solaris-sparcv9-build-16
>>>>> Dependency
>>>>> task failed: mach5...-8300-solaris-sparcv9-solaris-sparcv9-build-8
>>>>>                windows-x64-install-windows-x64-build-17 Dependency task
>>>>> failed: YJftjiBUYc
>>>>>
>>>>>        Test
>>>>>
>>>>>            19 Not run
>>>>>
>>>>> tier1-solaris-sparc-open_test_hotspot_jtreg_tier1_common-solaris-sparcv9-57
>>>>>
>>>>>
>>>>> Dependency task failed:
>>>>> mach5...-8300-solaris-sparcv9-solaris-sparcv9-build-8
>>>>>
>>>>> tier1-solaris-sparc-open_test_hotspot_jtreg_tier1_common-solaris-sparcv9-debug-58
>>>>>
>>>>>
>>>>> Dependency task failed:
>>>>> mach5...solaris-sparcv9-debug-solaris-sparcv9-build-9
>>>>>
>>>>> tier1-product-open_test_hotspot_jtreg_tier1_common-windows-x64-20
>>>>> Dependency task failed: YJftjiBUYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_common-windows-x64-debug-26
>>>>> Dependency task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-windows-x64-debug-29
>>>>>
>>>>> Dependency
>>>>> task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-windows-x64-debug-32
>>>>>
>>>>> Dependency
>>>>> task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-windows-x64-debug-35
>>>>>
>>>>> Dependency
>>>>> task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-windows-x64-debug-38
>>>>>
>>>>>
>>>>> Dependency task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-windows-x64-debug-41
>>>>> Dependency task failed: YJftjiBVYc
>>>>>
>>>>> tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-windows-x64-debug-44
>>>>> Dependency task failed: YJftjiBVYc
>>>>>                See all 19...
>>>>>
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>>>>
>>>>>> On 2018-11-01 12:20, Roman Kennke wrote:
>>>>>>> Hi Kim, thanks for reviewing! I'll push it through jdk/submit now.
>>>>>>>
>>>>>>> Erik: ok for you too?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>>>> On Oct 31, 2018, at 6:14 PM, Roman Kennke <rkennke at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Erik,
>>>>>>>>>
>>>>>>>>> right. Fixed this, and what what Kim mentioned plus a missing
>>>>>>>>> include:
>>>>>>>>>
>>>>>>>>> Incremental:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213199/webrev.01.diff/
>>>>>>>>> Full:
>>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8213199/webrev.01/
>>>>>>>>>
>>>>>>>>> Ok now?
>>>>>>>> Looks good.
>>>>>>>>