[foreign] RFR 8212987 : Binder should allows negative values for native pointers

Jorn Vernee jbvernee at xs4all.nl
Tue Nov 6 19:10:22 UTC 2018


Hey Henry, thanks for taking a look at this again. Comments inline...

Henry Jen schreef op 2018-11-06 18:46:
> I am afraid any attempt to support negative address without completely
> considered address unsigned is not gonna working well.

That is my current intent; the address (i.e. min) is consider unsigned. 
The length of BoundedMemoryRegion is just there to be able to do a 
bounds check, not having a length for `Everything` currently just means 
that you can't do any bounds checks for a memory region that is larger 
than Long.MAX_VALUE, since that is the limit for the length.

> On the SPARC note, I did a simple test on sun4u and sun4v, both not
> showing negative address. I suspect the negative address is observed
> on 32-bit OS.
> 
> void main(int argc, char** argv) {
>     char* buf = malloc(1024);
>     printf("Allocated address: %x\n", buf);
> 
>     printf("printf address: %x\n", printf);
> }

OK, I'm not sure about the negative addresses then.

> The patch failed TestJextractFFI failed because of EVERYTHING throw
> exception on length(), I think we could simply limit max length to -1L
> or Long.MAX_VALUE depending on if we want to allow 63-bit or 64-bit
> address and have Everything return accordingly. Before we have
> unsigned address story complete, I prefer EVERYTHING should simply be
> 0-Long.MAX_VALUE.

Ah, sorry, looks like that check is not in my local copy due to a bad 
merge (I have the unboxValue method in SystemABI).

The check that it's crashing on can be changed to 
`if(struct.ptr().type().bytesSize() != 0) {` instead, and that should 
fix the crash. Instead of checking the size of the underlying memory 
region, it would check the size of the struct's layout, and it seems 
that that is the actual intent of that check as well? i.e. checking that 
it's not an empty struct.

About the length() method; I have another patch that moves the 
bytesSize() and elements() functions from `Pointer` to `Array` and that 
removes the need to have a length() method on BoundedMemoryRegion at 
all, since they can instead rely on the size field of the array. [1] 
(that patch is designed to apply on top of the one from this RFR)

This RFR should enable unsigned addresses, at least as far as 
BoundedMemoryRegion goes. It doesn't complete the story of having a 
BoundedMemoryRegion larger than Long.MAX_VALUE, but that is a different 
problem. If I make EVERYTHING be a region from 0-Long.MAX_VALUE again, 
but the offset is unsigned then AFAICT there are no use-cases left for 
EVERYTHING, since the offset could be outside of it, so it would always 
be out of bounds.

> Any negative value exposed now may cause problem if user doesn’t treat
> those as unsigned values.

Would just documenting that behavior be sufficient for the user you 
think? AFAICT the only way the offset is currently exposed is through 
the `Pointer::addr` method and through the string returned by 
`BoundedPointer::toString` (although it doesn't show up as negative for 
the latter).

Thanks,
Jorn

[1] : 
http://cr.openjdk.java.net/~jvernee/panama/webrevs/arrays/webrev.01/

> Cheers,
> Henry
> 
> 
> LibClang crash recovery disabled
> null
> java.lang.UnsupportedOperationException
> 	at
> java.base/jdk.internal.foreign.memory.BoundedMemoryRegion$Everything.length(BoundedMemoryRegion.java:351)
> 	at
> java.base/jdk.internal.foreign.memory.BoundedPointer.bytesSize(BoundedPointer.java:103)
> 	at
> java.base/jdk.internal.foreign.invokers.UniversalNativeInvoker.unboxValue(UniversalNativeInvoker.java:176)
> 	at
> java.base/jdk.internal.foreign.invokers.UniversalNativeInvoker.invoke(UniversalNativeInvoker.java:125)
> 	at
> clang/clang.Index$Impl/0x00000007c0170040.clang_getTokenSpelling(Unknown
> Source)
> 	at
> jdk.internal.clang/jdk.internal.clang.TranslationUnit$Token.spelling(TranslationUnit.java:139)
> 	at
> jdk.internal.clang/jdk.internal.clang.TranslationUnit.tokens(TranslationUnit.java:72)
> 	at
> jdk.jextract/com.sun.tools.jextract.parser.Parser.handleMacro(Parser.java:168)
> 	at
> jdk.jextract/com.sun.tools.jextract.parser.Parser.lambda$parse$8(Parser.java:127)
> 	at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
> 	at
> java.base/java.util.stream.ReferencePipeline$11$1.accept(ReferencePipeline.java:442)
> 	at
> java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
> 	at
> java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
> 	at
> java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
> 	at
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> 	at
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> 	at
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> 	at
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
> 	at 
> jdk.jextract/com.sun.tools.jextract.parser.Parser.parse(Parser.java:104)
> 	at jdk.jextract/com.sun.tools.jextract.Context.parse(Context.java:411)
> 	at jdk.jextract/com.sun.tools.jextract.Context.parse(Context.java:387)
> 	at jdk.jextract/com.sun.tools.jextract.Main.run(Main.java:251)
> 	at jdk.jextract/com.sun.tools.jextract.Main.main(Main.java:316)
> 
>> On Nov 2, 2018, at 2:09 PM, Jorn Vernee <jbvernee at xs4all.nl> wrote:
>> 
>> Comments inline...
>> 
>> Henry Jen schreef op 2018-11-02 19:19:
>>> On Nov 2, 2018, at 4:46 AM, Jorn Vernee <jbvernee at xs4all.nl> wrote:
>>>> Henry Jen schreef op 2018-11-02 05:02:
>>>>> This pretty much passed all tests, except that we are expecting 
>>>>> wrong
>>>>> exception, depends on SPEC, we either need to fix test or fix
>>>>> implementation. I’ll let Maurizio comments on what exception should 
>>>>> we
>>>>> expect.
>>>> This depends on the used invoker again. Since the universal native 
>>>> invoker wraps the IOOBE inside an IllegalStateException that's what 
>>>> I'm checking for. TestNG doesn't seem to have an easy way to check 
>>>> the cause of an exception. I could do that manually, but I agree 
>>>> that it depends on the spec, so it needs to be decided what kind of 
>>>> exception we want to see there.
>>>>> However, change the BoundedPointer to create new MemoryRegion has a
>>>>> subtle difference, which I think worth attention. EVERYTHING has a
>>>>> special meaning that it’s the whole available memory. With the new
>>>>> implementation, the region won’t allow access before the address, 
>>>>> this
>>>>> is not correct in turns of Pointer as it can be moved freely in the
>>>>> bounded memory region, even backwards.
>>>> I hadn't thought about that yet, that's a good point.
>>>> Though I have to point out that the current implementation of some 
>>>> of the BoundedPointer factories were already disallowing this by 
>>>> creating a memory region at the requested offset [1].
>>> Thanks for all the work, you can tell this is prototype and not
>>> production quality. :P
>>> To me, Pointer to throw IndexOutOfBoundException seems reasonable.
>>> Like you said, the UniversalNativeInvoker actually rely on
>>> Pointer.get() to get the return Pointer, and that wrap up the IOOBE
>>> into ISE.
>>> For now, I think we can probably just relax the expectation to be
>>> RuntimeException?
>> 
>> Yeah, that seems fine.
>> 
>>>>> Although we can argue that it’s better to have Pointer construct 
>>>>> out
>>>>> of a strictly bounded memory region, it’s a change of current 
>>>>> behavior
>>>>> and that probably need some discussion regarding to use cases.
>>>>> Anyway, like I said, I think enable full 64bit address space is
>>>>> probably not a bad thing. I suspect once we change the length 
>>>>> instead
>>>>> of using min and max address value, the bound check won’t be that
>>>>> confusing, but you may feel differently.
>>>> I had a version which used min and max offset, but to make that safe 
>>>> I had to implement a version of Math.addExact for unsigned numbers, 
>>>> which was being called in checkBounds and checkRange to make sure 
>>>> that the offset being passed in didn't overflow when adding to the 
>>>> region's `min` address and length. The downside of this was that it 
>>>> made the implementation noticeably slower, I guess since my version 
>>>> of addExact isn't intrinsified. I got rid of one of the overflow 
>>>> checks by using lengthMinusOne (i.e. length - 1). I could probably 
>>>> look into that some more though.
>>>> Really, since BoundedMemoryRegion only does a bounds check and then 
>>>> an access, maybe we should choose a design where we create an 
>>>> unbounded version of `Pointer` and use that instead of using 
>>>> EVERYTHING. If the intent is really to make every 64bit address 
>>>> accessible than that should behave the same I think, but it seems 
>>>> much easier to make. It just leaves the problem of not being able to 
>>>> create a memory region of size >= Long.MAX_VALUE for another time.
>>> I like your idea. For special cases, that is with address to be
>>> negative, we can have a special unbounded pointer that cannot be
>>> dereferenced. After all, this is the use case here, a special error
>>> code. I doubt there would be a valid address with negative value, but
>>> I don’t have extensive knowledge about CPU architecture.
>> 
>> It seems this doesn't hold for e.g. Solaris [1], but it depends on the 
>> OS. So it seems that dereferencing negative pointers will have to be 
>> supported at some point.
>> 
>> In the meantime I've worked on things a little more and revisited the 
>> idea of a subclass of BoundedMemoryRegion that doesn't do the bounds 
>> check. After thinking about it, I don't think this is as hacky as it 
>> seemed. The idea is that you can either have a memory region with an 
>> explicit length between 0 and Long.MAX_VALUE, and any offset, OR have 
>> a memory region starting at 0 that covers everything, without an 
>> explicit length. Since the only restriction is that the explicit 
>> length can not be larger than Long.MAX_VALUE, but still any offset is 
>> accepted I think this might get us a little more mileage.
>> 
>> Here's the webrev: 
>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8212987/webrev.09/
>> 
>> Some implementation notes:
>> 
>> - Right now the memory region without an explicit offset or length 
>> (called Everything) is a nested subclass of BoundedMemoryRegion, but 
>> maybe these should be refactored into an abstract class and 
>> implementation pair.
>> 
>> - I noticed some missing access checks in put-/get-Address, and 
>> putBits was checking for read access instead of write access. So I've 
>> refactored the checks into checkRead and checkWrite methods to 
>> simplify things. The put-/get-Bits methods were also checking the size 
>> twice, both explicitly and as part of the switch, so I removed the 
>> explicit check.
>> 
>> - I've changed the BoundedPointer factories to use the new Everything 
>> class, instead of creating a memory region at the argument offset. 
>> That should allow 'looking back' with the returned pointer as well.
>> 
>> The other thing is that since the EVERYTHING region doesn't have a 
>> length, you can't request a pointer's `byteSize()` [2], or iterate 
>> over it's `elements()` [3]. I'm not sure I find those public API 
>> points all that sensible to be honest; with `byteSize()` I'd expect to 
>> get the size of the Pointer's layout, but instead it returns the size 
>> of the memory region - offset. You get the size of the layout by using 
>> `(type.layout().bitsSize() / 8)`. And something like `elements()` 
>> seems more at home in `Array`, since that has an explicit element 
>> count associated with it, while a pointer does not. Imho these API 
>> points should be moved to Array. The doc for `elements()` already 
>> talks about the pointer as if it's an array, so it seems like these 
>> API points were from before you had the Array interface?
>> 
>> Cheers,
>> Jorn
>> 
>> [1] : 
>> http://mail.openjdk.java.net/pipermail/panama-dev/2018-October/002995.html
>> [2] : 
>> http://hg.openjdk.java.net/panama/dev/file/cfee9bc4ebe4/src/java.base/share/classes/java/foreign/memory/Pointer.java#l94
>> [3] : 
>> http://hg.openjdk.java.net/panama/dev/file/cfee9bc4ebe4/src/java.base/share/classes/java/foreign/memory/Pointer.java#l58


More information about the panama-dev mailing list