[foreign] RFR 8218772: Limit struct member pointers to size of the field

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Feb 14 21:52:59 UTC 2019


On 14/02/2019 21:23, Jorn Vernee wrote:
> Yes, this is a good idea.
>
> Update webrev: 
> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.02/
>
> Struct::assign and Array::assign were pretty similar, so I factored 
> out the code into a Pointer::assign method. I also implemented 
> equals/hashCode/toString for LayoutTypeImpl for that, which seem 
> useful to have any ways. (I can imagine there are scenarios where a 
> user would want to have a Map<LayoutType, ...> for instance).

Not uber convinced about Pointer::assign, mostly because we already have 
a Pointer::copy - we should only have the latter, and perhaps that's the 
right name for the new method?

Also, not greatly convinced by the need of the Array::ptr method - maybe 
hide that into the impl class?

>
> The new checking also uncovered a bug where allocation of a 0 length 
> array would return Pointer.nullPointer(), but this does not have the 
> right LayoutType. Besides, there's already a special case for 0 length 
> regions by returning MemoryBoundInfo.NOTHING. I just removed the 
> nullPointer() return, so we end up with a pointer with the right 
> LayoutType, and perhaps more importantly, the right Scope.
>
> However, this again caused a crash in ScopeTest, so I've also updated 
> the bounds checking code to remove the special checkBounds case for 
> accessing NOTHING at offset 0, and added a special case to checkRange 
> w.r.t. empty ranges, which fixes the issue.

Could you expand a bit on exactly who was calling checkRange with offset 
and length set to 0? I guess it was Pointer::addr?

The changes in MemoryBoundInfo look odd - for instance:

if (length == 0) {
+ if (offset < 0 || offset > this.length) {
+ throw new IllegalStateException("offset: " + offset + ", region 
length: " + this.length);
+ }
+ }


This  stuff inside the braces is 99% the same as checkBounds(offset), 
except for a '<' instead of '<='.

Wouldn't it be better to special case checkBounds to deal with zero 
length gracefully, and then just call it from checkRange?

Also, and probably even better - if this stuff is there just to handle 
NOTHING - wouldn't we better off overriding the check methods for the 
NOTHING instance (as we do for everything) ? I'm not a fan of these 
hard-coded special cases, which will be nearly impossible to read 2 
months from now.

Cheers
Maurizio


> Jorn
>
> Maurizio Cimadamore schreef op 2019-02-14 16:19:
>> On 14/02/2019 15:18, Maurizio Cimadamore wrote:
>>> Why not using the Array.assign API point, which already performs all 
>>> these checks?
>>
>> Similarly, I think References.OfStruct::set should use Struct.assign -
>> as there's always a chance of mis-using the setter MH to write a
>> struct that is bigger or smaller than the expected one.
>>
>> Maurizio
>>
>>>
>>> Maurizio
>>>
>>> On 14/02/2019 14:31, Jorn Vernee wrote:
>>>> Okay, I have implemented the check in References.OfArray::set.
>>>>
>>>> Updated webrev: 
>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.01/
>>>>
>>>> I was also tinkering a long time with simplifying the original 
>>>> code, but kept running into issues with bounds checks in 
>>>> MemoryBoundInfo. The current bounds checking code isn't really 
>>>> suited for dealing with 0 length range checks.
>>>>
>>>> Jorn
>>>>
>>>> Jorn Vernee schreef op 2019-02-13 19:49:
>>>>> Jorn Vernee schreef op 2019-02-13 19:44:
>>>>>> Henry Jen schreef op 2019-02-13 19:19:
>>>>>>> On Feb 13, 2019, at 9:42 AM, Jorn Vernee <jbvernee at xs4all.nl> 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I don't think this bug is just a symptom of the bulk copy of 
>>>>>>>> Arrays. Note that you can also cause an overwrite when you have 
>>>>>>>> a struct with 2 ints, take a pointer to the first one, cast it 
>>>>>>>> to a long pointer, and write to it.
>>>>>>>>
>>>>>>>
>>>>>>> This is perfectly OK. The working theory is that as long as you
>>>>>>> operated in the allocated memory region, you should be able to 
>>>>>>> cast as
>>>>>>> needed.
>>>>>>>
>>>>>>> In you array case earlier, that’s not OK as the type of array has a
>>>>>>> length, so I would expect an exception. But if you cast the 
>>>>>>> pointer to
>>>>>>> an Array of 3, then it should be able to overwrite the int. 
>>>>>>> After all,
>>>>>>> [2i32]i32 and [3i32] and i32i32i32 are the same size of memory 
>>>>>>> block.
>>>>>>
>>>>>> The problem is that there is no way to know the size of the target
>>>>>> array type once we're in References.OfArray::set, since the target
>>>>>> pointer might not actually have that type. The earliest location 
>>>>>> where
>>>>>> we do know this size is in RuntimeSupport, when retrieving the 
>>>>>> pointer
>>>>>> to the field.
>>>>>
>>>>> Actually, maybe we do know the type... I'm wondering about use of
>>>>> unsafe set operations in boxing code. I will just try it out and see.
>>>>>
>>>>> Jorn
>>>>>
>>>>>> Jorn
>>>>>>
>>>>>>> Cheers,
>>>>>>> Henry
>>>>>>>
>>>>>>>
>>>>>>>> It simply seems incorrect to me that a pointer to a struct's 
>>>>>>>> field provides access outside of that field's memory.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> FWIW, I think the Array abstraction is a useful one. It signals 
>>>>>>>> the difference between having a chunk of memory (Array) vs. 
>>>>>>>> having just a cursor into memory (Pointer), and as such, I 
>>>>>>>> think it's fine to say that writing an Array constitutes a copy 
>>>>>>>> of the array vs. just a copy of the pointer. If a bulk-copy is 
>>>>>>>> not wanted, users can use the elementPointer() instead (same 
>>>>>>>> with Struct::ptr).
>>>>>>>>
>>>>>>>> The native types do not map perfectly into Java types, so there 
>>>>>>>> are some things that have to be learned when using the API. I 
>>>>>>>> think the pitfall here is the assumption that, since in C an 
>>>>>>>> array is just a pointer in a lot of cases, and setting a 
>>>>>>>> pointer does not incur a copy, the Array type in Java must also 
>>>>>>>> really be just a Pointer, and setting it should not incur a 
>>>>>>>> copy. But in the API we have 2 distinct types, Pointer and 
>>>>>>>> Array, so I don't think it's unreasonable to say that those 2 
>>>>>>>> will behave differently.
>>>>>>>>
>>>>>>>> Jorn
>>>>>>>>
>>>>>>>> Maurizio Cimadamore schreef op 2019-02-13 17:57:
>>>>>>>>> Hi Jorn,
>>>>>>>>> I was looking at something very related to this - e.g. 
>>>>>>>>> relationship
>>>>>>>>> between pointers and arrays, and, in general bulk-write 
>>>>>>>>> operations for
>>>>>>>>> structs and arrays (with Pointer::set) and I thought I might add
>>>>>>>>> something to this discussion, to see if the issues that you are
>>>>>>>>> running into are just bugs, or symptoms of something deeper.
>>>>>>>>> Over the last few weeks I've been toying with the idea of merging
>>>>>>>>> Array and Pointers - after all a BoundedPointer is expressive 
>>>>>>>>> enough
>>>>>>>>> to represent both. We could e.g. setup an approach like the 
>>>>>>>>> one below:
>>>>>>>>> - Pointer has array-like accessors Pointer::get(long), 
>>>>>>>>> Pointer::set(long, X)
>>>>>>>>> - regular accessor defaults to zero offset - that is, 
>>>>>>>>> Pointer::get()
>>>>>>>>> -> Pointer::get(0)
>>>>>>>>> - jextract does NOT generate setters for array struct fields, 
>>>>>>>>> or array
>>>>>>>>> global variables - only getters
>>>>>>>>> - an API is provided (well, one exists already) to do bulk copy
>>>>>>>>> between different arrays/structs
>>>>>>>>> While I like the unification this brings about (only one 
>>>>>>>>> abstraction
>>>>>>>>> instead of two, Pointer and Array), and I also like the fact 
>>>>>>>>> that we
>>>>>>>>> move towards a model where Pointer::get, Pointer::set are O(1)
>>>>>>>>> operations (with bulk operations left to higher level APIs), 
>>>>>>>>> there is
>>>>>>>>> something that doesn't 100% convinces me: if we go down this 
>>>>>>>>> path, the
>>>>>>>>> following layouts:
>>>>>>>>> u64:u64:i32 (pointer to pointer to int)
>>>>>>>>> and
>>>>>>>>> u64:[5 i32] (pointer to array of int)
>>>>>>>>> will effectively have the same carrier type:
>>>>>>>>> Pointer<Pointer<Integer>>
>>>>>>>>> The difference will be visible only upon closer inspection: we 
>>>>>>>>> could
>>>>>>>>> call Pointer::type() obtain a LayoutType, then do 
>>>>>>>>> LayoutType::layout()
>>>>>>>>> and see whether the pointee layout is a sequence or an address.
>>>>>>>>> Now, when we perform a get() on such a pointer, we can, given the
>>>>>>>>> layout, construct the right pointer with the right size info 
>>>>>>>>> (if any).
>>>>>>>>> But what about set() ? I see different options here, none of 
>>>>>>>>> which
>>>>>>>>> seems particularly satisfying:
>>>>>>>>> 1) Pointer::set should throw if you are trying to set an 
>>>>>>>>> array-like
>>>>>>>>> pointer (which would require bulk copy)
>>>>>>>>> 2) Pointer::set will silently perform bulk copy of the incoming
>>>>>>>>> pointer into the pointed region
>>>>>>>>> Of these, (2) is similar to what we have now, whereas (1) 
>>>>>>>>> would be a
>>>>>>>>> stricter variant.
>>>>>>>>> The thing that puzzles me is that, looking at the code it will be
>>>>>>>>> absolutely impossible to understand what's going on - e.g. if 
>>>>>>>>> native
>>>>>>>>> arrays and native pointers map to the same Java carrier 
>>>>>>>>> (Pointer) it
>>>>>>>>> then becomes super hard to explain/understand/predict why a given
>>>>>>>>> operation (e.g. Pointer::set) resulted in bulk copy/exception.
>>>>>>>>> One of the principles I've been trying to adhere to when 
>>>>>>>>> designing the
>>>>>>>>> foreign API is to avoid surprises - e.g. the semantics of a given
>>>>>>>>> operation should be clear from the carrier types involved in the
>>>>>>>>> operation. In that respect, maybe a pointer with an optional 
>>>>>>>>> length()
>>>>>>>>> is tolerable, but having Pointer::set behaving differently 
>>>>>>>>> depending
>>>>>>>>> on the result of Pointer::type() is that acceptable or too 
>>>>>>>>> subtle?
>>>>>>>>> This seems to suggest that the current Pointer vs. Array split 
>>>>>>>>> carries
>>>>>>>>> some weight. If carrier type are different then it's easy to 
>>>>>>>>> see for
>>>>>>>>> the user which operation might or might not be supported (or 
>>>>>>>>> might
>>>>>>>>> have bulk-logic).
>>>>>>>>> <sidebar>
>>>>>>>>> On the other hand, one might argue that this is already 
>>>>>>>>> happening with
>>>>>>>>> pointers to incomplete objects - if I have a Pointer<?> and I 
>>>>>>>>> call
>>>>>>>>> get() I don't know if I'll get an exception or not. It again 
>>>>>>>>> depends
>>>>>>>>> on whether the layout complete or not - e.g. if it's the void 
>>>>>>>>> layout
>>>>>>>>> or, if it's partial layout (because it refers to an unresolved 
>>>>>>>>> struct)
>>>>>>>>> an exception will be thrown.
>>>>>>>>> </sidebar>
>>>>>>>>> If we do retain Pointer vs. Array then I think we are free to 
>>>>>>>>> decide
>>>>>>>>> whether for Pointer::set we want (1) or (2) - that is, when we 
>>>>>>>>> have
>>>>>>>>> Pointer<Array<Integer>>
>>>>>>>>> or
>>>>>>>>> Pointer<StructFoo>
>>>>>>>>> should Pointer::set do bulk copy, or should it throw? If the 
>>>>>>>>> latter,
>>>>>>>>> should jextract even emit setters for struct fields of type
>>>>>>>>> array/struct? (or, perhaps, emulate such setters by performing
>>>>>>>>> explicit copy, rather than by using Pointer::set internally).
>>>>>>>>> These are all questions that are relevant, I believe, to the 
>>>>>>>>> fix you
>>>>>>>>> brought up - I'm of course fine with the fix, but I'd like to 
>>>>>>>>> also
>>>>>>>>> understand whether the bulk-copy on Pointer::set is one magic 
>>>>>>>>> trick
>>>>>>>>> too far and also, more generally, what do Panama-land feels 
>>>>>>>>> about the
>>>>>>>>> Array vs. Pointer split/lump.
>>>>>>>>> Feedback welcome.
>>>>>>>>> Cheers
>>>>>>>>> Maurizio
>>>>>>>>> On 13/02/2019 15:47, Jorn Vernee wrote:
>>>>>>>>>> Hi,
>>>>>>>>>> I found a bug where it was possible to overwrite trailing 
>>>>>>>>>> fields of a struct by writing an oversized array to a 
>>>>>>>>>> previous array field (see bug). Overwriting is also possible 
>>>>>>>>>> in other cases by forcing an oversized write to a struct 
>>>>>>>>>> field. This can be fixed (relatively easily) by limiting the 
>>>>>>>>>> size of memory regions of pointers to struct members to the 
>>>>>>>>>> size of their fields.
>>>>>>>>>> Please review the following.
>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218772
>>>>>>>>>> Webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.00/ 
>>>>>>>>>> Thanks,
>>>>>>>>>> Jorn


More information about the panama-dev mailing list