[foreign] RFR 8218772: Limit struct member pointers to size of the field

Jorn Vernee jbvernee at xs4all.nl
Thu Feb 14 21:23:52 UTC 2019


Yes, this is a good idea.

Update webrev: 
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.02/

Struct::assign and Array::assign were pretty similar, so I factored out 
the code into a Pointer::assign method. I also implemented 
equals/hashCode/toString for LayoutTypeImpl for that, which seem useful 
to have any ways. (I can imagine there are scenarios where a user would 
want to have a Map<LayoutType, ...> for instance).

The new checking also uncovered a bug where allocation of a 0 length 
array would return Pointer.nullPointer(), but this does not have the 
right LayoutType. Besides, there's already a special case for 0 length 
regions by returning MemoryBoundInfo.NOTHING. I just removed the 
nullPointer() return, so we end up with a pointer with the right 
LayoutType, and perhaps more importantly, the right Scope.

However, this again caused a crash in ScopeTest, so I've also updated 
the bounds checking code to remove the special checkBounds case for 
accessing NOTHING at offset 0, and added a special case to checkRange 
w.r.t. empty ranges, which fixes the issue.

Jorn

Maurizio Cimadamore schreef op 2019-02-14 16:19:
> On 14/02/2019 15:18, Maurizio Cimadamore wrote:
>> Why not using the Array.assign API point, which already performs all 
>> these checks?
> 
> Similarly, I think References.OfStruct::set should use Struct.assign -
> as there's always a chance of mis-using the setter MH to write a
> struct that is bigger or smaller than the expected one.
> 
> Maurizio
> 
>> 
>> Maurizio
>> 
>> On 14/02/2019 14:31, Jorn Vernee wrote:
>>> Okay, I have implemented the check in References.OfArray::set.
>>> 
>>> Updated webrev: 
>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.01/
>>> 
>>> I was also tinkering a long time with simplifying the original code, 
>>> but kept running into issues with bounds checks in MemoryBoundInfo. 
>>> The current bounds checking code isn't really suited for dealing with 
>>> 0 length range checks.
>>> 
>>> Jorn
>>> 
>>> Jorn Vernee schreef op 2019-02-13 19:49:
>>>> Jorn Vernee schreef op 2019-02-13 19:44:
>>>>> Henry Jen schreef op 2019-02-13 19:19:
>>>>>> On Feb 13, 2019, at 9:42 AM, Jorn Vernee <jbvernee at xs4all.nl> 
>>>>>> wrote:
>>>>>>> 
>>>>>>> I don't think this bug is just a symptom of the bulk copy of 
>>>>>>> Arrays. Note that you can also cause an overwrite when you have a 
>>>>>>> struct with 2 ints, take a pointer to the first one, cast it to a 
>>>>>>> long pointer, and write to it.
>>>>>>> 
>>>>>> 
>>>>>> This is perfectly OK. The working theory is that as long as you
>>>>>> operated in the allocated memory region, you should be able to 
>>>>>> cast as
>>>>>> needed.
>>>>>> 
>>>>>> In you array case earlier, that’s not OK as the type of array has 
>>>>>> a
>>>>>> length, so I would expect an exception. But if you cast the 
>>>>>> pointer to
>>>>>> an Array of 3, then it should be able to overwrite the int. After 
>>>>>> all,
>>>>>> [2i32]i32 and [3i32] and i32i32i32 are the same size of memory 
>>>>>> block.
>>>>> 
>>>>> The problem is that there is no way to know the size of the target
>>>>> array type once we're in References.OfArray::set, since the target
>>>>> pointer might not actually have that type. The earliest location 
>>>>> where
>>>>> we do know this size is in RuntimeSupport, when retrieving the 
>>>>> pointer
>>>>> to the field.
>>>> 
>>>> Actually, maybe we do know the type... I'm wondering about use of
>>>> unsafe set operations in boxing code. I will just try it out and 
>>>> see.
>>>> 
>>>> Jorn
>>>> 
>>>>> Jorn
>>>>> 
>>>>>> Cheers,
>>>>>> Henry
>>>>>> 
>>>>>> 
>>>>>>> It simply seems incorrect to me that a pointer to a struct's 
>>>>>>> field provides access outside of that field's memory.
>>>>>>> 
>>>>>>> ---
>>>>>>> 
>>>>>>> FWIW, I think the Array abstraction is a useful one. It signals 
>>>>>>> the difference between having a chunk of memory (Array) vs. 
>>>>>>> having just a cursor into memory (Pointer), and as such, I think 
>>>>>>> it's fine to say that writing an Array constitutes a copy of the 
>>>>>>> array vs. just a copy of the pointer. If a bulk-copy is not 
>>>>>>> wanted, users can use the elementPointer() instead (same with 
>>>>>>> Struct::ptr).
>>>>>>> 
>>>>>>> The native types do not map perfectly into Java types, so there 
>>>>>>> are some things that have to be learned when using the API. I 
>>>>>>> think the pitfall here is the assumption that, since in C an 
>>>>>>> array is just a pointer in a lot of cases, and setting a pointer 
>>>>>>> does not incur a copy, the Array type in Java must also really be 
>>>>>>> just a Pointer, and setting it should not incur a copy. But in 
>>>>>>> the API we have 2 distinct types, Pointer and Array, so I don't 
>>>>>>> think it's unreasonable to say that those 2 will behave 
>>>>>>> differently.
>>>>>>> 
>>>>>>> Jorn
>>>>>>> 
>>>>>>> Maurizio Cimadamore schreef op 2019-02-13 17:57:
>>>>>>>> Hi Jorn,
>>>>>>>> I was looking at something very related to this - e.g. 
>>>>>>>> relationship
>>>>>>>> between pointers and arrays, and, in general bulk-write 
>>>>>>>> operations for
>>>>>>>> structs and arrays (with Pointer::set) and I thought I might add
>>>>>>>> something to this discussion, to see if the issues that you are
>>>>>>>> running into are just bugs, or symptoms of something deeper.
>>>>>>>> Over the last few weeks I've been toying with the idea of 
>>>>>>>> merging
>>>>>>>> Array and Pointers - after all a BoundedPointer is expressive 
>>>>>>>> enough
>>>>>>>> to represent both. We could e.g. setup an approach like the one 
>>>>>>>> below:
>>>>>>>> - Pointer has array-like accessors Pointer::get(long), 
>>>>>>>> Pointer::set(long, X)
>>>>>>>> - regular accessor defaults to zero offset - that is, 
>>>>>>>> Pointer::get()
>>>>>>>> -> Pointer::get(0)
>>>>>>>> - jextract does NOT generate setters for array struct fields, or 
>>>>>>>> array
>>>>>>>> global variables - only getters
>>>>>>>> - an API is provided (well, one exists already) to do bulk copy
>>>>>>>> between different arrays/structs
>>>>>>>> While I like the unification this brings about (only one 
>>>>>>>> abstraction
>>>>>>>> instead of two, Pointer and Array), and I also like the fact 
>>>>>>>> that we
>>>>>>>> move towards a model where Pointer::get, Pointer::set are O(1)
>>>>>>>> operations (with bulk operations left to higher level APIs), 
>>>>>>>> there is
>>>>>>>> something that doesn't 100% convinces me: if we go down this 
>>>>>>>> path, the
>>>>>>>> following layouts:
>>>>>>>> u64:u64:i32 (pointer to pointer to int)
>>>>>>>> and
>>>>>>>> u64:[5 i32] (pointer to array of int)
>>>>>>>> will effectively have the same carrier type:
>>>>>>>> Pointer<Pointer<Integer>>
>>>>>>>> The difference will be visible only upon closer inspection: we 
>>>>>>>> could
>>>>>>>> call Pointer::type() obtain a LayoutType, then do 
>>>>>>>> LayoutType::layout()
>>>>>>>> and see whether the pointee layout is a sequence or an address.
>>>>>>>> Now, when we perform a get() on such a pointer, we can, given 
>>>>>>>> the
>>>>>>>> layout, construct the right pointer with the right size info (if 
>>>>>>>> any).
>>>>>>>> But what about set() ? I see different options here, none of 
>>>>>>>> which
>>>>>>>> seems particularly satisfying:
>>>>>>>> 1) Pointer::set should throw if you are trying to set an 
>>>>>>>> array-like
>>>>>>>> pointer (which would require bulk copy)
>>>>>>>> 2) Pointer::set will silently perform bulk copy of the incoming
>>>>>>>> pointer into the pointed region
>>>>>>>> Of these, (2) is similar to what we have now, whereas (1) would 
>>>>>>>> be a
>>>>>>>> stricter variant.
>>>>>>>> The thing that puzzles me is that, looking at the code it will 
>>>>>>>> be
>>>>>>>> absolutely impossible to understand what's going on - e.g. if 
>>>>>>>> native
>>>>>>>> arrays and native pointers map to the same Java carrier 
>>>>>>>> (Pointer) it
>>>>>>>> then becomes super hard to explain/understand/predict why a 
>>>>>>>> given
>>>>>>>> operation (e.g. Pointer::set) resulted in bulk copy/exception.
>>>>>>>> One of the principles I've been trying to adhere to when 
>>>>>>>> designing the
>>>>>>>> foreign API is to avoid surprises - e.g. the semantics of a 
>>>>>>>> given
>>>>>>>> operation should be clear from the carrier types involved in the
>>>>>>>> operation. In that respect, maybe a pointer with an optional 
>>>>>>>> length()
>>>>>>>> is tolerable, but having Pointer::set behaving differently 
>>>>>>>> depending
>>>>>>>> on the result of Pointer::type() is that acceptable or too 
>>>>>>>> subtle?
>>>>>>>> This seems to suggest that the current Pointer vs. Array split 
>>>>>>>> carries
>>>>>>>> some weight. If carrier type are different then it's easy to see 
>>>>>>>> for
>>>>>>>> the user which operation might or might not be supported (or 
>>>>>>>> might
>>>>>>>> have bulk-logic).
>>>>>>>> <sidebar>
>>>>>>>> On the other hand, one might argue that this is already 
>>>>>>>> happening with
>>>>>>>> pointers to incomplete objects - if I have a Pointer<?> and I 
>>>>>>>> call
>>>>>>>> get() I don't know if I'll get an exception or not. It again 
>>>>>>>> depends
>>>>>>>> on whether the layout complete or not - e.g. if it's the void 
>>>>>>>> layout
>>>>>>>> or, if it's partial layout (because it refers to an unresolved 
>>>>>>>> struct)
>>>>>>>> an exception will be thrown.
>>>>>>>> </sidebar>
>>>>>>>> If we do retain Pointer vs. Array then I think we are free to 
>>>>>>>> decide
>>>>>>>> whether for Pointer::set we want (1) or (2) - that is, when we 
>>>>>>>> have
>>>>>>>> Pointer<Array<Integer>>
>>>>>>>> or
>>>>>>>> Pointer<StructFoo>
>>>>>>>> should Pointer::set do bulk copy, or should it throw? If the 
>>>>>>>> latter,
>>>>>>>> should jextract even emit setters for struct fields of type
>>>>>>>> array/struct? (or, perhaps, emulate such setters by performing
>>>>>>>> explicit copy, rather than by using Pointer::set internally).
>>>>>>>> These are all questions that are relevant, I believe, to the fix 
>>>>>>>> you
>>>>>>>> brought up - I'm of course fine with the fix, but I'd like to 
>>>>>>>> also
>>>>>>>> understand whether the bulk-copy on Pointer::set is one magic 
>>>>>>>> trick
>>>>>>>> too far and also, more generally, what do Panama-land feels 
>>>>>>>> about the
>>>>>>>> Array vs. Pointer split/lump.
>>>>>>>> Feedback welcome.
>>>>>>>> Cheers
>>>>>>>> Maurizio
>>>>>>>> On 13/02/2019 15:47, Jorn Vernee wrote:
>>>>>>>>> Hi,
>>>>>>>>> I found a bug where it was possible to overwrite trailing 
>>>>>>>>> fields of a struct by writing an oversized array to a previous 
>>>>>>>>> array field (see bug). Overwriting is also possible in other 
>>>>>>>>> cases by forcing an oversized write to a struct field. This can 
>>>>>>>>> be fixed (relatively easily) by limiting the size of memory 
>>>>>>>>> regions of pointers to struct members to the size of their 
>>>>>>>>> fields.
>>>>>>>>> Please review the following.
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218772
>>>>>>>>> Webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8218772/webrev.00/ 
>>>>>>>>> Thanks,
>>>>>>>>> Jorn


More information about the panama-dev mailing list