Obtain MemorySegment from MemoryAddress for a given MemoryLayout

Tue Jan 5 08:30:20 UTC 2021

On 04/01/2021 22:12, Sebastian Stenzel wrote:
>> On 4. Jan 2021, at 11:46, Jorn Vernee <jorn.vernee at oracle.com> wrote:
>>
>> Hi,
>>
>> On 28/12/2020 19:15, Sebastian Stenzel wrote:
>>> During my latest experiments with Panama, I have to deal a lot with MemoryAddresses pointing to structs, that I need to access from Java:
>>>
>>> Let's say I have demo.h declaring my_struct:
>>>
>>> ```
>>> struct my_struct {
>>>    int foo;
>>> };
>>>
>>> ```
>>>
>>> If I run jextract on demo.h, it'll generate code that allows me to get "foo" from a segment, but not from an address:
>>>
>>> ```
>>> int getFooFromMyStruct(MemoryAddress addr) {
>>>      var segment = demo_h.my_struct.ofAddressRestricted(addr); // is there a better way, if I know the layout for sure?
>>>      return demo_h.my_struct.foo$get(segment); // segment required here
>>> }
>>> ```
>>>
>>> Now I'm wondering, if there is any safe (read: not restricted) way to obtain a MemorySegment for a given layout and address?
>> No, there is no non-restricted variant for turning a MemoryAddress into a MemorySegment. The latter is directly dereferencable from Java (hence that being the required type for the getter), while the former is not. The conversion from a MemoryAddress to a MemorySegment needs to essentially attach a size to a pointer to define a region of dereferencable memory, but there is no way to mechanically check that the size being attached is actually correct. So, dereferencing the resulting MemorySegment can still result in crashes and/or silent memory corruption, and hence it is restricted functionality.
>>
>> Jorn
>>
> I understand that it is impossible to cast a "point" to a "region" without any additional information, thus considering this a "dangerous" operation seems reasonable.
>
> However, in case of jextract'ed code _any_ non-primitve parameter will essentially become a MemoryAddress, despite jextract knowing the exact type in many cases. Of course it is still black magic to create a segment from a `void *pointer`, but jextract knows the exact MemoryLayout of a `my_struct *pointer`.
>
> (Technically it might be possible for the native library to ignore type information and pass any address at runtime, but I'd consider this an upstream bug and nothing an API consumer in a strongly-typed environment should carry the can for. In other words: In this case, a crash seems legitimate, as it would have happened in any other language as well.)
>
> Long story short: Since jextract will give us MemoryAddresses (despite knowing the memory layout) almost all of the times (except for primitives), but at the same time generates code requires a segment (as you explained above), basically all projects that make use of this generated code will require `-Dforeign.restricted=permit`. Therefore I'm asking if jextract could give us some "safe" way to deal with addresses with a known layout. Maybe there could be some internal API only accessible from extracted code that doesn't require restricted operations?
>
> I understand that this might not be that easy and might even affect the design of JEP 370/383 significantly.

Note that `-Dforeign.restricted=permit` is required any ways when using 
jextract bindings because using CLinker requires it (unless you never 
plan to call any functions from the generated bindings).

The safe way to deal with MemoryAddresses is to not dereference them at 
all, which doesn't require them to be converted  to MemorySegments. I.e. 
no restricted operation is needed in that case. Addresses with a known 
layout should already be MemorySegments (e.g. a MS returned by 
allocateNative), and can already be dereferenced without using 
restricted operations.

"Technically it might be possible for the native library to ignore type 
information and pass any address at runtime, but I'd consider this an 
upstream bug and nothing an API consumer in a strongly-typed environment 
should carry the can for. In other words: In this case, a crash seems 
legitimate, as it would have happened in any other language as well."

Note that foreign.restricted does not draw a line between legitimate and 
illegitimate crashes. It draws a line between crashes and no crashes (or 
silent memory corruption). The goal of this API is to allow interop with 
native libraries, not to allow writing native code in Java. However, 
some operations are required that do not conform to Java's safety 
standards (those that can cause VM crashes), and these are marked as 
restricted for that reason.

Jorn