New candidate JEP: 419: Foreign Function & Memory API (Second Incubator)

Tue Oct 12 10:03:05 UTC 2021

On 12/10/2021 07:00, Sebastian Stenzel wrote:
> I looked at these two lines from the example on the JEP page:
>
>      MemorySegment cString = MemorySegment.allocateNative(javaStrings[i].length + 1, ...);
>      cString.setUtf8String(0, javaStrings[i]);
>
> Just looking at the code, I have to guess, where exactly null-termination happens. There are two options:
>
> * Either `allocateNative` fills the allocated memory with zeros - while inefficient, this might be a legitimate assumption for people only knowing Java
allocateNative does zero-init the memory.
> * Or `setUtf8String` adds a null byte - which should then be documented somewhere
That too.
>
> So I looked at the JavaDoc and I believe the documentation can be improved for both methods:
I agree javadoc should be improved.
>
> First of all, regarding `allocateNative`, I think it should be explicitly stated that allocation does not mean initialization. While this might be obvious for most people here, we should keep in mind that by making the non-Java world more accessible for Java developers, we should do our best to warn those people about the different laws of nature they need to obey when entering this world.
Except that allocateNative does zero the memory :-) There are some 
obscure flags by which you can disable this, but in general that's the 
lay of the land. If a more optimized allocator is needed, a 
SegmentAllocator is probably the way to go, in which there's more 
freedom as to how memory is actually allocated/inited. Note that 
ByteBuffer.allocateDirect also does memory zeroing.
>
>
> Furthermore, here are my two cents about the `setUtf8String` JavaDoc. Currently it says:
>
>> Writes a UTF-8 encoded, null-terminated string into this segment at given offset.
> ... which, being nit-picky, I read as "please pass a UTF-8 encoded, null terminated string to this method". Instead, I'd suggest:
>
>> Writes the given string into this segment at given offset, converting it to a null-terminated byte sequence using UTF-8 encoding.
Sounds good.
>
> Last but not least, I'd like to note that the size of the allocated segment in the aforementioned example depends on the string length (should be `length()`), which is inaccurate when using multi-byte unicode code points. Just thought, I'd mention it, as people may copy this code.

This is all true but.... let me put the example in context:

* that's the first example in the JEP
* the goal of that example is to give a "feel" of what the API can do 
(you will see there's a lot of sections with "..." in them) - copy and 
paste won't work
* in principle, it's easier to just use an allocator to allocate AND 
initialize a Utf8 native string with null termination; but allocators 
felt too much for this first example
* if you scroll down, there's a more detailed example on how to call 
strlen - and that does use an implicitAllocator

The alternative is to replace the snippet with:

implicitAllocator().allocateUtf8String(javaStrings[i]);

Perhaps we might just do that (in the earlier iteration we did not have 
a constant for that allocator).

Maurizio

>
>
>> On 12. Oct 2021, at 00:41, mark.reinhold at oracle.com wrote:
>>
>> https://openjdk.java.net/jeps/419
>>
>>   Summary: Introduce an API by which Java programs can interoperate with
>>   code and data outside of the Java runtime. By efficiently invoking
>>   foreign functions (i.e., code outside the JVM), and by safely accessing
>>   foreign memory (i.e., memory not managed by the JVM), the API enables
>>   Java programs to call native libraries and process native data without
>>   the brittleness and danger of JNI.
>>
>> - Mark