FFM API: questions about reinterpret and MemorySegment

Anastasiya Lisitskaya lisnas at gmail.com
Wed Oct 2 22:58:23 UTC 2024


Hi,

It is very helpful!

So, if I want to use data from the heap without extra copying to off-heap
(native MemorySegment), should using String be avoided? It seems there is
no way to use a String without copying, as we can't guarantee a trailing
null terminator.

One thing still concerns me: is processing an unterminated string
unpredictable? Only one test from my suite fails (returning this extra
symbol or crashing).

Many thanks!

ср, 2 окт. 2024 г. в 13:11, Maurizio Cimadamore <
maurizio.cimadamore at oracle.com>:

> Hi, some replies below:
> On 01/10/2024 20:40, Anastasiya Lisitskaya wrote:
>
> Hi,
>
> I'm trying to use the FFM API (jdk 22) to call my C++ method and I need
> to pass a text (java String) and receive a text response. While
> implementing this, I encountered several issues:
>
>    1.
>
>    What are the best practices for defining newSize for use in the reinterpret(long
>    newSize) method? Can I use constants like Long.MAX_VALUE or
>    Integer.MAX_VALUE as newSize, or could that cause some problems?
>
> If the size of the returned string (I assume it's a char*) is known, then
> use that size. Otherwise, use Long.MAX_VALUE. MemorySegment::getString will
> read the string bytes up to the null terminator.
>
>
>
>    1.
>
>    When I tried to use in-heap MemorySegment with the
>    Linker.Option.critical(true)  and passed
>    MemorySegment.ofArray(text.getBytes()), I started getting extra symbol
>    like SOH in the response. What am I doing wrong? (Sample snippets listed
>    below). Changing newSize value in reinterpret(long newSize) doesn't
>    help
>
>
>    1. If I inline MemorySegment.ofArray(text.getBytes()) into
>    invokeExact, I expected : "мое все 123 аи92", but got:
>
>>    uncaught exception:
>>        address -> 0x60000120d710
>>        what() -> "util/charset/wide.h:366: failed to decode UTF-8 string
>>    at pos 25 in string "\xD0\x9C\xD0\xBE\xD1\x91 \xD0\xB2\xD1\x81\xD1\x91 123
>>    \xD0\x90\xD0\23092\1\xCF\xFD\xBD_""
>>        type -> yexception
>
>
> I'm definitely doing something wrong. Please help me figure it out and
> understand. Thanks!
>
> I think your problem is that the segment you are creating has no NULL
> terminator in the end?
>
> E.g. you take a Java string, get its byte array, and turn the byte array
> into a segment.
>
> To work with string safely, I suggest you use String-accepting
> allocation/accessor methods. Either Arena::allocateFrom(String), or
> MemorySegment::setString. Those will add the required terminator.
>
> I think even your first example looks incorrect (where you use
> `allocateFrom(JAVA_BYTE, text.getBytes()`), but you are probably saved
> there by the fact that malloc allocated a bigger chunk of memory and a zero
> just happens to be at the end of the string bytes?
>
> You can't pass the byte array of a Java string to a C/C++ function
> expecting a null-terminated string w/o performing some sort of copy and
> adding the required trailing terminator. Some C/C++ APIs might work with
> unterminated strings, in which case they will probably accept a size - e.g.
> how many characters are expected in the char*. But this doesn't seem to be
> the case here.
>
> Hope this helps
> Maurizio
>
>
>
>

-- 
С уважением, Лисицкая Настя
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241003/a0a4ae5c/attachment.htm>


More information about the panama-dev mailing list