FFM API: questions about reinterpret and MemorySegment
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Oct 3 08:45:05 UTC 2024
On 02/10/2024 23:58, Anastasiya Lisitskaya wrote:
> Hi,
>
> It is very helpful!
>
> So, if I want to use data from the heap without extra copying to
> off-heap (native MemorySegment), should using String be avoided? It
> seems there is no way to use a String without copying, as we can't
> guarantee a trailing null terminator.
I'm afraid that's the case. The Java String API does not concern with
string terminators because, in Java, all strings have a size. In C
that's not the case - so in general you need to append a terminator, and
that will involve some degree of copying.
>
> One thing still concerns me: is processing an unterminated string
> unpredictable? Only one test from my suite fails (returning this extra
> symbol or crashing).
Processing an unterminated string leads to undefined behavior.
Effectively, your program is scanning _past_ the contents of your
string, in search for a zero. Because of the way some system calls work
(e.g. malloc) it is likely that a zero will be found more or less where
expected. But that behavior is OS/platform dependent and absolutely
cannot be relied upon.
Maurizio
>
> Many thanks!
>
> ср, 2 окт. 2024 г. в 13:11, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com>:
>
> Hi, some replies below:
>
> On 01/10/2024 20:40, Anastasiya Lisitskaya wrote:
>> Hi,
>>
>> I'm trying to use the FFM API (jdk 22) to call my C++ method and
>> I need to pass a text(java String) and receive a text response.
>> While implementing this, I encountered several issues:
>>
>> 1.
>>
>> What are the best practices for defining |newSize| for use in
>> the |reinterpret(long newSize)| method? Can I use constants
>> like |Long.MAX_VALUE| or |Integer.MAX_VALUE| as |newSize|, or
>> could that cause some problems?
>>
> If the size of the returned string (I assume it's a char*) is
> known, then use that size. Otherwise, use Long.MAX_VALUE.
> MemorySegment::getString will read the string bytes up to the null
> terminator.
>
>
>> 1.
>>
>> When I tried to use in-heap |MemorySegment| with the
>> |Linker.Option.critical(true)| and passed
>> |MemorySegment.ofArray(text.getBytes())|, I started getting
>> extra symbol like SOH in the response. What am I doing wrong?
>> (Sample snippets listed below). Changing newSize value in
>> reinterpret(long newSize) doesn't help
>>
>> 1.
>> If I inline MemorySegment.ofArray(text.getBytes()) into
>> invokeExact, I expected : "мое все 123 аи92", but got:
>>
>> uncaught exception:
>> address -> 0x60000120d710
>> what() -> "util/charset/wide.h:366: failed to decode
>> UTF-8 string at pos 25 in string
>> "\xD0\x9C\xD0\xBE\xD1\x91 \xD0\xB2\xD1\x81\xD1\x91 123
>> \xD0\x90\xD0\23092\1\xCF\xFD\xBD_""
>> type -> yexception
>>
>> I'm definitely doing something wrong. Please help me figure it
>> out and understand. Thanks!
>
> I think your problem is that the segment you are creating has no
> NULL terminator in the end?
>
> E.g. you take a Java string, get its byte array, and turn the byte
> array into a segment.
>
> To work with string safely, I suggest you use String-accepting
> allocation/accessor methods. Either Arena::allocateFrom(String),
> or MemorySegment::setString. Those will add the required terminator.
>
> I think even your first example looks incorrect (where you use
> `allocateFrom(JAVA_BYTE, text.getBytes()`), but you are probably
> saved there by the fact that malloc allocated a bigger chunk of
> memory and a zero just happens to be at the end of the string bytes?
>
> You can't pass the byte array of a Java string to a C/C++ function
> expecting a null-terminated string w/o performing some sort of
> copy and adding the required trailing terminator. Some C/C++ APIs
> might work with unterminated strings, in which case they will
> probably accept a size - e.g. how many characters are expected in
> the char*. But this doesn't seem to be the case here.
>
> Hope this helps
> Maurizio
>
>
>
>
>
> --
> С уважением, Лисицкая Настя
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241003/c37bc5a6/attachment.htm>
More information about the panama-dev
mailing list