possible problem with JNI GetStringUTFChars

David Holmes david.holmes at oracle.com
Fri Jan 25 06:04:24 UTC 2019


On 25/01/2019 4:39 am, Alan Snyder wrote:
> Thank you. That post does explain what is happening, but leaves open the question of whether GetStringUTFChars should be changed.
> 
> What is the value of the current implementation of GetStringUTFChars versus one that returns true UTF-8?

Well that's really a Hotspot question as it concerns JNI, but this is 
ancient history. There's little point musing over the "why" of decisions 
made back in the late 1990's. But I suspect the main reason is the 
avoidance of embedded NUL characters.

The only bug report I can see on this (basically the same issue you are 
reporting) was back in 2004:

https://bugs.openjdk.java.net/browse/JDK-5030776

so it simply has not been an issue. As per the SO article that Claes 
referenced anyone needing true UTF8 has a couple of paths to achieve that.

Cheers,
David
-----


>    Alan
> 
> 
> 
> 
>> On Jan 24, 2019, at 10:32 AM, Claes Redestad <claes.redestad at oracle.com> wrote:
>>
>> Hi Alan,
>>
>> GetStringUTFChars unfortunately doesn't give you true UTF-8, but a modified UTF-8 sequence
>> as used by the VM internally for historical reasons.
>>
>> See answers to this related question on SO (which contains links to official docs):
>> https://stackoverflow.com/questions/32205446/getting-true-utf-8-characters-in-java-jni
>>
>> HTH
>>
>> /Claes
>>
>> On 2019-01-24 19:23, Alan Snyder wrote:
>>> I am having a problem with file names that contain emojis when passed to a macOS system call.
>>>
>>> Things work when I convert the path to bytes in Java, but fail (file not found) when I convert the path to bytes in native code using GetStringUTFChars.
>>>
>>> For example, where String.getBytes() returns
>>>
>>> -16 -97 -115 -69
>>>
>>> GetStringUTFChars returns:
>>>
>>> -19 -96 -68 -19 -67 -69
>>>
>>> I’m not a UTF expert, so can someone say whether I should file a bug report?
>>>
>>> (Tested in JDK 9, 11, and a fairly recent 12)
>>>
>>
> 


More information about the core-libs-dev mailing list