RFR: 8195129: System.load() fails to load from unicode paths [v3]

Tue Jun 1 18:45:22 UTC 2021

On Thu, 27 May 2021 16:14:38 GMT, Maxim Kartashev <github.com+28651297+mkartashev at openjdk.org> wrote:

>> Character strings within JVM are produced and consumed in several formats. Strings come from/to Java in the UTF8 format and POSIX APIs (like fprintf() or dlopen()) consume strings also in UTF8. On Windows, however, the situation is far less simple: some new(er) APIs expect UTF16 (wide-character strings), some older APIs can only work with strings in a "platform" format, where not all UTF8 characters can be represented; which ones can depends on the current "code page".
>> 
>> This commit switches the Windows version of native library loading code to using the new UTF16 API `LoadLibraryW()` and attempts to streamline the use of various string formats in the surrounding code. 
>> 
>> Namely, exception messages are made to consume strings explicitly in the UTF8 format, while logging functions (that end up using legacy Windows API) are made to consume "platform" strings in most cases. One exception is `JVM_LoadLibrary()` logging where the UTF8 name of the library is logged, which can, of course, be fixed, but was considered not worth the additional code (NB: this isn't a new bug).
>> 
>> The test runs in a separate JVM in order to make NIO happy about non-ASCII characters in the file name; tests are executed with LC_ALL=C and that doesn't let NIO work with non-ASCII file names even on Linux or MacOS.
>> 
>> Tested by running `test/hotspot/jtreg:tier1` on Linux and `jtreg:test/hotspot/jtreg/runtime` on Windows 10. The new test (`   jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode`) was explicitly ran on those platforms as well.
>> 
>> Results from Linux:
>> 
>> Test summary
>> ==============================
>>    TEST                                              TOTAL  PASS  FAIL ERROR   
>>    jtreg:test/hotspot/jtreg:tier1                     1784  1784     0     0   
>> ==============================
>> TEST SUCCESS
>> 
>> 
>> Building target 'run-test-only' in configuration 'linux-x86_64-server-release'
>> Test selection 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run:
>> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode
>> 
>> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode'
>> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java
>> Test results: passed: 1
>> 
>> 
>> Results from Windows 10:
>> 
>> Test summary
>> ==============================
>>    TEST                                              TOTAL  PASS  FAIL ERROR
>>    jtreg:test/hotspot/jtreg/runtime                    746   746     0     0
>> ==============================
>> TEST SUCCESS
>> Finished building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug'
>> 
>> 
>> Building target 'run-test-only' in configuration 'windows-x86_64-server-fastdebug'
>> Test selection 'test/hotspot/jtreg/runtime/jni/loadLibraryUnicode', will run:
>> * jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode
>> 
>> Running test 'jtreg:test/hotspot/jtreg/runtime/jni/loadLibraryUnicode'
>> Passed: runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java
>> Test results: passed: 1
>
> Maxim Kartashev has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Coding style-related corrections.
>  - Corrected the test to use Platform.sharedLibraryExt()

src/hotspot/os/windows/os_windows.cpp line 1491:

> 1489: static errno_t convert_UTF8_to_UTF16(char const* utf8_str, LPWSTR* utf16_str) {
> 1490:   return convert_to_UTF16(utf8_str, CP_UTF8, utf16_str);
> 1491: }

IIUC, `utf8_str` is the "modified" UTF-8 string in JNI. Using it as the standard UTF-8 (I believe Windows' encoding `CP_UTF8` is the one) may end up in incorrect conversions in some corner cases, e.g., for supplementary characters.

test/hotspot/jtreg/runtime/jni/loadLibraryUnicode/LoadLibraryUnicodeTest.java line 42:

> 40:         String nativePathSetting = "-Dtest.nativepath=" + getSystemProperty("test.nativepath");
> 41:         ProcessBuilder pb = ProcessTools.createTestJvm(nativePathSetting, LoadLibraryUnicode.class.getName());
> 42:         pb.environment().put("LC_ALL", "en_US.UTF-8");

Some environments/user configs may not have `UTF-8` codeset on the platform. May need to gracefully exit in such a case.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4169