RFR: 8240197: Cannot start JVM when $JAVA_HOME includes CJK characters
Yasumasa Suenaga
suenaga at oss.nttdata.com
Thu Mar 5 00:18:21 UTC 2020
Hi Ioi,
How about this change?
http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.05/
I already tried to use ResouceMark, but it was crashed because wide_abs_unc_path() would be called from outside of java thread at boot time.
This webrev uses os::free() instead of memory cleaner class, but I think it is simple because I tweaked memory lifecycle.
Thanks,
Yasumasa
On 2020/03/05 6:56, Ioi Lam wrote:
> Hi Yasumasa,
>
> This version looks good!
>
>
> [0] One thing I forgot ... is it possible to use "ResourceMark rm" in this function? If so, you can allocate the temp arrays using NEW_RESOURCE_ARRAY and then don't need to worry about freeing them.
>
>
> [1] for this code
>
> 4202 // This call would be success because it is checked in above
> 4203 GetFullPathNameW(unicode_path, full_path_len, full_path, NULL);
>
> How about this instead:
>
> DWORD len = GetFullPathNameW(unicode_path, full_path_len, full_path, NULL);
> assert(len <= full_path_len, "length already checked above");
>
> (same for the MultiByteToWideChar call in convert_to_unicode)
>
>
> [2] This is no longer needed
>
> 4240 os::free(buf);
>
>
> [3] I think the following code is not needed ....
>
> 4269 } else {
> 4270 size_t converted_path_size = sizeof(WCHAR) * (wcslen(unicode_path) + 1);
> 4271 converted_path = reinterpret_cast<LPWSTR>(os::malloc(converted_path_size, mtInternal));
> 4272 if (converted_path == NULL) {
> 4273 vm_exit_out_of_memory(buf_size, OOM_MALLOC_ERROR, "wide_abs_unc_path");
> 4274 }
> 4275 memcpy(converted_path, unicode_path, converted_path_size);
>
>
> ... if you free the original buffer inside get_full_path (or use ResourceMark):
>
> static errno_t get_full_path(LPCWSTR& unicode_path) {
> LPCWSTR full_path = .....;
> ....
> free(unicode_path);
> unicode_path = full_path;
> return ERROR_SUCCESS;
> }
>
> [4] This looks good! Much more readable than the original code!
>
> 4285 _snwprintf(result, result_len, L"%s%s", prefix, &converted_path[prefix_off]);
>
>
> Thanks
> - Ioi
>
>
> On 3/4/20 12:22 AM, Yasumasa Suenaga wrote:
>> Hi Ioi,
>>
>> Thanks for your comment.
>> How about this change?
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.04/
>>
>> This webrev adds the function both converting to unicode and getting full path,
>> they would be called from wide_abs_unc_path().
>> Also MemoryCleaner class which is introduced in this webrev frees memory auomatically in d'tor.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/03/04 16:05, Ioi Lam wrote:
>>>
>>>
>>> On 3/3/20 10:22 PM, Yasumasa Suenaga wrote:
>>>> Hi Ioi,
>>>>
>>>>> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:
>>>>
>>>> What do you think to use `goto` statement like Linux kernel?
>>>>
>>>>
>>> I don't have a strong opinion but I think many/most HotSpot developers are against it. Also with goto, you need to store the return value in a variable, but with destructors, you can just say "return value", so the code is cleaner.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/04 14:50, Ioi Lam wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I think this function is getting too long and it's hard to figure out all the error conditions.
>>>>>
>>>>> (I feel sorry for you because the code wasn't pretty messy to begin with, but I think it's at a point that we need to do the cleanup. It took me like 30 minutes to understand what is going on)
>>>>>
>>>>> [0] I noticed you changed the return type to LPWSTR, but all the callers are using wchar_t*. Are these two types the same?
>>>>>
>>>>> If you want to use LPWSTR within this function for consistency, maybe you can return the value as
>>>>>
>>>>> return (wchar_t*)result; // LPWSTR and wchat_t* are the same type on Windows.
>>>>>
>>>>> That way people unfamiliar with Windows (most of us) won't be confused.
>>>>>
>>>>> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:
>>>>>
>>>>> class State {
>>>>> errno_t & err;
>>>>> char* buf;
>>>>> LPWSTR unicode_path;
>>>>> LPWSTR result;
>>>>>
>>>>> State(errno_t & e) : buf(NULL), unicode_path(NULL), result(NULL), err(e) {
>>>>> err = SUCCESS;
>>>>> }
>>>>> ~State() {
>>>>> os::free(buf);
>>>>> os::free(unicode_path);
>>>>> if (err != ERROR_SUCCESS) {
>>>>> os::free(result);
>>>>> }
>>>>> }
>>>>> LPWSTR as_error(errno_t e) {
>>>>> err = e;
>>>>> return NULL;
>>>>> }
>>>>> } st;
>>>>>
>>>>>
>>>>> if ((st.buf = static_cast<char*>(os::malloc(buf_size, mtInternal))) == NULL) {
>>>>> return st.as_error(ENOMEM);
>>>>> }
>>>>>
>>>>> ....
>>>>> return st.result;
>>>>>
>>>>> [2] Will it be possible to break the functon down into smaller pieces, so it's easier to understand what's going on?
>>>>>
>>>>> E.g., Maybe the code that converts the C string into a unicode string can be moved into a separate function?
>>>>>
>>>>> // Get required buffer size to convert to Unicode
>>>>> int unicode_path_len = MultiByteToWideChar(CP_THREAD_ACP,
>>>>> MB_ERR_INVALID_CHARS,
>>>>> buf, -1,
>>>>> NULL, 0);
>>>>> if (unicode_path_len == 0) {
>>>>> os::free(buf);
>>>>> err = EINVAL;
>>>>> return NULL;
>>>>> }
>>>>>
>>>>> ....
>>>>>
>>>>> LPWSTR unicode_path = static_cast<LPWSTR>(os::malloc(sizeof(WCHAR) * unicode_path_len, mtInternal));
>>>>> if (unicode_path == NULL) {
>>>>> err = ENOMEM;
>>>>> } else {
>>>>> // This call would be success because it is checked in above
>>>>> err = ERROR_SUCCESS;
>>>>> MultiByteToWideChar(CP_THREAD_ACP,
>>>>> MB_ERR_INVALID_CHARS,
>>>>> buf, -1,
>>>>> unicode_path, unicode_path_len);
>>>>>
>>>>>
>>>>>
>>>>> [3] size_t result_len = prefix_len - prefix_off + additional_space;
>>>>>
>>>>> The variable name result_len is confusing. It's not the length of the result.
>>>>>
>>>>> [4] The old copying code was already hard to understand, and the new version now has 2 versions of the copying code. I think the reason is you want to avoid an extra memcpy by calling GetFullPathNameW directly on the result.
>>>>>
>>>>> if (needs_fullpath) {
>>>>> // Get required buffer size to convert to full path
>>>>> DWORD full_path_len = GetFullPathNameW(unicode_path, 0, NULL, NULL);
>>>>> if (full_path_len == 0) {
>>>>> err = EINVAL;
>>>>> } else {
>>>>> size_t result_size = sizeof(WCHAR) * (result_len + full_path_len);
>>>>> result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>>>>> if (result == NULL) {
>>>>> err = ENOMEM;
>>>>> } else {
>>>>> // This call would be success because it is checked in above
>>>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>>>>> // Copy prefix
>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>> }
>>>>> }
>>>>> } else {
>>>>> size_t result_size = sizeof(WCHAR) * (result_len + unicode_path_len);
>>>>> result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>>>>> if (result == NULL) {
>>>>> err = ENOMEM;
>>>>> } else {
>>>>> // Copy unicode path
>>>>> memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>>>>> // Copy prefix
>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>> }
>>>>> }
>>>>>
>>>>> I think the code can be simplified if you do the fullpath version in a separate function:
>>>>>
>>>>> if (needs_fullpath) {
>>>>> unicode_path = to_full_path(unicode_path, err, &unicode_path_len);
>>>>> }
>>>>>
>>>>> size_t result_size = ......
>>>>>
>>>>> This will do one more memcpy, but I think it's not performance critical
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>
>>>>> On 3/3/20 4:19 PM, Yasumasa Suenaga wrote:
>>>>>> Thanks Ralf!
>>>>>> I fixed them.
>>>>>>
>>>>>> I need a Reviewer to push, so I uploaded new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.03/
>>>>>>
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2020/03/03 23:25, Schmelter, Ralf wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> thanks for your work. The code should now be able to handle any path.
>>>>>>>
>>>>>>> Two small changes (no new webrev needed):
>>>>>>>
>>>>>>> size_t result_len = prefix_len + prefix_off + additional_space
>>>>>>> should be
>>>>>>> size_t result_len = prefix_len - prefix_off + additional_space
>>>>>>>
>>>>>>> And you should copy the prefix after you copied the filename, since sometimes the prefix is required to overwrite the start of the filename:
>>>>>>>
>>>>>>> // This call would be success because it is checked in above
>>>>>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>>>>>>> // Copy prefix
>>>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> // Copy unicode path
>>>>>>> memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>>>>>>> // Copy prefix
>>>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>>>>
>>>>>>> With these changes the os_windows gtest and the appcds jtreg tests run without errors.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Ralf
>>>>>>>
>>>>>
>>>
>
More information about the hotspot-runtime-dev
mailing list