RFR: 8240197: Cannot start JVM when $JAVA_HOME includes CJK characters
Yasumasa Suenaga
suenaga at oss.nttdata.com
Wed Mar 4 08:22:07 UTC 2020
Hi Ioi,
Thanks for your comment.
How about this change?
http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.04/
This webrev adds the function both converting to unicode and getting full path,
they would be called from wide_abs_unc_path().
Also MemoryCleaner class which is introduced in this webrev frees memory auomatically in d'tor.
Thanks,
Yasumasa
On 2020/03/04 16:05, Ioi Lam wrote:
>
>
> On 3/3/20 10:22 PM, Yasumasa Suenaga wrote:
>> Hi Ioi,
>>
>>> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:
>>
>> What do you think to use `goto` statement like Linux kernel?
>>
>>
> I don't have a strong opinion but I think many/most HotSpot developers are against it. Also with goto, you need to store the return value in a variable, but with destructors, you can just say "return value", so the code is cleaner.
>
> Thanks
> - Ioi
>
>> Yasumasa
>>
>>
>> On 2020/03/04 14:50, Ioi Lam wrote:
>>> Hi Yasumasa,
>>>
>>> I think this function is getting too long and it's hard to figure out all the error conditions.
>>>
>>> (I feel sorry for you because the code wasn't pretty messy to begin with, but I think it's at a point that we need to do the cleanup. It took me like 30 minutes to understand what is going on)
>>>
>>> [0] I noticed you changed the return type to LPWSTR, but all the callers are using wchar_t*. Are these two types the same?
>>>
>>> If you want to use LPWSTR within this function for consistency, maybe you can return the value as
>>>
>>> return (wchar_t*)result; // LPWSTR and wchat_t* are the same type on Windows.
>>>
>>> That way people unfamiliar with Windows (most of us) won't be confused.
>>>
>>> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:
>>>
>>> class State {
>>> errno_t & err;
>>> char* buf;
>>> LPWSTR unicode_path;
>>> LPWSTR result;
>>>
>>> State(errno_t & e) : buf(NULL), unicode_path(NULL), result(NULL), err(e) {
>>> err = SUCCESS;
>>> }
>>> ~State() {
>>> os::free(buf);
>>> os::free(unicode_path);
>>> if (err != ERROR_SUCCESS) {
>>> os::free(result);
>>> }
>>> }
>>> LPWSTR as_error(errno_t e) {
>>> err = e;
>>> return NULL;
>>> }
>>> } st;
>>>
>>>
>>> if ((st.buf = static_cast<char*>(os::malloc(buf_size, mtInternal))) == NULL) {
>>> return st.as_error(ENOMEM);
>>> }
>>>
>>> ....
>>> return st.result;
>>>
>>> [2] Will it be possible to break the functon down into smaller pieces, so it's easier to understand what's going on?
>>>
>>> E.g., Maybe the code that converts the C string into a unicode string can be moved into a separate function?
>>>
>>> // Get required buffer size to convert to Unicode
>>> int unicode_path_len = MultiByteToWideChar(CP_THREAD_ACP,
>>> MB_ERR_INVALID_CHARS,
>>> buf, -1,
>>> NULL, 0);
>>> if (unicode_path_len == 0) {
>>> os::free(buf);
>>> err = EINVAL;
>>> return NULL;
>>> }
>>>
>>> ....
>>>
>>> LPWSTR unicode_path = static_cast<LPWSTR>(os::malloc(sizeof(WCHAR) * unicode_path_len, mtInternal));
>>> if (unicode_path == NULL) {
>>> err = ENOMEM;
>>> } else {
>>> // This call would be success because it is checked in above
>>> err = ERROR_SUCCESS;
>>> MultiByteToWideChar(CP_THREAD_ACP,
>>> MB_ERR_INVALID_CHARS,
>>> buf, -1,
>>> unicode_path, unicode_path_len);
>>>
>>>
>>>
>>> [3] size_t result_len = prefix_len - prefix_off + additional_space;
>>>
>>> The variable name result_len is confusing. It's not the length of the result.
>>>
>>> [4] The old copying code was already hard to understand, and the new version now has 2 versions of the copying code. I think the reason is you want to avoid an extra memcpy by calling GetFullPathNameW directly on the result.
>>>
>>> if (needs_fullpath) {
>>> // Get required buffer size to convert to full path
>>> DWORD full_path_len = GetFullPathNameW(unicode_path, 0, NULL, NULL);
>>> if (full_path_len == 0) {
>>> err = EINVAL;
>>> } else {
>>> size_t result_size = sizeof(WCHAR) * (result_len + full_path_len);
>>> result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>>> if (result == NULL) {
>>> err = ENOMEM;
>>> } else {
>>> // This call would be success because it is checked in above
>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>>> // Copy prefix
>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>> }
>>> }
>>> } else {
>>> size_t result_size = sizeof(WCHAR) * (result_len + unicode_path_len);
>>> result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>>> if (result == NULL) {
>>> err = ENOMEM;
>>> } else {
>>> // Copy unicode path
>>> memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>>> // Copy prefix
>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>> }
>>> }
>>>
>>> I think the code can be simplified if you do the fullpath version in a separate function:
>>>
>>> if (needs_fullpath) {
>>> unicode_path = to_full_path(unicode_path, err, &unicode_path_len);
>>> }
>>>
>>> size_t result_size = ......
>>>
>>> This will do one more memcpy, but I think it's not performance critical
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> On 3/3/20 4:19 PM, Yasumasa Suenaga wrote:
>>>> Thanks Ralf!
>>>> I fixed them.
>>>>
>>>> I need a Reviewer to push, so I uploaded new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.03/
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2020/03/03 23:25, Schmelter, Ralf wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> thanks for your work. The code should now be able to handle any path.
>>>>>
>>>>> Two small changes (no new webrev needed):
>>>>>
>>>>> size_t result_len = prefix_len + prefix_off + additional_space
>>>>> should be
>>>>> size_t result_len = prefix_len - prefix_off + additional_space
>>>>>
>>>>> And you should copy the prefix after you copied the filename, since sometimes the prefix is required to overwrite the start of the filename:
>>>>>
>>>>> // This call would be success because it is checked in above
>>>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>>>>> // Copy prefix
>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>>
>>>>> and
>>>>>
>>>>> // Copy unicode path
>>>>> memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>>>>> // Copy prefix
>>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>>
>>>>> With these changes the os_windows gtest and the appcds jtreg tests run without errors.
>>>>>
>>>>> Best regards,
>>>>> Ralf
>>>>>
>>>
>
More information about the hotspot-runtime-dev
mailing list