RFR: 8240197: Cannot start JVM when $JAVA_HOME includes CJK characters

Wed Mar 4 07:05:48 UTC 2020


On 3/3/20 10:22 PM, Yasumasa Suenaga wrote:
> Hi Ioi,
>
>> [1] I think the memory freeing can be done using a destructor, so you 
>> can simply return in case of error, without having a lot of "if" checks:
>
> What do you think to use `goto` statement like Linux kernel?
>
>
I don't have a strong opinion but I think many/most HotSpot developers 
are against it. Also with goto, you need to store the return value in a 
variable, but with destructors, you can just say "return value", so the 
code is cleaner.

Thanks
- Ioi

> Yasumasa
>
>
> On 2020/03/04 14:50, Ioi Lam wrote:
>> Hi Yasumasa,
>>
>> I think this function is getting too long and it's hard to figure out 
>> all the error conditions.
>>
>> (I feel sorry for you because the code wasn't pretty messy to begin 
>> with, but I think it's at a point that we need to do the cleanup. It 
>> took me like 30 minutes to understand what is going on)
>>
>> [0] I noticed you changed the return type to LPWSTR, but all the 
>> callers are using wchar_t*. Are these two types the same?
>>
>>      If you want to use LPWSTR within this function for consistency, 
>> maybe you can return the value as
>>
>>      return (wchar_t*)result; // LPWSTR and wchat_t* are the same 
>> type on Windows.
>>
>>      That way people unfamiliar with Windows (most of us) won't be 
>> confused.
>>
>> [1] I think the memory freeing can be done using a destructor, so you 
>> can simply return in case of error, without having a lot of "if" checks:
>>
>>      class State {
>>         errno_t & err;
>>          char* buf;
>>          LPWSTR unicode_path;
>> LPWSTR result;
>>
>>          State(errno_t & e) : buf(NULL), unicode_path(NULL), 
>> result(NULL), err(e) {
>>              err = SUCCESS;
>>           }
>>          ~State() {
>>            os::free(buf);
>>            os::free(unicode_path);
>>            if (err != ERROR_SUCCESS) {
>>               os::free(result);
>>            }
>>          }
>>          LPWSTR as_error(errno_t e) {
>>              err = e;
>>              return NULL;
>>          }
>>      } st;
>>
>>
>>      if ((st.buf = static_cast<char*>(os::malloc(buf_size, 
>> mtInternal))) == NULL) {
>>          return st.as_error(ENOMEM);
>>      }
>>
>>      ....
>>      return st.result;
>>
>> [2] Will it be possible to break the functon down into smaller 
>> pieces, so it's easier to understand what's going on?
>>
>> E.g., Maybe the code that  converts the C string into a unicode 
>> string can be moved into a separate function?
>>
>>    // Get required buffer size to convert to Unicode
>>    int unicode_path_len = MultiByteToWideChar(CP_THREAD_ACP,
>> MB_ERR_INVALID_CHARS,
>>                                               buf, -1,
>>                                               NULL, 0);
>>    if (unicode_path_len == 0) {
>>      os::free(buf);
>>      err = EINVAL;
>>      return NULL;
>>    }
>>
>>    ....
>>
>>    LPWSTR unicode_path = static_cast<LPWSTR>(os::malloc(sizeof(WCHAR) 
>> * unicode_path_len, mtInternal));
>>    if (unicode_path == NULL) {
>>      err = ENOMEM;
>>    } else {
>>      // This call would be success because it is checked in above
>>      err = ERROR_SUCCESS;
>>      MultiByteToWideChar(CP_THREAD_ACP,
>>                          MB_ERR_INVALID_CHARS,
>>                          buf, -1,
>>                          unicode_path, unicode_path_len);
>>
>>
>>
>> [3] size_t result_len = prefix_len - prefix_off + additional_space;
>>
>> The variable name result_len is confusing. It's not the length of the 
>> result.
>>
>> [4] The old copying code was already hard to understand, and the new 
>> version now has 2 versions of the copying code. I think the reason is 
>> you want to avoid an extra memcpy by calling GetFullPathNameW 
>> directly on the result.
>>
>>      if (needs_fullpath) {
>>        // Get required buffer size to convert to full path
>>        DWORD full_path_len = GetFullPathNameW(unicode_path, 0, NULL, 
>> NULL);
>>        if (full_path_len == 0) {
>>          err = EINVAL;
>>        } else {
>>          size_t result_size = sizeof(WCHAR) * (result_len + 
>> full_path_len);
>>          result = static_cast<LPWSTR>(os::malloc(result_size, 
>> mtInternal));
>>          if (result == NULL) {
>>            err = ENOMEM;
>>          } else {
>>            // This call would be success because it is checked in above
>>            GetFullPathNameW(unicode_path, full_path_len, result + 
>> prefix_len - prefix_off, NULL);
>>            // Copy prefix
>>            memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>          }
>>        }
>>      } else {
>>        size_t result_size = sizeof(WCHAR) * (result_len + 
>> unicode_path_len);
>>        result = static_cast<LPWSTR>(os::malloc(result_size, 
>> mtInternal));
>>        if (result == NULL) {
>>          err = ENOMEM;
>>        } else {
>>          // Copy unicode path
>>          memcpy(result + prefix_len - prefix_off, unicode_path, 
>> sizeof(WCHAR) * unicode_path_len);
>>          // Copy prefix
>>          memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>        }
>>      }
>>
>> I think the code can be simplified if you do the fullpath version in 
>> a separate function:
>>
>>      if (needs_fullpath) {
>>         unicode_path = to_full_path(unicode_path, err, 
>> &unicode_path_len);
>>      }
>>
>>      size_t result_size = ......
>>
>> This will do one more memcpy, but I think it's not performance critical
>>
>> Thanks
>> - Ioi
>>
>>
>> On 3/3/20 4:19 PM, Yasumasa Suenaga wrote:
>>> Thanks Ralf!
>>> I fixed them.
>>>
>>> I need a Reviewer to push, so I uploaded new webrev:
>>>
>>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.03/
>>>
>>>
>>> Yasumasa
>>>
>>>
>>> On 2020/03/03 23:25, Schmelter, Ralf wrote:
>>>> Hi Yasumasa,
>>>>
>>>> thanks for your work. The code should now be able to handle any path.
>>>>
>>>> Two small changes (no new webrev needed):
>>>>
>>>> size_t result_len = prefix_len + prefix_off + additional_space
>>>> should be
>>>> size_t result_len = prefix_len - prefix_off + additional_space
>>>>
>>>> And you should copy the prefix after you copied the filename, since 
>>>> sometimes the prefix is required to overwrite the start of the 
>>>> filename:
>>>>
>>>> // This call would be success because it is checked in above
>>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - 
>>>> prefix_off, NULL);
>>>> // Copy prefix
>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>
>>>> and
>>>>
>>>> // Copy unicode path
>>>> memcpy(result + prefix_len - prefix_off, unicode_path, 
>>>> sizeof(WCHAR) * unicode_path_len);
>>>> // Copy prefix
>>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>>
>>>> With these changes the os_windows gtest and the appcds jtreg tests 
>>>> run without errors.
>>>>
>>>> Best regards,
>>>> Ralf
>>>>
>>