RFR: 8240197: Cannot start JVM when $JAVA_HOME includes CJK characters

Wed Mar 4 06:22:10 UTC 2020

Hi Ioi,

> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:

What do you think to use `goto` statement like Linux kernel?

Yasumasa

On 2020/03/04 14:50, Ioi Lam wrote:
> Hi Yasumasa,
> 
> I think this function is getting too long and it's hard to figure out all the error conditions.
> 
> (I feel sorry for you because the code wasn't pretty messy to begin with, but I think it's at a point that we need to do the cleanup. It took me like 30 minutes to understand what is going on)
> 
> [0] I noticed you changed the return type to LPWSTR, but all the callers are using wchar_t*. Are these two types the same?
> 
>      If you want to use LPWSTR within this function for consistency, maybe you can return the value as
> 
>      return (wchar_t*)result; // LPWSTR and wchat_t* are the same type on Windows.
> 
>      That way people unfamiliar with Windows (most of us) won't be confused.
> 
> [1] I think the memory freeing can be done using a destructor, so you can simply return in case of error, without having a lot of "if" checks:
> 
>      class State {
>         errno_t & err;
>          char* buf;
>          LPWSTR unicode_path;
> LPWSTR result;
> 
>          State(errno_t & e) : buf(NULL), unicode_path(NULL), result(NULL), err(e) {
>              err = SUCCESS;
>           }
>          ~State() {
>            os::free(buf);
>            os::free(unicode_path);
>            if (err != ERROR_SUCCESS) {
>               os::free(result);
>            }
>          }
>          LPWSTR as_error(errno_t e) {
>              err = e;
>              return NULL;
>          }
>      } st;
> 
> 
>      if ((st.buf = static_cast<char*>(os::malloc(buf_size, mtInternal))) == NULL) {
>          return st.as_error(ENOMEM);
>      }
> 
>      ....
>      return st.result;
> 
> [2] Will it be possible to break the functon down into smaller pieces, so it's easier to understand what's going on?
> 
> E.g., Maybe the code that  converts the C string into a unicode string can be moved into a separate function?
> 
>    // Get required buffer size to convert to Unicode
>    int unicode_path_len = MultiByteToWideChar(CP_THREAD_ACP,
>                                               MB_ERR_INVALID_CHARS,
>                                               buf, -1,
>                                               NULL, 0);
>    if (unicode_path_len == 0) {
>      os::free(buf);
>      err = EINVAL;
>      return NULL;
>    }
> 
>    ....
> 
>    LPWSTR unicode_path = static_cast<LPWSTR>(os::malloc(sizeof(WCHAR) * unicode_path_len, mtInternal));
>    if (unicode_path == NULL) {
>      err = ENOMEM;
>    } else {
>      // This call would be success because it is checked in above
>      err = ERROR_SUCCESS;
>      MultiByteToWideChar(CP_THREAD_ACP,
>                          MB_ERR_INVALID_CHARS,
>                          buf, -1,
>                          unicode_path, unicode_path_len);
> 
> 
> 
> [3] size_t result_len = prefix_len - prefix_off + additional_space;
> 
> The variable name result_len is confusing. It's not the length of the result.
> 
> [4] The old copying code was already hard to understand, and the new version now has 2 versions of the copying code. I think the reason is you want to avoid an extra memcpy by calling GetFullPathNameW directly on the result.
> 
>      if (needs_fullpath) {
>        // Get required buffer size to convert to full path
>        DWORD full_path_len = GetFullPathNameW(unicode_path, 0, NULL, NULL);
>        if (full_path_len == 0) {
>          err = EINVAL;
>        } else {
>          size_t result_size = sizeof(WCHAR) * (result_len + full_path_len);
>          result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>          if (result == NULL) {
>            err = ENOMEM;
>          } else {
>            // This call would be success because it is checked in above
>            GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>            // Copy prefix
>            memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>          }
>        }
>      } else {
>        size_t result_size = sizeof(WCHAR) * (result_len + unicode_path_len);
>        result = static_cast<LPWSTR>(os::malloc(result_size, mtInternal));
>        if (result == NULL) {
>          err = ENOMEM;
>        } else {
>          // Copy unicode path
>          memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>          // Copy prefix
>          memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>        }
>      }
> 
> I think the code can be simplified if you do the fullpath version in a separate function:
> 
>      if (needs_fullpath) {
>         unicode_path = to_full_path(unicode_path, err, &unicode_path_len);
>      }
> 
>      size_t result_size = ......
> 
> This will do one more memcpy, but I think it's not performance critical
> 
> Thanks
> - Ioi
> 
> 
> On 3/3/20 4:19 PM, Yasumasa Suenaga wrote:
>> Thanks Ralf!
>> I fixed them.
>>
>> I need a Reviewer to push, so I uploaded new webrev:
>>
>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8240197/webrev.03/
>>
>>
>> Yasumasa
>>
>>
>> On 2020/03/03 23:25, Schmelter, Ralf wrote:
>>> Hi Yasumasa,
>>>
>>> thanks for your work. The code should now be able to handle any path.
>>>
>>> Two small changes (no new webrev needed):
>>>
>>> size_t result_len = prefix_len + prefix_off + additional_space
>>> should be
>>> size_t result_len = prefix_len - prefix_off + additional_space
>>>
>>> And you should copy the prefix after you copied the filename, since sometimes the prefix is required to overwrite the start of the filename:
>>>
>>> // This call would be success because it is checked in above
>>> GetFullPathNameW(unicode_path, full_path_len, result + prefix_len - prefix_off, NULL);
>>> // Copy prefix
>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>
>>> and
>>>
>>> // Copy unicode path
>>> memcpy(result + prefix_len - prefix_off, unicode_path, sizeof(WCHAR) * unicode_path_len);
>>> // Copy prefix
>>> memcpy(result, prefix, sizeof(WCHAR) * prefix_len);
>>>
>>> With these changes the os_windows gtest and the appcds jtreg tests run without errors.
>>>
>>> Best regards,
>>> Ralf
>>>
>