RFR(S) 8190737: use unicode version of the canonicalize() function to handle long path on windows

Xueming Shen xueming.shen at oracle.com
Thu Sep 13 01:52:03 UTC 2018


Hi Calvin,

I believe I was thinking of keeping the "getPrefixed" as an 
implementation details and
instead exporting a "A" version of winFileHandleOpen() back then. But I 
don't have a
strong opinion on this one, so your approach looks fine.

No, I don't have a better idea for now to avoid those mb->ws->mb as long 
as the canonicalize()
and zip_open() are two separate invocations. But since they are only for 
 > max_path path. I'm
fine with it.

Yes, I think it's time to consider to migrate from interpreting the 
char* as "mb chars" to probably
"utf8 chars" (take a step back, I don't think it's easy to do to 
actually a wchar interface) for windows
platform, but that would be a separate and big rfes.

nit:

130             fname,              /* Ascii char path name */

the comment probably should be "path name in mb char", or ANSI charset.

Thanks,

-Sherman

On 9/12/18, 4:16 PM, Calvin Cheung wrote:
> Hi Sherman,
>
> Thanks for your review.
> Please refer to my reply below...
>
> On 9/10/18, 5:05 PM, Xueming Shen wrote:
>> On 9/10/18, 11:42 AM, Calvin Cheung wrote:
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8190737
>>>
>>> webrev: http://cr.openjdk.java.net/~ccheung/8190737/webrev.00/
>>>
>>> Please review this change for handling long path specified in the 
>>> -Xbootclasspath/a on windows.
>>>
>>> Highlight of changes:
>>> - canonicalize_md.c
>>>     it makes use of the unicode version of canonicalize 
>>> (wcanonicalize) if the original path is >= MAX_PATH.
>> So we are converting mb->wc->mb in wcanonicalize (when > max_path), 
>> and then feed the
>> resulting path into ZFILE_Open, in which we again do mb->wc-> + 
>> prefix -> CreateFileW(...)?
> That's the minimal change I could come up with. Let me know if you 
> have other suggestions.
>>
>> Maybe it's time to consider a "wchar" interface between vm and 
>> libraries.
> Good idea. I think it should be done in a separate RFE.
>> Maybe it's fine here given we
>> are only do this for > max_path case?
> This was done so that this change has no impact on the <= MAX_PATH case.
>>
>> Alan, now reading the win version canonicalize(...), remind me what's 
>> the real purpose of doing
>> FindFirstFile/FindClose() here? the API appears to suggest it is used 
>> to find the first match if there
>> is/are wildcards but we actually have checked/rejected the wildcards 
>> at the very beginning ? To
>> get the "real name" for case?
>>
>>>
>>> - zip_util.c
>>>     it uses the unicode version of CreateFile (CreateFileW) if the 
>>> original path is >= MAX_PATH.
>>>
>>> - io_util_md.c
>>>     Since zip_util.c (libzip) calls the getPrefixed() function in 
>>> canonicalize_md.c (libjava), the getPrefixed() function needs to be 
>>> exported.
>>
>> I kinda remembered that we once wanted to avoid export an variant of 
>> winFileHandleOpen() from
>> libjava to libzip ... in this case the alternative is to simply 
>> copy/paste the getPrefix into libzip ...
>> but I don't remember the exact concern back then.
> I also thought of copy/paste the getPrefix function into libzip.
> After looking at the lib/CoreLibraries.gmk, I think libzip already has 
> a dependency on libjava:
> $(eval $(call SetupJdkLibrary, BUILD_LIBZIP, \
>     NAME := zip, \
>     OPTIMIZATION := LOW, \
>     EXCLUDES := $(LIBZIP_EXCLUDES), \
>     CFLAGS := $(CFLAGS_JDKLIB) \
>         $(LIBZ_CFLAGS), \
>     CFLAGS_unix := $(BUILD_LIBZIP_MMAP) -UDEBUG, \
>     LDFLAGS := $(LDFLAGS_JDKLIB) \
>         $(call SET_SHARED_LIBRARY_ORIGIN), \
>     LIBS_unix := -ljvm -ljava $(LIBZ_LIBS), \
>     LIBS_windows := jvm.lib $(WIN_JAVA_LIB), \
> ))
>
> I've done tier1,2,3 testing and didn't see any new failures.
> I can do higher tier testing if needed.
>
> thanks,
> Calvin
>
> p.s. I didn't see this email until Ioi forwarded it to me since I 
> wasn't in the core-libs-dev alias. (I've subscribed to the alias this 
> morning.)
>
>>
>> -Sherman
>>
>>
>>
>>>
>>>
>>> - java_md.h
>>>     The JVM_MAXPATHLEN has been increased from _MAX_PATH to 1024. It 
>>> is because the the current usage of the canonicalize() function from 
>>> vm code is to preallocate the output buffer as follows:
>>>             char* canonical_path = 
>>> NEW_RESOURCE_ARRAY_IN_THREAD(thread, char, JVM_MAXPATHLEN);
>>>             if (!get_canonical_path(path, canonical_path, 
>>> JVM_MAXPATHLEN)) {
>>>     Also the unix version of canonicalize() function has the 
>>> following check:
>>>             int
>>>            canonicalize(char *original, char *resolved, int len)
>>>            {
>>>                if (len < PATH_MAX) {
>>>                errno = EINVAL;
>>>                return -1;
>>>            }
>>>    So dynamically allocating the output buffer requires more work 
>>> beyond the scope of this change.
>>>
>>> - LongBCP.java
>>>     added a scenario to the test - a long path to a jar file in 
>>> -Xbootclasspath/a.
>>>
>>> Testing: tier-[1,2,3].
>>>
>>> thanks,
>>> Calvin
>>



More information about the core-libs-dev mailing list