RFR(S) 8190737: use unicode version of the canonicalize() function to handle long path on windows
Xueming Shen
xueming.shen at oracle.com
Thu Sep 13 01:52:03 UTC 2018
Hi Calvin,
I believe I was thinking of keeping the "getPrefixed" as an
implementation details and
instead exporting a "A" version of winFileHandleOpen() back then. But I
don't have a
strong opinion on this one, so your approach looks fine.
No, I don't have a better idea for now to avoid those mb->ws->mb as long
as the canonicalize()
and zip_open() are two separate invocations. But since they are only for
> max_path path. I'm
fine with it.
Yes, I think it's time to consider to migrate from interpreting the
char* as "mb chars" to probably
"utf8 chars" (take a step back, I don't think it's easy to do to
actually a wchar interface) for windows
platform, but that would be a separate and big rfes.
nit:
130 fname, /* Ascii char path name */
the comment probably should be "path name in mb char", or ANSI charset.
Thanks,
-Sherman
On 9/12/18, 4:16 PM, Calvin Cheung wrote:
> Hi Sherman,
>
> Thanks for your review.
> Please refer to my reply below...
>
> On 9/10/18, 5:05 PM, Xueming Shen wrote:
>> On 9/10/18, 11:42 AM, Calvin Cheung wrote:
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8190737
>>>
>>> webrev: http://cr.openjdk.java.net/~ccheung/8190737/webrev.00/
>>>
>>> Please review this change for handling long path specified in the
>>> -Xbootclasspath/a on windows.
>>>
>>> Highlight of changes:
>>> - canonicalize_md.c
>>> it makes use of the unicode version of canonicalize
>>> (wcanonicalize) if the original path is >= MAX_PATH.
>> So we are converting mb->wc->mb in wcanonicalize (when > max_path),
>> and then feed the
>> resulting path into ZFILE_Open, in which we again do mb->wc-> +
>> prefix -> CreateFileW(...)?
> That's the minimal change I could come up with. Let me know if you
> have other suggestions.
>>
>> Maybe it's time to consider a "wchar" interface between vm and
>> libraries.
> Good idea. I think it should be done in a separate RFE.
>> Maybe it's fine here given we
>> are only do this for > max_path case?
> This was done so that this change has no impact on the <= MAX_PATH case.
>>
>> Alan, now reading the win version canonicalize(...), remind me what's
>> the real purpose of doing
>> FindFirstFile/FindClose() here? the API appears to suggest it is used
>> to find the first match if there
>> is/are wildcards but we actually have checked/rejected the wildcards
>> at the very beginning ? To
>> get the "real name" for case?
>>
>>>
>>> - zip_util.c
>>> it uses the unicode version of CreateFile (CreateFileW) if the
>>> original path is >= MAX_PATH.
>>>
>>> - io_util_md.c
>>> Since zip_util.c (libzip) calls the getPrefixed() function in
>>> canonicalize_md.c (libjava), the getPrefixed() function needs to be
>>> exported.
>>
>> I kinda remembered that we once wanted to avoid export an variant of
>> winFileHandleOpen() from
>> libjava to libzip ... in this case the alternative is to simply
>> copy/paste the getPrefix into libzip ...
>> but I don't remember the exact concern back then.
> I also thought of copy/paste the getPrefix function into libzip.
> After looking at the lib/CoreLibraries.gmk, I think libzip already has
> a dependency on libjava:
> $(eval $(call SetupJdkLibrary, BUILD_LIBZIP, \
> NAME := zip, \
> OPTIMIZATION := LOW, \
> EXCLUDES := $(LIBZIP_EXCLUDES), \
> CFLAGS := $(CFLAGS_JDKLIB) \
> $(LIBZ_CFLAGS), \
> CFLAGS_unix := $(BUILD_LIBZIP_MMAP) -UDEBUG, \
> LDFLAGS := $(LDFLAGS_JDKLIB) \
> $(call SET_SHARED_LIBRARY_ORIGIN), \
> LIBS_unix := -ljvm -ljava $(LIBZ_LIBS), \
> LIBS_windows := jvm.lib $(WIN_JAVA_LIB), \
> ))
>
> I've done tier1,2,3 testing and didn't see any new failures.
> I can do higher tier testing if needed.
>
> thanks,
> Calvin
>
> p.s. I didn't see this email until Ioi forwarded it to me since I
> wasn't in the core-libs-dev alias. (I've subscribed to the alias this
> morning.)
>
>>
>> -Sherman
>>
>>
>>
>>>
>>>
>>> - java_md.h
>>> The JVM_MAXPATHLEN has been increased from _MAX_PATH to 1024. It
>>> is because the the current usage of the canonicalize() function from
>>> vm code is to preallocate the output buffer as follows:
>>> char* canonical_path =
>>> NEW_RESOURCE_ARRAY_IN_THREAD(thread, char, JVM_MAXPATHLEN);
>>> if (!get_canonical_path(path, canonical_path,
>>> JVM_MAXPATHLEN)) {
>>> Also the unix version of canonicalize() function has the
>>> following check:
>>> int
>>> canonicalize(char *original, char *resolved, int len)
>>> {
>>> if (len < PATH_MAX) {
>>> errno = EINVAL;
>>> return -1;
>>> }
>>> So dynamically allocating the output buffer requires more work
>>> beyond the scope of this change.
>>>
>>> - LongBCP.java
>>> added a scenario to the test - a long path to a jar file in
>>> -Xbootclasspath/a.
>>>
>>> Testing: tier-[1,2,3].
>>>
>>> thanks,
>>> Calvin
>>
More information about the core-libs-dev
mailing list