RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures

Calvin Cheung calvin.cheung at oracle.com
Wed Nov 1 19:07:26 UTC 2017



On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
> Hi Calvin,
>
> On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung 
> <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>> wrote:
>
>     Hi Thomas,
>
>     On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>
>         Hi Calvin,
>
>         this is a worthwhile addition, thank you for your work!
>
>     Thanks for your review.
>
>
> Thanks for your work :)
>
> First off, there is another issue in file_attribute_data_to_stat(). We 
> actually had this issue before, but your work uncovered it:
>
> According to POSIX 
> (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html) 
> and every unix manpage I looked at (solaris, linux, aix..), st_ctime 
> is not the file creation time but the last time file status was 
> changed. Only Microsoft gets this wrong in their posix layer, its 
> stat() function returns 
>  (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation 
> time in st_ctime.
>
> But as os::stat() is a platform independent layer, I'd say we should 
> behave the same on all platforms, and that would be the Posix way.
>
> I did not find any code calling os::stat() and using st_ctime, so this 
> is still save to change for windows. (Note that I found that 
> perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation 
> time" - I opened a bug for that 
> (https://bugs.openjdk.java.net/browse/JDK-8190260 - but that does not 
> affect us, as they do not call os::stat()).
>
> There is one more complication: in os::stat() we use plain ::stat() in 
> the single byte case.:
>
> *+ if (strlen(path) < MAX_PATH) {*
> *+     ret = ::stat(pathbuf, sbuf);*
> *+   } else {*
> *
> *
> But ::stat() on Windows is broken, as I wrote above. We should not use 
> it, especially not if we change the meaing of st_ctime in the double 
> byte path.
>
> My suggestion would be to just always call GetFileAttributes(), also 
> for the single byte path. A very simple solution would be to just 
> always go the double byte path with UNC translation, regardless of the 
> path length. Or, if you are worried about performance, for paths 
> shorter than MAX_PATH we use GetFileAttributesA(), for longer paths 
> GetFileAttributesW with UNC translation. In both cases you get 
> a WIN32_FILE_ATTRIBUTE_DATA which you can feed tp your  
> file_attribute_data_to_stat() routine and have the struct stat you 
> return be always consistent.
I ran into an issue with the dwFileAttributes value for a jar file. On 
Windows Server O/S, the value is set to 0x2020 which is 
(FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but on a 
desktop O/S like Windows 7, it is set to 0x0020 which is just 
FILE_ATTRIBUTE_ARCHIVE. I've fixed it in file_attribute_data_to_stat().

I've also used GetTickCounts() to measure the performance of ::stat() vs 
GetFileAttributesA() plus file_attribute_data_to_stat(). There's no 
difference at the ms resolution. Using the high-resolution timestamp 
(https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx), 
it doesn't show much difference.
>
>
> But onward:
>
>
>
>         =========================
>
>         src/hotspot/os/windows/os_windows.cpp
>
>         Could you please use wchar_t instead of WCHAR?
>
>     Fixed.
>
>
>         ---------------
>         in os::stat():
>
>         This cumbersome malloc/strcpy sequence:
>
>         !   size_t path_len = strlen(path) + 1;
>         +   char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
>         mtInternal);
>         +   if (pathbuf == NULL) {
>         +     errno = ENOMEM;
>               return -1;
>             }
>             os::native_path(strcpy(pathbuf, path));
>
>          can be reduced to a simple strdup:
>
>         char* pathbuf = os::strdup(path, mtInternal);
>         if (pathbuf == NULL) {
>           errno = ENOMEM;
>           return -1;
>         }
>         os::native_path(pathbuf);
>
>         This also would separate strcpy() from os::native_path() which
>         is a bit unreadable.
>
>     I've made the change.
>
>
>         --------------------
>
>         The single-byte path (strdup, ::stat()), together with its
>         free(), can be moved inside the (strlen(path) < MAX_PATH)
>         condition. os::native_path will not change the path length (it
>         better not, as it operates on the input buffer). That avoids 
>         having two allocations on the doublebyte path.
>
>
>     os::native_path() converts a path like C:\\\\somedir to C:\\somedir
>     So I'll need the converted path in the wchar_t case too. The
>     freeing of the pathbuf is being done at the end of the function or
>     in the middle of the wchar_t case if there's an error.
>
>
> Oh, you are right,of course. I missed that it this is needed for both 
> paths.
>
>
>
>         -----------------------
>
>         But seeing that translation to UNC paths is something we might
>         want more often, I would combine allocation and UNC prefix
>         adding to one function like this, to avoid the memove and
>         increase reusability:
>
>         // Creates an unc path from a single byte path. Return buffer
>         is allocated in C heap and needs to be freed by caller.
>         Returns NULL on error.
>         static whchar_t* create_unc_path(const char* s) {
>           - does s start with \\?\ ?
>         - yes:
>                     - os::malloc(strlen(s) + 1) and mbstowcs_s
>                 - no:
>                     - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to
>         fourth position in string, prefix with L"\\?\"
>         }
>
>
>     I also include the case for adding  L"\\\\?\\UNC\0" at the
>     beginning to be consistent with libjava/canonicalize_md.c. 
>
>
>         We also need error checking to mbstowcs_s.
>
>     I've added assert like the following after the call:
>
>     assert(converted_chars == path_len, "sanity");
>
>
>
> create_unc_path() :
>
> - could you convert the /**/ to // comments, please?
Fixed.
>
> - not sure about the assert after mbstowcs_s: if we happen to 
> encounter an invalid multibyte character, function will fail and 
> return an error:
>
> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
> "If mbstowcs_s encounters an invalid multibyte character, it puts 0 in 
> *``pReturnValue, sets the destination buffer to an empty string, sets 
> errno to EILSEQ, and returns EILSEQ."
I've changed create_unc_path() so that the caller will get the errno and 
removed the assert.

static wchar_t* create_unc_path(const char* path, errno_t &err)
>
>  As this is dependent on user data, we should not assert, but handle 
> the return code of mbstowcs_s gracefully. Same goes for 
> libjava/canonicalize_md.c.
>
> - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path, 
> path_len);
> third parameter is wrong, as we hand in an offset into the buffer, we 
> must decrement the buffer size by this offset, so correct would be 
> path_len +7 - 7 or just path_len.
>
> - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4], 
> path_len + 4, path, path_len);
Fixed in both places.
>
>
>
>         Just for cleanliness, I would then wrap deallocation into an
>         own function.
>
>         static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>
>     I've added the destroy function.
>
>
>         These two functions could be candidates of putting into
>         os::win32 namespace, as this form of ANSI->UNC path
>         translation is quite common - whoever wants to use the xxxxW()
>         functions must do this.
>
>     I'm leaving them in os_windows.cpp since they're being used only
>     within that file.
>
>
> Fine by me.
>
>
>
>         -----------------------
>
>         FindFirstFileW:
>
>         I am pretty sure that you can achieve the same result with
>         GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>
>     It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar
>     to WIN32_FIND_DATAW.
>
>
>         It is way more straightforward to use than FindFirstFileW, as
>         it does not require you to write a callback hook.
>
>     I've switched to using GetFileAttributesExW().
>
>
> Thank you, this is way more readable.
> Another issue: do we still need the fix for 6539723 if we switch from 
> stat() to GetFileAttributesExW()? This fix looks important, but the 
> comment
> indicates that it could break things if the original bug is not present.
>
> Btw, this is another strong argument for scrapping ::stat() altogether 
> on all code paths, not only for long input paths, because stat() and 
> GetFileAttributesExW() may behave
> differently. So it would be good to use the same API on all code 
> paths, in order to get the best test coverage.
For this round of change, I'm using GetFileAttributesEx[A|W]() for both 
code paths.

webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/

thanks,
Calvin
>
>
>
>         -------
>
>         eval_find_data(): This is more of a generic helper function,
>         could you rename this to something clearer, e.g.
>         make_double_word() ?
>
>     Ok. I've renamed it.
>
>
>         Also, setup_stat(), could this be renamed more clearly to
>         something like WIN32_FIND_DATA_to_stat? or lowercase if this
>         bothers you :)
>
>     I'm naming the function as file_attribute_data_to_stat() to match
>     with the data structure name.
>
>
> Thanks for taking my suggestions.
>
>
>
>         ==================================
>         src/hotspot/share/classfile/classLoader.cpp
>
>         In ClassPathDirEntry::open_stream(), I would feel better if we
>         were asserting _dir and name to be != NULL before feeding it
>         to strlen.
>
>     I've added an assert statement.
>
>
>         ===================
>
>     Here's an updated webrev:
>     http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
>     <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>
>
> Thanks!
>
> Thomas
>
>     thanks,
>     Calvin
>
>
>         Thanks!
>
>         Thomas
>
>
>
>
>
>
>         On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
>         <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>
>         <mailto:calvin.cheung at oracle.com
>         <mailto:calvin.cheung at oracle.com>>> wrote:
>
>             I've reworked this fix by using the Unicode version of open
>             (wopen) to handle path name longer than max path with the path
>             prefixed to indicate an extended length path as described
>         in [1].
>
>             The Unicode version of stat (wstat) doesn't work well with
>         long
>             path [2]. So FindFirstFileW will be used.The data in
>             WIN32_FIND_DATA returned from FindFirstFileW needs to be
>             transferred to the stat structure since the caller expects a
>             return stat structure and other platforms return a stat
>         structure.
>
>             In classLoader.cpp, calculate the size of buffer required
>         instead
>             of limiting it to JVM_MAXPATHLEN.
>             In os_windows.cpp, dynamically allocate buffers in
>         os::open and
>             os::stat.
>
>             updated webrev:
>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>
>             It passed hs-tier2 testing using mach5.
>
>             thanks,
>             Calvin
>
>             [1]
>         https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
>         <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>
>             [2]
>         https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
>         <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>
>
>
>             On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>
>                 JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
>         <https://bugs.openjdk.java.net/browse/JDK-8188122>
>         <https://bugs.openjdk.java.net/browse/JDK-8188122
>         <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>
>                 Adding a warning message if the full path or the directory
>                 length exceeds MAX_PATH on windows.
>
>                 Example warning messages.
>
>                 1) The full path exceeds MAX_PATH:
>
>                 Java HotSpot(TM) 64-Bit Server VM warning: construct
>         full path
>                 name failed: total length 270 exceeds max length of 260
>                     dir
>                
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>                 length 259
>                     name Hello.class length 11
>
>                 2) The directory path exceeds MAX_PATH:
>
>                 Java HotSpot(TM) 64-Bit Server VM warning: os::stat
>         failed:
>                 path length 265 exceeds max length 260
>                     path
>                
>         T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
>                 webrev:
>         http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
>         <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>
>                 Testing:
>                     JPRT (including the new test)
>
>                 thanks,
>                 Calvin
>
>
>


More information about the hotspot-runtime-dev mailing list