RFR(S): 8188122: Path length limits on Windows leads to obscure class loading failures
Calvin Cheung
calvin.cheung at oracle.com
Wed Nov 1 19:07:26 UTC 2017
On 10/27/17, 12:55 AM, Thomas Stüfe wrote:
> Hi Calvin,
>
> On Fri, Oct 27, 2017 at 2:03 AM, Calvin Cheung
> <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>> wrote:
>
> Hi Thomas,
>
> On 10/25/17, 11:54 PM, Thomas Stüfe wrote:
>
> Hi Calvin,
>
> this is a worthwhile addition, thank you for your work!
>
> Thanks for your review.
>
>
> Thanks for your work :)
>
> First off, there is another issue in file_attribute_data_to_stat(). We
> actually had this issue before, but your work uncovered it:
>
> According to POSIX
> (http://pubs.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)
> and every unix manpage I looked at (solaris, linux, aix..), st_ctime
> is not the file creation time but the last time file status was
> changed. Only Microsoft gets this wrong in their posix layer, its
> stat() function returns
> (https://msdn.microsoft.com/en-us/library/14h5k7ff.aspx) creation
> time in st_ctime.
>
> But as os::stat() is a platform independent layer, I'd say we should
> behave the same on all platforms, and that would be the Posix way.
>
> I did not find any code calling os::stat() and using st_ctime, so this
> is still save to change for windows. (Note that I found that
> perfMemory_xxx.cpp uses plain OS ::stat and misuses ctime as "creation
> time" - I opened a bug for that
> (https://bugs.openjdk.java.net/browse/JDK-8190260 - but that does not
> affect us, as they do not call os::stat()).
>
> There is one more complication: in os::stat() we use plain ::stat() in
> the single byte case.:
>
> *+ if (strlen(path) < MAX_PATH) {*
> *+ ret = ::stat(pathbuf, sbuf);*
> *+ } else {*
> *
> *
> But ::stat() on Windows is broken, as I wrote above. We should not use
> it, especially not if we change the meaing of st_ctime in the double
> byte path.
>
> My suggestion would be to just always call GetFileAttributes(), also
> for the single byte path. A very simple solution would be to just
> always go the double byte path with UNC translation, regardless of the
> path length. Or, if you are worried about performance, for paths
> shorter than MAX_PATH we use GetFileAttributesA(), for longer paths
> GetFileAttributesW with UNC translation. In both cases you get
> a WIN32_FILE_ATTRIBUTE_DATA which you can feed tp your
> file_attribute_data_to_stat() routine and have the struct stat you
> return be always consistent.
I ran into an issue with the dwFileAttributes value for a jar file. On
Windows Server O/S, the value is set to 0x2020 which is
(FILE_ATTRIBUTE_NOT_CONTENT_INDEXED | FILE_ATTRIBUTE_ARCHIVE) but on a
desktop O/S like Windows 7, it is set to 0x0020 which is just
FILE_ATTRIBUTE_ARCHIVE. I've fixed it in file_attribute_data_to_stat().
I've also used GetTickCounts() to measure the performance of ::stat() vs
GetFileAttributesA() plus file_attribute_data_to_stat(). There's no
difference at the ms resolution. Using the high-resolution timestamp
(https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx),
it doesn't show much difference.
>
>
> But onward:
>
>
>
> =========================
>
> src/hotspot/os/windows/os_windows.cpp
>
> Could you please use wchar_t instead of WCHAR?
>
> Fixed.
>
>
> ---------------
> in os::stat():
>
> This cumbersome malloc/strcpy sequence:
>
> ! size_t path_len = strlen(path) + 1;
> + char* pathbuf = (char*)os::malloc(path_len * sizeof(char),
> mtInternal);
> + if (pathbuf == NULL) {
> + errno = ENOMEM;
> return -1;
> }
> os::native_path(strcpy(pathbuf, path));
>
> can be reduced to a simple strdup:
>
> char* pathbuf = os::strdup(path, mtInternal);
> if (pathbuf == NULL) {
> errno = ENOMEM;
> return -1;
> }
> os::native_path(pathbuf);
>
> This also would separate strcpy() from os::native_path() which
> is a bit unreadable.
>
> I've made the change.
>
>
> --------------------
>
> The single-byte path (strdup, ::stat()), together with its
> free(), can be moved inside the (strlen(path) < MAX_PATH)
> condition. os::native_path will not change the path length (it
> better not, as it operates on the input buffer). That avoids
> having two allocations on the doublebyte path.
>
>
> os::native_path() converts a path like C:\\\\somedir to C:\\somedir
> So I'll need the converted path in the wchar_t case too. The
> freeing of the pathbuf is being done at the end of the function or
> in the middle of the wchar_t case if there's an error.
>
>
> Oh, you are right,of course. I missed that it this is needed for both
> paths.
>
>
>
> -----------------------
>
> But seeing that translation to UNC paths is something we might
> want more often, I would combine allocation and UNC prefix
> adding to one function like this, to avoid the memove and
> increase reusability:
>
> // Creates an unc path from a single byte path. Return buffer
> is allocated in C heap and needs to be freed by caller.
> Returns NULL on error.
> static whchar_t* create_unc_path(const char* s) {
> - does s start with \\?\ ?
> - yes:
> - os::malloc(strlen(s) + 1) and mbstowcs_s
> - no:
> - os::malloc(strlen(s) + 1 + 4), mbstowcs_s to
> fourth position in string, prefix with L"\\?\"
> }
>
>
> I also include the case for adding L"\\\\?\\UNC\0" at the
> beginning to be consistent with libjava/canonicalize_md.c.
>
>
> We also need error checking to mbstowcs_s.
>
> I've added assert like the following after the call:
>
> assert(converted_chars == path_len, "sanity");
>
>
>
> create_unc_path() :
>
> - could you convert the /**/ to // comments, please?
Fixed.
>
> - not sure about the assert after mbstowcs_s: if we happen to
> encounter an invalid multibyte character, function will fail and
> return an error:
>
> https://msdn.microsoft.com/en-us/library/eyktyxsx.aspx
> "If mbstowcs_s encounters an invalid multibyte character, it puts 0 in
> *``pReturnValue, sets the destination buffer to an empty string, sets
> errno to EILSEQ, and returns EILSEQ."
I've changed create_unc_path() so that the caller will get the errno and
removed the assert.
static wchar_t* create_unc_path(const char* path, errno_t &err)
>
> As this is dependent on user data, we should not assert, but handle
> the return code of mbstowcs_s gracefully. Same goes for
> libjava/canonicalize_md.c.
>
> - Here: ::mbstowcs_s(&converted_chars, &wpath[7], path_len + 7, path,
> path_len);
> third parameter is wrong, as we hand in an offset into the buffer, we
> must decrement the buffer size by this offset, so correct would be
> path_len +7 - 7 or just path_len.
>
> - Same error below: + ::mbstowcs_s(&converted_chars, &wpath[4],
> path_len + 4, path, path_len);
Fixed in both places.
>
>
>
> Just for cleanliness, I would then wrap deallocation into an
> own function.
>
> static viud destroy_unc_path(whchar_t* s) { os::free(s); }
>
> I've added the destroy function.
>
>
> These two functions could be candidates of putting into
> os::win32 namespace, as this form of ANSI->UNC path
> translation is quite common - whoever wants to use the xxxxW()
> functions must do this.
>
> I'm leaving them in os_windows.cpp since they're being used only
> within that file.
>
>
> Fine by me.
>
>
>
> -----------------------
>
> FindFirstFileW:
>
> I am pretty sure that you can achieve the same result with
> GetFileAttributesExW(). It also returns WIN32_FIND_DATAW.
>
> It actually returns WIN32_FILE_ATTRIBUTE_DATA and is very similar
> to WIN32_FIND_DATAW.
>
>
> It is way more straightforward to use than FindFirstFileW, as
> it does not require you to write a callback hook.
>
> I've switched to using GetFileAttributesExW().
>
>
> Thank you, this is way more readable.
> Another issue: do we still need the fix for 6539723 if we switch from
> stat() to GetFileAttributesExW()? This fix looks important, but the
> comment
> indicates that it could break things if the original bug is not present.
>
> Btw, this is another strong argument for scrapping ::stat() altogether
> on all code paths, not only for long input paths, because stat() and
> GetFileAttributesExW() may behave
> differently. So it would be good to use the same API on all code
> paths, in order to get the best test coverage.
For this round of change, I'm using GetFileAttributesEx[A|W]() for both
code paths.
webrev: http://cr.openjdk.java.net/~ccheung/8188122/webrev.03/
thanks,
Calvin
>
>
>
> -------
>
> eval_find_data(): This is more of a generic helper function,
> could you rename this to something clearer, e.g.
> make_double_word() ?
>
> Ok. I've renamed it.
>
>
> Also, setup_stat(), could this be renamed more clearly to
> something like WIN32_FIND_DATA_to_stat? or lowercase if this
> bothers you :)
>
> I'm naming the function as file_attribute_data_to_stat() to match
> with the data structure name.
>
>
> Thanks for taking my suggestions.
>
>
>
> ==================================
> src/hotspot/share/classfile/classLoader.cpp
>
> In ClassPathDirEntry::open_stream(), I would feel better if we
> were asserting _dir and name to be != NULL before feeding it
> to strlen.
>
> I've added an assert statement.
>
>
> ===================
>
> Here's an updated webrev:
> http://cr.openjdk.java.net/~ccheung/8188122/webrev.02/
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.02/>
>
>
> Thanks!
>
> Thomas
>
> thanks,
> Calvin
>
>
> Thanks!
>
> Thomas
>
>
>
>
>
>
> On Wed, Oct 25, 2017 at 9:48 PM, Calvin Cheung
> <calvin.cheung at oracle.com <mailto:calvin.cheung at oracle.com>
> <mailto:calvin.cheung at oracle.com
> <mailto:calvin.cheung at oracle.com>>> wrote:
>
> I've reworked this fix by using the Unicode version of open
> (wopen) to handle path name longer than max path with the path
> prefixed to indicate an extended length path as described
> in [1].
>
> The Unicode version of stat (wstat) doesn't work well with
> long
> path [2]. So FindFirstFileW will be used.The data in
> WIN32_FIND_DATA returned from FindFirstFileW needs to be
> transferred to the stat structure since the caller expects a
> return stat structure and other platforms return a stat
> structure.
>
> In classLoader.cpp, calculate the size of buffer required
> instead
> of limiting it to JVM_MAXPATHLEN.
> In os_windows.cpp, dynamically allocate buffers in
> os::open and
> os::stat.
>
> updated webrev:
> http://cr.openjdk.java.net/~ccheung/8188122/webrev.01/
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.01/>>
>
> It passed hs-tier2 testing using mach5.
>
> thanks,
> Calvin
>
> [1]
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#MAX_PATH
> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>
> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH
> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx#MAX_PATH>>
>
> [2]
> https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>
> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral
> <https://social.msdn.microsoft.com/Forums/vstudio/en-US/3c093ea9-f0aa-446d-b648-2dabe8480430/stat-and-long-names?forum=vcgeneral>>
>
>
>
> On 10/16/17, 3:15 PM, Calvin Cheung wrote:
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8188122
> <https://bugs.openjdk.java.net/browse/JDK-8188122>
> <https://bugs.openjdk.java.net/browse/JDK-8188122
> <https://bugs.openjdk.java.net/browse/JDK-8188122>>
>
> Adding a warning message if the full path or the directory
> length exceeds MAX_PATH on windows.
>
> Example warning messages.
>
> 1) The full path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: construct
> full path
> name failed: total length 270 exceeds max length of 260
> dir
>
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> length 259
> name Hello.class length 11
>
> 2) The directory path exceeds MAX_PATH:
>
> Java HotSpot(TM) 64-Bit Server VM warning: os::stat
> failed:
> path length 265 exceeds max length 260
> path
>
> T:\\testoutput\\jtreg\\JTwork\\classes\\2\\runtime\\LoadClass\\LongPath.d\\xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\xxxxx
>
> webrev:
> http://cr.openjdk.java.net/~ccheung/8188122/webrev.00/
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/
> <http://cr.openjdk.java.net/%7Eccheung/8188122/webrev.00/>>
>
> Testing:
> JPRT (including the new test)
>
> thanks,
> Calvin
>
>
>
More information about the hotspot-runtime-dev
mailing list