[EXTERNAL] Re: Bug in Files.isSameFile(Path,Path)

Nhat Nguyen honguye at microsoft.com
Mon Nov 2 20:26:50 UTC 2020


Hi Alan,

We got the answers from our internal team.

It turns out that volSerialNumber, as per the Windows Protocol Specs, isn't required to be non-zero [1].
Fortunately, we can use the file index value to determine whether the file systems support 64 file-index [2].

File index being 0 means the file system doesn't support 64-bit file index. In the case of webdav drive, the
file index is indeed 0 from my testing. We were also cautioned that file systems that use 128-bit file index
such as ReFS may return -1 in case the index doesn't fit into a 64-bit value. In such cases, they recommend
that we use GetFileInformationByHandleEx [3] to get FILE_ID_INFO [4] which returns the 128-bit file id.

The caveat is that FILE_ID_INFO is only supported from Windows 8.0/2012, and NTFS might have only
implemented it as of 8.1/2012 R2. Furthermore, file systems are not required to implement it.
>From my testing, webdav drives don't support getting FILE_ID_INFO using GetFileInformationByHandleEx.

Given that FILE_ID_INFO is not fully supported by all versions of Windows which LTS JDK are required to run
on, and that it isn't required to be implemented by all file systems, we think the simplest and best effort
is to guard against 0 and -1 for file index, and fall back to comparing paths using GetFinalPathByHandle.
What is your opinion on this?

Also, we would be very appreciate if you can help us create a JBS bug for this issue so we can open PR
against it.

Thanks,
Nhat

[1]: "VolumeSerialNumber: No specific format or content of this field is required for protocol interoperation. This value is not required to be unique." (https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/bf691378-c34e-4a13-976e-404ea1a87738.)
[2]: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/2d3333fe-fc98-4a6f-98a2-4bb805aff407
[3]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getfileinformationbyhandleex
[4]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_info


-----Original Message-----
From: Nikola Grcevski <Nikola.Grcevski at microsoft.com> 
Sent: Thursday, October 29, 2020 8:24 AM
To: Alan Bateman <Alan.Bateman at oracle.com>; Nhat Nguyen <honguye at microsoft.com>; WarnerJan Veldhuis <veldhuis at freedom.nl>; nio-dev at openjdk.java.net
Subject: RE: [EXTERNAL] Re: Bug in Files.isSameFile(Path,Path)

Hi Alan,

We are still trying to get to the bottom of this, it seems like Microsoft documented the API non-zero is safe to use behaviour for Python (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fpull%2F5764%2Ffiles&data=04%7C01%7Chonguye%40microsoft.com%7C05eaa87eb7264809f76b08d87c1ea288%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395818299194688%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pEvBImE7NdzHcncIYAFBWAHbViYXB20K7dokKTc0L6A%3D&reserved=0), but it's not there in MSDN. We are waiting to hear back on our internal developer support channel from the API owners.

Cheers,
Nikola

-----Original Message-----
From: Alan Bateman <Alan.Bateman at oracle.com>
Sent: October 29, 2020 10:38 AM
To: Nhat Nguyen <honguye at microsoft.com>; Nikola Grcevski <Nikola.Grcevski at microsoft.com>; WarnerJan Veldhuis <veldhuis at freedom.nl>; nio-dev at openjdk.java.net
Subject: Re: [EXTERNAL] Re: Bug in Files.isSameFile(Path,Path)

On 29/10/2020 00:13, Nhat Nguyen wrote:
> Hi everyone,
>
> We looked into the issue and found that the values of volSerialNumber, 
> fileIndexHigh, and fileIndexLow in the file attributes, which are used 
> to determine if the two files are the same, are all zeroes when used 
> with webdav drives. However, this looks like a known issue as cpython seems to suffer from the same behaviour as well [1].
>
> We have a suggested fix [2] that falls back to using 
> GetFinalPathByHandle and comparing the two paths when the 
> volSerialNumber is zero. We have also reached out to the Windows team to confirm if this is the preferred way to detect such cases; we will get back when we have hear back from them.
>
> [1]: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fpython%2Fcpython%2Fpull%2F5764%23discussion_r169221544&da
> ta=04%7C01%7Chonguye%40microsoft.com%7C05eaa87eb7264809f76b08d87c1ea28
> 8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637395818299194688%7CUn
> known%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW
> wiLCJXVCI6Mn0%3D%7C1000&sdata=Mc%2B3cceNDA%2FdnwLLHr1t9RoPWae5YkKW
> pYlxiQWl%2Bx8%3D&reserved=0
> [2]: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fnhat-nguyen%2Fjdk%2Fcommit%2Fbd8fdc2809ed08e5564fe1eab4a7c1c2
> f32df84b&data=04%7C01%7Chonguye%40microsoft.com%7C05eaa87eb7264809
> f76b08d87c1ea288%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63739581
> 8299194688%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=396du2rLj4%2B7a%2FcP1
> 6P3czAAaafgjjxkMAJI1G14TPc%3D&reserved=0
>
Thanks for looking into it. Do you know if the serial number = 0 is documented anywhere?

I have a number of comments on how this workaround should fit in, I'll reply soon on this.

-Alan.


More information about the nio-dev mailing list