RFR: 8245194: Unix domain socket channel implementation

Michael McMahon michaelm at openjdk.java.net
Fri Sep 11 14:44:10 UTC 2020


On Mon, 7 Sep 2020 12:05:07 GMT, Michael McMahon <michaelm at openjdk.org> wrote:

> Continuing this review as a PR on github with the comments below incorporated. I expect there will be a few more
> iterations before integrating.
> On 06/09/2020 19:47, Alan Bateman wrote:
>> On 26/08/2020 15:24, Michael McMahon wrote:
>>>
>>> As I mentioned the other day, I wasn't able to use the suggested method on Windows which returns an absolute path. So,
>>> I added a method to WindowsPath which returns the path in the expected UTF-8 encoding as a byte[]. Let me know what you
>>> think of that.
>>>
>>> There is a fair bit of other refactoring and simplification done also. Next version is at:
>>>
>>> http://cr.openjdk.java.net/~michaelm/8245194/impl.webrev/webrev.9/
>>>
>> Adding a method to WindowsPath to encode the path using UTF-8 is okay but I don't think we should be caching it as the
>> encoding for sun_path is an outlier on Windows. I'm also a bit dubious about encoding a relative path when the resolved
>> path (before encoding) is > 247 chars. The documentation on the MS site isn't very completely and I think there are a
>> number points that require clarification to fully understand how this will work with relative, directly relative and
>> drive relative paths.
>>
> 
> Maybe it would be better to just use the path returned from toString() and do the conversion to UTF-8 externally. That
> would leave WindowsPath unchanged.
>> In the same area, the new PathUtil is a bit inconsistent with the existing provider code. One suggestion is to add a
>> method to AbstractFileSystemProvider instead. That is the base class the platform providers and would be a better place
>> to get the file path in bytes.
>>
> 
> Okay, I gave the method a name that is specific to Unix domain sockets because this UTF-8 strangeness is not likely to
> be useful by other components.
>> One other comment on the changes to the file system provider it should be okay to change UnixUserPrinipals ad its
>> fromUid and fromGid method to be public. This would mean that the patch shouldn't need to add UnixUserGroup (the main
>> issue is that class is that it means we end up with two classes with static factory methods doing the same thing).
> 
> Okay, that does simplify it a bit.
> 
> Thanks,
> Michael.
> 
>> -Alan.
>>
>>
>>
>>
>>
>>

> _Mailing list message from [Alan Bateman](mailto:Alan.Bateman at oracle.com) on
> [nio-dev](mailto:nio-dev at openjdk.java.net):_
> On 07/09/2020 13:14, Michael McMahon wrote:
> 
> > > In the same area, the new PathUtil is a bit inconsistent with the existing provider code. One suggestion is to add a
> > > method to AbstractFileSystemProvider instead. That is the base class the platform providers and would be a better place
> > > to get the file path in bytes.
> > 
> > 
> > Okay, I gave the method a name that is specific to Unix domain sockets because this UTF-8 strangeness is not likely to
> > be useful by other components.
> 
> The file system provider shouldn't know anything about Unix domain
> sockets so I prefer not to have "UnixDomain" in the name of this method.
> I also would prefer if null is an exception so that it is consistent
> with the other methods.
> 

I'm not sure what to call the method then. It returns a UTF-8 string
converted to bytes on Windows and the output of getByteArrayForSysCalls on Unix.
It could be called getByteArrayUTF8 on Windows, but that would not
be right on Unix.

Since sockets are file system entities, why not have socket in the method name?

What about getByteArrayForSocket() ?

> I think it would be useful to also summarize how the bind/connect works
> on Windows. For example, suppose the working directory is 240 characters
> and the UnixDomainSocketAddress is to a simple relative path of 20
> characters. If I understand correctly, the proposal will encode the
> simple relative path (not the resolved absolute path) to bytes using
> UTF-8 so it will probably be 20+ bytes in this case.

Right

> This hould be okay
> but inconsistent with the rest of? file system implementation which will
> use the long form. Also if the name is long then it won't use the long
> path prefix (\?) but instead rely on failure because the resulting
> sequence of bytes when encoded is > 100, is that correct?
> 

Yes, that is how it is working currently. We check the length in native code
just before calling bind(). Whether the long path prefix is present or not
won't matter because the path will not be used once it is longer than ~100 bytes.

Michael.
> -Alan.

-------------

PR: https://git.openjdk.java.net/jdk/pull/52


More information about the nio-dev mailing list