RFR 8245194: Unix domain socket channel implementation

Michael McMahon michael.x.mcmahon at oracle.com
Fri Jul 3 11:38:43 UTC 2020


Hi Mark,


Thanks for reviewing the spec.


On 02/07/2020 01:13, mark sheppard wrote:
> Hi Michael,
>
>
> A couple of comments and observations from the API perspective
>
>
> Trojan work done here.
>
> Looking at the API - overall it looks reasonably good, but there are 
> possibly a few inconsistencies,
>
> at least in my interpretation.
>
>
> I have some reservations about the representation of unix domain 
> sockets as an integral part of
>
> SocketChannel and ServerSocketChannel, which will be addressed further 
> below.
>
>
> A couple of small points;
>
>
> — Open, bind, connect methods ??
>
>
> * Does open(SocketAddress remoteAddress) result in an implicit bind 
> for the returned SocketChannel?
>
> The implementation note infers that this is an automatic assignment in 
> the case of a connect on an unbound
>
> Unix domain SocketChannel.
>
> Inference here is that this open includes an implicit connect, which 
> in turn would include an implicit bind ?
>
>
open(SocketAddress) does not itself invoke bind(). Therefore, the API 
note added to bind()
does not apply in this case.


However, there is a distinction between implicit bind (when bind is not 
called)
and "automatic assignment" (where you call bind(null) ), which applies 
with Unix Domain,

but not with TCP/IP sockets. In the case of TCP/IP sockets the result is 
the same. You get
a randomly chosen port number. With Unix domain sockets, implicit bind 
results in an

"unnamed" local address, and therefore no socket file in the 
file-system. This is the most

likely use-case for client sockets with Unix domain.


Whereas an "automatically assigned" address gives you an explicit 
(randomly chosen)

address (a path in the ${java.io.tmpdir} directory). This is equivalent 
to the randomly

chosen port number with TCP.


I think it's important to support both named and unnamed client sockets 
for Unix domain.

So, that is how I ended up with that distinction


Basically, TCP has no concept of an unnamed local port. So, that 
difference has to
be reflected somewhere in the API.


> * The no args open create a SocketChannel/ServerSocketChannel for an 
> Internet protocol
>
>
> 41* <p> A server-socket channel is created by invoking one of the 
> {@code open}
>
> 42* methods of this class. The no-arg {@link #open() open} method 
> opens a server-socket
>
> 43* channel for an <i>Internet protocol</i> socket. The {@link 
> #open(ProtocolFamily)}
>
> 44* method is used to open a server-socket channel for a socket of a 
> specified
>
> 45* protocol family.
>
>
>
> 122* Opens a server-socket channel for an <i>Internet protocol</i> socket.
>
> ServerSocketChannel::open()
>
>
> As such,does this mean that a subsequent bind with a 
> UnixDomainSocketAddress will failfor this unbound {Server}SocketChannel?
>
>
Yes, that should throw UnsupportedAddressTypeException as it does 
currently for unknown address types.
>
> And similarly if an open(ProtocolFamily) is invoked with one of INET, 
> INET6 or UNIX, then
>
> are there any constraints on a subsequent bindon that 
> {Server}SocketChannel ?
>
> Or constraints on SocketChannel::connect
>
> For example, if INET was specified does that mean only a SocketAddress 
> representing an IPV4 address
>
> may be bound and similarly INET6 then only an IPV6 address, and lastly 
> will UNIX constrain bind and connect
>
> using a UnixDomainSocketAddress ?
>
> Andin each case an appropriate Exceptions will be thrown to indicate 
> the invalid binding.
>
>
> — UnixDomainSocketAddress ??
>
>
> If a UnixDomainSocketAddress is provided to a 
> {Server}SocketChannel::bind,after creation with an open taking an INET 
> or INET6 argument, will
>
> an Exception be thrown ?
>
>
> * Is an unnamed SocketAddress extant on an unbound unix domain 
> SocketChannel? That is to say getLocalAddress will return an 
> UnixDomainSocketAddress representing unbound address or should it 
> return null ?
>
>
Yes, the socket family (set when creating the channel) must be 
compatible with the address type used when binding

and connecting. Otherwise UATE is thrown.

> In the module description documentation
>
>
> *343* <p><i>Internet Protocol</i> sockets support network 
> communication using TCP/IP and*
>
> *344* are addressed using {@link InetSocketAddress}es which 
> encapsulate an IP address*
>
> *345* and port number. <i>Internet Protocol</i> sockets are the 
> default kind created,*
>
> *346* when a protocol family is not specified in the channel factory 
> creation method.*
>
>
> But Internet Protocol sockets also includes UDP which are abstracted 
> as DatagramChannel, and SCTP which is an
>
> SctpChannel in jdk.sctp module.
>
> This is not a big deal, and I understand what is being said but it 
> somehow doesn’t read so fluidly, from a pedantic perspective.
>
> It isas if it is trying to be abstract andbe non committal on the 
> protocol.
>
> SocketChannel and SeverSocketChannel have been up to this point 
> synonymous with TCP. The underlying protocol
>
> is TCP so I think it could be stated explicitly.
>
>
Yes, there is a slight inconsistency there in that specific paragraph, 
because it is a general

one about java.net.StandardProtocolFamily. I might change "support 
network communication using TCP/IP"

to "support network communication using TCP and UDP*". *Since SCTP is 
not part of Java SE, I don't

think we need to refer to it here.
**

> — UnixDomainSocketAddress spec
>
>
> 41* <p> An <a id="unnamed"></a><i>unnamed</i> {@code 
> UnixDomainSocketAddress} has
>
> 42* an empty path. The local address of a Unix domain socket that is 
> automatically
>
> 43* bound will be unnamed.
>
>
> WRT UnixDomainSocketAddress an unnamed address equates to an empty 
> PATH — rather than a null PATH
>
>
> But an empty Path is significant, asper PATH spec stated below
>
>
> A Path is considered to be an /empty path/ if it consists solely of 
> one name element that is empty.
>
> Accessing a file using an /empty path/ is equivalent to accessing the 
> default directory of the file system.
>
>
I don't see an issue with the above comment. The last sentence 
describing file behavior for empty

paths is not directly relevant to sockets.


> Also this infers that auto bound sockets have unnamed representations. 
> However, this is at variance with ServerSocketChannel
>
>
> *234* @apiNote*
>
> *235* Binding a channel to a <i>Unix Domain</i> socket creates a file 
> corresponding to*
>
> *236* the file path in the {@link UnixDomainSocketAddress}. This file 
> persists*
>
> *237* after the channel is closed, and must be removed before another 
> socket can*
>
> *238* bind to the same name. Binding to an address that is 
> automatically assigned*
>
> *239* will create a unique file in some system temporary location. The 
> associated*
>
> *240* socket file also persists after the channel is closed. Its name 
> can be*
>
> *241* obtained from the channel's local socket address.*
>
>
> There are also some contractions in the following with 
> UnixDomainSocketAddress spec, in that auto binding for
>
> SocketChannel and ServerSocketChannel are distinct procedures:
>
>
> *361* If a Unix domain {@link SocketChannel} is automatically bound by 
> connecting it*
>
> *362* without calling {@link SocketChannel#bind(SocketAddress) bind} 
> first, then its*
>
> *363* address is <i>unnamed</i>; it has an empty path field, and 
> therefore has no*
>
> *364* associated file in the file-system. Explicitly binding a {@code 
> SocketChannel}*
>
> *365* to any unnamed address has the same effect.*
>
> *366* <p>*
>
> *367* If a Unix domain {@link ServerSocketChannel} is automatically 
> bound by passing a*
>
> *368* {@code null} address to one of the {@link 
> ServerSocketChannel#bind(SocketAddress) bind}*
>
> *369* methods, the channel is bound to a unique name in the temporary 
> directory identified*
>
> *370* by the {@code "java.io.tmpdir"} system property. The exact 
> pathname can be obtained by*
>
> *371* calling {@link ServerSocketChannel#getLocalAddress() 
> getLocalAddress} after bind returns.*
>
> *372* It is an error to bind a {@code ServerSocketChannel} to an 
> unnamed address.*
>
>
> Why not auto bind SocketChannel using similar mechanism to that of 
> ServerSocketChannel, and have the unnamed
>
> address to explicitly represent an unbound SocketChannel?
>
>
>
This is the distinction I referred to above with respect to "implicit 
binding" and "automatic assignment"
of addresses. It's just a difference between Unix domain and Inet, where 
we need to support unnamed

client sockets for Unix domain, but the concept does not make sense for 
server sockets.


> In any case all is hunky dory when considering SocketChannel 
> encapsulating a unix domain socket in the context of
>
> sending and receiving data. However, unix domain sockets have 
> additional capabilities and functionality which distinguish
>
> them from a TCP counterpart.This is in terms of sending file 
> descriptors and capabilities. This is significant functionality.
>
> So much so, that it should be at least considered they are represented 
> bytheir own distinct set of abstractions, analogous to SctpChannel.
>
> This could be done through either subclassing {Server} SocketChannel 
> or placing a relevant set of abstractions at the same
>
> in the SelectableChannel hierarchy
>
>
> I think having a separate set of abstractions for “unix domain 
> sockets” would make them easier to use, allow for easier specification 
> of the additional
>
> functionality and behaviour andleave SocketChannel and 
> ServerSocketChannel synonymous with the TCP protocol.
>
>
> As such, an abstraction such as:
>
> * LocalChannel, LocalServerChannel, or
>
> * LocalSocketChannel, LocalServerSocketChannel, or
>
> * LocalDomainChannel, LocalDomainServerChannel
>
> Are worth considering
>
>
I have done some prototyping of support for capabilities like that. But, 
it's out of scope for this
JEP, and we can look at it afterwards. For what it's worth, it is 
possible to implement sending and
receiving of channels without requiring any new API types.
>
> Some motivating factors for suggesting a separate set of api 
> abstractionsare:
>
>
> It will more clearly separate out the underlying protocol domains.
>
>
> Provides clearer and cleaner semantics. For example, other than the 
> fact that a TCP socket
>
> and Unix domain socket exhibit stream behaviour, a Unix domain socket 
> can exhibit
>
> different behaviour in connection establishment, send/receive behaviour;
>
> they not subject to flow control etc, and may exhibit different 
> reliability semantics,
>
> in that it is possible for a sender to flood a receiver.
>
>
> They have a different access model in being subject to OS level 
> permission on the OS filesystem.
>
> This leads to different bind behaviour, connect behaviour - different 
> causes for IOException or SocketException.
>
> They use a different “socket address” namespace i.e. the filesystem 
> name space.
>
> Close behaviour is different - requires additional explicit management 
> of the socket domain
>
> namespace in the filesystem.
>
>
> Having separate API abstractions allows for specific descriptions and 
> specification for those abstractions
>
> Having a separate abstraction will provide cleaner, more precise and 
> clearer semantics for a Unix domain channel abstraction
>
> Allows for clearer, more precise specification definition,and 
> developer documentation, avoiding possible ambiguities.
>
>
>
> Eliminates the StandardProtocolFamily extensions
>
> BUT will require addition of openLocalChannel/openLocalServerChannel 
> methods to SelectorProvider
>
>
> Eliminates socket method and throwing of UnSupportedException
>
>
> backward compatibility in the API - SocketChannel and 
> ServerSocketChannel remain as they are
>
> and the new functionality is defined in a new distinct set of abstractions
>
>
> Allows for easier integrationof the additional features and 
> functionality associated with unix domain sockets,
>
> for example, the passing of file descriptors, or capability exchanges 
> (sendCredential, receiveCredential).
>
>
> File descriptor passing can be easily included in a separate 
> abstraction (e.g. LocalChannel), with additional methods,
>
> such as** sendFD(FileDescriptorn fd)**, ** receiveFD(FileDescriptor 
> fd) ** provided for the purpose or even with
>
> an overloaded send and receive method.
>
> Thus, avoiding the need to retrofit such extended functionality into 
> the existing SocketChannel api.
>
> Of course, alternatively such functionality could be supported by a 
> mediating utility abstraction, such as 
> FileDescriptorExchanger::send(FileDescriptor, 
> SocketChannel)FileDescriptorExchanger::recv(FileDescriptor, 
> SocketChannel) for the purpose of exchanging file descriptors.
>
>
> In short a Unix domain socket is a different functional abstraction, 
> which has different and extended behaviour.
>
> As such defining a separate API abstraction provides for clear, 
> unambiguous behaviour, making easier to use, and
>
> allowing for easier integration of additional extended behaviour.
>
>
>

I'm not convinced that new abstractions are needed other than the bare 
minimum for representing
the new address type and protocol family etc. You can take a look at the 
prototype i did for sending & receiving channels

in the sandbox (sendchannels branch). It is implemented as 
SocketOption<Channel>. Setting the option

is used for sending a channel and getting it receives one. There's a bit 
more to it than that, but new APIs are not
required. And, in any case, this is out of scope for this JEP anyway.


Thanks

Michael


> best regards
> Mark
>
> ------------------------------------------------------------------------
> *From:* nio-dev <nio-dev-bounces at openjdk.java.net> on behalf of 
> Michael McMahon <michael.x.mcmahon at oracle.com>
> *Sent:* Thursday 18 June 2020 14:31
> *To:* nio-dev at openjdk.java.net <nio-dev at openjdk.java.net>
> *Subject:* RFR 8245194: Unix domain socket channel implementation
>
> Hi,
>
> I'd like to start the review for JEP-380 (Unix Domain Socket Channels) 
> [1]. The first and smaller part of this JEP (8241305: Add protocol 
> specific factory creation methods to SocketChannel and 
> ServerSocketChannel) has already been integrated in JDK 15. This main 
> part of the JEP will hopefully be targeted to 16 under this bugid 
> (8245194 Unix Domain socket channel implementation [3])
>
> The full webrev [4] touches a lot of files, so I have put the public 
> API change in a separate webrev at [5] and there is a specdiff at [6].
>
> The implementation, while it touches a lot of files, is mostly about 
> re-factoring the existing SocketImpl and ServerSocketImpl 
> implementation classes into separate Inet and Unix variants, along 
> with the new implementation code for Unix domain.
>
> Comments welcome on either the implementation or the API, although I 
> would like to concentrate on the API to start with as I expect the 
> review will have several iterations.
>
> Regards,
>
> Michael.
>
>
> [1] https://openjdk.java.net/jeps/380 <https://openjdk.java.net/jeps/380>
>
> [2] https://bugs.openjdk.java.net/browse/JDK-8241305 
> <https://bugs.openjdk.java.net/browse/JDK-8241305>
>
> [3] https://bugs.openjdk.java.net/browse/JDK-8245194 
> <https://bugs.openjdk.java.net/browse/JDK-8245194>
>
> [4] http://cr.openjdk.java.net/~michaelm/8245194/impl.webrev/webrev.1/
>
> [5] http://cr.openjdk.java.net/~michaelm/8245194/api.webrev/webrev.1/
>
> [6] 
> http://cr.openjdk.java.net/~michaelm/8245194/specdiff/specout.1/overview-summary.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/nio-dev/attachments/20200703/7c2c8da8/attachment-0001.htm>


More information about the nio-dev mailing list