RFR 8245194: Unix domain socket channel implementation

mark sheppard macanaoire at hotmail.com
Thu Jul 2 00:13:46 UTC 2020


Hi Michael,



A couple of comments and observations from the API perspective


Trojan work done here.

Looking at the API - overall it looks reasonably good, but there are possibly a few inconsistencies,

at least in my interpretation.


I have some reservations about the representation of unix domain sockets as an integral part of

SocketChannel and ServerSocketChannel, which will be addressed further below.


A couple of small points;


— Open, bind, connect methods ??


* Does open(SocketAddress remoteAddress) result in an implicit bind for the returned SocketChannel?

The implementation note infers that this is an automatic assignment in the case of a connect on an unbound

Unix domain SocketChannel.

Inference here is that this open includes an implicit connect, which in turn would include an implicit bind ?


* The no args open create a SocketChannel/ServerSocketChannel for an Internet protocol


41  * <p> A server-socket channel is created by invoking one of the {@code open}

  42  * methods of this class. The no-arg {@link #open() open} method opens a server-socket

  43  * channel for an <i>Internet protocol</i> socket. The {@link #open(ProtocolFamily)}

  44  * method is used to open a server-socket channel for a socket of a specified

  45  * protocol family.



122      * Opens a server-socket channel for an <i>Internet protocol</i> socket.

ServerSocketChannel::open()


As such,  does this mean that a subsequent bind with a UnixDomainSocketAddress will fail  for this unbound {Server}SocketChannel?


And similarly if an open(ProtocolFamily) is invoked with one of INET, INET6 or UNIX, then

are there any constraints on a subsequent bind  on that {Server}SocketChannel ?

Or constraints on SocketChannel::connect

For example, if INET was specified does that mean only a SocketAddress representing an IPV4 address

may be bound and similarly INET6 then only an IPV6 address, and lastly will UNIX constrain bind and connect

using a UnixDomainSocketAddress ?

And  in each case an appropriate Exceptions will be thrown to indicate the invalid binding.


— UnixDomainSocketAddress ??


If a UnixDomainSocketAddress is provided to a {Server}SocketChannel::bind,  after creation with an open taking an INET or INET6 argument, will

an Exception be thrown ?


* Is an unnamed SocketAddress extant on an unbound unix domain SocketChannel? That is to say getLocalAddress will return an UnixDomainSocketAddress representing unbound address or should it return null ?


In the module description documentation


 343  * <p><i>Internet Protocol</i> sockets support network communication using TCP/IP and

 344  * are addressed using {@link InetSocketAddress}es which encapsulate an IP address

 345  * and port number. <i>Internet Protocol</i> sockets are the default kind created,

 346  * when a protocol family is not specified in the channel factory creation method.


But Internet Protocol sockets also includes UDP which are abstracted as DatagramChannel, and SCTP which is an

SctpChannel in jdk.sctp module.

This is not a big deal, and I understand what is being said but it somehow doesn’t read so fluidly, from a pedantic perspective.

It is  as if it is trying to be abstract and  be non committal on the protocol.

SocketChannel and SeverSocketChannel have been up to this point synonymous with TCP. The underlying protocol

is TCP so I think it could be stated explicitly.


 — UnixDomainSocketAddress spec


  41  * <p> An <a id="unnamed"></a><i>unnamed</i> {@code UnixDomainSocketAddress} has

  42  * an empty path. The local address of a Unix domain socket that is automatically

  43  * bound will be unnamed.


WRT UnixDomainSocketAddress an unnamed address equates to an empty PATH — rather than a null PATH


But an empty Path is significant, as  per PATH spec stated below


A Path is considered to be an empty path if it consists solely of one name element that is empty.

Accessing a file using an empty path is equivalent to accessing the default directory of the file system.


Also this infers that auto bound sockets have unnamed representations. However, this is at variance with ServerSocketChannel


 234      * @apiNote

 235      * Binding a channel to a <i>Unix Domain</i> socket creates a file corresponding to

 236      * the file path in the {@link UnixDomainSocketAddress}. This file persists

 237      * after the channel is closed, and must be removed before another socket can

 238      * bind to the same name. Binding to an address that is automatically assigned

 239      * will create a unique file in some system temporary location. The associated

 240      * socket file also persists after the channel is closed. Its name can be

 241      * obtained from the channel's local socket address.


There are also some contractions in the following with UnixDomainSocketAddress spec, in that auto binding for

SocketChannel and ServerSocketChannel are distinct procedures:


 361  * If a Unix domain {@link SocketChannel} is automatically bound by connecting it

 362  * without calling {@link SocketChannel#bind(SocketAddress) bind} first, then its

 363  * address is <i>unnamed</i>; it has an empty path field, and therefore has no

 364  * associated file in the file-system. Explicitly binding a {@code SocketChannel}

 365  * to any unnamed address has the same effect.

 366  * <p>

 367  * If a Unix domain {@link ServerSocketChannel} is automatically bound by passing a

 368  * {@code null} address to one of the {@link ServerSocketChannel#bind(SocketAddress) bind}

 369  * methods, the channel is bound to a unique name in the temporary directory identified

 370  * by the {@code "java.io.tmpdir"} system property. The exact pathname can be obtained by

 371  * calling {@link ServerSocketChannel#getLocalAddress() getLocalAddress} after bind returns.

 372  * It is an error to bind a {@code ServerSocketChannel} to an unnamed address.


Why not auto bind SocketChannel using similar mechanism to that of ServerSocketChannel, and have the unnamed

address to explicitly represent an unbound SocketChannel?



In any case all is hunky dory when considering SocketChannel encapsulating a unix domain socket in the context of

sending and receiving data. However, unix domain sockets have additional capabilities and functionality which distinguish

them from a TCP counterpart.  This is in terms of sending file descriptors and capabilities. This is significant functionality.

So much so, that it should be at least considered they are represented by  their own distinct set of abstractions, analogous to SctpChannel.

This could be done through either subclassing {Server} SocketChannel or placing a relevant set of abstractions at the same

in the SelectableChannel hierarchy


I think having a separate set of abstractions for “unix domain sockets” would make them easier to use, allow for easier specification of the additional

functionality and behaviour and  leave SocketChannel and ServerSocketChannel synonymous with the TCP protocol.


As such, an abstraction such as:

 * LocalChannel, LocalServerChannel, or

 * LocalSocketChannel, LocalServerSocketChannel, or

 * LocalDomainChannel, LocalDomainServerChannel

Are worth considering


Some motivating factors for suggesting a separate set of api abstractions  are:


It will more clearly separate out the underlying protocol domains.


Provides clearer and cleaner semantics. For example, other than the fact that a TCP socket

and Unix domain socket exhibit stream behaviour, a Unix domain socket can exhibit

different behaviour in connection establishment, send/receive behaviour;

they not subject to flow control etc, and may exhibit different reliability semantics,

in that it is possible for a sender to flood a receiver.


They have a different access model in being subject to OS level permission on the OS filesystem.

This leads to different bind behaviour, connect behaviour - different causes for IOException or SocketException.

They use a different “socket address” namespace i.e. the filesystem name space.

Close behaviour is different - requires additional explicit management of the socket domain

namespace in the filesystem.


Having separate API abstractions allows for specific descriptions and specification for those abstractions

Having a separate abstraction will provide cleaner, more precise and clearer semantics for a Unix domain channel abstraction

Allows for clearer, more precise specification definition, and developer documentation, avoiding possible ambiguities.



Eliminates the StandardProtocolFamily extensions

BUT will require addition of openLocalChannel/openLocalServerChannel methods to SelectorProvider


Eliminates socket method and throwing of UnSupportedException


backward compatibility in the API - SocketChannel and ServerSocketChannel remain as they are

and the new functionality is defined in a new distinct set of abstractions


Allows for easier integration  of the additional features and functionality associated with unix domain sockets,

for example, the passing of file descriptors, or capability exchanges (sendCredential, receiveCredential).


File descriptor passing can be easily included in a separate abstraction (e.g. LocalChannel), with additional methods,

such as  ** sendFD(FileDescriptorn fd)  **,   ** receiveFD(FileDescriptor fd) ** provided for the purpose or even with

an overloaded send and receive method.

Thus, avoiding the need to retrofit such extended functionality into the existing SocketChannel api.



Of course, alternatively such functionality could be supported by a mediating utility abstraction, such as FileDescriptorExchanger::send(FileDescriptor, SocketChannel)  FileDescriptorExchanger::recv(FileDescriptor, SocketChannel) for the purpose of exchanging file descriptors.


In short a Unix domain socket is a different functional abstraction, which has different and extended behaviour.

As such defining a separate API abstraction provides for clear, unambiguous behaviour, making easier to use, and

allowing for easier integration of additional extended behaviour.


best regards
Mark

________________________________
From: nio-dev <nio-dev-bounces at openjdk.java.net> on behalf of Michael McMahon <michael.x.mcmahon at oracle.com>
Sent: Thursday 18 June 2020 14:31
To: nio-dev at openjdk.java.net <nio-dev at openjdk.java.net>
Subject: RFR 8245194: Unix domain socket channel implementation


Hi,

I'd like to start the review for JEP-380 (Unix Domain Socket Channels) [1]. The first and smaller part of this JEP (8241305: Add protocol specific factory creation methods to SocketChannel and ServerSocketChannel) has already been integrated in JDK 15. This main part of the JEP will hopefully be targeted to 16 under this bugid (8245194 Unix Domain socket channel implementation [3])

The full webrev [4] touches a lot of files, so I have put the public API change in a separate webrev at [5] and there is a specdiff at [6].

The implementation, while it touches a lot of files, is mostly about re-factoring the existing SocketImpl and ServerSocketImpl implementation classes into separate Inet and Unix variants, along with the new implementation code for Unix domain.

Comments welcome on either the implementation or the API, although I would like to concentrate on the API to start with as I expect the review will have several iterations.

Regards,

Michael.


[1] https://openjdk.java.net/jeps/380

[2] https://bugs.openjdk.java.net/browse/JDK-8241305

[3] https://bugs.openjdk.java.net/browse/JDK-8245194

[4] http://cr.openjdk.java.net/~michaelm/8245194/impl.webrev/webrev.1/

[5] http://cr.openjdk.java.net/~michaelm/8245194/api.webrev/webrev.1/

[6] http://cr.openjdk.java.net/~michaelm/8245194/specdiff/specout.1/overview-summary.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/nio-dev/attachments/20200702/608333fa/attachment-0001.htm>


More information about the nio-dev mailing list