MumbleCloseable

Mon Jun 24 18:04:10 PDT 2013

This note attempts to tie together the issues that have been raised for 
streams having a resource release mechanism.  For an explanation of why 
the initial attempt (which included the high-noise-to-signal classes 
CloseableStream and DelegatingStream) was impractical, see Paul's note here:

http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/2013-June/001987.html

and

http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/2013-June/002014.html

The key goals here are that we want to support both:
  - Users who care about guaranteed resource release being able to 
reliably do so unintrusively, without
  - Users who don't care having to pay attention if they don't want to.

Most of the problems we're discussing here are not new to streams; 
they're a consequence of not having dealt sufficiently with resource 
management in the past.

For example, take the IO packages.  FileInputStream holds a GC-resistent 
(GCR) resource, a file handle.  Therefore, well-behaved code should 
ensure that it calls close() before the FIS goes out of scope.  On the 
other hand, ByteArrayInputStream holds no GCR resources, and in fact its 
close() method is documented to do nothing, and you can even use the 
stream after closing it.

Where things get nastier is when we attempt to abstract over these.  If 
you wrap a { FileInputStream, ByteArrayInputStream } with a 
BufferedInputStream, what should BIS tell its users about whether 
close() is needed or not?  The answer is that this information is not 
easily captured in the static type system, and users have to reason 
about what the BIS might be wrapping.  [1]  Fortunately, the 
consequences of either error (either closing it unnecessarily, or 
failing to close it when there is a GCR resources there) are rarely fatal.

Streams are like BufferedInputStream, in that they might or might not 
contain a GCR resource, except the default orientation is swapped: most 
of the time, they do not, and failing to close them is just fine (and 
actually encouraged).  But when they do, it is important to be able to 
ensure that the resources held by underlying streams are released. We 
want it to be POSSIBLE to ensure resources held by streams are released, 
without users feeling forced to structure their code around resource 
release in the many cases where they know no resource needs to be released.

The best tool we have for making sure resources are released is 
try-with-resources.  Where possible, our resource management solution 
should build on that, not compete with that.  That said, TWR is imperfect.

One imperfection is that when something is declared AutoCloseable, the 
presumption seems to be that it *must* be closed, and this may put a 
burden on user code.  For example, we don't want the situation where 
users feel they have to do:

   try (Stream s = list.stream().filter(...).map(...)) {
     s.forEach(...);
   }

instead of

   list.stream()
       .filter(...)
       .map(...)
       .forEach(...);

CONCRETE PROPOSAL
-----------------

Extend the notion of AutoCloseable with a new interface, name TBD.  (I 
will use MumbleCloseable as a stand-in):

   package java.util;

   public interface MumbleCloseable extends AutoCloseable {
       void close(); // no exceptions
   }

The semantics of MumbleCloseable would be:
  - This is an object that may, or may not, hold a GCR resource.  The 
close() method is always valid to call, but may do nothing, and because 
it extends AC, can always be used with TWR.  Users can exploit their 
knowledge of what resources are held to avoid calling close() if they 
like, and tools should be cognizant of this. [2]  Frameworks are likely 
to want to be conservative and always close, but top-level user code can 
use its own judgment.

We would make Stream (and friends) extend MumbleCloseable.  The behavior 
of close() would include that subsequent invocations of intermediate or 
terminal stream operations would fail with an exception, and close 
handlers would be run at the time close() is called.

There would be a way to register an additional close-handler on a 
stream, so that methods like File.walk() could set up the stream so that 
closing it would close the underlying DirectoryStream.  It is currently 
a bikeshed exercise to come up with the right spelling here.

As always, the hardest part of this is coming up with the right spelling 
for MumbleCloseable.  [3]

As it turns out, the implementation within Streams is simple, clean, and 
imposes minimal incremental performance overhead, since it can piggyback 
on existing features.  Calling close() on a stream which has no 
resources is a handful of field reads and writes; maintaining support 
for closeability, if not used, costs one more field and no additional 
checks on other code paths.  An implementation is here:

   http://cr.openjdk.java.net/~briangoetz/JDK-8017513/webrev/

You'll see the changes overall are quite small and unintrusive.

OUTSTANDING OBJECTIONS
----------------------

A number of objections to aspects of this have been raised.  I will 
attempt to catalog and respond to them here.

1.  Static analysis tools, specifically the Eclipse inspection, key off 
of AutoCloseable and nag the user to close the resource.  This in turn 
will push users to mangle their code as above for no benefit.

Response: The Eclipse team already has expressed ability and willingness 
to tune the inspection to work effectively with java.util.Stream.

2.  More generally, users won't know when they see Stream whether they 
have to close it or not.

Response: This is reality.  A stream might, or might not, wrap a GCR 
resource.  Our options here seem to be:
  - Punt completely, and make these resources non-releasable (bad);
  - Punt more subtly, and not provide useful methods like Files.walk (lame);
  - Provide a mechanism where users who know what their streams contain 
can ensure that resources are released, if they care, while not forcing 
users who don't care to pay attention.

By providing a new marker in the form of MumbleCloseable, we can alert 
users that the sense of AutoCloseable is flipped; that most of the time, 
not closing is likely to be fine.

3.  Calling close() doesn't guarantee that the stream is closed 
immediately, or at all.  I want to be able to call close() in the middle 
of a parallel query and have that short-circuit the query execution, not 
unlike closing a SocketInputStream does.

Response: I too would love a hard cancellation mechanism, but we spent a 
while on that and didn't come up with something we could all live with. 
  There are plenty of existing classes with close(), like 
ByteArrayInputStream, where close() doesn't guarantee that anything is 
closed.  So while this is unfortunate, it is merely a minus point.

4.  You're doing this at the wrong level; Spliterator and Iterator 
should have close() methods instead.

Response (a): Even if they did, Stream would still need a close() method.
Response (b): Not realistic, so let's not even waste time musing on this.

5.  I don't like the name onClose(closeAction).

Response: All bikesheds could use more paint.  Better suggestions welcome!

[1] This is not unlike many other cross-cutting concerns that we live 
with every day, like aliasing, mutability, or thread-safety.  Whether or 
not you have to synchronize before invoking a method on a List is purely 
a function of decisions that have been made by the owner of that list, 
and is not reflected in the language or type system.  Users have to 
reason about "is this object shared" and "what synchronization protocol 
does it use", but usually this is tractable since the reference is 
usually isolated to code that usually knows the answer.  The same 
applies to "what resource is backing this stream."

[2] We should also explore, immediately at the conclusion of this 
exercise, whether it makes sense to have some annotations that go with 
MumbleCloseable to indicate "I know I am / am not returning something 
that ought to be closed."  This would help address the concerns raised 
by Stephan of "How do we make the static detectors even better, rather 
than just turning them off when they see a MumbleCloseable."

[3] One possible name is MaybeCloseable, but I don't like that much 
because it suggests "maybe not closeable" which makes users wonder "OK, 
then how do I know if I can call close()?"  We want something that 
suggests you can always call close, but don't always have to.  Best 
suggestion so far is "AdvisoryCloseable" -- thanks Doug -- but taking 
additional suggestions.

I suspect the hardest problem here is the name.  Which isn't a terrible 
place to be.