MumbleCloseable
Brian Goetz
brian.goetz at oracle.com
Mon Jun 24 18:04:10 PDT 2013
This note attempts to tie together the issues that have been raised for
streams having a resource release mechanism. For an explanation of why
the initial attempt (which included the high-noise-to-signal classes
CloseableStream and DelegatingStream) was impractical, see Paul's note here:
http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/2013-June/001987.html
and
http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/2013-June/002014.html
The key goals here are that we want to support both:
- Users who care about guaranteed resource release being able to
reliably do so unintrusively, without
- Users who don't care having to pay attention if they don't want to.
Most of the problems we're discussing here are not new to streams;
they're a consequence of not having dealt sufficiently with resource
management in the past.
For example, take the IO packages. FileInputStream holds a GC-resistent
(GCR) resource, a file handle. Therefore, well-behaved code should
ensure that it calls close() before the FIS goes out of scope. On the
other hand, ByteArrayInputStream holds no GCR resources, and in fact its
close() method is documented to do nothing, and you can even use the
stream after closing it.
Where things get nastier is when we attempt to abstract over these. If
you wrap a { FileInputStream, ByteArrayInputStream } with a
BufferedInputStream, what should BIS tell its users about whether
close() is needed or not? The answer is that this information is not
easily captured in the static type system, and users have to reason
about what the BIS might be wrapping. [1] Fortunately, the
consequences of either error (either closing it unnecessarily, or
failing to close it when there is a GCR resources there) are rarely fatal.
Streams are like BufferedInputStream, in that they might or might not
contain a GCR resource, except the default orientation is swapped: most
of the time, they do not, and failing to close them is just fine (and
actually encouraged). But when they do, it is important to be able to
ensure that the resources held by underlying streams are released. We
want it to be POSSIBLE to ensure resources held by streams are released,
without users feeling forced to structure their code around resource
release in the many cases where they know no resource needs to be released.
The best tool we have for making sure resources are released is
try-with-resources. Where possible, our resource management solution
should build on that, not compete with that. That said, TWR is imperfect.
One imperfection is that when something is declared AutoCloseable, the
presumption seems to be that it *must* be closed, and this may put a
burden on user code. For example, we don't want the situation where
users feel they have to do:
try (Stream s = list.stream().filter(...).map(...)) {
s.forEach(...);
}
instead of
list.stream()
.filter(...)
.map(...)
.forEach(...);
CONCRETE PROPOSAL
-----------------
Extend the notion of AutoCloseable with a new interface, name TBD. (I
will use MumbleCloseable as a stand-in):
package java.util;
public interface MumbleCloseable extends AutoCloseable {
void close(); // no exceptions
}
The semantics of MumbleCloseable would be:
- This is an object that may, or may not, hold a GCR resource. The
close() method is always valid to call, but may do nothing, and because
it extends AC, can always be used with TWR. Users can exploit their
knowledge of what resources are held to avoid calling close() if they
like, and tools should be cognizant of this. [2] Frameworks are likely
to want to be conservative and always close, but top-level user code can
use its own judgment.
We would make Stream (and friends) extend MumbleCloseable. The behavior
of close() would include that subsequent invocations of intermediate or
terminal stream operations would fail with an exception, and close
handlers would be run at the time close() is called.
There would be a way to register an additional close-handler on a
stream, so that methods like File.walk() could set up the stream so that
closing it would close the underlying DirectoryStream. It is currently
a bikeshed exercise to come up with the right spelling here.
As always, the hardest part of this is coming up with the right spelling
for MumbleCloseable. [3]
As it turns out, the implementation within Streams is simple, clean, and
imposes minimal incremental performance overhead, since it can piggyback
on existing features. Calling close() on a stream which has no
resources is a handful of field reads and writes; maintaining support
for closeability, if not used, costs one more field and no additional
checks on other code paths. An implementation is here:
http://cr.openjdk.java.net/~briangoetz/JDK-8017513/webrev/
You'll see the changes overall are quite small and unintrusive.
OUTSTANDING OBJECTIONS
----------------------
A number of objections to aspects of this have been raised. I will
attempt to catalog and respond to them here.
1. Static analysis tools, specifically the Eclipse inspection, key off
of AutoCloseable and nag the user to close the resource. This in turn
will push users to mangle their code as above for no benefit.
Response: The Eclipse team already has expressed ability and willingness
to tune the inspection to work effectively with java.util.Stream.
2. More generally, users won't know when they see Stream whether they
have to close it or not.
Response: This is reality. A stream might, or might not, wrap a GCR
resource. Our options here seem to be:
- Punt completely, and make these resources non-releasable (bad);
- Punt more subtly, and not provide useful methods like Files.walk (lame);
- Provide a mechanism where users who know what their streams contain
can ensure that resources are released, if they care, while not forcing
users who don't care to pay attention.
By providing a new marker in the form of MumbleCloseable, we can alert
users that the sense of AutoCloseable is flipped; that most of the time,
not closing is likely to be fine.
3. Calling close() doesn't guarantee that the stream is closed
immediately, or at all. I want to be able to call close() in the middle
of a parallel query and have that short-circuit the query execution, not
unlike closing a SocketInputStream does.
Response: I too would love a hard cancellation mechanism, but we spent a
while on that and didn't come up with something we could all live with.
There are plenty of existing classes with close(), like
ByteArrayInputStream, where close() doesn't guarantee that anything is
closed. So while this is unfortunate, it is merely a minus point.
4. You're doing this at the wrong level; Spliterator and Iterator
should have close() methods instead.
Response (a): Even if they did, Stream would still need a close() method.
Response (b): Not realistic, so let's not even waste time musing on this.
5. I don't like the name onClose(closeAction).
Response: All bikesheds could use more paint. Better suggestions welcome!
[1] This is not unlike many other cross-cutting concerns that we live
with every day, like aliasing, mutability, or thread-safety. Whether or
not you have to synchronize before invoking a method on a List is purely
a function of decisions that have been made by the owner of that list,
and is not reflected in the language or type system. Users have to
reason about "is this object shared" and "what synchronization protocol
does it use", but usually this is tractable since the reference is
usually isolated to code that usually knows the answer. The same
applies to "what resource is backing this stream."
[2] We should also explore, immediately at the conclusion of this
exercise, whether it makes sense to have some annotations that go with
MumbleCloseable to indicate "I know I am / am not returning something
that ought to be closed." This would help address the concerns raised
by Stephan of "How do we make the static detectors even better, rather
than just turning them off when they see a MumbleCloseable."
[3] One possible name is MaybeCloseable, but I don't like that much
because it suggests "maybe not closeable" which makes users wonder "OK,
then how do I know if I can call close()?" We want something that
suggests you can always call close, but don't always have to. Best
suggestion so far is "AdvisoryCloseable" -- thanks Doug -- but taking
additional suggestions.
I suspect the hardest problem here is the name. Which isn't a terrible
place to be.
More information about the lambda-libs-spec-experts
mailing list