RFR: 8367561: Getting some "header" property from a file:// URL causes a file descriptor leak
Jaikiran Pai
jpai at openjdk.org
Sat Oct 4 16:29:12 UTC 2025
Can I please get a review of this change which proposes to fix a file descriptor leak in `sun.net.www.protocol.file.FileURLConnection`?
The bug affects all JDK versions since Java 8 through mainline JDK. Furthermore, the issue isn't in any way related to "jar:" URLs or the JAR resources caching involved in the `JarURLConnection`.
`sun.net.www.protocol.file.FileURLConnection` is a `java.net.URLConnection`. For this issue, the APIs on `URLConnection` that are of interest are `connect()`, header related APIs (the getHeaderXXXX(), getLastModified() and similar APIs) and `getInputStream()`.
The existing implementation in `FileURLConnection` in its `connect()` implementation does readability checks on the underlying `File` instance. If the `File` is a directory then the readability check is done by verifying that a `File.list()` call does not return null. On the other hand, if the File is not a directory, then connect() constructs a temporary `java.io.FileInputStream` for the File and lets the FileInputStream's constructor implementation do the necessary readability checks. In either case, if the readability checks fail, then an IOException is thrown from this method and the FileURLConnection stays unconnected.
One important detail of the implementation in `FileURLConnection.connect()` is that the `FileInputStream` that it creates for the regular-file readability check, it keeps it open when it returns from `connect()`. `connect()` itself doesn't return any value, so this `FileInputStream` that was created for readability check remains open without the application having access to it, unless the application calls `(File)URLConnection.getInputStream()`. The implementation of `FileURLConnection.getInputStream()` returns this previously constructed `InputStream` if `connect()` had previously suceeded with its readability checks. The application, as usual, is then expected to `close()` that `InputStream` that was returned from `URLConnection.getInputStream()`. This is the "normal" case, where the application at some point calls the `URLConnection.getInputStream()` and closes that stream.
As noted previously, `URLConnection` has a few other APIs, for example the APIs that provide header values. An example:
Path regularFile = Path.of("hello.txt");
URLConnection conn = regularFile.toUri().toURL().openConnection();
// either of the following header related APIs
long val = conn.getLastModified();
String val = conn.getHeaderField("foobar");
// ... some other header related APIs
The issue/leak arises in these cases. Application code constructs a `URLConnection` and invokes these trivial APIs and neither does an explicit `connect()` nor calls `getInputStream()`, because it doesn't have to. In code like this, the implementation of `FileURLConnection` internally in the implementation of `getLastModified()` and other similar APIs, does a `connect()`. As explained above, the implementation of `connect()` then creates and leaves around a `FileInputStream` after doing readability checks. In this above application code, the `InputStream` never reaches the application code nor does it closed, thus leading to the leak.
The specification of `URLConnection` states:
> The actual connection to the remote object is made, using the {@link #connect() connect} method.
> The remote object becomes available. The header fields and the contents of the remote object can be accessed.
> ...
> Invoking the {@code close()} methods on the {@code InputStream} or {@code OutputStream} of an
> {@code URLConnection} after a request may free network resources associated with this
> instance, unless particular protocol specifications specify different behaviours
> for it.
It can be argued that it seems to expect that the application call `URLConnection.getInputStream()` and then close() it to close the resources, even in the case where it isn't interested in the `InputStream` or its contents. But, I think, it's a bit odd to be expecting applications to be doing that, if at all that's what is expected. So the change in this PR proposes to address the leak that can arise when the application never calls `URLConnection.getInputStream()`.
The change in this PR, keeps the readability checks in the `connect()` method. However, the `FileInputStream` that is opened to do the readability check is closed before returning from the `connect()` method. That way, there is no `InputStream` lying around. When/if the application calls `URLConnection.getInputStream()` then the implementation in that method has been updated to construct and return the `FileInputStream`. This does mean that there's a change in behaviour for the following case (I consider it a corner case):
- URLConnection is constructed for a "foo.txt" file which is readable
- URLConnection.getLastModified() (or similar header related APIs) is invoked on that URLConnection. Internally that results in a connect() and file readability check and that succeeds.
- Behind the scenes on the filesystem (or from some other part of the application), "foo.txt" file's permissions are modified to remove "read" permission.
- URLConnection.getInputStream() is invoked on that previously constructed (and connected) URLConnection.
Before the changes in this PR, the above scenario would result in the `URLConnection.getInputStream()` returning normally a `InputStream` instance (that was cached during the connect() call). The application would then be able to normally/successfully read content from that `InputStream`. With the proposed change in this PR, the `URLConnection.getInputStream()` call will throw a `FileNotFoundException`, when it attempts to construct the `FileInputStream` and noticing that the file doesn't have "read" permission. `URLConnection.getInputStream()` is specified to throw an `IOException`, so the exception (and its type) itself isn't a problem. It however is still a change in behaviour. Having said that, my opinion is that changing file permissions in between API calls on a `URLConnection` is a corner case and I think it might be OK if there is a change in behaviour here (along with a release note maybe), especially since I think the proposed change prevents a leak in a "normal" usage of t
he `URLConnection` APIs in the application code (and also helps the internal implementation in `FileURLConnection` to have a more clearly demarcated lifetime for the `InputStream` instance).
Additional care has been taken to make sure that no other change in behaviour is introduced. 2 new jtreg tests have been introduced. The `FileURLConnStreamLeakTest` is a regression test which reproduces the leak and verifies the fix - several test methods in this test will fail on Windows without the fix and will pass with the fix. The `GetInputStreamTest` is a new test which has been introduced to help improve the coverage of testing for the `InputStream` returned from the `FileURLConnection`.
tier1, tier2 and tier3 tests pass with this change.
-------------
Commit messages:
- 8367561: fix inputstream leak
- 8367561: introduce tests
Changes: https://git.openjdk.org/jdk/pull/27633/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27633&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8367561
Stats: 515 lines in 4 files changed: 483 ins; 11 del; 21 mod
Patch: https://git.openjdk.org/jdk/pull/27633.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/27633/head:pull/27633
PR: https://git.openjdk.org/jdk/pull/27633
More information about the net-dev
mailing list