RFR [7021870] GzipInputStream closes underlying stream during reading
Xueming Shen
xueming.shen at oracle.com
Fri May 10 22:48:11 UTC 2013
The proposed fix does solve the issue.
However it appears it does not fix the root cause that triggers this problem.
The root cause, or the direct trigger of this bug is the fact that ZipInputStream
.availble() is really NOT reliable to tell us whether or not there is any byte
"available" to be read from the input stream. The api doc of the ZIS.available()
specifies "Returns 0 after EOF has reached for the current entry data, otherwise
always return 1.". The implementation is straightforward,
public int available() throws IOException {
if (entryEOF) {
return 0;
} else {
return 1;
}
}
However, the flag "entryEOF" is only set if a read attempt has been tried
and failed (see ZIS.read()), which means if a previous read of the entry
succeeded and it actually read all the available bytes from the entry (with
a big enough buffer), there is 0 byte available for read, the "flag" is not
set, so if you invoke zis.available(), it return 1, but a read() invocation will
returns -1 (yes, if you then try zis.available() again , it returns the expected
0). The test below demonstrates this issue.
public static void main(String args[]) throws IOException {
ZipInputStream zis = new ZipInputStream(new FileInputStream(args[0]));
ZipEntry ze;
while((ze = zis.getNextEntry()) !=null) {
System.out.println("--------------" + ze.getName() + "--------");
byte[] buf = new byte[8102];
while (true) {
System.out.println(" zis.available()=" + zis.available());
int n = zis.read(buf, 0, buf.length);
System.out.println(" readN=" + n);
if (n == -1)
break;
}
}
}
It is arguable that the this is not an implementation bug, since it is reasonable
to argue that the EOF has not been "really" reached yet after the first try, the
implementation happens to return all available bytes, so the next read try finally
"reached" the EOF, but this obviously is not the expectation.
Unfortunately the fix we put in for #4691425 [1] logically depends on
in.available() to decide whether or not we need to read a little further, which
directly triggers this bug when the ZIS.available() "lies".
// If there are more bytes available in "in" or
// the leftover in the "inf" is > 26 bytes:
// this.trailer(8) + next.header.min(10) + next.trailer(8)
// try concatenated case
if (this.in.available() > 0 || n > 26) {
....
}
Not only ZInputStream has this issue, its parent class InflaterStream (and
its sibling GZIPInputStream, GZS inherits InflaterStream's available()
implementation directly) has the same issue, I do have a bug#7031075
filed against GZIPInputStream.
So the proposed fix is more a workaround for this available() issue. The
alternative is to fix the issue directly, for example, to change the ZIS's
available implementation to something like
public int available() throws IOException {
ensureOpen();
if (entryEOF || entry == null)
return 0;
switch (entry.method) {
case DEFLATED:
return (inf.finished() || inf.needsDictionary()) ? 0 : 1;
case STORED:
return remaining > 0 ? 1 : 0;
default:
throw new ZipException("invalid compression method");
}
}
we probably should go further to simply remove the flag "entryEOF"
and move the "deflated" case implementation into InflaterInputStream
(to fix 7031075 as well).
-Sherman
[1] http://cr.openjdk.java.net/~sherman/4691425/webrev/
[2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7031075
On 05/10/2013 01:03 AM, Ivan Gerasimov wrote:
> Hello everybody!
>
> GzipInputStream uses SequenceInputStream to concatenate the underlying stream with some data that reside in memory.
> SequenceInputStream is implementing in such a way that it closes the stream when it reaches EOS.
>
> The solution is to wrap the underlying stream with extended FilterInputStream that overrides the close() method.
>
> BUG: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7021870
> WEBREV: http://cr.openjdk.java.net/~dmeetry/7021870/webrev.0/ <http://cr.openjdk.java.net/%7Edmeetry/7021870/webrev.0/>
>
> Sincerely your,
> Ivan
>
More information about the core-libs-dev
mailing list