RFR [7021870] GzipInputStream closes underlying stream during reading
Ivan Gerasimov
ivan.gerasimov at oracle.com
Mon May 13 13:35:15 UTC 2013
A small correction to the recipe for failure.
The failure is reproduced when a SINGLE byte is appended to the gzip
archive.
In the scenario below the echo command appends TWO bytes. These two
bytes are interpreted as a wrong gzip header and get successfully ignored.
On 13.05.2013 16:23, Ivan Gerasimov wrote:
> Hello Sherman!
>
> Thank you for the review!
> While I agree with you that current implementation of
> ZipInputStream.available() is not what it's supposed to be, I doubt it
> is the only cause of the issue.
> Please consider the following example.
> Zip archive (z.zip) consists of two entries: f1 and f2.
> f1 - gzip archive concatenated with a couple of bytes (I understand
> it's malformed gzip archive),
> f2 - let it be a single byte text file.
>
> $ echo a > f1
> $ echo b > f2
> $ gzip f1
> $ echo a >> f1.gz
> $ zip z.zip f1.gz f2
>
> public class ZipMix {
> private static byte[] rbuf = new byte[1];
> public static void main(String[] args) throws Throwable {
> FileInputStream fis = new FileInputStream("z.zip");
> ZipInputStream zis = new ZipInputStream(fis);
> System.out.print("-start-\n");
> if (zis.getNextEntry() != null) {
> InputStream gis = new GZIPInputStream(zis, 8192);
> while (gis.read(rbuf) != -1)
> System.out.print(new String(rbuf));
> }
> if (zis.getNextEntry() != null) { // <--- throws IOException
> while (zis.read(rbuf) != -1)
> System.out.print(new String(rbuf));
> }
> System.out.print("\n-finish-\n");
> }
> }
>
> This code worked well with jdk1.6.20, since GZIPInputStream hadn't
> tried to read beyond the trailer.
> This breaks with the current jdk, since GZIPInputStream tries to read
> yet another gzip header after the first trailer, and
> SequenceInputStream.read() calls the close() method for underlying
> stream.
> And this would break even with the ZipInputStream.available() fixed.
>
> It seems to me that the root cause here is using SequenceInputStream,
> which can close the stream during the read() call.
> And this is what my fix was about - to prevent closing of the
> underlying stream.
>
> Sincerely yours,
> Ivan
>
>
> On 11.05.2013 2:48, Xueming Shen wrote:
>> The proposed fix does solve the issue.
>>
>> However it appears it does not fix the root cause that triggers this
>> problem.
>> The root cause, or the direct trigger of this bug is the fact that
>> ZipInputStream
>> .availble() is really NOT reliable to tell us whether or not there is
>> any byte
>> "available" to be read from the input stream. The api doc of the
>> ZIS.available()
>> specifies "Returns 0 after EOF has reached for the current entry
>> data, otherwise
>> always return 1.". The implementation is straightforward,
>>
>> public int available() throws IOException {
>> if (entryEOF) {
>> return 0;
>> } else {
>> return 1;
>> }
>> }
>>
>> However, the flag "entryEOF" is only set if a read attempt has been
>> tried
>> and failed (see ZIS.read()), which means if a previous read of the entry
>> succeeded and it actually read all the available bytes from the entry
>> (with
>> a big enough buffer), there is 0 byte available for read, the "flag"
>> is not
>> set, so if you invoke zis.available(), it return 1, but a read()
>> invocation will
>> returns -1 (yes, if you then try zis.available() again , it returns
>> the expected
>> 0). The test below demonstrates this issue.
>>
>> public static void main(String args[]) throws IOException {
>> ZipInputStream zis = new ZipInputStream(new
>> FileInputStream(args[0]));
>> ZipEntry ze;
>> while((ze = zis.getNextEntry()) !=null) {
>> System.out.println("--------------" + ze.getName() +
>> "--------");
>> byte[] buf = new byte[8102];
>>
>> while (true) {
>> System.out.println(" zis.available()=" +
>> zis.available());
>> int n = zis.read(buf, 0, buf.length);
>> System.out.println(" readN=" + n);
>> if (n == -1)
>> break;
>> }
>> }
>>
>> }
>>
>> It is arguable that the this is not an implementation bug, since it
>> is reasonable
>> to argue that the EOF has not been "really" reached yet after the
>> first try, the
>> implementation happens to return all available bytes, so the next
>> read try finally
>> "reached" the EOF, but this obviously is not the expectation.
>>
>> Unfortunately the fix we put in for #4691425 [1] logically depends on
>> in.available() to decide whether or not we need to read a little
>> further, which
>> directly triggers this bug when the ZIS.available() "lies".
>>
>> // If there are more bytes available in "in" or
>> // the leftover in the "inf" is > 26 bytes:
>> // this.trailer(8) + next.header.min(10) + next.trailer(8)
>> // try concatenated case
>> if (this.in.available() > 0 || n > 26) {
>> ....
>> }
>>
>> Not only ZInputStream has this issue, its parent class InflaterStream
>> (and
>> its sibling GZIPInputStream, GZS inherits InflaterStream's available()
>> implementation directly) has the same issue, I do have a bug#7031075
>> filed against GZIPInputStream.
>>
>> So the proposed fix is more a workaround for this available() issue. The
>> alternative is to fix the issue directly, for example, to change the
>> ZIS's
>> available implementation to something like
>>
>> public int available() throws IOException {
>> ensureOpen();
>> if (entryEOF || entry == null)
>> return 0;
>> switch (entry.method) {
>> case DEFLATED:
>> return (inf.finished() || inf.needsDictionary()) ? 0 : 1;
>> case STORED:
>> return remaining > 0 ? 1 : 0;
>> default:
>> throw new ZipException("invalid compression method");
>> }
>> }
>>
>> we probably should go further to simply remove the flag "entryEOF"
>> and move the "deflated" case implementation into InflaterInputStream
>> (to fix 7031075 as well).
>>
>> -Sherman
>>
>>
>> [1] http://cr.openjdk.java.net/~sherman/4691425/webrev/
>> [2] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7031075
>>
>> On 05/10/2013 01:03 AM, Ivan Gerasimov wrote:
>>> Hello everybody!
>>>
>>> GzipInputStream uses SequenceInputStream to concatenate the
>>> underlying stream with some data that reside in memory.
>>> SequenceInputStream is implementing in such a way that it closes the
>>> stream when it reaches EOS.
>>>
>>> The solution is to wrap the underlying stream with extended
>>> FilterInputStream that overrides the close() method.
>>>
>>> BUG: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7021870
>>> WEBREV: http://cr.openjdk.java.net/~dmeetry/7021870/webrev.0/
>>> <http://cr.openjdk.java.net/%7Edmeetry/7021870/webrev.0/>
>>>
>>> Sincerely your,
>>> Ivan
>>>
>>
>>
>>
>
>
>
More information about the core-libs-dev
mailing list