Concatenated .gz files/streams

Martin Buchholz martinrb at google.com
Sun May 23 19:07:47 UTC 2010


First, let me apologize for my poor judgment back in 2006.
Concatenated gzip members are clearly supported by rfc1952,
the gzip command, and by the many votes for bug 4691425.
The JDK should support this feature as well.

One can argue this change doesn't go far enough,
and that one should be able to perform operations
that motivated the existence of the multiple member feature,
like skip to the next gzip member and to inspect the filename
and extra fields in the gzip header, but that's a much bigger
change.

Probably GZIPInputStream shouldn't extend InflaterInputStream,
but too late for that.

--

the returned values from readHeader and readTrailer
should be documented.

--

174         int n = 10;
I would write this kind of code as
int n = 2 + 2 + 6;
for clarity.

--
Probably readHeader should call crc.reset() twice,
before and after reading the header,
and then remove all calls to crc.reset() after calling
readHeader.
--

typos:
OutputStram
jusr
--
 38             int n = rnd.nextInt(100) % 10 + 1;
Isn't this just rnd.nextInt(10) + 1?
--
the test should close gzis.
--
probably should do some cleanup on src, srcBAOS, srcBytes.
src is written to srcBAOS, then turned back into a byte[],
which seems pointless to me.
--

Martin

On Thu, May 20, 2010 at 15:19, Xueming Shen <xueming.shen at oracle.com> wrote:
> Martin,
>
> Though there is different opinion that "it is not obvious that accepting
> multiple .gz files concatenated
> together is actually an improvement" back to 2006:-)  it appears rfc1952
> clearly specifies "A gzip file
> consists of a serious of "members"..." in its "File format" section[1], and
> more importantly:-) some of
> my new colleagues are very interested to have this rfe addressed because
> there is real world product
> actually still heavilly uses concatenated .gz files. So here is the webrev
> of a reasonable fix and the test
> case. (The test case only run against the output from GZIPOutputStream, I
> did test on couple .gz files
> from gzip, just don't want to introduce in gzip dependency into the test
> case)
>
> http://cr.openjdk.java.net/~sherman/4691425/webrev/
>
> -Sherman
>
> [1] http://www.gzip.org/zlib/rfc-gzip.html#file-format
>



More information about the core-libs-dev mailing list