RFR: 8322256: Define and document GZIPInputStream concatenated stream semantics [v2]

Lance Andersen lancea at openjdk.org
Fri Aug 30 13:05:19 UTC 2024


On Fri, 30 Aug 2024 11:44:09 GMT, Alan Bateman <alanb at openjdk.org> wrote:

> > The gnu.org docs cover this(concatenating gzip files) as part of its [advanced usage of gzip](https://github.com/openjdk/jdk/pull/20787#issuecomment-2320873616), so I don'r think we need to do any more archeology
> 
> Okay, but just very surprising that support was added in JDK 7 without changes to the API docs or other documentation (from a quick search).
> 

I am not sure either but the implementation is pretty much in line with gzip/gunzip

> If we are retrofitting the APIs docs then I think treat "GZIP" as a file format. It may require adding overrides so there is a place to document the behavior when reading an entry roll over into the next stream.

Perhaps, an  additional blurb(or APINote) in the existing GZIPInputStream::read, but I think the verbiage that is being proposed with a couple tweaks covers the behavior.

BTW, this is what gzip/gunzip does:


bats % ls
Bruce.txt       Robin.txt       batman.txt      hello.txt    

bats % gzip -c batman.txt  > iambatman.gz
bats % gzip -c Bruce.txt  >> iambatman.gz
bats % gzip -c hello.txt  >> iambatman.gz
bats % gzip -c Robin.txt >> iambatman.gz   
bats % gunzip -c iambatman.gz 
I am batman
Bruce Wayne here
hello
Robin here

bats % gzip -c batman.txt  > iambrokenbat.gz
bats % gzip -c Bruce.txt >> iambrokenbat.gz 
bats % cat hello.txt >> iambrokenbat.gz 
bats % gzip -c Robin.txt >> iambrokenbat.gz

bats % gunzip -c iambrokenbat
I am batman
Bruce Wayne here
gunzip: iambrokenbat.gz: trailing garbage ignored


bats % gunzip iambatman    

bats % ls
Bruce.txt       Robin.txt       batman.txt      batman.txt.orig hello.txt       iambatman       iambrokenbat.gz
bats % cat iambatman 
I am batman
Bruce Wayne here
hello
Robin here

bats % gunzip iambrokenbat    
gunzip: iambrokenbat.gz: trailing garbage ignored
bats % ls
Bruce.txt       Robin.txt       batman.txt      batman.txt.orig hello.txt       iambatman       iambrokenbat
bats % cat iambrokenbat 
I am batman
Bruce Wayne here

bats %  gunzip -l iambatman.gz 
  compressed uncompressed  ratio uncompressed_name
         167           11 -99.9% iambatman

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20787#issuecomment-2321182631


More information about the core-libs-dev mailing list