RFR: 8322256: Define and document GZIPInputStream concatenated stream semantics [v8]
Archie Cobbs
acobbs at openjdk.org
Tue Jul 30 18:09:35 UTC 2024
On Tue, 30 Jul 2024 17:35:33 GMT, Lance Andersen <lancea at openjdk.org> wrote:
> Based on the above, I am reluctant to change the current behavior given it appears to have been modeled after gzip/gunzip as well as WinZip.
That's a reasonable conclusion... and I have no problem with preserving the existing behavior if that's what we want. But in that case I would lobby that we should also provide some new way to configure a `GZIPInputStream` that guarantees reliable behavior.
The key question here is: "Exactly what current behavior of `new GZIPInputStream(in)` do we want to preserve?"
Let's start by assuming that we want your above test to pass. Putting that into words: "Given a single GZIP stream followed by trailing garbage, `new GZIPInputStream(in)` should successfully decode the GZIP stream and ignore the trailing garbage".
Note however that what `new GZIPInputStream(in)` currently provides is stronger than that:
1. Trailing garbage is ignored
1. Any `IOException` thrown while reading trailing garbage is ignored
1. Concatenated streams are automatically decoded
So we know we want to preserve 1 - What about 2 and/or 3? Your thoughts?
My personal opinions:
* I think 2 is inherently bad and it should not be implemented in any variant
* I think 3 is not required _by default_, but one should be able to enable it somehow
If we were to accept those opinions (preserving only 1), then we would end up at the same place as `GzipCompressorInputStream`:
* Underlying `IOException`'s are never suppressed
* `new GZIPInputStream(in)` decodes only one GIZP stream and ignores any trailing garbage
* `new GZIPInputStream(in, true)` decodes concatenated streams; trailing garbage causes `IOException`
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18385#issuecomment-2258919532
More information about the core-libs-dev
mailing list