4206909 - adding Z_SYNC_FLUSH support to deflaters
Brandon Long
blong at google.com
Thu Sep 3 20:41:37 UTC 2009
On 09/03/09 Martin Buchholz uttered the following other thing:
> The zlib and Deflater API is hard to understand.
> It seems to me that the proper place to specify the flushing behavior is
> when calling setInput, not when calling one of the deflate methods.
>
> I'm imagining something like
> setInput(byte[]b, int off, int len, int flags)
> or more java-esque
> enum FlushingBehavior {
> NO_FLUSH, SYNC_FLUSH, FULL_FLUSH, FINISH }
> setInput(byte[]b, int off, int len, FlushingBehavior)
Looking at DeflaterOutputStream, setInput is called in write, but flush
is in a very separate place.
We're using this to implement a compressed line oriented network
protocol, and so we're likely to have buffering (multiple writes) and
then a flush when we're done, we're not building up the entire response
and calling write just once.
So, I'm not sure how you'd get the expected "stream" behavior out of
DeflaterOutputStream if you put the flushing behavior on setInput.
Having Deflater.deflate take a flush type more closely mimics the
zlib/jzlib deflate function. Having a separate flush call more closely
matches finish/finished on the API.
GNU's classpath "fixes" this behavior, and their API is a flush call on
Deflater (though no flushed, so they're Deflater.deflate probably loops)
Looking at Android's JDK, it has the same bug as the OpenJDK (no flush).
Anyways, after Martin's review, I changed it to syncFlush and passed
tightened up some other things.. and dropped the
InflaterInputStream.available change which was similar to yours (he felt
that would be harder to get in). I guess it partially depends on
whether we're targetting OpenJDK 6 or 7.
Brandon
> Martin
>
> On Thu, Aug 27, 2009 at 23:46, Xueming Shen<Xueming.Shen at sun.com> wrote:
> > Another long over-due zip RFE
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4206909
> > with 93 votes.
> >
> > Background info:
> >
> > zlib has following 6 allowed flush values when invoking deflate()/inflate
> >
> > #define Z_NO_FLUSH 0
> > #define Z_PARTIAL_FLUSH 1 /* will be removed, use Z_SYNC_FLUSH instead */
> > #define Z_SYNC_FLUSH 2
> > #define Z_FULL_FLUSH 3
> > #define Z_FINISH 4
> > #define Z_BLOCK 5
> >
> > The values are not really used by inflate(), so we only need to worry about
> > the
> > deflate().
> >
> > Our implementation now uses Z_NO_FLUSH for all deflation except when
> > "finish()"
> > is invoked, in which case we use Z_FINISH.
> >
> > When Z_NO_FLUSH is used, the deflater may accumulate/pend/hold input data in
> > its internal
> > buffer for better compression (in which the existing output data might not
> > be a "complete"
> > block), some applications (as those mentioned in the bug report)however
> > might need to flush
> > all pending data out so the output data is a "complete" block, therefor the
> > corresponding
> > de-compressor can work on those data.
> >
> > Z_SYNC_FLUSH and Z_FULL_FLUSH achieve this. The difference is that the
> > "full_flush" will
> > reset the existing hash table and start a new one for the coming new data,
> > which is really
> > not very useful. And the only difference compared to current "reset()" is
> > the reset() will
> > need to reallocate the memory, not a big deal.
> >
> > See zlib.h for more details (?)
> >
> > The change in Deflater is actually relatively easy, see the webrev
> >
> > http://cr.openjdk.java.net/~sherman/zipflush/webrev
> >
> > I would like to hear your opinion on
> >
> > (1)I'm proposing to simply add a "flush()" method to support the
> > Z_SYNC_FLUSH only. The
> > alternative is to have a flush(int) method + 2 constant sync_flush and
> > full_flush)
> >
> > (2)What should we do to the flush() method in DeflaterOutputStream class.
> > Ideally it should
> > be overridden like in the webrev, with API doc to explain that it does 2
> > things (1) flush
> > the deflater (2)flush the underlying outputstream, I believe this is what
> > the developer is
> > expecting and what the API should have been implemented at first place.
> > However this definitely
> > is an "incompatible" change, given the existing "flush the underlying os
> > only" behavior has
> > been in place for decade. I just wonder which approach we should take. I
> > tend to leave it
> > un-touched and do our best to "educate" the developer to go with
> >
> > 79 static class FlushableDOS extends DeflaterOutputStream {
> > 80 public FlushableDOS(OutputStream out) {
> > 81 super(out);
> > 82 }
> > 83 84 public void flush() throws IOException {
> > 85 int len = 0;
> > 86 while ((len = def.flush(buf, 0, buf.length)) > 0) {
> > 87 out.write(buf, 0, len);
> > 88 }
> > 89 out.flush();
> > 90 }
> > 91 }
> >
> > We can add something in the class description to explain about the "flush"
> >
> > (3)During the debug, I noticed the InflaterInputStream.available() issue.
> > The method is currently specified as "0 if EOF 1 otherwise", which does not
> > look right. The BufferedReader.readLine() in test case Flush()is blocked in
> > StreamDecoder#325, the inReady() at ln#323 keeps returning true for the
> > inputstream (InflaterInputStream). I believe this available() should be
> > specified/implemented as
> >
> > 0 if EOF, in.available() if inf.needInput() is true, otherwise 1.
> >
> > Opinion? This is going to be a separate bugid, if we agree to fix it. Btw,
> > how to write above spec in English?:-)
> >
> > Sherman
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
--
"I think, therefore, I am confused." -- RAW
http://www.fiction.net/blong/
More information about the core-libs-dev
mailing list