Early version of striding Deflater

Clemens Eisserer linuxhippy at gmail.com
Thu Mar 6 11:18:38 UTC 2008


Hi Dave,

Thanks a lot for your detailed review, and for fixing the warnings and
problems on solaris.
Unfourtunatly I am quite busy for now (Math exam is coming and I am
really bad in maths) - but I'll continue working on it as soon as
times permits.

Thanks a lot, lg Clemens

> There's a slight change to the semantics of "finished".  Current code sets
> Deflater.finished only if setParams is false.  Changed code may set it
> regardlewss of setParams.  I suspect this is OK: if client code changed
> strategy or level and called Deflater.deflate(), it would invoke
> deflateParams(), and not alter the value of finished.  The client's subsequent
> call to Deflater.deflate() would call deflate() which AFAICT would cause
> finished to be set.  What do you think?
>
> Why fall through from Z_OK to Z_BUF_ERROR?  In Z_BUF_ERROR and default cases,
> why continue execution of loop instead of returning 0 as does original code?
> I changed this; see attached Deflater.c
>
> *** Deflater.java: OK
>
> *** DeflaterOutputStream.java: OK
>
> More inline below.
>
> Clemens Eisserer wrote:
> > Hi Dave,
> >
> > Thanks a lot for your reply.
> > To make it short: Of course I understand that this is low-priority
> > (also for me, its a fun-only fix because someone in forums.java.net
> > mentioned it) so don't hurry.
> > Sorry that I wasted your time with my messy files, they were taken
> > from my "playground" thats why they were in such a bad shape - they
> > were only intended to give an idea which "road" I was taking. I
> > attached the new files taken from the mercurial repositories and only
> > modified at the affected places.
> >
> >> With a change of this sort, we really do need tests along with a fix.  Have
> >> you started writing any test cases?
> > I completly agree - I have some simple test-cases which test more or
> > less only very basic functionality of Deflater and they work well
> > (also FlatterTest passes).
> > I'll write some more tests which test exotic use-cases like changing
> > compression-level, ... during compression.
>
> Great, thanks.  It would be a Good Idea to have a test that checks my
> assumption re finished (see above).
>
> > I have some open questions:
> > 1.) Is the seperate structure approach to hold the stride-buffers ok?
>
> I think so.
>
> > 2.) Any suggestions for the following names: 1. strm-field in class
> > (defAdr), 2. defAdr-parameter,3. defptr -  long_to_ptr of defAdr, 4.
> > def_data - name of the structure
>
> Those don't quite match what I see in the code; but what's in the code seems OK:
> def_data for the struct,
> def_adr as a param to deflateBytes, etc. etc.
> defptr to reference a def_data in init and deflateBytes
>
> > 3.) I am not really used to program in C. Are the adress-operations ok
> > which I used to get members of the new struct def_data?
>
> It seems OK.
>
> > Thanks for your patience, lg Clemens
> >
> > Some notes, and changes in ramdom order:
> > * Changed deflate-bytes to the old behaviour to return after the call
> > to deflateParams
>
> Good; AFAICT at maintains the existing semantics.
>
> > * Verified that its ok to call deflateParams when there's not enough
> > space in the output-buffer to flush all "old" data out (thanks to Mark
> > Adler)
> > * I changed the method-signiture of the native method compared to
> > original, because some variables were read from JNI-code, whereas they
> > could have been passed simply down using method parameters. I think
> > its "cleaner" to pass it.
>
> The long argument to deflateBytes is a bit cumbersome, but Ken Russell opined
> that it provides better performance, so it's a Good Thing:
>
>  > Kenneth Russell wrote:
>  > I strongly agree with the contributor's suggestion. Not only is passing
>  > the argument from Java less code, it is also faster since field access
>  > from Java can be optimized by the HotSpot compiler, where field accesses
>  > through the JNI must go through the same set of boilerplate
>  > C/C++/assembly every time.
>
> > * Allocation of the stride-buffers together with the z_stream
> > structure. z_stream is really large, so the two stride-buffers should
> > not add that much overhead. However this has the advantage of not
> > mallocing/freeing and also beeing able to fill the input-stride-buffer
> > once for several calls of the native method.
>
> Looks good.
>
> > * Renamed the strm-adress-parameter to defadr, because it no longer
> > really points to a strm. I did not rename the java field "strm"
> > because I did not have an idea for a proper name.
>
> It should have had a different name from day one.  I'm slightly loathe to make
> a name change, nor do I have (Friday afternoon) a Really Good Idea.
>
> > * Removed striding from DeflaterOutputStream, (looked how code looked in 1.4.2).
>
> Looks good.
>
>  From your other email:
>
>  > I also thought about implementing striding in the CRC32/Adler32
>  > classes which basically suffer from the same block-the-gc behaviour as
>  > Inflater/Deflater did before they were "fixed" ;)
>
> I suspect these are not as much of an issue: presumably CRC32 and adler32
> calculations are faster than deflation (caveat: I haven't measured them).
> Other have pointed out / shamed us because, hey, shouldn't these be in Java
> anyway?
>
>  > Furthermore do you have good ideas for regression tests?
>  > The usual compression/decompression works fine, can you imagine
>  > corner-cases which would be worth special testing?
>  > Should the tests written in the jtreg format?
>
> I'd like to see tests where there's a possibility of semantics having changed;
> as noted above re "finished".
>
> I haven't done a performance analysis, but don't expect a regression.  If
> anything, since the striding is kept within native code, there should be fewer
> Java -> native calls, and better performance (though that is perhaps not
> measurable).
>
> My last comment is about this change in general.  It seems like a reasonable
> fix, though the corresponding bug:
>      http://bugs.sun.com/view_bug.do?bug_id=6399199
> is a low priority one for us, and this is code we generally feel is best left
> alone when possible.  We are still learning how best to work with
> contributions from outside of Sun.  I will check with other who've maintained
> this code in the past, to get their opinion of making this kind of change.
> While I'm currently responsible for jar/zip code, it's only one of the hats I
> currently wear ;-)
>
> Thanks,
>         Dave
>



More information about the core-libs-dev mailing list