Early version of striding Deflater
Clemens Eisserer
linuxhippy at gmail.com
Thu Mar 6 11:18:38 UTC 2008
Hi Dave,
Thanks a lot for your detailed review, and for fixing the warnings and
problems on solaris.
Unfourtunatly I am quite busy for now (Math exam is coming and I am
really bad in maths) - but I'll continue working on it as soon as
times permits.
Thanks a lot, lg Clemens
> There's a slight change to the semantics of "finished". Current code sets
> Deflater.finished only if setParams is false. Changed code may set it
> regardlewss of setParams. I suspect this is OK: if client code changed
> strategy or level and called Deflater.deflate(), it would invoke
> deflateParams(), and not alter the value of finished. The client's subsequent
> call to Deflater.deflate() would call deflate() which AFAICT would cause
> finished to be set. What do you think?
>
> Why fall through from Z_OK to Z_BUF_ERROR? In Z_BUF_ERROR and default cases,
> why continue execution of loop instead of returning 0 as does original code?
> I changed this; see attached Deflater.c
>
> *** Deflater.java: OK
>
> *** DeflaterOutputStream.java: OK
>
> More inline below.
>
> Clemens Eisserer wrote:
> > Hi Dave,
> >
> > Thanks a lot for your reply.
> > To make it short: Of course I understand that this is low-priority
> > (also for me, its a fun-only fix because someone in forums.java.net
> > mentioned it) so don't hurry.
> > Sorry that I wasted your time with my messy files, they were taken
> > from my "playground" thats why they were in such a bad shape - they
> > were only intended to give an idea which "road" I was taking. I
> > attached the new files taken from the mercurial repositories and only
> > modified at the affected places.
> >
> >> With a change of this sort, we really do need tests along with a fix. Have
> >> you started writing any test cases?
> > I completly agree - I have some simple test-cases which test more or
> > less only very basic functionality of Deflater and they work well
> > (also FlatterTest passes).
> > I'll write some more tests which test exotic use-cases like changing
> > compression-level, ... during compression.
>
> Great, thanks. It would be a Good Idea to have a test that checks my
> assumption re finished (see above).
>
> > I have some open questions:
> > 1.) Is the seperate structure approach to hold the stride-buffers ok?
>
> I think so.
>
> > 2.) Any suggestions for the following names: 1. strm-field in class
> > (defAdr), 2. defAdr-parameter,3. defptr - long_to_ptr of defAdr, 4.
> > def_data - name of the structure
>
> Those don't quite match what I see in the code; but what's in the code seems OK:
> def_data for the struct,
> def_adr as a param to deflateBytes, etc. etc.
> defptr to reference a def_data in init and deflateBytes
>
> > 3.) I am not really used to program in C. Are the adress-operations ok
> > which I used to get members of the new struct def_data?
>
> It seems OK.
>
> > Thanks for your patience, lg Clemens
> >
> > Some notes, and changes in ramdom order:
> > * Changed deflate-bytes to the old behaviour to return after the call
> > to deflateParams
>
> Good; AFAICT at maintains the existing semantics.
>
> > * Verified that its ok to call deflateParams when there's not enough
> > space in the output-buffer to flush all "old" data out (thanks to Mark
> > Adler)
> > * I changed the method-signiture of the native method compared to
> > original, because some variables were read from JNI-code, whereas they
> > could have been passed simply down using method parameters. I think
> > its "cleaner" to pass it.
>
> The long argument to deflateBytes is a bit cumbersome, but Ken Russell opined
> that it provides better performance, so it's a Good Thing:
>
> > Kenneth Russell wrote:
> > I strongly agree with the contributor's suggestion. Not only is passing
> > the argument from Java less code, it is also faster since field access
> > from Java can be optimized by the HotSpot compiler, where field accesses
> > through the JNI must go through the same set of boilerplate
> > C/C++/assembly every time.
>
> > * Allocation of the stride-buffers together with the z_stream
> > structure. z_stream is really large, so the two stride-buffers should
> > not add that much overhead. However this has the advantage of not
> > mallocing/freeing and also beeing able to fill the input-stride-buffer
> > once for several calls of the native method.
>
> Looks good.
>
> > * Renamed the strm-adress-parameter to defadr, because it no longer
> > really points to a strm. I did not rename the java field "strm"
> > because I did not have an idea for a proper name.
>
> It should have had a different name from day one. I'm slightly loathe to make
> a name change, nor do I have (Friday afternoon) a Really Good Idea.
>
> > * Removed striding from DeflaterOutputStream, (looked how code looked in 1.4.2).
>
> Looks good.
>
> From your other email:
>
> > I also thought about implementing striding in the CRC32/Adler32
> > classes which basically suffer from the same block-the-gc behaviour as
> > Inflater/Deflater did before they were "fixed" ;)
>
> I suspect these are not as much of an issue: presumably CRC32 and adler32
> calculations are faster than deflation (caveat: I haven't measured them).
> Other have pointed out / shamed us because, hey, shouldn't these be in Java
> anyway?
>
> > Furthermore do you have good ideas for regression tests?
> > The usual compression/decompression works fine, can you imagine
> > corner-cases which would be worth special testing?
> > Should the tests written in the jtreg format?
>
> I'd like to see tests where there's a possibility of semantics having changed;
> as noted above re "finished".
>
> I haven't done a performance analysis, but don't expect a regression. If
> anything, since the striding is kept within native code, there should be fewer
> Java -> native calls, and better performance (though that is perhaps not
> measurable).
>
> My last comment is about this change in general. It seems like a reasonable
> fix, though the corresponding bug:
> http://bugs.sun.com/view_bug.do?bug_id=6399199
> is a low priority one for us, and this is code we generally feel is best left
> alone when possible. We are still learning how best to work with
> contributions from outside of Sun. I will check with other who've maintained
> this code in the past, to get their opinion of making this kind of change.
> While I'm currently responsible for jar/zip code, it's only one of the hats I
> currently wear ;-)
>
> Thanks,
> Dave
>
More information about the core-libs-dev
mailing list