Early version of striding Deflater
Dave Bristor
David.Bristor at Sun.COM
Mon Feb 18 21:21:13 UTC 2008
Here are the attached files that I forgot to attach ;-(
Dave
Dave Bristor wrote:
> Hi Clemens,
>
> Here's some file-by-file feedback, answers to questions, etc. I've
> attached 2 files:
> * Deflater.c.reformat is by-and-large the same as the file that you
> sent, except that it compiles on Solaris without warnings (it wouldn't
> compile there w/o change; perhaps the linux compiler (I'm guessing) you
> used is more lenient), and is formatted more in keeping with the rest of
> the file's style.
>
> * Deflater.c has some further changes on my part, described below. I've
> run all our regression tests on this one and it passes. I haven't run
> JCK tests, nor our more extensive performance suite.
>
> File-by-file commentary:
>
> *** Deflate.c
>
> Doesn't compile, at least not on Solaris. Several warnings. I fixed
> them. See attached Deflater.c.reformat. (I have not tried compiling on
> other platforms.)
>
> Some stylistic issues need attention; see e.g. deflateBytes for brace
> positioning, trailing whitespace, space between keyword and parens.
>
> Should document fields in def_data (compare with zip_util.h).
>
> Some field IDs no longer necessary, since they're passed in as params.
> I removed them.
>
> There's a slight change to the semantics of "finished". Current code
> sets Deflater.finished only if setParams is false. Changed code may set
> it regardlewss of setParams. I suspect this is OK: if client code
> changed strategy or level and called Deflater.deflate(), it would invoke
> deflateParams(), and not alter the value of finished. The client's
> subsequent call to Deflater.deflate() would call deflate() which AFAICT
> would cause finished to be set. What do you think?
>
> Why fall through from Z_OK to Z_BUF_ERROR? In Z_BUF_ERROR and default
> cases, why continue execution of loop instead of returning 0 as does
> original code? I changed this; see attached Deflater.c
>
> *** Deflater.java: OK
>
> *** DeflaterOutputStream.java: OK
>
> More inline below.
>
> Clemens Eisserer wrote:
>> Hi Dave,
>>
>> Thanks a lot for your reply.
>> To make it short: Of course I understand that this is low-priority
>> (also for me, its a fun-only fix because someone in forums.java.net
>> mentioned it) so don't hurry.
>> Sorry that I wasted your time with my messy files, they were taken
>> from my "playground" thats why they were in such a bad shape - they
>> were only intended to give an idea which "road" I was taking. I
>> attached the new files taken from the mercurial repositories and only
>> modified at the affected places.
>>
>>> With a change of this sort, we really do need tests along with a
>>> fix. Have
>>> you started writing any test cases?
>> I completly agree - I have some simple test-cases which test more or
>> less only very basic functionality of Deflater and they work well
>> (also FlatterTest passes).
>> I'll write some more tests which test exotic use-cases like changing
>> compression-level, ... during compression.
>
> Great, thanks. It would be a Good Idea to have a test that checks my
> assumption re finished (see above).
>
>> I have some open questions:
>> 1.) Is the seperate structure approach to hold the stride-buffers ok?
>
> I think so.
>
>> 2.) Any suggestions for the following names: 1. strm-field in class
>> (defAdr), 2. defAdr-parameter,3. defptr - long_to_ptr of defAdr, 4.
>> def_data - name of the structure
>
> Those don't quite match what I see in the code; but what's in the code
> seems OK:
> def_data for the struct,
> def_adr as a param to deflateBytes, etc. etc.
> defptr to reference a def_data in init and deflateBytes
>
>> 3.) I am not really used to program in C. Are the adress-operations ok
>> which I used to get members of the new struct def_data?
>
> It seems OK.
>
>> Thanks for your patience, lg Clemens
>>
>> Some notes, and changes in ramdom order:
>> * Changed deflate-bytes to the old behaviour to return after the call
>> to deflateParams
>
> Good; AFAICT at maintains the existing semantics.
>
>> * Verified that its ok to call deflateParams when there's not enough
>> space in the output-buffer to flush all "old" data out (thanks to Mark
>> Adler)
>> * I changed the method-signiture of the native method compared to
>> original, because some variables were read from JNI-code, whereas they
>> could have been passed simply down using method parameters. I think
>> its "cleaner" to pass it.
>
> The long argument to deflateBytes is a bit cumbersome, but Ken Russell
> opined that it provides better performance, so it's a Good Thing:
>
> > Kenneth Russell wrote:
> > I strongly agree with the contributor's suggestion. Not only is passing
> > the argument from Java less code, it is also faster since field access
> > from Java can be optimized by the HotSpot compiler, where field accesses
> > through the JNI must go through the same set of boilerplate
> > C/C++/assembly every time.
>
>> * Allocation of the stride-buffers together with the z_stream
>> structure. z_stream is really large, so the two stride-buffers should
>> not add that much overhead. However this has the advantage of not
>> mallocing/freeing and also beeing able to fill the input-stride-buffer
>> once for several calls of the native method.
>
> Looks good.
>
>> * Renamed the strm-adress-parameter to defadr, because it no longer
>> really points to a strm. I did not rename the java field "strm"
>> because I did not have an idea for a proper name.
>
> It should have had a different name from day one. I'm slightly loathe
> to make a name change, nor do I have (Friday afternoon) a Really Good Idea.
>
>> * Removed striding from DeflaterOutputStream, (looked how code looked
>> in 1.4.2).
>
> Looks good.
>
> From your other email:
>
> > I also thought about implementing striding in the CRC32/Adler32
> > classes which basically suffer from the same block-the-gc behaviour as
> > Inflater/Deflater did before they were "fixed" ;)
>
> I suspect these are not as much of an issue: presumably CRC32 and
> adler32 calculations are faster than deflation (caveat: I haven't
> measured them). Other have pointed out / shamed us because, hey,
> shouldn't these be in Java anyway?
>
> > Furthermore do you have good ideas for regression tests?
> > The usual compression/decompression works fine, can you imagine
> > corner-cases which would be worth special testing?
> > Should the tests written in the jtreg format?
>
> I'd like to see tests where there's a possibility of semantics having
> changed; as noted above re "finished".
>
> I haven't done a performance analysis, but don't expect a regression.
> If anything, since the striding is kept within native code, there should
> be fewer Java -> native calls, and better performance (though that is
> perhaps not measurable).
>
> My last comment is about this change in general. It seems like a
> reasonable fix, though the corresponding bug:
> http://bugs.sun.com/view_bug.do?bug_id=6399199
> is a low priority one for us, and this is code we generally feel is best
> left alone when possible. We are still learning how best to work with
> contributions from outside of Sun. I will check with other who've
> maintained this code in the past, to get their opinion of making this
> kind of change. While I'm currently responsible for jar/zip code, it's
> only one of the hats I currently wear ;-)
>
> Thanks,
> Dave
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Deflater.c.reformat
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080218/96a31c87/Deflater.c.reformat>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Deflater.c
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080218/96a31c87/Deflater.c>
More information about the core-libs-dev
mailing list