Early version of striding Deflater

Dave Bristor David.Bristor at Sun.COM
Mon Feb 18 21:21:13 UTC 2008


Here are the attached files that I forgot to attach ;-(
	Dave

Dave Bristor wrote:
> Hi Clemens,
> 
> Here's some file-by-file feedback, answers to questions, etc.  I've 
> attached 2 files:
> * Deflater.c.reformat is by-and-large the same as the file that you 
> sent, except that it compiles on Solaris without warnings (it wouldn't 
> compile there w/o change; perhaps the linux compiler (I'm guessing) you 
> used is more lenient), and is formatted more in keeping with the rest of 
> the file's style.
> 
> * Deflater.c has some further changes on my part, described below.  I've 
> run all our regression tests on this one and it passes.  I haven't run 
> JCK tests, nor our more extensive performance suite.
> 
> File-by-file commentary:
> 
> *** Deflate.c
> 
> Doesn't compile, at least not on Solaris.  Several warnings.  I fixed 
> them. See attached Deflater.c.reformat.  (I have not tried compiling on 
> other platforms.)
> 
> Some stylistic issues need attention; see e.g. deflateBytes for brace 
> positioning, trailing whitespace, space between keyword and parens.
> 
> Should document fields in def_data (compare with zip_util.h).
> 
> Some field IDs no longer necessary, since they're passed in as params.  
> I removed them.
> 
> There's a slight change to the semantics of "finished".  Current code 
> sets Deflater.finished only if setParams is false.  Changed code may set 
> it regardlewss of setParams.  I suspect this is OK: if client code 
> changed strategy or level and called Deflater.deflate(), it would invoke 
> deflateParams(), and not alter the value of finished.  The client's 
> subsequent call to Deflater.deflate() would call deflate() which AFAICT 
> would cause finished to be set.  What do you think?
> 
> Why fall through from Z_OK to Z_BUF_ERROR?  In Z_BUF_ERROR and default 
> cases, why continue execution of loop instead of returning 0 as does 
> original code? I changed this; see attached Deflater.c
> 
> *** Deflater.java: OK
> 
> *** DeflaterOutputStream.java: OK
> 
> More inline below.
> 
> Clemens Eisserer wrote:
>> Hi Dave,
>>
>> Thanks a lot for your reply.
>> To make it short: Of course I understand that this is low-priority
>> (also for me, its a fun-only fix because someone in forums.java.net
>> mentioned it) so don't hurry.
>> Sorry that I wasted your time with my messy files, they were taken
>> from my "playground" thats why they were in such a bad shape - they
>> were only intended to give an idea which "road" I was taking. I
>> attached the new files taken from the mercurial repositories and only
>> modified at the affected places.
>>
>>> With a change of this sort, we really do need tests along with a 
>>> fix.  Have
>>> you started writing any test cases?
>> I completly agree - I have some simple test-cases which test more or
>> less only very basic functionality of Deflater and they work well
>> (also FlatterTest passes).
>> I'll write some more tests which test exotic use-cases like changing
>> compression-level, ... during compression.
> 
> Great, thanks.  It would be a Good Idea to have a test that checks my 
> assumption re finished (see above).
> 
>> I have some open questions:
>> 1.) Is the seperate structure approach to hold the stride-buffers ok?
> 
> I think so.
> 
>> 2.) Any suggestions for the following names: 1. strm-field in class
>> (defAdr), 2. defAdr-parameter,3. defptr -  long_to_ptr of defAdr, 4.
>> def_data - name of the structure
> 
> Those don't quite match what I see in the code; but what's in the code 
> seems OK:
> def_data for the struct,
> def_adr as a param to deflateBytes, etc. etc.
> defptr to reference a def_data in init and deflateBytes
> 
>> 3.) I am not really used to program in C. Are the adress-operations ok
>> which I used to get members of the new struct def_data?
> 
> It seems OK.
> 
>> Thanks for your patience, lg Clemens
>>
>> Some notes, and changes in ramdom order:
>> * Changed deflate-bytes to the old behaviour to return after the call
>> to deflateParams
> 
> Good; AFAICT at maintains the existing semantics.
> 
>> * Verified that its ok to call deflateParams when there's not enough
>> space in the output-buffer to flush all "old" data out (thanks to Mark
>> Adler)
>> * I changed the method-signiture of the native method compared to
>> original, because some variables were read from JNI-code, whereas they
>> could have been passed simply down using method parameters. I think
>> its "cleaner" to pass it.
> 
> The long argument to deflateBytes is a bit cumbersome, but Ken Russell 
> opined that it provides better performance, so it's a Good Thing:
> 
>  > Kenneth Russell wrote:
>  > I strongly agree with the contributor's suggestion. Not only is passing
>  > the argument from Java less code, it is also faster since field access
>  > from Java can be optimized by the HotSpot compiler, where field accesses
>  > through the JNI must go through the same set of boilerplate
>  > C/C++/assembly every time.
> 
>> * Allocation of the stride-buffers together with the z_stream
>> structure. z_stream is really large, so the two stride-buffers should
>> not add that much overhead. However this has the advantage of not
>> mallocing/freeing and also beeing able to fill the input-stride-buffer
>> once for several calls of the native method.
> 
> Looks good.
> 
>> * Renamed the strm-adress-parameter to defadr, because it no longer
>> really points to a strm. I did not rename the java field "strm"
>> because I did not have an idea for a proper name.
> 
> It should have had a different name from day one.  I'm slightly loathe 
> to make a name change, nor do I have (Friday afternoon) a Really Good Idea.
> 
>> * Removed striding from DeflaterOutputStream, (looked how code looked 
>> in 1.4.2).
> 
> Looks good.
> 
>  From your other email:
> 
>  > I also thought about implementing striding in the CRC32/Adler32
>  > classes which basically suffer from the same block-the-gc behaviour as
>  > Inflater/Deflater did before they were "fixed" ;)
> 
> I suspect these are not as much of an issue: presumably CRC32 and 
> adler32 calculations are faster than deflation (caveat: I haven't 
> measured them). Other have pointed out / shamed us because, hey, 
> shouldn't these be in Java anyway?
> 
>  > Furthermore do you have good ideas for regression tests?
>  > The usual compression/decompression works fine, can you imagine
>  > corner-cases which would be worth special testing?
>  > Should the tests written in the jtreg format?
> 
> I'd like to see tests where there's a possibility of semantics having 
> changed; as noted above re "finished".
> 
> I haven't done a performance analysis, but don't expect a regression.  
> If anything, since the striding is kept within native code, there should 
> be fewer Java -> native calls, and better performance (though that is 
> perhaps not measurable).
> 
> My last comment is about this change in general.  It seems like a 
> reasonable fix, though the corresponding bug:
>     http://bugs.sun.com/view_bug.do?bug_id=6399199
> is a low priority one for us, and this is code we generally feel is best 
> left alone when possible.  We are still learning how best to work with 
> contributions from outside of Sun.  I will check with other who've 
> maintained this code in the past, to get their opinion of making this 
> kind of change. While I'm currently responsible for jar/zip code, it's 
> only one of the hats I currently wear ;-)
> 
> Thanks,
>     Dave
> 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Deflater.c.reformat
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080218/96a31c87/Deflater.c.reformat>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Deflater.c
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080218/96a31c87/Deflater.c>


More information about the core-libs-dev mailing list