RFR :7088419 : (L) Use x86 Hardware CRC32 Instruction with java.util.zip.CRC32 and java.util.zip.Adler32

Alan Bateman Alan.Bateman at oracle.com
Thu May 16 21:27:19 UTC 2013


On 16/05/2013 15:50, David Chase wrote:
> :
>
> Parallel performance is a little harder to reason about on big x86 boxes (both Intel and AMD), so I am leaving the threshold high.  Dave Dice thought this might be an artifact of cores being put into a power-saving mode and being slow to wake (the particular benchmark I wrote would have been pessimal for this, since it alternated between serial and parallel phases).  The eventual speedups were often impressive (6x-12x) but it was unclear how many hardware threads (out of the 32-64 available) I was using to obtain this.  Yes, I need to plug this into JMH for fine-tuning.  I'm using the system fork-join pool because that initially seemed like the good-citizen thing to do (balance CRC/Adler needs against those of anyone else who might be doing work) but I am starting to wonder if it would make more sense to establish a small private pool with a bounded number of threads, so that I don't need to worry about being a good citizen so much.  It occurs to me, late in the game, that using big-ish units of work is another, different way to be a bad citizen.  (I would prefer to get this checked in if it represents a net improvement, and then work on the tuning afterwards.)
>
The current proposal doesn't change the API at this time but I wonder if 
you have considered adding parallelUpdate methods to complement the 
serial methods?

-Alan.





More information about the core-libs-dev mailing list