RFR :7088419 : (L) Use x86 Hardware CRC32 Instruction with java.util.zip.CRC32 and java.util.zip.Adler32
Alan Bateman
Alan.Bateman at oracle.com
Thu May 16 21:27:19 UTC 2013
On 16/05/2013 15:50, David Chase wrote:
> :
>
> Parallel performance is a little harder to reason about on big x86 boxes (both Intel and AMD), so I am leaving the threshold high. Dave Dice thought this might be an artifact of cores being put into a power-saving mode and being slow to wake (the particular benchmark I wrote would have been pessimal for this, since it alternated between serial and parallel phases). The eventual speedups were often impressive (6x-12x) but it was unclear how many hardware threads (out of the 32-64 available) I was using to obtain this. Yes, I need to plug this into JMH for fine-tuning. I'm using the system fork-join pool because that initially seemed like the good-citizen thing to do (balance CRC/Adler needs against those of anyone else who might be doing work) but I am starting to wonder if it would make more sense to establish a small private pool with a bounded number of threads, so that I don't need to worry about being a good citizen so much. It occurs to me, late in the game, that using big-ish units of work is another, different way to be a bad citizen. (I would prefer to get this checked in if it represents a net improvement, and then work on the tuning afterwards.)
>
The current proposal doesn't change the API at this time but I wonder if
you have considered adding parallelUpdate methods to complement the
serial methods?
-Alan.
More information about the core-libs-dev
mailing list