review request for 6622432
Xiaobin Lu
Xiaobin.Lu at Sun.COM
Mon Feb 16 22:33:42 UTC 2009
Webrev: http://webrev.invokedynamic.info/xiaobin.lu/6622432/
6622432: RFE: Performance improvements to java.math.BigDecimal
Details:
This work is targeted to improve the performance of BigDecimal and
related classes. Along with the performance improvement, the
implementation of many methods has been simplified or complicated for
reasons described below. Before delving into details of the webrev, here
is the summary of the improvement, for derby benchmark inside
SPECjvm2008, we saw about 84.65% improvement measured on a 2 socket
2.8GHz, 12G RAM Nehalem machine. Similarly, we saw about 6.57%
improvement on SPECjbb2005. The usage of BigDecimal is quite different
for derby and SPECjbb2005, the former benchmark emphasizes more on
inflated cases and the later is mostly using small numbers (i.e.
non-inflated case), so we are trying our best to take care of both in
the whole optimization, as a result, the optimization putting in the
webrev is made as general as possible. The change is quite significant
and I can only try to highlight some general ideas of the optimization.
First is some background of the current implementation of BigDecimal.
Among all the fields of BigDecimal classes, the two most interesting
fields are intCompact which is long type and intVal which is BigInteger
type. When the unscaled value of BigDecimal falls into (Long.MIN_VALUE,
Long.MAX_VALUE], its intCompact field is set and intVal is set to null
most of time until it is used to perform operations with another
BigInteger. This is a good thing to do since it will reduce the object
allocation significantly and make the operation faster since obviously
adding two primitive numbers for example just needs one instruction
while adding two BigInteger object requires a method call, get the
magnitude array out of the object, a for loop and so on. So one of the
big idea of optimization made in the webrev is to take a further step
from this idea and try to make some of the hot methods aware of the
non-inflated and inflated cases. For example, the constructor of
BigDecimal(char[] in, int offset, int len), when the number is not
inflated, we came up with some code to make the constructor as efficient
as possible. Another example is the multiply(BigDecimal multiplicand)
method, when one of the objects is not inflated, we came up with a
internal multiply method in BigInteger to accept long parameter.
The second big change to BigDecimal is its divide method. In the
inflated case, the current implementation calls divide method in
BigInteger which makes MutableBigInteger objects for both operators,
then perform the division. Our change made to divide method directly
makes MutableBigInteger object out of the magnitude array of the intVal
and calls divide method directly. When we get rid of the middle man, we
again can reduce the object allocation significantly when divide is a
hot method.
The third change to BigDecimal is the improvement of calculating the
precision of a given BigDecimal object. The current implementation
actually uses a static tens power array for incoming number to compare
with. And when the number is inflated, it keeps dividing itself by 1
billion until it becomes very small, every division adds 9 digits to the
precision. As you know, the division operation is expensive and the
algorithm to compare with the ten's power array can be made more
efficiently by correlating the bit length of the number with the index
to the array so that the time to compute the precision is O(1) rather
than being O(n) (where n is the length of the array). And that is
exactly what we are doing in the webrev. First, we get the bit length of
the number and then multiply it with log2 (10 base), use the result to
index to a dynamically expanded ten's power array so that division
operation can be avoided completely.
Last but not least, the optimization we've made in layoutChars
implementation significantly reduces the array allocation rate by making
a thread local StringBuilder object. As a result, the buffer to hold the
temporary characters is the same for every thread to call the method.
Further using the idea I mentioned, we also make it intCompact aware.
When the number is not inflated, we use a thread local character array
to hold that number in char(s). This reduces the burden for every method
call to allocate that character array.
Reviewed by:
Verified by:
JCK
regression tests in JDK workspace
Also contributed by:
Doug Lea (thanks so much for a lot of tedious code clean up work you've
done, and of course, many fantastic ideas. )
Joe Darcy contributed most of the tests
Regards,
-Xiaobin
More information about the core-libs-dev
mailing list