review request for 6622432

Mon Feb 16 22:33:42 UTC 2009

Webrev: http://webrev.invokedynamic.info/xiaobin.lu/6622432/

6622432: RFE: Performance improvements to java.math.BigDecimal

Details:

This work is targeted to improve the performance of BigDecimal and 
related classes. Along with the performance improvement, the 
implementation of many methods has been simplified or complicated for 
reasons described below. Before delving into details of the webrev, here 
is the summary of the improvement, for derby benchmark inside 
SPECjvm2008, we saw about 84.65% improvement measured on a 2 socket 
2.8GHz, 12G RAM Nehalem machine. Similarly, we saw about 6.57% 
improvement on SPECjbb2005. The usage of BigDecimal is quite different 
for derby and SPECjbb2005, the former benchmark emphasizes more on 
inflated cases and the later is mostly using small numbers (i.e. 
non-inflated case), so we are trying our best to take care of both in 
the whole optimization, as a result, the optimization putting in the 
webrev is made as general as possible. The change is quite significant 
and I can only try to highlight some general ideas of the optimization.

First is some background of the current implementation of BigDecimal. 
Among all the fields of BigDecimal classes, the two most interesting 
fields are intCompact which is long type and intVal which is BigInteger 
type. When the unscaled value of BigDecimal falls into (Long.MIN_VALUE, 
Long.MAX_VALUE], its intCompact field is set and intVal is set to null 
most of time until it is used to perform operations with another 
BigInteger. This is a good thing to do since it will reduce the object 
allocation significantly and make the operation faster since obviously 
adding two primitive numbers for example just needs one instruction 
while adding two BigInteger object requires a method call, get the 
magnitude array out of the object, a for loop and so on. So one of the 
big idea of optimization made in the webrev is to take a further step  
from this idea and try to make some of the hot methods aware of the 
non-inflated and inflated cases. For example, the constructor of 
BigDecimal(char[] in, int offset, int len), when the number is not 
inflated, we came up with some code to make the constructor as efficient 
as possible. Another example is the multiply(BigDecimal multiplicand) 
method, when one of the objects is not inflated, we came up with a 
internal multiply method in BigInteger to accept long parameter.

The second big change to BigDecimal is its divide method. In the 
inflated case, the current implementation calls divide method in 
BigInteger which makes MutableBigInteger objects for both operators, 
then perform the division. Our change made to divide method directly 
makes MutableBigInteger object out of the magnitude array of the intVal 
and calls divide method directly. When we get rid of the middle man, we 
again can reduce the object allocation significantly when divide is a 
hot method.

The third change to BigDecimal is the improvement of calculating the 
precision of a given BigDecimal object. The current implementation 
actually uses a static tens power array for incoming number to compare 
with. And when the number is inflated, it keeps dividing itself by 1 
billion until it becomes very small, every division adds 9 digits to the 
precision. As you know, the division operation is expensive and the 
algorithm to compare with the ten's power array can be made more 
efficiently by correlating the bit length of the number with the index 
to the array so that the time to compute the precision is O(1) rather 
than being O(n) (where n is the length of the array). And that is 
exactly what we are doing in the webrev. First, we get the bit length of 
the number and then multiply it with log2 (10 base), use the result to 
index to a dynamically expanded ten's power array so that division 
operation can be avoided completely.

Last but not least, the optimization we've made in layoutChars 
implementation significantly reduces the array allocation rate by making 
a thread local StringBuilder object. As a result, the buffer to hold the 
temporary characters is the same for every thread to call the method. 
Further using the idea I mentioned, we also make it intCompact aware. 
When the number is not inflated, we use a thread local character array 
to hold that number in char(s). This reduces the burden for every method 
call to allocate that character array.

Reviewed by:
Verified by:
JCK
regression tests in JDK workspace

Also contributed by:
Doug Lea (thanks so much for a lot of tedious code clean up work you've 
done, and of course, many fantastic ideas. )
Joe Darcy contributed most of the tests

Regards,
-Xiaobin