RFR 4641897: Faster string conversion of large integers

Peter Levart peter.levart at gmail.com
Tue Jun 25 18:13:47 UTC 2013


On 06/22/2013 12:54 PM, Aleksey Shipilev wrote:
>    T get(int index1, index2) {
>       T[][] lc = cache;
>       if (index1 >= lc.length) { // needs resizing
>          lc = <generate_new_T[][]_of_size>((index1 << 1) + 1);
>          cache = lc;
>       }
>       T[] lt = lc[*index2*];
>       if (index2 >= lt.length) { // needs resizing
>          lt = <generate_new_T[]_of_size>((index2 << 1) + 1);
>          lc[index1] = lt;
>          cache = lc; // publish
>       }
>       return lt[index2];
>    }
>
> -Aleksey.

Hi Aleksey,

1st I think there's a typo in above bolded part. There should be index1 
instead of index2 there, right?

Then I think there's a data race in above code which can de-reference 
half-constructed array and an element within it:

tread A:

T[][] lc = cache;
     // skip 1st if, since index1 < lc.length

thread B:

     T[][] lc = cache;
     // skip 1st if, since index1 < lc.length
     T[] lt = lc[index1];
     // enter 2nd if, since index2 >= lt.length;
     lt = <generate_new_T[]_of_size>((index2 << 1) + 1);
     lc[index1] = lt;

thread A:
     T[] lt = lc[index1]; // this reads a reference to new lt array, 
written by thread B (data-race)
     // skip 2nd if, since index2 < lt.length
     return lt[index2];   // this could read and return null, since 
array reference was obtained via data-race


I have studied the constraints of powerCache and have observed:

- the capacity/size of 1st level is constant and doesn't need resizing 
(37 slots, since Character.MAX_RADIX == 36)
- the virtual "length" field of each array is "final", meaning that at 
least length of the array can be obtained safely although the reference 
to the array was obtained via data-race
- the BigInteger class is effectively final (the meaningful state is 
held via final fields and non-final fields are just caches of some info 
which can be re-computed idempotently - like String.hash), meaning that 
a reference to BigInteger can be obtained via data-race and still the 
object will behave consistently.

So the caching could be performed with no synchronization (other than 
that provided by final fields descibed above).

Here is a possible variant of such caching:


     private static final BigInteger[][] powerCache =
         new BigInteger[Character.MAX_RADIX + 1][];

     private static BigInteger getRadixConversionCache(
         int radix,
         int exponent
     ) {
         BigInteger[] cacheLine = powerCache[radix];
         int oldLength = cacheLine == null ? 0 : cacheLine.length;
         if (exponent >= oldLength) { // needs resizing/creation?
             // invariant: each cacheLine has length > 0
             if (oldLength == 0) { // creation
                 cacheLine = new BigInteger[exponent + 1];
             }
             else { // resizing
                 // increase by factor of 1.5 (like ArrayList)
                 int newLength = oldLength + (oldLength >> 1);
                 // if that's not enough, take exact needed length
                 if (newLength <= exponent) newLength = exponent + 1;
                 cacheLine = Arrays.copyOf(cacheLine, newLength);
             }
             powerCache[radix] = cacheLine; // install new cacheLine
         }
         // search for 1st non-null power from min(oldLength - 1, 
exponent) backwards
         int s;
         BigInteger power = null;
         for (s = Math.min(oldLength - 1, exponent); s >= 0; s--) {
             power = cacheLine[s];
             if (power != null) break;
         }
         // calculate the rest up to and including exponent
         for (int i = s + 1; i <= exponent; i++) {
             power = power == null ? BigInteger.valueOf(radix) : 
power.square();
             cacheLine[i] = power;
         }
         return power;
     }



Please, be free to find a fault in this code ;-)

Regards, Peter




More information about the core-libs-dev mailing list