RFR 8017540: Improve multi-threaded contention behavior of BigInteger.toString() radix conversion cache
Peter Levart
peter.levart at gmail.com
Thu Jun 27 09:17:26 UTC 2013
Sorry for high frequency of messages.
This one is even better. powerCacheIndex need not be holding Nodes, but
BigIntegers instead. So no indirection on fast-path:
private static class Node {
final BigInteger value;
Node next;
Node(BigInteger value) { this.value = value; }
}
private static final Node[] powerCache;
private static final BigInteger[][] powerCacheIndex;
private static final int POWER_CACHE_LINE_CHUNK = 16;
static {
powerCache = new Node[Character.MAX_RADIX + 1];
for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX; i++) {
powerCache[i] = new Node(BigInteger.valueOf(i));
}
powerCacheIndex = new BigInteger[Character.MAX_RADIX + 1][];
}
private static BigInteger getRadixConversionCache(int radix, int
exponent) {
BigInteger[] cacheLine = powerCacheIndex[radix];
if (cacheLine != null && exponent < cacheLine.length) { //
cache line is long enough
BigInteger value = cacheLine[exponent];
if (value != null) {
return value;
}
return fillCacheLine(cacheLine, powerCache[radix], exponent);
}
else { // we need to extend / create cache line
cacheLine = new BigInteger[(exponent /
POWER_CACHE_LINE_CHUNK + 1) * POWER_CACHE_LINE_CHUNK];
BigInteger result = fillCacheLine(cacheLine,
powerCache[radix], exponent);
powerCacheIndex[radix] = cacheLine; // install new line
return result;
}
}
private static BigInteger fillCacheLine(BigInteger[] cacheLine,
Node node, int exponent) {
cacheLine[0] = node.value;
for (int i = 1; i <= exponent; i++) {
synchronized (node) {
if (node.next == null) {
node.next = new Node(node.value.pow(2));
}
}
node = node.next;
cacheLine[i] = node.value;
}
return node.value;
}
Regards, Peter
On 06/27/2013 10:51 AM, Peter Levart wrote:
> Hi,
>
> Expanding on the idea of building single growing linked list of values
> and treating the array just as index for quick access which can be
> re-created at any time without synchronization or volatile
> publication, here's the attempt:
>
>
> private static class Node {
> final BigInteger value;
> Node next;
> Node(BigInteger value) { this.value = value; }
> }
>
> private static final Node[] powerCache;
> private static final Node[][] powerCacheIndex;
> private static final int POWER_CACHE_LINE_CHUNK = 16;
>
> static {
> powerCache = new Node[Character.MAX_RADIX + 1];
> for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX; i++) {
> powerCache[i] = new Node(BigInteger.valueOf(i));
> }
> powerCacheIndex = new Node[Character.MAX_RADIX + 1][];
> }
>
> private static BigInteger getRadixConversionCache(int radix, int
> exponent) {
> Node[] cacheLine = powerCacheIndex[radix];
> if (cacheLine != null && exponent < cacheLine.length) { //
> cache line is long enough
> Node node = cacheLine[exponent];
> if (node != null) {
> return node.value;
> }
> return fillCacheLine(cacheLine, powerCache[radix], exponent);
> }
> else { // we need to extend / create cache line
> cacheLine = new Node[(exponent / POWER_CACHE_LINE_CHUNK +
> 1) * POWER_CACHE_LINE_CHUNK];
> BigInteger result = fillCacheLine(cacheLine,
> powerCache[radix], exponent);
> powerCacheIndex[radix] = cacheLine; // install new line
> return result;
> }
> }
>
> private static BigInteger fillCacheLine(Node[] cacheLine, Node
> node0, int exponent) {
> Node node = cacheLine[0] = node0;
> for (int i = 1; i <= exponent; i++) {
> Node nextNode;
> synchronized (node) {
> nextNode = node.next;
> if (nextNode == null) {
> nextNode = new Node(node.value.square());
> node.next = nextNode;
> }
> }
> cacheLine[i] = node = nextNode;
> }
> return node.value;
> }
>
>
>
> What do you think?
>
> Regards, Peter
>
>
> On 06/27/2013 09:27 AM, Peter Levart wrote:
>> On 06/27/2013 08:37 AM, Peter Levart wrote:
>>> Hi,
>>>
>>> This version seems correct. Maybe just replace double volatile read
>>> at length re-check with a single one:
>>>
>>>
>>> private static BigInteger getRadixConversionCache(int radix, int
>>> exponent) {
>>> BigInteger[] cacheLine = powerCache[radix]; // volatile read
>>> if (exponent < cacheLine.length)
>>> return cacheLine[exponent];
>>>
>>> int oldLength = cacheLine.length;
>>> cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
>>> for (int i = oldLength; i <= exponent; i++)
>>> cacheLine[i] = cacheLine[i - 1].pow(2);
>>>
>>> BigInteger[][] pc = powerCache; // volatile read again
>>> if (exponent >= pc[radix].length) {
>>> pc = pc.clone();
>>> pc[radix] = cacheLine;
>>> powerCache = pc; // volatile write, publish
>>> }
>>> return cacheLine[exponent];
>>> }
>>>
>>>
>>> I like this, since it tries to avoid overwriting larger cacheLines
>>> with smaller ones when unexistent exponents are requested
>>> concurrently for same radix. This is good, since computation for
>>> larger exponents takes quadratically longer time (as Alan Eliasen
>>> points out) and possibility of concurrent threads stomping on each
>>> other increases. Re-reading and cloning powerCache reference at end
>>> of computation also takes care of cacheLines for other radixes that
>>> might have expanded while the computation was taking place. So the
>>> only overhead remaining is when concurrent threads are uselessly
>>> computing same results at same time. To avoid this, locking and
>>> waiting would have to be introduced which would *complicate things*.
>>>
>>> Regards, Peter
>>
>> On the other hand, it doesn't complicate thing too much. So if this
>> extra CPU time proves to be a problem, here's a variation of above
>> code which prevents multiple threads from calculating the same result
>> at the same time, by serializing computation with precise granularity:
>>
>> private static class Node {
>> final BigInteger value;
>> Node next;
>> Node(BigInteger value) { this.value = value; }
>> }
>>
>> private static volatile Node[][] powerCache;
>>
>> static {
>> powerCache = new Node[Character.MAX_RADIX + 1][];
>> for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX;
>> i++) {
>> powerCache[i] = new Node[]{new Node(BigInteger.valueOf(i))};
>> }
>> }
>>
>> private static BigInteger getRadixConversionCache(int radix, int
>> exponent) {
>> Node[] cacheLine = powerCache[radix]; // volatile read
>> if (exponent < cacheLine.length)
>> return cacheLine[exponent].value;
>>
>> int oldLength = cacheLine.length;
>> cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
>> Node prevNode = cacheLine[oldLength - 1];
>> for (int i = oldLength; i <= exponent; i++) {
>> Node node;
>> synchronized (prevNode) {
>> node = prevNode.next;
>> if (node == null) {
>> node = new Node(prevNode.value.pow(2));
>> prevNode.next = node;
>> }
>> }
>> cacheLine[i] = prevNode = node;
>> }
>>
>> Node[][] pc = powerCache; // volatile read again
>> if (exponent >= pc[radix].length) {
>> pc = pc.clone();
>> pc[radix] = cacheLine;
>> powerCache = pc; // volatile write, publish
>> }
>> return cacheLine[exponent].value;
>> }
>>
>>
>> This variation differs only in a subtelty. It wraps each BigInteger
>> with a Node, which has a link to next node in chain. Computation of
>> next node is serailized using previous node as a lock. So although
>> Nodes can end up in several arrays (which are used as index for quick
>> access), there is only a single growing chain of Nodes per radix and
>> each Node is computed by single thread.
>>
>> Regards, Peter
>>
>>>
>>> On 06/26/2013 08:13 PM, Brian Burkhalter wrote:
>>>> So do we have consensus on this version?
>>>>
>>>> Thanks for the lively "conversation."
>>>>
>>>> Brian
>>>>
>>>> On Jun 26, 2013, at 12:05 AM, Aleksey Shipilev wrote:
>>>>
>>>>> Yes, like that.
>>>>>
>>>>> -Aleksey
>>>>>
>>>>> On 26.06.2013, at 10:53, Dmitry Nadezhin
>>>>> <dmitry.nadezhin at gmail.com> wrote:
>>>>>
>>>>>>> We could check for the existing cacheLine.length right before
>>>>>>> installing
>>>>>> the new one
>>>>>>
>>>>>> Do you mean something like this ?
>>>>>>
>>>>>> BigInteger getRadixConversionCache(int radix, int exponent) {
>>>>>> BigInteger[] cacheLine = powerCache[radix]; // volatile read
>>>>>> if (exponent < cacheLine.length)
>>>>>> return cacheLine[exponent];
>>>>>>
>>>>>> int oldLength = cacheLine.length;
>>>>>> cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
>>>>>> for (int i = oldLength; i < exponent + 1; i++)
>>>>>> cacheLine[i] = cacheLine[i - 1].square();
>>>>>>
>>>>>> if (exponent >= powerCache[radix].length) { // volatile read again
>>>>>> BigInteger[][] pc = Arrays.copyOf(powerCache);
>>>>>> pc[radix] = cacheLine;
>>>>>> powerCache = pc; // volatile write, publish
>>>>>> }
>>>>>> return cacheLine[exponent];
>>>>>> }
>>>
>>
>
More information about the core-libs-dev
mailing list