RFR (S): CR 8005926: (thread) Merge ThreadLocalRandom state into java.lang.Thread
Chris Hegarty
chris.hegarty at oracle.com
Mon Jan 14 18:20:34 UTC 2013
Ooooooh.... this change goes beyond my review capability!!! I thought we
were just eliminating the indirection of ThreadLocal, anyway...
As a basic review I don't see anything obviously wrong, and I don't have
a problem with adding the fields to j.l.Thread, but I'm not in a
position to review in detail the algorithm used.
Alan, did mention that he may be in a better position to do a detailed
review. I can also sponsor this change.
-Chris.
On 14/01/2013 13:52, Chris Hegarty wrote:
> For what it is worth, it is on my list for today.
>
> -Chris.
>
> On 14/01/2013 13:31, Aleksey Shipilev wrote:
>> I understand everyone is busy with JDK8 milestone, but maybe some
>> critical people have missed this note? It would be terrific if someone
>> could review this (or even sponsor it!).
>>
>> Thanks,
>> Aleksey.
>>
>> On 01/11/2013 02:31 AM, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> Submitting this on behalf of Doug Lea. The webrev is here:
>>> http://cr.openjdk.java.net/~shade/8005926/webrev.00/
>>>
>>> Bottom-line: merge ThreadLocalRandom state into Thread, to optimize many
>>> use cases around j.u.c.* code. The simple performance tests on 2x2 i5,
>>> Linux x86_64, 4 threads, 5 forks, 3x3s warmup, 5x3s measurement:
>>>
>>> JDK8 (baseline)
>>> TLR.nextInt(): 6.4 +- 0.1 ns/op
>>> TLR.current().nextInt(): 16.1 +- 0.4 ns/op
>>> TL.get().nextInt(): 19.1 +- 0.6 ns/op
>>>
>>> JDK8 (patched)
>>> TLR.nextInt(): 6.5 +- 0.2 ns/op
>>> TLR.current().nextInt(): 6.4 +- 0.1 ns/op
>>> TL.get().nextInt(): 17.2 +- 2.0 ns/op
>>>
>>> First line shows the peak performance of TLR itself, everything over
>>> that is the ThreadLocal overhead. One can see the patched version
>>> bypasses ThreadLocal machinery completely, and the overhead is slim
>>> to none.
>>>
>>> N.B. It gets especially interesting when there are many ThreadLocals
>>> registered. Making 1M ThreadLocals and pre-touching them bloats the
>>> thread-local maps, and we get:
>>>
>>> JDK8 (baseline), contaminators = 1M:
>>> TLR.nextInt(): 6.4 +- 0.1 ns/op
>>> TLR.current().nextInt(): 21.7 +- 5.3 ns/op
>>> TL.get().nextInt(): 28.7 +- 1.1 ns/op
>>>
>>> JDK8 (patched), contaminators = 1M:
>>> TLR.nextInt(): 6.6 +- 0.2 ns/op
>>> TLR.current().nextInt(): 6.5 +- 0.1 ns/op
>>> TL.get().nextInt(): 29.4 +- 0.5 ns/op
>>>
>>> Note that patched version successfully dodges this pathological case.
>>>
>>> Testing:
>>> - Doug tested on his platforms
>>> - Tested Linux x86_64 to build and run successfully
>>> - JPRT builds are OK
>>> - JPRT tests are OK (modulo some weird lambda/default-methods test
>>> failures in jdk8/tl)
>>>
>>> Attribution:
>>> - dl: original patch
>>> - shade: testing, copyright headers, etc.
>>>
>>> -Aleksey.
>>>
>>
More information about the core-libs-dev
mailing list