RFR: 8003246: Add Supplier to ThreadLocal
Peter Levart
peter.levart at gmail.com
Thu Dec 6 20:38:02 UTC 2012
On 12/06/2012 08:08 PM, Remi Forax wrote:
> On 12/06/2012 08:01 PM, Peter Levart wrote:
>> There's a quick trick that guarantees in-lining of get/set/remove:
>>
>> public static class FastThreadLocal<T> extends ThreadLocal<T> {
>> @Override
>> public final T get() { return super.get(); }
>>
>> @Override
>> public final void set(T value) { super.set(value); }
>>
>> @Override
>> public final void remove() { super.remove(); }
>> }
>>
>> ....just use static type FastThreadLocal everywhere in code.
>>
>> I tried it and it works.
>
> No, there is no way to have such guarantee, here, it works either
> because the only class ThreadLocal you load is FastThreadLocal or
> because the VM has profiled the callsite see that you only use
> FastThreadLocal for a specific instruction.
Nothing is certain but death and taxes, I agree.
But think deeper, Remi!
How do you explain the following test:
public class ThreadLocalTest {
static class Int { int value; }
static class TL0 extends ThreadLocal<Int> {}
static class TL1 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL2 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL3 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL4 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static long doTest(ThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++;
return System.nanoTime() - t0;
}
static long doTest(FastThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++;
return System.nanoTime() - t0;
}
static long test0(ThreadLocal<Int> tl) {
if (tl instanceof FastThreadLocal)
return doTest((FastThreadLocal<Int>)tl);
else
return doTest(tl);
}
static void test(ThreadLocal<Int> tl) {
tl.set(new Int());
System.out.print(tl.getClass().getName() + ":");
for (int i = 0; i < 8; i++)
System.out.print(" " + test0(tl));
System.out.println();
}
public static void main(String[] args) {
TL0 tl0 = new TL0();
test(tl0);
test(new TL1());
test(new TL2());
test(new TL3());
test(new TL4());
test(tl0);
}
}
Which prints the following (demonstrating almost 2x slowdown of TL0 -
last line compared to first):
test.ThreadLocalTest$TL0: 342716421 326105315 300744544 300654890
300726346 300752009 300700781 300735651
test.ThreadLocalTest$TL1: 321424139 312128166 312173383 312125203
312142144 312150949 316760957 313393554
test.ThreadLocalTest$TL2: 525661886 524169413 524184405 524215685
524162050 524400364 524174966 454370228
test.ThreadLocalTest$TL3: 472042229 471071328 464387909 468047355
464795171 464466481 464449567 464365974
test.ThreadLocalTest$TL4: 459651686 454142365 454129481 454180718
454217277 454109611 454119988 456978405
test.ThreadLocalTest$TL0: 582252322 582773455 582612509 582753610
582626360 582852195 582805654 582598285
Now with a simple change of:
static class TL0 extends FastThreadLocal<Int> {}
...the same test prints:
test.ThreadLocalTest$TL0: 330722181 325823711 301171182 309992192
321868979 308111417 303806979 300612033
test.ThreadLocalTest$TL1: 330263857 326448062 300607081 300575641
307442821 300616794 300548457 303462898
test.ThreadLocalTest$TL2: 319627165 311309477 311465815 311279612
311294427 311315803 311470291 311293823
test.ThreadLocalTest$TL3: 526849874 524209792 524421574 524166747
524396011 524163313 524395641 524165429
test.ThreadLocalTest$TL4: 464963126 455172216 455466304 455245487
455368318 455093735 455125038 455317375
test.ThreadLocalTest$TL0: 300472239 300695398 300480230 303459397
300451419 300679904 300445717 300451166
And that's very repeatable! Try it for yourself (on JDK8 of course).
Regards, Peter
>
>>
>>
>> Regards, Peter
>
> cheers,
> Rémi
>
>>
>> On 12/06/2012 01:03 PM, Doug Lea wrote:
>>> On 12/06/12 06:56, Vitaly Davidovich wrote:
>>>> Doug,
>>>>
>>>> When you see the fast to slow ThreadLocal transition due to class
>>>> loading
>>>> invalidating inlined get(), do you not then see it get restored
>>>> back to fast
>>>> mode since the receiver type in your call sites is still the
>>>> monomorphic
>>>> ThreadLocal (and not the unrelated subclasses)? Just trying to
>>>> understand what
>>>> Rémi and you are saying.
>>>>
>>>
>>> The possible outcomes are fairly non-deterministic, depending
>>> on hotspot's mood about recompiles, tiered-compile interactions,
>>> method size, Amddahl's law interactions, phase of moon, etc.
>>>
>>> (In j.u.c, we have learned that our users appreciate things
>>> being predictably fast enough rather than being
>>> unpredictably sometimes even faster but often slower.
>>> So when we see such cases, as with ThreadLocal, they get added
>>> to todo list.)
>>>
>>> -Doug
>>>
>>>
>>>
>>>
>>
>
More information about the core-libs-dev
mailing list