RFR: 8003246: Add Supplier to ThreadLocal
Peter Levart
peter.levart at gmail.com
Thu Dec 6 21:21:02 UTC 2012
Ok, I've got an explanation.
It's not because of using different static type of variables in code
with final methods, but because TL0 was redirected to a separate method
with separate call sites. The same happens using this variant:
static class TL0 extends ThreadLocal<Int> {}
static class TL1 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL2 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL3 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static class TL4 extends ThreadLocal<Int> { public Int get() {
return super.get(); } }
static long doTest(ThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++;
return System.nanoTime() - t0;
}
static long doTest0(ThreadLocal<Int> tl) {
long t0 = System.nanoTime();
for (int i = 0; i < 100000000; i++)
tl.get().value++;
return System.nanoTime() - t0;
}
static long test0(ThreadLocal<Int> tl) {
if (tl instanceof TL0)
return doTest0(tl);
else
return doTest(tl);
}
But I think that deoptimizations that Dough is talking about might be
prevented by using the following variant of TL:
public class FastThreadLocal<T> extends ThreadLocal<T> {
public final T getFast() { return super.get(); }
public final void setFast(T value) { super.set(value); }
public final void removeFast() { super.remove(); }
}
and invoking the "fast" methods in code.
Right?
Regards, Peter
On 12/06/2012 09:38 PM, Peter Levart wrote:
> On 12/06/2012 08:08 PM, Remi Forax wrote:
>> On 12/06/2012 08:01 PM, Peter Levart wrote:
>>> There's a quick trick that guarantees in-lining of get/set/remove:
>>>
>>> public static class FastThreadLocal<T> extends ThreadLocal<T> {
>>> @Override
>>> public final T get() { return super.get(); }
>>>
>>> @Override
>>> public final void set(T value) { super.set(value); }
>>>
>>> @Override
>>> public final void remove() { super.remove(); }
>>> }
>>>
>>> ....just use static type FastThreadLocal everywhere in code.
>>>
>>> I tried it and it works.
>>
>> No, there is no way to have such guarantee, here, it works either
>> because the only class ThreadLocal you load is FastThreadLocal or
>> because the VM has profiled the callsite see that you only use
>> FastThreadLocal for a specific instruction.
>
> Nothing is certain but death and taxes, I agree.
>
> But think deeper, Remi!
>
> How do you explain the following test:
>
> public class ThreadLocalTest {
>
> static class Int { int value; }
>
> static class TL0 extends ThreadLocal<Int> {}
> static class TL1 extends ThreadLocal<Int> { public Int get() {
> return super.get(); } }
> static class TL2 extends ThreadLocal<Int> { public Int get() {
> return super.get(); } }
> static class TL3 extends ThreadLocal<Int> { public Int get() {
> return super.get(); } }
> static class TL4 extends ThreadLocal<Int> { public Int get() {
> return super.get(); } }
>
> static long doTest(ThreadLocal<Int> tl) {
> long t0 = System.nanoTime();
> for (int i = 0; i < 100000000; i++)
> tl.get().value++;
> return System.nanoTime() - t0;
> }
>
> static long doTest(FastThreadLocal<Int> tl) {
> long t0 = System.nanoTime();
> for (int i = 0; i < 100000000; i++)
> tl.get().value++;
> return System.nanoTime() - t0;
> }
>
> static long test0(ThreadLocal<Int> tl) {
> if (tl instanceof FastThreadLocal)
> return doTest((FastThreadLocal<Int>)tl);
> else
> return doTest(tl);
> }
>
> static void test(ThreadLocal<Int> tl) {
> tl.set(new Int());
> System.out.print(tl.getClass().getName() + ":");
> for (int i = 0; i < 8; i++)
> System.out.print(" " + test0(tl));
> System.out.println();
> }
>
> public static void main(String[] args) {
> TL0 tl0 = new TL0();
> test(tl0);
> test(new TL1());
> test(new TL2());
> test(new TL3());
> test(new TL4());
> test(tl0);
> }
> }
>
>
> Which prints the following (demonstrating almost 2x slowdown of TL0 -
> last line compared to first):
>
> test.ThreadLocalTest$TL0: 342716421 326105315 300744544 300654890
> 300726346 300752009 300700781 300735651
> test.ThreadLocalTest$TL1: 321424139 312128166 312173383 312125203
> 312142144 312150949 316760957 313393554
> test.ThreadLocalTest$TL2: 525661886 524169413 524184405 524215685
> 524162050 524400364 524174966 454370228
> test.ThreadLocalTest$TL3: 472042229 471071328 464387909 468047355
> 464795171 464466481 464449567 464365974
> test.ThreadLocalTest$TL4: 459651686 454142365 454129481 454180718
> 454217277 454109611 454119988 456978405
> test.ThreadLocalTest$TL0: 582252322 582773455 582612509 582753610
> 582626360 582852195 582805654 582598285
>
> Now with a simple change of:
>
> static class TL0 extends FastThreadLocal<Int> {}
>
> ...the same test prints:
>
> test.ThreadLocalTest$TL0: 330722181 325823711 301171182 309992192
> 321868979 308111417 303806979 300612033
> test.ThreadLocalTest$TL1: 330263857 326448062 300607081 300575641
> 307442821 300616794 300548457 303462898
> test.ThreadLocalTest$TL2: 319627165 311309477 311465815 311279612
> 311294427 311315803 311470291 311293823
> test.ThreadLocalTest$TL3: 526849874 524209792 524421574 524166747
> 524396011 524163313 524395641 524165429
> test.ThreadLocalTest$TL4: 464963126 455172216 455466304 455245487
> 455368318 455093735 455125038 455317375
> test.ThreadLocalTest$TL0: 300472239 300695398 300480230 303459397
> 300451419 300679904 300445717 300451166
>
>
> And that's very repeatable! Try it for yourself (on JDK8 of course).
>
> Regards, Peter
>
>
>>
>>>
>>>
>>> Regards, Peter
>>
>> cheers,
>> Rémi
>>
>>>
>>> On 12/06/2012 01:03 PM, Doug Lea wrote:
>>>> On 12/06/12 06:56, Vitaly Davidovich wrote:
>>>>> Doug,
>>>>>
>>>>> When you see the fast to slow ThreadLocal transition due to class
>>>>> loading
>>>>> invalidating inlined get(), do you not then see it get restored
>>>>> back to fast
>>>>> mode since the receiver type in your call sites is still the
>>>>> monomorphic
>>>>> ThreadLocal (and not the unrelated subclasses)? Just trying to
>>>>> understand what
>>>>> Rémi and you are saying.
>>>>>
>>>>
>>>> The possible outcomes are fairly non-deterministic, depending
>>>> on hotspot's mood about recompiles, tiered-compile interactions,
>>>> method size, Amddahl's law interactions, phase of moon, etc.
>>>>
>>>> (In j.u.c, we have learned that our users appreciate things
>>>> being predictably fast enough rather than being
>>>> unpredictably sometimes even faster but often slower.
>>>> So when we see such cases, as with ThreadLocal, they get added
>>>> to todo list.)
>>>>
>>>> -Doug
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list