Odd decrease of benchmark throughput

Dávid Karnok akarnokd at gmail.com
Tue Sep 6 15:12:26 UTC 2016


I think I found the problem. The JITWatch's own analysis indicated (should
have looked at that earlier) that two of the hottest methods couldn't be
inlined in L4 because of being too deep in the call stack. Adding
-XX:MaxInlineLevel=20 the perf was ~650 ops/s for any number of measure
iterations.

So it seems that until the @Benchmark method got inlined, everything was
relatively fine but then once that outermost method became eligible for
JIT-ting, the hot path fell below the default inline level and the
resulting code was now 3x slower.

Thank you for your time.


2016-09-06 16:18 GMT+02:00 Dávid Karnok <akarnokd at gmail.com>:

> Thank you for the answer. I guess I'd need xperf for Windows but that tool
> is Win 8+. I'll try my luck with JITWatch again to see the difference in C1
> and C2 assemblies.
>
> 2016-09-06 16:00 GMT+02:00 Aleksey Shipilev <ashipile at redhat.com>:
>
>> On 09/06/2016 01:05 PM, Dávid Karnok wrote:
>> > # Run progress: 16,67% complete, ETA 00:01:47
>> > # Fork: 1 of 1
>> > # Warmup Iteration   1: 622,250 ops/s
>> > # Warmup Iteration   2: 646,154 ops/s
>> > # Warmup Iteration   3: 637,035 ops/s
>> > # Warmup Iteration   4: 639,014 ops/s
>> > # Warmup Iteration   5: 645,212 ops/s
>> > Iteration   1: 648,120 ops/s
>> > Iteration   2: 647,042 ops/s
>> > Iteration   3: 650,176 ops/s
>> > Iteration   4: 335,979 ops/s
>> > Iteration   5: 195,415 ops/s
>> >
>> > (Running Windows 7 x64, Java 8u102, i7 4790)
>> >
>> > Please advise.
>>
>> We have seen the behavior like that before.
>>
>> The way to further diagnose this: prepare two runs where measurement
>> phase a) has only 650 ops/s iterations; b) has only 195 ops/s iterations
>> -- vary warmup/measurement durations to fit. After that, -prof perfasm
>> both runs and see where the difference in profiles is. perfasm takes
>> only the measurement phase in the consideration.
>>
>> 99% bet is on different compilation, and it is important to know what
>> exactly compiled differently in 195 ops/sec iterations.
>>
>> Thanks,
>> -Aleksey
>>
>
>
>
> --
> Best regards,
> David Karnok
>



-- 
Best regards,
David Karnok


More information about the jmh-dev mailing list