tableswitch does not use an array of offsets
Azeem Jiva
azeem.jiva at oracle.com
Wed Mar 27 09:17:26 PDT 2013
Ahh yes that old piece of code :) 18 was picked because after some investigation that seemed at the time the best balance between the binary search and jump tables. That's probably different now on better/newer hardware. I think 18 was picked for SPECjvm98 which seemed to run best at 18.
--
Azeem Jiva
@javawithjiva
On Mar 27, 2013, at 8:57 AM, Krystal Mo <krystal.mo at oracle.com> wrote:
> I'm curious, too. Did a search and found a piece of relevant history:
>
> 6335285 : 17% regression for alacrity_jetstream at linux-amd64 for 1.6.0b55 vs 1.6.0b54
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6335285
>
> Which sets MinJumpTableSize to 18 after some benchmarking. That was done in 2005.
>
> It looks like in the early days C2 used to only compile tableswitch as a decision tree, and actually generating code with a jump table was added later (circa 2004).
>
> - Kris
>
> On 2013/3/27 7:16, Chuck Rasbold wrote:
>> The behavior is expected, but the rationale behind it may be lost to history.
>>
>> There's a HotSpot flag, -XX:MinJumpTableSize=<NN> that controls the minimum number of consecutive case branches needed to provoke the creation of a jump table.
>>
>> As you've discovered, the default is 18. I'm sure that value had some empirical backing when it was set (at least six) years ago, but perhaps it could be differently tuned for today's hardware.
>>
>> -- Chuck
>>
>>
>> On Wed, Mar 27, 2013 at 1:56 AM, Yann Le Tallec <ylt at letallec.org> wrote:
>> The code below is compiled as a tableswitch bytecode instruction as one would expect because the case values are contiguous. However, the JIT (Hotspot using JDK/JRE 1.7u17 64 bit on x86/Windows in either -server or -client mode) compiles it into a succession of cmp/je/jg.
>>
>> When adding one more case statement (case 17) to reach a total number of 18 case statements, the JIT does compile the switch using an array of offsets to calculate the jump.
>>
>> A micro-benchmark (controlled for compilation, inlining (off), gc) shows that adding an additional case statement to the method below (case 17) to reach a total of 18 improves performance materially (up to 25% depending on how the method is exercised).
>>
>> Is the decision to *not* use an array of offsets for <18 cases derived from profiling? Is that the expected behaviour and what is the underlying reason?
>>
>> Many thanks,
>> Yann
>>
>> static double multiplyByPowerOfTen(final double d, final int exponent) {
>> switch (exponent) {
>> case 0:
>> return d;
>> case 1:
>> return d * 10;
>> case 2:
>> return d * 100;
>> case 3:
>> return d * 1000;
>> case 4:
>> return d * 10000;
>> case 5:
>> return d * 100000;
>> case 6:
>> return d * 1000000;
>> case 7:
>> return d * 10000000;
>> case 8:
>> return d * 100000000;
>> case 9:
>> return d * 1000000000;
>> case 10:
>> return d * 10000000000L;
>> case 11:
>> return d * 100000000000L;
>> case 12:
>> return d * 1000000000000L;
>> case 13:
>> return d * 10000000000000L;
>> case 14:
>> return d * 100000000000000L;
>> case 15:
>> return d * 1000000000000000L;
>> case 16:
>> return d * 10000000000000000L;
>> default:
>> throw new RuntimeException("Unhandled power of ten " + exponent);
>> }
>> }
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20130327/565b27c4/attachment.html
More information about the hotspot-compiler-dev
mailing list