Optimising invoke and return

Fri Feb 20 07:23:03 PST 2009

Hi again folks,

The benchmarking so far has shown me that to get any real improvement on (semi) real world
application like Think Free Office I must look at a number of things...

1) Optimisation of invoke and return

The way invoke and return is done at the moment in the CC_INTERP is disgusting.

When 'run' encounters an invoke or return it does some limited processing, sets the VM
state to 'method_entry' or 'returning' and then returns from 'run' to 'main_loop' which
then handles the invoke or return and then calls 'run' again.

Now with the 'optimised' interpreter, when it encounters an invoke or return it has to
thread its way back from run_opt to main_loop and then back down to run_opt

Preoptimisation handling of invoke / return

main_loop -> run -> main_loop -> run

Post 'optimisation'

main_loop -> run -> run_opt -> run -> main_loop -> run -> run_opt.

Oops.

The way around this is to try and flatten main_loop into run so that it is all handled
in 'run'. Then the code for invoke/return can be migrated down to run_opt (at least
for the simple cases where it is resolved, not synchronised?, no exceptions).

However this will involve large scale munging of the codebase which I wanted to avoid.

Increasingly I am feeling that I am flogging a dead horse with the CC_INTERP.

I had a look at the template interpreter over the past few days. It doesn't actually look
too bad. It all seems quite neatly architected (at least to the depth I examined it).

2) Optimisation of native invoke and return

3) Optimisation of the native libraries.

My poiny haired boss is in town today and I think my proposal to him will be that we
do the template interpreter. Based on some benchmarking I did on the PC comparing zero
with -Xint this should give us 6 X performance increase over bare bones zero (4 X
over my current optimised implementation).

Doing the template interpreter would also serve as a useful 1st step to doing hotspot
(should we decide to go that way).

Gary: Does the Hotspt Compiler use the template interpreter to do its codegen, or does
it have its own codegen and only use the template interpreter to interpret uncompiled
code. (I could check this myself in the source, but I am lazy).

Regards,
Ed.