Assembly output from JRuby 'fib'

Thu Apr 28 11:28:01 PDT 2011

On Thu, Apr 28, 2011 at 11:18 AM, Rémi Forax <forax at univ-mlv.fr> wrote:
>>> >  Any thoughts on how we can make this even faster? The bulk of the code
>>> >  seems to be taken up by a few operations inherent to Fixnum math:
>>> >
>>> >  * Memory accesses relating to CallSite subclasses (LtCallSite and friends)
>>> >  * instanceof checks in those math-related CallSites
>
> It should be class check ? not an instanceof check.

The fast-path math call sites (as in subclasses of JRuby's CallSite)
check two things:

* Whether the receiver is a RubyFixnum object
* Whether Fixnum has been reopened and modified (replacing +, -, whatever)

If these both go the right way, it then casts to RubyFixnum and calls
directly. So far it is faster than invokedynamic, but if that changes
then these custom call sites would be unnecessary.

>>> >  * Fixnum overflow checks in + and - operations
>
> Do you specialize the overflow check depending on the callsite ?
> for fib(n - 1), you just have to check if n is different from
> Integer.MIN_INT,
> for fib(n - 2), if n is <= to Integer.MIN_INT - 1
> and for + use the double xor tricks.

Not at the call site, but addition and subtraction do have specialized
overflow checking. I'm not sure if these are being done as cheaply as
they could be:

    private static boolean additionOverflowed(long original, long
other, long result) {
        return (~(original ^ other) & (original ^ result) & SIGN_BIT) != 0;
    }

    private static boolean subtractionOverflowed(long original, long
other, long result) {
        return (~(original ^ ~other) & (original ^ result) & SIGN_BIT) != 0;
    }

>>> >  * Fixnum allocation/initialization costs (or Fixnum cache accesses)
>>> >
>>> >  As it stands today, the overhead of Fixnum operations is the primary
>>> >  factor preventing us from writing a lot more of JRuby's code in Ruby.
>>> >  Fixnums are too expensive to use for iterating over an array, doing a
>>> >  loop, etc. Of course we could do some code analysis to try to reduce
>>> >  loops to simple int operations, but barring that...does anyone have
>>> >  suggestions for reducing the cost of actual Fixnum operations?
>
> You should do what I'm doing with PHP.reboot.
> The interpreter profile the dynamic type of the variable and
> when you generate the code you typecheck the code to verifies
> if the optimistic assumption are correct.
>
> Compared to your dynopt approach, this means you don't have to
> generate both codes. You generate the one with int + overflow checks
> and if an overflow occurs, you escape (using invokedynamic)
> to the code that works with Fixnum.
> The tricky part is how to comeback ?

Yes, that's where I get stuck too :) I don't see an easy way to
specialize Fixnum code to long because of the overflow check (and the
potential for someone mutating Fixnum, though that's very rare). I
*do* however see that it would be easy to specialize Float code to
double, since Float operations only ever produce Float. Wouldn't it be
unusual if JRuby were able to do floating-point math faster (like
orders of magnitude faster) than integer math? :)

The main reason Fixnums are a concern to me is we'd like to move more
of JRuby logic into Ruby code, which would necessarily require a lot
of Fixnum operations...think array access, array-walking, loops, basic
math, bitmasking, and so on. We *could* allow for a special "core
class" mode that opts out of boundschecks where none are needed. This
would cover all such cases, allowing us to specialize to long or int.
Obviously a general solution would be better, since it would apply to
Ruby code...but the "core" trick would be fine for moving more of
JRuby into Ruby code.

- Charlie