TaggedArrays
Charles Oliver Nutter
headius at headius.com
Wed Sep 12 19:06:07 PDT 2012
On Wed, Sep 12, 2012 at 5:24 PM, Jim Laskey <jlaskey at me.com> wrote:
> The builtin tagged array (TaggedArrayNativeImpl) has no levels of
> indirection, the simple and optimized forms are fall backs. So the
> performance is similar to Java arrays. And yes the code that Rickard
> submitted here is for native performance.
>
> One would assume your math operations would assume tagged integers. So
> there so there should only be ints and BigIntegers (rare).
>
> Naive Ex:
This is very illustrative, thank you! I never got around to commenting
on your original email, so I'm taking some time now (I'm also doing a
talk on "high performance Ruby" at Golden Gate RubyConf on Friday, so
I'm looking at perf and numbers again).
> TaggedArray stack = TaggedArrays.allocate(512 * 1024);
> int SP = 0;
>
> SP = add(stack, SP);
>
>
> int add(TaggedArray stack, int SP) {
> try {
> long x = stack.getValue(SP--); // simple array indexing with tag detection
> long y = stack.getValue(SP--); // simple array indexing with tag detection
> long z = x + y - 1; // simple integer addition, accounting for the tag bit
> isOverflow(z); // Throws exception for overflow
> stack.setValue(++SP); // simple array indexing with tag detection
> return SP;
> } catch ( ) {
> // slow case for arithmetic overflow, BigIntegers and stack overflow
> exceptions.
> }
> So most of the time you will be using straight ints. Tag detection is a bit
> test and branch.
I get this example, but I'll describe it a bit for myself (and maybe others).
* Each thread gets a TaggedArray to use for storing values. You bite
chunks off this by passing an on-stack (Java stack) int offset along
with the TaggedArray, and all local variable use would work with the
TaggedArray elements rather than Java locals.
* Because we don't store locals in Object slots on the Java stack
frame, we aren't forced to use boxed numerics all the time.
* Accessing the elements anywhere requires passing the TaggedArray and
the offset where the value is, e.g. to math logic (which can either
receive the SP where the result should go or return an SP stating
where it put the result).
As Mark pointed out, for those of us with growable instance var lists,
those would be TaggedArray implementations, and we could store long
values directly.
Yes, I can see how this would work. HOWEVER... :)
It seems like it's still impossible for this to ever approach the
performance of actual Java primitives (or tagged numerics) living
directly on the JVM stack. At a minimum, you're doing all that
bouncing in and out of TaggedArray, which for any non-trivial program
will be hundreds of elements long. Indeed, most of that should be in
cache, but it's still not going to be as fast as JVM local/stack
access, right?
I am intrigued by the idea, I will admit. It's not something I can
spend a day wiring into JRuby, of course, but I'm going to ponder it a
bit. JRuby has managed to get pretty solid performance even with boxed
numbers by slowly and carefully *eliminating* our artificial stack as
much as possible. Many Ruby methods now use nothing but JVM locals,
and for algorithms where we can make that guarantee we're many times
faster than the next fastest Ruby impl. But by using JVM locals and
stack for everything, we are also backing ourselves into a bit of a
corner, since we have to use Object references everywhere and the JVM
is not doing a good enough job of seeing through them.
I might play with prototyping a similar impl in JRuby, just to explore
feasibility.
- Charlie
More information about the mlvm-dev
mailing list