TaggedArrays (Proposal)
Rémi Forax
forax at univ-mlv.fr
Mon Jul 2 06:57:30 PDT 2012
On 07/02/2012 03:05 PM, Jim Laskey wrote:
> During a week in the rarefied air of Stockholm back in May, a
> sleepless night got me thinking. The day after that, the thinking
> became a reality. I've been sitting on the code since, not sure what
> to do next. So..., why not start the month leading up to the JVMLS
> with a discussion about dynamic values.
>
> Every jLanguage developer knows that primitive boxing is the enemy.
> Even more so for untyped languages. We need a way to interleave
> primitive types with references.
>
> Tagged values (value types) for dynamic languages have been approached
> from a dozen different angles over the history of Java. However, no
> one seems to be satisfied with any of the proposals so far. Either
> the implementation is too limiting for the language developer or too
> complex to implement.
>
> Most recently, John (Rose) proposed hiding value tagging in the JVM
> via the Integer/Long/Float/Double.valueof methods. I saw a few issues
> with this proposal. First, the implementation works differently on 32
> bit and 64 bit platforms (only half a solution on each). Secondly,
> control of the tag bits is hidden such that it doesn't give a language
> implementor any leeway on bit usage. Finally, it will take a long
> time for it to get introduced into the JVM. The implementation is
> complex, scattered all over the VM and will lead to a significant
> multiplier for testing coverage.
but it will also help Java perf.
>
> It occurred to me on that sleepless Monday night, that the solution
> for most dynamic languages could be so much simpler. First, we have
> to look at what it is we really need. Ultimately it's about boxing.
> We want to avoid allocating memory whenever we need to store a
> primitive value in an object. Concerning ourselves with passing
> around tagged values in registers and storing in stack frames is all
> red herring. All that is needed is a mechanism for storing tagged
> values (or compressed values) in a no-type slot of a generic object.
> Thinking about it in these terms isolates all issues to a single
> array-like class, and thus simplifies implementation and simplifies
> testing. Instances of this class can be used as objects, as stack
> frames and even full stacks. A good percentage of a dynamic language
> needs are covered.
using it as a stack frames will require a pretty good escape analysis if
you want same perf as the native stack
or is there a trick somewhere ?
But given that there is a trick to avoid boxing for local variables (see
my talk at next JVM Summit),
having an array like this just for storing fields is enough to pull its
weight.
>
> So, Rickard Bäckman (also of Oracle) and I defined an API and
> implemented (in HotSpot) an interface called TaggedArray.
> Conceptional, TaggedArray is a fixed array of no-type slots (64-bit),
> where each slot can contain either a reference or a tagged long value
> (least significant bit set.) Internally, TaggedArray class's doOop
> method knows that it should skip any 64-bit value with the least
> significant bit set. How the language developer uses the other 63
> bits is up to them. References are just addresses. On 32 bit
> machines, the address (or packed address) is stored in the high
> 32-bits (user has no access) So there is no interference with the tag
> bit.
>
> We supply four implementations of the API. 1) is a naive two parallel
> arrays (one Object[], one long[]) implementation for platforms not
> supporting TaggedArrays (and JDK 1.7), 2) an optimized version of 1)
> that allocates each array on demand, 3) a JNI implementation
> (minimally needed for the interpreter) that uses the native
> implementation and 4) the native implementation that is recognized by
> both the C1/C2 compilers (effort only partially completed.) In
> general, the implementation choice is transparent to the user (optimal
> choice.)
Being able to subclass it in order to add fixed field like a metaclass
field, i.e a field that
is always a reference, would be cool too.
About the API, the two method set should be setValue()/setReference().
I think that getValue()/setValue() should return the long with the bit
set because
If i want to execute x + 1, I can convert it to x + 2 at compile time
thus avoid the shifts at runtime.
>
> I've enclosed a JavaDoc and the roughed out source. For discussion.
> Fire away.
>
> Cheers,
>
> -- Jim
cheers,
Rémi
More information about the mlvm-dev
mailing list