TaggedArrays (Proposal)

ravenex ravenex at qq.com
Mon Jul 2 07:11:46 PDT 2012


Very cool stuff, Jim and Rickard! I guess people are going to start missing NaN encoded tagged value/pointers now that there's something real to play with ;-) @Remi The subclass suggestion sounds a lot like Maxine's Hybrid objects, where named fields and an untyped array is bundled into a single object. Which pretty much emulates what people like to do in C/C++, something nice to have. > I think that getValue()/setValue() should return the long with the bit set because > If i want to execute x + 1, I can convert it to x + 2 at compile time thus avoid the shifts at runtime. Even without changing the API, this kind of transformation could easily be intrinsified in the JITs, not a big worry. Cheers, Raven ------------------ Original ------------------ From:  "Rémi Fora"; Date:  Mon, Jul 2, 2012 09:57 PM To:  "mlvm-dev";  Subject:  Re: TaggedArrays (Proposal) On 07/02/2012 03:05 PM, Jim Laskey wrote: > During a week in the rarefied air of Stockholm back in May, a  > sleepless night got me thinking.  The day after that, the thinking  > became a reality.  I've been sitting on the code since, not sure what  > to do next.  So..., why not start the month leading up to the JVMLS  > with a discussion about dynamic values. > > Every jLanguage developer knows that primitive boxing is the enemy.  >  Even more so for untyped languages.  We need a way to interleave  > primitive types with references. > > Tagged values (value types) for dynamic languages have been approached  > from a dozen different angles over the history of Java.  However, no  > one seems to be satisfied with any of the proposals so far.  Either  > the implementation is too limiting for the language developer or too  > complex to implement. > > Most recently, John (Rose) proposed hiding value tagging in the JVM  > via the Integer/Long/Float/Double.valueof methods.  I saw a few issues  > with this proposal.  First, the implementation works differently on 32  > bit and 64 bit platforms (only half a solution on each).  Secondly,  > control of the tag bits is hidden such that it doesn't give a language  > implementor any leeway on bit usage.  Finally, it will take a long  > time for it to get introduced into the JVM.  The implementation is  > complex, scattered all over the VM and will lead to a significant  > multiplier for testing coverage. but it will also help Java perf. > > It occurred to me on that sleepless Monday night, that the solution  > for most dynamic languages could be so much simpler.  First, we have  > to look at what it is we really need.  Ultimately it's about boxing.  >  We want to avoid allocating memory whenever we need to store a  > primitive value in an object.  Concerning ourselves with passing  > around tagged values in registers and storing in stack frames is all  > red herring.  All that is needed is a mechanism for storing tagged  > values (or compressed values) in a no-type slot of a generic object.  >  Thinking about it in these terms isolates all issues to a single  > array-like class, and thus simplifies implementation and simplifies  > testing.  Instances of this class can be used as objects, as stack  > frames and even full stacks.  A good percentage of a dynamic language  > needs are covered. using it as a stack frames will require a pretty good escape analysis if  you want same perf as the native stack or is there a trick somewhere ? But given that there is a trick to avoid boxing for local variables (see  my talk at next JVM Summit), having an array like this just for storing fields is enough to pull its  weight. > > So, Rickard Bäckman (also of Oracle) and I defined an API and  > implemented (in HotSpot) an interface called TaggedArray.  >  Conceptional, TaggedArray is a fixed array of no-type slots (64-bit),  > where each slot can contain either a reference or a tagged long value  > (least significant bit set.)  Internally, TaggedArray class's doOop  > method knows that it should skip any 64-bit value with the least  > significant bit set.  How the language developer uses the other 63  > bits is up to them.  References are just addresses.  On 32 bit  > machines, the address (or packed address) is stored in the high  > 32-bits (user has no access)  So there is no interference with the tag  > bit. > > We supply four implementations of the API.  1) is a naive two parallel  > arrays (one Object[], one long[]) implementation for platforms not  > supporting TaggedArrays (and JDK 1.7), 2) an optimized version of 1)  >  that allocates each array on demand, 3) a JNI implementation  > (minimally needed for the interpreter) that uses the native  > implementation and 4) the native implementation that is recognized by  > both the C1/C2 compilers (effort only partially completed.)  In  > general, the implementation choice is transparent to the user (optimal  > choice.) Being able to subclass it in order to add fixed field like a metaclass  field, i.e a field that is always a reference, would be cool too. About the API, the two method set should be setValue()/setReference(). I think that getValue()/setValue() should return the long with the bit  set because If i want to execute x + 1, I can convert it to x + 2 at compile time  thus avoid the shifts at runtime. > > I've enclosed a JavaDoc and the roughed out source.  For discussion.  >  Fire away. > > Cheers, > > -- Jim cheers, Rémi _______________________________________________ mlvm-dev mailing list mlvm-dev at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20120702/431061a8/attachment-0001.html 


More information about the mlvm-dev mailing list