TaggedArrays (Proposal)
Mark Roos
mroos at roos.com
Mon Jul 2 10:23:30 PDT 2012
>From Jim
It occurred to me on that sleepless Monday night, that the solution for
most dynamic languages could be so much simpler. First, we have to look
at what it is we really need. Ultimately it's about boxing. We want to
avoid allocating memory whenever we need to store a primitive value in an
object. Concerning ourselves with passing around tagged values in
registers and storing in stack frames is all red herring. All that is
needed is a mechanism for storing tagged values (or compressed values) in
a no-type slot of a generic object. Thinking about it in these terms
isolates all issues to a single array-like class, and thus simplifies
implementation and simplifies testing. Instances of this class can be
used as objects, as stack frames and even full stacks. A good percentage
of a dynamic language needs are covered.
Just having spent the last year implementing Smalltalk on the JVM the
issue of boxing ( particularly for integers ) is of interest for me. While
I agree with
your statement that allocating memory is the big issue, I don't really
understand your comment about 'when we store a primitive'. My most
visible issue
is in a 'for loop' like construct where I generate an index and use it to
access a byte array. In this case I need to create both the index and the
byte accessed.
For instance scanning a million byte array requires that I create a
million indexes and then another million ( assuming no cache ) one for
each byte accessed.
These are all then discarded. Its not clear to me how your proposal would
help here.
As I have thought about the boxing issue ( the Smalltalk I am using as a
reference has tagged integers ) I keep thinking that any jvm solution is
probably
going to have some 'java' or other target driven characteristics that make
it hard to use. In my case I have a java class that all of my objects are
instances of.
If there were a 'tagged' object type then it would have to be able to
substitute as one of my instances, hold or reference the same information
(like method lookups, class, shape etc ), exist on the stack or in a temp,
be testable in a GWT ....
Here I agree with you that making this work in the JVM is probably too
difficult.
So my thoughts are back to how to have a boxed primitive with no
allocation overhead unless it is saved or accessed outside of a thread, in
other words
how to reuse the box. I can do this for some situations where I can prove
the scope but I have yet to figure out a general solution.
So while I can see a use for mixing references and primitives in an array
it has not shown up in my work as amajor issue. Perhaps this is due to my
not
keeping parallel stacks?
In any case hope to hear more on this at the summit
regards
mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20120702/7b291f52/attachment.html
More information about the mlvm-dev
mailing list