TaggedArrays (Proposal)
Krystal Mok
rednaxelafx at gmail.com
Mon Jul 2 22:58:01 PDT 2012
@Howard
Your suggestion could pretty much work if the underlying VM is using a
conservative collector, where it'd actually include a set of filters to
check if a value is a (or "looks like a") real reference.
There are also a couple of runtimes that implements exact GC by tagging
values, but most JVMs that I know of doesn't work that way. They'd expect
all reference type fields to contain valid references.
- Kris
On Tue, Jul 3, 2012 at 1:46 PM, Howard Lovatt <howard.lovatt at gmail.com>wrote:
> @Kris, I was assuming that the tag would be sufficient for the JVM since
> 'real' references would be aligned and hence naturally not tagged. But I
> don't know enough about the JVM and hence you could well be correct. --
> Howard.
>
>
> On 3 July 2012 15:40, Krystal Mok <rednaxelafx at gmail.com> wrote:
>
>> On Tue, Jul 3, 2012 at 1:28 PM, Howard Lovatt <howard.lovatt at gmail.com>wrote:
>>
>>> I like the idea and something along these lines would be a great
>>> addition to the standard library, which I will come back to as a PS.
>>>
>>> In com.sun.misc.Unsafe there are already getLong(Object, int) and
>>> setLong(Object, int, long) methods and the same for Object. No doubt if we
>>> used getLong and setLong as they stand they would fail due to type checking
>>> if used to access references to Objects in an Object[]. However if similar
>>> methods were added that circumvented the type checking,
>>> getLongOrReference(Object, int) and setLongOrReference(Object, int, long),
>>> then you could write:
>>>
>>> private Object[] vsAndRs;
>>>
>>> ...
>>>
>>> @Override public long getValue(final int index) {
>>> checkIndex(pos);
>>> final long value = Unsafe.getLongOrReference(vsAndRs, index);
>>> if ( (value & TaggedArrays.TAG) != 0 ) {
>>> return value;
>>> }
>>> throw new TaggedArrayException("value at index =" + index + " is not
>>> tagged");
>>> }
>>>
>>> @Override public Object getReference(final int index) {
>>> if ( isReference(index) ) {
>>> return vsAndRs[index];
>>> }
>>> throw new TaggedArrayException("reference at index =" + index +" is
>>> tagged");
>>> }
>>>
>>> ..
>>>
>>> Which would save half the space and still be understandable Java (i.e.
>>> not a collection of native calls).
>>>
>>> It's not just about type checking at Java level, but also something deep
>> within the VM: the need to distinguish between a plain value and a
>> reference (pointer), which is fundamental to a exact/accurate garbage
>> collector.
>> Jim's idea includes modifying the doOop() function in the VM, which is
>> exactly where HotSpot handles reference fields within an object.
>>
>> If you put random values into a place where HotSpot (or other JVM impls
>> using exact GC) think it's a reference, then something bad may happen later
>> (e.g. during GC), that the VM think it had found a corrupted reference.
>>
>> - Kris
>>
>>
>>> Great work,
>>>
>>> -- Howard.
>>>
>>> PS A few random thoughts for the interface, assuming that something like
>>> this makes it into the JDK then the following additions might be nice:
>>>
>>> 1. Fills that accept a lambda that receives the index as its argument
>>> (filling function as opposed to filling constant).
>>> 2. A second interface and second factory that define tagged arrays for
>>> long and double in addition to Objects, like the example code given.
>>> 3. TaggedArray32, primarily for phones etc., with the extended
>>> interface, point 2 above, for TaggedArray32 accepting Objects, ints, and
>>> floats. On a 64 bit system TaggedArray32 would be the same as TaggedArray,
>>> but for 32 bit systems it would save space.
>>>
>>>
>>> On 2 July 2012 23:05, Jim Laskey <jlaskey at me.com> wrote:
>>>
>>>> During a week in the rarefied air of Stockholm back in May, a sleepless
>>>> night got me thinking. The day after that, the thinking became a reality.
>>>> I've been sitting on the code since, not sure what to do next. So..., why
>>>> not start the month leading up to the JVMLS with a discussion about dynamic
>>>> values.
>>>>
>>>> Every jLanguage developer knows that primitive boxing is the enemy.
>>>> Even more so for untyped languages. We need a way to interleave primitive
>>>> types with references.
>>>>
>>>> Tagged values (value types) for dynamic languages have been approached
>>>> from a dozen different angles over the history of Java. However, no one
>>>> seems to be satisfied with any of the proposals so far. Either the
>>>> implementation is too limiting for the language developer or too complex to
>>>> implement.
>>>>
>>>> Most recently, John (Rose) proposed hiding value tagging in the JVM via
>>>> the Integer/Long/Float/Double.valueof methods. I saw a few issues with
>>>> this proposal. First, the implementation works differently on 32 bit and
>>>> 64 bit platforms (only half a solution on each). Secondly, control of the
>>>> tag bits is hidden such that it doesn't give a language implementor any
>>>> leeway on bit usage. Finally, it will take a long time for it to get
>>>> introduced into the JVM. The implementation is complex, scattered all over
>>>> the VM and will lead to a significant multiplier for testing coverage.
>>>>
>>>> It occurred to me on that sleepless Monday night, that the solution for
>>>> most dynamic languages could be so much simpler. First, we have to look at
>>>> what it is we really need. Ultimately it's about boxing. We want to avoid
>>>> allocating memory whenever we need to store a primitive value in an object.
>>>> Concerning ourselves with passing around tagged values in registers and
>>>> storing in stack frames is all red herring. All that is needed is a
>>>> mechanism for storing tagged values (or compressed values) in a no-type
>>>> slot of a generic object. Thinking about it in these terms isolates all
>>>> issues to a single array-like class, and thus simplifies implementation and
>>>> simplifies testing. Instances of this class can be used as objects, as
>>>> stack frames and even full stacks. A good percentage of a dynamic language
>>>> needs are covered.
>>>>
>>>> So, Rickard Bäckman (also of Oracle) and I defined an API and
>>>> implemented (in HotSpot) an interface called TaggedArray. Conceptional,
>>>> TaggedArray is a fixed array of no-type slots (64-bit), where each slot can
>>>> contain either a reference or a tagged long value (least significant bit
>>>> set.) Internally, TaggedArray class's doOop method knows that it
>>>> should skip any 64-bit value with the least significant bit set. How
>>>> the language developer uses the other 63 bits is up to them. References
>>>> are just addresses. On 32 bit machines, the address (or packed address) is
>>>> stored in the high 32-bits (user has no access) So there is no
>>>> interference with the tag bit.
>>>>
>>>> We supply four implementations of the API. 1) is a naive two parallel
>>>> arrays (one Object[], one long[]) implementation for platforms not
>>>> supporting TaggedArrays (and JDK 1.7), 2) an optimized version of 1) that
>>>> allocates each array on demand, 3) a JNI implementation (minimally needed
>>>> for the interpreter) that uses the native implementation and 4) the native
>>>> implementation that is recognized by both the C1/C2 compilers (effort only
>>>> partially completed.) In general, the implementation choice is transparent
>>>> to the user (optimal choice.)
>>>>
>>>> I've enclosed a JavaDoc and the roughed out source. For discussion.
>>>> Fire away.
>>>>
>>>> Cheers,
>>>>
>>>> -- Jim
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mlvm-dev mailing list
>>>> mlvm-dev at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>>>
>>>>
>>>
>>>
>>> --
>>> -- Howard.
>>>
>>>
>>> _______________________________________________
>>> mlvm-dev mailing list
>>> mlvm-dev at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>>
>>>
>>
>> _______________________________________________
>> mlvm-dev mailing list
>> mlvm-dev at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>
>>
>
>
> --
> -- Howard.
>
>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/mlvm-dev/attachments/20120703/3e17a428/attachment-0001.html
More information about the mlvm-dev
mailing list