A special, built-in value type: a 64-bit "fixnum"
vladimir.x.ivanov at oracle.com
Thu Apr 23 16:05:16 UTC 2015
Here are some relevant discussions on mlvm-dev about Rickard's
heterogeneous array work:
On 4/23/15 6:55 PM, Ron Pressler wrote:
> John, having read your document -- which pretty much describes my proposal
> exactly -- I'd like you to explain your reservations. As value types are
> being added anyway as part of Valhalla, why not tack this on? Also, can you
> please describe Rickard Backman heterogeneous array work in more detail?
> On Wed, Apr 22, 2015 at 10:45 PM, John Rose <john.r.rose at oracle.com> wrote:
>> The point is to be able to overlay primitives with references,
>> so that their storage is compact, and so they can be arranged in arrays.
>> I worked through this line of thought and got here after some refinements:
>> 63 bits for a reference is more than anybody needs for a long time.
>> OTOH if you only ask for about 50 bits for references (still generous),
>> you can have all possible double values and nearly all long values.
>> The simplicity of the null check, and any new is-ref check, critically
>> GC performance. Also, the HotSpot GC assumes strongly that managed
>> pointers are never encoded or obscured (except, uniformly, by scaling
>> when they are compressed).
>> These factors push us towards an "address-native" storage format. The
>> of the format depend sensitively on pointer compression mode, endian-ness,
>> and whether object addresses can be negative (sign bits set). For that
>> any API for such a tagged value must hide the position and coding of the
>> tag bits.
>> ("Address-native" means that if the variable is in the is-ref state the
>> contents are indistinguishable from a regular managed reference. This
>> that loading a long or double requires some sort of rotation in value
>> The union check done by the GC becomes a range check (or high-bit test)
>> instead of a single-bit test. This is preferable to a bit test because it
>> (on some machines) be merged into the null check which the GC already
>> All that said, any such change is going to be really if it introduces a new
>> signature type. The next break we make for signatures must have a bigger
>> payoff—either parametric polymorphism or full value types. This is why
>> I (personally) stopped working on "tagu.patch".
>> But, to end on a more hopeful note, Rickard Backman has prototyped
>> something like this in a clever way that avoids committing us to a new
>> value type or signature: He has created an ad hoc array object that
>> can hold the sorts of two-way ref/prim unioned things you want.
>> I suppose you could build heterogeneous sequences on top of this.
>> One final "but": You can build compact heterogeneous sequences today,
>> with a little care. A bundle of three arrays would do nicely: N bytes for
>> tags, P longs for the primitive bits, and R objects for the refs (where
>> On some JVMs, that could be more compact than an array of unions,
>> when there are mostly (32-bit) refs. Three array headers is more than
>> one, yes, but that only matters if you have very short sequences.
>> In the JSR 292 implementation we use old-fashioned Object varargs
>> arrays of boxed numbers, when necessary. I periodically reconsider
>> using N/P/R bundles, but it hasn't seemed worth it yet. Perhaps
>> your use case makes Object arrays impractical?
>> — John
>> On Apr 22, 2015, at 4:11 AM, Ron Pressler <ron at paralleluniverse.co> wrote:
>>> I'd like to propose that the Valhalla project include a single special,
>>> built-in value type: a 64-bit "fixnum". The value has a single bit
>>> discriminating between a reference or a 63-bit long. It will, of course,
>>> treated correctly by the GC.
>>> For completeness, a couple of static helper functions may be introduced.
>>> One that takes a long and, preserving the sign, truncates it to 63 bits,
>>> throwing an exception in the case of an overflow, and the other taking a
>>> double and truncating down to 63 bits, truncating precision by one bit
>>> another for the reverse 63-bit double -> double operation).
>>> I believe this will be immensely useful for some applications that
>>> currently require two separate arrays to store a value of either a
>>> primitive or a reference, yet would require minimal work for GC support.
>>> course, this proposal can be extended to directly support any 63-bit (or
>>> smaller) value type, but even in its minimal form it is extremely useful.
More information about the valhalla-dev