Thoughts on peeling and readability
Brian Goetz
brian.goetz at oracle.com
Sun Dec 13 19:08:10 UTC 2015
Primitives like int and long are "special" in that all bit patterns are
valid and there are no integrity constraints that would prevent a client
from requesting a specific bit pattern. But this is the special case,
not the general case. A linguistic construct like T.default would have
to work for *all* T, not just the special cases.
There's nothing to stop you from writing an implementation that takes
advantage of knowledge of specific types like int; there's a range of
options there. What we're uninterested in doing is allowing clients to
have unsafe bit-level access to the representation of all value types.
There's also nothing to stop you from writing value types that allow
raw-bit operations; we're just not going to require that every value
type support that (which is what asking for a bit-oriented T.default
would be.)
T.default will almost certainly not go through a constructor; the VM
will zero out the bits as it does with the existing built-in types (the
eight primitive types plus references.) This process is pretty obvious
from both a specification and implementation perspective, but it does
create some responsibility for writers of value types -- specifically
the need to deal with the default bit pattern.
As our friends in the .NET community have discovered, trying to enforce
that the no-arg ctor is always executed before a value is exposed is a
game of whack-a-mole, so having T.default go through a constructor
simply reduces the probability of the "implicit null" surprise, but
doesn't banish it. One can be as snarky as one likes about the
tradeoffs ("reinvented null" vs "reinvented serialization"), but the
reality here is that there are risks lurking around both corners.
Personally I like the tradeoff we're converging towards, but we are
aware it is not perfect.
Stepping back, rather than arguing the merits or demerits of a
particular solution (which at this point we've more than exhausted),
it's far more helpful to talk about the problem instead of the
solution. So, let me ask -- what use cases are you concerned about,
other then the manufacture of multiple sentinels for use inside data
structure implementations?
(I suspect in the end you will find that you will be able to accomplish
what you need with the tools available, but the "let me specify the bit
pattern for an arbitrary value type" approach is not the way to get there.)
On 12/13/2015 1:32 PM, Timo Kinnunen wrote:
>
> Well, it’s ints and longs, and all primitive types copyable around as
> ints and longs, and all objects serializable to and from arrays of
> ints and longs, and all arrays of such, and all values made of such,
> and all arrays of such values, and all values made of such values,
> aaand I’m probably missing a dimension or two somewhere.
>
> These values are just as valid regardless of which bit patterns,
> all-zero or not, were used to construct them. They are safe to be
> copied around and, if you implemented them yourself, hashCode, equals,
> toString and any component-wise operations could also be done safely.
> Such operations simply can’t call any foreign code of any of the value
> or reference types involved. We don’t expect that we can take an
> arbitrary Object, use reflection to zero out its fields and then be
> able to call its instance methods like nothing had happened either.
> So, for reference types these operations would have to done using
> reflection, for value types VarHandles might give better performance.
>
> Ultimately I guess it all depends on whether T.default invokes some
> constructor or not. If it doesn’t or if the constructor is specified
> to always succeed trivially then we have reinvented null. Our new
> nulls trade off a large number of NPEs for silent invariant
> violations. This could be a good tradeoff but without knowing about
> the consequences of the violations beforehand it’s gonna be hard to
> say for certain. The good news is that hey, null is back!
>
> If T.default executes a constructor that can refuse an all-zero bit
> pattern, then we have reinvented serialization for value types and are
> requiring all value types support it. Our serialization protocol only
> recognizes one input value and can only deserialize, so it’s a bit
> useless. But with the addition of the missing serialize-function we
> can then define transforms for long[] <-> val <-> long[] and will have
> a quite general and capable system already.
>
> Or are we gonna just include the worst parts from both?
>
>
>
>
>
>
> --
> Have a nice day,
> Timo
>
> Sent from Mail for Windows 10
>
>
> *From: *Brian Goetz
> *Sent: *Sunday, December 13, 2015 01:20
> *To: *Timo Kinnunen
> *Cc: *Maurizio Cimadamore;Paul Benedict;valhalla-dev at openjdk.java.net
> *Subject: *Re: Thoughts on peeling and readability
>
> No. In general, to/from raw bit operations are not safe except in a
> few corner cases (like int and long.) Values are not uncontrolled
> buckets of bits.
>
> On the other hand, T.default *is* safe, because every type has a
> default bit pattern which is the initialization of any otherwise
> uninitialized field or array element. (It so happens that this
> default bit pattern corresponds to all zero bits for all types, though
> this is mostly a convenience for VM implementors.) For a composite
> value, the default value is comprised of the default value for all
> fields. By *definition*, the all-zero bit pattern is a valid element
> of all value types. However, there is no guarantee that any other bit
> pattern is valid for any given value type.
>
> If a particular value type wants to expose to/from raw bit
> constructors, that’s fine — but you’re asking for a language feature
> that applies to *all* values — and there is no guarantee that this is
> a safe operation for all values.
>
> On Dec 12, 2015, at 5:44 PM, Timo Kinnunen <timo.kinnunen at gmail.com
> <mailto:timo.kinnunen at gmail.com>> wrote:
>
>
>
> Field layout and bit fiddling isn’t exactly what I was thinking.
> Rather I was thinking something like Float.floatToRawIntBits() and
> Double.doubleToRawLongBits(), but without having to know about the
> types Float and Double or how many bits are in their raw bits. So
> something like this syntax:
>
> static <any T> T nextUp (T value) {
>
> <?missing type?> rawBits = T.toRawBits(value);
>
> T nextValue =
> T.fromRawBits(rawBits + 1);
>
> return nextValue;
>
> }
>
> This should fit in Valhalla reasonably well, as it is just a
> generalization of T.default with its complement operation
> included. And as it is, all of the problems you listed already
> apply to T.default. For example, a value type with one long field:
> If the long value in the field is a handle pointing to a
> memory-mapped buffer then any use of a default value of such a
> type could cause a crash. Which can include asking a properly
> constructed value if it is equal to any of the values in an array
> you have.
>
>
>
>
> --
> Have a nice day,
> Timo
>
> Sent from Mail for Windows 10
>
>
> *From:*Brian Goetz
> *Sent:*Saturday, December 12, 2015 18:21
> *To:*Timo Kinnunen;Maurizio Cimadamore;Paul
> Benedict;valhalla-dev at openjdk.java.net
> <mailto:valhalla-dev at openjdk.java.net>
> *Subject:*Re: Thoughts on peeling and readability
>
> Precise layout and bit control of values are anti-goals of
> Valhalla, so
>
> we're not really exploring this direction at this time.
>
> The problem with approaches like the one you suggest is they fall
> apart
>
> as soon as you leave the realm of "primitives modeled as values."
> What
>
> about values that have refs in them? What about values whose
>
> representations are private? Their implementation is supposed to
> be in
>
> sole control of their representation. This runs contrary to the
> "codes
>
> like a class" dictum.
>
> On 12/12/2015 4:43 AM, Timo Kinnunen wrote:
>
> > Hi,
>
> >
>
> > One thing that I don’t remember seeing is any syntax for
> constructing arbitrary values in generic code without having to
> know about the precise field layouts and what the meaning of such
> fields is. Something like T.default but for values other than 0.
> Perhaps T.default(12345) or some such?
>
> >
>
> > Or maybe this is slated to go with bytecode type specialization…
> What sort of syntax is envisioned to be driving that anyways?
>
> >
>
> >
>
> >
>
> >
>
> >
>
More information about the valhalla-dev
mailing list