lazy statics design notes

forax at univ-mlv.fr forax at univ-mlv.fr
Sun Mar 3 13:18:49 UTC 2019


Hi John,
for me it's not clear if the sentinel has to be user controlled or not (think vulls by example) for getfield, and given it looks like solving how to do a CAS on a value type (and its interaction with vulls), something we are still working on, i think we should restrain ourselves to try to solve getfield on a lazy final before our work on value type is finished.

As you said, for getstatic, we don't have this issue that why i think we should design getstatic of a lazy final without disabling null, getfield will likely have more constraints and that's fine.

Rémi

----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>, "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>, "Brian Goetz"
> <brian.goetz at oracle.com>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Dimanche 3 Mars 2019 00:15:16
> Objet: Re: lazy statics design notes

> Remi, Maurizio, Brian, I shot my last round, and I'm out.
> I agree we shouldn't tinker with the (value sets of the) types.
> 
> Instead let's reach for ways to adjoin extra sentinel values
> in the case of lazies (and optionals, and lazies of optionals),
> of both null-default and zero-default types.  These sentinel
> values will encode as disjoint from the base value set of the
> type T (whether T is null-default/ref or zero-default/prim).
> 
> Sentinels will denote the states outside of the normal "T value
> is present" state, either:  unbound-lazy or empty-optional.
> A lazy optional needs both sentinels, while a plain lazy or
> optional needs just one.
> 
> In the case where T is a reference, the JVM might add in one
> or two new references (perhaps with tag bits for extra dynamic
> checking).  This can be done outside the safe type system in
> the case of the JVM, if it puts the right decoding barriers in
> the right places, to strip the sentinels before using them in
> a T-safe manner.
> 
> In the case where T's encoding space is fully tensioned (like int)
> the sentinel will have to take the form of an extra field of
> two states.  One is "I'm the sentinel" and the other is "there's
> a T value in my other component".  This is just Optional all
> over again, which uses a sentinel (null!) today.
> 
> (If two sentinels are required, for a lazy-optional, then the extra
> field can take three states.  Or we append two extra fields.)
> 
> If we are buffering T on the heap in a stand-alone object, the
> extra state can (with some ad hoc hacking) be folded into the
> object header, because it is almost certain that the object header
> has some slack that's usable for the purpose.  Since buffered value
> object's won't need to store synchronization state (individually,
> at least), the bits which usually denote synchronization state can
> be co-opted to store a sentinel state, for a buffered T.  This usually
> won't be necessary, though, since if a T value is buffered, the
> client that is holding the reference is also capable of holding
> a real null, which more directly represents an out-of-type value
> point for T.  This is today's situation with Integer, which is
> null-default, while its payload type int is zeroable but not nullable.
> 
> If we were to load a value-like Integer onto the stack, the extra
> sentinel field would have to be manufactured like this:
>    boolean hasPayload = (p == null ? false : true);
>    int payload = (p == null ? int.default : p.value);
> This pair of values on stack would act like a value type whose
> default zero bits encode null, while an ordinary int payload value
> would be accompanied by a 'true' bit that distinguishes it from
> the null encoding.  This value type should, of course, be null-default,
> even though it carries a zero-default payload.
> 
> In the case where T's encoding space has some slack (like boolean)
> a sentinel or two can be created by using unencoded bit patterns.
> If T is a value type containing a reference or floating point field,
> then the option exists to "steal" the encoding from inside that field.
> 
> The all-zero-bits state is favorable in the heap because it is most
> reliably the first state of the object.  In the case of both optional
> and lazy (lazy-optional is just lazy here), the sentinel encodes
> the initial state, which encourages us to implement the sentinel
> with a default value (zero or null) for T.  This means that the normal
> corresponding default (zero or null) should actually be encoded
> with a special sentinel value.
> 
> On the stack the all-zero-bits state is less directly useful, but of
> course it's good if the stack and heap encodings can be as close
> as possible.
> 
> The getfield operation which loads a lazy instance field should do
> two things: 1. check for the encoding of the unbound state (which
> should be all-zeroes), 2. check for the encoding of the bound-to-default
> state (which should be a specially crafted sentinel).  In case 1, the
> lazy value binding code must be executed.  In case 2, the sentinel
> must be replaced by a true default value.  Something like this probably
> needs to happen anyway for null-default value types, since the
> zero-default encoding of a null-default value type needs to be
> replaced by a null pointer when it is loaded.
> 
> It looks to me like there are at least three places where a "raw"
> value is "wrapped" to give it adjusted semantics.  First, a null-default
> value type wraps the underlying zero-default bits by swapping
> out the zero and swapping in the null.  Second, an optional wraps
> the internal value by adjoining the "empty" value point.  Third,
> a lazy wraps its non-lazy value by adjoining the "unbound" state.
> 
> Sentinels are just one way to do it; surely there are others.  But if
> you don't use sentinels in some capacity to overlay new values on
> T's value set, you probably need a side bit to convey the variable's
> state; as I've said before, managing that correctly seems to require
> transactional memory.
> 
> Condy doesn't require a sentinel.  But of course HotSpot *internally*
> uses a sentinel to distinguish a resolved null value from the unresolved
> state.  The unresolved state is a null pointer while a resolved null
> is a special out-of-type non-null reference (called "the null sentinel")
> which condy swaps out for a resolved null after it does the null check.
> That's the same trick as I've described above.  Surprise; I wrote it.
> Great minds may think alike, but mediocre minds think the same
> thing repeatedly.
> 
> — John


More information about the valhalla-spec-observers mailing list