Value types, encapsulation, and uninitialized values

Sun Oct 28 19:53:36 UTC 2018

There’s a matrix of possibilities as to what to do here.  

On one axis, we have user model issues — what does the user see when they have an uninitialized value?  I can imagine at least four possibilities: 

 - Zeroes.  This is simple but sharp-edged; any value could be zeroes, and both clients and value class implementors have to deal with this forced-on-you value, no matter how ill-suited it may be to the domain.  

 - User-provided.  Here, the no-arg constructor is consulted (if present, wave hands about what happens when it is not a pure function).  Values are not nullable, but every value is the result of running a constructor, so users are not surprised by seeing a value that was invisibly manufactured.  But, every value class that cares has to still implement some sort of “null object pattern” for itself.  Which may or may not be a burden (C#’s experience suggests it’s not too bad.)  

 - Nullable.  Here, after suitable opt-in, null is made to appear to be a member of the value set for the class, and this is the default value.  Dereferencing a null results in a NPE.  

 - Vullable. This is like nullable, but somehow specific to values.  

I think everyone agrees that “zeroes” is too sharp-edged to inflict on the user base.  Vullable asks the user to learn a new like-but-not-quite-the-same-as null concept.  So this is probably less desirable than nullable.  Any of them have to be opt-ins, because they all have costs we don’t want to inflict on all values.  So the two reasonable rows are “user-provided” and “nullable”.  

On the other axis, we have implementation strategies, which we’ve not discussed yet in detail.  Suffice it to say there are quite a few ways we could implement either, varying in their performance cost, safety guarantees, and intrusion.  At one end of this spectrum is to say that the JVM doesn’t know anything about this at all, and just let the static compiler try and arrange the illusion.  (For example, if a class has fields of this kind of value type, and that field isn’t DA at exit from all constructors, the compiler inserts extra initialization code, etc etc.)  At the other end of the spectrum, we can push awareness of nullable value types into getfield, invoke*, etc.  And in between, there are at least three or four ways to get to each of the credible user models.  

I think what Doug is saying is: ( user-provided , language-managed ) is the sweet spot here; it keeps the VM model simple, and all the cost of opting into special treatment is borne by classes that want the special treatment — and other languages get to make their own choices about how to use value types. It provides no real safety guarantees — races may still allow values to be observed in their zero state, and other languages could manufacture values that bypass the checks — but it provides enough relief from the sharp edges.  

> On Oct 28, 2018, at 8:11 AM, Doug Lea <dl at cs.oswego.edu> wrote:
> 
> 
> Sorry for even slower response...
> 
> On 10/11/18 10:14 AM, Brian Goetz wrote:
>> 
>> Except, the current story falls down here, because authors must content
>> with the special all-zero default value, because, unlike classes, we
>> cannot guarantee that instances are the result of a constructor.
> 
> Can we enumerate the cases where this is encoutered? The main one is:
> 
> If a value class does not have a no-arg constructor, why not disallow
> array construction (dynamically if necessary), but also supply special
> methods such as createAndFillArray(int size, T proto) and a few others
> to still allow safe array construction when it applies.
> 
> In most cases where not-yet-present is a common case, one would think
> that people would choose to make T an Object type or use Optional<T>
> rather than using a pure value type.
> 
> -Doug
>