it's a value! it's a reference!, was Substitutability, was Re: Finding the spirit of L-World

Tue Mar 5 14:53:15 UTC 2019

To add some fuel to the fire, JLS and JVMS are not the
only specifications of the Java platform. Looking at the JNI spec, the
term “Value Type” is already defined:

https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/types.html#the_value_type

Fred

> On Feb 23, 2019, at 19:53, John Rose <john.r.rose at oracle.com> wrote:
> 
> In the same vein, here's another challenging text from the JLS:
> 
>> 4.3.1. Objects
> 
>> An object is a class instance or an array.
> 
> OK, nice start, but then it gets harder.
> 
>> The reference values (often just references) are pointers to
>> these objects, and a special null reference, which refers to no object.
> 
> Back to "the reference values" mentioned earlier.  Those include
> references to value* objects, right.  (And, oh dear, what's a pointer?
> Close your eyes, my friends!)  Perhaps it should say:
> 
> "The reference values (often just references) [are pointers]{.del}
> [refer]{.ins} to these objects, and [also include]{.ins} a special
> null reference, which refers to no object."
> 
> Maybe the JLS could also fly the value* vs. reference* flag more
> clearly at this point:
> 
> "A class instance can be either a value* class instance or a
> reference* class instance.  These can be referred to briefly
> as value* instances, or reference* instances, or also value*
> objects, or reference* objects, as the case may be.  The
> bare terms value* or reference* are unsuitably ambiguous
> to use as terms for objects."
> 
>> A class instance is explicitly created by a class instance creation expression (§15.9).
> 
> OK, that's fine for value* objects also.  Most of the operator
> definitions are fine for value* objects.
> 
>> ...
>> The operators on references to objects are:
>> …
>> 	• The reference equality operators == and != (§15.21.3)
> 
> This one is scary, but read on.  We can fix 15.21.3 later.
> 
>> ...
>> 
>> There may be many references to the same object.
> 
> Here we see the value-ness of references peek out.
> 
> We also encounter the undefined term "same" and are
> taught that many variables can refer to the same object.
> In fact that's true of primitives too:  There can be many
> variables which refer to the same primitive value, where
> same-ness is based solely on content, not on the history
> of the value's storage in the heap.  Let's try it on value*
> instances:
> 
> "There may be many references to the same value* object."
> 
> When are two instances of the same value* class the same?
> There's only one answer, I think:  When the two instances
> are substitutable for each other.  Maybe we should just adopt
> the word "same*" for "substitutably equal", as in "System.isSame",
> "System.samenessHashCode", etc.  So I'll add a star to "same*".
> 
> This points out a key (*the* key?) difference between
> reference* objects and value* objects:  Two references can
> be the same* reference* object only if they are derived
> from a common definition (some "new C") executed in
> the past and connected by dataflow to both references.
> 
> Two references can be the same* value* object even if
> that value* object was created twice, from unrelated
> construction operations (vdefault, withfield) on field
> inputs that also happen to be the same* (pairwise).
> 
> Maybe "value*" should be spelled "unfettered by data
> flow from historical instance construction"?  Quick,
> release the Thesaurus!
> 
> Back to the JLS:
> 
>> Most objects have state, stored in the fields of objects
>> that are instances of classes or in the variables that
>> are the components of an array object.
> 
> Here we may learn that both Value Based(tm) classes
> and proper value* classes have state, despite the fact
> that their fields are final and/or immutable.  I guess
> that's fine, since we also learn from the JLS that every
> variable has state (aka. its value, assigned in that
> variable's storage) as well as type.
> 
> (Note to self:  Hold off on bets that "stateless" will
> supplant "value*" in the terminology sweepstakes.)
> 
> The other classy terminology—object, field, instance,
> class—imports cleanly into Valhalla.  "Codes like a class"
> is our core mantra, and it saves a lot of trouble.
> 
>> If two variables contain references to the same object,
>> the state of the object can be modified using one variable's
>> reference to the object, and then the altered state can be
>> observed through the reference in the other variable.
> 
> This means well, but conflicts not only with value* classes
> but also with immutables.  We just learned that a VBC
> like java.lang.Integer (not quite a VBC but maybe someday)
> has something called "state", in form of its "value" field of
> type "int".  And now we are promised that, given any
> two variables of type Integer, if they both point to the
> same instance (say, Integer.valueOf(42)) we can modify
> the "value" field of that instance to 43.  Surely someone
> is teasing us!
> 
> Actually, the JLS seems to be further defining the term
> "same" here by appealing to state which is *modifiable*
> (though not all state is modifiable, surely).  Let's fix
> that and also make sure that value* objects fulfill the
> quoted specification:
> 
>> If two variables contain references to the same* object,
>> [and if]{.ins} the state of the object can be modified using one variable's
>> reference to the object, [and]{.del} then the altered state can be
>> observed through the reference in the other variable.
> 
> Hmm, that wasn't so bad.  So the moral of this story is
> that value* objects can have state.  (Generally speaking,
> even primitives can have state.  So:  Objects have state.)
> Also, two value* objects with the same* state (as determined
> by examining their fields pairwise) are actually the same*
> value.
> 
> The JLS should probably say something about VBCs at this
> point, because they *don't* have mutable state but *do* have
> same*-ness distinctions.  Ultimately, the identity of a reference*
> object depends on data flow of references through a program.
> 
> Two object references which derive from the same instance
> creation expression are always the same*, whether or not
> the object is a value* or a reference* object.   But two
> object references which derive from distinct instance
> creation expressions *must not* be the same* if the
> object is a reference* object, and *may be* the same if
> the object is a value* object, provided the two distinct
> instance creation expressions provided the same* field
> values to the two value* instances.  Say that ten times
> fast.
> 
> Flushed by a sense of success, let's try a bit of 15.21.3:
> 
>> At run time, the result of == is true if the operand values are both null or both refer to the same object or array; otherwise, the result is false.
>> 
>> The result of != is false if the operand values are both null or both refer to the same object or array; otherwise, the result is true.
> 
> Hey, all we need to do is add the star to "same*" and we're done.
> 
> There's a bonus section right here in the JLS, which I will annotate with
> a star:
> 
>> While == may be used to compare references of type String,
>> such an equality test determines whether or not the two operands
>> refer to the same* String object.
> 
>> The result is false if the operands are distinct [(i.e., not the same*)]{.ins}
>> String objects, even if they contain the same sequence
>> of characters (§3.10.5).
> 
>> The contents of two strings s and t can be tested for equality
>> by the method invocation s.equals(t).
> 
> What we have here is that word same* again, and also the different
> though related concept of "same sequence".  The JLS admits frankly
> two not-the-same* strings might have the same (no star!!) contents.
> 
> Perhaps that means this elusive term "value*" is really short for
> "same if contents are the same", and the term "reference*" means
> "same only if they have the same instance creation event".
> Roget himself couldn't boil all that down to a snappy pair of terms.
> 
> The best I can do is "content*" vs. "identity*":  "content class Foo{}"
> vs. "identity class Bar{}".  A content* class depends only on content,
> while an identity* class cares about where it came from.  (The JLS
> makes only sparing use of the terms "content" and "identity"; see
> Example 4.3.1-2 for the latter.)
> 
> Let's try a trip to the JVMS.
> 
> All the above about same* means the definition of "acmp" we have been
> talking about is somehow related to same*-ness as derived from the
> language of the JLS, and modified to suit values.
> 
> And Brian's argument (which I agree with) is that "acmp" should just do
> a same*-ness test, without further clever optimizations.
> 
> Let's see that that does to the JVM spec in Chapter 6:
> 
>> Both value1 and value2 must be of type reference. They are both popped from the operand stack and compared. The results of the comparison are as follows:
>> 	•  if_acmpeq succeeds if and only if value1 = value2
>> 	•  if_acmpne succeeds if and only if value1 ≠ value2
> 
> Here's same-ness under another guise, the deceptively simple
> looking (in-)equality operator.  Perhaps it should just read:
> 
>> 	•  if_acmpeq succeeds if and only if value1 and value2 are the same*
>> 	•  if_acmpne succeeds if and only if value1 and value2 are not the same*
> 
> 
> Again, HTH.
> 
> — John