generic specialization design discussion

Tue Apr 9 19:38:12 UTC 2019

On Apr 9, 2019, at 11:55 AM, John Rose <john.r.rose at oracle.com> wrote:
> 
> It's not like the UML experts are looking at the last
> two decades of our thrashing out the value model,
> and saying "yep, we wondered when you would get
> here".  They are as stuck in the Smalltalk model as
> we were.

P.S. I did a quick scan for mentions of object identity
in this available document:
  https://www.omg.org/spec/UML/2.5.1/PDF

The term "identity" is assumed but not defined, which
IMO is a hallmark of Smalltalk-era object modeling,
the kind of model we are struggling to get at arm's
length so we can wrap our arms around it.

Here's a quote from page 151:

> The effect property may be used to specify what happens to objects passed in or out of a Parameter. It does not apply to parameters typed by data types, because these do not have identity with which to detect changes.

(Data types are like Java primitives or value types.)

Page 459, on identity tests:

> If an object is classified solely as an instance of one or more Classes, then testing whether it is the “same object” as another object is based on the identity of the object, independent of the current values for its StructuralFeatures or any links in which it participates (see sub clause 11.4.2).

Page 460 on class changes (e.g. CLOS change-class):

> The identity of the input object is preserved, no behaviors are executed, and no default value expressions are evaluated. The newClassifiers replace existing classifiers in an atomic step, so that structural feature values and links are not lost during the reclassification when the oldClassifiers and newClassifiers have structural features and associations in common.

If this paragraph were guarded by language saying
"the input object must be of the Foo kind", then we
could look for an anti-Foo in the spec.  But it's not.

One useful bit is the definition of "data type", on page
167.  And this is where the UML folks have the best
claim to asking us "what took you so long?".

> 10.2.3.1 DataTypes
> 
> A DataType is a kind of Classifier. DataType differs from Class in that instances of a DataType are identified only by their value. All instances of a DataType with the same value are considered to be equal instances.
> 
> If a DataType has attributes (i.e., Properties owned by it and in its namespace) it is called a structured DataType. Instances of a structured DataType contain attribute values matching its attributes. Instances of a structured DataType are considered to be equal if and only if the structure is the same and the values of the corresponding attributes are equal.
> 
> Unified Modeling Language 2.5.1 167
> 
> A DataType may be parameterized, bound, and used as TemplateParameters.

As a bonus, we also have:

> 10.2.3.2 Primitive Types
> 
> A PrimitiveType defines a predefined DataType, without any substructure. A PrimitiveType may have algebra and operations defined outside of UML, for example, mathematically. The run-time instances of a PrimitiveType are values that correspond to mathematical elements defined outside of UML (for example, the Integers).

So, if we want to follow UML, we could call value/inline classes
something like "data type" classes or "structured data" classes.
(Now I sympathize with C# structs.)

We might have to give up on saying that "classes have instances
which are objects" and similar things because UML makes a strong
distinction between identity-free data types and identity-laden
objects.  One of our basic principles in Valhalla is "codes like a
class".  UML says "classifier" instead of "class", and seems to
allow data types to have "classifiers", so that's OK.  We'd have to
give up our use of the term "object" or bend away from UML
usage there, because our inline classes define structure data,
not objects, in UML terms.

Basically, UML policy is to first distinguish by-value structured
data types from by-identity objects, but that's not our policy.
We build everything from objects.  Oddly, this doesn't contradict
UML as a modeling facility, but where UML allows identity-sensitive
operations on arbitrary objects, we have to say (a) such an operation
is partial, and applies only to some objects, or (b) such an operation
is interpreted (as == and hashcode) without reference to identity,
for objects which lack identity.

Maybe we could smuggle classifiers for NoIdentity and HasIdentity
into a future version of UML, with appropriate bounds for UML's
identity-sensitive operations?  Then it could describe the structure
we are building.

— John