The Spirit of acmp

Wed Apr 10 01:55:26 UTC 2019

Hello,

Acmp poses some interesting problems that could influence fundamental aspects of Java. 
No consensus seems to exist about acmp yet, but we know that there are different categories of “value types” with unique needs. 

The point of this mail is to explore the consequences of making a case distinction between different categories of “value types” and find solutions to general concerns.
I tried to be concise, but I think such a delicate topic needs to be done accurately (please bear with me).

Before making my point, I want to re-iterate on the critical aspects of "==":
    - It must have a defined behavior for "value types" since they can be cast to Object.
    - It must be fast. A recursive descent for tree or list like structures is not an option.
    - It _should_ perform a substitutability test for "some cases".

In this context I think "some cases" actually refers to "data carriers" such as "Point".
At least for numerics I think it is a hard sell not to have "complex1 == complex2" in a language.

Nonetheless, the real problem of "==" comes from non data classes like "Cursor" or "ValueTreeNode" as they violate the performance constraint when checking for substitutability.
Because of this violation the only easy way out is to always return "false".

The question is, can we get away with having the latter while still keeping the former? For this we could make a case distinction.
First, we got "composed primitive" types which really want "==" and compare continuous bits in memory. But this puts some constraints on field types.
“Composed primitives" can contain only primitives or other "composed primitives" (hence the name). Technically, RefObjects could be legal fields, but relying on object identity for value substitutability scares me.

But more pressingly are "inline class" types which are more akin to objects without identity (where "==" breaks apart).
They have no restrictions on their fields and always return “false”... and I think the consequences are manageable.
First, you can counteract this by not letting "==" compile, which will make many developers aware of the problem.

Furthermore, the application of "==" is rather narrow in scope. You use it on numbers, in null checks and as optimization for comparisons (after all, it is just a subset of the logical .equals).
As a side effect, making "==" return "false" puts more pressure on .equals (since it is often called when an identity check fails). Because of this pressure the .equals method of "inline classes" should implement the exhaustive evaluation that is so scary to use for "==". If you want fast substitutability, use Objects.

Consequently, banning "==" from "inline classes" means that you cannot use "==" on generic types that include "inline classes" (just like you can't assign null).
Would it be so bad to force users to rely on logical equality in this case? Since .equals is a superset of identity checks, bugs should only come up in questionable code.

However, if identity checks turn out to be significant for performance for a large enough amount of cases we could introduce a new operator for logical equality (similar to Stephen's suggestion).
Now, if the type parameter is a RefObject this operator could compile to an additional identity check before .equals is called.
Similarly, this could give you an easy way to perform a null-check on an “any T” type. It can correctly check nullable types for zeroes and returns false otherwise.

This operator could help in the migration work of libraries to support "value types" and clean up some syntax.
In general, an equality operator could guide the language (and its users) towards a more value-like path that is less dependent on identity (but still benefits from it when applicable).
I can only imagine that an equality operator like .= or ?= would be adopted rather quick for migration purposes and in general.

Going full cycle, the notion of a "value type" just might be too broad. Conceptually, "composed primitives" (true value types) are different to "inline classes" (technical value types). Maybe they should be treated as such.

Here are some continuative thoughts which are out of scope:
- 1: If "==" is allowed on "composed primitives" it could be interpreted as a form of operator overloading. Buildings upon this, a “composed primitive” could implement a "numerical interface" which allows to use more operators (%, *, /, + or –). 
      Technically, value types and their generic specialization are an exciting premise for new math APIs. And given the constraints of "composed primitives" I think the room for misusing/abusing operator overloading is limited.

- 2: I think the core problem of "value types with substitutability check" aligns itself with "record classes". Where "composed primitives" are data carriers of primitives, records are data carriers of all types. Perhaps "composed primitives" are really just "primitive records"?

That's it. I hope there were some interesting bits.
Patrick

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus