Fwd: The Spirit of acmp

Mon Sep 30 20:11:08 UTC 2019

This was received on the -comments list (some time ago).

The gist of this comment is that this particular commenter is willing to 
trade completeness for speed, by trying to make a "best efforts" attempt 
to have ACMP compare values, but punt when it comes to recursion (an 
idea we've discussed before.)  He gets there by appealing to the 
distinction between "pure" and "polluted" values, which has also come up 
before in the discussion.

-------- Forwarded Message --------
Subject: 	The Spirit of acmp
Date: 	Wed, 10 Apr 2019 03:55:12 +0200
From: 	Patrick Plieschnegger <Patrick at Plieschnegger.com>
To: 	valhalla-spec-comments at openjdk.java.net

Hello,

Acmp poses some interesting problems that could influence fundamental 
aspects of Java. No consensus seems to exist about acmp yet, but we know 
that there are different categories of “value types” with unique needs.
The point of this mail is to explore the consequences of making a case 
distinction between different categories of “value types” and find 
solutions to general concerns.
I tried to be concise, but I think such a delicate topic needs to be 
done accurately (please bear with me).

Before making my point, I want to re-iterate on the critical aspects of 
"==":
- It must have a defined behavior for "value types" since they can be 
cast to Object.
- It must be fast. A recursive descent for tree or list like structures 
is not an option.
- It _should_ perform a substitutability test for "some cases".

In this context I think "some cases" actually refers to "data carriers" 
such as "Point".
At least for numerics I think it is a hard sell not to have "complex1 == 
complex2" in a language.

Nonetheless, the real problem of "==" comes from non data classes like 
"Cursor" or "ValueTreeNode" as they violate the performance constraint 
when checking for substitutability.
Because of this violation the only easy way out is to always return "false".

The question is, can we get away with having the latter while still 
keeping the former? For this we could make a case distinction.
First, we got "composed primitive" types which really want "==" and 
compare continuous bits in memory. But this puts some constraints on 
field types.
“Composed primitives" can contain only primitives or other "composed 
primitives" (hence the name). Technically, RefObjects could be legal 
fields, but relying on object identity for value substitutability scares me.

But more pressingly are "inline class" types which are more akin to 
objects without identity (where "==" breaks apart).
They have no restrictions on their fields and always return “false”... 
and I think the consequences are manageable.
First, you can counteract this by not letting "==" compile, which will 
make many developers aware of the problem.

Furthermore, the application of "==" is rather narrow in scope. You use 
it on numbers, in null checks and as optimization for comparisons (after 
all, it is just a subset of the logical .equals).
As a side effect, making "==" return "false" puts more pressure on 
.equals (since it is often called when an identity check fails). Because 
of this pressure the .equals method of "inline classes" should implement 
the exhaustive evaluation that is so scary to use for "==". If you want 
fast substitutability, use Objects.

Consequently, banning "==" from "inline classes" means that you cannot 
use "==" on generic types that include "inline classes" (just like you 
can't assign null).
Would it be so bad to force users to rely on logical equality in this 
case? Since .equals is a superset of identity checks, bugs should only 
come up in questionable code.

However, if identity checks turn out to be significant for performance 
for a large enough amount of cases we could introduce a new operator for 
logical equality (similar to Stephen's suggestion).
Now, if the type parameter is a RefObject this operator could compile to 
an additional identity check before .equals is called.
Similarly, this could give you an easy way to perform a null-check on an 
“any T” type. It can correctly check nullable types for zeroes and 
returns false otherwise.

This operator could help in the migration work of libraries to support 
"value types" and clean up some syntax.
In general, an equality operator could guide the language (and its 
users) towards a more value-like path that is less dependent on identity 
(but still benefits from it when applicable).
I can only imagine that an equality operator like .= or ?= would be 
adopted rather quick for migration purposes and in general.

Going full cycle, the notion of a "value type" just might be too broad. 
Conceptually, "composed primitives" (true value types) are different to 
"inline classes" (technical value types). Maybe they should be treated 
as such.

Here are some continuative thoughts which are out of scope:
- 1: If "==" is allowed on "composed primitives" it could be interpreted 
as a form of operator overloading. Buildings upon this, a “composed 
primitive” could implement a "numerical interface" which allows to use 
more operators (%, *, /, + or –). Technically, value types and their 
generic specialization are an exciting premise for new math APIs. And 
given the constraints of "composed primitives" I think the room for 
misusing/abusing operator overloading is limited.

- 2: I think the core problem of "value types with substitutability 
check" aligns itself with "record classes". Where "composed primitives" 
are data carriers of primitives, records are data carriers of all types. 
Perhaps "composed primitives" are really just "primitive records"?

That's it. I hope there were some interesting bits.
Patrick

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus