Proposal: Sameness operators
Derek Foster
vapor1 at teleport.com
Mon Mar 30 22:51:38 PDT 2009
AUTHOR:
Derek Foster
OVERVIEW
Many Java objects implement the Comparable<T> interface, and/or override the Object.equals() method, to provide a way of ordering their instances or detecting whether two instances are equivalent . However, the syntax for using these methods has historically been fairly ugly:
// We want to write "if a >= b", but we have to write...
if (a.compareTo(B) >= 0) {
whatever();
}
// We want to write "if a == b", but we have to write...
if (a.equals(b)) {
whatever();
}
The ugliness of these methods has often motivated the creation of special-purpose APIs to simplify the use of classes which implement them, such as the 'Date.before(Date)' and 'Date.after(Date)' methods in java.util.Date. However, these methods are inconsistent from class to class, and often not present.
Furthermore, the existing language == and != operators exhibit a strange assymetry between the behavior of objects and that of primitive types that often catches new users of Java by surprise:
int i;
if (i == 5) {
// gets here if the VALUE of i is 5
}
String j = new String("abc");
if (j == "abc") {
// probably never gets here! Comparing the values of the references, not the values of the strings.
}
String k = new String("abc");
if (k.equals("abc")) {
// gets here if the VALUE of k is "abc".
}
This behavior often confuses newcomers to Java, and is a common source of bugs even for experienced programmers.
This proposal suggests that a new set of relational operators be added to Java which would simplify ordering classes by their declared orderings (as specified by the Comparable<T> class and Object.equals()), while still yielding syntax that is as simple as using the existing >=, <=, ==, and != operators.
FEATURE SUMMARY:
Adds four new operators to Java:
a $$ b "same as": a==null ? b==null : a.equals(b), or a == b for primitive types.
a !$ b "not same as": a==null ? b!=null : !a.equals(b), or a != b for primitive types.
a >$ b "greater than or same": a.compareTo(b) >= 0, or a >= b for primitive types.
a <$ b "less than or same": a.compareTo(b) <= 0, or a <= b for primitive types.
and adds additional overloadings to existing operators:
a < b a.compareTo(b) < 0, or a < b for primitive types.
a > b a.compareTo(b) > 0, or a > b for primitive types.
MAJOR ADVANTAGE:
Use of new operators for these relational tests would simplify code and make it more clear what relational tests are being made, as well as reducing the opportunity for mistakes (such as accidentally typing a.compareTo(b) >=0 when <=0 was intended).
MAJOR BENEFIT:
Clearer code due to the use of infix operators instead of using method calls and an extra pair of parentheses, plus possible extra tests (often accidentally omitted) for nullness around calls to Object.equals(Object).
Future proposals involving limited uses of operator overloading for code clarity (for BigInteger, etc. classes) would no longer run into the iceberg of "but you can't change the behavior of == in a backwards-compatible fashion!"
MAJOR DISADVANTAGE:
Modifications to the compiler would be required. These do not appear particularly difficult, but would take some effort.
The !$ operator would not be available for Perl-style pattern matching, if that were ever added to Java.
ALTERNATIVES:
Keep using the Comparable.compareTo and Object.equals methods as they are.
It would have been better to define == and != this way in the first place, and use some other operator (perhaps === as in some other languages?) to indicate comparison by object identity. This would have made Java simpler and easier for newcomers to understand. However, that's not how the language is currently defined, and to change the behavior of == now would be a backwards incompatible change.
EXAMPLES
SIMPLE EXAMPLE:
String getOrdering(String first, String second) {
if (first $$ second) {
System.out.println("They are equal");
} else if (first !$ second) {
System.out.println("They are not equal");
} else if (first > second) {
System.out.println("The first is after the second");
} else if (first >$ second) {
System.out.println("The first is same as or after the second");
} else if (first < second) {
System.out.println("The first is before the second");
} else if (first <$ second) {
System.out.println("The first is before or the same as the second");
}
}
ADVANCED EXAMPLE:
Really, the simple example pretty much illustrates the feature.
DETAILS
SPECIFICATION:
The following new tokens
$$ !$ >$ <$
shall be added to section 3.12 of the JLS3.
The expression grammar in section 15.20 shall be modified like so:
RelationalExpression:
ShiftExpression
RelationalExpression < ShiftExpression
RelationalExpression > ShiftExpression
RelationalExpression <= ShiftExpression
RelationalExpression >= ShiftExpression
RelationalExpression >$ ShiftExpression
RelationalExpression <$ ShiftExpression
RelationalExpression instanceof ReferenceType
The expression grammar in section 15.21 shall be modified like so:
EqualityExpression:
RelationalExpression
EqualityExpression == RelationalExpression
EqualityExpression != RelationalExpression
EqualityExpression $$ RelationalExpression
EqualityExpression !$ RelationalExpression
Semantically, the behavior of these new operators is as follows:
$$ and !$ operators:
Evaluation of these operators shall occur exactly as they do for the == and != operators, as specified in section 15.21 of the JLS3 ("Equality Operators"), with the exception that for the purposes of these operators, section 15.21.3 ("Reference equality operators == and !=") shall be disregarded and replaced with:
If the operands of a sameness operator are both of either reference type or the null type, then the operation is object equivalence. The behavior described below is for the $$ operator. The !$ operator shall behave identically except that it shall return false when the $$ operator would return true, and vice versa. The procedure for evaluating the $$ operator is as follows:
If both operands are null, then the result shall be 'true'.
Otherwise, if the left operand is null and the right operand is not null, then the result shall be 'false'.
Otherwise, the return value shall be the value of the expression "left.equals(right)", evaluated using the java.lang.Object.equals(Object) method.
>, >$, <, and <$ operators:
Evaluation of these operators shall occur exactly as it does for corresponding >, >=, <, and <= operators, in section 15.20.1 of the JLS3 ("Numerical Comparison Operators <, <=, >, and >=") with the exception of the following.
The text "The type of each of the operands of a numerical comparison operator must be a type that is convertible (§5.1.8) to a primitive numeric type, or a compile-time error occurs." shall be replaced with the text "If the type of both of the operands of a numerical comparison operator is a type that is convertible (§5.1.8) to a primitive numeric type, the following algorithm is used to evaluate the operator. Otherwise, the operation is object ordering (See §15.20.1.1)"
A new section 15.20.1.1 shall be added consisting of the folowing text:
If one or both operands of a relational operator cannot be converted to primitive numeric types, then boxing conversions shall be used to convert both operands to object types.
After this conversion, if the type of the left operand does not extend the raw type java.lang.Comparable, and also does not extend java.lang.Comparable<T> for some type T which is equal to or a supertype of the type of the right operand, then a compiler error shall be reported.
Otherwise, the result of the comparison shall be evaluated at runtime as follows:
If the left operand is null, a NullPointerException shall be thrown.
Otherwise, the method left.compareTo(right) shall be called. The following table shall be used to determine the result of the operator evaluation, based on the value returned from this method:
left.compareTo(right)
<0 ==0 >0
> false false true
< true false false
>$ false true true
<$ true true false
COMPILATION:
Compilation of the example given above would be desugared as follows:
if (first==null ? second==null : first.equals(second)) {
System.out.println("They are equal");
} else if (first==null ? second != null : !first.equals(second)) {
System.out.println("They are not equal");
} else if (first.compareTo(second) > 0) {
System.out.println("The first is after the second");
} else if (first.compareTo(second) >= 0) {
System.out.println("The first is same as or after the second");
} else if (first.compareTo(second) < 0) {
System.out.println("The first is before the second");
} else if (first.compareTo(second) <= 0) {
System.out.println("The first is before or the same as the second");
}
TESTING:
The feature can be tested by ensuring, first of all that the new operators return the same results as existing operators when invoked on operands which are both convertible to numeric primitive types.
Secondly, that the new operators return the same results as their desugaring equivalents when invoked in circumstances where one or both operands are not convertible to numeric primitive types.
Thirdly, that the relational operators throw NullPointerExceptions when the leftmost operation is null.
Fourthly, that compiler errors are reported exactly in the cases when the desugared equivalents of the operators would not compile.
LIBRARY SUPPORT:
No library changes are needed for this feature.
REFLECTIVE APIS:
No changes to reflective APIs are needed for this feature.
OTHER CHANGES:
No other parts of the platform need to be updated.
MIGRATION:
A tool such as Eclipe or IntelliJ Idea might identify for the users existing calls to .compareTo or .equals which could be converted to use the new operators, and could offer to perform the change automatically at the user's request. Or a user could simply refactor code to use the new operators as desired.
COMPATIBILITY
BREAKING CHANGES:
These operators should not cause any existing valid programs A program which, in source code, used the $ character at the start of an identifier might be affected, if it contained code such as:
if (a<$something) {
}
meaning "if ( a < $something )" rather than "if ( a <$ something )" which is how it would be parsed according to this proposal. This would almost certainly result in a compiler error about a missing variable "something", particularly if the body of generated code was large, so the odds of this resulting in correctly compiling but silently misinterpreted code is small.
This would only happen in generated code (the only place a $ character is supposed to be used, according to the JLS), and is expected to be an extremely rare phenomenon, since generated code rarely starts identifiers with dollar signs, and also usually generates code with spaces around operators for readability. This problem can be easily fixed when upgrading to Java 7 by altering the code generator to put a space in front of any such identifiers, either globally or only when they follow a < or > character. Note that internal synthetic variables generated by a compiler would be immune from this problem since they would never appear in source code form.
Alternately, this proposal could be altered to use another character instead of $ for its tokens. One good candidate for this would be "$" which lends itself to being read as "is equivalent to" (rather than "$" for "is the same as"). There are, however, rare cases that this could break as well ( if (a<~b) { ... } ) and although relational tests against negated values are quite rare (since combining bitwise operations and relationals is usually a nonsensical operation), unfortunately this breakage could occur in non-generated code, which is much more common.
Yet another option would be "#", which somewhat resembles an equals sign. Thus, instead of the $$, !$, <$, and >$ operators, this proposal could be implemented to use the ~~, !~, <~, and >~ operators, or the ##, !#, <#, and ># operators. The # operator has the advantage of being unused in Java, and so is guaranteed not to break code. However, it is highly sought after by writers of language change proposals, and so is under heavy competition as to its future meaning.
EXISTING PROGRAMS:
Except for the minor, rare, breaking change listed above, this change should not create incompatibilities with existing source or class files.
REFERENCES
EXISTING BUGS:
There are a variety of proposals in the Bug Database related to various people's desires to support operator overloading for certain built-in mathematical classes such as BigInteger. A couple of such proposals are listed below.
"Add [], -, +, *, / operators to core classes as appropriate" (related)
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5099780
"BigInteger should support autoboxing"
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6407464
Also, see discussion on the Project Coin mailing list regarding the proposal "Draft proposal: allow the use of relational operators on Comparable classes," particularly with regards to why that proposal was withdrawn (namely, inability to make the == and != operators work properly on Comparable classes).
http://mail.openjdk.java.net/pipermail/coin-dev/2009-March/000361.html
URL FOR PROTOTYPE (optional):
None.
More information about the coin-dev
mailing list