Value equality

Remi Forax forax at univ-mlv.fr
Wed May 18 22:52:18 UTC 2016


Even without talking about optimizations,
LIFE for a value type is not correct because CE may not be included in EQ.

value class Foo {
  int value;

  public boolean equals(Foo foo) {
    return value & 0xFF == foo.value & 0xFF;
  }
}

here, if CE do a bitwise comparaison, the semantics of LIFE is not the semantics of EQ.

Rémi

----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> À: "Brian Goetz" <Brian.Goetz at oracle.com>
> Cc: valhalla-spec-experts at openjdk.java.net
> Envoyé: Jeudi 19 Mai 2016 00:11:12
> Objet: Re: Value equality
> 
> On May 18, 2016, at 7:57 AM, Brian Goetz <Brian.Goetz at Oracle.COM> wrote:
> > 
> > [1] John points out that if == is CE, then (a==b||a.equals(b)) will
> > redundantly load the fields on failed ==.  But, many equals
> > implementations start with "a==b" as a short-circuiting optimization,
> > which means "a==b" will be a common (pure) subexpression in the resulting
> > expansion (and for values, methods are monomorphic and will get inlined
> > more frequently), so the two checks can be collapsed.
> 
> Proposed acronym:  Call the explicit legacy idiom for equality "LIFE".  This
> is "a==b||a.equals(b)", maybe with a null (or NaN?) check.  Semantically it
> is CE||OE.
> 
> The question of optimization is worth expanding on.  In a nutshell, pure
> control flow test like a.f=ACMP=b.f are often collapsible.  (The dominating
> test makes the later test go away, or in other words, the dominated test can
> rise upward to merge into the dominating test.)  But, even pure tests cannot
> be sunk past exception-producing code.  If a sub-operation of OE can produce
> an exception, it cannot be reordered with a previous CE sub-operation.
> Unfortunately, exceptions provide a semantic "window" into order of
> operations; this is one of the major difficulties with Java optimization.
> 
> For example:
> 
> if (a.someTag != b.someTag)  return NOT_EQUAL;  //CE, first pass
> if (!a.equals(b))  return NOT_EQUAL; //OE, second pass
> return IS_EQUAL;
> 
> If the heavy comparison somehow includes a repetition of the "someTag"
> comparison, that latter comparison can be removed as a dominating test.
> 
> But, if the heavy comparison can throw an exception (or make any side
> effect), then the "someTag" comparison cannot be moved *down* into or past
> the equals method.
> 
> The net of this is that the writer of the equals method has, in general, no
> control over the first pass of the equals algorithm, even after
> optimization.  The code might get some spot benefits from removal of
> explicit tests in the OE pass if they are already covered (dominated) by the
> CE pass.  If the code *omits* the equivalent of a CE test (e.g., an ACMP
> comparison) for some reason, the CE test is extra work.  Not much extra
> work, but it could cause extra fills or spills as the JIT works to provide a
> first-pass access to a component that will be needed, again, in a second
> pass after arbitrary intermediate labor.
> 
> So a mandatory LIFE sentence (couldn't resist that), if op== is CE (that's
> P1), means some loss of performance control around equals methods.
> 
> By comparison, if op== is OE (that's P2), there is no need to explicitly call
> CE; the .equals method does the exact job, and the coder has control.
> 
> In legacy code where op== is OE, the LIFE expression produces back-to-back
> calls to .equals, semantically OE||OE.  That would be even worse than
> CE||OE, unless we have a way to optimize OE||OE to OE, or until the users
> pick the LIFE (or the Legacy Idiom Connoting Equality) out of their code.
> 
> But, if it is allowed, collapsing two adjacent calls to .equals is a much
> simpler JIT transform than merging an adjacent CE and NE.  It's a
> macro-level CSE on partially lowered IR.
> 
> As noted above, merging a CE into an NE requires that all the logic of the CE
> be performed before the first possible exception produced by NE.
> 
> Let's try a more complete example:
> 
> class MyVal {
>   TypeDesc type;
>   long payload1;
>   long[] payload2;
>   boolean equals(MyVal that) {
>     return this.type.equals(that.type) && this.payload1 == that.payload1 &&
>     Arrays.equals(this.payload1, that.payload2);
>   }
> }
> class User {
>    boolean foo(MyVal a, MyVal b) {
>       return a==b || a.equals(b);  // CE || OE == LIFE
>    }
> }
> 
> In P1, the intermediate representation for the legacy idiom will contain
> these operations:
> 
> PASS1: // start CE
> if (a.type !ACMP= b.type)  goto PASS2;
> if (a.payload1 !LCMP= b.payload1)  goto PASS2;
> if (a.payload2 !ACMP= b.payload2)  goto PASS2;
> return IS_EQUAL;
> PASS2: // start OE
> z = a.type.INVOKEVIRTUAL[TypeDesc, equals, (TypeDesc)Z](b.type); //[DT
> a.type?]
> … // random logic here; can throw exceptions in general
> if (!z)  return NOT_EQUAL;
> if (a.payload1 !LCMP= b.payload1)  return NOT_EQUAL; //[DT a.payload1]
> z = INVOKESTATIC[Arrays, equals, ([J [J)Z](a.payload2, b.payload2); //[DT
> a.payload2]
> … // random logic here; can throw exceptions in general
> if (!z)  return NOT_EQUAL;
> return IS_EQUAL;
> 
> What's wrong with this?  Well, from the JIT POV it is micromanagement, by the
> CE code, of the order of operations.  The JIT has to reverse any unfortunate
> decisions in the CE code if it wants to respect the order of operations in
> the OE code.
> 
> From the POV of the author of MyVal.equals. the CE enforces a policy that it
> is more profitable to compare the bits of payload1 and the object identity
> of payload2 before the structure of TypeDesc is consulted.  Sometimes this
> is right, sometimes it is wrong, but the CE+OE idiom fits everybody into a
> one-size-fits all policy, and takes away ordering control from the .equals
> method.
> 
> Could the JIT reorder stuff any way it wants, and make the code more like
> what the author intended?  After all, CE is provably a subset of OE.  There
> are two problems with that.  First, the subset relation is subtle and not
> visible (to the JIT) at the bytecode level.  (After all, CE and OE are two
> completely different bytecode shapes.)
> 
> Second, as noted above, the possibility of exceptions places barriers to
> reordering.  You cannot reorder exceptions with normal exits.  In the above
> IR, if the payload1 fields differ, TypeDesc.equals is never called, so even
> if it were going to throw an exception (due to an assert or a bug or a
> malformed MyVal or TypeDesc, say) the IR says it can't, and the JIT can't
> reorder the CE code to merge it into the OE code.
> 
> In this particular example, the possible optimizations are removal of the
> dominated tests marked "[DT …]" above.  The DT on payload2 is inside the
> implementation of Arrays.equals.  The DT on a.type may or may not be present
> in the TypeDesc.equals method; it's a programmer choice.
> 
> More perturbations are possible with larger value objects:  If you add more
> 64-bit payload fields, the bitwise comparison clearly needs to load the
> entire value object (from whatever buffer or box it is in) before the first
> call to TypeDesc.equals.  If the programmer knows that TypeDesc.equals
> method is likely to prove inequality, without examination of the payload,
> then the first pass can be wasted bandwidth, in cases of degenerate values
> with (say) all zeroes payloads.  Is this a problem?  Depends on the data
> structure.  In this case, the dominated tests (e.g., DT of payload1) can
> usually be elided, so at worst it is a reordering of programmer intentions.
> 
> On the other hand, if you add more reference fields (like type or payload2),
> the first pass amounts to a reordering of (likely) ACMP operations inside
> the .equals method of each successive component.  Is that bad?  Well, as
> noted above, it forces the JIT to load each reference field twice, once to
> do ACMP, and then (much later) to call .equals.  Are there enough registers
> to avoid double loads?  Often, yes, until the computation gets register
> bound, and then you simply have multiple loads, just to ensure the semantics
> of exception ordering.
> 
> In summary, optimizing CE || OE (the legacy semantics of LIFE) is not
> horrible but will remove some programmer control from .equals ordering, and
> will cause more data motion in the presence of multiple reference
> components.
> 
> LIFE with P1 will also make the JIT slightly slower and less robust, as it
> pattern matches on more ad hoc and verbose bytecode shapes.  (Complex
> pattern matching is difficult; that's why we moved the string concatenation
> idioms into invokedynamic in JDK 9.)
> 
> Let's go to P2 for a moment.  In the presence of LIFE, the IR will first
> contain unexpanded intrinsics for a repeated OE || OE:
> 
> // a==b
> z = INTRINSIC[a.equals(b)];
> if (!z)  goto NOT_EQUAL;
> // a.equals(b)
> z = INTRINSIC[a.equals(b)];
> if (!z)  goto NOT_EQUAL;
> goto IS_EQUAL;
> 
> We need a special permission, an above-the-bytecode guarantee that any normal
> or abnormal result produced by the second call will be produced by the first
> call.  In that case, we can elide the IR for the second call (as provably
> redundant):
> 
> z = INTRINSIC[a.equals(b)];
> if (!z)  goto NOT_EQUAL;
> goto IS_EQUAL;
> 
> Expansion then continues as above, but starting with PASS2.  Code needs less
> optimization and is more under the control of the author of .equals.
> 
> HTH
> — John


More information about the valhalla-spec-observers mailing list