[Exp] First prototype of new acmp bytecode

Fri Feb 16 13:13:58 UTC 2018

Hi John,

thanks a lot for looking at this! Please find comments below.

For reference, here's the webrev that John looked at:
http://cr.openjdk.java.net/~thartmann/valhalla/exp/acmp.01/

On 15.02.2018 20:04, John Rose wrote:
> # templateTable_x86
> Suggest `assert((mask >> LogKlassAlignmentInBytes) != 0, "invalid shift")`.
> (Same thing in graphKit also.)

I think for now we can use an even stronger assert:
assert((mask >> LogKlassAlignmentInBytes) == 1, "invalid shift");

But this code might change in the future in case we use a different perturbation scheme.

> Suggest trying memory-to-register cmov instead of test/jcc/bind(is_null)

Do you mean something like this?

__ testptr(rdx, rdx);
__ cmovl(Assembler::notZero, rbx, Address(rdx, oopDesc::klass_offset_in_bytes()));

That won't work because the load from rdx will be executed even if the cmov condition evaluates to
false (i.e., if rdx is NULL). See page 254 of the Intel x86_64 manual:

temp ← SRC
IF condition TRUE
THEN
  DEST ← temp;
FI
[...]

https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf

> (or maybe do that in JIT code where it might matter).

In the cases were it's possible to convert to a CMove and when it also pays off performance-wise,
the JIT should do this automatically during the SplitIfBlocks optimization. But I'll check the
generated code.

> # graphKit
> Should be structured like this:
>     if (ta->is_value_based() || tb->is_value_based()) { ... }
>     else if (ta->can be_value_based() && tb->can_be_value_based())) { ... }
>     //else use old_acmp

Done.

> (Looking ahead to subnode, I see you have that there.  There's some
> duplication here between the parser and Node::Ideal, probably more
> than we want.)

Yes, sure. I was looking into improving this later. Just wanted to get a working version first.

> Also in the second case (both *might* be VB) you should swap the operands,
> if the second is non-null and the first might be null; that avoids a null check.

Yes, I was about to add a test for the case where one operand is known to be never null (see "//
TODO testcase for this"). I've now added tests (see "// new acmp without null check") and
implemented the swap optimization.

> In the long term, acmp should have a null_seen profile bit so we can do
> trap-based implicit null checks.  I think that would respond to one of your
> TODO notes (speculate a being non-null).

Yes, we can use Type::speculative_maybe_null()/speculative_always_null() for this. And yes, that's
what I meant with the TODO comment. We might need more profile points though.

I'll also look into if we can remove the null check after parsing if we don't need it anymore for
acmp (for example, if we know the other operand is always non-null). The problem is that other
optimizations might have relied on this null check so we probably cannot just simply remove it.

> The (OrX (CastP2X a) (CastP2X b)) makes my skin crawl a little, but I
> guess it's safe, except that I don't think you want the CastX2P wrapped
> around it.  That's because the result of the OrX is not a valid oop,
> and we don't want it to accidentally be classified as one (say, after
> copy removal and LRG merging).

Yes, it's ugly but currently the CastX2P is required because CmpP expects pointer operands.

> Also, I don't think the OrX will fold up in GVN in all the cases where
> it should.  If one operand folds to null, the whole thing should turn
> into a null check on the other operand.  If one operand folds to a
> NotNull type, the null-test of the whole thing should fold to false.
> But that can probably be solved by some pattern matching in
> Node::Ideal.  Such extra logic can also turn a CmpX(x,0) to a
> CmpP(x,null), when that pops up after GVN.  I think that's one
> reason you wanted to add the CastX2P before the CmpP,
> instead of a CmpX.

Yes, but right now the plan is to do that with pattern matching in CmpPNode::Ideal().

We might even want to introduce a new (macro) node for the perturbation (see below).

> (It's almost as if we want a CmpP(x,y) with a new BoolNode type
> of both-zero, where the CmpP/BothZero is lowered to "orcc" or
> some such.  And the notion applies to non-pointers also, as an
> optimization of things like "if (x == 0 && y == 0)".  The general
> case doesn't stop at two inputs, but two inputs *might* be an
> interesting case to chase after.)
> 
> As with the OrX for double-null test, the output of perturbing
> a pointer (OrX with mask) should be an int/long rather than a
> pointer.  That is, the CastX2P seems dangerous to me in
> both places.  So one of the three inputs of the perturbed
> CmpP should be an int/long, not a perturbed oop.

Actually, the "three input solution" is just a hack that I've used to simplify the optimizations in
Ideal and make sure that Ideal gets invoked once the type of one of the input operand changes
(because type updates are not propagated through the perturbation operations). I'm looking into
other, cleaner ways to implement this.

> (My problems with CastX2P get less if the result is very
> clearly a RawPtr, and has no chance of getting into an
> oop map and seen by the GC.)

Yes, the result is indeed a RawPtr but I think the "real" problem is the following:

If the oop is perturbed at runtime, the GC does not need to know about it. We can just treat it as a
RawPtr and make sure that it's never treated as a valid oop at a safepoint after optimizations. The
following cmp will return false no matter if the GC moved the object or not.

However, if the oop is *not* perturbed at runtime and live at a safepoint, the GC needs to be aware
of it because a subsequent cmp needs to use the updated oop in case the GC moves the object. That
said, the GC may need to update the (not) perturbed oop.

Roland and I've discussed this today and that might be a hard issue. My current take would be to
introduce a new node for the perturbation and make sure that the result is not live "over" a
safepoint. I have to think more about it though.

> # parse2
> 
> Suggest moving can_be_value_based tests into new_acmp,
> where their interaction with is_value_based will be clearer.

Done.

> # type
> 
> TypeInstPtr::can_be_value_based should also test for other supertypes
> of value based types, notably interfaces.  For safety, allow all interfaces,
> if any value-based types are not final, since a subclass might implement
> a new interface. 

Yes, I have neglected interface supertypes for now (also in the test). Will look into this next.

> Speaking of subclasses, you probably want to do an
> TypePtr::xmeet rather than klass equality check.

Could you elaborate on where you would like to see a xmeet?

> (Thing to remember:  Object functions as an honorary interface to value
> types.  So wherever Object is special-cased, interfaces probably belong
> in the special case also.)

Yes, that's a good point.

> # subnode
> 
> This looks right, except for my reservations about the CastX2P.
> (Which might go away if that's a safely marked RawPtr.)
> 
> The swapping trick works in subnode.  Specifically:
> 
> Node* a; Node *b;
> if (might_be_null(a) && !might_be_null(b))
>   b = perturb_ptr_if_value(b);
> else
>   a = perturb_ptr_if_value(a);

Right, but the swapping only pays off if we can remove the null check that was emitted for the other
operand during parsing (see issues I've mentioned above).

Here's the new (intermediate) webrev that I will update while I make progress:
http://cr.openjdk.java.net/~thartmann/valhalla/exp/acmp.02/

Thanks a lot!

Tobias