RFR (S): CR 8014447: Object.hashCode intrinsic breaks inline caches

Aleksey Shipilev aleksey.shipilev at oracle.com
Wed Sep 25 13:55:41 PDT 2013


On 09/25/2013 02:25 AM, Vladimir Kozlov wrote:
> I would go with you second solution but rename is_low_priority to
> does_virtual_dispatch. is_predicted() does not do virtual dispatch so
> you don't need to include in factored method (add assert for
> protection). And as result you don't need new inline_intrinsic(). Just
> cache result in local var:
> 
> if (cg->does_virtual_dispatch()) {
>   cg_intrinsic = cg;
>   cg = NULL;
> }

Thank you, Vladimir. The updated webrev is here:
  http://cr.openjdk.java.net/~shade/8014447/webrev.02/

It passes JPRT (almost all testing is done, stuck in the queues), and
still does the good thing for Object.hashCode():

baseline:
  HashCodeBench.stat_i_i:  3.7 +- 0.1 ns/op
  HashCodeBench.stat_o_i:  3.7 +- 0.1 ns/op
  HashCodeBench.stat_o_o:  3.7 +- 0.1 ns/op
  HashCodeBench.virt_i_i:  1.5 +- 0.1 ns/op
  HashCodeBench.virt_o_i:  8.6 +- 0.1 ns/op // <--- !!!
  HashCodeBench.virt_o_o:  4.2 +- 0.1 ns/op

patched:
  HashCodeBench.stat_i_i:  3.6 +- 0.1 ns/op
  HashCodeBench.stat_o_i:  3.6 +- 0.1 ns/op
  HashCodeBench.stat_o_o:  3.6 +- 0.1 ns/op
  HashCodeBench.virt_i_i:  1.5 +- 0.1 ns/op
  HashCodeBench.virt_o_i:  2.0 +- 0.1 ns/op // improvement
  HashCodeBench.virt_o_o:  3.8 +- 0.1 ns/op

I put does_virtual_dispatch for the clone() intrinsic as well, but
the effect is nil, since it is hard to call Object.clone() with Object
as the formal receiver while actually being the subclass. It does not
degrade the clone() performance though, so I'm inclined to keep it for
the symmetry, in case anybody finds the trick of invoking it.

baseline:
  CloneBench.cln:        18.0 +- 0.3 ns/op
  CloneBench.cln_cln:    23.3 +- 0.5 ns/op
  CloneBench.obj_cln:    23.0 +- 0.6 ns/op
  CloneBench.obj_obj:    23.6 +- 0.7 ns/op

patched:
  CloneBench.cln:        18.0 +- 0.3 ns/op
  CloneBench.cln_cln:    23.7 +- 0.7 ns/op
  CloneBench.obj_cln:    23.5 +- 0.7 ns/op
  CloneBench.obj_obj:    23.6 +- 0.7 ns/op

Thanks,
-Aleksey.


More information about the hotspot-compiler-dev mailing list