RFR: 8255397: x86: coalesce reference and int entry points into vtos bytecodes

Claes Redestad redestad at openjdk.java.net
Tue Oct 27 19:49:23 UTC 2020


On Tue, 27 Oct 2020 19:22:51 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> It rubs me the wrong way that we are effectively changing `push_ptr` to `push_i` for `aep`. While it is implemented in the same manner in `interp_masm_x86.cpp` -- delegating to `push`, it still means if `push_i` implementation changes, `aep` would do the `push_i` _as if_ it is integer, not pointer. Ditto a change in `push_ptr` (adding verification, maybe?) would miss this code.

Verification is done explicitly with `__ verify_oop(..)` and friends, so it seems unlikely we'll overload `push_ptr` any time soon (and they have been semantically identical for many years, even before the merging of 32- and 64-bit `interp_masm_x86...`). But I acknowledge this adds a fragility here, but perhaps there are some assertions we can add to put a check that `push_ptr` and `push_i` stays semantically the same?

> 
> So, how much of the improvement we are talking about to sacrifice this?

A few hundred thousand instructions and branches on Hello World (seems unconditional jumps are logged as branches by `perf`?):

Baseline:
       103,795,433      instructions              #    0.59  insn per cycle           ( +-  0.07% )
        20,263,519      branches                  #  200.867 M/sec                    ( +-  0.08% )
           731,187      branch-misses             #    3.61% of all branches          ( +-  0.15% )       0.067306367 seconds time elapsed                                          ( +-  0.24% )

Patch:
       103,466,523      instructions              #    0.59  insn per cycle           ( +-  0.07% )
        20,068,162      branches                  #  201.935 M/sec                    ( +-  0.08% )
           727,575      branch-misses             #    3.63% of all branches          ( +-  0.13% )       0.066568115 seconds time elapsed                                          ( +-  0.27% )

For Hello World maybe half of that comes from reduced overhead of generating, the rest from quickening quite a few bytecode transitions. There's a scaling component (seen a few million instruction gains on slightly larger apps), but it's nothing huge.

-------------

PR: https://git.openjdk.java.net/jdk/pull/865


More information about the hotspot-dev mailing list