RFR: 8281469: aarch64: Improve interpreter stack banging

Xin Liu xliu at openjdk.java.net
Mon Mar 28 22:36:40 UTC 2022


On Mon, 28 Mar 2022 17:26:56 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> This is the AArch64 counterpart of X86 change: https://github.com/openjdk/jdk/commit/3a13425bc9088cbb6d95e1a46248d7eba27fb1a6.
> 
> Motivational performance improvements on Raspberry Pi 3:
> 
> 
>  Performance counter stats for 'baseline/bin/java -version' (10 runs):
> 
>             476.96 msec task-clock                #    1.288 CPUs utilized            ( +-  0.11% )
>                166      context-switches          #    0.348 K/sec                    ( +-  0.93% )
>                  8      cpu-migrations            #    0.017 K/sec                    ( +-  9.33% )
>              2,954      page-faults               #    0.006 M/sec                    ( +-  0.04% )
>        560,690,251      cycles                    #    1.176 GHz                      ( +-  0.07% )
>        239,068,958      instructions              #    0.43  insn per cycle           ( +-  0.04% )
>         30,236,426      branches                  #   63.394 M/sec                    ( +-  0.05% )
>          4,145,994      branch-misses             #   13.71% of all branches          ( +-  0.09% )
> 
>           0.370225 +- 0.000285 seconds time elapsed  ( +-  0.08% )
> 
>  Performance counter stats for 'patched/bin/java -version' (10 runs):
> 
>             456.01 msec task-clock                #    1.283 CPUs utilized            ( +-  0.12% )
>                156      context-switches          #    0.341 K/sec                    ( +-  0.99% )
>                  8      cpu-migrations            #    0.018 K/sec                    ( +-  4.30% )
>              2,957      page-faults               #    0.006 M/sec                    ( +-  0.07% )
>        536,970,476      cycles                    #    1.178 GHz                      ( +-  0.12% )
>        236,527,954      instructions              #    0.44  insn per cycle           ( +-  0.04% )
>         30,195,820      branches                  #   66.218 M/sec                    ( +-  0.04% )
>          4,128,388      branch-misses             #   13.67% of all branches          ( +-  0.13% )
> 
>           0.355460 +- 0.000741 seconds time elapsed  ( +-  0.21% )
> 
> 
> SPECjvm2008 with `-Xint`:
> 
> 
> Compress: +54%
> Serial: +56%
> 
> 
> Additional testing:
>  - [x] Linux aarch64 fastdebug, `tier1`
>  - [x] Linux aarch64 fastdebug, `tier2`
>  - [x] Ad-hoc benchmarks

LGTM. I am not a reviwer. need other reviewer to approve it. 

One subtlety is that sub can only encode uimm24.  I think it's safe for page size = 4k because it can support up to 2^12 p.

-------------

Marked as reviewed by xliu (Committer).

PR: https://git.openjdk.java.net/jdk/pull/8001


More information about the hotspot-dev mailing list