RFR: 8271128: InlineIntrinsics support for 32-bit ARM [v4]
Christoph Göttschkes
cgo at openjdk.java.net
Fri Aug 6 08:49:50 UTC 2021
> Hi,
>
> please review this patch, which adds support for InlineIntrinsics to the 32-bit ARM port. The old aarch32 port had this intrinsic implemented and enabled by default.
>
> Like on many other platforms, the 32-bit ARM port simply calls into the `SharedRuntime` to intrinsify the basic `java.lang.Math` methods. InlineIntrinsics is already implemented for C1 on 32-bit ARM, which does the same thing.
>
> testing: hotspot tier1 on ARMv5TE (soft-float) and ARMv7-A (hard-float)
>
> There is already the micro benchmark `test/micro/org/openjdk/bench/java/lang/MathBench.java` which I used. The soft-float benchmarks are not that meaningful, since I performed them in QEMU.
>
> __hard-float__ `-Xint -XX:+InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 1169.574 | +/- 133.694 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 759.902 | +/- 573.852 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 854.753 | +/- 67.217 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 902.034 | +/- 22.413 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 895.470 | +/- 113.811 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 936.136 | +/- 40.661 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 864.670 | +/- 68.329 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 1082.589 | +/- 92.570 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 853.715 | +/- 122.427 | ops/ms |
>
> __hard-float__ `-Xint -XX:-InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 450.907 | +/- 10.402 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 592.242 | +/- 14.011 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 167.614 | +/- 7.530 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 572.099 | +/- 55.089 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 596.588 | +/- 24.976 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 212.673 | +/- 4.060 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 584.873 | +/- 42.774 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 514.690 | +/- 30.568 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 566.586 | +/- 23.995 | ops/ms |
>
> __soft-float__ `-Xint -XX:+InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 279.575 | +/- 56.455 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 137.005 | +/- 72.561 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 117.778 | +/- 30.186 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 107.957 | +/- 10.158 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 101.341 | +/- 3.914 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 222.220 | +/- 3.854 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 112.715 | +/- 9.088 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 119.341 | +/- 76.528 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 105.224 | +/- 30.477 | ops/ms |
>
> __soft-float__ `-Xint -XX:-InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 173.150 | +/- 36.279 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 129.774 | +/- 8.795 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 53.524 | +/- 1.679 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 132.503 | +/- 4.274 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 135.483 | +/- 1.150 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 54.266 | +/- 0.699 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 105.636 | +/- 4.647 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 204.550 | +/- 7.206 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 101.072 | +/- 3.701 | ops/ms |
Christoph Göttschkes has updated the pull request incrementally with one additional commit since the last revision:
Rename use_runtime_function -> use_runtime_call
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/4927/files
- new: https://git.openjdk.java.net/jdk/pull/4927/files/3c20c980..6de922f8
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4927&range=03
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4927&range=02-03
Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod
Patch: https://git.openjdk.java.net/jdk/pull/4927.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/4927/head:pull/4927
PR: https://git.openjdk.java.net/jdk/pull/4927
More information about the hotspot-runtime-dev
mailing list