RFR: 8271128: InlineIntrinsics support for 32-bit ARM
Christoph Göttschkes
cgo at openjdk.java.net
Wed Aug 4 07:34:27 UTC 2021
On Thu, 29 Jul 2021 09:40:08 GMT, Christoph Göttschkes <cgo at openjdk.org> wrote:
> Hi,
>
> please review this patch, which adds support for InlineIntrinsics to the 32-bit ARM port. The old aarch32 port had this intrinsic implemented and enabled by default.
>
> Like on many other platforms, the 32-bit ARM port simply calls into the `SharedRuntime` to intrinsify the basic `java.lang.Math` methods. InlineIntrinsics is already implemented for C1 on 32-bit ARM, which does the same thing.
>
> testing: hotspot tier1 on ARMv5TE (soft-float) and ARMv7-A (hard-float)
>
> There is already the micro benchmark `test/micro/org/openjdk/bench/java/lang/MathBench.java` which I used. The soft-float benchmarks are not that meaningful, since I performed them in QEMU.
>
> __hard-float__ `-Xint -XX:+InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 1169.574 | +/- 133.694 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 759.902 | +/- 573.852 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 854.753 | +/- 67.217 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 902.034 | +/- 22.413 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 895.470 | +/- 113.811 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 936.136 | +/- 40.661 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 864.670 | +/- 68.329 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 1082.589 | +/- 92.570 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 853.715 | +/- 122.427 | ops/ms |
>
> __hard-float__ `-Xint -XX:-InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 450.907 | +/- 10.402 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 592.242 | +/- 14.011 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 167.614 | +/- 7.530 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 572.099 | +/- 55.089 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 596.588 | +/- 24.976 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 212.673 | +/- 4.060 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 584.873 | +/- 42.774 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 514.690 | +/- 30.568 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 566.586 | +/- 23.995 | ops/ms |
>
> __soft-float__ `-Xint -XX:+InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 279.575 | +/- 56.455 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 137.005 | +/- 72.561 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 117.778 | +/- 30.186 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 107.957 | +/- 10.158 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 101.341 | +/- 3.914 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 222.220 | +/- 3.854 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 112.715 | +/- 9.088 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 119.341 | +/- 76.528 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 105.224 | +/- 30.477 | ops/ms |
>
> __soft-float__ `-Xint -XX:-InlineIntrinsics`
>
> | Benchmark | (seed) | Mode | Cnt | Score | Error | Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble | 0 | thrpt | 5 | 173.150 | +/- 36.279 | ops/ms |
> | MathBench.cosDouble | 0 | thrpt | 5 | 129.774 | +/- 8.795 | ops/ms |
> | MathBench.expDouble | 0 | thrpt | 5 | 53.524 | +/- 1.679 | ops/ms |
> | MathBench.log10Double | 0 | thrpt | 5 | 132.503 | +/- 4.274 | ops/ms |
> | MathBench.logDouble | 0 | thrpt | 5 | 135.483 | +/- 1.150 | ops/ms |
> | MathBench.powDouble | 0 | thrpt | 5 | 54.266 | +/- 0.699 | ops/ms |
> | MathBench.sinDouble | 0 | thrpt | 5 | 105.636 | +/- 4.647 | ops/ms |
> | MathBench.sqrtDouble | 0 | thrpt | 5 | 204.550 | +/- 7.206 | ops/ms |
> | MathBench.tanDouble | 0 | thrpt | 5 | 101.072 | +/- 3.701 | ops/ms |
`jdk:tier1` tests on soft-float and hard-float are good. `IeeeRecommendedTests` on soft-float has some issues, but this is because I don't use an external soft float library, but the glibc implementation for floating point arithmetic, which doesn't have the required precision. Probably one should pick up [JDK-8215902](https://bugs.openjdk.java.net/browse/JDK-8215902) again and fix this situation.
-------------
PR: https://git.openjdk.java.net/jdk/pull/4927
More information about the hotspot-runtime-dev
mailing list