RFR: 8271128: InlineIntrinsics support for 32-bit ARM [v2]

Christoph Göttschkes cgo at openjdk.java.net
Thu Aug 5 12:30:56 UTC 2021


> Hi,
> 
> please review this patch, which adds support for InlineIntrinsics to the 32-bit ARM port. The old aarch32 port had this intrinsic implemented and enabled by default.
> 
> Like on many other platforms, the 32-bit ARM port simply calls into the `SharedRuntime` to intrinsify the basic `java.lang.Math` methods. InlineIntrinsics is already implemented for C1 on 32-bit ARM, which does the same thing.
> 
> testing: hotspot tier1 on ARMv5TE (soft-float) and ARMv7-A (hard-float)
> 
> There is already the micro benchmark `test/micro/org/openjdk/bench/java/lang/MathBench.java` which I used. The soft-float benchmarks are not that meaningful, since I performed them in QEMU.
> 
> __hard-float__ `-Xint -XX:+InlineIntrinsics`
> 
> | Benchmark                     | (seed) |  Mode | Cnt |    Score |       Error |  Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble           |      0 | thrpt |   5 | 1169.574 | +/- 133.694 | ops/ms |
> | MathBench.cosDouble           |      0 | thrpt |   5 |  759.902 | +/- 573.852 | ops/ms |
> | MathBench.expDouble           |      0 | thrpt |   5 |  854.753 | +/-  67.217 | ops/ms |
> | MathBench.log10Double         |      0 | thrpt |   5 |  902.034 | +/-  22.413 | ops/ms |
> | MathBench.logDouble           |      0 | thrpt |   5 |  895.470 | +/- 113.811 | ops/ms |
> | MathBench.powDouble           |      0 | thrpt |   5 |  936.136 | +/-  40.661 | ops/ms |
> | MathBench.sinDouble           |      0 | thrpt |   5 |  864.670 | +/-  68.329 | ops/ms |
> | MathBench.sqrtDouble          |      0 | thrpt |   5 | 1082.589 | +/-  92.570 | ops/ms |
> | MathBench.tanDouble           |      0 | thrpt |   5 |  853.715 | +/- 122.427 | ops/ms |
> 
> __hard-float__ `-Xint -XX:-InlineIntrinsics`
> 
> | Benchmark                     | (seed) |  Mode | Cnt |    Score |       Error |  Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble           |      0 | thrpt |   5 |  450.907 | +/-  10.402 | ops/ms |
> | MathBench.cosDouble           |      0 | thrpt |   5 |  592.242 | +/-  14.011 | ops/ms |
> | MathBench.expDouble           |      0 | thrpt |   5 |  167.614 | +/-   7.530 | ops/ms |
> | MathBench.log10Double         |      0 | thrpt |   5 |  572.099 | +/-  55.089 | ops/ms |
> | MathBench.logDouble           |      0 | thrpt |   5 |  596.588 | +/-  24.976 | ops/ms |
> | MathBench.powDouble           |      0 | thrpt |   5 |  212.673 | +/-   4.060 | ops/ms |
> | MathBench.sinDouble           |      0 | thrpt |   5 |  584.873 | +/-  42.774 | ops/ms |
> | MathBench.sqrtDouble          |      0 | thrpt |   5 |  514.690 | +/-  30.568 | ops/ms |
> | MathBench.tanDouble           |      0 | thrpt |   5 |  566.586 | +/-  23.995 | ops/ms |
> 
> __soft-float__ `-Xint -XX:+InlineIntrinsics`
> 
> | Benchmark                     | (seed) |  Mode | Cnt |    Score |       Error |  Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble           |      0 | thrpt |   5 |  279.575 | +/-  56.455 | ops/ms |
> | MathBench.cosDouble           |      0 | thrpt |   5 |  137.005 | +/-  72.561 | ops/ms |
> | MathBench.expDouble           |      0 | thrpt |   5 |  117.778 | +/-  30.186 | ops/ms |
> | MathBench.log10Double         |      0 | thrpt |   5 |  107.957 | +/-  10.158 | ops/ms |
> | MathBench.logDouble           |      0 | thrpt |   5 |  101.341 | +/-   3.914 | ops/ms |
> | MathBench.powDouble           |      0 | thrpt |   5 |  222.220 | +/-   3.854 | ops/ms |
> | MathBench.sinDouble           |      0 | thrpt |   5 |  112.715 | +/-   9.088 | ops/ms |
> | MathBench.sqrtDouble          |      0 | thrpt |   5 |  119.341 | +/-  76.528 | ops/ms |
> | MathBench.tanDouble           |      0 | thrpt |   5 |  105.224 | +/-  30.477 | ops/ms |
> 
> __soft-float__ `-Xint -XX:-InlineIntrinsics`
> 
> | Benchmark                     | (seed) |  Mode | Cnt |    Score |       Error |  Units |
> | :---------------------------- | -----: | ----: | --: | -------: | ----------: | -----: |
> | MathBench.absDouble           |      0 | thrpt |   5 |  173.150 | +/-  36.279 | ops/ms |
> | MathBench.cosDouble           |      0 | thrpt |   5 |  129.774 | +/-   8.795 | ops/ms |
> | MathBench.expDouble           |      0 | thrpt |   5 |   53.524 | +/-   1.679 | ops/ms |
> | MathBench.log10Double         |      0 | thrpt |   5 |  132.503 | +/-   4.274 | ops/ms |
> | MathBench.logDouble           |      0 | thrpt |   5 |  135.483 | +/-   1.150 | ops/ms |
> | MathBench.powDouble           |      0 | thrpt |   5 |   54.266 | +/-   0.699 | ops/ms |
> | MathBench.sinDouble           |      0 | thrpt |   5 |  105.636 | +/-   4.647 | ops/ms |
> | MathBench.sqrtDouble          |      0 | thrpt |   5 |  204.550 | +/-   7.206 | ops/ms |
> | MathBench.tanDouble           |      0 | thrpt |   5 |  101.072 | +/-   3.701 | ops/ms |

Christoph Göttschkes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:

 - Rename generate_transcendental_entry -> generate_math_runtime_call
 - Merge branch 'master' into JDK-8271128-arm32-math-intrinsics
   
   * master: (97 commits)
     8270903: sun.net.httpserver.HttpConnection: Improve toString
     8271722: [TESTBUG] gc/g1/TestMixedGCLiveThreshold.java can fail if G1 Full GC uses >1 workers
     8270058: Use Objects.check{Index,FromIndexSize} for java.desktop
     4819544: SwingSet2 JTable Demo throws NullPointerException
     8271878: UnProblemList jdk/jfr/event/gc/detailed/TestEvacuationFailedEvent.java in JDK18
     8271895: UnProblemList javax/swing/JComponent/7154030/bug7154030.java in JDK18
     8271863: ProblemList serviceability/sa/TestJmapCore.java on linux-x64 with ZGC
     8271898: disable os.release_multi_mappings_vm on macOS-X64
     8271893: mark hotspot runtime/PerfMemDestroy/PerfMemDestroy.java test as ignoring external VM flags
     8271887: mark hotspot runtime/CDSCompressedKPtrs tests which ignore external VM flags
     8271891: mark hotspot runtime/Safepoint tests which ignore external VM flags
     8271886: mark hotspot runtime/InvocationTests tests which ignore external VM flags
     8271890: mark hotspot runtime/Dictionary tests which ignore external VM flags
     8271894: ProblemList javax/swing/JComponent/7154030/bug7154030.java in JDK17
     8271888: build error after JDK-8271599
     8271456: Avoid looking up standard charsets in "java.desktop" module
     8271589: fatal error with variable shift count integer rotate operation.
     8271599: Javadoc of floorDiv() and floorMod() families is inaccurate in some places
     8271877: ProblemList jdk/jfr/event/gc/detailed/TestEvacuationFailedEvent.java in JDK17
     8271217: Fix race between G1PeriodicGCTask checks and GC request
     ...
 - 8271128: InlineIntrinsics support for 32-bit ARM

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4927/files
  - new: https://git.openjdk.java.net/jdk/pull/4927/files/d8926dfe..faadc2d5

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4927&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4927&range=00-01

  Stats: 37172 lines in 345 files changed: 33855 ins; 1917 del; 1400 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4927.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4927/head:pull/4927

PR: https://git.openjdk.java.net/jdk/pull/4927


More information about the hotspot-runtime-dev mailing list