RFR: 8308776: [AArch64] Math.log is 10% slower than StrictMath.log on aarch64 [v2]
Andrew Dinn
adinn at openjdk.org
Wed Nov 26 09:13:50 UTC 2025
On Tue, 25 Nov 2025 22:23:36 GMT, Dhamoder Nalla <dhanalla at openjdk.org> wrote:
>> This PR Introduces an optimized AArch64 intrinsic for Math.log using reciprocal refinement and a table-driven polynomial.
>> Improves throughput for double logarithms while preserving IEEE-754 corner case behavior (±0, subnormals, negatives, NaN).
>>
>>
>>
>> The micro-benchmark results from MathBench and StrictMathBench below show the performance improvement of Math.log:
>>
>>
>> **Before change**
>> <html xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
>> xmlns="http://www.w3.org/TR/REC-html40">
>>
>> <head>
>>
>> <meta name=ProgId content=OneNote.File>
>> <meta name=Generator content="Microsoft OneNote 15">
>> </head>
>>
>> <body lang=en-US style='font-family:Calibri;font-size:11.0pt'>
>>
>>
>> <div style='direction:ltr;border-width:100%'>
>>
>> <div style='direction:ltr;margin-top:0in;margin-left:0in;width:12.9277in'>
>>
>> <div style='direction:ltr;margin-top:0in;margin-left:0in;width:12.9277in'>
>>
>> <div style='direction:ltr'>
>>
>> Benchmark | Mode | Cnt | Score | Error | Units
>> -- | -- | -- | -- | -- | --
>> MathBench.logDouble | thrpt | 10 | **15549.705** | ±357.439 | ops/ms
>> StrictMathBench.logDouble | thrpt | 10 | 219408.158 | ±16484.680 | ops/ms
>>
>> </div>
>>
>> </div>
>>
>> </div>
>>
>> </div>
>>
>>
>> </body>
>>
>> </html>
>>
>>
>>
>> **After adding Math.log intrinsic**
>>
>> <html xmlns:o="urn:schemas-microsoft-com:office:office"
>> xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
>> xmlns="http://www.w3.org/TR/REC-html40">
>>
>> <head>
>>
>> <meta name=ProgId content=OneNote.File>
>> <meta name=Generator content="Microsoft OneNote 15">
>> </head>
>>
>> <body lang=en-US style='font-family:Calibri;font-size:11.0pt'>
>>
>>
>> <div style='direction:ltr;border-width:100%'>
>>
>> <div style='direction:ltr;margin-top:0in;margin-left:0in;width:12.9277in'>
>>
>> <div style='direction:ltr;margin-top:0in;margin-left:0in;width:12.9277in'>
>>
>> <div style='direction:ltr'>
>>
>> Benchmark | Mode | Cnt | Score | Error | Units
>> -- | -- | -- | -- | -- | --
>> MathBench.logDouble | thrpt | 10 | **300086.773** | ±6675.936 | ops/ms
>> StrictMathBench.logDouble | thrpt | 10 | 226521.817 | ±4038.975 | ops/ms
>>
>>
>> </div>
>>
>> </div>
>>
>> </div>
>>
>> </div>
>>
>>
>> </body>
>>
>> </html>
>
> Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision:
>
> [AArch64] Math.log is 10% slower than StrictMath.log on aarch64
src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8812:
> 8810: address generate_dlog() {
> 8811: __ align(CodeEntryAlignment);
> 8812: StubCodeMark mark(this, "StubRoutines", "dlog");
This StubCodeMark needs to be declared with a StubId as argument. See other stub generators in this file or the equivalent code in `cpu/x86/stubGenerator_x86_64_log.cpp` for an example of what it should look like.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28306#discussion_r2564133696
More information about the hotspot-dev
mailing list