RFR: 8287788: reuse intermediate segments allocated during FFM stub invocations

Matthias Ernst duke at openjdk.org
Sun Jan 19 21:09:15 UTC 2025


On Fri, 17 Jan 2025 14:58:37 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:

> Could you add the benchmark you're using to the PR as well? 

Done. I slotted it into the "points" BM suite, alas I had to define another "DoublePoint" struct, though, since the existing int/int pair gets packed into a long.

Full disclosure, I'm not sure how to run it inside the jdk build structure, ran it outside instead, so I hope it builds (`make test TEST="micro:java.lang.foreign.points"` => `Error: Unable to access jarfile /Users/mernst/IdeaProjects/jdk/build/macosx-aarch64-server-fastdebug/images/test/micro/benchmarks.jar`)


It exercises a loop like this:

struct DoublePoint { double x; double y; }
DoublePoint unit_rotate(double phi);  <== HFA requires intermediate buffer
void unit_rotate_ptr(DoublePoint* out, double phi);  <== reference, no intermediate buffer

DoublePoint *points = new DoublePoint[N];
for (i in 0...N) points[i] = unit_rotate(2*pi*i/N);
vs
for (i in 0...N) unit_rotate_ptr(points+i, 2*pi*i/N);


It is now almost competitive and the memory profile looks a lot better:

# VM version: JDK 25-ea, OpenJDK 64-Bit Server VM, 25-ea+3-283
Benchmark                                        Mode  Cnt      Score      Error   Units
PointsAlloc.circle_by_ptr                        avgt    5      8.964 ±   0.351   ns/op
PointsAlloc.circle_by_ptr:·gc.alloc.rate         avgt    5     95.301 ±   3.665  MB/sec
PointsAlloc.circle_by_ptr:·gc.alloc.rate.norm    avgt    5      0.224 ±   0.001    B/op
PointsAlloc.circle_by_ptr:·gc.count              avgt    5      2.000            counts
PointsAlloc.circle_by_ptr:·gc.time               avgt    5      3.000                ms
PointsAlloc.circle_by_value                      avgt    5     46.498 ±   2.336   ns/op
PointsAlloc.circle_by_value:·gc.alloc.rate       avgt    5  13141.578 ± 650.425  MB/sec
PointsAlloc.circle_by_value:·gc.alloc.rate.norm  avgt    5    160.224 ±   0.001    B/op
PointsAlloc.circle_by_value:·gc.count            avgt    5    116.000            counts
PointsAlloc.circle_by_value:·gc.time             avgt    5     44.000                ms

# VM version: JDK 25-internal, OpenJDK 64-Bit Server VM, 25-internal-adhoc.mernst.jdk
Benchmark                                        Mode  Cnt   Score    Error   Units
PointsAlloc.circle_by_ptr                        avgt    5   9.108 ±  0.477   ns/op
PointsAlloc.circle_by_ptr:·gc.alloc.rate         avgt    5  93.792 ±  4.898  MB/sec
PointsAlloc.circle_by_ptr:·gc.alloc.rate.norm    avgt    5   0.224 ±  0.001    B/op
PointsAlloc.circle_by_ptr:·gc.count              avgt    5   2.000           counts
PointsAlloc.circle_by_ptr:·gc.time               avgt    5   4.000               ms
PointsAlloc.circle_by_value                      avgt    5  13.180 ±  0.611   ns/op
PointsAlloc.circle_by_value:·gc.alloc.rate       avgt    5  64.816 ±  2.964  MB/sec
PointsAlloc.circle_by_value:·gc.alloc.rate.norm  avgt    5   0.224 ±  0.001    B/op
PointsAlloc.circle_by_value:·gc.count            avgt    5   2.000           counts
PointsAlloc.circle_by_value:·gc.time             avgt    5   5.000               ms

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23142#issuecomment-2599586149


More information about the core-libs-dev mailing list