RFR: 8352536: Add overloads to parse and build class files from/to MemorySegment [v5]
David M. Lloyd
duke at openjdk.org
Thu Mar 27 17:42:22 UTC 2025
On Thu, 27 Mar 2025 17:38:49 GMT, David M. Lloyd <duke at openjdk.org> wrote:
>> Provide method overloads to the ClassFile interface of the java.lang.classfile API which allow parsing of classes found in memory segments, as well as allowing built class files to be output to them.
>
> David M. Lloyd has updated the pull request incrementally with one additional commit since the last revision:
>
> Add a benchmark for class file emission
Here's the raw benchmark results against `AbstractMap` and `TreeMap`:
Benchmark Mode Cnt Score Error Units
MemorySegmentBenchmark.emitWithCopy0 thrpt 5 198061.082 ± 2300.146 ops/s
MemorySegmentBenchmark.emitWithCopy1 thrpt 5 35352.167 ± 320.823 ops/s
MemorySegmentBenchmark.emitWithoutCopy0 thrpt 5 265208.111 ± 1416.120 ops/s
MemorySegmentBenchmark.emitWithoutCopy1 thrpt 5 53215.327 ± 354.228 ops/s
`0` is the smaller `AbstractMap` class bytes and `1` is the larger `TreeMap` class bytes. For case 0 we see an improvement of around 34% overall, and case 1 shows an improvement of closer to 50% (which is expected, since larger classes would mean copying more bytes as well as putting more pressure on the GC).
Here is the same benchmark with `-prof gc` enabled:
Benchmark Mode Cnt Score Error Units
MemorySegmentBenchmark.emitWithCopy0 thrpt 5 197728.066 ± 3107.524 ops/s
MemorySegmentBenchmark.emitWithCopy0:gc.alloc.rate thrpt 5 3900.963 ± 61.292 MB/sec
MemorySegmentBenchmark.emitWithCopy0:gc.alloc.rate.norm thrpt 5 20688.004 ± 0.001 B/op
MemorySegmentBenchmark.emitWithCopy0:gc.count thrpt 5 680.000 counts
MemorySegmentBenchmark.emitWithCopy0:gc.time thrpt 5 415.000 ms
MemorySegmentBenchmark.emitWithCopy1 thrpt 5 35504.531 ± 260.423 ops/s
MemorySegmentBenchmark.emitWithCopy1:gc.alloc.rate thrpt 5 3512.621 ± 25.778 MB/sec
MemorySegmentBenchmark.emitWithCopy1:gc.alloc.rate.norm thrpt 5 103744.020 ± 0.001 B/op
MemorySegmentBenchmark.emitWithCopy1:gc.count thrpt 5 673.000 counts
MemorySegmentBenchmark.emitWithCopy1:gc.time thrpt 5 413.000 ms
MemorySegmentBenchmark.emitWithoutCopy0 thrpt 5 265533.600 ± 1707.914 ops/s
MemorySegmentBenchmark.emitWithoutCopy0:gc.alloc.rate thrpt 5 3547.167 ± 22.811 MB/sec
MemorySegmentBenchmark.emitWithoutCopy0:gc.alloc.rate.norm thrpt 5 14008.003 ± 0.001 B/op
MemorySegmentBenchmark.emitWithoutCopy0:gc.count thrpt 5 651.000 counts
MemorySegmentBenchmark.emitWithoutCopy0:gc.time thrpt 5 392.000 ms
MemorySegmentBenchmark.emitWithoutCopy1 thrpt 5 52727.917 ± 624.059 ops/s
MemorySegmentBenchmark.emitWithoutCopy1:gc.alloc.rate thrpt 5 3531.104 ± 42.004 MB/sec
MemorySegmentBenchmark.emitWithoutCopy1:gc.alloc.rate.norm thrpt 5 70224.013 ± 0.001 B/op
MemorySegmentBenchmark.emitWithoutCopy1:gc.count thrpt 5 683.000 counts
MemorySegmentBenchmark.emitWithoutCopy1:gc.time thrpt 5 412.000 ms
You can see that in addition to the overhead of copying, we also put a bit more pressure on the GC despite having similar numbers of allocations by filling up our allocation regions more quickly with the extra large array per operation, which requires a little more time to be spent in GC on average. We are allocating roughly the same *number* of objects in either case.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24139#issuecomment-2758926428
More information about the core-libs-dev
mailing list