RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v3]
Aleksey Shipilev
shade at openjdk.org
Fri Aug 11 14:45:58 UTC 2023
On Mon, 22 May 2023 19:19:01 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks.
>>
>> Example on M1:
>>
>>
>> Benchmark (size) Mode Cnt Score Error Units
>>
>> # Before
>> MultiArrayAlloc.full 1 avgt 15 74,053 ± 0,869 ns/op
>> MultiArrayAlloc.full 2 avgt 15 87,800 ± 0,931 ns/op
>> MultiArrayAlloc.full 4 avgt 15 124,814 ± 0,615 ns/op
>> MultiArrayAlloc.full 8 avgt 15 188,562 ± 0,785 ns/op
>> MultiArrayAlloc.full 16 avgt 15 313,007 ± 1,108 ns/op
>> MultiArrayAlloc.full 32 avgt 15 640,276 ± 4,560 ns/op
>> MultiArrayAlloc.full 64 avgt 15 1395,220 ± 5,860 ns/op
>> MultiArrayAlloc.full 128 avgt 15 3417,848 ± 11,345 ns/op
>> MultiArrayAlloc.full 256 avgt 15 9955,360 ± 102,057 ns/op
>> MultiArrayAlloc.full 512 avgt 15 27738,002 ± 244,940 ns/op
>> MultiArrayAlloc.full 1024 avgt 15 147507,008 ± 1434,085 ns/op
>>
>> # After
>> MultiArrayAlloc.full 1 avgt 15 70,434 ± 0,373 ns/op ; 5% better
>> MultiArrayAlloc.full 2 avgt 15 82,394 ± 0,137 ns/op ; 7% better
>> MultiArrayAlloc.full 4 avgt 15 108,542 ± 0,129 ns/op ; 15% better
>> MultiArrayAlloc.full 8 avgt 15 170,697 ± 4,480 ns/op ; 11% better
>> MultiArrayAlloc.full 16 avgt 15 272,902 ± 0,877 ns/op ; 15% better
>> MultiArrayAlloc.full 32 avgt 15 524,486 ± 1,447 ns/op ; 22% better
>> MultiArrayAlloc.full 64 avgt 15 1088,932 ± 2,739 ns/op ; 17% better
>> MultiArrayAlloc.full 128 avgt 15 3151,144 ± 14,621 ns/op ; 8% better
>> MultiArrayAlloc.full 256 avgt 15 8455,293 ± 12,656 ns/op ; 18% better
>> MultiArrayAlloc.full 512 avgt 15 26060,055 ± 116,524 ns/op ; 6% better
>> MultiArrayAlloc.full 1024 avgt 15 130824,480 ± 831,703 ns/op ; 13% better
>>
>>
>> Additional testing:
>> - [x] Ad-hoc micro-benchmarks
>> - [x] Linux x86_64 fastdebug `serviceability/jvmti`
>> - [x] Linux x86_64 fastdebug `jdk/jfr`
>> - [x] Linux x86_64 fastdebug `t...
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase.
I force-pushed because there were lots of changes in the related code and the merge was exceedingly tedious. The `MultiArrayAlloc` benchmarks improve, I am going to post more thorough benchmark results next week. Meanwhile, @iklam, do you want to give this a spin with [JDK-8310823](https://bugs.openjdk.org/browse/JDK-8310823) prototype? I think it would be sensitive to this change as well.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14019#issuecomment-1674890219
More information about the hotspot-gc-dev
mailing list