RFR: 8345687: Improve the implementation of SegmentFactories::allocateSegment

Wed Dec 11 18:44:49 UTC 2024

On Fri, 6 Dec 2024 16:30:47 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> Hi,
> 
> This patch improves the performance of a typical `Arena::allocate` in several ways:
> 
> - Delay the creation of the NativeMemorySegmentImpl. This avoids the merge of the instance with the one obtained from the call in the uncommon path, increasing the chance the object being scalar replaced.
> - Split the allocation of over-aligned memory to a slow-path method.
> - Align the memory to 8 bytes, allowing faster zeroing.
> - Use a dedicated method to zero the just-allocated native memory, reduce code size and make it more straightforward.
> - Make `VM.pageAlignDirectMemory` a `Boolean` instead of a `boolean` so that `false` value can be constant folded.
> 
> Please take a look and leave your reviews, thanks a lot.

The benchmark results of the updated version are a bit better than the previous version:

    Benchmark                 (size)  Mode  Cnt   Score   Error  Units
    AllocTest.alloc_confined       5  avgt   30  15.796 ± 0.066  ns/op
    AllocTest.alloc_confined      20  avgt   30  16.402 ± 0.203  ns/op
    AllocTest.alloc_confined     100  avgt   30  17.804 ± 0.142  ns/op
    AllocTest.alloc_confined     500  avgt   30  20.037 ± 0.176  ns/op
    AllocTest.alloc_confined    2000  avgt   30  39.397 ± 3.176  ns/op
    AllocTest.alloc_confined    8000  avgt   30  77.413 ± 1.621  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22610#issuecomment-2536839516