RFR: 8357531: The `SegmentBulkOperations::fill` method can be improved using overlaps [v5]

Per Minborg pminborg at openjdk.org
Thu May 22 12:00:55 UTC 2025


On Thu, 22 May 2025 11:52:34 GMT, Per Minborg <pminborg at openjdk.org> wrote:

>> This PR builds on a concept John Rose told me about some time ago. Instead of combining memory operations of various sizes, a single large and skewed memory operation can be made to clean up the tail of remaining bytes.
>> 
>> This has the effect of simplifying and shortening the code. The number of branches to evaluate is reduced.
>> 
>> It should be noted that the performance of the fill operation affects the allocation of new segments (as they are zeroed out before being returned to the client code).
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update benchmark to reflect new fill method

Updated benchmarks:

Base:

Benchmark                                   (ELEM_SIZE)  Mode  Cnt        Score        Error  Units
SegmentBulkFill.nativeSegmentFillJava                 2  avgt   30        1.618 ±      0.060  ns/op
SegmentBulkFill.nativeSegmentFillJava                 3  avgt   30        1.602 ±      0.042  ns/op
SegmentBulkFill.nativeSegmentFillJava                 4  avgt   30        1.775 ±      0.070  ns/op
SegmentBulkFill.nativeSegmentFillJava                 5  avgt   30        1.759 ±      0.051  ns/op
SegmentBulkFill.nativeSegmentFillJava                 6  avgt   30        1.771 ±      0.051  ns/op
SegmentBulkFill.nativeSegmentFillJava                 7  avgt   30        1.785 ±      0.049  ns/op
SegmentBulkFill.nativeSegmentFillJava                 8  avgt   30        2.383 ±      0.061  ns/op
(12 is estimated in the chart below)
SegmentBulkFill.nativeSegmentFillJava                64  avgt   30        4.010 ±      0.255  ns/op
SegmentBulkFill.nativeSegmentFillJava               512  avgt   30        6.622 ±      0.246  ns/op
SegmentBulkFill.nativeSegmentFillJava              4096  avgt   30       44.431 ±      0.832  ns/op
SegmentBulkFill.nativeSegmentFillJava             32768  avgt   30      331.429 ±      3.073  ns/op
SegmentBulkFill.nativeSegmentFillJava            262144  avgt   30     4174.795 ±     76.096  ns/op
SegmentBulkFill.nativeSegmentFillJava           2097152  avgt   30    33084.699 ±     53.530  ns/op
SegmentBulkFill.nativeSegmentFillJava          16777216  avgt   30   298953.004 ±  11241.262  ns/op
SegmentBulkFill.nativeSegmentFillJava         134217728  avgt   30  2857973.939 ± 128453.291  ns/op


Patch:

Benchmark                              (ELEM_SIZE)  Mode  Cnt  Score   Error  Units
SegmentBulkFill.nativeSegmentFillJava            2  avgt   30  1.322 ± 0.020  ns/op
SegmentBulkFill.nativeSegmentFillJava            3  avgt   30  1.313 ± 0.009  ns/op
SegmentBulkFill.nativeSegmentFillJava            4  avgt   30  1.323 ± 0.023  ns/op
SegmentBulkFill.nativeSegmentFillJava            5  avgt   30  1.309 ± 0.006  ns/op
SegmentBulkFill.nativeSegmentFillJava            6  avgt   30  1.310 ± 0.017  ns/op
SegmentBulkFill.nativeSegmentFillJava            7  avgt   30  1.308 ± 0.004  ns/op
SegmentBulkFill.nativeSegmentFillJava            8  avgt   30  1.312 ± 0.008  ns/op
SegmentBulkFill.nativeSegmentFillJava           12  avgt   30  1.316 ± 0.025  ns/op
SegmentBulkFill.nativeSegmentFillJava           64  avgt   30  3.829 ± 0.199  ns/op
SegmentBulkFill.nativeSegmentFillJava          512  avgt   30  6.661 ± 0.077  ns/op



![image](https://github.com/user-attachments/assets/69462223-e967-4d37-b7fd-ac47d0e04db9)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25383#issuecomment-2900966164


More information about the core-libs-dev mailing list