[foreign-memaccess+abi] RFR: Improve strlen performance [v6]

Per Minborg pminborg at openjdk.org
Wed Aug 16 11:30:33 UTC 2023


On Tue, 15 Aug 2023 14:40:55 GMT, Per Minborg <pminborg at openjdk.org> wrote:

>> This PR suggests removing the use of native calls for strlen and instead use Java implementations.
>> 
>> The PR also suggest performance improvements for quad word strlen.
>> 
>> Here are some benchmarks that compares the performance of the new proposed methods with the performance of the JDK 21 variants (called "legacy" methods):
>> 
>> <img width="2596" alt="image" src="https://github.com/openjdk/panama-foreign/assets/7457876/c785f341-c826-4e3e-bf56-8387b1e96010">
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Allow heap segments of Long and Double to use chunked len calc

Here are some up-to-date benchmarks:


Benchmark                                  (size)  Mode  Cnt    Score   Error  Units
InternalStrLen.legacyDoubleByte                 1  avgt   30    1.884 ? 0.012  ns/op
InternalStrLen.legacyDoubleByte                 4  avgt   30    3.748 ? 0.014  ns/op
InternalStrLen.legacyDoubleByte                16  avgt   30   11.264 ? 0.059  ns/op
InternalStrLen.legacyDoubleByte               251  avgt   30  164.827 ? 0.514  ns/op
InternalStrLen.legacyDoubleByte              1024  avgt   30  647.327 ? 2.600  ns/op

InternalStrLen.legacyQuadByte                   1  avgt   30    1.961 ? 0.082  ns/op
InternalStrLen.legacyQuadByte                   4  avgt   30    3.746 ? 0.012  ns/op
InternalStrLen.legacyQuadByte                  16  avgt   30   11.253 ? 0.048  ns/op
InternalStrLen.legacyQuadByte                 251  avgt   30  165.220 ? 0.809  ns/op
InternalStrLen.legacyQuadByte                1024  avgt   30  647.814 ? 2.576  ns/op

InternalStrLen.legacySingleByte                 1  avgt   30    1.586 ? 0.016  ns/op
InternalStrLen.legacySingleByte                 4  avgt   30    2.995 ? 0.056  ns/op
InternalStrLen.legacySingleByte                16  avgt   30   10.032 ? 1.354  ns/op
InternalStrLen.legacySingleByte               251  avgt   30  125.963 ? 0.636  ns/op
InternalStrLen.legacySingleByte              1024  avgt   30  489.120 ? 2.170  ns/op
InternalStrLen.legacySingleByteMisaligned       1  avgt   30    1.607 ? 0.035  ns/op
InternalStrLen.legacySingleByteMisaligned       4  avgt   30    3.004 ? 0.024  ns/op
InternalStrLen.legacySingleByteMisaligned      16  avgt   30    8.779 ? 0.113  ns/op
InternalStrLen.legacySingleByteMisaligned     251  avgt   30  127.541 ? 1.196  ns/op
InternalStrLen.legacySingleByteMisaligned    1024  avgt   30  494.867 ? 5.129  ns/op



InternalStrLen.newDoubleByte                    1  avgt   30    1.967 ? 0.204  ns/op
InternalStrLen.newDoubleByte                    4  avgt   30    4.002 ? 0.014  ns/op
InternalStrLen.newDoubleByte                   16  avgt   30    6.025 ? 0.050  ns/op
InternalStrLen.newDoubleByte                  251  avgt   30   28.835 ? 0.185  ns/op
InternalStrLen.newDoubleByte                 1024  avgt   30   72.941 ? 1.307  ns/op

InternalStrLen.newQuadByte                      1  avgt   30    1.430 ? 0.020  ns/op
InternalStrLen.newQuadByte                      4  avgt   30    3.844 ? 0.017  ns/op
InternalStrLen.newQuadByte                     16  avgt   30    6.822 ? 0.052  ns/op
InternalStrLen.newQuadByte                    251  avgt   30   48.940 ? 1.078  ns/op
InternalStrLen.newQuadByte                   1024  avgt   30  196.276 ? 1.722  ns/op

InternalStrLen.newSingleByte                    1  avgt   30    1.920 ? 0.016  ns/op
InternalStrLen.newSingleByte                    4  avgt   30    5.717 ? 0.748  ns/op
InternalStrLen.newSingleByte                   16  avgt   30    5.450 ? 0.047  ns/op
InternalStrLen.newSingleByte                  251  avgt   30   22.148 ? 0.155  ns/op
InternalStrLen.newSingleByte                 1024  avgt   30   39.291 ? 0.220  ns/op
InternalStrLen.newSingleByteMisaligned          1  avgt   30    1.761 ? 0.010  ns/op
InternalStrLen.newSingleByteMisaligned          4  avgt   30    4.937 ? 0.029  ns/op
InternalStrLen.newSingleByteMisaligned         16  avgt   30    9.741 ? 0.041  ns/op
InternalStrLen.newSingleByteMisaligned        251  avgt   30   28.152 ? 0.346  ns/op
InternalStrLen.newSingleByteMisaligned       1024  avgt   30   44.796 ? 0.805  ns/op

-------------

PR Comment: https://git.openjdk.org/panama-foreign/pull/862#issuecomment-1680428464


More information about the panama-dev mailing list