RFR: 8266332: Adler32 intrinsic for x86 64-bit platforms [v4]

Sandhya Viswanathan sviswanathan at openjdk.java.net
Mon May 3 22:07:54 UTC 2021


On Mon, 3 May 2021 18:45:19 GMT, Xubo Zhang <github.com+58006833+xbzhang99 at openjdk.org> wrote:

>> Implement Adler32 intrinsic for x86 64-bit platform using vector instructions.
>> 
>> For the following benchmark:
>> http://cr.openjdk.java.net/~pli/rfr/8216259/TestAdler32.java
>> 
>> The optimization shows ~5x improvement.
>> 
>> Base:
>> Benchmark (count) Mode Cnt Score Error Units
>> TestAdler32Perf.testAdler32Update 64 avgt 25 0.084 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 128 avgt 25 0.104 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 256 avgt 25 0.146 ± 0.002 us/op
>> TestAdler32Perf.testAdler32Update 512 avgt 25 0.226 ± 0.002 us/op
>> TestAdler32Perf.testAdler32Update 1024 avgt 25 0.390 ± 0.005 us/op
>> TestAdler32Perf.testAdler32Update 2048 avgt 25 0.714 ± 0.007 us/op
>> TestAdler32Perf.testAdler32Update 4096 avgt 25 1.359 ± 0.014 us/op
>> TestAdler32Perf.testAdler32Update 8192 avgt 25 2.751 ± 0.023 us/op
>> TestAdler32Perf.testAdler32Update 16384 avgt 25 5.494 ± 0.077 us/op
>> TestAdler32Perf.testAdler32Update 32768 avgt 25 11.058 ± 0.160 us/op
>> TestAdler32Perf.testAdler32Update 65536 avgt 25 22.198 ± 0.319 us/op
>> 
>> 
>> With patch:
>> Benchmark (count) Mode Cnt Score Error Units
>> TestAdler32Perf.testAdler32Update 64 avgt 25 0.020 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 128 avgt 25 0.025 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 256 avgt 25 0.031 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 512 avgt 25 0.048 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 1024 avgt 25 0.078 ± 0.001 us/op
>> TestAdler32Perf.testAdler32Update 2048 avgt 25 0.139 ± 0.002 us/op
>> TestAdler32Perf.testAdler32Update 4096 avgt 25 0.262 ± 0.004 us/op
>> TestAdler32Perf.testAdler32Update 8192 avgt 25 0.524 ± 0.010 us/op
>> TestAdler32Perf.testAdler32Update 16384 avgt 25 1.017 ± 0.022 us/op
>> TestAdler32Perf.testAdler32Update 32768 avgt 25 2.058 ± 0.052 us/op
>> TestAdler32Perf.testAdler32Update 65536 avgt 25 3.994 ± 0.013 us/op
>
> Xubo Zhang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   changed copyright year to 2021 in macroAssembler_x86_adler.cpp

src/hotspot/cpu/x86/macroAssembler_x86_adler.cpp line 82:

> 80:       cmpptr(data, end);
> 81:       jcc(Assembler::aboveEqual, SKIP_LOOP_1A);
> 82: 

align(32) is needed here.

src/hotspot/cpu/x86/macroAssembler_x86_adler.cpp line 113:

> 111:       vpaddd(xa, xa, xtmp0, Assembler::AVX_256bit);
> 112:       vpaddd(xb, xb, xtmp1, Assembler::AVX_256bit);
> 113:       vpaddd(xsa, xsa, xtmp2, Assembler::AVX_256bit);

Should Assembler::AVX_128bit here.

src/hotspot/cpu/x86/macroAssembler_x86_adler.cpp line 179:

> 177:       movdl(rax, xb);
> 178:       addl(b_d, rax);
> 179: 

align(32) is needed here.

src/hotspot/cpu/x86/macroAssembler_x86_adler.cpp line 183:

> 181:       movzbl(rax, Address(data, 0)); //movzx   eax, byte[data]
> 182:       addl(a_d, rax);
> 183:       incl(data);

data is a pointer, incl(data) should be either incptr(data) or addptr(data, 1);

-------------

PR: https://git.openjdk.java.net/jdk/pull/3806


More information about the hotspot-dev mailing list