RFR: 8248188: Add HotSpotIntrinsicCandidate and API for Base64 decoding

CoreyAshford github.com+51754783+CoreyAshford at openjdk.java.net
Mon Sep 28 17:20:45 UTC 2020


On Mon, 28 Sep 2020 16:35:59 GMT, CoreyAshford <github.com+51754783+CoreyAshford at openjdk.org> wrote:

>> AOT support needs an update:
>> #  Internal Error (jdk/src/hotspot/share/aot/aotCodeHeap.cpp:557), pid=345656, tid=364316
>> #  guarantee(adr != NULL) failed: AOT Symbol not found _aot_stub_routines_base64_decodeBlock
>> 
>> V  [jvm.dll+0x1dbc6e]  AOTCodeHeap::link_stub_routines_symbols+0xf7e  (aotcodeheap.cpp:557)
>> V  [jvm.dll+0x1d95e8]  AOTCodeHeap::link_global_lib_symbols+0x2f8  (aotcodeheap.cpp:603)
>> V  [jvm.dll+0x1dc616]  AOTCodeHeap::load_klass_data+0x476  (aotcodeheap.cpp:840)
>> V  [jvm.dll+0x1e1021]  AOTLoader::load_for_klass+0x161  (aotloader.cpp:55)
>> V  [jvm.dll+0x5fec96]  InstanceKlass::initialize_impl+0x4e6  (instanceklass.cpp:1159)
>> V  [jvm.dll+0x5fead6]  InstanceKlass::initialize_impl+0x326  (instanceklass.cpp:1133)
>> V  [jvm.dll+0xc5a633]  Threads::initialize_java_lang_classes+0x93  (thread.cpp:3766)
>> V  [jvm.dll+0xc57c32]  Threads::create_vm+0xa12  (thread.cpp:4037)
>> 
>> Can be reproduced by running JTREG tests:
>> compiler/aot/calls/fromAot
>
>> AOT support needs an update:
>> # Internal Error (jdk/src/hotspot/share/aot/aotCodeHeap.cpp:557), pid=345656, tid=364316
>> # guarantee(adr != NULL) failed: AOT Symbol not found _aot_stub_routines_base64_decodeBlock
>> 
>> V [jvm.dll+0x1dbc6e] AOTCodeHeap::link_stub_routines_symbols+0xf7e (aotcodeheap.cpp:557)
>> V [jvm.dll+0x1d95e8] AOTCodeHeap::link_global_lib_symbols+0x2f8 (aotcodeheap.cpp:603)
>> V [jvm.dll+0x1dc616] AOTCodeHeap::load_klass_data+0x476 (aotcodeheap.cpp:840)
>> V [jvm.dll+0x1e1021] AOTLoader::load_for_klass+0x161 (aotloader.cpp:55)
>> V [jvm.dll+0x5fec96] InstanceKlass::initialize_impl+0x4e6 (instanceklass.cpp:1159)
>> V [jvm.dll+0x5fead6] InstanceKlass::initialize_impl+0x326 (instanceklass.cpp:1133)
>> V [jvm.dll+0xc5a633] Threads::initialize_java_lang_classes+0x93 (thread.cpp:3766)
>> V [jvm.dll+0xc57c32] Threads::create_vm+0xa12 (thread.cpp:4037)
>> 
>> Can be reproduced by running JTREG tests:
>> compiler/aot/calls/fromAot
> 
> Thanks for catching that!  Will fix on next round.

Martin Doerr wrote:
...
> I can see you're using clrldi to clear the upper bits of the parameters. But seems like it clears one bit too few.

You're right.  I misread the instruction as clear from bit 0 to bit N, but it's actually create a mask with bits N to
63 with one's, zeroes elsewhere, then AND it with the src register.

Will fix.

> You can also use cmpwi for the boolean one.

Ah, good!  Thanks.  Will change.

> 
> I wonder about the loop unrolling. It doesn't look beneficial because the loop body is large.
> Did you measure performance gain by this unrolling?
> I think for agressive tuning we'd have to apply techniques like modulo scheduling, but that's much more work.
> So please only use unrolling as far as a benefit is measurable.

I did test on a prototype written in C using vector intrinsics, and 8 was the sweet spot, however the structure of that
code was a bit different and I should have verified that the same amount of loop unrolling makes sense for the Java
intrinsic.   I will perform those experiments.

> 
> But you may want to align the loop start to help instruction fetch.

Interesting.  I did add an align, but in my patch clean up I must have lost it again somehow.  I will add it back
again.  Sorry for that mistake.

> We'll test it, but we don't have Power 10. You guys need to cover that.

I did test on Power10, but I wasn't able to do performance testing because I ran on an instruction-level simulator.
Real hardware will be available in the coming months.

Thanks for your careful look at the code, and the regression testing you've done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/293


More information about the core-libs-dev mailing list