[vectorIntrinsics] RFR: Add utf8 decoding benchmarks
Paul Sandoz
psandoz at openjdk.java.net
Wed Nov 25 21:28:08 UTC 2020
On Wed, 25 Nov 2020 17:37:01 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Following discussions on the mailing list, I'm submitting three benchmarks around UTF-8 decoding:
>> - decode: uses a while-loop based implementation currently in use in the JDK
>> - decodeVector: uses a lookup table with vector operations for 1-3 bytes characters
>> - decodeVectorASCII: uses a simple vector operation to accelerate parsing ASCII-only characters
>>
>> We don't observe the expected speedups with either decodeVector and decodeVectorASCII, so these are, I think, good test cases to further develop the Vector API.
>
> @luhenry Looks good to me as well. Please do create a separate branch (other than vectorIntrinsics) as indicated above by OpenJDK Skara bot and resubmit the pull request from the new branch.
This benchmark has identified a few issues:
1. Operations on species of byte/64 are not intrinsic.
2. `ShortVector` from/to char[] are not intrinsic (i suspected this to be the case).
For now:
1) can be unblocked by focusing on byte/128 and short/256.
2) can be unblocked with the following patch (so it appears, but needs more detailed review) or using `short[]`.
diff --git a/src/hotspot/share/opto/vectorIntrinsics.cpp b/src/hotspot/share/opto/vectorIntrinsics.cpp
index db7b69a9137..399130b0e45 100644
--- a/src/hotspot/share/opto/vectorIntrinsics.cpp
+++ b/src/hotspot/share/opto/vectorIntrinsics.cpp
@@ -624,7 +624,10 @@ bool LibraryCallKit::inline_vector_mem_operation(bool is_store) {
// Handle loading masks.
// If there is no consistency between array and vector element types, it must be special byte array case or loading masks
if (arr_type != NULL && !using_byte_array && elem_bt != arr_type->elem()->array_element_basic_type() && !is_mask) {
- return false;
+ if (elem_bt == T_SHORT && arr_type->elem()->array_element_basic_type() == T_CHAR) {
+ } else {
+ return false;
+ }
}
In general, to found the causes of issues i recommend extracting out vector sub-expressions and placing in separate benchmarks. It's easier to analyze the code that is generated.
The benchmark is also storing vectors/shuffles on the heap. Instead i recommend storing such data in compatible arrays, then loading into vector instances held in local variables.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/26
More information about the panama-dev
mailing list