RFR: 8282047: Enhance StringDecode/Encode microbenchmarks [v2]

Fri Feb 25 11:09:10 UTC 2022

On Fri, 25 Feb 2022 10:19:17 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> test/micro/org/openjdk/bench/java/lang/StringDecode.java line 72:
>> 
>>> 70:     public void setup() {
>>> 71:         charset = Charset.forName(charsetName);
>>> 72:         asciiString = LOREM.substring(0, 32).getBytes(charset);
>> 
>> This is problematic IMO in that it's missing short strings such as "Claes". Average Java strings are about 32 bytes long AFAICR, and people writing (vectorized) ijntrinsics have a nasty habit of optimizing for long strings, to the detriment of typical-length ones.
>> Whether we like it or not, people will optimize for benchmarks, so it's important that benchmark data is realistic. The shortest here is 15 bytes, as far as I can see. I'd certainly include a short string of just a few bytes so that intrinsics don't cause regressions in important cases.
>
> All good points. I've added a number of such short variants to all(?) relevant microbenchmarks. The tests should now better cover a mix of input lengths and encodings.

��

-------------

PR: https://git.openjdk.java.net/jdk/pull/7516