RFR: 8314774: Optimize URLEncoder [v8]
Claes Redestad
redestad at openjdk.org
Mon Aug 28 13:43:12 UTC 2023
On Thu, 24 Aug 2023 10:38:57 GMT, Glavo <duke at openjdk.org> wrote:
>> I mainly made these optimizations:
>>
>> * Avoid allocating `StringBuilder` when there are no characters in the URL that need to be encoded;
>> * Implement a fast path for UTF-8.
>>
>> In addition to improving performance, these optimizations also reduce temporary objects:
>>
>> * It no longer allocates any object when there are no characters in the URL that need to be encoded;
>> * The initial size of StringBuilder is larger to avoid expansion as much as possible;
>> * For UTF-8, the temporary `CharArrayWriter`, strings and byte arrays are no longer needed.
>>
>> The results of the `URLEncodeDecode` benchmark:
>>
>>
>> Before:
>> Benchmark (count) (maxLength) (mySeed) Mode Cnt Score Error Units
>> URLEncodeDecode.testEncodeUTF8 1024 1024 3 avgt 15 5.587 ? 0.010 ms/op
>>
>> After:
>> Benchmark (count) (maxLength) (mySeed) Mode Cnt Score Error Units
>> URLEncodeDecode.testEncodeUTF8 1024 1024 3 avgt 15 3.582 ? 0.054 ms/op
>>
>>
>> I also updated the tests to add more test cases.
>
> Glavo has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove UTF-8 fast path
I've opened a PR to improve the OpenJDK `URLEncodeDecode` microbenchmark to better capture real-world mixed scenarios. While the microbenchmark can be improved further to include inputs with high code points and surrogate pairs I think this fixes a few glaring issues with the current benchmark (which only tests strings with a lot of `'\u000`` control characters) and should help guide optimization efforts better.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15354#issuecomment-1695721273
More information about the net-dev
mailing list