RFR: 8314774: Optimize URLEncoder [v8]
Claes Redestad
redestad at openjdk.org
Thu Aug 24 12:14:27 UTC 2023
On Thu, 24 Aug 2023 10:38:57 GMT, Glavo <duke at openjdk.org> wrote:
>> I mainly made these optimizations:
>>
>> * Avoid allocating `StringBuilder` when there are no characters in the URL that need to be encoded;
>> * Implement a fast path for UTF-8.
>>
>> In addition to improving performance, these optimizations also reduce temporary objects:
>>
>> * It no longer allocates any object when there are no characters in the URL that need to be encoded;
>> * The initial size of StringBuilder is larger to avoid expansion as much as possible;
>> * For UTF-8, the temporary `CharArrayWriter`, strings and byte arrays are no longer needed.
>>
>> The results of the `URLEncodeDecode` benchmark:
>>
>>
>> Before:
>> Benchmark (count) (maxLength) (mySeed) Mode Cnt Score Error Units
>> URLEncodeDecode.testEncodeUTF8 1024 1024 3 avgt 15 5.587 ? 0.010 ms/op
>>
>> After:
>> Benchmark (count) (maxLength) (mySeed) Mode Cnt Score Error Units
>> URLEncodeDecode.testEncodeUTF8 1024 1024 3 avgt 15 3.582 ? 0.054 ms/op
>>
>>
>> I also updated the tests to add more test cases.
>
> Glavo has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove UTF-8 fast path
Does your benchmark test a healthy mix of strings? Some that need encoding, some that don't (perhaps mostly weighted so that most inputs need encoding only in the latter half - which is common since protocol+host seldom needs encoding)
For strings that don't need encoding at all this optimization alone should get you close to the numbers for the full thing.
The heuristic to size the sb could perhaps discount chars we copy 1:1 to reduce allocation pressure (`i + ((s.length() - i) << 1)`)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15354#issuecomment-1691559924
More information about the net-dev
mailing list