RFR: 8314774: Optimize URLEncoder [v8]

Glavo duke at openjdk.org
Thu Aug 24 10:47:37 UTC 2023


On Thu, 24 Aug 2023 10:38:57 GMT, Glavo <duke at openjdk.org> wrote:

>> I mainly made these optimizations:
>> 
>> * Avoid allocating `StringBuilder` when there are no characters in the URL that need to be encoded;
>> * Implement a fast path for UTF-8.
>> 
>> In addition to improving performance, these optimizations also reduce temporary objects:
>> 
>> * It no longer allocates any object when there are no characters in the URL that need to be encoded;
>> * The initial size of StringBuilder is larger to avoid expansion as much as possible;
>> * For UTF-8, the temporary `CharArrayWriter`, strings and byte arrays are no longer needed.
>> 
>> The results of the `URLEncodeDecode` benchmark:
>> 
>> 
>> Before:
>> Benchmark                       (count)  (maxLength)  (mySeed)  Mode  Cnt  Score   Error  Units
>> URLEncodeDecode.testEncodeUTF8     1024         1024         3  avgt   15  5.587 ? 0.010  ms/op
>> 
>> After:
>> Benchmark                       (count)  (maxLength)  (mySeed)  Mode  Cnt  Score   Error  Units
>> URLEncodeDecode.testEncodeUTF8     1024         1024         3  avgt   15  3.582 ? 0.054  ms/op
>> 
>> 
>> I also updated the tests to add more test cases.
>
> Glavo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove UTF-8 fast path

I've removed the UTF-8 fast path and this is the JMH result now:


Baseline:
Benchmark                                          (count)  (maxLength)  (mySeed)  Mode  Cnt        Score   Error   Units
URLEncodeDecode.testEncodeUTF8                        1024         1024         3  avgt   15        5.582 ± 0.009   ms/op
URLEncodeDecode.testEncodeUTF8:gc.alloc.rate          1024         1024         3  avgt   15     1439.974 ± 2.386  MB/sec
URLEncodeDecode.testEncodeUTF8:gc.alloc.rate.norm     1024         1024         3  avgt   15  8429374.434 ± 0.239    B/op
URLEncodeDecode.testEncodeUTF8:gc.count               1024         1024         3  avgt   15        6.000          counts
URLEncodeDecode.testEncodeUTF8:gc.time                1024         1024         3  avgt   15        9.000              ms

This PR:
Benchmark                                          (count)  (maxLength)  (mySeed)  Mode  Cnt        Score    Error   Units
URLEncodeDecode.testEncodeUTF8                        1024         1024         3  avgt   15        4.452 ±  0.092   ms/op
URLEncodeDecode.testEncodeUTF8:gc.alloc.rate          1024         1024         3  avgt   15     1702.963 ± 35.595  MB/sec
URLEncodeDecode.testEncodeUTF8:gc.alloc.rate.norm     1024         1024         3  avgt   15  7949307.100 ±  0.721    B/op
URLEncodeDecode.testEncodeUTF8:gc.count               1024         1024         3  avgt   15      120.000           counts
URLEncodeDecode.testEncodeUTF8:gc.time                1024         1024         3  avgt   15      125.000               ms


This PR now has about a 25% performance increase and a 5% reduction in memory allocation in the `URLEncodeDecode` benchmark.

Well, that doesn't seem like a significant improvement, so I initially wanted to submit it with the UTF-8 fast path, which would significantly reduce memory allocations.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15354#issuecomment-1691443610


More information about the net-dev mailing list