RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers

Claes Redestad redestad at openjdk.org
Thu Sep 21 15:26:52 UTC 2023


On Thu, 21 Sep 2023 14:32:46 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> `URLEncoder` currently appends chars that needs encoding into a `java.io.CharArrayWriter`, converts that to a `String`, uses `String::getBytes` to get the encoded bytes and then appends these bytes in a escaped manner to the output stream. This is somewhat inefficient.
> 
> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in small chunks, with greatly reduced allocation as a result.
> 
> The exact size of the buffers is an open question, but generally it seems that a tiny buffer wins by virtue of allocating less, and that the per chunk overheads are relatively small.

Micros show a small throughput win and a large allocation reduction for variants that need to change the URL either partially or completely, and no regression when the URL remains unchanged:

Name                             (unchanged) Cnt        Base    Error         Test    Error   Unit   Diff%
URLEncodeDecode.testEncodeLatin1           0  15       3.471 ±  0.103        2.796 ±  0.078  ms/op   19.5% (p = 0.000*)
  :gc.alloc.rate                         N/A  15     828.462 ± 25.054      673.090 ± 19.214 MB/sec  -18.8% (p = 0.000*)
  :gc.alloc.rate.norm                    N/A  15 3013680.062 ±  0.721  1972347.384 ±  0.540   B/op  -34.6% (p = 0.000*)
  :gc.count                              N/A  15      20.000                17.000          counts
  :gc.time                               N/A  15      16.000                15.000              ms
URLEncodeDecode.testEncodeLatin1          75  15       1.269 ±  0.028        1.132 ±  0.029  ms/op   10.8% (p = 0.000*)
  :gc.alloc.rate                         N/A  15     606.924 ± 12.993      443.802 ± 11.184 MB/sec  -26.9% (p = 0.000*)
  :gc.alloc.rate.norm                    N/A  15  807656.807 ±  0.191   526711.840 ±  0.197   B/op  -34.8% (p = 0.000*)
  :gc.count                              N/A  15      16.000                11.000          counts
  :gc.time                               N/A 15/11      16.000                10.000              ms
URLEncodeDecode.testEncodeLatin1         100  15       0.542 ±  0.000        0.542 ±  0.000  ms/op   -0.0% (p = 0.932 )
  :gc.alloc.rate                         N/A  15       0.007 ±  0.000        0.007 ±  0.000 MB/sec    0.0% (p = 0.358 )
  :gc.alloc.rate.norm                    N/A  15       3.730 ±  0.004        3.731 ±  0.001   B/op    0.0% (p = 0.356 )
  :gc.count                              N/A  15       0.000                 0.000          counts
URLEncodeDecode.testEncodeUTF8             0  15       3.469 ±  0.137        2.678 ±  0.023  ms/op   22.8% (p = 0.000*)
  :gc.alloc.rate                         N/A  15     843.593 ± 32.562      711.747 ±  6.147 MB/sec  -15.6% (p = 0.000*)
  :gc.alloc.rate.norm                    N/A  15 3065136.041 ±  0.948  1999098.562 ±  0.161   B/op  -34.8% (p = 0.000*)
  :gc.count                              N/A  15      22.000                18.000          counts
  :gc.time                               N/A  15      20.000                15.000              ms
URLEncodeDecode.testEncodeUTF8            75  15       1.337 ±  0.031        1.192 ±  0.107  ms/op   10.8% (p = 0.000*)
  :gc.alloc.rate                         N/A  15     586.133 ± 13.420      429.661 ± 36.595 MB/sec  -26.7% (p = 0.000*)
  :gc.alloc.rate.norm                    N/A  15  821529.273 ±  0.216   533888.255 ±  0.744   B/op  -35.0% (p = 0.000*)
  :gc.count                              N/A  15      15.000                11.000          counts
  :gc.time                               N/A 15/11      10.000                10.000              ms
URLEncodeDecode.testEncodeUTF8           100  15       0.542 ±  0.000        0.541 ±  0.000  ms/op    0.1% (p = 0.000*)
  :gc.alloc.rate                         N/A  15       0.007 ±  0.000        0.007 ±  0.000 MB/sec    0.0% (p = 0.771 )
  :gc.alloc.rate.norm                    N/A  15       3.731 ±  0.001        3.727 ±  0.002   B/op   -0.1% (p = 0.000*)
  :gc.count                              N/A  15       0.000                 0.000          counts
  * = significant

Invariant parameters used by above microbenchmarks:
encodeChars:      6
maxLength  :   1024

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15865#issuecomment-1729715342


More information about the net-dev mailing list