RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v6]

Claes Redestad redestad at openjdk.org
Sun Sep 24 21:52:22 UTC 2023


On Fri, 22 Sep 2023 08:53:07 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> `URLEncoder` currently appends chars that needs encoding into a `java.io.CharArrayWriter`, converts that to a `String`, uses `String::getBytes` to get the encoded bytes and then appends these bytes in a escaped manner to the output stream. This is somewhat inefficient.
>> 
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in small chunks, with greatly reduced allocation as a result.
>> 
>> The exact size of the buffers is an open question, but generally it seems that a tiny buffer wins by virtue of allocating less, and that the per chunk overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update src/java.base/share/classes/java/net/URLEncoder.java
>   
>   Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com>

> > Micros show a small throughput win and a large allocation reduction for variants that need to change the URL either partially or completely, and no regression when the URL remains unchanged:
> > ```
> > Name                             (unchanged) Cnt        Base    Error         Test    Error   Unit   Diff%
> > URLEncodeDecode.testEncodeLatin1           0  15       3.471 ±  0.103        2.796 ±  0.078  ms/op   19.5% (p = 0.000*)
> >   :gc.alloc.rate                                     828.462 ± 25.054      673.090 ± 19.214 MB/sec  -18.8% (p = 0.000*)
> >   :gc.alloc.rate.norm                            3013680.062 ±  0.721  1972347.384 ±  0.540   B/op  -34.6% (p = 0.000*)
> >   :gc.count                                           20.000                17.000          counts
> >   :gc.time                                            16.000                15.000              ms
> > <snip>
> > ```
> 
> Can you share how to generate this kind of diff report? What I am doing now is to run the baseline and the current version separately, and then use the processing program I wrote to generate the diff report. Can you share your experience?

It's a small utility I've toyed with and iterated on over the last couple of weeks. It's tangled up on some legacy proprietary code, but I think I can refactor out the relevant pieces and put it in my OpenJDK repo in the next couple of weeks.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15865#issuecomment-1732677299


More information about the net-dev mailing list