RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v5]

Fri Sep 22 08:53:10 UTC 2023

On Fri, 22 Sep 2023 08:37:04 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> `URLEncoder` currently appends chars that needs encoding into a `java.io.CharArrayWriter`, converts that to a `String`, uses `String::getBytes` to get the encoded bytes and then appends these bytes in a escaped manner to the output stream. This is somewhat inefficient.
>> 
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in small chunks, with greatly reduced allocation as a result.
>> 
>> The exact size of the buffers is an open question, but generally it seems that a tiny buffer wins by virtue of allocating less, and that the per chunk overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add comment, split long line

src/java.base/share/classes/java/net/URLEncoder.java line 299:

> 297:      * Encodes input chars in cb and appends the byte values in an escaped format
> 298:      * ("%FF") to out. The temporary byte buffer, bb, must be able to accept
> 299:      * cb.position() * ce.maxBytesPerChar() bytes.

The following might be better, and also use `"%XX"` instead of `"%FF"`, the latter being 255 in hex:
Suggestion:

     * Encodes input chars in {@code cb} and appends the byte values in an escaped
     * format ({@code "%XX"}) to {@code out}. The temporary byte buffer, {@code bb},
     * must be able to accept {@code cb.position() * ce.maxBytesPerChar()} bytes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15865#discussion_r1334082986