RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v5]
ExE Boss
duke at openjdk.org
Fri Sep 22 08:53:10 UTC 2023
On Fri, 22 Sep 2023 08:37:04 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> `URLEncoder` currently appends chars that needs encoding into a `java.io.CharArrayWriter`, converts that to a `String`, uses `String::getBytes` to get the encoded bytes and then appends these bytes in a escaped manner to the output stream. This is somewhat inefficient.
>>
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in small chunks, with greatly reduced allocation as a result.
>>
>> The exact size of the buffers is an open question, but generally it seems that a tiny buffer wins by virtue of allocating less, and that the per chunk overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
>
> Add comment, split long line
src/java.base/share/classes/java/net/URLEncoder.java line 299:
> 297: * Encodes input chars in cb and appends the byte values in an escaped format
> 298: * ("%FF") to out. The temporary byte buffer, bb, must be able to accept
> 299: * cb.position() * ce.maxBytesPerChar() bytes.
The following might be better, and also use `"%XX"` instead of `"%FF"`, the latter being 255 in hex:
Suggestion:
* Encodes input chars in {@code cb} and appends the byte values in an escaped
* format ({@code "%XX"}) to {@code out}. The temporary byte buffer, {@code bb},
* must be able to accept {@code cb.position() * ce.maxBytesPerChar()} bytes.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/15865#discussion_r1334082986
More information about the net-dev
mailing list