RFR: 8316681: Rewrite URLEncoder.encode to use small reusable buffers [v6]
温绍锦
duke at openjdk.org
Thu Oct 5 00:32:26 UTC 2023
On Fri, 22 Sep 2023 08:53:07 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> `URLEncoder` currently appends chars that needs encoding into a `java.io.CharArrayWriter`, converts that to a `String`, uses `String::getBytes` to get the encoded bytes and then appends these bytes in a escaped manner to the output stream. This is somewhat inefficient.
>>
>> This PR replaces the `CharArrayWriter` with a reusable `CharBuffer` + `ByteBuffer` pair. This allows us to encode to the output `StringBuilder` in small chunks, with greatly reduced allocation as a result.
>>
>> The exact size of the buffers is an open question, but generally it seems that a tiny buffer wins by virtue of allocating less, and that the per chunk overheads are relatively small.
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
>
> Update src/java.base/share/classes/java/net/URLEncoder.java
>
> Co-authored-by: ExE Boss <3889017+ExE-Boss at users.noreply.github.com>
URLEncoder#DONT_NEED_ENCODING based on BitSet is actually a lookup table. Should we consider improving it in this way?
public class URLEncoder {
static final long DONT_NEED_ENCODING_FLAGS_0;
static final long DONT_NEED_ENCODING_FLAGS_1;
static {
long flag0 = 0;
flag0 |= 1L << ' '; // ASCII 32
flag0 |= 1L << '*'; // ASCII 42
flag0 |= 1L << '-'; // ASCII 25
flag0 |= 1L << '.'; // ASCII 46
// ASCII 48 - 57
for (int i = '0'; i <= '9'; ++i) {
flag0 |= 1L << i;
}
DONT_NEED_ENCODING_FLAGS_0 = flag0;
long flags1 = 0;
// ASCII 65 - 90
for (int i = 'A'; i <= 'Z'; ++i) {
flags1 |= 1L << (i - 64);
}
flags1 |= 1L << ('_' - 64); // ASCII 95
// ASCII 97 - 122
for (int i = 'a'; i <= 'z'; ++i) {
flags1 |= 1L << (i - 64);
}
DONT_NEED_ENCODING_FLAGS_1 = flags1;
}
private static boolean dontNeedEncoding(char c) {
int prefix = c >> 6;
if (prefix > 1) {
return false;
}
long flags = prefix == 0 ? DONT_NEED_ENCODING_FLAGS_0 : DONT_NEED_ENCODING_FLAGS_1;
return (flags & (1L << c)) != 0;
}
}
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15865#issuecomment-1747842588
More information about the net-dev
mailing list