RFR: 8314774: Optimize URLEncoder
Claes Redestad
redestad at openjdk.org
Thu Aug 24 09:50:40 UTC 2023
On Wed, 23 Aug 2023 18:51:37 GMT, Daniel Fuchs <dfuchs at openjdk.org> wrote:
> The fast path that just returns the given string if ASCII-only and no encoding looks simple enough. I don't particularly like the idea of embedding the logic of encoding UTF-8 into that class though, that increases the complexity significantly, and Charset encoders are there for that. Also I don't understand the reason for changing BitSet into a boolean array - that seems gratuitous?
A perhaps key difference for performance between the `BitSet` and the `boolean[]` in this code is that the latter is `static final @Stable` and thus easy to optimize for the JIT. The `words` array held by a `BitSet` is neither `final` nor `@Stable` so the JIT likely needs to keep a few extra checks around every access.
An interesting experiment would be to instead model this as a `ConstantBitSet` with a `final @Stable` internal array. This could get most (or all?) of the benefit, keeping things at a higher abstraction level and allow for some reuse. Retaining the compactness of `BitSet`s is nice too, though that might not be very important for constant bit sets.
API would need to be worked out but something like add a public method `BitSet::asConstant` and hiding away the details might be a good starting point:
public BitSet asConstant() {
return new ConstantBitSet(this);
}
private static class ConstantBitSet extends BitSet {
private @Stable final long[] words;
private ConstantBitSet(BitSet bitSet) {
words = Arrays.copyOf(bitSet.words);
}
// override all BitSet methods, make mutating methods throw (IllegalStateException?)
// -- for a public API perhaps extract an interface
}
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15354#issuecomment-1691364488
More information about the net-dev
mailing list