Request for Comments: Adding bulk-read method "CharSequence.getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)"

Chen Liang liangchenblue at gmail.com
Tue Nov 5 12:15:45 UTC 2024


Thanks Markus, let's continue the API discussion here.

I indeed believe that allowing to batch-copy to an array is a good idea.
The JDK CharSequence can provide a safe "ranged copy from source into
destination" functionality.  However, users must be aware that there may be
malicious CharSequence implementations that may retain references to the
passed array; users must copy the resulting array again if they store it.

This particular fact is fine for Reader.of, since arbitrary readers should
never be trusted.  However, I think this might affect many other usages of
getChars if users pass in a trusted char array into such a method.

-Chen

On Sun, Oct 27, 2024 at 3:44 AM Markus Karg <markus at headcrashing.eu> wrote:

> >Hi Markus,
>
> >Should we drop the srcBigin/srcEnd parameters, as they can be replaced by
> a subSequence(srcBegin, srcEnd) call?
>
> Chen, I do understand your idea and while originally I had the same in
> mind (it really *is* appealing!), I came up with a draft using the
> *original* String.getChars() signature instead, due to the following
> drawbacks:
>
>    - There might exist (possibly lotsof) CharSequence.getChars(int, int,
>    char[], int) implementations already, as this problem (and the idea
>    how to solve it) is anything but new. At least such implementations are
>    String, StringBuilder and StringBuffer. If we come up with a different
>    signature, then *none* of these already existing performance boosters
>    will get used by Reader.of(CharSequence) automatically - at least
>    until they come up with alias methods. Effectively this leads to (possibly
>    lots) of alias methods. At least it leads to alias methods in String,
>    StringBuilder, StringBuffer and CharBuffer. In contrast, when keeping
>    the signature copied from String.getChars, chances are good that
>    (possibly lots of) implementations will *instantly* be supported by
>    Reader.of(CharSequence) without alias methods. At least, String,
>    StringBuilder and StringBuffer will be.
>    - Since decades people are now very used to StringBuilder.getChars(int,
>    int, char[], int), so (possibly a lot of) people might simply *expect*
>    us to come up with that lengthy signature. These people might be rather
>    confused (if not to say frustrated) when we now force them to write an
>    intermediate subSequence(int, int) for something that was "such
>    simple" before.
>    - Custom implementations of CharSequence.subSequence could come up
>    with the (performance-wise "bad") idea of creating *copies* instead of
>    views. At least it seems like AbstractStringBuilder is doing that, so
>    chances are "good" that custom libs will do that, too. For example, because
>    they need it for safety. Or possibly, because they have a technical reason
>    that *enforces* a copy. That would (possibly massively, depending on
>    the actual class) spoil the idea of performance-boosting this RFC is all
>    about.
>
> -Markus
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20241105/68154259/attachment-0001.htm>


More information about the core-libs-dev mailing list