RFC: 8356679: Using CharSequence::getChars internally
Markus KARG
markus at headcrashing.eu
Tue May 13 09:05:59 UTC 2025
Thank you, Roger.
Actually the method helps in the "toString()" variants, too, as in some
places we could *get rid* of "toString()" (which is more work than
"just" a buffer due to the added compression complexity).
In fact, I already took the time to rewrite *all* of them while waiting
for the approval of this list posting. In *all* cases *less* buffering /
copying is needed, and *less* "toString()" conversion (which is a copy
under the hood) is needed. So if I would be allowed to show the code as
a PR, it would be much easier to explain and discuss.
A PR is the best place to discuss "how to code would change". In the
worst case, let's drop it if we see that it is actually a bad thing.
-Markus
Am 12.05.2025 um 20:18 schrieb Roger Riggs:
> Hi Markus,
>
> On the surface, its looks constructive.
> I suspect that many of these cases will turn into discussions about
> the right/best/better way to buffer the characters.
> The getChars method only helps when extracting to a char array, many
> of the current implementations create strings as the intermediary. The
> advantage of the 1 character at a time technique is not needing a
> (separated allocated) buffer.
> Consider taking a few at a time before launching into the whole set.
>
> $.02, Roger
>
> On 5/11/25 2:45 AM, Markus KARG wrote:
>> Dear Core Libs Team,
>>
>> I am hereby requesting comments on JDK-8356679.
>>
>> I would like to invest some time and set up a PR implementing Chen
>> Liangs's proposal laid out in
>> https://bugs.openjdk.org/browse/JDK-8356679. For your convenience,
>> the text of that JBS is copied below. According to the Developer's
>> Guide I do need to get broad agreement BEFORE filing a PR. Therefore,
>> I kindly ask everybody to briefly show consent, so I may file a PR.
>>
>> Thanks
>> -Markus
>>
>>
>> Copy from https://bugs.openjdk.org/browse/JDK-8356679:
>>
>> Recently OpenJDK adopted the new method CharSequence::getChars(int,
>> int, char[], int) for inclusion in Java 25. As a bulk reader method,
>> it allows potentially improved efficiency over the previously
>> available char-by-char reader method CharSequence::charAt(int).
>>
>> Chen Liang suggested on March 23rd on the core-lib-dev mailing list
>> to use the new method within the internal source code of OpenJDK for
>> the implementation of Appendables (see
>> https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html).
>> The idea behind this is that the implementations might be more
>> efficient then.
>>
>> A quick analysis of the OpenJDK source code identified (at least) the
>> following classes which could potentially run more efficient when
>> using CharSequence::getChars internally, thanks to bulk reading and /
>> or prevention of internal copies / toString() conversions:
>> * java.io.Writer
>> * java.io.StringWriter
>> * java.io.PrintWriter
>> * java.io.BufferedWriter
>> * java.io.CharArrayWriter
>> * java.io.FileWriter
>> * java.io.OutputStreamWriter
>> * sun.nio.cs.StreamEncoder
>> * java.io.PrintStream
>> * java.nio.CharBuffer
>>
>> In the sense of "eat your own dog food", it makes sense to implement
>> Chen's idea in (at least) those classes. Possibly more classes could
>> get identified when taking a deeper look. Besides the potential
>> efficiency improvements, it would be a good show case for the usage
>> of the new API.
>>
>> The risk of this change should be low, as test coverage exists, and
>> as the intended changes are solely internal to the implementation. No
>> API will get changed. In some cases the JavaDocs will get slightly
>> adapted where it currently exposes the actual implementation (to not
>> lie in future).
>>
>
More information about the core-libs-dev
mailing list