RFC: 8356679: Using CharSequence::getChars internally
Roger Riggs
roger.riggs at oracle.com
Tue May 13 13:10:40 UTC 2025
Hi Markus,
A main point was to avoid trying to do everything at once.
The PR comments become hard to follow and intermingled and it takes
longer to get agreement because of the thrash in the PR.
Roger
On 5/13/25 5:05 AM, Markus KARG wrote:
> Thank you, Roger.
>
> Actually the method helps in the "toString()" variants, too, as in
> some places we could *get rid* of "toString()" (which is more work
> than "just" a buffer due to the added compression complexity).
>
> In fact, I already took the time to rewrite *all* of them while
> waiting for the approval of this list posting. In *all* cases *less*
> buffering / copying is needed, and *less* "toString()" conversion
> (which is a copy under the hood) is needed. So if I would be allowed
> to show the code as a PR, it would be much easier to explain and discuss.
>
> A PR is the best place to discuss "how to code would change". In the
> worst case, let's drop it if we see that it is actually a bad thing.
>
> -Markus
>
>
> Am 12.05.2025 um 20:18 schrieb Roger Riggs:
>> Hi Markus,
>>
>> On the surface, its looks constructive.
>> I suspect that many of these cases will turn into discussions about
>> the right/best/better way to buffer the characters.
>> The getChars method only helps when extracting to a char array, many
>> of the current implementations create strings as the intermediary.
>> The advantage of the 1 character at a time technique is not needing a
>> (separated allocated) buffer.
>> Consider taking a few at a time before launching into the whole set.
>>
>> $.02, Roger
>>
>> On 5/11/25 2:45 AM, Markus KARG wrote:
>>> Dear Core Libs Team,
>>>
>>> I am hereby requesting comments on JDK-8356679.
>>>
>>> I would like to invest some time and set up a PR implementing Chen
>>> Liangs's proposal laid out in
>>> https://bugs.openjdk.org/browse/JDK-8356679. For your convenience,
>>> the text of that JBS is copied below. According to the Developer's
>>> Guide I do need to get broad agreement BEFORE filing a PR.
>>> Therefore, I kindly ask everybody to briefly show consent, so I may
>>> file a PR.
>>>
>>> Thanks
>>> -Markus
>>>
>>>
>>> Copy from https://bugs.openjdk.org/browse/JDK-8356679:
>>>
>>> Recently OpenJDK adopted the new method CharSequence::getChars(int,
>>> int, char[], int) for inclusion in Java 25. As a bulk reader method,
>>> it allows potentially improved efficiency over the previously
>>> available char-by-char reader method CharSequence::charAt(int).
>>>
>>> Chen Liang suggested on March 23rd on the core-lib-dev mailing list
>>> to use the new method within the internal source code of OpenJDK for
>>> the implementation of Appendables (see
>>> https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html).
>>> The idea behind this is that the implementations might be more
>>> efficient then.
>>>
>>> A quick analysis of the OpenJDK source code identified (at least)
>>> the following classes which could potentially run more efficient
>>> when using CharSequence::getChars internally, thanks to bulk reading
>>> and / or prevention of internal copies / toString() conversions:
>>> * java.io.Writer
>>> * java.io.StringWriter
>>> * java.io.PrintWriter
>>> * java.io.BufferedWriter
>>> * java.io.CharArrayWriter
>>> * java.io.FileWriter
>>> * java.io.OutputStreamWriter
>>> * sun.nio.cs.StreamEncoder
>>> * java.io.PrintStream
>>> * java.nio.CharBuffer
>>>
>>> In the sense of "eat your own dog food", it makes sense to implement
>>> Chen's idea in (at least) those classes. Possibly more classes could
>>> get identified when taking a deeper look. Besides the potential
>>> efficiency improvements, it would be a good show case for the usage
>>> of the new API.
>>>
>>> The risk of this change should be low, as test coverage exists, and
>>> as the intended changes are solely internal to the implementation.
>>> No API will get changed. In some cases the JavaDocs will get
>>> slightly adapted where it currently exposes the actual
>>> implementation (to not lie in future).
>>>
>>
More information about the core-libs-dev
mailing list