RFC: 8356679: Using CharSequence::getChars internally

Markus KARG markus at headcrashing.eu
Wed May 14 10:48:30 UTC 2025


Many of the modified classes derive from a common super class and share 
one needed common change (which is one of the points which are easy to 
see once you see all of those classes in a single PR, but hard to 
explain in plaint-text pre-PR mailing list threads), so at least those 
need to be discussed *together*. But to spare JBS and PRs, I can open 
the PR with just the first set of changes, and once we agree that this 
set is fine, I can push the next commit *in the same PR*. Otherwise we 
would need endless JBS, mailing list threads, and PRs, just to fixe a 
dozen internal code lines.

Having said that, does the current state of this thread count as 
"reached common agreement to file a PR" or do I still have to wait until 
more people chime in?

-Markus


Am 13.05.2025 um 15:10 schrieb Roger Riggs:
> Hi Markus,
>
> A main point was to avoid trying to do everything at once.
> The PR comments become hard to follow and intermingled and it takes 
> longer to get agreement because of the thrash in the PR.
>
> Roger
>
> On 5/13/25 5:05 AM, Markus KARG wrote:
>> Thank you, Roger.
>>
>> Actually the method helps in the "toString()" variants, too, as in 
>> some places we could *get rid* of "toString()" (which is more work 
>> than "just" a buffer due to the added compression complexity).
>>
>> In fact, I already took the time to rewrite *all* of them while 
>> waiting for the approval of this list posting. In *all* cases *less* 
>> buffering / copying is needed, and *less* "toString()" conversion 
>> (which is a copy under the hood) is needed. So if I would be allowed 
>> to show the code as a PR, it would be much easier to explain and 
>> discuss.
>>
>> A PR is the best place to discuss "how to code would change". In the 
>> worst case, let's drop it if we see that it is actually a bad thing.
>>
>> -Markus
>>
>>
>> Am 12.05.2025 um 20:18 schrieb Roger Riggs:
>>> Hi Markus,
>>>
>>> On the surface, its looks constructive.
>>> I suspect that many of these cases will turn into discussions about 
>>> the right/best/better way to buffer the characters.
>>> The getChars method only helps when extracting to a char array, many 
>>> of the current implementations create strings as the intermediary. 
>>> The advantage of the 1 character at a time technique is not needing 
>>> a (separated allocated) buffer.
>>> Consider taking a few at a time before launching into the whole set.
>>>
>>> $.02, Roger
>>>
>>> On 5/11/25 2:45 AM, Markus KARG wrote:
>>>> Dear Core Libs Team,
>>>>
>>>> I am hereby requesting comments on JDK-8356679.
>>>>
>>>> I would like to invest some time and set up a PR implementing Chen 
>>>> Liangs's proposal laid out in 
>>>> https://bugs.openjdk.org/browse/JDK-8356679. For your convenience, 
>>>> the text of that JBS is copied below. According to the Developer's 
>>>> Guide I do need to get broad agreement BEFORE filing a PR. 
>>>> Therefore, I kindly ask everybody to briefly show consent, so I may 
>>>> file a PR.
>>>>
>>>> Thanks
>>>> -Markus
>>>>
>>>>
>>>> Copy from https://bugs.openjdk.org/browse/JDK-8356679:
>>>>
>>>> Recently OpenJDK adopted the new method CharSequence::getChars(int, 
>>>> int, char[], int) for inclusion in Java 25. As a bulk reader 
>>>> method, it allows potentially improved efficiency over the 
>>>> previously available char-by-char reader method 
>>>> CharSequence::charAt(int).
>>>>
>>>> Chen Liang suggested on March 23rd on the core-lib-dev mailing list 
>>>> to use the new method within the internal source code of OpenJDK 
>>>> for the implementation of Appendables (see 
>>>> https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html). 
>>>> The idea behind this is that the implementations might be more 
>>>> efficient then.
>>>>
>>>> A quick analysis of the OpenJDK source code identified (at least) 
>>>> the following classes which could potentially run more efficient 
>>>> when using CharSequence::getChars internally, thanks to bulk 
>>>> reading and / or prevention of internal copies / toString() 
>>>> conversions:
>>>> * java.io.Writer
>>>> * java.io.StringWriter
>>>> * java.io.PrintWriter
>>>> * java.io.BufferedWriter
>>>> * java.io.CharArrayWriter
>>>> * java.io.FileWriter
>>>> * java.io.OutputStreamWriter
>>>> * sun.nio.cs.StreamEncoder
>>>> * java.io.PrintStream
>>>> * java.nio.CharBuffer
>>>>
>>>> In the sense of "eat your own dog food", it makes sense to 
>>>> implement Chen's idea in (at least) those classes. Possibly more 
>>>> classes could get identified when taking a deeper look. Besides the 
>>>> potential efficiency improvements, it would be a good show case for 
>>>> the usage of the new API.
>>>>
>>>> The risk of this change should be low, as test coverage exists, and 
>>>> as the intended changes are solely internal to the implementation. 
>>>> No API will get changed. In some cases the JavaDocs will get 
>>>> slightly adapted where it currently exposes the actual 
>>>> implementation (to not lie in future).
>>>>
>>>
>


More information about the core-libs-dev mailing list