RFC: 8356679: Using CharSequence::getChars internally

Markus KARG markus at headcrashing.eu
Sun May 18 15:09:43 UTC 2025


Roger,

thank you for your comments.

Following your advice I have splitted the larger work of JDK-8356679 
into sub tasks.

I would like to start with a first PR implementing the *foundational* 
work, i. e. optimizing Writer::append for efficiency (JDK-8357183). For 
convenience, attached below is a copy of the description.

Comments Welcome!

If everybody is fine with this, I would be happy to get some +1 to 
publish the PR for JDK-8357183!

Am 14.05.2025 um 15:57 schrieb Roger Riggs:
> Hi Markus,
>
> Starting out with the common case is a good idea for the first PR.
>
> I much prefer a PR with a single goal and that comes to a conclusion 
> and does not add new features or changes after the PR is submitted.
> I tend to lose interest in PRs with lots of churn, it means I have to 
> re-review the bulk of it when there is a change and may wait days to 
> let it settle down before coming back to it.
>
> I tend to think the PR was not really ready to be reviewed if simple 
> issues and corrections have to be made frequently.
> Do your own checking for typos and copyrights and simple refactoring 
> before opening the PR.
> Quality before quantity or speed.
>
> I'm fine with separate Jira issues that clearly delineate a specific 
> scope and goal.
>
> The title of this issue (8356679) doesn't identify the real goal.
> It seems to be to improve performance or memory usage, not just to use 
> a new API.
>
> These are my personal opinions about contributions and process.
>
> Regards, Roger
>
>
> On 5/14/25 6:48 AM, Markus KARG wrote:
>> Many of the modified classes derive from a common super class and 
>> share one needed common change (which is one of the points which are 
>> easy to see once you see all of those classes in a single PR, but 
>> hard to explain in plaint-text pre-PR mailing list threads), so at 
>> least those need to be discussed *together*. But to spare JBS and 
>> PRs, I can open the PR with just the first set of changes, and once 
>> we agree that this set is fine, I can push the next commit *in the 
>> same PR*. Otherwise we would need endless JBS, mailing list threads, 
>> and PRs, just to fixe a dozen internal code lines.
>>
>> Having said that, does the current state of this thread count as 
>> "reached common agreement to file a PR" or do I still have to wait 
>> until more people chime in?
>>
>> -Markus
>>
>>
>> Am 13.05.2025 um 15:10 schrieb Roger Riggs:
>>> Hi Markus,
>>>
>>> A main point was to avoid trying to do everything at once.
>>> The PR comments become hard to follow and intermingled and it takes 
>>> longer to get agreement because of the thrash in the PR.
>>>
>>> Roger
>>>
>>> On 5/13/25 5:05 AM, Markus KARG wrote:
>>>> Thank you, Roger.
>>>>
>>>> Actually the method helps in the "toString()" variants, too, as in 
>>>> some places we could *get rid* of "toString()" (which is more work 
>>>> than "just" a buffer due to the added compression complexity).
>>>>
>>>> In fact, I already took the time to rewrite *all* of them while 
>>>> waiting for the approval of this list posting. In *all* cases 
>>>> *less* buffering / copying is needed, and *less* "toString()" 
>>>> conversion (which is a copy under the hood) is needed. So if I 
>>>> would be allowed to show the code as a PR, it would be much easier 
>>>> to explain and discuss.
>>>>
>>>> A PR is the best place to discuss "how to code would change". In 
>>>> the worst case, let's drop it if we see that it is actually a bad 
>>>> thing.
>>>>
>>>> -Markus
>>>>
>>>>
>>>> Am 12.05.2025 um 20:18 schrieb Roger Riggs:
>>>>> Hi Markus,
>>>>>
>>>>> On the surface, its looks constructive.
>>>>> I suspect that many of these cases will turn into discussions 
>>>>> about the right/best/better way to buffer the characters.
>>>>> The getChars method only helps when extracting to a char array, 
>>>>> many of the current implementations create strings as the 
>>>>> intermediary. The advantage of the 1 character at a time technique 
>>>>> is not needing a (separated allocated) buffer.
>>>>> Consider taking a few at a time before launching into the whole set.
>>>>>
>>>>> $.02, Roger
>>>>>
>>>>> On 5/11/25 2:45 AM, Markus KARG wrote:
>>>>>> Dear Core Libs Team,
>>>>>>
>>>>>> I am hereby requesting comments on JDK-8356679.
>>>>>>
>>>>>> I would like to invest some time and set up a PR implementing 
>>>>>> Chen Liangs's proposal laid out in 
>>>>>> https://bugs.openjdk.org/browse/JDK-8356679. For your 
>>>>>> convenience, the text of that JBS is copied below. According to 
>>>>>> the Developer's Guide I do need to get broad agreement BEFORE 
>>>>>> filing a PR. Therefore, I kindly ask everybody to briefly show 
>>>>>> consent, so I may file a PR.
>>>>>>
>>>>>> Thanks
>>>>>> -Markus
>>>>>>
>>>>>>
>>>>>> Copy from https://bugs.openjdk.org/browse/JDK-8356679:
>>>>>>
>>>>>> Recently OpenJDK adopted the new method 
>>>>>> CharSequence::getChars(int, int, char[], int) for inclusion in 
>>>>>> Java 25. As a bulk reader method, it allows potentially improved 
>>>>>> efficiency over the previously available char-by-char reader 
>>>>>> method CharSequence::charAt(int).
>>>>>>
>>>>>> Chen Liang suggested on March 23rd on the core-lib-dev mailing 
>>>>>> list to use the new method within the internal source code of 
>>>>>> OpenJDK for the implementation of Appendables (see 
>>>>>> https://mail.openjdk.org/pipermail/core-libs-dev/2025-March/141521.html). 
>>>>>> The idea behind this is that the implementations might be more 
>>>>>> efficient then.
>>>>>>
>>>>>> A quick analysis of the OpenJDK source code identified (at least) 
>>>>>> the following classes which could potentially run more efficient 
>>>>>> when using CharSequence::getChars internally, thanks to bulk 
>>>>>> reading and / or prevention of internal copies / toString() 
>>>>>> conversions:
>>>>>> * java.io.Writer
>>>>>> * java.io.StringWriter
>>>>>> * java.io.PrintWriter
>>>>>> * java.io.BufferedWriter
>>>>>> * java.io.CharArrayWriter
>>>>>> * java.io.FileWriter
>>>>>> * java.io.OutputStreamWriter
>>>>>> * sun.nio.cs.StreamEncoder
>>>>>> * java.io.PrintStream
>>>>>> * java.nio.CharBuffer
>>>>>>
>>>>>> In the sense of "eat your own dog food", it makes sense to 
>>>>>> implement Chen's idea in (at least) those classes. Possibly more 
>>>>>> classes could get identified when taking a deeper look. Besides 
>>>>>> the potential efficiency improvements, it would be a good show 
>>>>>> case for the usage of the new API.
>>>>>>
>>>>>> The risk of this change should be low, as test coverage exists, 
>>>>>> and as the intended changes are solely internal to the 
>>>>>> implementation. No API will get changed. In some cases the 
>>>>>> JavaDocs will get slightly adapted where it currently exposes the 
>>>>>> actual implementation (to not lie in future).
>>>>>>
>>>>>
>>>
>


More information about the core-libs-dev mailing list