java.io.Writer uses CharSequence.toString()

Pavel Rappo pavel.rappo at oracle.com
Sat Jul 30 13:31:28 UTC 2016


Could you please prototype what you suggest in code so we could discuss it more
constructively? Otherwise I feel this discussion is getting too broad and as
such may not achieve anything in particular.

Thanks,
-Pavel

> On 30 Jul 2016, at 10:03, Fabian Lange <lange.fabian at gmail.com> wrote:
> 
> Hi,
> so why did you guys invent CharSequence as an API if it cannot be used.
> I kind of understand why people use unsafe and come up with their own
> character data implementations.
> 
> So you prefer multi megabyte string allocations including their arraycopy,
> (java.lang.StringBuilder.toString())
> 
>    public String toString() {
>        // Create a copy, don't share the array
>        return new String(value, 0, count);
>    }
> 
> which then will arraycopy again this multimegabyte char array
> java.io.Writer.write(String, int, int)
> 
>            } else {    // Don't permanently allocate very large buffers.
>                cbuf = new char[len];
>            }
>            str.getChars(off, (off + len), cbuf, 0);
>            write(cbuf, 0, len);
> 
> over an Implementation which doesn't do that?
> 
> Fabian
> 
> On Sat, Jul 30, 2016 at 1:23 AM, Brent Christian
> <brent.christian at oracle.com> wrote:
>> Hi,
>> 
>> This idea has been brought up before [1].
>> 
>> I concur with Pavel's assessment.  I would add that now that latin-1 Strings
>> are stored in a more compact form in JDK 9 ("Compact Strings" [2]), the
>> performance profile of string data is further complicated.
>> 
>> Thanks,
>> -Brent
>> 
>> 1. https://bugs.openjdk.java.net/browse/JDK-6206838
>> 2. https://bugs.openjdk.java.net/browse/JDK-8054307
>> 
>> On 07/29/2016 10:21 AM, Pavel Rappo wrote:
>>> 
>>> Once again, while I agree in some places it could have been done a bit
>>> better
>>> probably, I would say it's good to a have a look at benchmarks first.
>>> 
>>> If they show there's indeed a big difference between
>>> 
>>>    char[] copy = new chars[charSequence.length()];
>>>    String s = charSequence.toString();
>>>    s.getChars(0, s.length, copy, 0);
>>> 
>>> and
>>> 
>>>    char[] copy = new chars[charSequence.length()];
>>>    charSequence.getChars(0, charSequence.length(), copy, 0);
>>> 
>>> it could justify an increase in complexity of CharBuffer.append or
>>> introducing a
>>> new default method (getChars/fillInto) into CharSequence. Possibly. Or
>>> maybe
>>> not. Because there might be some nontrivial effects we are completely
>>> unaware of.
>>> 
>>> Btw, what do you mean by "extract char[]" from StringBuilder? Do you want
>>> StringBuilder to give away a reference to its char[] outside? If not, than
>>> what's the difference between "extract char[]" from StringBuilder and "use
>>> String" in your algorithm?
>>> 
>>> The bottom line is whatever you suggest would likely need a good
>>> justification.
>>> To me it's not immediately obvious that something like this
>>> 
>>>     public CharBuffer append(CharSequence csq) {
>>>         if (csq == null) {
>>>             put("null");
>>>         } else if (csq instanceof StringBuilder) {
>>>             char[] chars = new char[csq.length()];
>>>             ((StringBuilder) csq).getChars(0, csq.length(), chars, 0);
>>>             put(chars);
>>>         } else if (csq instanceof StringBuffer) {
>>>             char[] chars = new char[csq.length()];
>>>             ((StringBuffer) csq).getChars(0, csq.length(), chars, 0);
>>>             put(chars);
>>>         } else if (csq instanceof CharBuffer) {
>>>             CharBuffer buffer = (CharBuffer) csq;
>>>             int p = buffer.position();
>>>             put(buffer);
>>>             buffer.position(p);
>>>         } else {
>>>             for (int i = 0; i < csq.length(); i++) {
>>>                 put(csq.charAt(i));
>>>             }
>>>         }
>>>         return this;
>>>     }
>>> 
>>> is better than this (what's there today)
>>> 
>>>     public CharBuffer append(CharSequence csq) {
>>>         if (csq == null)
>>>             return put("null");
>>>         else
>>>             return put(csq.toString());
>>>     }
>>> 
>>>> On 29 Jul 2016, at 15:12, ecki at zusammenkunft.net wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> Have to agree with Fabian handling CharSequences (and special case
>>>> StringBuilder) is pretty weak, in CharBuffer.append(CharSequence) you see
>>>> the same toString. I would expect it to do:
>>>> - Instamceof String -> use it
>>>> - Instance of StringBuilder -> extract char[] and iterate
>>>> - Instance of CharBuffer -> handle
>>>> - Otherwise: Loop over charAt
>>>> 
>>>> (the otherwise might be a tradeof between allocation and (not)inlined
>>>> bounds checks)
>>>> 
>>>> Alternative would be a CharSequence.fillInto(char[])
>>>> 
>>>> BTW wouldn't it be create if char[] implements CharSequence?
>>>> 
>>>> Gruss
>>>> Bernd
>>>> --
>>>> http://bernd.eckenfels.net
>>>> From Win 10 Mobile
>>>> 
>>>> Von: Fabian Lange
>>> 
>>> 
>> 



More information about the core-libs-dev mailing list