RFC: AbstractStringBuilder sharing its buffer with String

Remi Forax forax at univ-mlv.fr
Sat Nov 28 14:15:43 UTC 2015


I agree with Peter.

Also, Ivan, one goal of JEP 280 is to be able to avoid the creation of an intermediary char/byte buffer when doing string concatenation so when the implementation of JEP 280 will be integrated it should speedup your benchmark. 

cheers,
Rémi

----- Mail original -----
> De: "Peter Levart" <peter.levart at gmail.com>
> À: "Ivan Gerasimov" <ivan.gerasimov at oracle.com>, core-libs-dev at openjdk.java.net
> Envoyé: Vendredi 27 Novembre 2015 16:15:40
> Objet: Re: RFC: AbstractStringBuilder sharing its buffer with String
> 
> 
> 
> On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
> > Hello!
> >
> > Prior to Java5, StringBuffer used to be able to share its internal
> > char[] buffer with the String, returned with toString().
> > With introducing of the new StringBuilder class this functionality was
> > removed.
> >
> > It seems tempting to reintroduce this feature now in
> > AbstractStringBuilder.
> > The rationale here is that StringBuilder already provides a way of
> > accepting the hint about the result's size.
> > The constructor with the explicitly specified capacity is used for
> > increasing the efficiency of strings concatenations.
> > Optimizing this case by avoiding additional memory allocation and
> > copying looks sensible to me.
> >
> > Here's the draft webrev
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
> 
> It is tempting, yes. But I think there was a reason that this was
> dropped when StringBuilder was introduced and that reason was
> thread-safety. StringBuilder doesn't use any synchronization. If a
> concurrent thread gets a reference to a StringBuilder and executes
> mutating operations while some other thread calls toString() on the same
> StringBuilder, then returned String could appear to be changing it's
> content (be mutable), which can have security implications:
> 
>   413     public String toString() {
>   414         final byte[] value = this.value;
>   415         if (isLatin1()) {
>   416             if ((count << coder) < value.length) {
>   417                 return StringLatin1.newString(value, 0, count);
>   418             }
>   419             shared = true;
>   420             return new String(value, String.LATIN1);
>   421         }
>   422         return StringUTF16.newStringSB(value, 0, count);
>   423     }
> 
> 
>   927     public AbstractStringBuilder replace(int start, int end,
> String str) {
>   928         if (end > count) {
>   929             end = count;
>   930         }
>   931         checkRangeSIOOBE(start, end, count);
>   932         int len = str.length();
>   933         int newCount = count + len - (end - start);
>   934         ensureCapacityInternal(newCount);
>   935         unshareValue();
>   936         shift(end, newCount - count);
>   937         count = newCount;
>   938         putStringAt(start, str);
>   939         return this;
>   940     }
> 
> 
> For example:
> 
> static final StringBuilder sb = new StringBuilder(3).append("abc");
> 
> Thread 1:
> 
> String s = sb.toString();
> System.out.println(s);
> 
> Thread 2:
> 
> sb.replace(0, 3, "def");
> 
> 
> Question: What can Thread 1 print?
> 
> Thread 1 sets shared = true and shares value array with String, but when
> Thread 2 executes replace, it may not yet notice the write from Thread 1
> and may not do anything in unshareValue(), effectively overwriting the
> shared array with "def".
> 
> Answer: I think anything of the following:
> 
> abc
> 
> dbc
> aec
> abf
> 
> dec
> dbf
> aef
> 
> def
> 
> ...and the answer may change over time. Now imagine handing over such
> string to new FileInputStream()...
> 
> 
> Regards, Peter
> 
> 
> >
> > This variant showed a significant speed improvement for the cases,
> > when the the capacity is equal to the result's size (up to +25% to
> > throughput).
> > For the other cases, the difference isn't very clear based on my
> > benchmarks :)
> > But is seems to be small enough, as it only adds a few comparisons,
> > coupled with already relatively heavy operations, like allocation and
> > copying.
> >
> > Here's the benchmark that I've used:
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
> >
> >
> > And the corresponding graphs.
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
> >
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
> >
> > The blue line here stands for the baseline throughput.
> >
> > If people agree, this is a useful addition, a test should also be
> > added, of course.
> >
> > Sincerely yours,
> > Ivan
> >
> 
> 



More information about the core-libs-dev mailing list