RFC: AbstractStringBuilder sharing its buffer with String
Remi Forax
forax at univ-mlv.fr
Sat Nov 28 14:15:43 UTC 2015
I agree with Peter.
Also, Ivan, one goal of JEP 280 is to be able to avoid the creation of an intermediary char/byte buffer when doing string concatenation so when the implementation of JEP 280 will be integrated it should speedup your benchmark.
cheers,
Rémi
----- Mail original -----
> De: "Peter Levart" <peter.levart at gmail.com>
> À: "Ivan Gerasimov" <ivan.gerasimov at oracle.com>, core-libs-dev at openjdk.java.net
> Envoyé: Vendredi 27 Novembre 2015 16:15:40
> Objet: Re: RFC: AbstractStringBuilder sharing its buffer with String
>
>
>
> On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
> > Hello!
> >
> > Prior to Java5, StringBuffer used to be able to share its internal
> > char[] buffer with the String, returned with toString().
> > With introducing of the new StringBuilder class this functionality was
> > removed.
> >
> > It seems tempting to reintroduce this feature now in
> > AbstractStringBuilder.
> > The rationale here is that StringBuilder already provides a way of
> > accepting the hint about the result's size.
> > The constructor with the explicitly specified capacity is used for
> > increasing the efficiency of strings concatenations.
> > Optimizing this case by avoiding additional memory allocation and
> > copying looks sensible to me.
> >
> > Here's the draft webrev
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
>
> It is tempting, yes. But I think there was a reason that this was
> dropped when StringBuilder was introduced and that reason was
> thread-safety. StringBuilder doesn't use any synchronization. If a
> concurrent thread gets a reference to a StringBuilder and executes
> mutating operations while some other thread calls toString() on the same
> StringBuilder, then returned String could appear to be changing it's
> content (be mutable), which can have security implications:
>
> 413 public String toString() {
> 414 final byte[] value = this.value;
> 415 if (isLatin1()) {
> 416 if ((count << coder) < value.length) {
> 417 return StringLatin1.newString(value, 0, count);
> 418 }
> 419 shared = true;
> 420 return new String(value, String.LATIN1);
> 421 }
> 422 return StringUTF16.newStringSB(value, 0, count);
> 423 }
>
>
> 927 public AbstractStringBuilder replace(int start, int end,
> String str) {
> 928 if (end > count) {
> 929 end = count;
> 930 }
> 931 checkRangeSIOOBE(start, end, count);
> 932 int len = str.length();
> 933 int newCount = count + len - (end - start);
> 934 ensureCapacityInternal(newCount);
> 935 unshareValue();
> 936 shift(end, newCount - count);
> 937 count = newCount;
> 938 putStringAt(start, str);
> 939 return this;
> 940 }
>
>
> For example:
>
> static final StringBuilder sb = new StringBuilder(3).append("abc");
>
> Thread 1:
>
> String s = sb.toString();
> System.out.println(s);
>
> Thread 2:
>
> sb.replace(0, 3, "def");
>
>
> Question: What can Thread 1 print?
>
> Thread 1 sets shared = true and shares value array with String, but when
> Thread 2 executes replace, it may not yet notice the write from Thread 1
> and may not do anything in unshareValue(), effectively overwriting the
> shared array with "def".
>
> Answer: I think anything of the following:
>
> abc
>
> dbc
> aec
> abf
>
> dec
> dbf
> aef
>
> def
>
> ...and the answer may change over time. Now imagine handing over such
> string to new FileInputStream()...
>
>
> Regards, Peter
>
>
> >
> > This variant showed a significant speed improvement for the cases,
> > when the the capacity is equal to the result's size (up to +25% to
> > throughput).
> > For the other cases, the difference isn't very clear based on my
> > benchmarks :)
> > But is seems to be small enough, as it only adds a few comparisons,
> > coupled with already relatively heavy operations, like allocation and
> > copying.
> >
> > Here's the benchmark that I've used:
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
> >
> >
> > And the corresponding graphs.
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
> >
> > http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
> >
> > The blue line here stands for the baseline throughput.
> >
> > If people agree, this is a useful addition, a test should also be
> > added, of course.
> >
> > Sincerely yours,
> > Ivan
> >
>
>
More information about the core-libs-dev
mailing list