RFC: AbstractStringBuilder sharing its buffer with String
Vitaly Davidovich
vitalyd at gmail.com
Sat Nov 28 19:19:20 UTC 2015
Is that a valid example given StringBuilder isn't threadsafe to begin with?
If the SB instance is shared among threads that perform writes to it
without external synchronization, it's a bug in that code. Am I missing
something?
You'd have to ensure that the returned String is stable and effectively
immutable though, which means the underlying byte[] has to be released by
SB right before it's mutated if it's shared, which I think Ivan's patch is
doing (from a quick glance).
sent from my phone
On Nov 27, 2015 10:16 AM, "Peter Levart" <peter.levart at gmail.com> wrote:
>
>
> On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
>
>> Hello!
>>
>> Prior to Java5, StringBuffer used to be able to share its internal char[]
>> buffer with the String, returned with toString().
>> With introducing of the new StringBuilder class this functionality was
>> removed.
>>
>> It seems tempting to reintroduce this feature now in
>> AbstractStringBuilder.
>> The rationale here is that StringBuilder already provides a way of
>> accepting the hint about the result's size.
>> The constructor with the explicitly specified capacity is used for
>> increasing the efficiency of strings concatenations.
>> Optimizing this case by avoiding additional memory allocation and copying
>> looks sensible to me.
>>
>> Here's the draft webrev
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
>>
>
> It is tempting, yes. But I think there was a reason that this was dropped
> when StringBuilder was introduced and that reason was thread-safety.
> StringBuilder doesn't use any synchronization. If a concurrent thread gets
> a reference to a StringBuilder and executes mutating operations while some
> other thread calls toString() on the same StringBuilder, then returned
> String could appear to be changing it's content (be mutable), which can
> have security implications:
>
> 413 public String toString() {
> 414 final byte[] value = this.value;
> 415 if (isLatin1()) {
> 416 if ((count << coder) < value.length) {
> 417 return StringLatin1.newString(value, 0, count);
> 418 }
> 419 shared = true;
> 420 return new String(value, String.LATIN1);
> 421 }
> 422 return StringUTF16.newStringSB(value, 0, count);
> 423 }
>
>
> 927 public AbstractStringBuilder replace(int start, int end, String
> str) {
> 928 if (end > count) {
> 929 end = count;
> 930 }
> 931 checkRangeSIOOBE(start, end, count);
> 932 int len = str.length();
> 933 int newCount = count + len - (end - start);
> 934 ensureCapacityInternal(newCount);
> 935 unshareValue();
> 936 shift(end, newCount - count);
> 937 count = newCount;
> 938 putStringAt(start, str);
> 939 return this;
> 940 }
>
>
> For example:
>
> static final StringBuilder sb = new StringBuilder(3).append("abc");
>
> Thread 1:
>
> String s = sb.toString();
> System.out.println(s);
>
> Thread 2:
>
> sb.replace(0, 3, "def");
>
>
> Question: What can Thread 1 print?
>
> Thread 1 sets shared = true and shares value array with String, but when
> Thread 2 executes replace, it may not yet notice the write from Thread 1
> and may not do anything in unshareValue(), effectively overwriting the
> shared array with "def".
>
> Answer: I think anything of the following:
>
> abc
>
> dbc
> aec
> abf
>
> dec
> dbf
> aef
>
> def
>
> ...and the answer may change over time. Now imagine handing over such
> string to new FileInputStream()...
>
>
> Regards, Peter
>
>
>
>> This variant showed a significant speed improvement for the cases, when
>> the the capacity is equal to the result's size (up to +25% to throughput).
>> For the other cases, the difference isn't very clear based on my
>> benchmarks :)
>> But is seems to be small enough, as it only adds a few comparisons,
>> coupled with already relatively heavy operations, like allocation and
>> copying.
>>
>> Here's the benchmark that I've used:
>>
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
>>
>> And the corresponding graphs.
>>
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
>>
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
>> The blue line here stands for the baseline throughput.
>>
>> If people agree, this is a useful addition, a test should also be added,
>> of course.
>>
>> Sincerely yours,
>> Ivan
>>
>>
>
More information about the core-libs-dev
mailing list