RFC: AbstractStringBuilder sharing its buffer with String

Vitaly Davidovich vitalyd at gmail.com
Mon Nov 30 14:19:49 UTC 2015


Hi Peter,

I see your point about someone deliberately doing this to break String
invariants.  But by this logic, everything should be "threadsafe" in case
someone attempts to break it via concurrency.

sent from my phone
On Nov 28, 2015 7:19 PM, "Peter Levart" <peter.levart at gmail.com> wrote:

>
>
> On 11/28/2015 08:19 PM, Vitaly Davidovich wrote:
>
> Is that a valid example given StringBuilder isn't threadsafe to begin
> with? If the SB instance is shared among threads that perform writes to it
> without external synchronization, it's a bug in that code.  Am I missing
> something?
>
>
> That "bug" might be intentional. Produced to circumvent security check.
> String is trusted to be immutable. For example here:
>
>     public FileInputStream(File file) throws FileNotFoundException {
>         String name = (file != null ? file.getPath() : null);
>         SecurityManager security = System.getSecurityManager();
>         if (security != null) {
>             security.checkRead(name);
>         }
>         if (name == null) {
>             throw new NullPointerException();
>         }
>         if (file.isInvalid()) {
>             throw new FileNotFoundException("Invalid file path");
>         }
>         fd = new FileDescriptor();
>         fd.attach(this);
>         path = name;
>         open(name);
>     }
>
>
> ...the trust is that it doesn't change between calls to:
>
>             security.checkRead(name);
>
> and:
>
>         open(name);
>
>
> You'd have to ensure that the returned String is stable and effectively
> immutable though, which means the underlying byte[] has to be released by
> SB right before it's mutated if it's shared, which I think Ivan's patch is
> doing (from a quick glance).
>
>
> But there's a race in code (even if 'shared' and 'value' were volatile),
> which I think is impossible to fix in StringBuilder without introducing
> locking.
>
> Sequence of operations (referring to quoted example below):
>
> - Thread 2 executes unshareValue() 1st while 'value' is not shared yet,
> which does nothing.
> - Thread 1 sets shared = true and creates a String with 'value' and
> returns it
> - Thread 2 proceeds with mutation of 'value' which is now shared with
> String in thread 1
>
>
>
> It would be possible to create a StringBuilder-like class (even a subclass
> of AbstractStringBuilder) that would be safe to use and would share the
> array with String with minimal overhead for  mutative operations, but such
> class would not be compatible with StringBuilder. For example, using
> standard trick that limits the use of an instance to the thread that
> constructed it:
>
> public class SingleThreadedStringBuilder extends AbstractStringBuilder {
>
>     private Thread constructingThread = Thread.currentThread();
>
>     // and then in each mutative operation...
>
>     @Override
>     public SingleThreadedStringBuilder append(CharSequence s) {
>         if (Thread.currentThread() != constructingThread) throw new
> IllegalStateException();
>         super.append(s);
>         return this;
>     }
>
>     // it could even be a one-shot object, so it could become unusable
> after a call to toString()
>
>     @Override
>     public String toString() {
>         if (Thread.currentThread() != constructingThread) throw new
> IllegalStateException();
>         constructingThread = null;
>
>         // create and return a String, possibly passing the 'value' array
> by reference
>         ...
>     }
>
>
> Such class could be used for string concatenation, but I think JEP280 has
> a better answer for that and more.
>
> Regards, Peter
>
> sent from my phone
> On Nov 27, 2015 10:16 AM, "Peter Levart" <peter.levart at gmail.com> wrote:
>
>>
>>
>> On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
>>
>>> Hello!
>>>
>>> Prior to Java5, StringBuffer used to be able to share its internal
>>> char[] buffer with the String, returned with toString().
>>> With introducing of the new StringBuilder class this functionality was
>>> removed.
>>>
>>> It seems tempting to reintroduce this feature now in
>>> AbstractStringBuilder.
>>> The rationale here is that StringBuilder already provides a way of
>>> accepting the hint about the result's size.
>>> The constructor with the explicitly specified capacity is used for
>>> increasing the efficiency of strings concatenations.
>>> Optimizing this case by avoiding additional memory allocation and
>>> copying looks sensible to me.
>>>
>>> Here's the draft webrev
>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
>>>
>>
>> It is tempting, yes. But I think there was a reason that this was dropped
>> when StringBuilder was introduced and that reason was thread-safety.
>> StringBuilder doesn't use any synchronization. If a concurrent thread gets
>> a reference to a StringBuilder and executes mutating operations while some
>> other thread calls toString() on the same StringBuilder, then returned
>> String could appear to be changing it's content (be mutable), which can
>> have security implications:
>>
>>  413     public String toString() {
>>  414         final byte[] value = this.value;
>>  415         if (isLatin1()) {
>>  416             if ((count << coder) < value.length) {
>>  417                 return StringLatin1.newString(value, 0, count);
>>  418             }
>>  419             shared = true;
>>  420             return new String(value, String.LATIN1);
>>  421         }
>>  422         return StringUTF16.newStringSB(value, 0, count);
>>  423     }
>>
>>
>>  927     public AbstractStringBuilder replace(int start, int end, String
>> str) {
>>  928         if (end > count) {
>>  929             end = count;
>>  930         }
>>  931         checkRangeSIOOBE(start, end, count);
>>  932         int len = str.length();
>>  933         int newCount = count + len - (end - start);
>>  934         ensureCapacityInternal(newCount);
>>  935         unshareValue();
>>  936         shift(end, newCount - count);
>>  937         count = newCount;
>>  938         putStringAt(start, str);
>>  939         return this;
>>  940     }
>>
>>
>> For example:
>>
>> static final StringBuilder sb = new StringBuilder(3).append("abc");
>>
>> Thread 1:
>>
>> String s = sb.toString();
>> System.out.println(s);
>>
>> Thread 2:
>>
>> sb.replace(0, 3, "def");
>>
>>
>> Question: What can Thread 1 print?
>>
>> Thread 1 sets shared = true and shares value array with String, but when
>> Thread 2 executes replace, it may not yet notice the write from Thread 1
>> and may not do anything in unshareValue(), effectively overwriting the
>> shared array with "def".
>>
>> Answer: I think anything of the following:
>>
>> abc
>>
>> dbc
>> aec
>> abf
>>
>> dec
>> dbf
>> aef
>>
>> def
>>
>> ...and the answer may change over time. Now imagine handing over such
>> string to new FileInputStream()...
>>
>>
>> Regards, Peter
>>
>>
>>
>>> This variant showed a significant speed improvement for the cases, when
>>> the the capacity is equal to the result's size (up to +25% to throughput).
>>> For the other cases, the difference isn't very clear based on my
>>> benchmarks :)
>>> But is seems to be small enough, as it only adds a few comparisons,
>>> coupled with already relatively heavy operations, like allocation and
>>> copying.
>>>
>>> Here's the benchmark that I've used:
>>>
>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
>>>
>>> And the corresponding graphs.
>>>
>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
>>>
>>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
>>> The blue line here stands for the baseline throughput.
>>>
>>> If people agree, this is a useful addition, a test should also be added,
>>> of course.
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>
>



More information about the core-libs-dev mailing list