RFC: AbstractStringBuilder sharing its buffer with String
Peter Levart
peter.levart at gmail.com
Mon Nov 30 15:13:17 UTC 2015
On 11/30/2015 03:19 PM, Vitaly Davidovich wrote:
>
> Hi Peter,
>
> I see your point about someone deliberately doing this to break String
> invariants. But by this logic, everything should be "threadsafe" in
> case someone attempts to break it via concurrency.
>
Anything related to security, yes.
Regards, Peter
> sent from my phone
>
> On Nov 28, 2015 7:19 PM, "Peter Levart" <peter.levart at gmail.com
> <mailto:peter.levart at gmail.com>> wrote:
>
>
>
> On 11/28/2015 08:19 PM, Vitaly Davidovich wrote:
>>
>> Is that a valid example given StringBuilder isn't threadsafe to
>> begin with? If the SB instance is shared among threads that
>> perform writes to it without external synchronization, it's a bug
>> in that code. Am I missing something?
>>
>
> That "bug" might be intentional. Produced to circumvent security
> check. String is trusted to be immutable. For example here:
>
> public FileInputStream(File file) throws FileNotFoundException {
> String name = (file != null ? file.getPath() : null);
> SecurityManager security = System.getSecurityManager();
> if (security != null) {
> security.checkRead(name);
> }
> if (name == null) {
> throw new NullPointerException();
> }
> if (file.isInvalid()) {
> throw new FileNotFoundException("Invalid file path");
> }
> fd = new FileDescriptor();
> fd.attach(this);
> path = name;
> open(name);
> }
>
>
> ...the trust is that it doesn't change between calls to:
>
> security.checkRead(name);
>
> and:
>
> open(name);
>
>
>> You'd have to ensure that the returned String is stable and
>> effectively immutable though, which means the underlying byte[]
>> has to be released by SB right before it's mutated if it's
>> shared, which I think Ivan's patch is doing (from a quick glance).
>>
>
> But there's a race in code (even if 'shared' and 'value' were
> volatile), which I think is impossible to fix in StringBuilder
> without introducing locking.
>
> Sequence of operations (referring to quoted example below):
>
> - Thread 2 executes unshareValue() 1st while 'value' is not shared
> yet, which does nothing.
> - Thread 1 sets shared = true and creates a String with 'value'
> and returns it
> - Thread 2 proceeds with mutation of 'value' which is now shared
> with String in thread 1
>
>
>
> It would be possible to create a StringBuilder-like class (even a
> subclass of AbstractStringBuilder) that would be safe to use and
> would share the array with String with minimal overhead for
> mutative operations, but such class would not be compatible with
> StringBuilder. For example, using standard trick that limits the
> use of an instance to the thread that constructed it:
>
> public class SingleThreadedStringBuilder extends
> AbstractStringBuilder {
>
> private Thread constructingThread = Thread.currentThread();
>
> // and then in each mutative operation...
>
> @Override
> public SingleThreadedStringBuilder append(CharSequence s) {
> if (Thread.currentThread() != constructingThread) throw
> new IllegalStateException();
> super.append(s);
> return this;
> }
>
> // it could even be a one-shot object, so it could become
> unusable after a call to toString()
>
> @Override
> public String toString() {
> if (Thread.currentThread() != constructingThread) throw
> new IllegalStateException();
> constructingThread = null;
>
> // create and return a String, possibly passing the
> 'value' array by reference
> ...
> }
>
>
> Such class could be used for string concatenation, but I think
> JEP280 has a better answer for that and more.
>
> Regards, Peter
>
>> sent from my phone
>>
>> On Nov 27, 2015 10:16 AM, "Peter Levart" <peter.levart at gmail.com
>> <mailto:peter.levart at gmail.com>> wrote:
>>
>>
>>
>> On 11/27/2015 01:39 PM, Ivan Gerasimov wrote:
>>
>> Hello!
>>
>> Prior to Java5, StringBuffer used to be able to share its
>> internal char[] buffer with the String, returned with
>> toString().
>> With introducing of the new StringBuilder class this
>> functionality was removed.
>>
>> It seems tempting to reintroduce this feature now in
>> AbstractStringBuilder.
>> The rationale here is that StringBuilder already provides
>> a way of accepting the hint about the result's size.
>> The constructor with the explicitly specified capacity is
>> used for increasing the efficiency of strings concatenations.
>> Optimizing this case by avoiding additional memory
>> allocation and copying looks sensible to me.
>>
>> Here's the draft webrev
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/00/webrev/
>> <http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/00/webrev/>
>>
>>
>> It is tempting, yes. But I think there was a reason that this
>> was dropped when StringBuilder was introduced and that reason
>> was thread-safety. StringBuilder doesn't use any
>> synchronization. If a concurrent thread gets a reference to a
>> StringBuilder and executes mutating operations while some
>> other thread calls toString() on the same StringBuilder, then
>> returned String could appear to be changing it's content (be
>> mutable), which can have security implications:
>>
>> 413 public String toString() {
>> 414 final byte[] value = this.value;
>> 415 if (isLatin1()) {
>> 416 if ((count << coder) < value.length) {
>> 417 return StringLatin1.newString(value, 0,
>> count);
>> 418 }
>> 419 shared = true;
>> 420 return new String(value, String.LATIN1);
>> 421 }
>> 422 return StringUTF16.newStringSB(value, 0, count);
>> 423 }
>>
>>
>> 927 public AbstractStringBuilder replace(int start, int
>> end, String str) {
>> 928 if (end > count) {
>> 929 end = count;
>> 930 }
>> 931 checkRangeSIOOBE(start, end, count);
>> 932 int len = str.length();
>> 933 int newCount = count + len - (end - start);
>> 934 ensureCapacityInternal(newCount);
>> 935 unshareValue();
>> 936 shift(end, newCount - count);
>> 937 count = newCount;
>> 938 putStringAt(start, str);
>> 939 return this;
>> 940 }
>>
>>
>> For example:
>>
>> static final StringBuilder sb = new
>> StringBuilder(3).append("abc");
>>
>> Thread 1:
>>
>> String s = sb.toString();
>> System.out.println(s);
>>
>> Thread 2:
>>
>> sb.replace(0, 3, "def");
>>
>>
>> Question: What can Thread 1 print?
>>
>> Thread 1 sets shared = true and shares value array with
>> String, but when Thread 2 executes replace, it may not yet
>> notice the write from Thread 1 and may not do anything in
>> unshareValue(), effectively overwriting the shared array with
>> "def".
>>
>> Answer: I think anything of the following:
>>
>> abc
>>
>> dbc
>> aec
>> abf
>>
>> dec
>> dbf
>> aef
>>
>> def
>>
>> ...and the answer may change over time. Now imagine handing
>> over such string to new FileInputStream()...
>>
>>
>> Regards, Peter
>>
>>
>>
>> This variant showed a significant speed improvement for
>> the cases, when the the capacity is equal to the result's
>> size (up to +25% to throughput).
>> For the other cases, the difference isn't very clear
>> based on my benchmarks :)
>> But is seems to be small enough, as it only adds a few
>> comparisons, coupled with already relatively heavy
>> operations, like allocation and copying.
>>
>> Here's the benchmark that I've used:
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java
>> <http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/MyBenchmark.java>
>>
>>
>> And the corresponding graphs.
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png
>> <http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test15.png>
>>
>> http://cr.openjdk.java.net/~igerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png
>> <http://cr.openjdk.java.net/%7Eigerasim/XXXXXXX-ASB-shared-value/ASB-Shared-bench-test17.png>
>>
>> The blue line here stands for the baseline throughput.
>>
>> If people agree, this is a useful addition, a test should
>> also be added, of course.
>>
>> Sincerely yours,
>> Ivan
>>
>>
>
More information about the core-libs-dev
mailing list