RFR: 8166842: String.hashCode() has a non-benign data race
Peter Levart
peter.levart at gmail.com
Thu Sep 29 09:31:30 UTC 2016
On 09/29/2016 02:58 AM, Vitaly Davidovich wrote:
> On Wednesday, September 28, 2016, Carsten Varming <cvarming at twitter.com>
> wrote:
>
>> Dear David,
>>
>> See inline.
>>
>> On Wed, Sep 28, 2016 at 7:47 PM, David Holmes <david.holmes at oracle.com
>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>> wrote:
>>
>>> On 29/09/2016 3:44 AM, Carsten Varming wrote:
>>>
>>>> Dear Vitaly and David,
>>>>
>>>> Looking at your webrev the original code is:
>>>>
>>>> public int hashCode() {
>>>> if (hash == 0 && value.length > 0) {
>>>> hash = isLatin1() ? StringLatin1.hashCode(value)
>>>> }
>>>> return hash;
>>>> }
>>>>
>>>> There is a constructor:
>>>>
>>>> public String(String original) {
>>>> this.value = original.value;
>>>> this.coder = original.coder;
>>>> this.hash = original.hash;
>>>> }
>>>>
>>>> that might write zero to the mutable field "hash".
>>>>
>>>> The object created by this constructor might be shared using plain reads
>>>> and writes between two threads[1] and the write of 0 in the constructor
>>>> might be interleaved with the reads and write in hashCode. Does this
>>>> capture the problem?
>>>>
>>> Because String has final fields there is a freeze action at the end of
>>> construction so that String instances are always safely published even if
>>> not "safely published".
>>>
>> I always thought that the freeze action only freezes final fields. The
>> hash field in String is not final and example 17.5-1 is applicable as far
>> as I can see (https://docs.oracle.com/javase/specs/jls/se8/html/jls-
>> 17.html#jls-17.5). Has the memory model changed in JDK9 to invalidate
>> example 17.5-1 or I am missing something about String.
>>
> Right - the spec only talks about final fields being ok in such cases. In
> practice, compilers put a freeze action at the end of the constructor and
> non-final fields come along for the ride, AFAIK.
>
> However, I'm not sure this example is all that interesting. Suppose you
> published a String copied from another, but the hash value was reordered
> and didn't appear to the reader. That ought to be fine because hashCode is
> supposed to recompute it. At worst, you perform the same computation
> "unnecessarily".
>
> The possible compiler reordering, as discussed in this thread, within
> hashCode itself is problematic because it can, theoretically, yield 0
> erroneously.
I think Carsten has a point. If we bring this constructor to the picture:
public String(String original) {
this.value = original.value;
this.coder = original.coder;
this.hash = original.hash;
}
and combine it with the hashCode implementation:
public int hashCode() {
if (hash == 0 && value.length > 0) {
hash = isLatin1() ? ...
}
return hash;
}
...then we can get an erroneous result of 0 from hashCode() even without
compiler (or hardware) reordering reads in hashCode():
Possible execution:
Thread 3:
hash = isLatin1() ? ....; // write non-zero
Thread 1:
if (hash == 0 && ...) // false, skip assignment
Thread 2:
this.hash = original.hash; // zero (hash not computed yet in original)
Thread 1:
return hash; // returns zero
This would of course require the reference to String constructed by
Thread 2 to be unsafely published to Thread 1 & 3 before this.hash in
constructor is written, meaning that "freeze" action for final fields
would have to be performed immediately after the write of last final
field (coder). And that's another possibility that JMM allows.
I think that this is enough to apply the fix, thanks.
Regards, Peter
>> Carsten
>>
>>
>>> David
>>>
>>>
>>> [1]: Meaning the is no happens-before relationship established between
>>>> object construction and another thread calling hashCode on the object.
>>>>
>>>> Carsten
>>>>
>>>> On Wed, Sep 28, 2016 at 10:13 AM, Vitaly Davidovich <vitalyd at gmail.com
>>>> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>
>>>> <mailto:vitalyd at gmail.com
>>>> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>>> wrote:
>>>>
>>>> On Wednesday, September 28, 2016, David Holmes
>>>> <david.holmes at oracle.com
>>>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');> <mailto:
>>>> david.holmes at oracle.com
>>>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>>>
>>>> wrote:
>>>>
>>>> > On 28/09/2016 10:44 PM, Peter Levart wrote:
>>>> >
>>>> >> Hi,
>>>> >>
>>>> >> According to discussion here:
>>>> >>
>>>> >> http://cs.oswego.edu/pipermail/concurrency-interest/2016-
>>>> <http://cs.oswego.edu/pipermail/concurrency-interest/2016->
>>>> >> September/015414.html
>>>> >>
>>>> >>
>>>> >> it seems compact strings introduced (at least theoretical)
>>>> non-benign
>>>> >> data race into String.hasCode() method.
>>>> >>
>>>> >> Here is a proposed patch:
>>>> >>
>>>> >> http://cr.openjdk.java.net/~plevart/jdk9-dev/8166842_String
>>>> <http://cr.openjdk.java.net/~plevart/jdk9-dev/8166842_String>.
>>>>
>>>> >> hashCode/webrev.01/
>>>> >>
>>>> >
>>>> > I'm far from convinced that the bug exists - theoretical or
>>>> otherwise -
>>>> > but the "fix" is harmless.
>>>> >
>>>> > When we were working on JSR-133 one of the conceptual models is
>>>> that every
>>>> > write to a variable goes into the set of values that a read may
>>>> potentially
>>>> > return (so no out-of-thin-air for example). happens-before
>>>> establishes
>>>> > constraints on which value can legally be returned - the most
>>>> recent. An
>>>> > additional property was that once a value was returned, a later
>>>> read could
>>>> > not return an earlier value - in essence once a read returns a
>>>> given value,
>>>> > all earlier written values are removed from the set of potential
>>>> values
>>>> > that can be read.
>>>> >
>>>> > Your bug requires that the code act as-if written:
>>>> >
>>>> > int temp = hash;
>>>> > if (temp == 0) {
>>>> > hash = ...
>>>> > }
>>>> > return temp;
>>>>
>>>> It's the other way I think:
>>>>
>>>> int temp = hash; // read 0
>>>> if (hash == 0) // reread a non 0
>>>> hash = temp = ...
>>>> return temp // return 0
>>>>
>>>> It's unlikely but what prohibits that?
>>>>
>>>> >
>>>> > and I do not believe that is allowed.
>>>> >
>>>> > David
>>>> >
>>>> >
>>>> >> For the bug:
>>>> >>
>>>> >> https://bugs.openjdk.java.net/browse/JDK-8166842
>>>> <https://bugs.openjdk.java.net/browse/JDK-8166842>
>>>> >>
>>>> >>
>>>> >>
>>>> >> JDK 8 did not have this problem, so no back-porting necessary.
>>>> >>
>>>> >> Regards, Peter
>>>> >>
>>>> >>
>>>>
>>>> --
>>>> Sent from my phone
>>>>
>>>>
>>>>
More information about the core-libs-dev
mailing list