RFR: 8166842: String.hashCode() has a non-benign data race
Vitaly Davidovich
vitalyd at gmail.com
Thu Sep 29 01:05:32 UTC 2016
On Wednesday, September 28, 2016, David Holmes <david.holmes at oracle.com>
wrote:
> On 28/09/2016 11:24 PM, Peter Levart wrote:
>
>>
>> On 09/28/2016 03:05 PM, David Holmes wrote:
>>
>>> On 28/09/2016 10:44 PM, Peter Levart wrote:
>>>
>>>> Hi,
>>>>
>>>> According to discussion here:
>>>>
>>>> http://cs.oswego.edu/pipermail/concurrency-interest/2016-
>>>> September/015414.html
>>>>
>>>> it seems compact strings introduced (at least theoretical) non-benign
>>>> data race into String.hasCode() method.
>>>>
>>>> Here is a proposed patch:
>>>>
>>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/8166842_String.
>>>> hashCode/webrev.01/
>>>>
>>>>
>>> I'm far from convinced that the bug exists - theoretical or otherwise
>>> - but the "fix" is harmless.
>>>
>>> When we were working on JSR-133 one of the conceptual models is that
>>> every write to a variable goes into the set of values that a read may
>>> potentially return (so no out-of-thin-air for example). happens-before
>>> establishes constraints on which value can legally be returned - the
>>> most recent. An additional property was that once a value was
>>> returned, a later read could not return an earlier value - in essence
>>> once a read returns a given value, all earlier written values are
>>> removed from the set of potential values that can be read.
>>>
>>
>> That would be a nice property, yes.
>>
>>
>>> Your bug requires that the code act as-if written:
>>>
>>> int temp = hash;
>>> if (temp == 0) {
>>> hash = ...
>>> }
>>> return temp;
>>>
>>> and I do not believe that is allowed.
>>>
>>> David
>>>
>>
>> Well, I can't find anything like that in JMM description. Could you
>> point me to it? Above example only reads once from hash. The code in
>> question is this:
>>
>> if (hash == 0) { // 1st read
>> hash = ...
>> }
>> return hash; // 2nd read
>>
>> And the bug requires the code to act like:
>>
>> int temp2 = hash; // 2nd read
>> int temp1 = hash; // 1st read
>> if (temp1 == 0) {
>> return (hash = ...);
>> }
>> return temp2;
>>
>
> Given hash is not a volatile field there is nothing that requires the
> compiler to emit code that reads it twice - of course javac will issue
> getfields as per the source code.
I don't think there's anything preventing it from reading it any number of
times so long as intra-thread semantics are preserved.
>
> So the compiler is of course allowed to read it twice, but you then want
> it to reorder those reads such that the second use uses the value of the
> first read, and the first use uses the value of the second read.
I gave a pseudo example upthread. It would be weird codegen but allowed.
>
> I recall discussions of such perverse compiler behaviour in the past. :(
Be thankful that at least an explicit read into a local is preserved - C++
compilers are allowed to reload in that case too unless the local is marked
volatile.
None of this is an issue when not using data races. If you choose to
engage in a data race, gotta play by the rules :).
>
> Is this allowed?
>>
>
> Sadly yes. What we ended up with in JLS Ch17 permits this - it is a
> variation of the aliasing example in 17.4.
>
> There are places where the formalization of the JMM "threw the baby out
> with the bath water" IMHO. Some of the less formal descriptions made far
> more sense - like the notion of there being a set of values a read may
> return, and how happens-before constrains that, and how once a value is
> returned you cant then "read" an earlier value. <sigh>
>
> Thanks,
> David
>
> Regards, Peter
>>
>>
>>>
>>>> For the bug:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8166842
>>>>
>>>>
>>>>
>>>> JDK 8 did not have this problem, so no back-porting necessary.
>>>>
>>>> Regards, Peter
>>>>
>>>>
>>
--
Sent from my phone
More information about the core-libs-dev
mailing list