RFR(XXS): 8227338: templateInterpreter.cpp: copy_table() needs to be safer

Erik Osterlund erik.osterlund at oracle.com
Sun Jul 7 08:48:15 UTC 2019


Yeah that switch statement code and yet another plain non-volatile load/store loop looks like complete nonsense unfortunately. It should at least use Atomic::load/store.

Fortunately, on x86_64, I believe it will in practice yield word atomic copying anyway by chance. But it should be fixed anyway. *sigh*

The real danger is SPARC though and its BIS instructions. I don’t have the code in front of me, but I really hope not to see that switch statement and non-volatile loop in that pd_disjoint_words_atomic() function.

And I agree that the atomic copying API should be used when we need atomic copying. And if it turns out the implementation of that API is not atomic, it should be fixed in that atomic copying API.

So I think this change looks good. But it looks like we are not done yet. :c

Thanks,
/Erik

> On 7 Jul 2019, at 02:46, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
> Added Erik O. to the "To:" list...
> 
> 
>> On 7/6/19 8:05 PM, Daniel D. Daugherty wrote:
>>> On 7/6/19 6:06 PM, David Holmes wrote:
>>> Hi Dan,
>>> 
>>>> On 6/07/2019 11:53 pm, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>> 
>>>> During the code review for the following fix:
>>>> 
>>>>      JDK-8227117 normal interpreter table is not restored after single stepping with TLH
>>>>      https://bugs.openjdk.java.net/browse/JDK-8227117
>>>> 
>>>> Erik O. noticed a potential race with templateInterpreter.cpp: copy_table()
>>>> depending on C++ compiler optimizations. The following bug is being used
>>>> to fix this issue:
>>>> 
>>>>      JDK-8227338 templateInterpreter.cpp: copy_table() needs to be safer
>>>>      https://bugs.openjdk.java.net/browse/JDK-8227338
>>>> 
>>>> Here's the webrev URL:
>>>> 
>>>> http://cr.openjdk.java.net/~dcubed/8227338-webrev/0_for_jdk14/
>>> 
>>> So the original code uses a loop to copy, while the new code calls Copy::disjoint_words_atomic, but the implementation of that on x64 is just a loop same as the original AFAICS:
>> 
>> Yup. I figure Erik O. will jump in here with his reasoning... :-)
> 
> Thinking about it more... I think the answer is that we are switching to
> calling code that specifies the type of behavior that we need:
> 
>     Copy::disjoint_words_atomic()
> 
> is what we need when we're not at a safepoint. If, down the road, we find
> that the compiler does something with pd_disjoint_words_atomic() that
> breaks our expectation for pd_disjoint_words_atomic(), then we fix that
> version of pd_disjoint_words_atomic() and all the callers will be good
> again...  Or something like that...
> 
> The version in src/hotspot/os_cpu/solaris_x86/copy_solaris_x86.inline.hpp
> happens to be exactly our loop with no switch statement... which is
> particularly funny given Erik's observations about what at least one
> Solaris X64 compiler did to loops...
> 
> Still, I'm just guessing here on a Saturday night... hopefully Erik
> will chime in here...
> 
> Dan
> 
> 
> 
>> 
>> Dan
>> 
>> 
>>> 
>>> static void pd_disjoint_words_atomic(const HeapWord* from, HeapWord* to, size_t count) {
>>> #ifdef AMD64
>>>   switch (count) {
>>>   case 8:  to[7] = from[7];
>>>   case 7:  to[6] = from[6];
>>>   case 6:  to[5] = from[5];
>>>   case 5:  to[4] = from[4];
>>>   case 4:  to[3] = from[3];
>>>   case 3:  to[2] = from[2];
>>>   case 2:  to[1] = from[1];
>>>   case 1:  to[0] = from[0];
>>>   case 0:  break;
>>>   default:
>>>     while (count-- > 0) {
>>>       *to++ = *from++;
>>>     }
>>>     break;
>>>   }
>>> #else
>>> 
>>> David
>>> -----
>>> 
>>>> This fix has been tested via Mach5 Tier[1-3] on Oracle's usual platforms.
>>>> Mach5 tier[4-6] is running now. It has also been tested with the manual
>>>> jdb test from JDK-8227117 using 'release' and 'fastdebug' bits.
>>>> 
>>>> Thanks, in advance, for questions, comments or suggestions.
>>>> 
>>>> Dan
>> 
> 



More information about the hotspot-runtime-dev mailing list