RFR: 8305819: LogConfigurationTest intermittently fails on AArch64 [v2]

Xin Liu xliu at openjdk.org
Fri May 5 18:35:20 UTC 2023


On Fri, 5 May 2023 13:15:33 GMT, Dan Gao <duke at openjdk.org> wrote:

>> LogConfigurationTest*reconfigure*MT* crash intermittently on AArch64.
>> According to the crash log and coredump, we found it crash as follows:
>> 
>> void LogTagSet::log(LogLevelType level, const char* msg) {
>>   LogOutputList::Iterator it = _output_list.iterator(level);
>>   LogDecorations decorations(level, *this, _decorators);
>> 
>>   for (; it != _output_list.end(); it++) {
>>     (*it)->write(decorations, msg);//crash 
>>   }
>> }
>> 
>> In the test, two threads write into the log while another thread dynamically changes the decorators and tags. During this time, the  _output_list will be modified. Because of the relax memory model of aarch64, while adding LogOutputNode to LogOutputList, adding node to list and setting the value of node may be reordered, therefore the read thread may not read the correct value of the node's content. Consequently, storestore memory barrier is needed to ensure the order of writing. 
>> By the way, applying this patch may affect performance.
>> 
>> How to reproduce on Linux aarch64:
>> test case
>> 
>> /* @test
>>  * @library /test/lib
>>  * @modules java.base/jdk.internal.misc
>>  *          java.xml
>>  * @run main/native GTestWrapper --gtest_filter=LogConfigurationTest*reconfigure*MT*
>>  */
>> 
>> Crash may occasionally occur after running continuously for 5000 times.
>
> Dan Gao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   comment fixme

src/hotspot/share/logging/logOutputList.hpp line 52:

> 50:   struct LogOutputNode : public CHeapObj<mtLogging> {
> 51:     LogOutput*      _value;
> 52:     LogOutputNode* volatile _next;

Atomic::load_acquire will cast its argument to const volatile T* no matter what.
 
another reason is that there's no other load of _next field. I don't see any chance compiler elides load instruction for it. 
so I don't think we need 'volatile' qualifier for it. same reasons for  _level_start[LogLevel::Count]

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13421#discussion_r1186393406


More information about the hotspot-runtime-dev mailing list