RFR: 8327647: Occasional SIGSEGV in markWord::displaced_mark_helper() for SPECjvm2008 sunflow [v6]

Coleen Phillimore coleenp at openjdk.org
Mon Apr 29 21:16:07 UTC 2024


On Sun, 28 Apr 2024 07:26:48 GMT, Dean Long <dlong at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1787:
>> 
>>> 1785:   lea(cache, Address(cache, index));
>>> 1786:   // Prevents stale data from being read after the bytecode is patched to the fast bytecode
>>> 1787:   membar(MacroAssembler::LoadLoad);
>> 
>> if we put LoadLoad in here, maybe it is redundant for TemplateTable::patch_bytecode(..)
>> 
>> https://github.com/openjdk/jdk/blob/a08af97fda06e785d1c3a5a17e562e65150bbb07/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp#L192
>> 
>> because there has a Load-acquire in L199.
>> can we change 
>> `load_field_entry(Register cache, Register index, int bcp_offset = 1) ` 
>> to 
>> `load_field_entry(Register cache, Register index, int bcp_offset = 1, bool needLoadLoad = 1) `?
>> 
>> then change L192 like this
>> <pre>
>> @@ -189,7 +189,7 @@ void TemplateTable::patch_bytecode(Bytecodes::Code bc, Register bc_reg,
>>        // additional, required work.
>>        assert(byte_no == f1_byte || byte_no == f2_byte, "byte_no out of range");
>>        assert(load_bc_into_bc_reg, "we use bc_reg as temp");
>> -      __ load_field_entry(temp_reg, bc_reg);
>> +      __ load_field_entry(temp_reg, bc_reg, 1 /*bcp_offset*/, false /*needLoadLoad*/);
>>        if (byte_no == f1_byte) {
>>          __ lea(temp_reg, Address(temp_reg, in_bytes(ResolvedFieldEntry::get_code_offset())));
>>        } else {
>>          __ lea(temp_reg, Address(temp_reg, in_bytes(ResolvedFieldEntry::put_code_offset())));
>>        }                                                                         
>>        // Load-acquire the bytecode to match store-release in ResolvedFieldEntry::fill_in()
>>        __ ldarb(temp_reg, temp_reg);
>> </pre>
>
> I believe patch_bytecode only needs a LoadStore between reading the cache bytecode and patching at bcp[0], so I would agree the LoadLoad in load_field_entry is not needed here.  But the reason is not because we have already load-acquire.  The reason is because there are not two loads that need ordering.  There is only a load and a store.  We can't replace the load-acquire with a LoadLoad, for example.

The LoadLoad in load_field_entry is needed for the second thread that (t2-1) loads the patched bytecode, and then (t2-2) loads the pointer to the cpCache entry for field, then (t2-3) loads the values from the cpCache entry pointer.  The LoadLoad is between t2-2 and t2-3.

The ldarb in patch_bytecode does make the LoadLoad in load_field_entry redundant as pointed out by @sunny868 but seems harmless in terms of performance and correctness, ie. not needing a special parameter. 

The reason that it's in load_field_entry is that there are other callers who do not load acquire the get_code and put_code fields.  So it's safer and less error prone to forget with it in load_field_entry.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18477#discussion_r1583783084


More information about the hotspot-compiler-dev mailing list