RFR: 8300926: Several startup regressions ~6-70% in 21-b6 all platforms [v2]
Coleen Phillimore
coleenp at openjdk.org
Fri Feb 17 20:00:54 UTC 2023
On Fri, 17 Feb 2023 09:25:55 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
>> Hi all, please consider.
>>
>> The original issue was when thread 1 asked to deopt nmethod set X and thread 2 asked for the same or a subset of X.
>> All method will already be marked, but the actual deoptimizing, not entrant, patching PC on stacks and patching post call nops, was not done yet. Which meant thread 2 could 'pass' thread 1.
>> Most places did deopt under Compile_lock, thus this is not an issue, but WB and clearCallSiteContext do not.
>>
>> Since a handshakes may take long before completion and Compile_lock is used for so much more than deopts.
>> The fix in https://bugs.openjdk.org/browse/JDK-8299074 instead always emitted a handshake even when everything was already marked. (instead of adding Compile_lock to all places)
>>
>> This turnout to be problematic in the startup, for example the number of deopt handshakes in jetty dry run startup went from 5 to 39 handshakes.
>>
>> This fix first adds a barrier for which you do not pass until the requested deopts have happened and it coalesces the handshakes.
>> Secondly it moves handshakes part out of the Compile_lock where it is possible.
>>
>> Which means we fix the performance bug and we reduce the contention on Compile_lock, meaning higher throughput in compiler and things such as class-loading.
>>
>> It passes t1-t7 with flying colours! t8 still not completed and I'm redoing some testing due to last minute simplifications.
>>
>> Thanks, Robbin
>
> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision:
>
> Review fixes
src/hotspot/share/classfile/systemDictionary.cpp line 1505:
> 1503: // Add to systemDictionary - so other classes can see it.
> 1504: // Grabs and releases SystemDictionary_lock
> 1505: update_dictionary(THREAD, k, loader_data);
All these patterns are the same in class loading, except here we update the SystemDictionary after deoptimizing holding the Compile_lock and in your change we update it beforehand.
I was going to suggest future cleanup of moving add_to_hierarchy and the associated deopt_scope code to InstanceKlass, but now I'm not sure if this order matters.
Holding the Compile_lock while adding a class to the SystemDictionary and the code to hold the Compile_lock in ciEnv::get_klass_by_name() now doesn't make sense to me. Klass is read lock free in the dictionary and temporarily holding the Compile_lock doesn't seem to do what it says:
```
{ // Grabbing the Compile_lock prevents systemDictionary updates
// during compilations.
At any rate, I think the SystemDictionary might need to be updated after deoptimization is done. Can you check this?
-------------
PR: https://git.openjdk.org/jdk/pull/12585
More information about the hotspot-dev
mailing list