RFR: JDK-8302736: Major performance regression in Math.log on aarch64

Andrew Haley aph at openjdk.org
Mon May 22 09:43:52 UTC 2023


On Thu, 11 May 2023 01:44:08 GMT, Dean Long <dlong at openjdk.org> wrote:

>> This is day one code for the macOS/Aarch64 port which has been in place for two years. Why is this only now being seen to be a problem?
>> 
>> The high-level placement of these calls was done to stop playing whack-a-mole every time we hit a new failure due to a missing `ThreadWXEnable`. I'm all for placing these where they are actually needed but noone seems to be to able to clearly state/identify exactly where that is in the code. The changes in this PR are pushing it down further, but based on the comments e.g.
>> 
>> // we might modify the code cache via BarrierSetNMethod::nmethod_entry_barrier
>>   MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, thread));
>>   return ConfigT::thaw(thread, (Continuation::thaw_kind)kind);
>> 
>> we are not pushing it down to where it is actually needed. The trade-off of course is that if we push this too far down we may have to execute it far more often and so take a performance hit. So figuring out the optimum placement for these in the call stack seems rather difficult.
>
>> The trade-off of course is that if we push this too far down we may have to execute it far more often and so take a performance hit. So figuring out the optimum placement for these in the call stack seems rather difficult.
> 
> Most code does not care what the WXWrite state is.  We could use an alternative approach where code that needs a particular WXWrite state sets it, but when it is done not change the state back.  So instead of using ThreadWXEnable RAII that resets the state when it goes out of scope, we would use thread->enable_wx(WXWrite) before writing into the code cache and we would use thread->enable_wx(WXExec) when transitioning from _thread_in_vm to _thread_in_Java thread state.  The implementation of enable_wx() already makes redundant state transitions cheap.  This allows us to move the thread->enable_wx(WXWrite) to immediately before the write into the code cache without needing to worry about finding an optimal coarser scope if the code writes into the code cache in multiple places.

> @dean-long and @theRealAph are you ok with this change as a point-fix?

I'm pretty nervous, to be honest. I think it'll work.
Could we add a write-enable to`PcDescCache::add_pc_desc`? I Don't know how often that function is used.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13606#issuecomment-1556893695


More information about the hotspot-dev mailing list