RFR: 8331911: Reconsider locking for recently disarmed nmethods [v3]

Thu Jun 6 11:46:47 UTC 2024

On Thu, 30 May 2024 16:04:04 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>>> It seems fine to me that the GC backends are responsible for checking if the nmethod is disarmed outside the lock. However, we have some callers that now check it redundantly. I think those callers should stop doing that now. Otherwise, this looks good to me.
>> 
>> Thanks for the feedback! Looking around the code, I think there are a few places where we can do more changes.
>> 
>> First, remove check here:
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/code/nmethod.cpp#L852-L855
>> 
>> This would force us to add the check in super-class implementation here:
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L83
>> 
>> Second, we can remove the check here:
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L175-L181
>> 
>> But it does not seem straightforward, because we currently skip cross-modification fence based on is_armed(...) check. Unfortunately, we cannot easily know if nmethod_entry_barrier acted or not, we only know if method is safe or not.  Can we / should we do these refactoring separately?
>
>> > It seems fine to me that the GC backends are responsible for checking if the nmethod is disarmed outside the lock. However, we have some callers that now check it redundantly. I think those callers should stop doing that now. Otherwise, this looks good to me.
>> 
>> 
>> 
>> Thanks for the feedback! Looking around the code, I think there are a few places where we can do more changes.
>> 
>> 
>> 
>> First, remove check here:
>> 
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/code/nmethod.cpp#L852-L855
>> 
>> 
>> 
>> This would force us to add the check in super-class implementation here:
>> 
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L83
>> 
>> 
>> 
>> Second, we can remove the check here:
>> 
>> https://github.com/openjdk/jdk/blob/bc7d9e3d0bc663bbbeb068889082da4a9f0fa8de/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L175-L181
>> 
>> 
>> 
>> But it does not seem straightforward, because we currently skip cross-modification fence based on is_armed(...) check. Unfortunately, we cannot easily know if nmethod_entry_barrier acted or not, we only know if method is safe or not.  Can we / should we do these refactoring separately?
> 
> I see your point. However, this PR is refactoring the code to iron out who is responsible for checking is_armed, so I would prefer if we got that right in this PR. We say it should be the backend code doing that, so the callers shouldn't. I agree with all the changes you just listed and if you make them I would be happy.
> 
> Regarding the cross modifying fence, I strongly prefer to not try and be clever. Just run the cross modifying fence unconditionally after calling the backend code. We get there because the barrier was armed anyway.

> @fisk 
> 
> I've addressed the feedback. Can you take a look?
> 
> I did not remove the check [here](https://github.com/openjdk/jdk/blob/5dcb7a627e1cfb360719a25722588180e5de9d09/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L175-L177). Removing this check resulted in time out when `-XX:+DeoptimizeNMethodBarriersALot` flag set as it [executes deoptimization code path](https://github.com/openjdk/jdk/blob/5dcb7a627e1cfb360719a25722588180e5de9d09/src/hotspot/share/gc/shared/barrierSetNMethod.cpp#L200-L203)

I am quite nervous about having that silly optimization there. So I'm going to have to insist on removing it. Perhaps though, it should be fixed as a separate issue. Allow me to explain myself why it makes me nervous.
The nmethod entry barriers guard modifications to instructions and data through a mixed bag of synchronous and asynchronous cross modifying code.
Synchronous cross modifying code is the sane thing to do; we modify instructions, guarded from concurrent execution by a data flag. After modifying the instructions, the data flag is flipped, and observers are allowed to execute the instructions *after executing an instruction cross modification fence*.
On for example AArch64 we only perform synchronous cross modifying code, and the stub has the fencing machinery, so it should be okay.
However, on x86_64, we perform a mix of asynchronous and synchronous cross modifying code. The guard word is the immediate part of a compare instruction. If the new disarmed immediate is observed by concurrent execution, instruction cache coherency guarantees that we will correctly observe the cross modified instructions when they are subsequently executed.
However, when we go into the stub slow path, and check is_armed etc, these are *data reads*. That makes the dance entirely different as it suddenly performs synchronous cross modifying code. If a data read observes that the instructions have been modified, we don't have the same level of guarantees any longer, unless we perform an instruction cross modification fence.
So my concern here, is that the silly optimization to fix some verification code timeout or whatever, is in fact causing a real correctness problem for real release builds, on x86_64. By skipping the cross modification fence we perform an incomplete synchronous instruction cross modification dance that isn't sound.
Having said that, perhaps we should file a separate issue to remove that check, since it seems to fix an actual bug, while I guess this was meant more as an optimization. What do you think?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19285#issuecomment-2152174420