RFR: JDK-8269393: store/load order not preserved when handling memory pool due to weakly ordered memory architecture of aarch64

Fri Oct 6 08:42:46 UTC 2023

On Thu, 5 Oct 2023 11:02:48 GMT, Damon Fenacci <dfenacci at openjdk.org> wrote:

>>> > There is a CodeCache lock, but it seems to be that the memory tracking is accessing data outside the lock. I don't know how that's supposed to work
>>> 
>>> Acquiring a CodeCache lock when getting `used` and `committed` here
>>> 
>>> https://github.com/openjdk/jdk/blob/12de9b0225363377e9a76729b11698221d4f29f2/src/hotspot/share/services/memoryPool.cpp#L181-L182
>>> 
>>> was actually the solution in the first commit [1060bb0](https://github.com/openjdk/jdk/commit/1060bb01f7ad0ca47c38eb0f68b7f2aab6562ac3) but then, asking about the expensiveness of a lock vs storestore/loadload barriers, and the fact that the origin of the issue was actually the order in which store and load actually happen, made me think barriers would be preferable, thing that I now very much doubt.
>>> You are suggesting that it would make more sense (and be cleaner) to acquire a CodeCache lock instead of using barriers, right?
>> 
>> That seems to be the established convention. And I wonder if there are other problems caused by accessing data in this way, outside the CodeCache lock.
>> 
>>> > ```
>>> >   // Caller in JDK is responsible for synchronization -
>>> >   // acquire the lock for this memory pool before calling VM
>>> >   MemoryUsage usage = get_memory_usage();
>>> > ```
>>> > Oops. I can't see the lock being acquired.
>>> 
>>> I'm wondering if there are more places where this would be needed...
>> 
>> The comment makes it pretty clear that `get_memory_usage()`really needs to hold the lock for this memory pool. I very strongly suspect this is the real cause of your problem.
>> 
>>> > What do you mean by "we’d need a reference to be shared between the 2 places "?
>>> 
>>> What I meant is mainly that in this specific case the _storing_ and _loading_ happens in very different places in the code.
>> 
>> Sure, it's not best practice. Better than an (apparently) randomly-placed fence, though.
>
> @theRealAph do you think the current solution (with `CodeCache_lock` acquisition) could be OK?

I think so. I am a bit nervous because `MutexLock` isn't recursive, so if this code is ever called from a region that is already locked it'll fail. But there is an assertion in `MutexLock` that will detect that if it ever happens, so OK.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15819#discussion_r1348408521