RFR: JDK-8269393: store/load order not preserved when handling memory pool due to weakly ordered memory architecture of aarch64
Andrew Haley
aph at openjdk.org
Fri Sep 29 14:43:15 UTC 2023
On Tue, 19 Sep 2023 15:04:27 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> # Issue
>> An intermittent _Memory Pool not found_ error has been noticed when running a few tests (_vmTestbase/vm/mlvm/meth/stress/compiler/deoptimize/Test.java_, _vmTestbase/vm/mlvm/meth/stress/compiler/sequences/Test.java_) on _macosx_aarch64_ (production build) with non-segmented code cache.
>>
>> ## Origin
>> The issue originates from the fact that aarch64 architecture is a weakly ordered memory architecture, i.e. it _permits the observation and completion of memory accesses in a different order from the program order_.
>>
>> More precisely: while calling `CodeHeapPool::get_memory_usage`, the `used` and `committed` variables are retrieved
>> https://github.com/openjdk/jdk/blob/138542de7889e8002df0e15a79e31d824c6a0473/src/hotspot/share/services/memoryPool.cpp#L181-L182
>> and these are computed based on different variables saved in memory in `CodeCache::allocate` (during `heap->allocate` and `heap->expand_by` to be precise) .https://github.com/openjdk/jdk/blob/138542de7889e8002df0e15a79e31d824c6a0473/src/hotspot/share/code/codeCache.cpp#L535-L537
>> The problem happens when first `heap->expand_by` gets called (which _increases_ `committed`) and then `heap->allocate` gets called in a second loop pass (which _increases_ `used`). Although stores in `CodeCache::allocate` happen in the this order, when reading from memory in `CodeHeapPool::get_memory_usage` it can happen that `used` has the newly computed value, while `committed` is still "old" (because of ARM’s weak memory order). This is a problem, since `committed` must be > than `used`.
>>
>> # Solution
>>
>> To avoid this situation we must assure that values used to calculate `committed` are actually saved before the values used to calculate `used` and that the opposite be true for reading. To enforce this we acquire a `CodeCache_lock` while reading `used` and `committed` in `CodeHeapPool::get_memory_usage` (which should actually be the convention when accessing CodeCache data).
>
> src/hotspot/share/services/memoryPool.cpp line 182:
>
>> 180: MemoryUsage CodeHeapPool::get_memory_usage() {
>> 181: OrderAccess::loadload();
>> 182: size_t used = used_in_bytes();
>
> This doesn't look quite right. A `loadload` controls the ordering between two accesses. If you want to make sure that you don't see an old version of `committed` with a new version of `used` then the `loadload` must be _between_ the two loads.
Also, for clarity, I'd make `used` volatile, and access it with `Atomic::release_store()` and `load_acquire()`. That should keep everything straight, assuming there aren't any more ordering failures.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/15819#discussion_r1330293024
More information about the hotspot-dev
mailing list