RFR: JDK-8317661: [REDO] store/load order not preserved when handling memory pool due to weakly ordered memory architecture of aarch64

David Holmes dholmes at openjdk.org
Wed Oct 18 05:57:45 UTC 2023


On Wed, 11 Oct 2023 09:56:37 GMT, Damon Fenacci <dfenacci at openjdk.org> wrote:

> # Issue
> An intermittent _Memory Pool not found_ error has been noticed when running  a few tests (_vmTestbase/vm/mlvm/meth/stress/compiler/deoptimize/Test.java_, _vmTestbase/vm/mlvm/meth/stress/compiler/sequences/Test.java_) on _macosx_aarch64_ (production build) with non-segmented code cache.
> 
> ## Origin
> The issue originates from the fact that aarch64 architecture is a weakly ordered memory architecture, i.e. it _permits the observation and completion of memory accesses in a different order from the program order_.
> 
> More precisely: while calling `CodeHeapPool::get_memory_usage`, the `used` and `committed` variables are retrieved
> https://github.com/openjdk/jdk/blob/138542de7889e8002df0e15a79e31d824c6a0473/src/hotspot/share/services/memoryPool.cpp#L181-L182
> and these are computed based on different variables saved in memory in `CodeCache::allocate` (during `heap->allocate` and `heap->expand_by` to be precise) .https://github.com/openjdk/jdk/blob/138542de7889e8002df0e15a79e31d824c6a0473/src/hotspot/share/code/codeCache.cpp#L535-L537
> The problem happens when first `heap->expand_by` gets called (which _increases_ `committed`) and then `heap->allocate` gets called in a second loop pass (which _increases_ `used`). Although stores in `CodeCache::allocate` happen in the this order, when reading from memory in `CodeHeapPool::get_memory_usage` it can happen that `used` has the newly computed value, while `committed` is still "old" (because of ARM’s weak memory order). This is a problem, since `committed` must be > than `used`.
> 
> # Solution
> 
> To avoid this situation we must assure that values used to calculate `committed` are actually saved before the values used to calculate `used` and that the opposite be true for reading. To enforce this we add an `acquire/release` ordering for `CodeHeap::_next_segment` ensuring that when `CodeHeap::allocated_capacity` is called, if `_next_segment` (which is used to calculate `used`) has already been updated in `CodeHeap::allocate`, then all values written before are going to be visible (the ones to calculate `committed`).
> 
> Acquiring a `CodeCache_lock` has been attempted in #15819 but resulted in deadlocks (that seem to be unavoidable).

src/hotspot/share/memory/heap.cpp line 306:

> 304:     block = block_at(_next_segment);
> 305:     block->initialize(number_of_segments);
> 306:     Atomic::release_store(&_next_segment, _next_segment + number_of_segments);

When we do the release here, what previous stores are we trying to order with the reading code below? It is hard to see the complete picture with this code just from the PR changes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16143#discussion_r1363230970


More information about the hotspot-runtime-dev mailing list