RFR: 8316959: Improve InlineCacheBuffer pending queue management
Dean Long
dlong at openjdk.org
Fri Oct 6 19:48:05 UTC 2023
On Thu, 5 Oct 2023 15:43:07 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
> Hi all,
>
> please review this change that makes enqueuing `CompiledICHolder`s into the `InlineCacheBuffer::_pending_list` lock free. This reduces time clearing IC call sites when unlinking during parallel code cache unloading by 40% (at 16 threads, with 60k unloaded nmethods) affecting G1, Shenandoah.
>
> I do not expect any noticeable difference for other collectors unloading serially (an uncontended mutex should be similar in performance than the CAS).
>
> The implementation is fairly straightforward, some comments:
> * I do not know why the previous implementation used `InlineCacheBuffer_lock` for guarding access to the pending list. This and the other user do not share any of the affected variables.
> * managing the pending count using atomic increments is as problematic wrt to consistency with other code reading from it as before as the readers did not take the lock either. Code review showed that the time of when the code is executed seem to be disjoint enough (i.e. inside/outside safepoint, after synchronization). As mentioned, no difference here.
> * `InlineCacheBuffer::release_pending_icholders` is as unguarded wrt to races with `queue_for_release` as before.
>
> Note that there can be a performance/pause time problem with ZGC wrt to `InlineCacheBuffer::release_pending_icholders`: releasing (a lot of them) in the safepoint used for it can take a few ms alone (In my stress test 6-7ms) which might break some guarantees (otoh it's strictly speaking not the GC taking such a long stw pause). This also did not change to before.
>
> Testing: tier1-5, code unloading stress testing
src/hotspot/share/code/icBuffer.cpp line 270:
> 268: for (;;) {
> 269: icholder->set_next(old);
> 270: CompiledICHolder* cur = Atomic::cmpxchg(&_pending_released, old, icholder, memory_order_relaxed);
I think we need a release barrier between these two writes, and a corresponding acquire barrier on the read side.
Also, how about adding an assert that icholder->next() == nullptr as the first line?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16059#discussion_r1349250449
More information about the hotspot-compiler-dev
mailing list