RFR: 8310239: Add missing cross modifying fence in nmethod entry barriers [v3]

Fri Oct 20 20:00:08 UTC 2023

On 20 Oct 2023, at 9:03, Erik Österlund wrote:

>> In fact, there is a current race in the nmethod entry barriers, where what we are doing violates the AMD APM (cf. APM volume 2 section 7.6.1 https://www.amd.com/system/files/TechDocs/24593.pdf).

Pages 205 and 206 in the AMD doc talk about self-modifying code and
then (what we care about) cross-modifying code.  It then goes on
to discuss asynchronous support for CMC (which is the part we
care most about for high-performance code) and synchronous CMC.

It’s really well written; kudos to AMD.  And it’s friendly to us.
Specifically, they seem to have worked hard to make the instruction
fetcher read in a total store order, respecting the ordering of
writes from whatever gremlin is modifying the code stream.
Also, any derived state (such as decodings of fetched
instructions) are invalidated the right way after I$ changes.

All that makes easier our job, of running in the fast lane,
which requires knowing exactly what are the boundaries and
limits of the fast lane, so we don’t fall off the icy cliff
immediately to our left.

By contrast, the Intel SDM, at 9.1.3 (Handling Self- and
Cross-Modifying Code), only covers the synchronous case of
CMC, and is rather short.  I know the Intel architects have
thought about this problem too, of asynchronous CMC.  And we
may have have verbally discussed with them rules like the
ones AMD has published.  I don’t recall seeing a write-up
from Intel on this specific subject, though.  Maybe Sandhya
or another Intel person can help me find it?

— John