RFC Literally: 8232782: streamline post-LRB CAS barrier (aarch64)
Nilsen, Kelvin
kdnilsen at amazon.com
Fri May 15 15:05:03 UTC 2020
I've been planning to mimic the changes made with https://bugs.openjdk.java.net/browse/JDK-8222766 (Shenandoah: streamline post-LRB CAS barrier (x86)).
Have run into a few challenges in that the AARCH64 variant of cmpxchg_oop has different arguments than the x86 version. In particular, AARCH64 has acquire, release, weak, and is_cae boolean arguments in place of the single Boolean exchange argument. I've been sleuthing though the implementation, but some "comments" would be very helpful if someone else is already familiar with this code as I am relatively new to both HotSpot and Shenandoah code bases, and to AARCH64 as well. Are these arguments providing support for ...
different variants of ARM?
different contextual state?
experimentation with different code generation models?
something else?
By the way, is there a reliable and current document that addresses the "theory of operation" for coherency and synchronization between parallel tasks within the new and improved Shenandoah implementation? I have studied in depth the original Shenandoah design, but I haven't yet found any detailed descriptions (have reviewed recent talks with powerpoint overviews) of Shenandoah 2.
My working assumptions are:
1. Multiple concurrent reader threads may discover the same previously uncopied object and decide to copy it. Each one in parallel copies the object and then uses CAS to "mark with forwarding address". The race is resolved to the thread that succeeds with CAS. The other threads abandon their copies and use the address "supplied" by the winning copy.
2. All threads will heal in place the obsolete pointer to the original object. But the healing process for each address also needs a lock, because a parallel mutator thread may overwrite the reference field between the moment when the reader fetched its value and attempts to replace the value with the relevant forwarding pointer. What does this lock look like to the healing reader? To the mutator? Or is there some argument as to why no locking is necessary here?
Understanding the planned coordination protocol will help me get the code right for JDK-8232782 and will also help me to a better job of testing it.
Thanks for any help you might be able to provide.
More information about the shenandoah-dev
mailing list