RFR (M) 8235573: Move JFR ObjectSample oop into OopStorage
kim.barrett at oracle.com
Wed Aug 5 06:40:08 UTC 2020
> On Aug 4, 2020, at 5:39 AM, Markus Gronlund <markus.gronlund at oracle.com> wrote:
> About _last_sweep:
> The value, representing the last time the GC passed over the sample instances, is only treated as a hint, used as input to construct a filter when serializing object samples as events along with their associated chains. The filter excludes samples that are short-lived, i.e. having timestamps after the _last_sweep value. _last_sweep is only read by a single thread which at that moment have exclusive access to the sampler instance. It is ok for it to read whatever value leaked to it, in fact it can be advantageous to see an older value as it makes the filter more effective in reducing noise. Therefore, I think it is ok to leave an undecorated access to this variable. Also, there is no ordering constraint, the value is just reset when constructing a new instance, and a race is ok here too.
In the old code, the write of _last_sweep was done during a weak_oops_do
that was called by all GCs during a safepoint, thus providing the needed
synchronization with the Java thread reads.
In the new code, for a GC that does STW reference processing, the GC
callback that updates _last_sweep is from that STW safepoint context, so
still not concurrent with the sampling.
However, for a GC that does concurrent reference processing, the GC callback
is concurrent with mutator threads, so potentially concurrent with sampling.
So we have unsynchronized non-atomic read/write pairs, which is UB. And
that's UB being introduced by this change.
And it's not one of those "undetectable" UB's that we sometimes squint and
pretend isn't happening and hope the compiler never becomes smart enough to
find. The value involved is a 64bit timestamp. If run on a 32bit platform,
word tearing could ensue for the reader.
A mitigating factor is that ZGC doesn't run on 32bit platforms and changing
that is not on the roadmap.
I don't know if Shenandoah has concurrent reference processing (yet), but it
does support 32bit platforms:
Looking at the use of _last_sweep, it's not clear that it really needs to be
a clock value; maybe it could just be a (atomic) counter? But that doesn't
look like a simple change to make.
More information about the hotspot-dev