RFR: 8247281: migrate ObjectMonitor::_object to OopStorage [v2]

Mon Sep 14 14:23:23 UTC 2020

On Mon, 14 Sep 2020 14:15:41 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> @rkennke - Thanks for the review. I believe @fisk is going to address
>> your comments.
>
> Hi Kim,
> 
> Here is a partial reply to your review. Thanks for reviewing!
> Hmm seems like your email was only sent to shenandoah-dev. Not sure if that was intended. I'm not subscribed to that
> mailing list, so I will send my reply through github and hope for the best.
>> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on
>> [shenandoah-dev](mailto:shenandoah-dev at openjdk.java.net):_
>> ------------------------------------------------------------------------------
>> src/hotspot/share/oops/weakHandle.cpp
>> 36 WeakHandle::WeakHandle(OopStorage* storage, oop obj) :
>> 37 _obj(storage->allocate()) {
>> 38 assert(obj != NULL, "no need to create weak null oop");
>> 
>> Please format this differently so the ctor-init-list is more easily
>> distinguished from the body. I don't care that much which of the several
>> alternatives is used.
>> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/objectMonitor.cpp
>> 244 // Check that object() and set_object() are called from the right context:
>> 245 static void check_object_context() {
>> 
>> This seems like checking we would normally only do in a debug build. Is this
>> really supposed to be done in product builds too? (It's written to support
>> that, just wondering if that's really what we want.) Maybe these aren't
>> called very often so it doesn't matter? I also see that guarantee (rather
>> than assert) is used a fair amount in this and related code.
> 
> I don't think I have a preference here. As you say, it seems a bit mixed. I would be okay with both. Do you want them
> to be asserts?
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/objectMonitor.cpp
>> 251 JavaThread* jt = (JavaThread*)self;
>> 
>> Use self->as_Java_thread().
>> 
>> Later: Coleen already commented on this and it's been fixed.
>> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/objectMonitor.cpp
>> 249 guarantee(self->is_Java_thread() || self->is_VM_thread(), "must be");
>> 250 if (self->is_Java_thread()) {
>> 
>> Maybe instead
>> 
>> if (self->is_Java_thread()) {
>> ...
>> } else {
>> guarantee(self->is_VM_thread(), "must be");
>> }
>> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/objectMonitor.cpp
>> Both ObjectMonitor::object() and ObjectMonitor::object_peek() have
>> 265 if (_object.is_null()) {
>> 266 return NULL;
>> 
>> Should we really be calling those functions when in such a state? That seems
>> like it might be a bug in the callers?
>> 
>> OK, I think I see some places where object_peek() might need such protection
>> because of races, in src/hotspot/share/runtime/synchronizer.cpp. And
>> because we don't seem to ever release() the underlying WeakHandles.
>> 
>> 514 if (m->object_peek() == NULL) {
>> 703 if (cur_om->object_peek() == NULL) {
>> 
>> But it still seems like it might be a bug to call object() in such a state.
> 
> The problem is Type Stable Memory (TSM). The idea is that all monitors are allocated in blocks of many (when the object
> they are associated with is not yet known), and are then re-used, switching object association back and forth. They are
> also never freed.  The changes in this RFR were originally tied together with a patch that also removes TSM. But this
> has been split into separate parts by Dan in order to perform the OopStorage migration first, and then removing TSM. So
> in this patch, we are still dealing with block-allocation of monitors before they have an associated object, and
> changing of object association. Therefore, what you are saying is 100% true, but I would like to defer fixing that
> until the follow-up patch that removes TSM. In that patch, the accessors are changed to do exactly what you expect them
> to (resolve/peek), as the ObjectMonitor instances are allocated with a single object association in mind, installed in
> the constructor, and get deleted after deflation, with destuctors run. So I hope we can agree to wait until that
> subsequent patch with addressing such concerns. We thought reviewing the whole thing as a single patch was too much
> clutter.
>> Related, see next comment.
>> 
>> Later: Looks like Coleen questioned this too. I'm not sure I understand
>> Erik's response though. When do we look at monitors that might not be
>> constructed / initialized? That seems like a bad thing to do. Oh, but the
>> whole creation path for OMs is kind of evil right now. I see... And that's
>> planned to be fixed in later work. OK.
> 
> Yes, exactly. It is currently quite evil, but will soon be cleaned up to look quite ordinary.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/synchronizer.cpp
>> 1548 guarantee(m->object_peek() == NULL, "invariant");
>> 
>> Later: But see previous comment. Some of this might be relevat later though.
>> 
>> object_peek() seemed like the wrong operation here. I thought this was
>> attempting to verify that the underlying WeakHandle has been released. But
>> peeking doesn't ensure that. Oh, but we don't actually release() the
>> WeakHandle when we "free" an OM. We're just pushing the OM on the free
>> list. Which means the GC will continue to examine the associated OopStorage
>> entry (and discover it's NULL). So there's some cost to not releasing the
>> WeakHandle when putting an OM on the free list. Of course, it means there
>> won't be any allocations when taking off the free list; I'm not sure which
>> way is better.
> 
> Yeah, so we never really release the monitors, because the whole monitor can't be freed yet in this patch.
> 
>> But this makes me go back and wonder about whether object_peek() should have
>> the _object.is_null() check. After creation it seems like that should never
>> be true.
> 
> Before the switch away from TSM (next patch), we must assume that monitors have not yet been properly initialized, due
> to block allocation of a whole bunch of monitors at a time, before they have been associated with objects. They are
> also exposed to iterators and such, before any object association has been made. But very very soon, that will be a
> memory of the past.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/runtime/synchronizer.cpp
>> 
>> The old code in chk_in_use_entry seems wrong. It checked for a null
>> object() and recorded that as an error. But then it went on and attempted
>> to use it as if it was not null. That's been fixed by the change. However,
>> the change no longer treats a null as an error. Probably this is because
>> it's weak, and so could have become null. But is that really possible for
>> an "in use" monitor?
> 
> Well, the monitors are added to some in-use list at deflation time. The instant after, the monitor could have become
> unused. But it will still be floating around on the in-use list, until deflation time. And yes, now it is suddenly very
> much valid for a monitor that is no longer used (but placed on the in-use list when it was used), to have a NULL
> object, due to GC getting to weak processing clearing the oop, before deflation gets to it.
>> ------------------------------------------------------------------------------
>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/ObjectMonitor.java
>> 94 public OopHandle object() {
>> 95 Address objAddr = addr.getAddressAt(objectFieldOffset);
>> 96 if (objAddr == null) {
>> 97 return null;
>> 98 }
>> 99 return objAddr.getOopHandleAt(0);
>> 
>> How about something a little bit less abstraction smashing?
> 
> This is already the "norm", and the way that all other handle loads look in the SA today, and therefore the SA does not
> support any GC with load barriers properly. While ideally, there would be something equivalent to the Access API in the
> SA to abstract the access details, I think it seems like way beyond the scope of this change to implement something
> like that for the SA, as part of this change.  Again, thanks for reviewing!

> Shenandoah changes are not complete:
> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp: In member function 'virtual
> void ShenandoahFinalMarkingTask::work(uint)':
> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:298:31: error: 'oops_do' is
> not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&resolve_mark_cl); ^~~~~~~
> /home/rkennke/src/openjdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp:308:31: error: 'oops_do' is
> not a member of 'ObjectSynchronizer' ObjectSynchronizer::oops_do(&mark_cl);
> ^~~~~~~
> 
> I will have a look into how to resolve this.
> 
> In order to fix this, I need one of two things:
> 
>     * A way to iterate oops of the global-list. Explanation: In Shenandoah's I-U mode, we need to re-mark such ObjectMonitors
>       during final-mark because they may be the only remaining reference to an object on the heap.
>       _or_ (even better)
> 
>     * Call a write-barrier (IN_NATIVE) whenever an ObjectMonitor is moved to the global-list upon thread-destruction. This
>       way we could mark that object concurrently.
> 
> 
> Notice that this does not affect thread-local ObjectMonitors (IOW, most regular ObjectMonitors) because those would be
> scanned while scanning thread stacks. Therefore I'd like to avoid to generally place a barrier in ObjectMonitor.
> See: https://bugs.openjdk.java.net/browse/JDK-8251451
> 
> AFAICT, this may affect ZGC too (not 100% sure about this).

So this change makes the object monitors weak. So you shouldn't have to remark them at all. If they hold the last
reference to the object, then the monitor gets the object cleared as part of normal weak OopStorage reference
processing. Subsequent deflation cycles will detect that the GC has cleared the oop, and reuse the monitor. In other
words, we should just remove the call to the oops_do function, and rely on OopStorage doing its thing. The GC should
not have to care at all about monitors any longer, only about processing weak OopStorages, which is done automatically.
Hope this makes sense.

-------------

PR: https://git.openjdk.java.net/jdk/pull/135