RFR: 8212879: Make JVMTI TagMap table concurrent [v4]

Wed Nov 4 09:36:03 UTC 2020

On Tue, 3 Nov 2020 21:31:35 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> src/hotspot/share/prims/jvmtiTagMap.cpp line 2979:
>> 
>>> 2977: 
>>> 2978: // Concurrent GC needs to call this in relocation pause, so after the objects are moved
>>> 2979: // and have their new addresses, the table can be rehashed.
>> 
>> I think the comment is confusing and wrong. The requirement is that the collector must call this before exposing moved objects to the mutator, and must provide the to-space invariant. (This whole design would not work with the old Shenandoah barriers without additional work. I don't know if tagmaps ever worked at all for them? Maybe they added calls to Access<>::resolve (since happily deceased) to deal with that?)  I also think there are a bunch of missing calls; piggybacking on the num-dead callback isn't correct (see later comment about that).
>
> So the design is that when the oops have new addresses, we set a flag in the table to rehash it.  Not sure why this is wrong and why wouldn't it work for shenandoah? @zhengyu123 ?  When we call WeakHandle.peek()/resolve() after the call, the new/moved oop address should be returned.  Why wouldn't this be the case?

I didn't say it "doesn't work for shenandoah", I said it wouldn't have worked with the old shenandoah barriers without additional work, like adding calls to resolve.  I understand the design intent of notifying the table management that its hash codes are out of date.  And the num-dead callback isn't the right place, since there are num-dead callback invocations that aren't associated with hash code invalidation.  (It's not a correctness wrong, it's a "these things are unrelated and this causes unnecessary work" wrong.)

>> src/hotspot/share/prims/jvmtiTagMap.cpp line 3015:
>> 
>>> 3013:       if (tag_map != NULL && !tag_map->is_empty()) {
>>> 3014:         if (num_dead_entries != 0) {
>>> 3015:           tag_map->hashmap()->unlink_and_post(tag_map->env());
>> 
>> Why are we doing this in the callback, rather than just setting a flag?  I thought part of the point of this change was to get tagmap processing out of GC pauses.  The same question applies to the non-safepoint side.  The idea was to be lazy about updating the tagmap, waiting until someone actually needed to use it.  Or if more prompt ObjectFree notifications are needed then signal some thread (maybe the service thread?) for followup.
>
> The JVMTI code expects the posting to be done quite eagerly presumably during GC, before it has a chance to disable the event or some other such operation.  So the posting is done during the notification because it's as soon as possible.  Deferring to the ServiceThread had two problems.  1. the event came later than the caller is expecting it, and in at least one test the event was disabled before posting, and 2. there's a comment in the code why we can't post events with a JavaThread.  We'd have to transition into native while holding a no safepoint lock (or else deadlock).  The point of making this change was so that the JVMTI table does not need GC code to serially process the table.

I know of nothing that leads to "presumably during GC" being a requirement. Having all pending events of some type occur before that type of event is disabled seems like a reasonable requirement, but just means that event disabling also requires the table to be "up to date", in the sense that any GC-cleared entries need to be dealt with. That can be handled just like other operations that use the table contents, rather than during the GC.  That is, use post_dead_object_on_vm_thread if there are or might be any pending dead objects, before disabling the event.

-------------

PR: https://git.openjdk.java.net/jdk/pull/967