RFR: 8212879: Make JVMTI TagMap table concurrent [v5]

Wed Nov 4 02:19:00 UTC 2020

On Wed, 4 Nov 2020 00:08:10 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> This change turns the HashTable that JVMTI uses for object tagging into a regular Hotspot hashtable - the one in hashtable.hpp with resizing and rehashing.   Instead of pointing directly to oops so that GC has to walk the table to follow oops and then to rehash the table, this table points to WeakHandle.  GC walks the backing OopStorages concurrently.
>> 
>> The hash function for the table is a hash of the lower 32 bits of the address.  A flag is set during GC (gc_notification if in a safepoint, and through a call to JvmtiTagMap::needs_processing()) so that the table is rehashed at the next use.
>> 
>> The gc_notification mechanism of weak oop processing is used to notify Jvmti to post ObjectFree events.  In concurrent GCs there can be a window of time between weak oop marking where the oop is unmarked, so dead (the phantom load in peek returns NULL) but the gc_notification hasn't been done yet.  In this window, a heap walk or GetObjectsWithTags call would not find an object before the ObjectFree event is posted.  This is dealt with in two ways:
>> 
>> 1. In the Heap walk, there's an unconditional table walk to post events if events are needed to post.
>> 2. For GetObjectWithTags, if a dead oop is found in the table and posting is required, we use the VM thread to post the event.
>> 
>> Event posting cannot be done in a JavaThread because the posting needs to be done while holding the table lock, so that the JvmtiEnv state doesn't change before posting is done.  ObjectFree callbacks are limited in what they can do as per the JVMTI Specification.  The allowed callbacks to the VM already have code to allow NonJava threads.
>> 
>> To avoid rehashing, I also tried to use object->identity_hash() but this breaks because entries can be added to the table during heapwalk, where the objects use marking.  The starting markWord is saved and restored.  Adding a hashcode during this operation makes restoring the former markWord (locked, inflated, etc) too complicated.  Plus we don't want all these objects to have hashcodes because locking operations after tagging would have to always use inflated locks.
>> 
>> Much of this change is to remove serial weak oop processing for the weakProcessor, ZGC and Shenandoah.  The GCs have been stress tested with jvmti code.
>> 
>> It has also been tested with tier1-6.
>> 
>> Thank you to Stefan, Erik and Kim for their help with this change.
>
> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Code review comments from Kim and Albert.

Hi Coleen,

Wow, there are a lot of simplifications and code removal with this fix!
It looks great in general, just some nits below.
I also wanted to suggest renaming the 'set_needs_processing' to 'set_needs_rehashing'. :)

src/hotspot/share/prims/jvmtiTagMap.hpp:

Nit: Would it better to use a plural form 'post_dead_objects_on_vm_thread'? :
`+  void post_dead_object_on_vm_thread();`

src/hotspot/share/prims/jvmtiTagMap.cpp:

Nit: It'd be nice to add a short comment before the check_hashmap similar to L143 also explaining a difference (does check and post just for one env) with the check_hashmaps_for_heapwalk:
122 void JvmtiTagMap::check_hashmap(bool post_events) {
  . . .
143 // This checks for posting and rehashing and is called from the heap walks.
 144 void JvmtiTagMap::check_hashmaps_for_heapwalk() {

I'm just curious how this fragment was added. Did you get any failures in testing? :
1038   // skip if object is a dormant shared object whose mirror hasn't been loaded
1039   if (obj != NULL &&   obj->klass()->java_mirror() == NULL) {
1040     log_debug(cds, heap)("skipped dormant archived object " INTPTR_FORMAT " (%s)", p2i(obj),
1041                          obj->klass()->external_name());
1042     return;
1043   }

Nit: Can we rename this field to something like '_some_dead_found' or '_dead_found'? :
`1186   bool _some_dead;`

Nit: The lines 2997-3007 and 3009-3019 do the same but in different contexts. 
2996   if (!is_vm_thread) {
2997     if (num_dead_entries != 0) {
2998       JvmtiEnvIterator it;
2999       for (JvmtiEnv* env = it.first(); env != NULL; env = it.next(env)) {
3000         JvmtiTagMap* tag_map = env->tag_map_acquire();
3001         if (tag_map != NULL) {
3002           // Lock each hashmap from concurrent posting and cleaning
3003           tag_map->unlink_and_post_locked();
3004         }
3005       }
3006       // there's another callback for needs_rehashing
3007     }
3008   } else {
3009     assert(SafepointSynchronize::is_at_safepoint(), "must be");
3010     JvmtiEnvIterator it;
3011     for (JvmtiEnv* env = it.first(); env != NULL; env = it.next(env)) {
3012       JvmtiTagMap* tag_map = env->tag_map_acquire();
3013       if (tag_map != NULL && !tag_map->is_empty()) {
3014         if (num_dead_entries != 0) {
3015           tag_map->hashmap()->unlink_and_post(tag_map->env());
3016         }
3017         // Later GC code will relocate the oops, so defer rehashing until then.
3018         tag_map->_needs_rehashing = true;
3019       }
It feels like it can be refactored/simplified, at least, a little bit.
Is it possible to check and just return if (num_dead_entries == 0)?
If not, then, at least, it can be done the same way (except of locking).

Also, can we have just one (static?) lock for the whole gc_notification (not per JVMTI env/JvmtiTagMap)? How much do we win by locking per each env/JvmtiTagMap? Note, that in normal case there is just one agent. It is very rare to have multiple agents requesting object tagging and ObjectFree events. It seems, this can be refactored to more simple code with one function doing work in both contexts.

src/hotspot/share/utilities/hashtable.cpp:

Nit: Need space after the '{' :
  `+const int _small_table_sizes[] = {107, 1009, 2017, 4049, 5051, 10103, 20201, 40423 } ;`

src/hotspot/share/prims/jvmtiTagMapTable.cpp:

Nit: Extra space after assert:
  `119   assert (find(index, hash, obj) == NULL, "shouldn't already be present");`

Thanks,
Serguei

-------------

PR: https://git.openjdk.java.net/jdk/pull/967