RFR: 8212879: Make JVMTI TagMap table concurrent [v4]

Wed Nov 4 10:08:00 UTC 2020

On Tue, 3 Nov 2020 21:40:39 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> src/hotspot/share/prims/jvmtiTagMap.cpp line 127:
>> 
>>> 125:   // The table cleaning, posting and rehashing can race for
>>> 126:   // concurrent GCs. So fix it here once we have a lock or are
>>> 127:   // at a safepoint.
>> 
>> I think this comment and the one below about locking are confused, at least about rehashing.  I _think_ this is referring to concurrent num-dead notification?  I've already commented there about it being a problem to do the unlink &etc in the GC pause (see later comment).  It also seems like a bad idea to be doing this here and block progress by a concurrent GC because we're holding the tagmap lock for a long time, which is another reason to not have the num-dead notification do very much (and not require a lock that might be held here for a long time).
>
> The comment is trying to describe the situation like:
> 1. mark-end pause (WeakHandle.peek() returns NULL because object A is unmarked)
> 2. safepoint for heap walk
>    2a. Need to post ObjectFree event for object A before the heap walk doesn't find object A.
> 3. gc_notification - would have posted an ObjectFree event for object A if the heapwalk hadn't intervened
> 
> The check_hashmap() function also checks whether the hash table needs to be rehashed before the next operation that uses the hashtable.
> 
> Both operations require the table to be locked.
> 
> The unlink and post needs to be in a GC pause for reasons that I stated above.  The unlink and post were done in a GC pause so this isn't worse for any GCs.  The lock can be held for concurrent GC while the number of entries are processed and this would be a delay for some applications that have requested a lot of tags, but these applications have asked for this and it's not worse than what we had with GC walking this table in safepoints.

For the GCs that call the num_dead notification in a pause it is much worse than what we had. As I pointed out elsewhere, it used to be that tagmap processing was all-in-one, as a single serial subtask taken by the first thread that reached it in WeakProcessor processing. Other threads would find that subtask taken and move on to processing oopstores in parallel with the tagmap processing. Now everything except the oopstorage-based clearing of dead entries is a single threaded serial task done by the VMThread, after all the parallel WeakProcessor work is done, because that's where the num-dead callbacks are invoked. WeakProcessor's parallel oopstorage processing doesn't have a way to do the num-dead callbacks by the last thread out of each parallel oopstorage processing. Instead it's left to the end, on the assumption that the callbacks are relatively cheap.  But that could still be much worse than the old code, since the tagmap oopstorage could be late in the order of processing, an
 d so still effectively be a serial subtask after all the parallel subtasks are done or mostly done.

-------------

PR: https://git.openjdk.java.net/jdk/pull/967