RFR: 8212879: Make JVMTI TagMap table concurrent [v4]

Thu Nov 5 14:53:01 UTC 2020

On Wed, 4 Nov 2020 13:22:57 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

>> For the GCs that call the num_dead notification in a pause it is much worse than what we had. As I pointed out elsewhere, it used to be that tagmap processing was all-in-one, as a single serial subtask taken by the first thread that reached it in WeakProcessor processing. Other threads would find that subtask taken and move on to processing oopstores in parallel with the tagmap processing. Now everything except the oopstorage-based clearing of dead entries is a single threaded serial task done by the VMThread, after all the parallel WeakProcessor work is done, because that's where the num-dead callbacks are invoked. WeakProcessor's parallel oopstorage processing doesn't have a way to do the num-dead callbacks by the last thread out of each parallel oopstorage processing. Instead it's left to the end, on the assumption that the callbacks are relatively cheap.  But that could still be much worse than the old code, since the tagmap oopstorage could be late in the order of processing,
  and so still effectively be a serial subtask after all the parallel subtasks are done or mostly done.
>
> Yes, you are right that the processing will be done serially and not by a parallel worker thread.  This is could spawn a new GC worker thread to process the posts, as you suggest.  We could do that if we find a customer that has a complaint about the pause time of this processing.

So both before and now, this task is a single threaded task. The difference is that before that single threaded task could be performed in parallel to other tasks. So if the table is small, you probably won't be able to notice any difference as small table implies not much to do. And if the table is large, you still probably won't be able to notice any difference as a large table implies it will dominate the pause with both the old and new approach. Any difference at all is bounded at 2x processing time, as it was serial both before and after. But now if we have a perfectly medium balanced table, we can at the very worst observe a theoretical 2x worse processing of this JVMTI table. I think that if we truly did care about this difference, and that it is important to keep this code as well performed as possible, then we would not have a serial phase for this at all. The fact that this has been serial suggests to me that it is not a path that is critical, and therefore I don't think op
 timizing the theoretical max 2x worse processing times for perfectly medium sized JVMTI tag map tables, is worth the hassle. At least I can't see why this would be of any importance.

-------------

PR: https://git.openjdk.java.net/jdk/pull/967