RFR: 8253064: monitor list simplifications and getting rid of TSM [v2]
David Holmes
david.holmes at oracle.com
Mon Nov 9 06:06:34 UTC 2020
Hi Dan,
On 9/11/2020 1:50 pm, Daniel D.Daugherty wrote:
> On Sun, 8 Nov 2020 21:43:00 GMT, David Holmes <dholmes at openjdk.org> wrote:
>
>>> How about this:
>>> static MonitorList _in_use_list;
>>> // The ratio of the current _in_use_list count to the ceiling is used
>>> // to determine if we are above MonitorUsedDeflationThreshold and need
>>> // to do an async monitor deflation cycle. The ceiling is increased by
>>> // AvgMonitorsPerThreadEstimate when a thread is added to the system
>>> // and is decreased by AvgMonitorsPerThreadEstimate when a thread is
>>> // removed from the system.
>>> // Note: If the _in_use_list max exceeds the ceiling, then
>>> // monitors_used_above_threshold() will use the in_use_list max instead
>>> // of the thread count derived ceiling because we have used more
>>> // ObjectMonitors than the estimated average.
>>> static jint _in_use_list_ceiling;
>>
>> Thanks for the comment. So instead of checking the threshhold on each OM allocation we use this averaging technique to estimate the number of monitors in use? Can you explain how this came about rather than the simple/obvious check at allocation time. Thanks.
>
> I'm not sure I understand your question, but let me that a stab at it anyway...
>
> We used to compare the sum of the in-use counts from all the in-use lists
> with the total population of ObjectMonitors. If that ratio was higher than
> MonitorUsedDeflationThreshold, then we would do an async deflation cycle.
> Since we got rid of TSM, we no longer had a population of already allocated
> ObjectMonitors, we had a max value instead. However, when the VMs use
> of ObjectMonitors is first spinning up, the max value is typically very close
> to the in-use count so we would always be asking for an async-deflation
> during that spinning up phase.
>
> I created the idea of a ceiling value that is tied to thread count and the
> AvgMonitorsPerThreadEstimate to replace the population value that we
> used to have. By comparing the in-use count against the ceiling value, we
> no longer exceed the MonitorUsedDeflationThreshold when the VMs use
> of ObjectMonitors is first spinning up so we no longer do async deflations
> continuously during that phase. If the max value exceeds the ceiling value,
> then we're using a LOT of ObjectMonitors and, in that case, we compare
> the in-use count against the max to determine if we're exceeding the
> MonitorUsedDeflationThreshold.
>
> Does this help?
It helps but I'm still wrestling with what MonitorUsedDeflationThreshold
actually means now.
So the existing MonitorUsedDeflationThreshold is used as a measure of
the proportion of monitors actually in-use compared to the number of
monitors pre-allocated. If an inflation request requires a new block to
be allocated and we're above MonitorUsedDeflationThreshold % then a
request for async deflation occurs (when we actually check).
The new code, IIUC, says, lets assume we expect
AvgMonitorsPerThreadEstimate monitors-per-thread. If the number of
monitors in-use is > MonitorUsedDeflationThreshold % of
(AvgMonitorsPerThreadEstimate * number_of_threads), then we request
async deflation.
So ... obviously we need some kind of watermark based system for
requesting deflation otherwise there will be far too many deflation
requests. And we also don't want to have check for exceeding the
threshold on every monitor allocation. So the deflation thread will
wakeup periodically and check if the threshold is exceeded.
Okay ... so then it comes down to deciding whether
AvgMonitorsPerThreadEstimate is the best way to establish the watermark
and what the default value should be. This doesn't seem like something
that an application developer could reasonably try to estimate so it is
just going to be a tuning knob they adjust somewhat arbitrarily. I
assume the 1024 default came from tuning something?
Have you looked at the affect on memory use these changes have (ie peak
RSS use)? Did your performance measurements look at using different
values? (I can imagine that with enough memory we can effectively
disable deflation and so potentially increase performance. OTOH maybe
deflation is so infrequent it is a non-issue.)
I have to confess that I never really thought about the old set of
heuristics for this, but the fact we're changing the heuristics does
raise a concern about what impact applications may see.
BTW MonitorUsedDeflationThreshold should really be diagnostic not
experimental, as real applications may need to tune it (and people often
don't want to use experimental flags in production as a matter of policy).
Thanks,
David
-----
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/642
>
More information about the hotspot-dev
mailing list