Cost of single-threaded nmethod hotness updates at each safepoint (in JDK 8)
Srinivas Ramakrishna
ysr1729 at gmail.com
Fri Jul 31 19:07:18 UTC 2015
Hi Vladimir --
Here's a snapshot of the counters:
sun.ci.codeCacheCapacity=251658240
sun.ci.codeCacheMaxCapacity=251658240
sun.ci.codeCacheMethodsReclaimedNum=3450
sun.ci.codeCacheSweepsTotalNum=58
sun.ci.codeCacheSweepsTotalTimeMillis=1111
sun.ci.codeCacheUsed=35888704
Notice that the code cache usage is less that 35 MB, for the 240 MB
capacity, yet it seems we have had 58 sweeps already, and safepoint cleanup
says:
[mark nmethods, 0.0165062 secs]
Even if the two closures do little or no work, the single-threaded walk
over deep stacks of a thousand threads will cost time for applications with
many threads, and this is now done at each safepoint irrespective of the
sweeper activity as far as I can tell. It seems as if this work should be
somehow rolled up (via a suitable injection) into GC's thread walks that
are done in parallel, rather than doing this in a pre-GC phase (unless I am
mssing some reason that the sequencing is necessary, which it doesn't seem
to be here).
-- ramki
On Fri, Jul 31, 2015 at 11:48 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>
wrote:
> Hi Vladimir --
>
> I noticed the increase even with Initial and Reserved set to the default
> of 240 MB, but actual usage much lower (less than a quarter).
>
> Look at this code path. Note that this is invoked at every safepoint
> (although it says "periodically" in the comment).
> In the mark_active_nmethods() method, there's a thread iteration in both
> branches of the if. I haven't checked to
> see which of the two was the culprit here, yet (if either).
>
> // Various cleaning tasks that should be done periodically at safepoints
>
> void SafepointSynchronize::do_cleanup_tasks() {
>
> ....
>
> {
>
> TraceTime t4("mark nmethods", TraceSafepointCleanupTime);
>
> NMethodSweeper::mark_active_nmethods();
>
> }
>
> ..
>
> }
>
>
> void NMethodSweeper::mark_active_nmethods() {
>
> ...
>
> if (!sweep_in_progress()) {
>
> _seen = 0;
>
> _sweep_fractions_left = NmethodSweepFraction;
>
> _current = CodeCache::first_nmethod();
>
> _traversals += 1;
>
> _total_time_this_sweep = Tickspan();
>
>
> if (PrintMethodFlushing) {
>
> tty->print_cr("### Sweep: stack traversal %d", _traversals);
>
> }
>
> Threads::nmethods_do(&mark_activation_closure);
>
>
> } else {
>
> // Only set hotness counter
>
> Threads::nmethods_do(&set_hotness_closure);
>
> }
>
>
> OrderAccess::storestore();
>
> }
>
> On Fri, Jul 31, 2015 at 11:43 AM, Vladimir Kozlov <
> vladimir.kozlov at oracle.com> wrote:
>
>> Hi Ramki,
>>
>> Did you fill up CodeCache? It start scanning aggressive only with full
>> CodeCache:
>>
>> // Force stack scanning if there is only 10% free space in the code
>> cache.
>> // We force stack scanning only non-profiled code heap gets full, since
>> critical
>> // allocation go to the non-profiled heap and we must be make sure that
>> there is
>> // enough space.
>> double free_percent = 1 /
>> CodeCache::reverse_free_ratio(CodeBlobType::MethodNonProfiled) * 100;
>> if (free_percent <= StartAggressiveSweepingAt) {
>> do_stack_scanning();
>> }
>>
>> Vladimir
>>
>> On 7/31/15 11:33 AM, Srinivas Ramakrishna wrote:
>>
>>>
>>> Yes.
>>>
>>>
>>> On Fri, Jul 31, 2015 at 11:31 AM, Vitaly Davidovich <vitalyd at gmail.com
>>> <mailto:vitalyd at gmail.com>> wrote:
>>>
>>> Ramki, are you running tiered compilation?
>>>
>>> sent from my phone
>>>
>>> On Jul 31, 2015 2:19 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com
>>> <mailto:ysr1729 at gmail.com>> wrote:
>>>
>>>
>>> Hello GC and Compiler teams!
>>>
>>> One of our services that runs with several thousand threads
>>> recently noticed an increase
>>> in safepoint stop times, but not gc times, upon transitioning to
>>> JDK 8.
>>>
>>> Further investigation revealed that most of the delta was
>>> related to the so-called
>>> pre-gc/vmop "cleanup" phase when various book-keeping activities
>>> are performed,
>>> and more specifically in the portion that walks java thread
>>> stacks single-threaded (!)
>>> and updates the hotness counters for the active nmethods. This
>>> code appears to
>>> be new to JDK 8 (in jdk 7 one would walk the stacks only during
>>> code cache sweeps).
>>>
>>> I have two questions:
>>> (1) has anyone else (typically, I'd expect applications with
>>> many hundreds or thousands of threads)
>>> noticed this regression?
>>> (2) Can we do better, for example, by:
>>> (a) doing these updates by walking thread stacks in
>>> multiple worker threads in parallel, or best of all:
>>> (b) doing these updates when we walk the thread stacks
>>> during GC, and skipping this phase entirely
>>> for non-GC safepoints (with attendant loss in
>>> frequency of this update in low GC frequency
>>> scenarios).
>>>
>>> It seems kind of silly to do GC's with many multiple worker
>>> threads, but do these thread stack
>>> walks single-threaded when it is embarrasingly parallel (one
>>> could predicate the parallelization
>>> based on the measured stack sizes and thread population, if
>>> there was concern on the ovrhead of
>>> activating and deactivating the thread gangs for the work).
>>>
>>> A followup question: Any guesses as to how code cache
>>> sweep/eviction quality might be compromised if one
>>> were to dispense with these hotness updates entirely (or at a
>>> much reduced frequency), as a temporary
>>> workaround to the performance problem?
>>>
>>> Thoughts/Comments? In particular, has this issue been addressed
>>> perhaps in newer JVMs?
>>>
>>> Thanks for any comments, feedback, pointers!
>>> -- ramki
>>>
>>> PS: for comparison, here's data with +TraceSafepointCleanup from
>>> JDK 7 (first, where this isn't done)
>>> vs JDK 8 (where this is done) with a program that has a few
>>> thousands of threads:
>>>
>>>
>>>
>>> JDK 7:
>>> ..
>>> 2827.308: [sweeping nmethods, 0.0000020 secs]
>>> 2828.679: [sweeping nmethods, 0.0000030 secs]
>>> 2829.984: [sweeping nmethods, 0.0000030 secs]
>>> 2830.956: [sweeping nmethods, 0.0000030 secs]
>>> ..
>>>
>>> JDK 8:
>>> ..
>>> 7368.634: [mark nmethods, 0.0177030 secs]
>>> 7369.587: [mark nmethods, 0.0178305 secs]
>>> 7370.479: [mark nmethods, 0.0180260 secs]
>>> 7371.503: [mark nmethods, 0.0186494 secs]
>>> ..
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150731/f72802b9/attachment.htm>
More information about the hotspot-gc-dev
mailing list