Cost of single-threaded nmethod hotness updates at each safepoint (in JDK 8)

Fri Jul 31 18:43:12 UTC 2015

Hi Ramki,

Did you fill up CodeCache? It start scanning aggressive only with full 
CodeCache:

   // Force stack scanning if there is only 10% free space in the code 
cache.
   // We force stack scanning only non-profiled code heap gets full, 
since critical
   // allocation go to the non-profiled heap and we must be make sure 
that there is
   // enough space.
   double free_percent = 1 / 
CodeCache::reverse_free_ratio(CodeBlobType::MethodNonProfiled) * 100;
   if (free_percent <= StartAggressiveSweepingAt) {
     do_stack_scanning();
   }

Vladimir

On 7/31/15 11:33 AM, Srinivas Ramakrishna wrote:
>
> Yes.
>
>
> On Fri, Jul 31, 2015 at 11:31 AM, Vitaly Davidovich <vitalyd at gmail.com
> <mailto:vitalyd at gmail.com>> wrote:
>
>     Ramki, are you running tiered compilation?
>
>     sent from my phone
>
>     On Jul 31, 2015 2:19 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com
>     <mailto:ysr1729 at gmail.com>> wrote:
>
>
>         Hello GC and Compiler teams!
>
>         One of our services that runs with several thousand threads
>         recently noticed an increase
>         in safepoint stop times, but not gc times, upon transitioning to
>         JDK 8.
>
>         Further investigation revealed that most of the delta was
>         related to the so-called
>         pre-gc/vmop "cleanup" phase when various book-keeping activities
>         are performed,
>         and more specifically in the portion that walks java thread
>         stacks single-threaded (!)
>         and updates the hotness counters for the active nmethods. This
>         code appears to
>         be new to JDK 8 (in jdk 7 one would walk the stacks only during
>         code cache sweeps).
>
>         I have two questions:
>         (1) has anyone else (typically, I'd expect applications with
>         many hundreds or thousands of threads)
>         noticed this regression?
>         (2) Can we do better, for example, by:
>                (a) doing these updates by walking thread stacks in
>         multiple worker threads in parallel, or best of all:
>                (b) doing these updates when we walk the thread stacks
>         during GC, and skipping this phase entirely
>                      for non-GC safepoints (with attendant loss in
>         frequency of this update in low GC frequency
>                      scenarios).
>
>         It seems kind of silly to do GC's with many multiple worker
>         threads, but do these thread stack
>         walks single-threaded when it is embarrasingly parallel (one
>         could predicate the parallelization
>         based on the measured stack sizes and thread population, if
>         there was concern on the ovrhead of
>         activating and deactivating the thread gangs for the work).
>
>         A followup question: Any guesses as to how code cache
>         sweep/eviction quality might be compromised if one
>         were to dispense with these hotness updates entirely (or at a
>         much reduced frequency), as a temporary
>         workaround to the performance problem?
>
>         Thoughts/Comments? In particular, has this issue been addressed
>         perhaps in newer JVMs?
>
>         Thanks for any comments, feedback, pointers!
>         -- ramki
>
>         PS: for comparison, here's data with +TraceSafepointCleanup from
>         JDK 7 (first, where this isn't done)
>         vs JDK 8 (where this is done) with a program that has a few
>         thousands of threads:
>
>
>
>         JDK 7:
>         ..
>         2827.308: [sweeping nmethods, 0.0000020 secs]
>         2828.679: [sweeping nmethods, 0.0000030 secs]
>         2829.984: [sweeping nmethods, 0.0000030 secs]
>         2830.956: [sweeping nmethods, 0.0000030 secs]
>         ..
>
>         JDK 8:
>         ..
>         7368.634: [mark nmethods, 0.0177030 secs]
>         7369.587: [mark nmethods, 0.0178305 secs]
>         7370.479: [mark nmethods, 0.0180260 secs]
>         7371.503: [mark nmethods, 0.0186494 secs]
>         ..
>
>