[10] RFR: 8178387: Reduce memory churn when creating java.lang.invoke entities

Mon Apr 10 18:00:39 UTC 2017

On 2017-04-10 19:41, Paul Sandoz wrote:
>> On 10 Apr 2017, at 10:31, Claes Redestad <claes.redestad at oracle.com> wrote:
>> Don't expect too much! These two add up to about ~60k fewer bytecode executed to bootstrap j.l.invoke
>> and run a trivial lambda, which is about 13% of the total and shows up as around or somewhat less than a
>> millisecond improvement on my machine (out of the ~19-20ms it now takes to bootstrap and execute the
>> first lambda on my machine).
>>
> Keep chipping away, every bit helps :-)
>
> Separately, i am curious how much GC activity occurs at startup before the main method is called. Would it be possible to overlay hotspot activity on the Java flame graph?

Depends on the GC and initial heap sizes, but on the small tests I'm 
looking at, G1 has typically not even started a concurrent cycle, so choice
of GC doesn't matter that much... (G1 has some minor issues shutting 
down promptly - seeing delays of up to 15ms to get all theads to realize
they should stop what they're doing - which causes some headache though)

... on the other hand I'm seeing a lot of noise on my 2-socket 
workstation when not pinning to a single socket, that appears to come 
from JIT
threads *and* the main java thread spending more time. My best guess 
right now is some interaction between interpreter and JIT threads,
possibly some contention/sharing effects when installing compiled 
methods, and have been starting to ask around for someone who can give
me a tour through that code to investigate deeper... :-)

W.r.t. overlaying native and java flame graphs then it's quite easy to 
generate native-only and java-only startup graphs separately, but 
surprisingly(?)
hard to create something mixed that capture what happens during the 
earliest bootstrap with any precision. There's some wonderful work out 
there
using agents to map JIT disassembly to perf output and such, but I'm not 
sure it's feasible to use that approach to see what's happening during
interpreter-heavy/warmup phases.

Thanks!

/Claes