GC overhead limit exceeded

Thu Jan 9 11:05:34 PST 2014

In Nashorn-land, can you "pre-compile" scripts and re-use the compiled entities in the same way you can with Rhino?  And if so, are you re-using those instances between executions of your cron-like functions in Prudence?

-----Original Message-----
From: nashorn-dev-bounces at openjdk.java.net [mailto:nashorn-dev-bounces at openjdk.java.net] On Behalf Of Tal Liron
Sent: Thursday, January 09, 2014 2:01 PM
To: nashorn-dev at openjdk.java.net
Subject: Re: GC overhead limit exceeded

Indeed, scripts are reused in this case, though I can't guarantee that there isn't a bug somewhere on my end.

I'm wondering if it might be triggered by another issue: Prudence supports an internal crontab-life feature (based on cron4j), and these are again Nashorn scripts being run, once a minute. You can see them in the log of the example application you downloaded. Then again, the same exact feature is leveraged with Rhino.

Another idea by which I may help: when the errors occur again on my server, I will be happy to provide you with SSH access to it to snoop around. We can also run VisualVM via an SSH tunnel, it should be able to show us exactly which classes are not being GCed. If you think this would be helpful, please email my directly and we can set this up. 
However, in attempts to debug this locally the heap seems to be behaving well enough.

On 01/10/2014 01:38 AM, Hannes Wallnoefer wrote:
> Tal,
>
> I've been thowing requests at the Prudence test app for the last 20 
> minutes or so. I do see that it uses a lot of metaspace, close to 50M 
> in my case. The test app seems to load/unload 2 classes per request 
> with Rhino compared to 4 classes per request with Nashorn, which is 
> probably due to differences in bytecode generation between the two 
> engines.
>
> I don't yet see metaspace usage growing beyond that limit, or 
> generating GC warnings. Maybe I haven't been running it long enough.
>
> I'm wondering if maybe metaspace is tight from the very beginning, and 
> the GC problems are caused by spikes in load (e.g. concurrent requests)?
>
> Also, are you aware of new classes being generated for each request? 
> Are you evaluating script files for each request? It would be more 
> efficient to evaluate the script just once and then reuse it for 
> subsequent requests.
>
> Hannes
>
> Am 2014-01-09 17:21, schrieb Tal Liron:
>> You may download the latest release of Prudence, run it and bombard 
>> it with hits (use ab or a similar tool):
>>
>> http://threecrickets.com/prudence/download/
>>
>> To get the GC logs, start it like so:
>>
>> JVM_SWITCHES=\
>>     -Xloggc:/full/path/to/logs/gc.log \
>>     -XX:+PrintGCDetails \
>>     -XX:+PrintTenuringDistribution \
>>     sincerity start prudence
>>
>> To bombard it:
>>
>> ab -n 50000 -c 10 "http://localhost:8080/prudence-example/"
>>
>> Of course, you may also want to restrict the JVM heap size so it will 
>> happen sooner. I think. I actually don't understand JVM 8 GC at all, 
>> but you guys do, so have a go. All I can tell you is that I have a 
>> server running live on the Internet, which I have to restart every 3 
>> days due to this issue.
>>
>> Unfortunately, I don't have an easy way to isolate the problem to 
>> something smaller. However, I would think there's probably an 
>> advantage in using something as big as possible -- you can probably 
>> get very rich dumps of what is polluting the heap.
>>
>>
>> On 01/10/2014 12:00 AM, Marcus Lagergren wrote:
>>> Tal - The GC people 10 meters behind me want to know if you have a 
>>> repro of your full GC to death problem that they can look at?
>>> They're interested.
>>>
>>> /M
>>>
>>> On 09 Jan 2014, at 16:29, Kirk Pepperdine <kirk at kodewerk.com> wrote:
>>>
>>>> Hi Marcus,
>>>>
>>>> Looks like some of the details have been chopped off. Is there a GC 
>>>> log available? If there is a problem with MethodHandle a work 
>>>> around might be a simple as expanding perm.. but wait, this is meta 
>>>> space now and it should grow as long as your system has memory to 
>>>> give to the process. The only thing I can suggest is that the space 
>>>> to hold compressed class pointers is a fixed size and that if 
>>>> Nashorn is loading a lot of classes is that you consider making 
>>>> that space larger. Full disclosure, this isn't something that I've 
>>>> had a chance to dabble with but I think there is a flag to control 
>>>> the size of that space. Maybe Colleen can offer better insight.
>>>>
>>>> Regards,
>>>> Kirk
>>>>
>>>> On Jan 9, 2014, at 10:02 AM, Marcus Lagergren 
>>>> <marcus.lagergren at oracle.com> wrote:
>>>>
>>>>> This almost certainly stems from the implementation from 
>>>>> MethodHandle combinators being implemented as lambda forms as 
>>>>> anonymous java classes. One of the things that is being done for
>>>>> 8u20 is to drastically reduce the number of lambda forms created.  
>>>>> I don't know of any workaround at the moment. CC:ing 
>>>>> hotspot-compiler-dev, so the people there can elaborate a bit.
>>>>>
>>>>> /M
>>>>>
>>>>> On 06 Jan 2014, at 06:57, Benjamin Sieffert 
>>>>> <benjamin.sieffert at metrigo.de> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> we have been observing similar symptoms from 7u40 onwards (using 
>>>>>> nashorn-backport with j7 -- j8 has the same problems as 7u40 and 
>>>>>> 7u45...
>>>>>> 7u25 is the last version that works fine) and suspect the cause 
>>>>>> to be the
>>>>>> JSR-292 changes that took place there. Iirc I already asked over 
>>>>>> on their mailing list. Here's the link:
>>>>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2013-December/005
>>>>>> 586.html
>>>>>>
>>>>>> The fault might as well lie with nashorn, though. It's certainly 
>>>>>> worth investigating.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>> 2014/1/4 Tal Liron <tal.liron at threecrickets.com>
>>>>>>
>>>>>>> Thanks! I didn't know of these. I'm not sure how to read the 
>>>>>>> log, but this doesn't look so good. I get a lot of "allocation 
>>>>>>> failures" that look like
>>>>>>> this:
>>>>>>>
>>>>>>> Java HotSpot(TM) 64-Bit Server VM (25.0-b63) for linux-amd64 JRE 
>>>>>>> (1.8.0-ea-b121), built on Dec 19 2013 17:29:18 by "java_re" with 
>>>>>>> gcc 4.3.0
>>>>>>> 20080428 (Red Hat 4.3.0-8)
>>>>>>> Memory: 4k page, physical 2039276k(849688k free), swap 
>>>>>>> 262140k(256280k
>>>>>>> free)
>>>>>>> CommandLine flags: -XX:InitialHeapSize=32628416
>>>>>>> -XX:MaxHeapSize=522054656
>>>>>>> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
>>>>>>> -XX:+PrintTenuringDistribution -XX:+UseCompressedClassPointers 
>>>>>>> -XX:+UseCompressedOops -XX:+UseParallelGC
>>>>>>> 0.108: [GC (Allocation Failure)
>>>>>>> Desired survivor size 524288 bytes, new threshold 7 (max 15)
>>>>>>> [PSYoungGen: 512K->496K(1024K)] 512K->496K(32256K), 0.0013194 
>>>>>>> secs]
>>>>>>> [Times: user=0.01 sys=0.00, real=0.00 secs]
>>>>>>>
>>>>>>>
>>>>>>> On 01/04/2014 10:02 PM, Ben Evans wrote:
>>>>>>>
>>>>>>>> -Xloggc:<pathtofile> -XX:+PrintGCDetails 
>>>>>>>> -XX:+PrintTenuringDistribution
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Benjamin Sieffert
>>>>>> metrigo GmbH
>>>>>> Sternstr. 106
>>>>>> 20357 Hamburg
>>>>>>
>>>>>> Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp 
>>>>>> Westermeyer Die Gesellschaft ist eingetragen beim Registergericht 
>>>>>> Hamburg Nr. HRB 120447.
>>
>