GC overhead limit exceeded

Thu Jan 9 09:38:35 PST 2014

Tal,

I've been thowing requests at the Prudence test app for the last 20 
minutes or so. I do see that it uses a lot of metaspace, close to 50M in 
my case. The test app seems to load/unload 2 classes per request with 
Rhino compared to 4 classes per request with Nashorn, which is probably 
due to differences in bytecode generation between the two engines.

I don't yet see metaspace usage growing beyond that limit, or generating 
GC warnings. Maybe I haven't been running it long enough.

I'm wondering if maybe metaspace is tight from the very beginning, and 
the GC problems are caused by spikes in load (e.g. concurrent requests)?

Also, are you aware of new classes being generated for each request? Are 
you evaluating script files for each request? It would be more efficient 
to evaluate the script just once and then reuse it for subsequent requests.

Hannes

Am 2014-01-09 17:21, schrieb Tal Liron:
> You may download the latest release of Prudence, run it and bombard it 
> with hits (use ab or a similar tool):
>
> http://threecrickets.com/prudence/download/
>
> To get the GC logs, start it like so:
>
> JVM_SWITCHES=\
>     -Xloggc:/full/path/to/logs/gc.log \
>     -XX:+PrintGCDetails \
>     -XX:+PrintTenuringDistribution \
>     sincerity start prudence
>
> To bombard it:
>
> ab -n 50000 -c 10 "http://localhost:8080/prudence-example/"
>
> Of course, you may also want to restrict the JVM heap size so it will 
> happen sooner. I think. I actually don't understand JVM 8 GC at all, 
> but you guys do, so have a go. All I can tell you is that I have a 
> server running live on the Internet, which I have to restart every 3 
> days due to this issue.
>
> Unfortunately, I don't have an easy way to isolate the problem to 
> something smaller. However, I would think there's probably an 
> advantage in using something as big as possible -- you can probably 
> get very rich dumps of what is polluting the heap.
>
>
> On 01/10/2014 12:00 AM, Marcus Lagergren wrote:
>> Tal - The GC people 10 meters behind me want to know if you have a 
>> repro of your full GC to death problem that they can look at? They’re 
>> interested.
>>
>> /M
>>
>> On 09 Jan 2014, at 16:29, Kirk Pepperdine <kirk at kodewerk.com> wrote:
>>
>>> Hi Marcus,
>>>
>>> Looks like some of the details have been chopped off. Is there a GC 
>>> log available? If there is a problem with MethodHandle a work around 
>>> might be a simple as expanding perm.. but wait, this is meta space 
>>> now and it should grow as long as your system has memory to give to 
>>> the process. The only thing I can suggest is that the space to hold 
>>> compressed class pointers is a fixed size and that if Nashorn is 
>>> loading a lot of classes is that you consider making that space 
>>> larger. Full disclosure, this isn’t something that I’ve had a chance 
>>> to dabble with but I think there is a flag to control the size of 
>>> that space. Maybe Colleen can offer better insight.
>>>
>>> Regards,
>>> Kirk
>>>
>>> On Jan 9, 2014, at 10:02 AM, Marcus Lagergren 
>>> <marcus.lagergren at oracle.com> wrote:
>>>
>>>> This almost certainly stems from the implementation from 
>>>> MethodHandle combinators being implemented as lambda forms as 
>>>> anonymous java classes. One of the things that is being done for  
>>>> 8u20 is to drastically reduce the number of lambda forms created.  
>>>> I don’t know of any workaround at the moment. CC:ing 
>>>> hotspot-compiler-dev, so the people there can elaborate a bit.
>>>>
>>>> /M
>>>>
>>>> On 06 Jan 2014, at 06:57, Benjamin Sieffert 
>>>> <benjamin.sieffert at metrigo.de> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> we have been observing similar symptoms from 7u40 onwards (using
>>>>> nashorn-backport with j7 -- j8 has the same problems as 7u40 and 
>>>>> 7u45...
>>>>> 7u25 is the last version that works fine) and suspect the cause to 
>>>>> be the
>>>>> JSR-292 changes that took place there. Iirc I already asked over 
>>>>> on their
>>>>> mailing list. Here's the link:
>>>>> http://mail.openjdk.java.net/pipermail/mlvm-dev/2013-December/005586.html 
>>>>>
>>>>> The fault might as well lie with nashorn, though. It's certainly 
>>>>> worth
>>>>> investigating.
>>>>>
>>>>> Regards
>>>>>
>>>>>
>>>>> 2014/1/4 Tal Liron <tal.liron at threecrickets.com>
>>>>>
>>>>>> Thanks! I didn't know of these. I'm not sure how to read the log, 
>>>>>> but this
>>>>>> doesn't look so good. I get a lot of "allocation failures" that 
>>>>>> look like
>>>>>> this:
>>>>>>
>>>>>> Java HotSpot(TM) 64-Bit Server VM (25.0-b63) for linux-amd64 JRE
>>>>>> (1.8.0-ea-b121), built on Dec 19 2013 17:29:18 by "java_re" with 
>>>>>> gcc 4.3.0
>>>>>> 20080428 (Red Hat 4.3.0-8)
>>>>>> Memory: 4k page, physical 2039276k(849688k free), swap 
>>>>>> 262140k(256280k
>>>>>> free)
>>>>>> CommandLine flags: -XX:InitialHeapSize=32628416 
>>>>>> -XX:MaxHeapSize=522054656
>>>>>> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
>>>>>> -XX:+PrintTenuringDistribution -XX:+UseCompressedClassPointers
>>>>>> -XX:+UseCompressedOops -XX:+UseParallelGC
>>>>>> 0.108: [GC (Allocation Failure)
>>>>>> Desired survivor size 524288 bytes, new threshold 7 (max 15)
>>>>>> [PSYoungGen: 512K->496K(1024K)] 512K->496K(32256K), 0.0013194 secs]
>>>>>> [Times: user=0.01 sys=0.00, real=0.00 secs]
>>>>>>
>>>>>>
>>>>>> On 01/04/2014 10:02 PM, Ben Evans wrote:
>>>>>>
>>>>>>> -Xloggc:<pathtofile> -XX:+PrintGCDetails 
>>>>>>> -XX:+PrintTenuringDistribution
>>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> Benjamin Sieffert
>>>>> metrigo GmbH
>>>>> Sternstr. 106
>>>>> 20357 Hamburg
>>>>>
>>>>> Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp 
>>>>> Westermeyer
>>>>> Die Gesellschaft ist eingetragen beim Registergericht Hamburg
>>>>> Nr. HRB 120447.
>