Hotspot crash

Tue Feb 19 01:35:44 PST 2013

done: http://hg.openjdk.java.net/ppc-aix-port/jdk7u/raw-file/tip/README-ppc.html#_TOC_ANCHOR_16_

+<h4>Memory requirements</h4>
+
+<p>
+Our VM is currently optimized for server class loads. This means that besides
+the usual Java heap settings which are controlled trough command line options
+the user has to make sure the environment provides reasonable data segment and
+the stack size limits. We recommend setting the stack size limit to 4MB
+(e.g. call '<code>ulimit -s 4000</code>') and the data segment limit to 1GB
+(e.g. call '<code>ulimit -d 1000000</code>'). Higher limits should be OK (the
+current limits can be inspected with '<code>ulimit -a</code>').
+</p>
+

On Mon, Feb 18, 2013 at 10:08 PM, Steve Poole <spoole at linux.vnet.ibm.com> wrote:
>
> hi Volker - ok that all makes sense :-)
>
> We should keep track of the need to set  ulimit -d to something reasonable for Hotspot.
>
> I surpose we could add it to the README-PPC file?  ( Are there other caveats or other runtime settings that need documenting|?)
>
>
> On 14 Feb 2013, at 16:41, Volker Simonis <volker.simonis at gmail.com> wrote:
>
>> Hi Steve,
>>
>> I finally found some time to look at the problem and I managed to
>> actually reproduce it (I just compiled the 730 .java files from the
>> jdk8/langtools build).
>>
>> I really think the problem is the very low limit for the data segment
>> you had. As you can see in the hs_err file:
>>
>> OS:uname:AIX dreyfus 1 7 0008FBB2D900
>> rlimit: STACK 4194304k, CORE 1048575k, NPROC 4096, NOFILE infinity, AS
>> infinity, DATA 131072k
>>
>> it was set to 131MB which is just not enough for the C2 JIT. Actually
>> the JIT is not using 25GB as you were afraid of. If you look at the
>> core file which will be produced if you reproduce the problem (of
>> course you'll have to set the ulimit for cores to unlimited) you'll
>> notice that it is just about 1.5GB, so the entire JVM process only
>> used that much memory which I think is not unusual.
>>
>> Already increasing the maximal data segment to 150MB will fix the
>> problem although I'de recommend to set it at least to 1GB if you don't
>> want to set it to unlimited. The J9 probably run with your settings
>> because either 130MB was "just enough" or because they don't use
>> malloc to dynamically allocate memory but rather shmget or mmap which
>> is not accounted for by the 'ulimit -d' setting.
>>
>> So to cut a long story short - I don't think we have a problem in this
>> specific area:)
>>
>> Regards,
>> Volker
>>
>>
>> On Tue, Feb 12, 2013 at 11:05 AM, Steve Poole <spoole at linux.vnet.ibm.com> wrote:
>>> hi Volker.
>>>
>>> The command line can be constructed from the crash data so I did that last night and proved I could reproduce the problem repeatedly.
>>>
>>> Since there was absolutely nothing else running on the system  I started from first principles and looked at system settings.
>>> I checked ulimit and saw that -d and -s were not unlimited so I set them to be so and run it again and of course it now works.
>>>
>>> I don't normally bother making that sort of ulimit change on a build machine as I've never had the problem with J9 builds :-)
>>>
>>> So I'm still suspicious of the need to do this - Not sure where I'm loosing 25GB memory to yet.
>>>
>>> I'm going to see if we can do some basic performance analysis and just make sure we understand whats going on.
>>>
>>>
>>>
>>> On 12 Feb 2013, at 08:32, Volker Simonis <volker.simonis at gmail.com> wrote:
>>>
>>>> Hi Steve,
>>>>
>>>> I didn't dare to say that it is not a HotSpot problem:)
>>>>
>>>> The high load and the fact that of 32GB only 5 GB where free seemed
>>>> just strange if the machine is only used for building.
>>>>
>>>> What would be really helpful would be to see the exact invocation
>>>> (i.e. the command line) of the corresponding javac call. I think for
>>>> the new build system you'll have to set the environment variable
>>>> LOG=trace to get the most debug output of the build itself.
>>>>
>>>>
>>>> On Mon, Feb 11, 2013 at 10:25 PM, Steve Poole <spoole at linux.vnet.ibm.com> wrote:
>>>>> hi Volker  -  the machine in question is completely dedicated to building OpenJDK and doesn't do anything else.
>>>>> Since it a single threaded machine any memory usage is coming from (in theory) the build process.
>>>>>
>>>>> May be we really are pushing it too hard :-)    The same machine can build the classes using J9 though so its a little strange...
>>>>>
>>>>> If you're comfortable that it not a hotspot  problem I'll do the diagnostics.
>>>>>
>>>>> let you know what I find out.
>>>>>
>>>>>
>>>>>
>>>>> On 11 Feb 2013, at 18:41, Volker Simonis <volker.simonis at gmail.com> wrote:
>>>>>
>>>>>> Hi Steve,
>>>>>>
>>>>>> unfortunately I was quite busy today so I just looked at the crash.
>>>>>>
>>>>>> I agree that the warning are annoying and we will switch them off if
>>>>>> we will be not running with -XX:+Verbose.
>>>>>> But as far as I can see, they are basically harmless. The appear
>>>>>> because we've switched on compressed pointers by default on AIX and
>>>>>> whtat happens is that at startup the VM tries to find some "good"
>>>>>> addresses for the allocation of the heap. If this doesn't succeed (and
>>>>>> that's what the warnings basically say) that's no real problem. As I
>>>>>> said, it's annoying and we will switch them off. Until then you can
>>>>>> use the switch "-XX:-UseCompressedOops" to switch compressed pointers
>>>>>> off manually.
>>>>>>
>>>>>> Regarding the actual error I'm a little bit confused. In that special
>>>>>> code the VM is actually only calling the bare os "malloc" functions
>>>>>> which seems to be unable to allocoate native memory. Does the VM fail
>>>>>> reproducible with this error? I saw in the hs_err.log file that the
>>>>>> machine is under high load:
>>>>>>
>>>>>> load average:20.07 17.96 17.18
>>>>>>
>>>>>> and the swap memory is nearly exhausted:
>>>>>>
>>>>>>
>>>>>> physical total : 32212254720
>>>>>> physical free  : 5296439296
>>>>>> swap total     : 536870912
>>>>>> swap free      : 483180544
>>>>>>
>>>>>> while the machine still has 5 GB of free ram. Maybe, at the point
>>>>>> where malloc was called there where other processes running which used
>>>>>> additional memory?
>>>>>>
>>>>>> Would it be possible to carve out the javac command which leads to the
>>>>>> error (i.e."Compiling 733 files for BUILD_FULL_JAVAC")
>>>>>>
>>>>>> Otherwise I'll try to reproduce it myself.
>>>>>>
>>>>>> Regards,
>>>>>> Volker
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 10, 2013 at 9:04 PM, Steve Poole <spoole at linux.vnet.ibm.com> wrote:
>>>>>>> hi guys - I'm getting a crash (basically native out of memory) when trying to build JDK8 on AIX with latest and greatest JDK7u build.
>>>>>>>
>>>>>>> I built the jdk7u one myself according to the published instructions on Volkers cr page.
>>>>>>>
>>>>>>> The crash and log are available here    (build log at http://cr.openjdk.java.net/~spoole/log.txt    crash at http://cr.openjdk.java.net/~spoole/hs_err_pid13369358.log )
>>>>>>>
>>>>>>> The machine has 30GB of memory and what with the amount of warnings showing up I'm assuming its a bug :-)
>>>>>>>
>>>>>>> Let me know if there is anything I can do to work around this (or if you need more info)   I'm trying to use the autoconf process for JDK8 and this my first major hurdle.   Any help would be appreciated!
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Steve
>>>>>>
>>>>>
>>>>
>>>
>>
>