code review (round 1) for memory commit failure fix (8013057)

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue May 28 14:03:16 PDT 2013


On 5/28/13 2:47 PM, Stefan Karlsson wrote:
> On 5/28/13 10:08 PM, Daniel D. Daugherty wrote:
>> Stefan,
>>
>> Thanks for getting back to me! Replies embedded below...
>>
>>
>> On 5/28/13 1:02 PM, Stefan Karlsson wrote:
>>> On 5/28/13 5:55 PM, Daniel D. Daugherty wrote:
>>>> Stefan,
>>>>
>>>> Thanks for the re-review! Replies embedded below.
>>>>
>>>>
>>>> On 5/28/13 2:56 AM, Stefan Karlsson wrote:
>>>>> On 05/24/2013 08:23 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a revised version of the proposed fix for the following bug:
>>>>>>
>>>>>>     8013057 assert(_needs_gc || 
>>>>>> SafepointSynchronize::is_at_safepoint())
>>>>>>             failed: only read at safepoint
>>>>>>
>>>>>> Here are the (round 1) webrev URLs:
>>>>>>
>>>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/
>>>>>> Internal: 
>>>>>> http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/1-hsx25/
>>>>>
>>>>> Your patch exits the VM if we fail to get memory in 
>>>>> os::commit_memory. There are a couple of places where we already 
>>>>> have error messages, what should we do about them? For example:
>>>>>   if (!os::commit_memory((char*)guard_page, _page_size, 
>>>>> _page_size)) {
>>>>>     // Do better than this for Merlin
>>>>>     vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table 
>>>>> last card");
>>>>>   }
>>>>
>>>> Actually, my patch only exits the VM on Linux and Solaris only for
>>>> certain mmap() error code values so we shouldn't do anything with
>>>> existing code that exits on os::commit_memory() failures.
>>>>
>>>> For the specific example above, it is in platform independent code in
>>>> src/share/vm/memory/cardTableModRefBS.cpp so we definitely don't want
>>>> to remove that check. That would be bad for non-Linux and non-Solaris
>>>> platforms.
>>>
>>> I talked to the GC team today, and we don't want to loose the hints 
>>> that tell what the memory is allocated for. The code you're 
>>> suggesting will report the following, for the example above, on 
>>> Linux/Solaris:
>>>
>>> +      warning("INFO: os::commit_memory(" PTR_FORMAT ", " SIZE_FORMAT
>>> +              ", " SIZE_FORMAT ", %d) failed; errno=%d", addr, size,
>>> +              alignment_hint, exec, err);
>>> +      vm_exit_out_of_memory(size, OOM_MMAP_ERROR,
>>> +                            "committing reserved memory.");
>>>
>>> and for the other platforms:
>>>
>>>            vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table
>>>        last card");
>>>        
>>> We'd like to get the"card table last card"  string into the Linux/Solaris error message, if possible.
>>
>> Well, it is software so just about anything is possible... :-)
>>
>> How about a new optional parameter to os::commit_memory() where an
>> "alt_mesg" can be passed in? If alt_mesg != NULL (and we're on a
>> platform that call vm_exit_out_of_memory() the platform dependent
>> code), we'll use the alt_mesg instead of "committing reserved memory".
>> Does that sound acceptable?
>>
>> Of course, I'll have to revisit and potentially update many more calls
>> to os::commit_memory().
> I need to think about that. I don't think it look that nice if we 
> duplicate that string like:
> if (!os::commit_memory(..., "card table last card")) {
>   vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table last 
> card");
> }

Could solve that with a local string...


> Maybe we could have a:
> os::commit_memory_or_exit(..., "card table last card")

So we would have both commit_memory() and commit_memory_or_exit()?
Not liking that... Also that would require more new code on MacOS X
and Windows that we don't currently need.


>
>>
>>
>>>   
>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/src/os/linux/vm/os_linux.cpp.frames.html 
>>>>>
>>>>>  Lines 2564-2578:
>>>>>
>>>>> I don't think we want to exit immediately if we can't get large 
>>>>> pages. With this change JVMs will stop to run if to few large 
>>>>> pages have been setup. I think this will affect a lot of users.
>>>>>
>>>>> This code path should probably be handled by:
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007074
>>>>
>>>> I suppose that depends on whether the failure to get large pages
>>>> causes the underlying reservation mapping to get lost.
>>> Yes, we'll loose the reservation but we'll get it back a couple of 
>>> rows bellow in os::commit_memory. That's why 8007074 wasn't found 
>>> immediately.
>>
>> Ummmmm. As soon as the reservation is lost, then something else in
>> the same process can come along a mmap() that memory... It could be
>> a native library over which you have no control...
>
> I agree about the dangers with this code. I'm just concerned that this 
> patch will shutdown JVMs on machines that end using up all large pages 
> or get their large pages fragmented. This will probably make the 
> default Large Pages implementation in HotSpot unusable on Linux.

Hopefully more testing will help me understand more about how things
work on Linux. Right now, I'm having trouble just getting to the
regular failure point let alone a large pages failure...


>
>>
>>
>>>
>>>> It also
>>>> depends on the errno value when large page mmap() fails.
>>>
>>> ENOMEM
>>>
>>>>
>>>> If you prefer, I can back out lines 2606-2621 in the following
>>>> function:
>>>>
>>>> 2592 bool os::pd_commit_memory(char* addr, size_t size, size_t alignment_hint,
>>>> 2593                        bool exec) {
>>>> 2594   if (UseHugeTLBFS && alignment_hint > (size_t)vm_page_size()) {
>>>> 2595     int prot = exec ? PROT_READ|PROT_WRITE|PROT_EXEC : PROT_READ|PROT_WRITE;
>>>> 2596     uintptr_t res =
>>>> 2597       (uintptr_t) ::mmap(addr, size, prot,
>>>> 2598                          MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_HUGETLB,
>>>> 2599                          -1, 0);
>>>> 2600     if (res != (uintptr_t) MAP_FAILED) {
>>>> 2601       if (UseNUMAInterleaving) {
>>>> 2602         numa_make_global(addr, size);
>>>> 2603       }
>>>> 2604       return true;
>>>> 2605     }
>>>> 2606
>>>> 2607     int err = errno;  // save errno from mmap() call above
>>>> 2608     switch (err) {
>>>> 2609     case EBADF:
>>>> 2610     case EINVAL:
>>>> 2611     case ENOTSUP:
>>>> 2612       break;
>>>> 2613
>>>> 2614     default:
>>>> 2615       warning("INFO: os::commit_memory(" PTR_FORMAT ", " SIZE_FORMAT
>>>> 2616               ", " SIZE_FORMAT ", %d) failed; errno=%d", addr, size,
>>>> 2617               alignment_hint, exec, err);
>>>> 2618       vm_exit_out_of_memory(size, OOM_MMAP_ERROR,
>>>> 2619                             "committing reserved memory.");
>>>> 2620       break;
>>>> 2621     }
>>>> 2622     // Fall through and try to use small pages
>>>> 2623   }
>>>> 2624
>>>> 2625   if (pd_commit_memory(addr, size, exec)) {
>>>> 2626     realign_memory(addr, size, alignment_hint);
>>>> 2627     return true;
>>>> 2628   }
>>>> 2629   return false;
>>>> 2630 }
>>>> and then I can leave that potential location for your work on 8007074.
>>>
>>> I think that would be good.
>>
>> Based on your reply above that the reservation can be lost, now
>> I need to be convinced why you think it is OK to leave it for
>> now...
>
> I just see that we have two bad solutions at the moment: one that 
> exits the JVM and the other that _might_ crash the JVM. I think 
> neither is satisfactory for the users.

The current situation is worse. What happens if the transient out of
memory condition doesn't crash the JVM, it just corrupts that JVM in
subtle ways... and the JVM just propagates the corrupted data.... :-(

Dan


>
> StefanK
>>
>>
>>>
>>>>
>>>> Before I back out the code, I would be interested in exercising it
>>>> in a large page config on a Linux machine. Can you give me some info
>>>> about how to enable large pages on Linux?
>>>
>>> I have the instructions in the bug report. Contact me if that 
>>> doesn't work.
>>
>> I thought of that after I sent my reply this morning. I'm doing
>> basic testing on Ron D's Linux machine right now. Large pages
>> will be next...
>>
>> Dan
>>
>>
>>>
>>>>
>>>> I'll have to see about getting access to a Linux machine. I don't
>>>> have one of those in my lab. (More on that below)
>>>>
>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/src/os/linux/vm/os_linux.cpp.frames.html 
>>>>>
>>>>>  Lines 2614-2620:
>>>>>
>>>>> Do we ever end up failing like this on Linux? Have you been able 
>>>>> to reproduce this on Linux?
>>>>
>>>> According to this bug:
>>>>
>>>>     JDK-6843484 os::commit_memory() failures are not handled 
>>>> properly on linux
>>>> https://jbs.oracle.com/bugs/browse/JDK-6843484
>>>>
>>>> we do run into this issue on Linux. However, I have not tried my
>>>> reproducer on a Linux machine. I'm a bit worried about doing that
>>>> since I swamped my local Solaris server so bad that I had to power
>>>> cycle it Friday night.
>>>>
>>>> I will investigate getting access to a remote Linux machine and I'll
>>>> check into how they get rebooted, get unstuck, etc... I'm worried
>>>> about screwing up someone else's machine so I'll see about getting
>>>> my own...
>>>
>>> OK.
>>>
>>> StefanK
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>>>
>>>>> StefanK
>>>>>
>>>>>>
>>>>>> Testing:
>>>>>> - Aurora Adhoc vm.quick batch for all OSes in the following configs:
>>>>>>   {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>>>> - I've created a standalone Java stress test with a shell script
>>>>>>   wrapper that reproduces the failing code paths on my Solaris X86
>>>>>>   server. This test will not be integrated since running the machine
>>>>>>   out of swap space is very disruptive (crashes the window system,
>>>>>>   causes various services to exit, etc.)
>>>>>>
>>>>>> Gory details are below. As always, comments, questions and
>>>>>> suggestions are welome.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> Gory Details:
>>>>>>
>>>>>> The VirtualSpace data structure is built on top of the ReservedSpace
>>>>>> data structure. VirtualSpace presumes that failed 
>>>>>> os::commit_memory()
>>>>>> calls do not affect the underlying ReservedSpace memory mappings.
>>>>>> That assumption is true on MacOS X and Windows, but it is not true
>>>>>> on Linux or Solaris. The mmap() system call on Linux or Solaris can
>>>>>> lose previous mappings in the event of certain errors. On MacOS X,
>>>>>> the mmap() system call clearly states that previous mappings are
>>>>>> replaced only on success. On Windows, a different set of APIs are
>>>>>> used and they do not document any loss of previous mappings.
>>>>>>
>>>>>> The solution is to implement the proper failure checks in the
>>>>>> os::commit_memory() implementations on Linux and Solaris. On MacOS X
>>>>>> and Windows, no additional checks are needed.
>>>>>>
>>>>>> There is also a secondary change where some of the 
>>>>>> pd_commit_memory()
>>>>>> calls were calling os::commit_memory() instead of calling their 
>>>>>> sibling
>>>>>> os::pd_commit_memory(). This resulted in double NMT tracking so this
>>>>>> has also been fixed. There were also some incorrect mmap)() return
>>>>>> value checks which have been fixed.
>>>>>>
>>>>>> Just to be clear: This fix simply properly detects the "out of swap
>>>>>> space" condition on Linux and Solaris and causes the VM to fail in a
>>>>>> more orderly fashion with a message that looks like this:
>>>>>>
>>>>>> The Java process' stderr will show:
>>>>>>
>>>>>> INFO: os::commit_memory(0xfffffd7fb2522000, 4096, 4096, 0) 
>>>>>> failed; errno=11
>>>>>> #
>>>>>> # There is insufficient memory for the Java Runtime Environment 
>>>>>> to continue.
>>>>>> # Native memory allocation (mmap) failed to map 4096 bytes for 
>>>>>> committing reserved memory.
>>>>>> # An error report file with more information is saved as:
>>>>>> # /work/shared/bugs/8013057/looper.03/hs_err_pid9111.log
>>>>>>
>>>>>> The hs_err_pid file will have the more verbose info:
>>>>>>
>>>>>> #
>>>>>> # There is insufficient memory for the Java Runtime Environment 
>>>>>> to continue.
>>>>>> # Native memory allocation (mmap) failed to map 4096 bytes for 
>>>>>> committing reserved memory.
>>>>>> # Possible reasons:
>>>>>> #   The system is out of physical RAM or swap space
>>>>>> #   In 32 bit mode, the process size limit was hit
>>>>>> # Possible solutions:
>>>>>> #   Reduce memory load on the system
>>>>>> #   Increase physical memory or swap space
>>>>>> #   Check if swap backing store is full
>>>>>> #   Use 64 bit Java on a 64 bit OS
>>>>>> #   Decrease Java heap size (-Xmx/-Xms)
>>>>>> #   Decrease number of Java threads
>>>>>> #   Decrease Java thread stack sizes (-Xss)
>>>>>> #   Set larger code cache with -XX:ReservedCodeCacheSize=
>>>>>> # This output file may be truncated or incomplete.
>>>>>> #
>>>>>> #  Out of Memory Error 
>>>>>> (/work/shared/bug_hunt/hsx_rt_latest/exp_8013057/src/os/s
>>>>>> olaris/vm/os_solaris.cpp:2791), pid=9111, tid=21
>>>>>> #
>>>>>> # JRE version: Java(TM) SE Runtime Environment (8.0-b89) (build 
>>>>>> 1.8.0-ea-b89)
>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM 
>>>>>> (25.0-b33-bh_hsx_rt_exp_8013057_dcu
>>>>>> bed-product-fastdebug mixed mode solaris-amd64 compressed oops)
>>>>>> # Core dump written. Default location: 
>>>>>> /work/shared/bugs/8013057/looper.03/core
>>>>>> or core.9111
>>>>>> #
>>>>>>
>>>>>> You might be wondering why we are assuming that the failed mmap()
>>>>>> commit operation has lost the 'reserved memory' mapping.
>>>>>>
>>>>>>     We have no good way to determine if the 'reserved memory' 
>>>>>> mapping
>>>>>>     is lost. Since all the other threads are not idle, it is 
>>>>>> possible
>>>>>>     for another thread to have 'reserved' the same memory space 
>>>>>> for a
>>>>>>     different data structure. Our thread could observe that the 
>>>>>> memory
>>>>>>     is still 'reserved' but we have no way to know that the 
>>>>>> reservation
>>>>>>     isn't ours.
>>>>>>
>>>>>> You might be wondering why we can't recover from this transient
>>>>>> resource availability issue.
>>>>>>
>>>>>>     We could retry the failed mmap() commit operation, but we would
>>>>>>     again run into the issue that we no longer know which data
>>>>>>     structure 'owns' the 'reserved' memory mapping. In 
>>>>>> particular, the
>>>>>>     memory could be reserved by native code calling mmap() 
>>>>>> directly so
>>>>>>     the VM really has no way to recover from this failure.
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20130528/ec77edfa/attachment-0001.html 


More information about the hotspot-runtime-dev mailing list