code review (round 1) for memory commit failure fix (8013057)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue May 28 13:08:06 PDT 2013
Stefan,
Thanks for getting back to me! Replies embedded below...
On 5/28/13 1:02 PM, Stefan Karlsson wrote:
> On 5/28/13 5:55 PM, Daniel D. Daugherty wrote:
>> Stefan,
>>
>> Thanks for the re-review! Replies embedded below.
>>
>>
>> On 5/28/13 2:56 AM, Stefan Karlsson wrote:
>>> On 05/24/2013 08:23 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a revised version of the proposed fix for the following bug:
>>>>
>>>> 8013057 assert(_needs_gc ||
>>>> SafepointSynchronize::is_at_safepoint())
>>>> failed: only read at safepoint
>>>>
>>>> Here are the (round 1) webrev URLs:
>>>>
>>>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/
>>>> Internal:
>>>> http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/1-hsx25/
>>>
>>> Your patch exits the VM if we fail to get memory in
>>> os::commit_memory. There are a couple of places where we already
>>> have error messages, what should we do about them? For example:
>>> if (!os::commit_memory((char*)guard_page, _page_size, _page_size)) {
>>> // Do better than this for Merlin
>>> vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table
>>> last card");
>>> }
>>
>> Actually, my patch only exits the VM on Linux and Solaris only for
>> certain mmap() error code values so we shouldn't do anything with
>> existing code that exits on os::commit_memory() failures.
>>
>> For the specific example above, it is in platform independent code in
>> src/share/vm/memory/cardTableModRefBS.cpp so we definitely don't want
>> to remove that check. That would be bad for non-Linux and non-Solaris
>> platforms.
>
> I talked to the GC team today, and we don't want to loose the hints
> that tell what the memory is allocated for. The code you're suggesting
> will report the following, for the example above, on Linux/Solaris:
>
> + warning("INFO: os::commit_memory(" PTR_FORMAT ", " SIZE_FORMAT
> + ", " SIZE_FORMAT ", %d) failed; errno=%d", addr, size,
> + alignment_hint, exec, err);
> + vm_exit_out_of_memory(size, OOM_MMAP_ERROR,
> + "committing reserved memory.");
>
> and for the other platforms:
>
> vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table
> last card");
>
> We'd like to get the"card table last card" string into the Linux/Solaris error message, if possible.
Well, it is software so just about anything is possible... :-)
How about a new optional parameter to os::commit_memory() where an
"alt_mesg" can be passed in? If alt_mesg != NULL (and we're on a
platform that call vm_exit_out_of_memory() the platform dependent
code), we'll use the alt_mesg instead of "committing reserved memory".
Does that sound acceptable?
Of course, I'll have to revisit and potentially update many more calls
to os::commit_memory().
>
>>
>>> http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/src/os/linux/vm/os_linux.cpp.frames.html
>>>
>>> Lines 2564-2578:
>>>
>>> I don't think we want to exit immediately if we can't get large
>>> pages. With this change JVMs will stop to run if to few large pages
>>> have been setup. I think this will affect a lot of users.
>>>
>>> This code path should probably be handled by:
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007074
>>
>> I suppose that depends on whether the failure to get large pages
>> causes the underlying reservation mapping to get lost.
> Yes, we'll loose the reservation but we'll get it back a couple of
> rows bellow in os::commit_memory. That's why 8007074 wasn't found
> immediately.
Ummmmm. As soon as the reservation is lost, then something else in
the same process can come along a mmap() that memory... It could be
a native library over which you have no control...
>
>> It also
>> depends on the errno value when large page mmap() fails.
>
> ENOMEM
>
>>
>> If you prefer, I can back out lines 2606-2621 in the following
>> function:
>>
>> 2592 bool os::pd_commit_memory(char* addr, size_t size, size_t alignment_hint,
>> 2593 bool exec) {
>> 2594 if (UseHugeTLBFS && alignment_hint > (size_t)vm_page_size()) {
>> 2595 int prot = exec ? PROT_READ|PROT_WRITE|PROT_EXEC : PROT_READ|PROT_WRITE;
>> 2596 uintptr_t res =
>> 2597 (uintptr_t) ::mmap(addr, size, prot,
>> 2598 MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS|MAP_HUGETLB,
>> 2599 -1, 0);
>> 2600 if (res != (uintptr_t) MAP_FAILED) {
>> 2601 if (UseNUMAInterleaving) {
>> 2602 numa_make_global(addr, size);
>> 2603 }
>> 2604 return true;
>> 2605 }
>> 2606
>> 2607 int err = errno; // save errno from mmap() call above
>> 2608 switch (err) {
>> 2609 case EBADF:
>> 2610 case EINVAL:
>> 2611 case ENOTSUP:
>> 2612 break;
>> 2613
>> 2614 default:
>> 2615 warning("INFO: os::commit_memory(" PTR_FORMAT ", " SIZE_FORMAT
>> 2616 ", " SIZE_FORMAT ", %d) failed; errno=%d", addr, size,
>> 2617 alignment_hint, exec, err);
>> 2618 vm_exit_out_of_memory(size, OOM_MMAP_ERROR,
>> 2619 "committing reserved memory.");
>> 2620 break;
>> 2621 }
>> 2622 // Fall through and try to use small pages
>> 2623 }
>> 2624
>> 2625 if (pd_commit_memory(addr, size, exec)) {
>> 2626 realign_memory(addr, size, alignment_hint);
>> 2627 return true;
>> 2628 }
>> 2629 return false;
>> 2630 }
>> and then I can leave that potential location for your work on 8007074.
>
> I think that would be good.
Based on your reply above that the reservation can be lost, now
I need to be convinced why you think it is OK to leave it for
now...
>
>>
>> Before I back out the code, I would be interested in exercising it
>> in a large page config on a Linux machine. Can you give me some info
>> about how to enable large pages on Linux?
>
> I have the instructions in the bug report. Contact me if that doesn't
> work.
I thought of that after I sent my reply this morning. I'm doing
basic testing on Ron D's Linux machine right now. Large pages
will be next...
Dan
>
>>
>> I'll have to see about getting access to a Linux machine. I don't
>> have one of those in my lab. (More on that below)
>>
>>
>>> http://cr.openjdk.java.net/~dcubed/8013057-webrev/1-hsx25/src/os/linux/vm/os_linux.cpp.frames.html
>>>
>>> Lines 2614-2620:
>>>
>>> Do we ever end up failing like this on Linux? Have you been able to
>>> reproduce this on Linux?
>>
>> According to this bug:
>>
>> JDK-6843484 os::commit_memory() failures are not handled properly
>> on linux
>> https://jbs.oracle.com/bugs/browse/JDK-6843484
>>
>> we do run into this issue on Linux. However, I have not tried my
>> reproducer on a Linux machine. I'm a bit worried about doing that
>> since I swamped my local Solaris server so bad that I had to power
>> cycle it Friday night.
>>
>> I will investigate getting access to a remote Linux machine and I'll
>> check into how they get rebooted, get unstuck, etc... I'm worried
>> about screwing up someone else's machine so I'll see about getting
>> my own...
>
> OK.
>
> StefanK
>>
>> Dan
>>
>>
>>
>>>
>>> StefanK
>>>
>>>>
>>>> Testing:
>>>> - Aurora Adhoc vm.quick batch for all OSes in the following configs:
>>>> {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>>>> - I've created a standalone Java stress test with a shell script
>>>> wrapper that reproduces the failing code paths on my Solaris X86
>>>> server. This test will not be integrated since running the machine
>>>> out of swap space is very disruptive (crashes the window system,
>>>> causes various services to exit, etc.)
>>>>
>>>> Gory details are below. As always, comments, questions and
>>>> suggestions are welome.
>>>>
>>>> Dan
>>>>
>>>>
>>>> Gory Details:
>>>>
>>>> The VirtualSpace data structure is built on top of the ReservedSpace
>>>> data structure. VirtualSpace presumes that failed os::commit_memory()
>>>> calls do not affect the underlying ReservedSpace memory mappings.
>>>> That assumption is true on MacOS X and Windows, but it is not true
>>>> on Linux or Solaris. The mmap() system call on Linux or Solaris can
>>>> lose previous mappings in the event of certain errors. On MacOS X,
>>>> the mmap() system call clearly states that previous mappings are
>>>> replaced only on success. On Windows, a different set of APIs are
>>>> used and they do not document any loss of previous mappings.
>>>>
>>>> The solution is to implement the proper failure checks in the
>>>> os::commit_memory() implementations on Linux and Solaris. On MacOS X
>>>> and Windows, no additional checks are needed.
>>>>
>>>> There is also a secondary change where some of the pd_commit_memory()
>>>> calls were calling os::commit_memory() instead of calling their
>>>> sibling
>>>> os::pd_commit_memory(). This resulted in double NMT tracking so this
>>>> has also been fixed. There were also some incorrect mmap)() return
>>>> value checks which have been fixed.
>>>>
>>>> Just to be clear: This fix simply properly detects the "out of swap
>>>> space" condition on Linux and Solaris and causes the VM to fail in a
>>>> more orderly fashion with a message that looks like this:
>>>>
>>>> The Java process' stderr will show:
>>>>
>>>> INFO: os::commit_memory(0xfffffd7fb2522000, 4096, 4096, 0) failed;
>>>> errno=11
>>>> #
>>>> # There is insufficient memory for the Java Runtime Environment to
>>>> continue.
>>>> # Native memory allocation (mmap) failed to map 4096 bytes for
>>>> committing reserved memory.
>>>> # An error report file with more information is saved as:
>>>> # /work/shared/bugs/8013057/looper.03/hs_err_pid9111.log
>>>>
>>>> The hs_err_pid file will have the more verbose info:
>>>>
>>>> #
>>>> # There is insufficient memory for the Java Runtime Environment to
>>>> continue.
>>>> # Native memory allocation (mmap) failed to map 4096 bytes for
>>>> committing reserved memory.
>>>> # Possible reasons:
>>>> # The system is out of physical RAM or swap space
>>>> # In 32 bit mode, the process size limit was hit
>>>> # Possible solutions:
>>>> # Reduce memory load on the system
>>>> # Increase physical memory or swap space
>>>> # Check if swap backing store is full
>>>> # Use 64 bit Java on a 64 bit OS
>>>> # Decrease Java heap size (-Xmx/-Xms)
>>>> # Decrease number of Java threads
>>>> # Decrease Java thread stack sizes (-Xss)
>>>> # Set larger code cache with -XX:ReservedCodeCacheSize=
>>>> # This output file may be truncated or incomplete.
>>>> #
>>>> # Out of Memory Error
>>>> (/work/shared/bug_hunt/hsx_rt_latest/exp_8013057/src/os/s
>>>> olaris/vm/os_solaris.cpp:2791), pid=9111, tid=21
>>>> #
>>>> # JRE version: Java(TM) SE Runtime Environment (8.0-b89) (build
>>>> 1.8.0-ea-b89)
>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM
>>>> (25.0-b33-bh_hsx_rt_exp_8013057_dcu
>>>> bed-product-fastdebug mixed mode solaris-amd64 compressed oops)
>>>> # Core dump written. Default location:
>>>> /work/shared/bugs/8013057/looper.03/core
>>>> or core.9111
>>>> #
>>>>
>>>> You might be wondering why we are assuming that the failed mmap()
>>>> commit operation has lost the 'reserved memory' mapping.
>>>>
>>>> We have no good way to determine if the 'reserved memory' mapping
>>>> is lost. Since all the other threads are not idle, it is possible
>>>> for another thread to have 'reserved' the same memory space for a
>>>> different data structure. Our thread could observe that the memory
>>>> is still 'reserved' but we have no way to know that the
>>>> reservation
>>>> isn't ours.
>>>>
>>>> You might be wondering why we can't recover from this transient
>>>> resource availability issue.
>>>>
>>>> We could retry the failed mmap() commit operation, but we would
>>>> again run into the issue that we no longer know which data
>>>> structure 'owns' the 'reserved' memory mapping. In particular, the
>>>> memory could be reserved by native code calling mmap() directly so
>>>> the VM really has no way to recover from this failure.
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20130528/04561eb1/attachment-0001.html
More information about the hotspot-runtime-dev
mailing list