HSX-24 Backport Code Review for memory commit failure fix (8013057)

Mon Jun 10 22:53:48 PDT 2013

Ship it! :)

Thanks Dan.

David
-----

On 8/06/2013 8:21 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have an HSX-24 backport of the proposed fix for the following bug:
>
>      8013057 assert(_needs_gc || SafepointSynchronize::is_at_safepoint())
>              failed: only read at safepoint
>
> Here are the (round 0) webrev URLs for the HSX-24 backport:
>
> OpenJDK: http://cr.openjdk.java.net/~dcubed/8013057-webrev/0-hsx24/
> Internal: http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/0-hsx24/
>
> Here are the (round 3) webrev URLs for the HSX-25 version for reference:
>
> OpenJDK: http://cr.openjdk.java.net/~dcubed/8013057-webrev/3-hsx25/
> Internal: http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/3-hsx25/
>
> The easiest way to review the backport is to download and compare
> the two patch files with your favorite file comparison tool:
>
> OpenJDK patch links:
> http://cr.openjdk.java.net/~dcubed/8013057-webrev/0-hsx24/8013057_for_hsx24_exp.patch
>
> http://cr.openjdk.java.net/~dcubed/8013057-webrev/3-hsx25/8013057_exp.patch
>
> Internal patch links:
> http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/0-hsx24/8013057_for_hsx24_exp.patch
>
> http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/3-hsx25/8013057_exp.patch
>
>
> For 'diff' oriented people (like me), here's my quick sanity of the two
> patch files:
>      - diff the patches
>      - strip down to line additions and deletions
>      - strip out the copyright changes
>
> $ diff 8013057-webrev/3-hsx25/8013057_exp.patch \
>      8013057-webrev/0-hsx24/8013057_for_hsx24_exp.patch \
>      | grep '^[<>] [+-] ' | grep -v Copyright
> < +  return false;
>  > +  return false;
> < +    vm_exit_out_of_memory(size, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(size, mesg);
> < +    vm_exit_out_of_memory(size, OOM_MMAP_ERROR, "committing reserved
> memory.");
>  > +    vm_exit_out_of_memory(size, "committing reserved memory.");
> < +    vm_exit_out_of_memory(size, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(size, mesg);
> < +    vm_exit_out_of_memory(size, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(size, mesg);
> < +    vm_exit_out_of_memory(bytes, OOM_MMAP_ERROR, "committing reserved
> memory.");
>  > +    vm_exit_out_of_memory(bytes, "committing reserved memory.");
> < +    vm_exit_out_of_memory(bytes, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(bytes, mesg);
> < +    vm_exit_out_of_memory(bytes, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(bytes, mesg);
> < +    vm_exit_out_of_memory(size, OOM_MMAP_ERROR, mesg);
>  > +    vm_exit_out_of_memory(size, mesg);
> < -        vm_exit_out_of_memory(new_committed.byte_size(), OOM_MMAP_ERROR,
>  > -        vm_exit_out_of_memory(new_committed.byte_size(),
> < -    vm_exit_out_of_memory(_size, OOM_MMAP_ERROR, "Allocator (commit)");
>  > -    vm_exit_out_of_memory(_size, "Allocator (commit)");
> < -    vm_exit_out_of_memory(_page_size, OOM_MMAP_ERROR, "card table
> last card");
>  > -    vm_exit_out_of_memory(_page_size, "card table last card");
> < -        vm_exit_out_of_memory(new_committed.byte_size(), OOM_MMAP_ERROR,
>  > -        vm_exit_out_of_memory(new_committed.byte_size(),
>
> The majority of the diffs are due to new OOM failure type param that
> was added in HSX-25 only. The "return false" shows up because HSX-24
> has some left-over MacOS X port cruft that someone removed in HSX-25
> so the "return false" looks like it was added in different places.
>
> The remainder is the HSX-25 round 3 code review invite with the
> code blocks updated for HSX-24.
>
>
> Testing:
> - Aurora Adhoc vm.quick batch for all OSes in the following configs:
>    {Client VM, Server VM} x {fastdebug} x {-Xmixed}
> - I've created a standalone Java stress test with a shell script
>    wrapper that reproduces the failing code paths on my Solaris X86
>    server and on Ron's DevOps Linux machine. This test will not be
>    integrated since running the machine out of swap space is very
>    disruptive (crashes the window system, causes various services to
>    exit, etc.)
>
> There are three parts to this fix:
>
> 1) Detect commit memory failures on Linux and Solaris where the
>     previous reservation can be lost and call vm_exit_out_of_memory()
>     to report the resource exhaustion. Add os::commit_memory_or_exit()
>     API to provide more consistent handling of vm_exit_out_of_memory()
>     calls.
> 2) Change existing os::commit_memory() calls to make the executable
>     status of memory more clear; this makes security analysis easier.
> 3) Clean up some platform dependent layer calls that were resulting
>     in extra NMT accounting. Clean up some mmap() return value checks.
>
> Gory details are below. As always, comments, questions and
> suggestions are welome.
>
> Dan
>
>
> Gory Details:
>
> The VirtualSpace data structure is built on top of the ReservedSpace
> data structure. VirtualSpace presumes that failed os::commit_memory()
> calls do not affect the underlying ReservedSpace memory mappings.
> That assumption is true on MacOS X and Windows, but it is not true
> on Linux or Solaris. The mmap() system call on Linux or Solaris can
> lose previous mappings in the event of certain errors. On MacOS X,
> the mmap() system call clearly states that previous mappings are
> replaced only on success. On Windows, a different set of APIs are
> used and they do not document any loss of previous mappings.
>
> The solution is to implement the proper failure checks in the
> os::commit_memory() implementations on Linux and Solaris. On MacOS X
> and Windows, no additional checks are needed.
>
> During code review round 1, there was a request from the GC team to
> provide an os::commit_memory_or_exit() entry point in order to preserve
> the existing error messages on all platforms. This entry point allows
> code like this:
>
> src/share/vm/gc_implementation/parallelScavenge/cardTableExtension.cpp:
>
>   652       if (!os::commit_memory((char*)new_committed.start(),
>   653                              new_committed.byte_size())) {
>   654         vm_exit_out_of_memory(new_committed.byte_size(),
>   655                               "card table expansion");
>
> to be replaced with code like this:
>
>   652       os::commit_memory_or_exit((char*)new_committed.start(),
>   653                                 new_committed.byte_size(), !ExecMem,
>   654                                 "card table expansion");
>
> All uses of os::commit_memory() have been visited and those locations
> that previously exited on error have been updated to use the new entry
> point. This new entry point cleans up the original call sites and the
> vm_exit_out_of_memory() calls are now consistent on all platforms.
>
> As a secondary change, while visiting all os::commit_memory() calls, I
> also updated them to use the new ExecMem enum in order to make the
> executable status of the memory more clear. Since executable memory can
> be an attack vector, it is prudent to make the executable status of
> memory crystal clear. This also allowed me to remove the default
> executable flag value of 'false'. Now all new uses of commit_memory()
> must be clear about the executable status of the memory.
>
> There are also tertiary changes where some of the pd_commit_memory()
> calls were calling os::commit_memory() instead of calling their sibling
> os::pd_commit_memory(). This resulted in double NMT tracking so this
> has also been fixed. There were also some incorrect mmap)() return
> value checks which have been fixed.
>
> Just to be clear: This fix simply properly detects the "out of swap
> space" condition on Linux and Solaris and causes the VM to fail in a
> more orderly fashion with a message that looks like this:
>
> The Java process' stderr will show:
>
> INFO: os::commit_memory(0xfffffd7fb2522000, 4096, 4096, 0) failed; errno=11
> #
> # There is insufficient memory for the Java Runtime Environment to
> continue.
> # Native memory allocation (mmap) failed to map 4096 bytes for
> committing reserved memory.
> # An error report file with more information is saved as:
> # /work/shared/bugs/8013057/looper.03/hs_err_pid9111.log
>
> The hs_err_pid file will have the more verbose info:
>
> #
> # There is insufficient memory for the Java Runtime Environment to
> continue.
> # Native memory allocation (mmap) failed to map 4096 bytes for
> committing reserved memory.
> # Possible reasons:
> #   The system is out of physical RAM or swap space
> #   In 32 bit mode, the process size limit was hit
> # Possible solutions:
> #   Reduce memory load on the system
> #   Increase physical memory or swap space
> #   Check if swap backing store is full
> #   Use 64 bit Java on a 64 bit OS
> #   Decrease Java heap size (-Xmx/-Xms)
> #   Decrease number of Java threads
> #   Decrease Java thread stack sizes (-Xss)
> #   Set larger code cache with -XX:ReservedCodeCacheSize=
> # This output file may be truncated or incomplete.
> #
> #  Out of Memory Error
> (/work/shared/bug_hunt/hsx_rt_latest/exp_8013057/src/os/solaris/vm/os_solaris.cpp:2791),
> pid=9111, tid=21
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0-b89) (build
> 1.8.0-ea-b89)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM
> (25.0-b33-bh_hsx_rt_exp_8013057_dcubed-product-fastdebug mixed mode
> solaris-amd64 compressed oops)
> # Core dump written. Default location:
> /work/shared/bugs/8013057/looper.03/core or core.9111
> #
>
> You might be wondering why we are assuming that the failed mmap()
> commit operation has lost the 'reserved memory' mapping.
>
>      We have no good way to determine if the 'reserved memory' mapping
>      is lost. Since all the other threads are not idle, it is possible
>      for another thread to have 'reserved' the same memory space for a
>      different data structure. Our thread could observe that the memory
>      is still 'reserved' but we have no way to know that the reservation
>      isn't ours.
>
> You might be wondering why we can't recover from this transient
> resource availability issue.
>
>      We could retry the failed mmap() commit operation, but we would
>      again run into the issue that we no longer know which data
>      structure 'owns' the 'reserved' memory mapping. In particular, the
>      memory could be reserved by native code calling mmap() directly so
>      the VM really has no way to recover from this failure.
>
> You might be wondering why part of this work is deferred:
>
> 2749     if (!recoverable_mmap_error(err)) {
> 2750       // However, it is not clear that this loss of our reserved
> mapping
> 2751       // happens with large pages on Linux or that we cannot recover
> 2752       // from the loss. For now, we just issue a warning and we don't
> 2753       // call vm_exit_out_of_memory(). This issue is being tracked by
> 2754       // JBS-8007074.
> 2755       warn_fail_commit_memory(addr, size, alignment_hint, exec, err);
> 2756 //    vm_exit_out_of_memory(size, "committing reserved memory.");
> 2757     }
> 2758     // Fall through and try to use small pages
>
>      When line 2756 is enabled and UseHugeTLBFS is specified, then the
>      VM will exit because no more huge/large pages are available. It is
>      not yet clear that this transition from large to small pages is
>      actually unsafe, but we don't yet have proof that it is safe
>      either. More research will be done via JBS-8007074.
>
> If you've made it this far without falling asleep or drooling on your
> keyboard, I applaud you!