code review (round 2) for memory commit failure fix (8013057)

Tue Jun 4 13:31:16 PDT 2013

Dan,

Still good from NMT perspective.

Question:
os_linux.cpp #2667:
  why keep dead code?

Thanks,

-Zhengyu

On Jun 4, 2013, at 4:19 PM, Daniel D. Daugherty wrote:

> I'm looking for re-reviews from Zhengyu, Stefan, David H and Dmitry S...
> Please chime in on this thread when you get a chance...
> 
> Dan
> 
> 
> On 6/4/13 11:32 AM, Daniel D. Daugherty wrote:
>> Greetings,
>> 
>> I have another revised version of the proposed fix for the following bug:
>> 
>>    8013057 assert(_needs_gc || SafepointSynchronize::is_at_safepoint())
>>            failed: only read at safepoint
>> 
>> Here are the (round 2) webrev URLs:
>> 
>> OpenJDK: http://cr.openjdk.java.net/~dcubed/8013057-webrev/2-hsx25/
>> Internal: http://javaweb.us.oracle.com/~ddaugher/8013057-webrev/2-hsx25/
>> 
>> Testing:
>> - Aurora Adhoc vm.quick batch for all OSes in the following configs:
>>  {Client VM, Server VM} x {fastdebug} x {-Xmixed}
>> - I've created a standalone Java stress test with a shell script
>>  wrapper that reproduces the failing code paths on my Solaris X86
>>  server and on Ron's DevOps Linux machine. This test will not be
>>  integrated since running the machine out of swap space is very
>>  disruptive (crashes the window system, causes various services to
>>  exit, etc.)
>> 
>> There are three parts to this fix:
>> 
>> 1) Detect commit memory failures on Linux and Solaris where the
>>   previous reservation can be lost and call vm_exit_out_of_memory()
>>   to report the resource exhaustion. Add os::commit_memory_or_exit()
>>   API to provide more consistent handling of vm_exit_out_of_memory()
>>   calls.
>> 2) Change existing os::commit_memory() calls to make the executable
>>   status of memory more clear; this makes security analysis easier.
>> 3) Clean up some platform dependent layer calls that were resulting
>>   in extra NMT accounting. Clean up some mmap() return value checks.
>> 
>> Gory details are below. As always, comments, questions and
>> suggestions are welome.
>> 
>> Dan
>> 
>> 
>> Gory Details:
>> 
>> The VirtualSpace data structure is built on top of the ReservedSpace
>> data structure. VirtualSpace presumes that failed os::commit_memory()
>> calls do not affect the underlying ReservedSpace memory mappings.
>> That assumption is true on MacOS X and Windows, but it is not true
>> on Linux or Solaris. The mmap() system call on Linux or Solaris can
>> lose previous mappings in the event of certain errors. On MacOS X,
>> the mmap() system call clearly states that previous mappings are
>> replaced only on success. On Windows, a different set of APIs are
>> used and they do not document any loss of previous mappings.
>> 
>> The solution is to implement the proper failure checks in the
>> os::commit_memory() implementations on Linux and Solaris. On MacOS X
>> and Windows, no additional checks are needed.
>> 
>> During code review round 1, there was a request from the GC team to
>> provide an os::commit_memory_or_exit() entry point in order to preserve
>> the existing error messages on all platforms. This entry point allows
>> code like this:
>> 
>> src/share/vm/gc_implementation/parallelScavenge/cardTableExtension.cpp:
>> 
>> 568       if (!os::commit_memory((char*)new_committed.start(),
>> 569                              new_committed.byte_size())) {
>> 570         vm_exit_out_of_memory(new_committed.byte_size(), OOM_MMAP_ERROR,
>> 571                               "card table expansion");
>> 
>> to be replaced with code like this:
>> 
>> 568       os::commit_memory_or_exit((char*)new_committed.start(),
>> 569                                 new_committed.byte_size(), !ExecMem,
>> 570                                 "card table expansion");
>> 
>> All uses of os::commit_memory() have been visited and those locations
>> that previously exited on error have been updated to use the new entry
>> point. This new entry point cleans up the original call sites and the
>> vm_exit_out_of_memory() calls are now consistent on all platforms.
>> 
>> As a secondary change, while visiting all os::commit_memory() calls, I
>> also updated them to use the new ExecMem enum in order to make the
>> executable status of the memory more clear. Since executable memory can
>> be an attack vector, it is prudent to make the executable status of
>> memory crystal clear. This also allowed me to remove the default
>> executable flag value of 'false'. Now all new uses of commit_memory()
>> must be clear about the executable status of the memory.
>> 
>> There are also tertiary changes where some of the pd_commit_memory()
>> calls were calling os::commit_memory() instead of calling their sibling
>> os::pd_commit_memory(). This resulted in double NMT tracking so this
>> has also been fixed. There were also some incorrect mmap)() return
>> value checks which have been fixed.
>> 
>> Just to be clear: This fix simply properly detects the "out of swap
>> space" condition on Linux and Solaris and causes the VM to fail in a
>> more orderly fashion with a message that looks like this:
>> 
>> The Java process' stderr will show:
>> 
>> INFO: os::commit_memory(0xfffffd7fb2522000, 4096, 4096, 0) failed; errno=11
>> #
>> # There is insufficient memory for the Java Runtime Environment to continue.
>> # Native memory allocation (mmap) failed to map 4096 bytes for committing reserved memory.
>> # An error report file with more information is saved as:
>> # /work/shared/bugs/8013057/looper.03/hs_err_pid9111.log
>> 
>> The hs_err_pid file will have the more verbose info:
>> 
>> #
>> # There is insufficient memory for the Java Runtime Environment to continue.
>> # Native memory allocation (mmap) failed to map 4096 bytes for committing reserved memory.
>> # Possible reasons:
>> #   The system is out of physical RAM or swap space
>> #   In 32 bit mode, the process size limit was hit
>> # Possible solutions:
>> #   Reduce memory load on the system
>> #   Increase physical memory or swap space
>> #   Check if swap backing store is full
>> #   Use 64 bit Java on a 64 bit OS
>> #   Decrease Java heap size (-Xmx/-Xms)
>> #   Decrease number of Java threads
>> #   Decrease Java thread stack sizes (-Xss)
>> #   Set larger code cache with -XX:ReservedCodeCacheSize=
>> # This output file may be truncated or incomplete.
>> #
>> #  Out of Memory Error (/work/shared/bug_hunt/hsx_rt_latest/exp_8013057/src/os/s
>> olaris/vm/os_solaris.cpp:2791), pid=9111, tid=21
>> #
>> # JRE version: Java(TM) SE Runtime Environment (8.0-b89) (build 1.8.0-ea-b89)
>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b33-bh_hsx_rt_exp_8013057_dcu
>> bed-product-fastdebug mixed mode solaris-amd64 compressed oops)
>> # Core dump written. Default location: /work/shared/bugs/8013057/looper.03/core
>> or core.9111
>> #
>> 
>> You might be wondering why we are assuming that the failed mmap()
>> commit operation has lost the 'reserved memory' mapping.
>> 
>>    We have no good way to determine if the 'reserved memory' mapping
>>    is lost. Since all the other threads are not idle, it is possible
>>    for another thread to have 'reserved' the same memory space for a
>>    different data structure. Our thread could observe that the memory
>>    is still 'reserved' but we have no way to know that the reservation
>>    isn't ours.
>> 
>> You might be wondering why we can't recover from this transient
>> resource availability issue.
>> 
>>    We could retry the failed mmap() commit operation, but we would
>>    again run into the issue that we no longer know which data
>>    structure 'owns' the 'reserved' memory mapping. In particular, the
>>    memory could be reserved by native code calling mmap() directly so
>>    the VM really has no way to recover from this failure.
>> 
>> You might be wondering why part of his work is deferred:
>> 
>> 2654     default:
>> 2655       // Any remaining errors on this OS can cause our reserved mapping
>> 2656       // to be lost. That can cause confusion where different data
>> 2657       // structures think they have the same memory mapped. The worst
>> 2658       // scenario is if both the VM and a library think they have the
>> 2659       // same memory mapped.
>> 2660       //
>> 2661       // However, it is not clear that this loss of our reserved mapping
>> 2662       // happens with large pages on Linux or that we cannot recover
>> 2663       // from the loss. For now, we just issue a warning and we don't
>> 2664       // call vm_exit_out_of_memory(). This issue is being tracked by
>> 2665       // JBS-8007074.
>> 2666       warn_fail_commit_memory(addr, size, alignment_hint, exec, err);
>> 2667 #if 0
>> 2668       vm_exit_out_of_memory(size, OOM_MMAP_ERROR,
>> 2669                             "committing reserved memory.");
>> 2670 #endif
>> 2671       break;
>> 
>>    When lines 2668-2669 are enabled and UseHugeTLBFS is specified,
>>    then the VM will exit because no more huge/large pages are
>>    available. It is not yet clear that this transition from large to
>>    small pages is actually unsafe, but we don't yet have proof that
>>    it is safe either. More research will be done via JBS-8007074.
>> 
>> 
>> 
>