code review for memory commit failure fix (8013057)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Thu May 23 09:34:49 PDT 2013
Stefan,
Thanks for the quick review! Replies embedded below...
Other (potential) reviewers, hold off on any more reviews since
I'm going to rework the fix base on Stefan's review...
On 5/22/13 11:23 PM, Stefan Karlsson wrote:
> <snip>
>>
>> You might be wondering why we added os::commit_reserved_memory()
>> instead of changing os::commit_memory().
>>
>> We wanted to limit the potential impact. The new checks are only
>> needed when commiting previously reserved memory on Linux and
>> Solaris. It appears that os::commit_memory() does not require
>> a previous os::reserve_memory() call and in those cases, we
>> didn't want to make the code path unrecoverable.
>
> Do you have an example of when we don't need a os::reserve_memory
> before the os::commit_memory call?
You are correct! I was fooled by some of relative distance between
os::reserve_memory() calls and the subsequent os::commit_memory()
calls.
> On Linux, os::commit_memory uses mmap + MAP_FIXED to commit the
> memory. I thought that os::reserve_memory was needed to guarantee that
> no other part of the JVM/application mapped into the same memory area.
> So, I'm a bit curious to see which call to os::commit_memory does not
> require a previous os::reserve_memory?
The use of MAP_FIXED by all of the os::commit_memory() implementations
should have been a huge warning/clue that I needed to revisit my
conclusion about being able to call commit_memory() without a prior
call to reserve_memory().
I'm ignoring the following code in:
src/os/bsd/vm/os_bsd.cpp:
1982 bool os::pd_commit_memory(char* addr, size_t size, bool exec) {
1983 int prot = exec ? PROT_READ|PROT_WRITE|PROT_EXEC :
PROT_READ|PROT_WRITE;
1984 #ifdef __OpenBSD__
1985 // XXX: Work-around mmap/MAP_FIXED bug temporarily on OpenBSD
1986 return ::mprotect(addr, size, prot) == 0;
1987 #else
1988 uintptr_t res = (uintptr_t) ::mmap(addr, size, prot,
1989 MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0);
1990 return res != (uintptr_t) MAP_FAILED;
1991 #endif
1992 }
The ::mprotect() work around will have the same requirements in that the
memory will need to have been 'reserved' before being 'protected'. Just
for snicks, I also plan to see which code is used on MacOS X...
What this means is that I need to rework the fix since we no longer
need to add the commit_reserved_memory() entry points.
> Browsing through the code, I see a number of places where you have not
> changed os::commit_memory to os::commit_reserved_memory.
>
> For example allocation.inline.hpp:
>
> template <class E, MEMFLAGS F>
> E* ArrayAllocator<E, F>::allocate(size_t length) {
> ...
> _addr = os::reserve_memory(_size, NULL, alignment, F);
> if (_addr == NULL) {
> vm_exit_out_of_memory(_size, OOM_MMAP_ERROR, "Allocator (reserve)");
> }
>
> bool success = os::commit_memory(_addr, _size, false /* executable */);
> if (!success) {
> vm_exit_out_of_memory(_size, OOM_MMAP_ERROR, "Allocator (commit)");
> }
>
> Do these places call os::commit_memory, instead of
> os::commit_reserved_memory, because they have their own error handling?
Exactly right. These places, except for the psVirtualspace.cpp location
that I mumble about below, handle the failure from os::commit_memory().
Of course, after redoing the fix, all callers of os::commit_memory()
will be auto-magically fixed (or broken if I get the wrong...) :-)
Dan
>
> thanks,
> StefanK
>
>>
>> You might be wondering why we are assuming that the failed mmap()
>> commit operation has lost the 'reserved memory' mapping.
>>
>> We have no good way to determine if the 'reserved memory' mapping
>> is lost. Since all the other threads are not idle, it is possible
>> for another thread to have 'reserved' the same memory space for a
>> different data structure. Our thread could observe that the memory
>> is still 'reserved' but we have no way to know that the reservation
>> isn't ours.
>>
>> You might be wondering why we can't recover from this transient
>> resource availability issue.
>>
>> We could retry the failed mmap() commit operation, but we would
>> again run into the issue that we no longer know which data
>> structure 'owns' the 'reserved' memory mapping.
>>
>> It looks like PSVirtualSpace::expand_by(size_t bytes),
>> PSVirtualSpace::expand_into(PSVirtualSpace* other_space, size_t bytes),
>> PSVirtualSpaceHighToLow::expand_by(size_t bytes), and
>> PSVirtualSpaceHighToLow::expand_into(PSVirtualSpace* other_space,
>> size_t bytes) in
>> src/share/vm/gc_implementation/parallelScavenge/psVirtualspace.cpp
>> have the same issue. Why aren't you fixing them?
>>
>> Those functions do indeed appear to be suffering from the same
>> assumption, but those calls are in GC code. A follow-up bug will be
>> filed so that someone on the GC team can properly diagnose, fix and
>> test that code.
>>
>
More information about the hotspot-runtime-dev
mailing list