Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing

David Holmes david.holmes at oracle.com
Wed Dec 12 23:33:46 PST 2012


Hi Bengt,

This has to be one of the absolute best review requests I've ever read 
:) Thank you.

Just a couple of queries.

os_posix.cpp:

Love the diagram :)

I'm assuming that "alignment" has to be >= the underlying page size, and 
in fact must be a multiple of the underlying page size ? (As I assume 
you can only unmap whole numbers of pages). If so does that need to be 
checked somewhere?

In virtualSpace.cpp:

         // Reserve size large enough to do manual alignment and
         // increase size to a multiple of the desired alignment
         size = align_size_up(size, alignment);
!       base = os::reserve_memory_aligned(size, alignment);

The comment doesn't seem necessary now, and the align_size_up seems 
redundant.

Thanks,
David

On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this change?
>
> http://cr.openjdk.java.net/~brutisso/7173959/webrev.00/
>
> This is for a bug originally reported by the Coherence team:
>
> 7173959 : Jvm crashed during coherence exabus (tmb) testing
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7173959
>
> The original analysis and proposed fix was done by Mikael Gerdin and me
> together. I'll handle the webrev and push since Mikael is on vacation
> starting today. But Mikael did a great job tracking down this very
> difficult bug, so he should have a large part of the credit for this bug
> fix,
>
> Description from the CR:
>
> The reason that we crash is due to how we re-map memory when we want to
> align it for large pages in ReservedSpace::initialize().
>
> Here is what happens:
>
> The reservation of memory is split up to a few steps. When we want a
> chunk of memory with large pages we first just reserve some memory large
> enough for what we need. Then we realize that we want large pages, so we
> want to re-map the reserved memory to use large pages. But since this
> requires that we have a large-page-aligned memory chunk we may have to
> fix the recently reserved memory chunk up a bit.
>
> We do this in ReservedSpace::initialize() by first releasing the memory
> we just reserved. Then requesting more memory than we actually need to
> make sure that we have enough space to do manual large page alignment.
> After we have gotten this memory we figure out a nicely aligned start
> address. We then release the memory again and then reserve just enough
> memory but using the aligned start address as a fixed address to make
> sure that we get the memory we wanted in an aligned way.
>
> This is done in a loop to make sure that we eventually get some memory.
> The interesting code looks like this:
>
> do {
> char* extra_base = os::reserve_memory(extra_size, NULL, alignment);
> if (extra_base == NULL) return;
> // Do manual alignement
> base = (char*) align_size_up((uintptr_t) extra_base, alignment);
> assert(base >= extra_base, "just checking");
> // Re-reserve the region at the aligned base address.
> os::release_memory(extra_base, extra_size); // (1)
> base = os::reserve_memory(size, base); // (2)
> } while (base == NULL);
>
>
> There is a race here between releasing the memory in (1) and
> re-reserving it in (2). But the loop is supposed to handle this race.
>
> The problem is that on posix platforms you can remap the same memory
> area several times. The call in (2) will use mmap with MAP_FIXED. This
> means that the OS will think that you know exactly what you are doing.
> So, if part of the memory has been mapped already by the process it will
> just go ahead and reuse that memory.
>
> This means that if we are having multiple threads that do mmap. We can
> end up with a situation where we release our mapping in (1). Then
> another thread comes in and maps part of the memory that we used to
> have. Then we remap over that memory again in (2) with MAP_FIXED. Now we
> have a situation where two threads in our process have mapped the same
> memory. If both threads try to use it or if one of the threads unmap
> part or all of the memory we will crash.
>
> On posix it is possible to unmap any part of a mapped chunk. So, our
> proposed solution to the race described above is to not unmap all memory
> in (1) but rather just unmap the section at the start and at the end of
> the chunk that we mapped to get alignment. This also removes the need
> for the loop.
>
> However, on Windows you can only unmap _all_ of the memory that you have
> mapped. On the other hand Windows also will not allow you to map over
> other mappings, so the original code is actually safe. If we keep the loop.
>
> So, our solution is to treat this differently on Windows and posix
> platforms.
>
>
> Thanks,
> Bengt


More information about the hotspot-runtime-dev mailing list