Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing

Bengt Rutisson bengt.rutisson at oracle.com
Thu Dec 13 13:23:26 PST 2012


Hi Vitaly,

Thanks for looking at this!

On 12/13/12 2:44 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Great write-up! What was in the hs_err file that led you (or Mikael) here?
>

Using the core file we noticed that the hotspot internal data structures 
that keep track of the bitmap sizes did not match the mapped memory 
listed in the hs_err file. We could see that the mapped memory for the 
bitmap was smaller than what hotspot had requested. After convincing 
ourselves that we were passing the correct size to mmap we figured that 
someone must have been unmapping part of the memory that we had mapped.

> One question I have is why alignment is a multiple of page size and 
> not page size? I imagine callers want to reserve X space and have it 
> page aligned up if necessary.  When would someone request multiple 
> page size alignment rather than just increase requested allocation 
> amount and keep alignment at page size? If huge pages are used then 
> vm_allocation_granularity() respects that, right? Or is this for 
> transparent huge page support? Hope that makes sense.
>
One such example is with G1. It wants the heap size to be aligned with 
the region size, which may be a multiple of the page size. Even with 
large pages.

Actually we also need this for our large page support. The ReservedSpace 
doesn't really know of large pages. We use it to map memory but then we 
later on use madvice to make the mapped memory use huge pages. This 
means that at the time we map the memory we can only rely on the small 
page size, but if we know we will use it for large pages we have to make 
sure that the memory is aligned to the large page size. So the alignment 
will be a multiple of the small page size.
>
> The other question is whether it's worth it to check for overflow in 
> the new os_posix function? There are some additions and subtractions 
> on size_t values and maybe on 32 bit some of those adds can overflow a 
> temporary? Specifically, line 118.  Disregard if that's not needed :).
>

Good point. I think this change is not really increasing the risk of an 
overflow compared to the existing code. We should not overflow if we are 
passed sane values. An overflow will indicate that someone higher up was 
putting together a too large size. I think I would rather have the upper 
layers verify that they calculate reasonable values.

Thanks for your feedback!
Bengt

> Thanks
>
> Sent from my phone
>
> On Dec 13, 2012 7:59 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com 
> <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
>     Hi David,
>
>     Updated webrev:
>     http://cr.openjdk.java.net/~brutisso/7173959/webrev.02/
>     <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.02/>
>
>     I moved the alignment of "size" back into
>     ReservedSpace::initialize() since we actually store the size in an
>     instance variable (_size), so I think it is a bit risky to just do
>     the align_up in os::reserve_memory_aligned(). We probably want the
>     instance variables _size and _alignment in ReservedSpace to be
>     consistent.
>
>     I added an assert in os::reserve_memory_aligned() to verify that
>     the size is correctly aligned. I also added the assert you
>     suggested to check that alignment is page size aligned.
>
>     Bengt
>
>     On 12/13/12 11:50 AM, David Holmes wrote:
>
>         On 13/12/2012 8:37 PM, Bengt Rutisson wrote:
>
>
>             Hi again,
>
>             Updated webrev:
>             http://cr.openjdk.java.net/~brutisso/7173959/webrev.01/
>             <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.01/>
>
>             I removed the comment and the alignment.
>
>             But I did not add an assert just yet.
>
>             At the top of ReservedSpace::initialize() we have this code:
>
>             const size_t granularity = os::vm_allocation_granularity();
>             assert((size & (granularity - 1)) == 0,
>             "size not aligned to os::vm_allocation_granularity()");
>             assert((alignment & (granularity - 1)) == 0,
>             "alignment not aligned to os::vm_allocation_granularity()");
>
>
>             Where os::vm_allocation_granularity() is implemented as
>             page size on all
>             platforms except Windows. There we look it up from the
>             Windows API. I
>             assume that is a page size too, but since the Windows code
>             in our patch
>             does not unmap based on the alignment it should be safe
>             either way.
>
>             Is this assert enough or would you like me to add an
>             assert closer to
>             the code block were I did the changes?
>
>
>         As this is a separate method now I think an assert in the
>         method itself would not go astray.
>
>         David
>         -----
>
>             Thanks,
>             Bengt
>
>
>             On 12/13/12 11:02 AM, Bengt Rutisson wrote:
>
>
>                 Hi David,
>
>                 Thanks for looking at this!
>
>                 On 12/13/12 8:33 AM, David Holmes wrote:
>
>                     Hi Bengt,
>
>                     This has to be one of the absolute best review
>                     requests I've ever
>                     read :) Thank you.
>
>
>                 Wow! Thanks! :)
>
>
>                     Just a couple of queries.
>
>                     os_posix.cpp:
>
>                     Love the diagram :)
>
>
>                 It was Mikael's way of making sure we got it right.
>
>                     I'm assuming that "alignment" has to be >= the
>                     underlying page size,
>                     and in fact must be a multiple of the underlying
>                     page size ? (As I
>                     assume you can only unmap whole numbers of pages).
>                     If so does that
>                     need to be checked somewhere?
>
>
>                 Good point. I'll add an assert to check that.
>
>                     In virtualSpace.cpp:
>
>                     // Reserve size large enough to do manual
>                     alignment and
>                     // increase size to a multiple of the desired
>                     alignment
>                     size = align_size_up(size, alignment);
>                     ! base = os::reserve_memory_aligned(size, alignment);
>
>                     The comment doesn't seem necessary now, and the
>                     align_size_up seems
>                     redundant.
>
>
>                 You are right. I'll remove the comment and the alignment.
>
>                 Thanks,
>                 Bengt
>
>
>                     Thanks,
>                     David
>
>                     On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>
>
>                         Hi all,
>
>                         Could I have a couple of reviews for this change?
>
>                         http://cr.openjdk.java.net/~brutisso/7173959/webrev.00/
>                         <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.00/>
>
>                         This is for a bug originally reported by the
>                         Coherence team:
>
>                         7173959 : Jvm crashed during coherence exabus
>                         (tmb) testing
>                         http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7173959
>
>                         The original analysis and proposed fix was
>                         done by Mikael Gerdin and me
>                         together. I'll handle the webrev and push
>                         since Mikael is on vacation
>                         starting today. But Mikael did a great job
>                         tracking down this very
>                         difficult bug, so he should have a large part
>                         of the credit for this
>                         bug
>                         fix,
>
>                         Description from the CR:
>
>                         The reason that we crash is due to how we
>                         re-map memory when we want to
>                         align it for large pages in
>                         ReservedSpace::initialize().
>
>                         Here is what happens:
>
>                         The reservation of memory is split up to a few
>                         steps. When we want a
>                         chunk of memory with large pages we first just
>                         reserve some memory
>                         large
>                         enough for what we need. Then we realize that
>                         we want large pages,
>                         so we
>                         want to re-map the reserved memory to use
>                         large pages. But since this
>                         requires that we have a large-page-aligned
>                         memory chunk we may have to
>                         fix the recently reserved memory chunk up a bit.
>
>                         We do this in ReservedSpace::initialize() by
>                         first releasing the memory
>                         we just reserved. Then requesting more memory
>                         than we actually need to
>                         make sure that we have enough space to do
>                         manual large page alignment.
>                         After we have gotten this memory we figure out
>                         a nicely aligned start
>                         address. We then release the memory again and
>                         then reserve just enough
>                         memory but using the aligned start address as
>                         a fixed address to make
>                         sure that we get the memory we wanted in an
>                         aligned way.
>
>                         This is done in a loop to make sure that we
>                         eventually get some memory.
>                         The interesting code looks like this:
>
>                         do {
>                         char* extra_base =
>                         os::reserve_memory(extra_size, NULL, alignment);
>                         if (extra_base == NULL) return;
>                         // Do manual alignement
>                         base = (char*) align_size_up((uintptr_t)
>                         extra_base, alignment);
>                         assert(base >= extra_base, "just checking");
>                         // Re-reserve the region at the aligned base
>                         address.
>                         os::release_memory(extra_base, extra_size); // (1)
>                         base = os::reserve_memory(size, base); // (2)
>                         } while (base == NULL);
>
>
>                         There is a race here between releasing the
>                         memory in (1) and
>                         re-reserving it in (2). But the loop is
>                         supposed to handle this race.
>
>                         The problem is that on posix platforms you can
>                         remap the same memory
>                         area several times. The call in (2) will use
>                         mmap with MAP_FIXED. This
>                         means that the OS will think that you know
>                         exactly what you are doing.
>                         So, if part of the memory has been mapped
>                         already by the process it
>                         will
>                         just go ahead and reuse that memory.
>
>                         This means that if we are having multiple
>                         threads that do mmap. We can
>                         end up with a situation where we release our
>                         mapping in (1). Then
>                         another thread comes in and maps part of the
>                         memory that we used to
>                         have. Then we remap over that memory again in
>                         (2) with MAP_FIXED.
>                         Now we
>                         have a situation where two threads in our
>                         process have mapped the same
>                         memory. If both threads try to use it or if
>                         one of the threads unmap
>                         part or all of the memory we will crash.
>
>                         On posix it is possible to unmap any part of a
>                         mapped chunk. So, our
>                         proposed solution to the race described above
>                         is to not unmap all
>                         memory
>                         in (1) but rather just unmap the section at
>                         the start and at the end of
>                         the chunk that we mapped to get alignment.
>                         This also removes the need
>                         for the loop.
>
>                         However, on Windows you can only unmap _all_
>                         of the memory that you
>                         have
>                         mapped. On the other hand Windows also will
>                         not allow you to map over
>                         other mappings, so the original code is
>                         actually safe. If we keep
>                         the loop.
>
>                         So, our solution is to treat this differently
>                         on Windows and posix
>                         platforms.
>
>
>                         Thanks,
>                         Bengt
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20121213/857c4c3e/attachment-0001.html 


More information about the hotspot-runtime-dev mailing list