Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing
Bengt Rutisson
bengt.rutisson at oracle.com
Mon Dec 17 00:38:01 PST 2012
Hi Vitaly,
On 12/14/12 3:46 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> I'd add an assert(begin_offset < end_offset, "sanity") as well but up
> to you. Looks good.
>
Actually begin_offset and end_offset are the number of bytes we want to
umap. So, it is quite possible for end_offset to be less than begin_offset.
Bengt
> Thanks
>
> Sent from my phone
>
> On Dec 14, 2012 3:49 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com
> <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
> Hi again Runtime team,
>
> I need one more review for this change. It is a P1 bug with a
> fairly small change. Any takers?
>
> Latest webrev based on comments from David Holmes and Vitaly
> Davidovich:
> http://cr.openjdk.java.net/~brutisso/7173959/webrev.03/
> <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.03/>
>
> Thanks,
> Bengt
>
>
> On 12/14/12 9:29 AM, Bengt Rutisson wrote:
>
>
> Hi David,
>
> On 12/14/12 1:38 AM, David Holmes wrote:
>
> Hi Bengt,
>
> That all sounds good to me.
>
> With regard to potential overflow perhaps a simple check
> that extra_size does not overflow? If that is true and
> extra_base is not null then I think all the other
> calculations must be valid.
>
>
> Yes, that makes sense. I'll add an overflow check for the
> extra_size.
>
> Thanks,
> Bengt
>
>
> Thanks,
> David
>
> On 13/12/2012 10:58 PM, Bengt Rutisson wrote:
>
>
> Hi David,
>
> Updated webrev:
> http://cr.openjdk.java.net/~brutisso/7173959/webrev.02/ <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.02/>
>
> I moved the alignment of "size" back into
> ReservedSpace::initialize()
> since we actually store the size in an instance
> variable (_size), so I
> think it is a bit risky to just do the align_up in
> os::reserve_memory_aligned(). We probably want the
> instance variables
> _size and _alignment in ReservedSpace to be consistent.
>
> I added an assert in os::reserve_memory_aligned() to
> verify that the
> size is correctly aligned. I also added the assert you
> suggested to
> check that alignment is page size aligned.
>
> Bengt
>
> On 12/13/12 11:50 AM, David Holmes wrote:
>
> On 13/12/2012 8:37 PM, Bengt Rutisson wrote:
>
>
> Hi again,
>
> Updated webrev:
> http://cr.openjdk.java.net/~brutisso/7173959/webrev.01/
> <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.01/>
>
> I removed the comment and the alignment.
>
> But I did not add an assert just yet.
>
> At the top of ReservedSpace::initialize() we
> have this code:
>
> const size_t granularity =
> os::vm_allocation_granularity();
> assert((size & (granularity - 1)) == 0,
> "size not aligned to
> os::vm_allocation_granularity()");
> assert((alignment & (granularity - 1)) == 0,
> "alignment not aligned to
> os::vm_allocation_granularity()");
>
>
> Where os::vm_allocation_granularity() is
> implemented as page size on all
> platforms except Windows. There we look it up
> from the Windows API. I
> assume that is a page size too, but since the
> Windows code in our patch
> does not unmap based on the alignment it
> should be safe either way.
>
> Is this assert enough or would you like me to
> add an assert closer to
> the code block were I did the changes?
>
>
> As this is a separate method now I think an assert
> in the method
> itself would not go astray.
>
> David
> -----
>
> Thanks,
> Bengt
>
>
> On 12/13/12 11:02 AM, Bengt Rutisson wrote:
>
>
> Hi David,
>
> Thanks for looking at this!
>
> On 12/13/12 8:33 AM, David Holmes wrote:
>
> Hi Bengt,
>
> This has to be one of the absolute
> best review requests I've ever
> read :) Thank you.
>
>
> Wow! Thanks! :)
>
>
> Just a couple of queries.
>
> os_posix.cpp:
>
> Love the diagram :)
>
>
> It was Mikael's way of making sure we got
> it right.
>
> I'm assuming that "alignment" has to
> be >= the underlying page size,
> and in fact must be a multiple of the
> underlying page size ? (As I
> assume you can only unmap whole
> numbers of pages). If so does that
> need to be checked somewhere?
>
>
> Good point. I'll add an assert to check that.
>
> In virtualSpace.cpp:
>
> // Reserve size large enough to do
> manual alignment and
> // increase size to a multiple of the
> desired alignment
> size = align_size_up(size, alignment);
> ! base =
> os::reserve_memory_aligned(size,
> alignment);
>
> The comment doesn't seem necessary
> now, and the align_size_up seems
> redundant.
>
>
> You are right. I'll remove the comment and
> the alignment.
>
> Thanks,
> Bengt
>
>
> Thanks,
> David
>
> On 13/12/2012 4:42 PM, Bengt Rutisson
> wrote:
>
>
> Hi all,
>
> Could I have a couple of reviews
> for this change?
>
> http://cr.openjdk.java.net/~brutisso/7173959/webrev.00/
> <http://cr.openjdk.java.net/%7Ebrutisso/7173959/webrev.00/>
>
> This is for a bug originally
> reported by the Coherence team:
>
> 7173959 : Jvm crashed during
> coherence exabus (tmb) testing
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7173959
>
> The original analysis and proposed
> fix was done by Mikael Gerdin
> and me
> together. I'll handle the webrev
> and push since Mikael is on vacation
> starting today. But Mikael did a
> great job tracking down this very
> difficult bug, so he should have a
> large part of the credit for this
> bug
> fix,
>
> Description from the CR:
>
> The reason that we crash is due to
> how we re-map memory when we
> want to
> align it for large pages in
> ReservedSpace::initialize().
>
> Here is what happens:
>
> The reservation of memory is split
> up to a few steps. When we want a
> chunk of memory with large pages
> we first just reserve some memory
> large
> enough for what we need. Then we
> realize that we want large pages,
> so we
> want to re-map the reserved memory
> to use large pages. But since this
> requires that we have a
> large-page-aligned memory chunk we may
> have to
> fix the recently reserved memory
> chunk up a bit.
>
> We do this in
> ReservedSpace::initialize() by
> first releasing the
> memory
> we just reserved. Then requesting
> more memory than we actually
> need to
> make sure that we have enough
> space to do manual large page
> alignment.
> After we have gotten this memory
> we figure out a nicely aligned start
> address. We then release the
> memory again and then reserve just
> enough
> memory but using the aligned start
> address as a fixed address to make
> sure that we get the memory we
> wanted in an aligned way.
>
> This is done in a loop to make
> sure that we eventually get some
> memory.
> The interesting code looks like this:
>
> do {
> char* extra_base =
> os::reserve_memory(extra_size,
> NULL, alignment);
> if (extra_base == NULL) return;
> // Do manual alignement
> base = (char*)
> align_size_up((uintptr_t)
> extra_base, alignment);
> assert(base >= extra_base, "just
> checking");
> // Re-reserve the region at the
> aligned base address.
> os::release_memory(extra_base,
> extra_size); // (1)
> base = os::reserve_memory(size,
> base); // (2)
> } while (base == NULL);
>
>
> There is a race here between
> releasing the memory in (1) and
> re-reserving it in (2). But the
> loop is supposed to handle this race.
>
> The problem is that on posix
> platforms you can remap the same
> memory
> area several times. The call in
> (2) will use mmap with MAP_FIXED.
> This
> means that the OS will think that
> you know exactly what you are
> doing.
> So, if part of the memory has been
> mapped already by the process it
> will
> just go ahead and reuse that memory.
>
> This means that if we are having
> multiple threads that do mmap. We
> can
> end up with a situation where we
> release our mapping in (1). Then
> another thread comes in and maps
> part of the memory that we used to
> have. Then we remap over that
> memory again in (2) with MAP_FIXED.
> Now we
> have a situation where two threads
> in our process have mapped the
> same
> memory. If both threads try to use
> it or if one of the threads unmap
> part or all of the memory we will
> crash.
>
> On posix it is possible to unmap
> any part of a mapped chunk. So, our
> proposed solution to the race
> described above is to not unmap all
> memory
> in (1) but rather just unmap the
> section at the start and at the
> end of
> the chunk that we mapped to get
> alignment. This also removes the need
> for the loop.
>
> However, on Windows you can only
> unmap _all_ of the memory that you
> have
> mapped. On the other hand Windows
> also will not allow you to map over
> other mappings, so the original
> code is actually safe. If we keep
> the loop.
>
> So, our solution is to treat this
> differently on Windows and posix
> platforms.
>
>
> Thanks,
> Bengt
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20121217/02ce114a/attachment-0001.html
More information about the hotspot-runtime-dev
mailing list