Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing
Bengt Rutisson
bengt.rutisson at oracle.com
Tue Dec 18 12:29:59 PST 2012
Hi all,
One final update on this. I pushed this change both in to the hotspot-gc
repository and to the hs24 repository:
http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/730cc4ddd550
http://hg.openjdk.java.net/hsx/hsx24/hotspot/rev/e1d9b04b560b
But I forgot to add "Contributed-by" for Mikael Gerdin ( mgerdin). That
should definitely have been the case since he deserves a large part of
the credit for this fix.
Hopefully this email message will be of some help in recording his work
for this fix for future reference.
Bengt
On 12/17/12 8:36 AM, Bengt Rutisson wrote:
>
> Hi Coleen,
>
> Thanks for suggesting the comments. I added them to the os files.
>
>
> David, Coleen, Zhengyu and Vitaly,
>
> Thanks for the reviews! All set to push this now.
>
> Bengt
>
>
> On 12/14/12 4:49 PM, Coleen Phillimore wrote:
>> On 12/14/2012 10:24 AM, Bengt Rutisson wrote:
>>>
>>> Hi Coleen,
>>>
>>> Thanks for looking at this!
>>>
>>> On 12/14/12 2:21 PM, Coleen Phillimore wrote:
>>>> On 12/14/2012 3:48 AM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi again Runtime team,
>>>>>
>>>>> I need one more review for this change. It is a P1 bug with a
>>>>> fairly small change. Any takers?
>>>>>
>>>>> Latest webrev based on comments from David Holmes and Vitaly
>>>>> Davidovich:
>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.03/
>>>>
>>>> Hi Bengt, This code looks good to me but I asked Zhengyu to check
>>>> it to make sure it doesn't have any bad NMT interactions.
>>>
>>> Thanks! He just replied.
>>>>
>>>> Also the writeup is great below, but can you add these comments to
>>>> the code in os_posix.cpp. I had to go back to the email to figure
>>>> out why you are doing this.
>>>
>>> I'm not really sure where to put such a comment. I think the writeup
>>> I did mostly makes sense when you compare it to the old code. Also,
>>> the affected code is in the middle of ReservedSpace::initialize(). I
>>> think it would look strange with a large comment there. In the
>>> os_posix/windows files the comment does not make much sense in my
>>> opinion.
>>>
>>> I could add a comment in ReservedSpace::initialize() that refers to
>>> the Jira issue. But I'm not a big fan of such comments.
>>
>> I agree we shouldn't point to bugs in the source code. A comment at
>> the top of the os_posix version of this function that paraphrases
>> below why you are not re-reserving in a loop like windows would be
>> sufficient. That trimming memory at either end is allowed on
>> posix-like os's and is thread-safe, something like:
>>
>> // Multiple threads can race in this code, and can remap over each
>> other with MAP_FIXED, so on posix, unmap the section
>> // at the start and at the end of the chunk that we mapped to get
>> requested alignment.
>>
>> os_windows:
>>
>> // Multiple threads can race in this code but it's not possible to
>> remap with a section of virtual space to get requested alignment,
>> like posix-like os's.
>> // Windows prevents multiple thread from remapping over each other so
>> this loop is thread-safe.
>>
>> I just want something to document why they are different.
>>
>>>
>>>>
>>>> I don't understand why there are racing threads. I thought all
>>>> the heap space was initialized at the start of the vm when it is
>>>> single threaded?
>>>
>>> I would have to look up exactly when we set up the heap to see if we
>>> are single threaded at that point. The issue we saw was with setting
>>> up the GC bitmaps. At that point we are multithreaded and we are for
>>> example mapping memory for the GC worker threads at the same time.
>>
>> I see.
>>
>> Coleen
>>>
>>>
>>> Thanks again for looking at this!
>>> Bengt
>>>
>>>>
>>>> thanks,
>>>> Coleen
>>>>
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>
>>>>> On 12/14/12 9:29 AM, Bengt Rutisson wrote:
>>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> On 12/14/12 1:38 AM, David Holmes wrote:
>>>>>>> Hi Bengt,
>>>>>>>
>>>>>>> That all sounds good to me.
>>>>>>>
>>>>>>> With regard to potential overflow perhaps a simple check that
>>>>>>> extra_size does not overflow? If that is true and extra_base is
>>>>>>> not null then I think all the other calculations must be valid.
>>>>>>
>>>>>> Yes, that makes sense. I'll add an overflow check for the
>>>>>> extra_size.
>>>>>>
>>>>>> Thanks,
>>>>>> Bengt
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 13/12/2012 10:58 PM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.02/
>>>>>>>>
>>>>>>>> I moved the alignment of "size" back into
>>>>>>>> ReservedSpace::initialize()
>>>>>>>> since we actually store the size in an instance variable
>>>>>>>> (_size), so I
>>>>>>>> think it is a bit risky to just do the align_up in
>>>>>>>> os::reserve_memory_aligned(). We probably want the instance
>>>>>>>> variables
>>>>>>>> _size and _alignment in ReservedSpace to be consistent.
>>>>>>>>
>>>>>>>> I added an assert in os::reserve_memory_aligned() to verify
>>>>>>>> that the
>>>>>>>> size is correctly aligned. I also added the assert you
>>>>>>>> suggested to
>>>>>>>> check that alignment is page size aligned.
>>>>>>>>
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>> On 12/13/12 11:50 AM, David Holmes wrote:
>>>>>>>>> On 13/12/2012 8:37 PM, Bengt Rutisson wrote:
>>>>>>>>>>
>>>>>>>>>> Hi again,
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.01/
>>>>>>>>>>
>>>>>>>>>> I removed the comment and the alignment.
>>>>>>>>>>
>>>>>>>>>> But I did not add an assert just yet.
>>>>>>>>>>
>>>>>>>>>> At the top of ReservedSpace::initialize() we have this code:
>>>>>>>>>>
>>>>>>>>>> const size_t granularity = os::vm_allocation_granularity();
>>>>>>>>>> assert((size & (granularity - 1)) == 0,
>>>>>>>>>> "size not aligned to os::vm_allocation_granularity()");
>>>>>>>>>> assert((alignment & (granularity - 1)) == 0,
>>>>>>>>>> "alignment not aligned to os::vm_allocation_granularity()");
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Where os::vm_allocation_granularity() is implemented as page
>>>>>>>>>> size on all
>>>>>>>>>> platforms except Windows. There we look it up from the
>>>>>>>>>> Windows API. I
>>>>>>>>>> assume that is a page size too, but since the Windows code in
>>>>>>>>>> our patch
>>>>>>>>>> does not unmap based on the alignment it should be safe
>>>>>>>>>> either way.
>>>>>>>>>>
>>>>>>>>>> Is this assert enough or would you like me to add an assert
>>>>>>>>>> closer to
>>>>>>>>>> the code block were I did the changes?
>>>>>>>>>
>>>>>>>>> As this is a separate method now I think an assert in the method
>>>>>>>>> itself would not go astray.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Bengt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/13/12 11:02 AM, Bengt Rutisson wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for looking at this!
>>>>>>>>>>>
>>>>>>>>>>> On 12/13/12 8:33 AM, David Holmes wrote:
>>>>>>>>>>>> Hi Bengt,
>>>>>>>>>>>>
>>>>>>>>>>>> This has to be one of the absolute best review requests
>>>>>>>>>>>> I've ever
>>>>>>>>>>>> read :) Thank you.
>>>>>>>>>>>
>>>>>>>>>>> Wow! Thanks! :)
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Just a couple of queries.
>>>>>>>>>>>>
>>>>>>>>>>>> os_posix.cpp:
>>>>>>>>>>>>
>>>>>>>>>>>> Love the diagram :)
>>>>>>>>>>>
>>>>>>>>>>> It was Mikael's way of making sure we got it right.
>>>>>>>>>>>
>>>>>>>>>>>> I'm assuming that "alignment" has to be >= the underlying
>>>>>>>>>>>> page size,
>>>>>>>>>>>> and in fact must be a multiple of the underlying page size
>>>>>>>>>>>> ? (As I
>>>>>>>>>>>> assume you can only unmap whole numbers of pages). If so
>>>>>>>>>>>> does that
>>>>>>>>>>>> need to be checked somewhere?
>>>>>>>>>>>
>>>>>>>>>>> Good point. I'll add an assert to check that.
>>>>>>>>>>>
>>>>>>>>>>>> In virtualSpace.cpp:
>>>>>>>>>>>>
>>>>>>>>>>>> // Reserve size large enough to do manual alignment and
>>>>>>>>>>>> // increase size to a multiple of the desired alignment
>>>>>>>>>>>> size = align_size_up(size, alignment);
>>>>>>>>>>>> ! base = os::reserve_memory_aligned(size, alignment);
>>>>>>>>>>>>
>>>>>>>>>>>> The comment doesn't seem necessary now, and the
>>>>>>>>>>>> align_size_up seems
>>>>>>>>>>>> redundant.
>>>>>>>>>>>
>>>>>>>>>>> You are right. I'll remove the comment and the alignment.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bengt
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could I have a couple of reviews for this change?
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.00/
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is for a bug originally reported by the Coherence team:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 7173959 : Jvm crashed during coherence exabus (tmb) testing
>>>>>>>>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7173959
>>>>>>>>>>>>>
>>>>>>>>>>>>> The original analysis and proposed fix was done by Mikael
>>>>>>>>>>>>> Gerdin
>>>>>>>>>>>>> and me
>>>>>>>>>>>>> together. I'll handle the webrev and push since Mikael is
>>>>>>>>>>>>> on vacation
>>>>>>>>>>>>> starting today. But Mikael did a great job tracking down
>>>>>>>>>>>>> this very
>>>>>>>>>>>>> difficult bug, so he should have a large part of the
>>>>>>>>>>>>> credit for this
>>>>>>>>>>>>> bug
>>>>>>>>>>>>> fix,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Description from the CR:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The reason that we crash is due to how we re-map memory
>>>>>>>>>>>>> when we
>>>>>>>>>>>>> want to
>>>>>>>>>>>>> align it for large pages in ReservedSpace::initialize().
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is what happens:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The reservation of memory is split up to a few steps. When
>>>>>>>>>>>>> we want a
>>>>>>>>>>>>> chunk of memory with large pages we first just reserve
>>>>>>>>>>>>> some memory
>>>>>>>>>>>>> large
>>>>>>>>>>>>> enough for what we need. Then we realize that we want
>>>>>>>>>>>>> large pages,
>>>>>>>>>>>>> so we
>>>>>>>>>>>>> want to re-map the reserved memory to use large pages. But
>>>>>>>>>>>>> since this
>>>>>>>>>>>>> requires that we have a large-page-aligned memory chunk we
>>>>>>>>>>>>> may
>>>>>>>>>>>>> have to
>>>>>>>>>>>>> fix the recently reserved memory chunk up a bit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We do this in ReservedSpace::initialize() by first
>>>>>>>>>>>>> releasing the
>>>>>>>>>>>>> memory
>>>>>>>>>>>>> we just reserved. Then requesting more memory than we
>>>>>>>>>>>>> actually
>>>>>>>>>>>>> need to
>>>>>>>>>>>>> make sure that we have enough space to do manual large page
>>>>>>>>>>>>> alignment.
>>>>>>>>>>>>> After we have gotten this memory we figure out a nicely
>>>>>>>>>>>>> aligned start
>>>>>>>>>>>>> address. We then release the memory again and then reserve
>>>>>>>>>>>>> just
>>>>>>>>>>>>> enough
>>>>>>>>>>>>> memory but using the aligned start address as a fixed
>>>>>>>>>>>>> address to make
>>>>>>>>>>>>> sure that we get the memory we wanted in an aligned way.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is done in a loop to make sure that we eventually get
>>>>>>>>>>>>> some
>>>>>>>>>>>>> memory.
>>>>>>>>>>>>> The interesting code looks like this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> do {
>>>>>>>>>>>>> char* extra_base = os::reserve_memory(extra_size, NULL,
>>>>>>>>>>>>> alignment);
>>>>>>>>>>>>> if (extra_base == NULL) return;
>>>>>>>>>>>>> // Do manual alignement
>>>>>>>>>>>>> base = (char*) align_size_up((uintptr_t) extra_base,
>>>>>>>>>>>>> alignment);
>>>>>>>>>>>>> assert(base >= extra_base, "just checking");
>>>>>>>>>>>>> // Re-reserve the region at the aligned base address.
>>>>>>>>>>>>> os::release_memory(extra_base, extra_size); // (1)
>>>>>>>>>>>>> base = os::reserve_memory(size, base); // (2)
>>>>>>>>>>>>> } while (base == NULL);
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> There is a race here between releasing the memory in (1) and
>>>>>>>>>>>>> re-reserving it in (2). But the loop is supposed to handle
>>>>>>>>>>>>> this race.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem is that on posix platforms you can remap the
>>>>>>>>>>>>> same memory
>>>>>>>>>>>>> area several times. The call in (2) will use mmap with
>>>>>>>>>>>>> MAP_FIXED.
>>>>>>>>>>>>> This
>>>>>>>>>>>>> means that the OS will think that you know exactly what
>>>>>>>>>>>>> you are
>>>>>>>>>>>>> doing.
>>>>>>>>>>>>> So, if part of the memory has been mapped already by the
>>>>>>>>>>>>> process it
>>>>>>>>>>>>> will
>>>>>>>>>>>>> just go ahead and reuse that memory.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This means that if we are having multiple threads that do
>>>>>>>>>>>>> mmap. We
>>>>>>>>>>>>> can
>>>>>>>>>>>>> end up with a situation where we release our mapping in
>>>>>>>>>>>>> (1). Then
>>>>>>>>>>>>> another thread comes in and maps part of the memory that
>>>>>>>>>>>>> we used to
>>>>>>>>>>>>> have. Then we remap over that memory again in (2) with
>>>>>>>>>>>>> MAP_FIXED.
>>>>>>>>>>>>> Now we
>>>>>>>>>>>>> have a situation where two threads in our process have
>>>>>>>>>>>>> mapped the
>>>>>>>>>>>>> same
>>>>>>>>>>>>> memory. If both threads try to use it or if one of the
>>>>>>>>>>>>> threads unmap
>>>>>>>>>>>>> part or all of the memory we will crash.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On posix it is possible to unmap any part of a mapped
>>>>>>>>>>>>> chunk. So, our
>>>>>>>>>>>>> proposed solution to the race described above is to not
>>>>>>>>>>>>> unmap all
>>>>>>>>>>>>> memory
>>>>>>>>>>>>> in (1) but rather just unmap the section at the start and
>>>>>>>>>>>>> at the
>>>>>>>>>>>>> end of
>>>>>>>>>>>>> the chunk that we mapped to get alignment. This also
>>>>>>>>>>>>> removes the need
>>>>>>>>>>>>> for the loop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However, on Windows you can only unmap _all_ of the memory
>>>>>>>>>>>>> that you
>>>>>>>>>>>>> have
>>>>>>>>>>>>> mapped. On the other hand Windows also will not allow you
>>>>>>>>>>>>> to map over
>>>>>>>>>>>>> other mappings, so the original code is actually safe. If
>>>>>>>>>>>>> we keep
>>>>>>>>>>>>> the loop.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, our solution is to treat this differently on Windows
>>>>>>>>>>>>> and posix
>>>>>>>>>>>>> platforms.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Bengt
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20121218/048acdf3/attachment-0001.html
More information about the hotspot-runtime-dev
mailing list