Request for review (s): 7173959 : Jvm crashed during coherence exabus (tmb) testing

Fri Dec 14 00:48:54 PST 2012

Hi again Runtime team,

I need one more review for this change. It is a P1 bug with a fairly 
small change. Any takers?

Latest webrev based on comments from David Holmes and Vitaly Davidovich:
http://cr.openjdk.java.net/~brutisso/7173959/webrev.03/

Thanks,
Bengt

On 12/14/12 9:29 AM, Bengt Rutisson wrote:
>
> Hi David,
>
> On 12/14/12 1:38 AM, David Holmes wrote:
>> Hi Bengt,
>>
>> That all sounds good to me.
>>
>> With regard to potential overflow perhaps a simple check that 
>> extra_size does not overflow? If that is true and extra_base is not 
>> null then I think all the other calculations must be valid.
>
> Yes, that makes sense. I'll add an overflow check for the extra_size.
>
> Thanks,
> Bengt
>
>>
>> Thanks,
>> David
>>
>> On 13/12/2012 10:58 PM, Bengt Rutisson wrote:
>>>
>>> Hi David,
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.02/
>>>
>>> I moved the alignment of "size" back into ReservedSpace::initialize()
>>> since we actually store the size in an instance variable (_size), so I
>>> think it is a bit risky to just do the align_up in
>>> os::reserve_memory_aligned(). We probably want the instance variables
>>> _size and _alignment in ReservedSpace to be consistent.
>>>
>>> I added an assert in os::reserve_memory_aligned() to verify that the
>>> size is correctly aligned. I also added the assert you suggested to
>>> check that alignment is page size aligned.
>>>
>>> Bengt
>>>
>>> On 12/13/12 11:50 AM, David Holmes wrote:
>>>> On 13/12/2012 8:37 PM, Bengt Rutisson wrote:
>>>>>
>>>>> Hi again,
>>>>>
>>>>> Updated webrev:
>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.01/
>>>>>
>>>>> I removed the comment and the alignment.
>>>>>
>>>>> But I did not add an assert just yet.
>>>>>
>>>>> At the top of ReservedSpace::initialize() we have this code:
>>>>>
>>>>> const size_t granularity = os::vm_allocation_granularity();
>>>>> assert((size & (granularity - 1)) == 0,
>>>>> "size not aligned to os::vm_allocation_granularity()");
>>>>> assert((alignment & (granularity - 1)) == 0,
>>>>> "alignment not aligned to os::vm_allocation_granularity()");
>>>>>
>>>>>
>>>>> Where os::vm_allocation_granularity() is implemented as page size 
>>>>> on all
>>>>> platforms except Windows. There we look it up from the Windows API. I
>>>>> assume that is a page size too, but since the Windows code in our 
>>>>> patch
>>>>> does not unmap based on the alignment it should be safe either way.
>>>>>
>>>>> Is this assert enough or would you like me to add an assert closer to
>>>>> the code block were I did the changes?
>>>>
>>>> As this is a separate method now I think an assert in the method
>>>> itself would not go astray.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>
>>>>> On 12/13/12 11:02 AM, Bengt Rutisson wrote:
>>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> Thanks for looking at this!
>>>>>>
>>>>>> On 12/13/12 8:33 AM, David Holmes wrote:
>>>>>>> Hi Bengt,
>>>>>>>
>>>>>>> This has to be one of the absolute best review requests I've ever
>>>>>>> read :) Thank you.
>>>>>>
>>>>>> Wow! Thanks! :)
>>>>>>
>>>>>>>
>>>>>>> Just a couple of queries.
>>>>>>>
>>>>>>> os_posix.cpp:
>>>>>>>
>>>>>>> Love the diagram :)
>>>>>>
>>>>>> It was Mikael's way of making sure we got it right.
>>>>>>
>>>>>>> I'm assuming that "alignment" has to be >= the underlying page 
>>>>>>> size,
>>>>>>> and in fact must be a multiple of the underlying page size ? (As I
>>>>>>> assume you can only unmap whole numbers of pages). If so does that
>>>>>>> need to be checked somewhere?
>>>>>>
>>>>>> Good point. I'll add an assert to check that.
>>>>>>
>>>>>>> In virtualSpace.cpp:
>>>>>>>
>>>>>>> // Reserve size large enough to do manual alignment and
>>>>>>> // increase size to a multiple of the desired alignment
>>>>>>> size = align_size_up(size, alignment);
>>>>>>> ! base = os::reserve_memory_aligned(size, alignment);
>>>>>>>
>>>>>>> The comment doesn't seem necessary now, and the align_size_up seems
>>>>>>> redundant.
>>>>>>
>>>>>> You are right. I'll remove the comment and the alignment.
>>>>>>
>>>>>> Thanks,
>>>>>> Bengt
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 13/12/2012 4:42 PM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Could I have a couple of reviews for this change?
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~brutisso/7173959/webrev.00/
>>>>>>>>
>>>>>>>> This is for a bug originally reported by the Coherence team:
>>>>>>>>
>>>>>>>> 7173959 : Jvm crashed during coherence exabus (tmb) testing
>>>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7173959
>>>>>>>>
>>>>>>>> The original analysis and proposed fix was done by Mikael Gerdin
>>>>>>>> and me
>>>>>>>> together. I'll handle the webrev and push since Mikael is on 
>>>>>>>> vacation
>>>>>>>> starting today. But Mikael did a great job tracking down this very
>>>>>>>> difficult bug, so he should have a large part of the credit for 
>>>>>>>> this
>>>>>>>> bug
>>>>>>>> fix,
>>>>>>>>
>>>>>>>> Description from the CR:
>>>>>>>>
>>>>>>>> The reason that we crash is due to how we re-map memory when we
>>>>>>>> want to
>>>>>>>> align it for large pages in ReservedSpace::initialize().
>>>>>>>>
>>>>>>>> Here is what happens:
>>>>>>>>
>>>>>>>> The reservation of memory is split up to a few steps. When we 
>>>>>>>> want a
>>>>>>>> chunk of memory with large pages we first just reserve some memory
>>>>>>>> large
>>>>>>>> enough for what we need. Then we realize that we want large pages,
>>>>>>>> so we
>>>>>>>> want to re-map the reserved memory to use large pages. But 
>>>>>>>> since this
>>>>>>>> requires that we have a large-page-aligned memory chunk we may
>>>>>>>> have to
>>>>>>>> fix the recently reserved memory chunk up a bit.
>>>>>>>>
>>>>>>>> We do this in ReservedSpace::initialize() by first releasing the
>>>>>>>> memory
>>>>>>>> we just reserved. Then requesting more memory than we actually
>>>>>>>> need to
>>>>>>>> make sure that we have enough space to do manual large page
>>>>>>>> alignment.
>>>>>>>> After we have gotten this memory we figure out a nicely aligned 
>>>>>>>> start
>>>>>>>> address. We then release the memory again and then reserve just
>>>>>>>> enough
>>>>>>>> memory but using the aligned start address as a fixed address 
>>>>>>>> to make
>>>>>>>> sure that we get the memory we wanted in an aligned way.
>>>>>>>>
>>>>>>>> This is done in a loop to make sure that we eventually get some
>>>>>>>> memory.
>>>>>>>> The interesting code looks like this:
>>>>>>>>
>>>>>>>> do {
>>>>>>>> char* extra_base = os::reserve_memory(extra_size, NULL, 
>>>>>>>> alignment);
>>>>>>>> if (extra_base == NULL) return;
>>>>>>>> // Do manual alignement
>>>>>>>> base = (char*) align_size_up((uintptr_t) extra_base, alignment);
>>>>>>>> assert(base >= extra_base, "just checking");
>>>>>>>> // Re-reserve the region at the aligned base address.
>>>>>>>> os::release_memory(extra_base, extra_size); // (1)
>>>>>>>> base = os::reserve_memory(size, base); // (2)
>>>>>>>> } while (base == NULL);
>>>>>>>>
>>>>>>>>
>>>>>>>> There is a race here between releasing the memory in (1) and
>>>>>>>> re-reserving it in (2). But the loop is supposed to handle this 
>>>>>>>> race.
>>>>>>>>
>>>>>>>> The problem is that on posix platforms you can remap the same 
>>>>>>>> memory
>>>>>>>> area several times. The call in (2) will use mmap with MAP_FIXED.
>>>>>>>> This
>>>>>>>> means that the OS will think that you know exactly what you are
>>>>>>>> doing.
>>>>>>>> So, if part of the memory has been mapped already by the 
>>>>>>>> process it
>>>>>>>> will
>>>>>>>> just go ahead and reuse that memory.
>>>>>>>>
>>>>>>>> This means that if we are having multiple threads that do mmap. We
>>>>>>>> can
>>>>>>>> end up with a situation where we release our mapping in (1). Then
>>>>>>>> another thread comes in and maps part of the memory that we 
>>>>>>>> used to
>>>>>>>> have. Then we remap over that memory again in (2) with MAP_FIXED.
>>>>>>>> Now we
>>>>>>>> have a situation where two threads in our process have mapped the
>>>>>>>> same
>>>>>>>> memory. If both threads try to use it or if one of the threads 
>>>>>>>> unmap
>>>>>>>> part or all of the memory we will crash.
>>>>>>>>
>>>>>>>> On posix it is possible to unmap any part of a mapped chunk. 
>>>>>>>> So, our
>>>>>>>> proposed solution to the race described above is to not unmap all
>>>>>>>> memory
>>>>>>>> in (1) but rather just unmap the section at the start and at the
>>>>>>>> end of
>>>>>>>> the chunk that we mapped to get alignment. This also removes 
>>>>>>>> the need
>>>>>>>> for the loop.
>>>>>>>>
>>>>>>>> However, on Windows you can only unmap _all_ of the memory that 
>>>>>>>> you
>>>>>>>> have
>>>>>>>> mapped. On the other hand Windows also will not allow you to 
>>>>>>>> map over
>>>>>>>> other mappings, so the original code is actually safe. If we keep
>>>>>>>> the loop.
>>>>>>>>
>>>>>>>> So, our solution is to treat this differently on Windows and posix
>>>>>>>> platforms.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bengt
>>>>>>
>>>>>
>>>
>