RFR(s): 8023905: Failing to initialize VM with small initial heap when NUMA and large pages are enabled

Fri Aug 26 19:52:10 UTC 2016

Hi Stefan,

On 08/26/2016 04:53 AM, Stefan Johansson wrote:
> Hi Sangheon,
>
> On 2016-08-25 23:02, sangheon wrote:
>> Hi Stefan,
>>
>> On 08/25/2016 07:36 AM, Stefan Johansson wrote:
>>> Hi Sangheon,
>>>
>>> On 2016-08-24 19:53, sangheon wrote:
>>>> Hi Stefan,
>>>>
>>>> Thanks for reviewing this.
>>>>
>>>> On 08/24/2016 06:29 AM, Stefan Johansson wrote:
>>>>> Hi Sangheon,
>>>>>
>>>>> Thanks for looking at this issue.
>>>>>
>>>>> On 2016-08-10 19:28, sangheon wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Can I have some reviews for this change?
>>>>>>
>>>>>> NUMA and large pages are not compatible in Linux as large pages 
>>>>>> cannot uncommit pages(os_linux.cpp:line 4828 [1]). So we use pin 
>>>>>> region for this case. If we succeed to reserve with large pages 
>>>>>> for small initial heap, we will fail when free memory for 
>>>>>> biasing. The reason is that when we initialize NUMA with large 
>>>>>> pages, we change the page size to the default page size if the 
>>>>>> allocated pages are small.
>>>>>>
>>>>>> I am proposing to exit the VM at that time. Adding an exception 
>>>>>> seems not good idea for this small heap which seems not practical 
>>>>>> for NUMA + large page case.
>>>>>> The added test is checking the exit message if both NUMA and 
>>>>>> large pages are supported.
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8023905
>>>>>> Webrev: http://cr.openjdk.java.net/~sangheki/8023905/webrev.0
>>>>> A few comments.
>>>>>
>>>>> I agree that having the VM exit is good, but I think the exit 
>>>>> message should include info that large pages caused this. 
>>>>> Something like: "Failed initializing NUMA with large pages. Too 
>>>>> small heap size"
>>>> OK.
>>>> "Failed initializing NUMA with large pages. Too small heap size" 
>>>> seems much better.
>>>>
>>>>>
>>>>> Another thing is the use of #ifdef to make this conditional for 
>>>>> Linux. Is this needed? Isn't the return value for 
>>>>> can_commit_large_page_memory() the conditional we should care about? 
>>>> We can know whether we are using 'pin region' from the return value 
>>>> of can_commit_large_page_memory() and UseLargePages flag.
>>>>
>>>> However the condition of comparison with page size in Linux version 
>>>> of os::pd_free_memory() makes this problem. For other platforms 
>>>> such as Windows, AIX and BSD which have empty os::pd_free_memory(), 
>>>> it doesn't matter. And Solaris doesn't have such condition. This 
>>>> means for other platforms we don't need to exit because of 
>>>> reverting to default page size.
>>>>
>>>> I mean if can_commit_large_page_memory() returns false and 
>>>> UseLargePages enabled, we will try to use pin region. But NUMA 
>>>> could try to use default page size. It is okay in general except on 
>>>> Linux because of above reason.
>>>>
>>>>> Or will we fail some platform too early. If so, we could add 
>>>>> another capability method to the os class and use that to avoid 
>>>>> having the #ifdef in the code.
>>>> I also considered shortly to add a new method to decide.
>>>> And I agree that not using #ifdef is better in general. But I'm not 
>>>> sure for this case as it is too Linux implementation specific. i.e. 
>>>> Linux version is implemented pd_free_memory() to conditionally 
>>>> commit after comparing with page size. If Linux pd_free_memory() 
>>>> becomes blank or the condition is changed, the decision method also 
>>>> should be changed which seems not worth for me. This is why I 
>>>> stopped considering it.
>>>>
>>> Ok, I don't see things changing as a big problem, you could let the 
>>> new capability just return the same as 
>>> can_commit_large_page_memory() for Linux and have the other 
>>> platforms return true. This would have the same maintenance 
>>> requirements as the current solution in my eyes.
>> I think we have to consider not only the maintenance but also the 
>> necessity of the method. I don't think we could re-use it and this is 
>> why I described it as too Linux implementation specific.
>>
>> If you strongly want to introduce a new method, may I ask what is 
>> expected name/roll of the method?
>>
> I see your point and I'll leave it up to a second reviewer to decide 
> which way to go.
OK.

> I was thinking about something like: 
> can_free_parts_of_large_page_memory() or maybe 
> can_split_large_page_memory() but I agree that it's hard to come up 
> with a good name.
I like the former one if we need to add a new one.
I'm not sure for other platforms returning true is correct regarding the 
method's meaning though.

>
>>>
>>>>>
>>>>> Regarding the comment, I'm not sure what you mean by "pin region". 
>>>>> I might be missing something but I think the comment need more 
>>>>> information to be easier to understand.
>>>> Looking at the ReservedHeapSpace::try_reserve_heap() line 314, 
>>>> there is a comment about current allocation.
>>>> // If OS doesn't support demand paging for large page memory, we need
>>>> // to use reserve_memory_special() to reserve and pin the entire 
>>>> region.
>>>>
>>>> And I agree adding more information is better.
>>>> How about this? (I will also update the comment at test)
>>>> -     // If we are using pin region, we cannot change the page size 
>>>> to default size
>>>> -     // as we could free memory which is not expected for pin 
>>>> region in Linux.
>>>>
>>>> +     // If we are using pin region which is reserved and pinned 
>>>> the entire region,
>>>> +     // we cannot change the page size to default size as we could 
>>>> free memory
>>>> +     // which is not expected for pin region in Linux.
>>>>
>>> Ok, I see what you mean. Just never thought about it as "pin region".
>> We are already using the term 'pinned region (not pin region)' 5 
>> times at G1 code.
>>
> That's true but the meaning of that is different and that's why I 
> would like to avoid that wording. From heapRegion.hpp:
>   // A pinned region contains objects which are not moved by garbage 
> collections.
>   // Humongous regions and archive regions are pinned.
>   bool is_pinned() const { return _type.is_pinned(); }
>
> So this refers to G1 heap regions that are pinned because we don't 
> move them during GC. In try_reserve_heap() we refer to the backing 
> memory of the whole heap which is reserved and committed (pinned). So 
> the meaning of pinned in these two cases is very different.
OK, let's avoid using it. As we are talking about it, it is already 
controversial.
I interpreted 'pinned region' as some region which is not moving or 
modifying while you include the job itself too (GC or heap allocation).

>
>>> I would say something like:
>>>     // Changing the page size below can lead to freeing of memory. 
>>> When using large pages
>>>     // and the memory has been both reserved and committed, some 
>>> platforms do not support
>>>     // freeing parts of it. For those platforms we fail initialization.
>>>     if (UseLargePages && !os::can_free_parts_of_large_page_memory()) {
>>>       vm_exit_during_initialization("Failed initializing NUMA. Too 
>>> small heap size");
>>>     }
>>>
>>> What do you think about that?
>> Basically I like above but it would be better to contain 'pinned 
>> region' as there's no reason to avoid using it.
>>
> As stated above, I think we should avoid it in this context since it 
> might be misinterpreted.
OK. But I had to slightly change as current version doesn't have the 
general method.

I changed the message(MSG_EXIT_TOO_SMALL_HEAP) in the test to the 
correct string. When I ran JPRT I was using correct string but uploaded 
the old one.

webrev:
http://cr.openjdk.java.net/~sangheki/8023905/webrev.1
http://cr.openjdk.java.net/~sangheki/8023905/webrev.1_to_0

Thanks,
Sangheon

>
> Thanks,
> Stefan
>> Thanks,
>> Sangheon
>>
>>
>>>
>>> Thanks,
>>> Stefan
>>>> Let me upload webrev after our discussion ended.
>>>>
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Stefan
>>>>>
>>>>>> Testing: JPRT, manual test on NUMA + large page supported machine.
>>>>>>
>>>>>> Thanks,
>>>>>> Sangheon
>>>>>>
>>>>>> [1]:
>>>>>> // With SHM and HugeTLBFS large pages we cannot uncommit a page, 
>>>>>> so there's no way
>>>>>> // we can make the adaptive lgrp chunk resizing work. If the user 
>>>>>> specified
>>>>>> // both UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on the 
>>>>>> command line - warn and
>>>>>> // disable adaptive resizing.
>>>>>
>>>>
>>>
>>
>