RFR(s): 8023905: Failing to initialize VM with small initial heap when NUMA and large pages are enabled

Mon Aug 29 08:31:32 UTC 2016

Hi Sangheon,

On 2016-08-26 21:52, sangheon wrote:
> Hi Stefan,
>
> On 08/26/2016 04:53 AM, Stefan Johansson wrote:
>> Hi Sangheon,
>>
>> On 2016-08-25 23:02, sangheon wrote:
>>> Hi Stefan,
>>>
>>> On 08/25/2016 07:36 AM, Stefan Johansson wrote:
>>>> Hi Sangheon,
>>>>
>>>> On 2016-08-24 19:53, sangheon wrote:
>>>>> Hi Stefan,
>>>>>
>>>>> Thanks for reviewing this.
>>>>>
>>>>> On 08/24/2016 06:29 AM, Stefan Johansson wrote:
>>>>>> Hi Sangheon,
>>>>>>
>>>>>> Thanks for looking at this issue.
>>>>>>
>>>>>> On 2016-08-10 19:28, sangheon wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Can I have some reviews for this change?
>>>>>>>
>>>>>>> NUMA and large pages are not compatible in Linux as large pages 
>>>>>>> cannot uncommit pages(os_linux.cpp:line 4828 [1]). So we use pin 
>>>>>>> region for this case. If we succeed to reserve with large pages 
>>>>>>> for small initial heap, we will fail when free memory for 
>>>>>>> biasing. The reason is that when we initialize NUMA with large 
>>>>>>> pages, we change the page size to the default page size if the 
>>>>>>> allocated pages are small.
>>>>>>>
>>>>>>> I am proposing to exit the VM at that time. Adding an exception 
>>>>>>> seems not good idea for this small heap which seems not 
>>>>>>> practical for NUMA + large page case.
>>>>>>> The added test is checking the exit message if both NUMA and 
>>>>>>> large pages are supported.
>>>>>>>
>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8023905
>>>>>>> Webrev: http://cr.openjdk.java.net/~sangheki/8023905/webrev.0
>>>>>> A few comments.
>>>>>>
>>>>>> I agree that having the VM exit is good, but I think the exit 
>>>>>> message should include info that large pages caused this. 
>>>>>> Something like: "Failed initializing NUMA with large pages. Too 
>>>>>> small heap size"
>>>>> OK.
>>>>> "Failed initializing NUMA with large pages. Too small heap size" 
>>>>> seems much better.
>>>>>
>>>>>>
>>>>>> Another thing is the use of #ifdef to make this conditional for 
>>>>>> Linux. Is this needed? Isn't the return value for 
>>>>>> can_commit_large_page_memory() the conditional we should care about? 
>>>>> We can know whether we are using 'pin region' from the return 
>>>>> value of can_commit_large_page_memory() and UseLargePages flag.
>>>>>
>>>>> However the condition of comparison with page size in Linux 
>>>>> version of os::pd_free_memory() makes this problem. For other 
>>>>> platforms such as Windows, AIX and BSD which have empty 
>>>>> os::pd_free_memory(), it doesn't matter. And Solaris doesn't have 
>>>>> such condition. This means for other platforms we don't need to 
>>>>> exit because of reverting to default page size.
>>>>>
>>>>> I mean if can_commit_large_page_memory() returns false and 
>>>>> UseLargePages enabled, we will try to use pin region. But NUMA 
>>>>> could try to use default page size. It is okay in general except 
>>>>> on Linux because of above reason.
>>>>>
>>>>>> Or will we fail some platform too early. If so, we could add 
>>>>>> another capability method to the os class and use that to avoid 
>>>>>> having the #ifdef in the code.
>>>>> I also considered shortly to add a new method to decide.
>>>>> And I agree that not using #ifdef is better in general. But I'm 
>>>>> not sure for this case as it is too Linux implementation specific. 
>>>>> i.e. Linux version is implemented pd_free_memory() to 
>>>>> conditionally commit after comparing with page size. If Linux 
>>>>> pd_free_memory() becomes blank or the condition is changed, the 
>>>>> decision method also should be changed which seems not worth for 
>>>>> me. This is why I stopped considering it.
>>>>>
>>>> Ok, I don't see things changing as a big problem, you could let the 
>>>> new capability just return the same as 
>>>> can_commit_large_page_memory() for Linux and have the other 
>>>> platforms return true. This would have the same maintenance 
>>>> requirements as the current solution in my eyes.
>>> I think we have to consider not only the maintenance but also the 
>>> necessity of the method. I don't think we could re-use it and this 
>>> is why I described it as too Linux implementation specific.
>>>
>>> If you strongly want to introduce a new method, may I ask what is 
>>> expected name/roll of the method?
>>>
>> I see your point and I'll leave it up to a second reviewer to decide 
>> which way to go.
> OK.
>
>> I was thinking about something like: 
>> can_free_parts_of_large_page_memory() or maybe 
>> can_split_large_page_memory() but I agree that it's hard to come up 
>> with a good name.
> I like the former one if we need to add a new one.
> I'm not sure for other platforms returning true is correct regarding 
> the method's meaning though.
>
Probably not, for platforms not supporting large pages for example we 
should probably return false.
>>
>>>>
>>>>>>
>>>>>> Regarding the comment, I'm not sure what you mean by "pin 
>>>>>> region". I might be missing something but I think the comment 
>>>>>> need more information to be easier to understand.
>>>>> Looking at the ReservedHeapSpace::try_reserve_heap() line 314, 
>>>>> there is a comment about current allocation.
>>>>> // If OS doesn't support demand paging for large page memory, we need
>>>>> // to use reserve_memory_special() to reserve and pin the entire 
>>>>> region.
>>>>>
>>>>> And I agree adding more information is better.
>>>>> How about this? (I will also update the comment at test)
>>>>> -     // If we are using pin region, we cannot change the page 
>>>>> size to default size
>>>>> -     // as we could free memory which is not expected for pin 
>>>>> region in Linux.
>>>>>
>>>>> +     // If we are using pin region which is reserved and pinned 
>>>>> the entire region,
>>>>> +     // we cannot change the page size to default size as we 
>>>>> could free memory
>>>>> +     // which is not expected for pin region in Linux.
>>>>>
>>>> Ok, I see what you mean. Just never thought about it as "pin region".
>>> We are already using the term 'pinned region (not pin region)' 5 
>>> times at G1 code.
>>>
>> That's true but the meaning of that is different and that's why I 
>> would like to avoid that wording. From heapRegion.hpp:
>>   // A pinned region contains objects which are not moved by garbage 
>> collections.
>>   // Humongous regions and archive regions are pinned.
>>   bool is_pinned() const { return _type.is_pinned(); }
>>
>> So this refers to G1 heap regions that are pinned because we don't 
>> move them during GC. In try_reserve_heap() we refer to the backing 
>> memory of the whole heap which is reserved and committed (pinned). So 
>> the meaning of pinned in these two cases is very different.
> OK, let's avoid using it. As we are talking about it, it is already 
> controversial.
> I interpreted 'pinned region' as some region which is not moving or 
> modifying while you include the job itself too (GC or heap allocation).
>
Great, thanks.
>>
>>>> I would say something like:
>>>>     // Changing the page size below can lead to freeing of memory. 
>>>> When using large pages
>>>>     // and the memory has been both reserved and committed, some 
>>>> platforms do not support
>>>>     // freeing parts of it. For those platforms we fail 
>>>> initialization.
>>>>     if (UseLargePages && !os::can_free_parts_of_large_page_memory()) {
>>>>       vm_exit_during_initialization("Failed initializing NUMA. Too 
>>>> small heap size");
>>>>     }
>>>>
>>>> What do you think about that?
>>> Basically I like above but it would be better to contain 'pinned 
>>> region' as there's no reason to avoid using it.
>>>
>> As stated above, I think we should avoid it in this context since it 
>> might be misinterpreted.
> OK. But I had to slightly change as current version doesn't have the 
> general method.
>
> I changed the message(MSG_EXIT_TOO_SMALL_HEAP) in the test to the 
> correct string. When I ran JPRT I was using correct string but 
> uploaded the old one.
>
> webrev:
> http://cr.openjdk.java.net/~sangheki/8023905/webrev.1
> http://cr.openjdk.java.net/~sangheki/8023905/webrev.1_to_0
>
Looks good.

Thanks,
Stefan

> Thanks,
> Sangheon
>
>
>>
>> Thanks,
>> Stefan
>>> Thanks,
>>> Sangheon
>>>
>>>
>>>>
>>>> Thanks,
>>>> Stefan
>>>>> Let me upload webrev after our discussion ended.
>>>>>
>>>>> Thanks,
>>>>> Sangheon
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Stefan
>>>>>>
>>>>>>> Testing: JPRT, manual test on NUMA + large page supported machine.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sangheon
>>>>>>>
>>>>>>> [1]:
>>>>>>> // With SHM and HugeTLBFS large pages we cannot uncommit a page, 
>>>>>>> so there's no way
>>>>>>> // we can make the adaptive lgrp chunk resizing work. If the 
>>>>>>> user specified
>>>>>>> // both UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on 
>>>>>>> the command line - warn and
>>>>>>> // disable adaptive resizing.
>>>>>>
>>>>>
>>>>
>>>
>>
>