RFR(s): 8023905: Failing to initialize VM with small initial heap when NUMA and large pages are enabled

Mon Aug 29 20:57:44 UTC 2016

Hi Stefan,

On 08/29/2016 01:31 AM, Stefan Johansson wrote:
> Hi Sangheon,
>
> On 2016-08-26 21:52, sangheon wrote:
>> Hi Stefan,
>>
>> On 08/26/2016 04:53 AM, Stefan Johansson wrote:
>>> Hi Sangheon,
>>>
>>> On 2016-08-25 23:02, sangheon wrote:
>>>> Hi Stefan,
>>>>
>>>> On 08/25/2016 07:36 AM, Stefan Johansson wrote:
>>>>> Hi Sangheon,
>>>>>
>>>>> On 2016-08-24 19:53, sangheon wrote:
>>>>>> Hi Stefan,
>>>>>>
>>>>>> Thanks for reviewing this.
>>>>>>
>>>>>> On 08/24/2016 06:29 AM, Stefan Johansson wrote:
>>>>>>> Hi Sangheon,
>>>>>>>
>>>>>>> Thanks for looking at this issue.
>>>>>>>
>>>>>>> On 2016-08-10 19:28, sangheon wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Can I have some reviews for this change?
>>>>>>>>
>>>>>>>> NUMA and large pages are not compatible in Linux as large pages 
>>>>>>>> cannot uncommit pages(os_linux.cpp:line 4828 [1]). So we use 
>>>>>>>> pin region for this case. If we succeed to reserve with large 
>>>>>>>> pages for small initial heap, we will fail when free memory for 
>>>>>>>> biasing. The reason is that when we initialize NUMA with large 
>>>>>>>> pages, we change the page size to the default page size if the 
>>>>>>>> allocated pages are small.
>>>>>>>>
>>>>>>>> I am proposing to exit the VM at that time. Adding an exception 
>>>>>>>> seems not good idea for this small heap which seems not 
>>>>>>>> practical for NUMA + large page case.
>>>>>>>> The added test is checking the exit message if both NUMA and 
>>>>>>>> large pages are supported.
>>>>>>>>
>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8023905
>>>>>>>> Webrev: http://cr.openjdk.java.net/~sangheki/8023905/webrev.0
>>>>>>> A few comments.
>>>>>>>
>>>>>>> I agree that having the VM exit is good, but I think the exit 
>>>>>>> message should include info that large pages caused this. 
>>>>>>> Something like: "Failed initializing NUMA with large pages. Too 
>>>>>>> small heap size"
>>>>>> OK.
>>>>>> "Failed initializing NUMA with large pages. Too small heap size" 
>>>>>> seems much better.
>>>>>>
>>>>>>>
>>>>>>> Another thing is the use of #ifdef to make this conditional for 
>>>>>>> Linux. Is this needed? Isn't the return value for 
>>>>>>> can_commit_large_page_memory() the conditional we should care 
>>>>>>> about? 
>>>>>> We can know whether we are using 'pin region' from the return 
>>>>>> value of can_commit_large_page_memory() and UseLargePages flag.
>>>>>>
>>>>>> However the condition of comparison with page size in Linux 
>>>>>> version of os::pd_free_memory() makes this problem. For other 
>>>>>> platforms such as Windows, AIX and BSD which have empty 
>>>>>> os::pd_free_memory(), it doesn't matter. And Solaris doesn't have 
>>>>>> such condition. This means for other platforms we don't need to 
>>>>>> exit because of reverting to default page size.
>>>>>>
>>>>>> I mean if can_commit_large_page_memory() returns false and 
>>>>>> UseLargePages enabled, we will try to use pin region. But NUMA 
>>>>>> could try to use default page size. It is okay in general except 
>>>>>> on Linux because of above reason.
>>>>>>
>>>>>>> Or will we fail some platform too early. If so, we could add 
>>>>>>> another capability method to the os class and use that to avoid 
>>>>>>> having the #ifdef in the code.
>>>>>> I also considered shortly to add a new method to decide.
>>>>>> And I agree that not using #ifdef is better in general. But I'm 
>>>>>> not sure for this case as it is too Linux implementation 
>>>>>> specific. i.e. Linux version is implemented pd_free_memory() to 
>>>>>> conditionally commit after comparing with page size. If Linux 
>>>>>> pd_free_memory() becomes blank or the condition is changed, the 
>>>>>> decision method also should be changed which seems not worth for 
>>>>>> me. This is why I stopped considering it.
>>>>>>
>>>>> Ok, I don't see things changing as a big problem, you could let 
>>>>> the new capability just return the same as 
>>>>> can_commit_large_page_memory() for Linux and have the other 
>>>>> platforms return true. This would have the same maintenance 
>>>>> requirements as the current solution in my eyes.
>>>> I think we have to consider not only the maintenance but also the 
>>>> necessity of the method. I don't think we could re-use it and this 
>>>> is why I described it as too Linux implementation specific.
>>>>
>>>> If you strongly want to introduce a new method, may I ask what is 
>>>> expected name/roll of the method?
>>>>
>>> I see your point and I'll leave it up to a second reviewer to decide 
>>> which way to go.
>> OK.
>>
>>> I was thinking about something like: 
>>> can_free_parts_of_large_page_memory() or maybe 
>>> can_split_large_page_memory() but I agree that it's hard to come up 
>>> with a good name.
>> I like the former one if we need to add a new one.
>> I'm not sure for other platforms returning true is correct regarding 
>> the method's meaning though.
>>
> Probably not, for platforms not supporting large pages for example we 
> should probably return false.
I think I need to clarify a little bit here.

I said "I'm not sure for other platforms returning true is correct 
regarding the method's meaning though. " in the previous email and the 
part of it came from your answer of "the new capability just return the 
same as can_commit_large_page_memory() for Linux and have the other 
platforms return true."

What we need at the location in the patch is distinguishing Linux from 
others for large pages enabled. So that we will exit the vm for Linux only.

>>>
>>>>>
>>>>>>>
>>>>>>> Regarding the comment, I'm not sure what you mean by "pin 
>>>>>>> region". I might be missing something but I think the comment 
>>>>>>> need more information to be easier to understand.
>>>>>> Looking at the ReservedHeapSpace::try_reserve_heap() line 314, 
>>>>>> there is a comment about current allocation.
>>>>>> // If OS doesn't support demand paging for large page memory, we 
>>>>>> need
>>>>>> // to use reserve_memory_special() to reserve and pin the entire 
>>>>>> region.
>>>>>>
>>>>>> And I agree adding more information is better.
>>>>>> How about this? (I will also update the comment at test)
>>>>>> -     // If we are using pin region, we cannot change the page 
>>>>>> size to default size
>>>>>> -     // as we could free memory which is not expected for pin 
>>>>>> region in Linux.
>>>>>>
>>>>>> +     // If we are using pin region which is reserved and pinned 
>>>>>> the entire region,
>>>>>> +     // we cannot change the page size to default size as we 
>>>>>> could free memory
>>>>>> +     // which is not expected for pin region in Linux.
>>>>>>
>>>>> Ok, I see what you mean. Just never thought about it as "pin region".
>>>> We are already using the term 'pinned region (not pin region)' 5 
>>>> times at G1 code.
>>>>
>>> That's true but the meaning of that is different and that's why I 
>>> would like to avoid that wording. From heapRegion.hpp:
>>>   // A pinned region contains objects which are not moved by garbage 
>>> collections.
>>>   // Humongous regions and archive regions are pinned.
>>>   bool is_pinned() const { return _type.is_pinned(); }
>>>
>>> So this refers to G1 heap regions that are pinned because we don't 
>>> move them during GC. In try_reserve_heap() we refer to the backing 
>>> memory of the whole heap which is reserved and committed (pinned). 
>>> So the meaning of pinned in these two cases is very different.
>> OK, let's avoid using it. As we are talking about it, it is already 
>> controversial.
>> I interpreted 'pinned region' as some region which is not moving or 
>> modifying while you include the job itself too (GC or heap allocation).
>>
> Great, thanks.
>>>
>>>>> I would say something like:
>>>>>     // Changing the page size below can lead to freeing of memory. 
>>>>> When using large pages
>>>>>     // and the memory has been both reserved and committed, some 
>>>>> platforms do not support
>>>>>     // freeing parts of it. For those platforms we fail 
>>>>> initialization.
>>>>>     if (UseLargePages && 
>>>>> !os::can_free_parts_of_large_page_memory()) {
>>>>>       vm_exit_during_initialization("Failed initializing NUMA. Too 
>>>>> small heap size");
>>>>>     }
>>>>>
>>>>> What do you think about that?
>>>> Basically I like above but it would be better to contain 'pinned 
>>>> region' as there's no reason to avoid using it.
>>>>
>>> As stated above, I think we should avoid it in this context since it 
>>> might be misinterpreted.
>> OK. But I had to slightly change as current version doesn't have the 
>> general method.
>>
>> I changed the message(MSG_EXIT_TOO_SMALL_HEAP) in the test to the 
>> correct string. When I ran JPRT I was using correct string but 
>> uploaded the old one.
>>
>> webrev:
>> http://cr.openjdk.java.net/~sangheki/8023905/webrev.1
>> http://cr.openjdk.java.net/~sangheki/8023905/webrev.1_to_0
>>
> Looks good.
Thanks for the review.

Sangheon

>
> Thanks,
> Stefan
>
>> Thanks,
>> Sangheon
>>
>>
>>>
>>> Thanks,
>>> Stefan
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Stefan
>>>>>> Let me upload webrev after our discussion ended.
>>>>>>
>>>>>> Thanks,
>>>>>> Sangheon
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Stefan
>>>>>>>
>>>>>>>> Testing: JPRT, manual test on NUMA + large page supported machine.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sangheon
>>>>>>>>
>>>>>>>> [1]:
>>>>>>>> // With SHM and HugeTLBFS large pages we cannot uncommit a 
>>>>>>>> page, so there's no way
>>>>>>>> // we can make the adaptive lgrp chunk resizing work. If the 
>>>>>>>> user specified
>>>>>>>> // both UseNUMA and UseLargePages (or UseSHM/UseHugeTLBFS) on 
>>>>>>>> the command line - warn and
>>>>>>>> // disable adaptive resizing.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>