RFR(s): 8077276: allocating heap with UseLargePages and HugeTLBFS may trash existing memory mappings (linux)
Kirk Pepperdine
kirk at kodewerk.com
Tue May 5 09:02:39 UTC 2015
Hi Stefan,
I figured it must be a known issue but I thought I’d take the opportunity to add what we’re seeing out here in the wild.
Regards,
Kirk
On May 5, 2015, at 10:11 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
> Hi Kirk,
>
> On 2015-05-05 08:30, Kirk Pepperdine wrote:
>> Hi all,
>>
>> We’ve been having a running discussion in friends at jclarity.com regarding THP on Linux. Our recommendation (and has been for some time) is to turn them off as they can be responsible for some very long pause times. Charlie Hunt is probably the most knowledgeable person you have internally regarding the problem.
>>
>> Here is a GC log entry from a current discussion.
>>
>> 2015-04-03T01:26:22.488-0400: 196943.302: [GC [PSYoungGen: 3890145K->1557919K(5087744K)] 14069848K->12980486K(16664064K), 47.7667670 secs] [Times: user=0.00 sys=584.96, real=47.77 secs]
>
> To be clear, the UseHugeTLBFS flag, that we discuss in this review, does not turn on Transparent Huge Pages.
>
> It's a known issue that THP are causing problems:
> https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge
>
> For a short period of time we had UseTransparentHugePages turned on by default, but it was reverted with:
> https://bugs.openjdk.java.net/browse/JDK-8024838 - Significant slowdown due to transparent huge pages
>
> You can check if you have THP by reading this file:
> /sys/kernel/mm/transparent_hugepage/enabled
>
> Thanks,
> StefanK
>
>>
>> Kind regards,
>> Kirk
>>
>>
>>
>> On May 4, 2015, at 10:05 PM, Coleen Phillimore <coleen.phillimore at oracle.com> wrote:
>>
>>> Hi Thomas,
>>>
>>> I reviewed the latest version. Thanks to Stefan who reviewed the earlier versions and suggested changes.
>>>
>>> This looks great. I am not an expert at large pages (or even that knowledgeable) but your change addresses the safety concern that you had and the code appears to do what you said.
>>>
>>> Is there a follow-up RFE to clean out the last use of and remove the requested address parameter to os::reserve_memory()? This os::reserve_memory() calling os::pd_reserve_memory layering is really annoying and seems not helpful.
>>>
>>> Thanks,
>>> Coleen
>>>
>>> (also, I will sponsor it for you if you send me an hg exported version of your commit)
>>>
>>> On 4/30/15, 3:33 AM, Thomas Stüfe wrote:
>>>> Hi all,
>>>>
>>>> I realize that this patch may look more complicated than it is.
>>>>
>>>> Basically, the problem is that under certain conditions, memory is
>>>> allocated using mmap(addr, MAP_FIXED) for an initial reservation (e.g. java
>>>> heap), which may trash existing mappings.
>>>>
>>>> The fix is basically just to remove the MAP_FIXED flag for the initial
>>>> allocation.
>>>>
>>>> Fix looks more complicated than this because the test functions were
>>>> expanded to add regression tests for this fix.
>>>>
>>>> Please tell me if I should dumb down this fix or add explanations.
>>>>
>>>> Thanks & Kind regards, Thomas
>>>>
>>>>
>>>>
>>>> On Thu, Apr 30, 2015 at 8:39 AM, David Holmes <david.holmes at oracle.com>
>>>> wrote:
>>>>
>>>>> On 30/04/2015 4:59 AM, Thomas Stüfe wrote:
>>>>>
>>>>>> Could I have another reviewer, please, and a sponsor?
>>>>>>
>>>>> Latest webrev is:
>>>>>
>>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8077276/webrev.05/webrev
>>>>>
>>>>> Sorry not something I can review in depth.
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>> Thanks!
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>> On Thursday, April 9, 2015, Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>> please review this fix to huge page allocation on Linux.
>>>>>>>
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8077276
>>>>>>> webrev:
>>>>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8077276/webrev.00/webrev/
>>>>>>>
>>>>>>> os::Linux::reserve_memory_special_huge_tlbfs_mixed() first establishes a
>>>>>>> mapping with small pages over the whole requested range, then exchanges
>>>>>>> the
>>>>>>> parts of the mapping which are aligned to large page size with large
>>>>>>> pages.
>>>>>>>
>>>>>>> Now, when establishing the first mapping, it uses os::reserve_memory()
>>>>>>> with a potentially non-null req_addr. In that case, os::reserve_memory()
>>>>>>> will do a mmap(MAP_FIXED).
>>>>>>>
>>>>>>> This will trash any pre-existing mappings.
>>>>>>>
>>>>>>> Note that I could not willingly reproduce the error with an unmodified
>>>>>>> VM.
>>>>>>> But I added a reproduction case which demonstrated that if one were to
>>>>>>> call
>>>>>>> os::Linux::reserve_memory_special_huge_tlbfs_mixed() with a non-NULL
>>>>>>> req_addr, existing mappings would be trashed. Depending on where we are
>>>>>>> in
>>>>>>> address space, we also would overwrite libc structures and crash
>>>>>>> immediately.
>>>>>>>
>>>>>>> The repro case is part of the change, see changes in
>>>>>>> test_reserve_memory_special_huge_tlbfs_mixed(), and can be executed with:
>>>>>>>
>>>>>>> ./images/jdk/bin/java -XX:+UseLargePages -XX:+UseHugeTLBFS -XX:-UseSHM
>>>>>>> -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests
>>>>>>>
>>>>>>> The fix: instead of using os::reserve_memory(),
>>>>>>> os::Linux::reserve_memory_special_huge_tlbfs_mixed() now calls mmap()
>>>>>>> directly with the non-NULL req_addr, but without MAP_FIXED. This means
>>>>>>> the
>>>>>>> OS may do its best to allocate at req_addr, but will not trash any
>>>>>>> mappings. This also follows the pattern in
>>>>>>> os::Linux::reserve_memory_special_huge_tlbfs_only(), which is a sister
>>>>>>> function of os::Linux::reserve_memory_special_huge_tlbfs_mixed().
>>>>>>>
>>>>>>> Note also discussion at:
>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-April/017823.html
>>>>>>> .
>>>>>>>
>>>>>>> Thanks for reviewing!
>>>>>>>
>>>>>>> Kind Regards, Thomas
>>>>>>>
>>>>>>>
>>>>>>>
>
More information about the hotspot-runtime-dev
mailing list