RFR: 8016155: SIGBUS when running Kitchensink with ParallelScavenge and ParallelOld
Stefan Johansson
stefan.johansson at oracle.com
Tue Aug 27 06:34:11 PDT 2013
Thanks Per and Jon for the reviews.
Stefan
On 2013-08-26 13:40, Stefan Johansson wrote:
> On 2013-08-23 20:58, Jon Masamitsu wrote:
>>
>> On 8/23/2013 5:30 AM, Stefan Johansson wrote:
>>> Hi all,
>>>
>>> I would like some reviews on my fix for bug:
>>> http://bugs.sun.com/view_bug.do?bug_id=8016155
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~sjohanss/8016155/webrev.00
>>>
>>> Summary:
>>> On Linux we have a problem that we hit a SIGBUS when one NUMA node
>>> runs out of large pages but the system as a whole has large pages
>>> left. To avoid this we need to ease the requirement on which node
>>> the memory should be allocated on. This can be done by using the
>>> memory policy MPOL_PREFERRED, which prefers a certain node, instead
>>> of MPOL_BIND, which requires a certain node.
>>
>> With your change what happens when the system as a whole
>> runs out of large pages?
>
> The change doesn't do anything specific for large pages it just sets
> the memory policy to MPOL_PREFERRED to guarantee that we don't
> forcefully use a NUMA node that can't back the given mapping. If we
> run out of large pages this will still be handled in the same way, we
> prefer that the memory is allocated on a given NUMA node, but if it
> isn't possible we'll use another.
>
> I've verified that this is actually what happens by running SPEPjbb
> with an increasing heap and quite few large pages configured on the
> system. When the large pages are all used, we fall back on using
> regular sized pages and every thing runs along just fine.
>
> Thanks for highlighting this case, Jon.
>
> Stefan
>
>>
>> Jon
>>
>>>
>>> Testing:
>>> To verify the fix I've run Kitchensink as describe in the bug
>>> report, but also done some manual testing. To sanity test
>>> performance I've run SPECjbb2005 with and without UseNUMA before and
>>> after the fix and I haven't seen any problem. I also ran SPECjbb2005
>>> on a system where one NUMA node has been configured with no large
>>> pages while the other has enough for the test. Without the fix this
>>> crashes immediately, but with the fix the results are sane.
>>>
>>> Thanks,
>>> Stefan
>>
>
More information about the hotspot-dev
mailing list