RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21]

Thu Jan 11 08:23:40 UTC 2024

On Thu, 28 Dec 2023 09:26:19 GMT, Liming Liu <duke at openjdk.org> wrote:

>> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14).
>> 
>> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported:
>> 
>> <table>
>>   <tr>
>>     <th>Kernel</th>
>>     <th colspan="2"><tt>-XX:-TransparentHugePages</tt></th>
>>     <th colspan="2"><tt>-XX:+TransparentHugePages</tt></th>
>>   </tr>
>>   <tr><td></td><td>Unpatched</td><td>Patched</td><td>Unpatched</td><td>Patched</td></tr>
>>   <tr><td>4.18</td><td>11.30</td><td>11.30</td><td>0.25</td><td>0.25</td></tr>
>>   <tr><td>5.13</td><td>0.22</td><td>0.22</td><td>3.42</td><td>3.42</td></tr>
>>   <tr><td>6.1</td><td>0.27</td><td>0.33</td><td>3.54</td><td>0.33</td></tr>
>> </table>
>
> Liming Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use pthread instead

I think this patch makes sense and should be done. Sorry for dropping the ball on this one.

Some suggestions inline.

I also would prefer to have a boolean product-diagnostic linux-only switch to disable the madvise pretouching, just to have something that can be disabled when problems appear at a customer and we want to quickly rule out that pretouching is the culprit. We can remove that switch after some time if no issues appeared.

src/hotspot/share/runtime/os.cpp line 2118:

> 2116:   }
> 2117: }
> 2118: 

I suggest a slightly different flow, similar to how we do things in other areas:

os.hpp

private:
bool pd_pretouch_memory(..);
public:
void pretouch_memory(..);

os.cpp

void os::pretouch_memory(..) {
  // Ask platform first
  if (pd_pretouch_memory(..)) {
    return;
  }
  ... do pretouching by touching
}

then provide a pd_pretouch for every platform; let other platforms be just a noop returning false, on Linux - if THPs are enabled and so forth, do the madvise and return true.

One function less in the os namespace, and we don't call back from a pd_... function into a generic function which is unusual.

test/hotspot/jtreg/runtime/os/TestTransparentHugePageUsage.java line 46:

> 44: import java.util.regex.Matcher;
> 45: import java.util.regex.Pattern;
> 46: import jdk.test.lib.process.ProcessTools;

Please add a comment describing what the test does. E.g. "Tests checks that a pretouched java heap appears to use THPs by checking AnonHugePages in smaps". Feel free to find a better formulation.

Does the test fail without madvise, at least sporadically?

So I wonder whether it would be better to run with SerialGC, to limit the pretouching to one thread. That would increase the time window needed for pretouching and give us a higher chance to observe those small pages that appear before khugepaged gets around merging them into THPs.

-------------

Changes requested by stuefe (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1814890863
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1448437292
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1448462880