RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v29]
Thomas Stuefe
stuefe at openjdk.org
Thu Jan 25 08:30:41 UTC 2024
On Mon, 22 Jan 2024 06:39:52 GMT, Liming Liu <duke at openjdk.org> wrote:
>> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14).
>>
>> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported:
>>
>> <table>
>> <tr>
>> <th>Kernel</th>
>> <th colspan="2"><tt>-XX:-TransparentHugePages</tt></th>
>> <th colspan="2"><tt>-XX:+TransparentHugePages</tt></th>
>> </tr>
>> <tr><td></td><td>Unpatched</td><td>Patched</td><td>Unpatched</td><td>Patched</td></tr>
>> <tr><td>4.18</td><td>11.30</td><td>11.30</td><td>0.25</td><td>0.25</td></tr>
>> <tr><td>5.13</td><td>0.22</td><td>0.22</td><td>3.42</td><td>3.42</td></tr>
>> <tr><td>6.1</td><td>0.27</td><td>0.33</td><td>3.54</td><td>0.33</td></tr>
>> </table>
>
> Liming Liu has updated the pull request incrementally with two additional commits since the last revision:
>
> - Use TestThreadGroup
> - Set it as default before parsing
I like this version. Some nits remain. Thank you for your patience.
src/hotspot/os/linux/globals_linux.hpp line 96:
> 94: \
> 95: product(bool, UseMadvPopulateWrite, false, DIAGNOSTIC, \
> 96: "Use MADV_POPULATE_WRITE in os::pd_pretouch_memory.") \
I would make this default true. We need a fallback mechanism if we encounter problems and we want to exclude the madvise as a possible cause. But seeing that the perf gains are real and significant, I would enable it by default.
src/hotspot/os/linux/os_linux.cpp line 2972:
> 2970: ", %d) failed; error='%s' (errno=%d)",
> 2971: p2i(first), len, MADV_POPULATE_WRITE,
> 2972: os::strerror(err), err);
What other things can go wrong here beside missing kernel support?
Unconditional log output (with log_warning) is tricky. Many tools parse the JVM output and are thrown off by unexpected content. That's why we restrict log_warning to the small band of "stuff that can go wrong at a customer but it is so severe we really need to tell the customer right now".
Stuff that should never go wrong should be assert()ed, or possibly guarantee()'d.
Stuff that can go wrong but is not as severe, should be warned about at a lower level.
In this case, output may get flooded with warnings if we continue running the VM and repeat the pretouch attempts with other areas.
src/hotspot/os/linux/os_linux.cpp line 4402:
> 4400:
> 4401: // Check the availability of MADV_POPULATE_WRITE.
> 4402: FLAG_SET_DEFAULT(UseMadvPopulateWrite, (::madvise(0, 0, MADV_POPULATE_WRITE) == 0));
Can we delay this to the first attempt? Switch it off if the first attempt returns EINVAL? Every system call saved at startup is good.
-------------
Changes requested by stuefe (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/15781#pullrequestreview-1843071110
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1465987597
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1465998092
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1466001305
More information about the hotspot-runtime-dev
mailing list