RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly

Thomas Stuefe stuefe at openjdk.org
Wed Oct 4 14:10:42 UTC 2023


On Mon, 18 Sep 2023 07:37:26 GMT, Liming Liu <duke at openjdk.org> wrote:

> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14).
> 
> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported:
> 
> <table>
>   <tr>
>     <th>Kernel</th>
>     <th colspan="2"><tt>-XX:-TransparentHugePages</tt></th>
>     <th colspan="2"><tt>-XX:+TransparentHugePages</tt></th>
>   </tr>
>   <tr><td></td><td>Unpatched</td><td>Patched</td><td>Unpatched</td><td>Patched</td></tr>
>   <tr><td>4.18</td><td>11.30</td><td>11.30</td><td>0.25</td><td>0.25</td></tr>
>   <tr><td>5.13</td><td>0.22</td><td>0.22</td><td>3.42</td><td>3.42</td></tr>
>   <tr><td>6.1</td><td>0.27</td><td>0.33</td><td>3.54</td><td>0.33</td></tr>
> </table>

Side note, does anyone know why we pretouch memory for *explicit* large pages? I would have thought that memory is already online and as "live" as it can get once it is mmapped.

src/hotspot/os/linux/os_linux.cpp line 2839:

> 2837: #ifndef MADV_POPULATE_WRITE
> 2838:   #define MADV_POPULATE_WRITE 23
> 2839: #endif

Suggestion (we should have done this for other cases too) as a stupid sanity check:

#ifndef MADV_POPULATE_WRITE
  #define MADV_POPULATE_WRITE 23
#else
  static_assert(MADV_POPULATE_WRITE == 23);
#endif

src/hotspot/os/linux/os_linux.cpp line 2902:

> 2900:           p2i(first), p2i(last), page_size,
> 2901:           os::strerror(err), err);
> 2902: }

I don't think this breakout is necessary, I'd do it inline in pd_pretouch.

I'm unsure about `warning` (this will print warnings by default) here. When exactly would this fail? Would UL logging better, or a native OOM error? If I understand the manpage correctly, one possible error scenario is when this is called for write protected memory, which would be a case for assert.

src/hotspot/os/linux/os_linux.cpp line 2905:

> 2903: 
> 2904: void os::pd_pretouch_memory(void *first, void *last, size_t page_size) {
> 2905:   size_t len = static_cast<char *>(last) - static_cast<char *>(first) + page_size;

Please use `pointer_delta()` and make len const.

src/hotspot/os/linux/os_linux.cpp line 2911:

> 2909:   if (::madvise(first, len, MADV_POPULATE_WRITE) == -1) {
> 2910:     int err = errno;
> 2911:     if (err == EINVAL) { // Not supported

Would be nice to avoid repeated syscalls to madvise if this fails once; no reason to try again, then.

src/hotspot/share/runtime/os.cpp line 2108:

> 2106:     // granularity, so we can touch anywhere in a page.  Touch at the
> 2107:     // beginning of each page to simplify iteration.
> 2108:     void* first = align_down(start, page_size);

minor nit, since you are touching this, could you make it const too? (void* const)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1746954033
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1345784162
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1345785187
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1345796964
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1345849056
PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1345853309


More information about the hotspot-runtime-dev mailing list