RFR: 8324781: runtime/Thread/TestAlwaysPreTouchStacks.java failed with Expected a higher ratio between stack committed and reserved [v9]

Thomas Stuefe stuefe at openjdk.org
Thu Jun 6 12:40:51 UTC 2024


On Mon, 6 May 2024 03:33:30 GMT, Liming Liu <duke at openjdk.org> wrote:

>> The testcase failed on Oracle CI since JDK-8315923. The root cause is that Oracle CI runs Linux-5.4.17-UEK where the value of MADV_POPULATE_WRITE (23) is used as MADV_DONTEXEC which is not supported by upstream. This PR solves the testcase failure by checking versions of kernels first, and checking the availability of MADV_POPULATE_WRITE when they are not older than 5.14.
>
> Liming Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix the wrong condition

Meanwhile, I am warming to the current approach. I understand that this it avoids referring to individual downstream vendors, which I agree may be brittle.

My main concern is to prevent future flag mismatches. Therefore, my proposal is to do what this patch does, but in a more generic way. Essentially, encoding that for certain flags, we cannot rely on older kernel correctly ignoring them. But we assume that downstream kernel vendors will at least fix conflicts when they merge in flags from mainline. We sacrifice the ability to benefit from vendor-specific backports, but that is the compromise.

The flags I'd like to guard for now are:
1) UEK7: MADV_DONTNEED_LOCKED   -> MADV_DOEXEC
2) UEK7: MADV_COLLAPSE          -> MADV_DONTEXEC
3) UEK6: MADV_POPULATE_READ     -> MADV_DOEXEC
4) UEK6: MADV_POPULATE_WRITE    -> MADV_DONTEXEC

If the vendor keeps up its routine of just shifting the proprietary flags to the end of the numerical MADV range for each new mainline flag, we will continue to have problems and this list may grow.

The mechanism could be very close to what @limingliu-ampere does now, only a tad more generic. E.g.:


bool os::Linux::can_use_madvise_flag(int someflag) { 
  // have a hardcoded array of { flag, kernel version } tupels. 
  // Search it for someflag, and if found, return false if host kernel version is older than the encoded version. 
  // Otherwise return true.
}


and then maybe wrap the madvise call with something like this:


bool os::Linux::checked_madvise(..., someflag) {
  assert(can_use_madvise_flag(someflag))
  call real madvise
}


in addition to something like this in initialization:


if (UseMadvPopulateWrite && ! can_use_madvise_flag(MADV_POPULATE_WRITE)) {
   FLAG_SET_ERGO(UseMadvPopulateWrite, false);
}


Do you like this, does this make sense?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18592#issuecomment-2152304920


More information about the hotspot-gc-dev mailing list