Question: 8324776: How to safely tell if MADV_POPULATE_WRITE is supported

Thomas Stüfe thomas.stuefe at gmail.com
Thu Feb 22 20:35:11 UTC 2024


Hi Patrick,

So, if I understand this correctly, the Oracle kernels break binary
compatibility with mainline kernels. That's not good. Especially seeing
that you really don't want something like MADV_DOEXEC to accidentally being
enabled, especially from a security standpoint.

Funny how they just keep moving MADV_DOEXEC/DONTEXEC up to the end of the
numerical range. They quite literally just kick the can down the road :-(

I don't see a good way to solve this apart from starting to add
distro-specific recognition code, e.g. scanning for /etc/os-release, which
would be rather annoying. AFAICS, we avoided that so far.

Cheers, Thomas









On Thu, Feb 22, 2024 at 3:55 AM Patrick Zhang OS <
patrick at os.amperecomputing.com> wrote:

>
>
> >> Is this a downstream-only patch specific to the Oracle Linux kernel?
>
> Yes, I think so.
>
> At linux-uek repo [1], it says that the uek6/u2 branch is for kernel v5.4
> on Oracle Linux 7 and 8. Looking into the header file, we could see the
> definitions of 22/23, the commit is [2]. At the same time, Linux mainline’s
> max mode number was 21 and did not have 22/23. I have an attached PDF
> describing the diffs with various kernels, mainline vs uek, see [3].
>
>
>
> By the way, uek kernel v5.15 used numbers 24/25 [4] for these two
> customized modes which would generate another conflict with mainline since
> 5.18-rc1 and 6.1 where the same numbers got used. I would suggest that uek
> kernel should put customized modes into a separate range, e.g., 100+, to
> avoid conflict.
>
>
>
> [1] https://github.com/oracle/linux-uek/
>
> [2]
> https://github.com/oracle/linux-uek/commit/a91ae4fa327d8957e2f806420b8b835269b85bd4
>
> mm: introduce MADV_DOEXEC
>
> madvise MADV_DOEXEC preserves a memory range across exec.  Initially
>
> only supported for non-executable, non-stack, anonymous memory.
>
> MADV_DOEXEC is single-use and after exec madvise must done again to
>
> preserve memory for a subsequent exec.
>
> MADV_DONTEXEC reverts the effect of a previous MADV_DOEXEC call and
>
> undoes the preservation of the range.
>
>
>
> #define MADV_DOEXEC                 22                           /* do
> inherit across exec */
>
> #define MADV_DONTEXEC          23                           /* don't
> inherit across exec */
>
>
>
> [3]
> https://bugs.openjdk.org/secure/attachment/108337/madvise_return_values.pdf,
> at https://bugs.openjdk.org/browse/JDK-8324776
>
> [4]
> https://github.com/oracle/linux-uek/commit/4693c5d9d799eb4803c5afc781cc60e2b645e398
>
>
>
> Regards, Patrick
>
>
>
> *From:* Thomas Stüfe <thomas.stuefe at gmail.com>
> *Sent:* Thursday, February 22, 2024 0:56
> *To:* Patrick Zhang OS <patrick at os.amperecomputing.com>
> *Cc:* hotspot-dev developers <hotspot-dev at openjdk.java.net>;
> hotspot-gc-dev at openjdk.java.net
> *Subject:* Re: Question: 8324776: How to safely tell if
> MADV_POPULATE_WRITE is supported
>
>
>
> Hi,
>
>
>
> I am trying to understand this better.
>
>
>
> MADV_POPULATE_READ and MADV_POPULATE_WRITE were added to mainline with
>
>
>
> ```
>
>  commit 4ca9b3859dac14bbef0c27d00667bb5b10917adb
>  Author: David Hildenbrand <david at redhat.com>
>  Date:   Wed Jun 30 18:52:28 2021 -0700
>      mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page
> tables
>
> ```
>
>
>
> and they used 22/23 as numerical identifier:
>
>
>
> ```
>
> +#define MADV_POPULATE_READ     22      /* populate (prefault) page tables
> readable */
> +#define MADV_POPULATE_WRITE    23      /* populate (prefault) page tables
> writable */
>
> ```
>
>
>
> But I don't see a patch in mainline for MADV_DOEXEC. I do see a very
> lengthy RFR thread that seemed to end in patent disputes (?):
>
>
> https://lore.kernel.org/all/1595869887-23307-1-git-send-email-anthony.yznaga@oracle.com/T/#u
>
>
>
> Is this a downstream-only patch specific to the Oracle Linux kernel?
>
>
>
> Cheers, Thomas
>
>
>
>
>
>
>
> On Wed, Feb 21, 2024 at 4:25 PM Patrick Zhang OS <
> patrick at os.amperecomputing.com> wrote:
>
> Hi,
>
>
>
> I have a question regarding the regression reported by [1] which was
> caused by the commit [2].
>
>
>
> The root cause can be briefly described as: MADV_POPULATE_WRITE has been
> supported since Linux 5.14. It is suggested to call madvise(0, 0, advice)
> to tell if this madvise mode is valid, which “will return zero iff advice
> is supported by the kernel and can be relied on to probe for support” [3].
> However Oracle UEKR6 5.4.17 has a duplicate definition of the number 23:
> MADV_DONTEXEC [4]. As a result, the test passes on Linux 5.4 mainline,
> while fails on UEK 5.4.17, because madvise(0, 0, 23) gives different return
> values. See details at [5].
>
>
>
> It is a dilemma that how to fix so inside JVM. Thanks for any advice.
>
>
>
> [1] https://bugs.openjdk.org/browse/JDK-8324776
> runtime/os/TestTransparentHugePageUsage.java fails with The usage of THP is
> not enough
>
> [2] https://bugs.openjdk.org/browse/JDK-8315923 pretouch_memory by
> atomic-add-0 fragments huge pages unexpectedly
>
> [3] https://www.man7.org/linux/man-pages/man2/madvise.2.html
>
> [4]
> https://yum.oracle.com/repo/OracleLinux/OL8/developer/UEKR6/aarch64/getPackageSource/kernel-uek-5.4.17-2136.327.2.el8uek.src.rpm,
> linux-5.4.17/include/uapi/asm-generic/mman-common.h#L75
>
> [5]
> https://bugs.openjdk.org/browse/JDK-8324776?focusedId=14651275&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14651275
> .
>
>
>
> Regards
>
> Patrick
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20240222/1a88cc16/attachment.htm>


More information about the hotspot-gc-dev mailing list