8241354: ZGC: fatal error: Failed to get NUMA id due to get_mempolicy operation not permitted(Internet mail)
Stefan Karlsson
stefan.karlsson at oracle.com
Mon Mar 23 09:41:17 UTC 2020
On 2020-03-23 10:06, jiefu(傅杰) wrote:
> Hi StefanK,
>
> Thanks for your review and very nice suggestions.
>
> After more investigation, I found that several NUMA apis won't work in the docker, such as get_mempolicy, numa_tonode_memory, ...
> So it isn't only the get_mempolicy that is problematic.
>
> And Thomas had reminded me that the other gcs are affected by this issue too.
> So it would be better to fix them together.
>
> What do you think of http://cr.openjdk.java.net/~jiefu/8241354/webrev.02/ ?
numa_available() is a HotSpot wrapper around the numa_available
function. I don't think you should add this kind of logic inside that
function. Could move it up to libnuma_init instead?
If you intend this to be a generic (non-ZGC) change, then I think it
would be good to create a new RFR and send it to hotspot-dev, so that
the Runtime team and others also see it.
Thanks,
StefanK
>
> Thanks a lot.
> Best regards,
> Jie
>
> On 2020/3/23, 4:43 PM, "Stefan Karlsson" <stefan.karlsson at oracle.com> wrote:
>
> Hi Jie,
>
> On 2020-03-22 14:35, jiefu(傅杰) wrote:
> > Hi Erik,
> >
> > Thanks for your review and valuable comments.
> >
> > Updated: http://cr.openjdk.java.net/~jiefu/8241354/webrev.01/
> >
> > Please review it.
>
> Thanks for providing this patch.
>
> If it is only the get_mempolicy that is problematic, then I wonder if it
> would be better to leave the UseNUMA flag untouched and only turn off
> the ZGC specific NUMA parts. Maybe something like this:
>
> static bool check_get_mempolicy_support() {
> int dummy = 0;
> int mode = -1;
> // Check whether get_mempolicy is supported or not
> if (ZSyscall::get_mempolicy(&mode, NULL, 0, (void*)&dummy,
> MPOL_F_NODE | MPOL_F_ADDR) == -1) {
> if (!FLAG_IS_DEFAULT(UseNUMA)) {
> warning("ZGC NUMA support is disabled since get_mempolicy is
> unsupported.");
> }
> return false;
> }
>
> return true;
> }
>
> void ZNUMA::initialize_platform() {
> _enabled = UseNUMA && check_get_mempolicy_support();
> }
>
> An alternative would be to take this a step further (probably as a
> separate RFR) and provide a user friendly output in our -Xlog:gc+init
> output:
>
> [0.015s][info][gc,init] Initializing The Z Garbage Collector
> [0.015s][info][gc,init] Version:
> 15-internal+0-2020-03-04-0947497.stefank... (fastdebug)
> [0.015s][info][gc,init] NUMA Support: Unsupported <== HERE
> [0.015s][info][gc,init] CPUs: 32 total, 32 available
> [0.015s][info][gc,init] Memory: 128851M
> [0.015s][info][gc,init] Large Page Support: Disabled
> [0.015s][info][gc,init] Medium Page Size: 32M
> [0.015s][info][gc,init] Workers: 20 parallel, 4 concurrent
>
> Borrowing the structure from how UseLargePages are setup and printed:
>
> void ZLargePages::initialize_platform() {
> if (UseLargePages) {
> if (UseTransparentHugePages) {
> _state = Transparent;
> } else {
> _state = Explicit;
> }
> } else {
> _state = Disabled;
> }
> }
>
> const char* ZLargePages::to_string() {
> switch (_state) {
> case Explicit:
> return "Enabled (Explicit)";
>
> case Transparent:
> return "Enabled (Transparent)";
>
> default:
> return "Disabled";
> }
> }
>
> Thanks,
> StefanK
>
> >
> > Thanks a lot.
> > Best regards,
> > Jie
> >
> > On 2020/3/22, 4:26 PM, "Erik Österlund" <erik.osterlund at oracle.com> wrote:
> >
> > Hi Jie,
> >
> > It seems to me that if the environment doesn’t supply the required NUMA APIs, then we really should disable UseNUMA instead. I propose we check the availability of the syscall during initialization instead, and switch off all NUMA functionality when appropriate. And we should only print a warning if the user explicitly supplied UseNUMA on the command line.
> >
> > Thanks,
> > /Erik
> >
> > > On 20 Mar 2020, at 13:15, jiefu(傅杰) <jiefu at tencent.com> wrote:
> > >
> > > Hi all,
> > >
> > > JBS: https://bugs.openjdk.java.net/browse/JDK-8241354
> > > Webrev: http://cr.openjdk.java.net/~jiefu/8241354/webrev.00/
> > >
> > > A VM fatal error may be observed if ZGC is used.
> > >
> > > The background is that some of our products will run in the docker.
> > > For some safety reason, SYS_get_mempolicy is not allowed in the docker.
> > >
> > > It might be not a good practice to generate a fatal error when get_mempolicy fails.
> > > What do you think?
> > >
> > > Thanks a lot.
> > > Best regards,
> > > Jie
> >
> >
> >
> >
>
>
>
>
More information about the hotspot-gc-dev
mailing list