8241354: ZGC: fatal error: Failed to get NUMA id due to get_mempolicy operation not permitted(Internet mail)

Stefan Karlsson stefan.karlsson at oracle.com
Mon Mar 23 09:41:17 UTC 2020


On 2020-03-23 10:06, jiefu(傅杰) wrote:
> Hi StefanK,
>
> Thanks for your review and very nice suggestions.
>
> After more investigation, I found that several NUMA apis won't work in the docker, such as get_mempolicy, numa_tonode_memory, ...
> So it isn't only the get_mempolicy that is problematic.
>
> And Thomas had reminded me that the other gcs are affected by this issue too.
> So it would be better to fix them together.
>
> What do you think of http://cr.openjdk.java.net/~jiefu/8241354/webrev.02/ ?

numa_available() is a HotSpot wrapper around the numa_available 
function. I don't think you should add this kind of logic inside that 
function. Could move it up to libnuma_init instead?

If you intend this to be a generic (non-ZGC) change, then I think it 
would be good to create a new RFR and send it to hotspot-dev, so that 
the Runtime team and others also see it.

Thanks,
StefanK

>
> Thanks a lot.
> Best regards,
> Jie
>
> On 2020/3/23, 4:43 PM, "Stefan Karlsson" <stefan.karlsson at oracle.com> wrote:
>
>      Hi Jie,
>      
>      On 2020-03-22 14:35, jiefu(傅杰) wrote:
>      > Hi Erik,
>      >
>      > Thanks for your review and valuable comments.
>      >
>      > Updated: http://cr.openjdk.java.net/~jiefu/8241354/webrev.01/
>      >
>      > Please review it.
>      
>      Thanks for providing this patch.
>      
>      If it is only the get_mempolicy that is problematic, then I wonder if it
>      would be better to leave the UseNUMA flag untouched and only turn off
>      the ZGC specific NUMA parts. Maybe something like this:
>      
>      static bool check_get_mempolicy_support() {
>         int dummy = 0;
>         int mode = -1;
>         // Check whether get_mempolicy is supported or not
>         if (ZSyscall::get_mempolicy(&mode, NULL, 0, (void*)&dummy,
>      MPOL_F_NODE | MPOL_F_ADDR) == -1) {
>           if (!FLAG_IS_DEFAULT(UseNUMA)) {
>             warning("ZGC NUMA support is disabled since get_mempolicy is
>      unsupported.");
>           }
>           return false;
>         }
>      
>         return true;
>      }
>      
>      void ZNUMA::initialize_platform() {
>         _enabled = UseNUMA && check_get_mempolicy_support();
>      }
>      
>      An alternative would be to take this a step further (probably as a
>      separate RFR) and provide a user friendly output in our -Xlog:gc+init
>      output:
>      
>      [0.015s][info][gc,init] Initializing The Z Garbage Collector
>      [0.015s][info][gc,init] Version:
>      15-internal+0-2020-03-04-0947497.stefank... (fastdebug)
>      [0.015s][info][gc,init] NUMA Support: Unsupported <== HERE
>      [0.015s][info][gc,init] CPUs: 32 total, 32 available
>      [0.015s][info][gc,init] Memory: 128851M
>      [0.015s][info][gc,init] Large Page Support: Disabled
>      [0.015s][info][gc,init] Medium Page Size: 32M
>      [0.015s][info][gc,init] Workers: 20 parallel, 4 concurrent
>      
>      Borrowing the structure from how UseLargePages are setup and printed:
>      
>      void ZLargePages::initialize_platform() {
>         if (UseLargePages) {
>           if (UseTransparentHugePages) {
>             _state = Transparent;
>           } else {
>             _state = Explicit;
>           }
>         } else {
>           _state = Disabled;
>         }
>      }
>      
>      const char* ZLargePages::to_string() {
>         switch (_state) {
>         case Explicit:
>           return "Enabled (Explicit)";
>      
>         case Transparent:
>           return "Enabled (Transparent)";
>      
>         default:
>           return "Disabled";
>         }
>      }
>      
>      Thanks,
>      StefanK
>      
>      >
>      > Thanks a lot.
>      > Best regards,
>      > Jie
>      >
>      > On 2020/3/22, 4:26 PM, "Erik Österlund" <erik.osterlund at oracle.com> wrote:
>      >
>      >      Hi Jie,
>      >
>      >      It seems to me that if the environment doesn’t supply the required NUMA APIs, then we really should disable UseNUMA instead. I propose we check the availability of the syscall during initialization instead, and switch off all NUMA functionality when appropriate. And we should only print a warning if the user explicitly supplied UseNUMA on the command line.
>      >
>      >      Thanks,
>      >      /Erik
>      >
>      >      > On 20 Mar 2020, at 13:15, jiefu(傅杰) <jiefu at tencent.com> wrote:
>      >      >
>      >      > Hi all,
>      >      >
>      >      > JBS:    https://bugs.openjdk.java.net/browse/JDK-8241354
>      >      > Webrev: http://cr.openjdk.java.net/~jiefu/8241354/webrev.00/
>      >      >
>      >      > A VM fatal error may be observed if ZGC is used.
>      >      >
>      >      > The background is that some of our products will run in the docker.
>      >      > For some safety reason, SYS_get_mempolicy is not allowed in the docker.
>      >      >
>      >      > It might be not a good practice to generate a fatal error when get_mempolicy fails.
>      >      > What do you think?
>      >      >
>      >      > Thanks a lot.
>      >      > Best regards,
>      >      > Jie
>      >
>      >
>      >
>      >
>      
>      
>      
>




More information about the hotspot-gc-dev mailing list