RFR: 8241423: NUMA APIs may fail to work in the docker due to operation not permitted

jiefu(傅杰) jiefu at tencent.com
Fri Apr 3 07:58:18 UTC 2020


Hi Bob,

Thanks for your review and helpful comments. 
                    
I'm not a docker expert. 
Apart from the zgc crash [1], we didn't come across other problems in the docker.
                    
It seems that this bug has nothing to do with the resource limit.
The root cause is that some NUMA-related syscalls are disabled in the docker for safety reasons.

Please note that we already have numa_available() check here [2].
But it failed to detect such cases.

What do you think?

Thanks a lot.
Best regards,
Jie
                      
[1] https://bugs.openjdk.java.net/browse/JDK-8241354
[2] http://hg.openjdk.java.net/jdk/jdk/file/f50a7df94744/src/hotspot/os/linux/os_linux.cpp#l3182

On 2020/4/3, 3:58 AM, "Bob Vandette" <bob.vandette at oracle.com> wrote:

    Jie,
    
    Before we discuss this specific fix, I’d like to know if you have confirmed that Hotspot’s NUMA
    support actually functions properly when running in containers (with proper privs)?
    
    Also, do the libnuma functions work properly in response to cgroup limitations imposed by docker run --cpuset-mems?
    
    Some of the traditional kernel functions reporting resource limits only report host values and do not
    correctly report limits specified for containers.   To resolve this issue I have added an osContainer
    class to hotspot.  Included in this class is a function that reports the memory nodes available 
    to hotspot when running in a container.   It might be necessary to query this function when
    trying to configure the hotspot NUMA support.
    
    Back to your webrev, is it not possible to get the address for numa_available and
    then try to calling it in order to determine if NUMA can be used?
    
    If it is determined that you don’t have sufficient access, I would suggest disabling UseNUMA
    all together.
    
    Bob
    
    > On Mar 23, 2020, at 11:58 AM, jiefu(傅杰) <jiefu at tencent.com> wrote:
    > 
    > Hi all,
    > 
    > JBS:    https://bugs.openjdk.java.net/browse/JDK-8241423
    > Webrev: http://cr.openjdk.java.net/~jiefu/8241423/webrev.00/
    > 
    > A VM fatal error may be observed if ZGC is used (see JDK-8241354).
    > The background is that some of our products run in the docker.
    > And for safety reasons, SYS_get_mempolicy is not allowed by default [1].
    > 
    > At first, we thought it just a zgc-only problem and filed JDK-8241354.
    > But Thomas had reminded me that other collectors are also affected [2].
    > So it would be better to fix them together.
    > 
    > After more investigation, we found that NUMA APIs are actually dependent on several syscalls, such as get_mempolicy, mbind and set_mempolicy.
    > When the required syscalls are unavailable, NUMA APIs fail to work as expected.
    > 
    > The fix is to check whether the required syscalls are available.
    > In theory, all NUMA-related syscalls should be checked.
    > But it seems hard to do so because some of them will cause unexpected side effect.
    > To fix our issue, checking get_mempolicy is enough.
    > And just as Per suggested that we can refine this later if it turns out to be a problem [3].
    > 
    > Please review it and give me some advice.
    > 
    > Thanks a lot.
    > Best regards,
    > Jie
    > 
    > [1] https://docs.docker.com/engine/security/seccomp/
    > [2] https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-March/028923.html
    > [3] https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-March/028933.html
    
    
    



More information about the hotspot-dev mailing list