RFR: 8261401: Add sanity check for UseSHM large pages similar to the one used with hugetlb large pages [v2]
    Thomas Stuefe 
    stuefe at openjdk.java.net
       
    Wed Feb 10 18:47:42 UTC 2021
    
    
  
On Wed, 10 Feb 2021 18:44:26 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Thomas review
>>   
>>   Removed check for IPC_LOCK capability. If a large page mapping fails
>>   the errno is already present in the warning printed out. We could
>>   look at improving this to better explain when EPERM vs ENOMEM occurs.
>
> Looks almost good. Minor nit below.
> > Hi Stefan,
> > Testing UseSHM makes definitely sense. Since SysV shm has more restrictions than the mmap API I am not sure that the former works where the latter does not.
> 
> Nice to hear that you agree Thomas :)
> 
> > About your patch: The shmget test is good, I am not sure about the rest.
> > You first attempt to `shmget(IPC_PRIVATE, page_size, SHM_HUGETLB)`. Then you separately scan for CAP_IPC_LOCK. But would shmget(SHM_HUGETLB) not fail with EPERM if CAP_IPC_LOCK is missing? man shmget says:
> > ```
> >        EPERM  The SHM_HUGETLB flag was specified, but the caller was not
> >               privileged (did not have the CAP_IPC_LOCK capability).
> > ```
> > 
> > 
> > So I think querying for EPERM after shmget should be sufficient? If CAP_IPC_LOCK was missing, the shmget should have failed and we should never get to the point of can_lock_memory().
> 
> This was my initial though as well, but it turns out that the capability is only needed to "lock" more memory than the MEMLOCK-limit defines. So the first `shmget()` will succeed as long as there is at least one large page available (and the MEMLOCK-limit is not smaller than the large page size).
> 
> I could of course design the other check to once again call `shmget()`, but with a size larger than the limit. If that check fails with `EPERM` we don't have the needed capability.
Okay, I see what you did now.
> 
> > About the rlimit test: Assuming the man page is wrong and shmget succeeds even without the capability, you only print this limit if can_lock_memory() returned false. The way I understand this is that the limit can still be the limiting factor even with those capabilities in place.
> 
> In old kernels this was the case but according to `man setrlimit` this is no longer the case:
> 
> ```
> RLIMIT_MEMLOCK
>   The maximum number of bytes of memory that may be locked into RAM ...
>   In Linux kernels before 2.6.9, this limit controlled the amount of memory that could be
>   locked by a privileged process. Since Linux 2.6.9, no limits are placed on the amount of 
>   memory that a privileged process may lock, and this limit instead governs the amount 
>   of memory that an unprivileged process may lock. 
> ```
> 
> So if we have the capability the limit will never be a limiting factor.
> 
> > I think it would make more sense to extend the error reporting if later real "shmget" calls fail. Note that later reserve calls can fail for a number of reasons (huge page pool exhausted, RLIMIT_MEMLOCK reached, SHMMAX or SHMALL reached, commit charge reached its limit...) which means that just reporting on RLIMIT_MEMLOCK would be arbitrary. Whether or not more intelligent error reporting makes sense depends also on whether we think we still need the SysV path at all. I personally doubt that this still makes sense.
> 
> I agree that this warning is not covering all bases, but I though it was a good addition compared to just be silent about it. But I like your idea about just doing the simplest sanity check here and then when a mapping goes bade try to figure out why it did.
> 
> In that case I would only include the first `shmget()` in the sanity check and file a separate issue for improving the warning.
This makes sense, especially since you could add that warning to both SysV and mmap paths since they share at least some of the restrictions. Like huge page pool size. It also would mean you don't pay for some of these tests unless needed.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2488
    
    
More information about the hotspot-dev
mailing list