RFR: 8286030: Avoid JVM crash when containers share the same /tmp dir [v2]

Thomas Stuefe stuefe at openjdk.org
Fri Jul 8 07:13:25 UTC 2022


On Thu, 7 Jul 2022 21:21:27 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Some Kubernetes setups share the /tmp directory across multiple containers. On rare occasions, the JVM may crash when it tries to write to `/tmp/hsperfdata_<user>/<pid>` when a process in a separate container decides to do the same thing (because they happen to have the same namespaced pid).
>> 
>> This patch avoids the crash by using `flock()` to allow only one of these processes to write to the file. All other competing processes that fail to grab the lock will give up the file and run with PerfMemory disabled. We will try to enable PerfMemory for the failed processes in a follow-up RFE: [JDK-8289883](https://bugs.openjdk.org/browse/JDK-8289883)
>> 
>> Thanks to Vitaly Davidovich and Nico Williams for coming up with the idea of using `flock()`.
>> 
>> I kept the use of `kill()` for stale file detection to be compatible with older JVMs.
>> 
>> I also took the opportunity to clean up the comments and remove dead code. The old code was using "shared memory resources" which sounds unclear and odd. I changed the terminology to say "shared memory file" instead.
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   @tstuefe comments

src/hotspot/os/posix/perfMemory_posix.cpp line 743:

> 741: 
> 742:     // We now have a file name that converts to a valid integer
> 743:     // that could represent a process id.

"could"? It does, or? Why being ambiguous? (If we change the scheme later, we can be more specific)

src/hotspot/os/posix/perfMemory_posix.cpp line 744:

> 742:     // We now have a file name that converts to a valid integer
> 743:     // that could represent a process id.
> 744:     //

Can you add a short introduction here similar to:

"On Linux, VMs may run in different PID namespaces and therefore their PIDs are not unique. Therefore, we first try to flock() the file ..."

src/hotspot/os/posix/perfMemory_posix.cpp line 759:

> 757:     // signal the process, then the file is assumed to
> 758:     // be stale and is removed because the files for such a
> 759:     // process should be in a different user specific directory.

Please add a short sentence like:

"Finally, if VMs from different containers share /tmp, we cannot reliably determine process liveness using kill. To avoid that, either stop sharing /tmp between containers or only run VMs with the JDK-8286030 fix."

src/hotspot/os/posix/perfMemory_posix.cpp line 766:

> 764:     // The locking protocol works only with JVMs that have the JDK-8286030 fix.
> 765:     // If you are sharing the /tmp difrectory among different containers, do not
> 766:     // use older JVMs that don't have this fix.

Could possible shorten this if you feel the long comment above explains it all already.

src/hotspot/os/posix/perfMemory_posix.cpp line 796:

> 794:     if (!is_locked_by_another_process) {
> 795:       if ((pid == os::current_process_id()) ||
> 796:           (kill(pid, 0) == OS_ERR && (errno == ESRCH || errno == EPERM))) {

Thinking about this, I now was confused. AFAICS this code is only ever called from `mmap_create_shared` with the directory containing our own pid. I do not see this called for PIDs from other processes. Why do we handle that case? Or am I overlooking something?

-------------

PR: https://git.openjdk.org/jdk/pull/9406


More information about the serviceability-dev mailing list