SIGBUS on linux in perfMemory_init

Wed May 4 19:13:02 UTC 2022

On 5/3/2022 8:41 AM, Nico Williams wrote:
> On Fri, Apr 29, 2022 at 09:44:00AM -0400, Vitaly Davidovich wrote:
>> As for possible solutions, would it be possible to use the global PID
>> instead of the namespaced PID to "regain" the uniqueness invariant of the
>> PID? Also, might it make sense to flock() the file to prevent another
>> process from mucking with it?
> My unsolicited, outsider opinions:
>
>   - Sharing /tmp across containers is a Bad Idea (tm).
>
>   - Sharing /tmp across related containers (in a pod) is not _as_ bad an
>     idea.
>
>     (It might be a way to implement some cross-container communications,
>     though it would be better to have an explicit mechanism for that
>     rather than the rather-generic /tmp.)
>
>   - Containerizing apps that *do* communicate over /tmp might be one
>     reason one might configure a shared /tmp in a pod.
>
>     Some support for such a configuration might be needed.
>
>     (Alternatively, pods that share /tmp should also share a PID
>     namespace.)
>
>   - Since there is an option to not have an mmap'ed hsperf file, it might
>     be nice to have an option to use the global PID for naming hsperf
>     files.  Or, better, implement an automatic mechanism for detecting
>     conflict and switching to global PID for naming hsperf files (or
>     switching to anonymous hsperf mmaps).

I think using the PID in the serviceability files is problematic with 
containers. Currently, the files use the PID returned by getpid(2), it's 
the PID within the container's namespace.

Example: if a process has the host PID of 3456 and namespaced PID of 1, 
the files are accessible by the host as:

      /proc/3456/root/tmp/hsperfdata_user/1
      /proc/3456/root/tmp/.java_pid1

and accessible inside the container as

      /tmp/hsperfdata_user/1
      /tmp/.java_pid1

If you run "jps" on the host, you'd get

     3456 MyApp

If you run "jps" inside the container, you'd get

     1 MyApp

if we use some sort of "global PID", we could name the file as

/proc/3456/root/tmp/hsperfdata_user/3456
/proc/3456/root/tmp/.java_pid3456

But there are several problems:

- The file is created by the MyApp process. AFAIK there's no way for a 
containerized process to obtain its host PID (i.e., 3456)
- "jps" no longer works inside the container, because 3456 is not a 
valid process in the container.

Also, what happens if we use nested containers? A process's ID could be 
1, 3456, or 7890, depending on what namespace you are looking at, Which 
one should we use?

====================================

I think we should understand the requirements. I am guessing they are:

- "jps" should report the process IDs of all the JVM processes that are 
visible to the current user. The PIDs should be in the current namespace.
- The user can use a PID returned by "jps" to attach to the process.

====================================

How about this:

When a JVM starts up, it creates the hsperfdata with a UUID instead of 
PID. E.g.,

     /tmp/hsperfdata_user/0da29ace76f76f61

Today, jps reads /proc/*/status to determine the NSpid, so we are doing 
a lot of processing already. Instead, we should read /proc/*/maps and 
scan for something with this pattern:

     7ffb2536d000-7ffb25375000 rw-s .... 
/tmp/hsperfdata_user/0da29ace76f76f61

If we can find this pattern, we know we have a JVM process that the 
current user can attach to.

When attaching to this process, we will use this file instead

/proc/$PID/root/tmp/.java_pid0da29ace76f76f61

This should work with

- shared /tmp across containers
- nested containers

>   - In any case, on systems that have a real flock(2), using flock(2) for
>     liveness testing is better than kill(2) with signal 0 -- the latter
>     has false positives, while the former does not [provided O_CLOEXEC is
>     used].
>
>     For this reason, and though I am not too sympathetic to the situation
>     that caused this crash, I believe that it would be better to have
>     some sort of fix for this problem than to declare it a non-problem
>     and not-fix it.
>
>
> I would like to expand on Vitaly's mention of flock(2).  Using the
> global PID would leave the JVM unable to use kill(2) with signal 0 for
> liveness detection during hsperf garbage file collection.  Using kill(2)
> with signal 0 for liveness is not that reliable anyways because of PID
> reuse -- it can have false positives.
>
> A better mechanism for liveness detection would be to have the owning
> JVM take an exclusive (LOCK_EX) flock(2) on the hsperf file at startup,
> and for hsperf garbage file collection to try (LOCK_NB) to get an
> exclusive lock (LOCK_EX) on a candidate hsperf garbage file as a
> liveness detection mechanism.
>
> When using the namespaced PID the kill(2) with signal 0 method of
> liveness detection should still be used for backwards-compatibility in,
> e.g., jvisualvm.
>
> Using flock(2) would be less portable than kill(2) with signal 0, but
> already there is a bunch of Linux-specific code here looking through
> /proc, and Linux does have a real flock(2).
>
> An adaptive, zero-conf hsperf file naming scheme might use the
> namespaced PID if available (i.e., if an exclusive flock(2) could be
> obtained on the file), or the global PID if not, with some indication in
> the name of the file's name of which kind of PID was used.

Nico, Using flock for liveness detection is somewhat off topic. Do you 
want to create an RFE for it and perhaps start a separate discussion?

Thanks
- Ioi

> Cheers,
>
> Nico