SIGBUS on linux in perfMemory_init
Severin Gehwolf
sgehwolf at redhat.com
Fri May 6 08:40:31 UTC 2022
On Thu, 2022-05-05 at 13:48 -0700, Ioi Lam wrote:
>
>
> On 5/3/2022 8:41 AM, Nico Williams wrote:
> > On Fri, Apr 29, 2022 at 09:44:00AM -0400, Vitaly Davidovich wrote:
> > > As for possible solutions, would it be possible to use the global PID
> > > instead of the namespaced PID to "regain" the uniqueness invariant of the
> > > PID? Also, might it make sense to flock() the file to prevent another
> > > process from mucking with it?
> > My unsolicited, outsider opinions:
> >
> > - Sharing /tmp across containers is a Bad Idea (tm).
> >
> > - Sharing /tmp across related containers (in a pod) is not _as_ bad an
> > idea.
> >
> > (It might be a way to implement some cross-container communications,
> > though it would be better to have an explicit mechanism for that
> > rather than the rather-generic /tmp.)
> >
> > - Containerizing apps that *do* communicate over /tmp might be one
> > reason one might configure a shared /tmp in a pod.
> >
> > Some support for such a configuration might be needed.
> >
> > (Alternatively, pods that share /tmp should also share a PID
> > namespace.)
> >
> > - Since there is an option to not have an mmap'ed hsperf file, it might
> > be nice to have an option to use the global PID for naming hsperf
> > files. Or, better, implement an automatic mechanism for detecting
> > conflict and switching to global PID for naming hsperf files (or
> > switching to anonymous hsperf mmaps).
> >
> > - In any case, on systems that have a real flock(2), using flock(2) for
> > liveness testing is better than kill(2) with signal 0 -- the latter
> > has false positives, while the former does not [provided O_CLOEXEC is
> > used].
> >
> > For this reason, and though I am not too sympathetic to the situation
> > that caused this crash, I believe that it would be better to have
> > some sort of fix for this problem than to declare it a non-problem
> > and not-fix it.
> >
> >
> > I would like to expand on Vitaly's mention of flock(2). Using the
> > global PID would leave the JVM unable to use kill(2) with signal 0 for
> > liveness detection during hsperf garbage file collection. Using kill(2)
> > with signal 0 for liveness is not that reliable anyways because of PID
> > reuse -- it can have false positives.
> >
> > A better mechanism for liveness detection would be to have the owning
> > JVM take an exclusive (LOCK_EX) flock(2) on the hsperf file at startup,
> > and for hsperf garbage file collection to try (LOCK_NB) to get an
> > exclusive lock (LOCK_EX) on a candidate hsperf garbage file as a
> > liveness detection mechanism.
> >
> > When using the namespaced PID the kill(2) with signal 0 method of
> > liveness detection should still be used for backwards-compatibility in,
> > e.g., jvisualvm.
> >
> > Using flock(2) would be less portable than kill(2) with signal 0, but
> > already there is a bunch of Linux-specific code here looking through
> > /proc, and Linux does have a real flock(2).
> >
> > An adaptive, zero-conf hsperf file naming scheme might use the
> > namespaced PID if available (i.e., if an exclusive flock(2) could be
> > obtained on the file), or the global PID if not, with some indication in
> > the name of the file's name of which kind of PID was used.
>
> Hi Nico,
>
> I read your message again and now I totally agree with using flock(2) :-)
>
> As you said, we should start with getpid(). That way the behavior is
> compatible with older versions of jcmd tools, especially when Java is
> used outside of containers.
>
> One thing I realized is that if we have a collision, we don't need to
> use a globally unique ID. We just need an ID that's unique in the
> directory being written into.
>
> I think we can do this on the VM side:
>
> String id = getpid();
> while (true) {
> String file = "/tmp/hsperfdata_" + username() + "/" + id;
> if (get_exclusive_access(file)) {
> // I won the contest and
> // (a) the file didn't exist, or
> // (b) the file existed but the JVM that used it has died
> return file;
> }
> // Add an "x" here so we don't collide with the getpid() of
> another process
> id = "x" + random();
> }
>
> On the tools side, we can do the pid -> rendezvous file mapping as I
> described in the other e-mail.
If we could limit using this this special trick when it's actually
neede then this would be my preference. For one, it mostly keeps
compatibility with older JVMs and for two this isn't a very common use-
case which would penalize the 90% of use cases which aren't affected by
this.
On the other hand, 'man proc' tells me this about /proc/*/environ:
"""
This file contains the initial environment that was set when the
currently executing program was started via execve(2). [...]
If, after an execve(2), the process modifies its environment (e.g.,
by calling functions such as putenv(3) or modifying the environ(7)
variable directly), this file will not reflect those changes.
[...]
Permission to access this file is governed by a ptrace access mode
PTRACE_MODE_READ_FSCREDS check; see ptrace(2).
"""
So doing the publication of the file that was used in a reliable way
will be a challenge. Both approaches, shared memory mapping and setting
the environment will need PTRACE_MODE_READ_FSCREDS which I think isn't
generally granted for containers.
Thanks,
Severin
More information about the hotspot-runtime-dev
mailing list