SIGBUS on linux in perfMemory_init
Ioi Lam
ioi.lam at oracle.com
Thu May 5 20:48:37 UTC 2022
On 5/3/2022 8:41 AM, Nico Williams wrote:
> On Fri, Apr 29, 2022 at 09:44:00AM -0400, Vitaly Davidovich wrote:
>> As for possible solutions, would it be possible to use the global PID
>> instead of the namespaced PID to "regain" the uniqueness invariant of the
>> PID? Also, might it make sense to flock() the file to prevent another
>> process from mucking with it?
> My unsolicited, outsider opinions:
>
> - Sharing /tmp across containers is a Bad Idea (tm).
>
> - Sharing /tmp across related containers (in a pod) is not _as_ bad an
> idea.
>
> (It might be a way to implement some cross-container communications,
> though it would be better to have an explicit mechanism for that
> rather than the rather-generic /tmp.)
>
> - Containerizing apps that *do* communicate over /tmp might be one
> reason one might configure a shared /tmp in a pod.
>
> Some support for such a configuration might be needed.
>
> (Alternatively, pods that share /tmp should also share a PID
> namespace.)
>
> - Since there is an option to not have an mmap'ed hsperf file, it might
> be nice to have an option to use the global PID for naming hsperf
> files. Or, better, implement an automatic mechanism for detecting
> conflict and switching to global PID for naming hsperf files (or
> switching to anonymous hsperf mmaps).
>
> - In any case, on systems that have a real flock(2), using flock(2) for
> liveness testing is better than kill(2) with signal 0 -- the latter
> has false positives, while the former does not [provided O_CLOEXEC is
> used].
>
> For this reason, and though I am not too sympathetic to the situation
> that caused this crash, I believe that it would be better to have
> some sort of fix for this problem than to declare it a non-problem
> and not-fix it.
>
>
> I would like to expand on Vitaly's mention of flock(2). Using the
> global PID would leave the JVM unable to use kill(2) with signal 0 for
> liveness detection during hsperf garbage file collection. Using kill(2)
> with signal 0 for liveness is not that reliable anyways because of PID
> reuse -- it can have false positives.
>
> A better mechanism for liveness detection would be to have the owning
> JVM take an exclusive (LOCK_EX) flock(2) on the hsperf file at startup,
> and for hsperf garbage file collection to try (LOCK_NB) to get an
> exclusive lock (LOCK_EX) on a candidate hsperf garbage file as a
> liveness detection mechanism.
>
> When using the namespaced PID the kill(2) with signal 0 method of
> liveness detection should still be used for backwards-compatibility in,
> e.g., jvisualvm.
>
> Using flock(2) would be less portable than kill(2) with signal 0, but
> already there is a bunch of Linux-specific code here looking through
> /proc, and Linux does have a real flock(2).
>
> An adaptive, zero-conf hsperf file naming scheme might use the
> namespaced PID if available (i.e., if an exclusive flock(2) could be
> obtained on the file), or the global PID if not, with some indication in
> the name of the file's name of which kind of PID was used.
Hi Nico,
I read your message again and now I totally agree with using flock(2) :-)
As you said, we should start with getpid(). That way the behavior is
compatible with older versions of jcmd tools, especially when Java is
used outside of containers.
One thing I realized is that if we have a collision, we don't need to
use a globally unique ID. We just need an ID that's unique in the
directory being written into.
I think we can do this on the VM side:
String id = getpid();
while (true) {
String file = "/tmp/hsperfdata_" + username() + "/" + id;
if (get_exclusive_access(file)) {
// I won the contest and
// (a) the file didn't exist, or
// (b) the file existed but the JVM that used it has died
return file;
}
// Add an "x" here so we don't collide with the getpid() of
another process
id = "x" + random();
}
On the tools side, we can do the pid -> rendezvous file mapping as I
described in the other e-mail.
Thanks
- Ioi
> Cheers,
>
> Nico
More information about the hotspot-runtime-dev
mailing list