SIGBUS on linux in perfMemory_init

Ioi Lam ioi.lam at oracle.com
Sun May 8 04:55:33 UTC 2022



On 5/6/2022 1:40 AM, Severin Gehwolf wrote:
> On Thu, 2022-05-05 at 13:48 -0700, Ioi Lam wrote:
>>
>> On 5/3/2022 8:41 AM, Nico Williams wrote:
>>> On Fri, Apr 29, 2022 at 09:44:00AM -0400, Vitaly Davidovich wrote:
>>>> As for possible solutions, would it be possible to use the global PID
>>>> instead of the namespaced PID to "regain" the uniqueness invariant of the
>>>> PID? Also, might it make sense to flock() the file to prevent another
>>>> process from mucking with it?
>>> My unsolicited, outsider opinions:
>>>
>>>    - Sharing /tmp across containers is a Bad Idea (tm).
>>>
>>>    - Sharing /tmp across related containers (in a pod) is not _as_ bad an
>>>      idea.
>>>
>>>      (It might be a way to implement some cross-container communications,
>>>      though it would be better to have an explicit mechanism for that
>>>      rather than the rather-generic /tmp.)
>>>
>>>    - Containerizing apps that *do* communicate over /tmp might be one
>>>      reason one might configure a shared /tmp in a pod.
>>>
>>>      Some support for such a configuration might be needed.
>>>
>>>      (Alternatively, pods that share /tmp should also share a PID
>>>      namespace.)
>>>
>>>    - Since there is an option to not have an mmap'ed hsperf file, it might
>>>      be nice to have an option to use the global PID for naming hsperf
>>>      files.  Or, better, implement an automatic mechanism for detecting
>>>      conflict and switching to global PID for naming hsperf files (or
>>>      switching to anonymous hsperf mmaps).
>>>
>>>    - In any case, on systems that have a real flock(2), using flock(2) for
>>>      liveness testing is better than kill(2) with signal 0 -- the latter
>>>      has false positives, while the former does not [provided O_CLOEXEC is
>>>      used].
>>>
>>>      For this reason, and though I am not too sympathetic to the situation
>>>      that caused this crash, I believe that it would be better to have
>>>      some sort of fix for this problem than to declare it a non-problem
>>>      and not-fix it.
>>>
>>>
>>> I would like to expand on Vitaly's mention of flock(2).  Using the
>>> global PID would leave the JVM unable to use kill(2) with signal 0 for
>>> liveness detection during hsperf garbage file collection.  Using kill(2)
>>> with signal 0 for liveness is not that reliable anyways because of PID
>>> reuse -- it can have false positives.
>>>
>>> A better mechanism for liveness detection would be to have the owning
>>> JVM take an exclusive (LOCK_EX) flock(2) on the hsperf file at startup,
>>> and for hsperf garbage file collection to try (LOCK_NB) to get an
>>> exclusive lock (LOCK_EX) on a candidate hsperf garbage file as a
>>> liveness detection mechanism.
>>>
>>> When using the namespaced PID the kill(2) with signal 0 method of
>>> liveness detection should still be used for backwards-compatibility in,
>>> e.g., jvisualvm.
>>>
>>> Using flock(2) would be less portable than kill(2) with signal 0, but
>>> already there is a bunch of Linux-specific code here looking through
>>> /proc, and Linux does have a real flock(2).
>>>
>>> An adaptive, zero-conf hsperf file naming scheme might use the
>>> namespaced PID if available (i.e., if an exclusive flock(2) could be
>>> obtained on the file), or the global PID if not, with some indication in
>>> the name of the file's name of which kind of PID was used.
>> Hi Nico,
>>
>> I read your message again and now I totally agree with using flock(2) :-)
>>
>> As you said, we should start with getpid(). That way the behavior is
>> compatible with older versions of jcmd tools, especially when Java is
>> used outside of containers.
>>
>> One thing I realized is that if we have a collision, we don't need to
>> use a globally unique ID. We just need an ID that's unique in the
>> directory being written into.
>>
>> I think we can do this on the VM side:
>>
>>       String id = getpid();
>>       while (true) {
>>           String file = "/tmp/hsperfdata_" + username() + "/" + id;
>>           if (get_exclusive_access(file)) {
>>               // I won the contest and
>>               // (a) the file didn't exist, or
>>               // (b) the file existed but the JVM that used it has died
>>               return file;
>>           }
>>           // Add an "x" here so we don't collide with the getpid() of
>> another process
>>           id = "x" + random();
>>       }
>>
>> On the tools side, we can do the pid -> rendezvous file mapping as I
>> described in the other e-mail.
> If we could limit using this this special trick when it's actually
> neede then this would be my preference. For one, it mostly keeps
> compatibility with older JVMs and for two this isn't a very common use-
> case which would penalize the 90% of use cases which aren't affected by
> this.
>
> On the other hand, 'man proc' tells me this about /proc/*/environ:
>
> """
> This file contains the initial environment that was set when the
> currently executing program was started via execve(2). [...]
>
> If,  after  an  execve(2),  the process modifies its environment (e.g.,
> by calling functions such as putenv(3) or modifying the environ(7)
> variable directly), this file will not reflect those changes.
>
> [...]
>
> Permission to access this file is governed by a ptrace access mode
> PTRACE_MODE_READ_FSCREDS check; see ptrace(2).
> """
>
> So doing the publication of the file that was used in a reliable way
> will be a challenge. Both approaches, shared memory mapping and setting
> the environment will need PTRACE_MODE_READ_FSCREDS which I think isn't
> generally granted for containers.

I think publishing via /proc/*/environ is going to be problematic, but 
using the maps file seems fine. Here are my experiments.

My conclusion is:

If a Java process is visible to "jps" today, "jps" can also read its 
/proc/id/maps file.

============== test 1 ================
Docker running with cgroup v1 on Ubuntu 20.04.3 LTS

ubuntu at minikube-cgv1:~$ jps -J-version
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1, 
mixed mode, sharing)
ubuntu at minikube-cgv1:~$ ps -ef | grep Wait
ubuntu      3490    3280  2 21:34 pts/1    00:00:00 docker run -it 
--tty=true --rm my-java-app java -cp / Wait
root        3541    3515  2 21:34 pts/0    00:00:00 java -cp / Wait
ubuntu      3598    1247  0 21:34 pts/0    00:00:00 grep --color=auto Wait

ubuntu at minikube-cgv1:~$ jps
3611 Jps
ubuntu at minikube-cgv1:~$ sudo jps
3541 Wait
3630 Jps
ubuntu at minikube-cgv1:~$ wc /proc/3541/maps
wc: /proc/3541/maps: Permission denied
ubuntu at minikube-cgv1:~$ ls /proc/3541/root
ls: cannot access '/proc/3541/root': Permission denied
ubuntu at minikube-cgv1:~$ sudo wc /proc/3541/maps
181 968 12074 /proc/3541/maps
ubuntu at minikube-cgv1:~$ sudo grep hsperf /proc/3541/maps
7f4ebca2b000-7f4ebca33000 rw-s 00000000 00:33 1818692                    
/tmp/hsperfdata_root/1

============== test 2 ================
podman rootless + cgroupv2 on Ubuntu 21.10

ubuntu at podman-tester:~$ jps -J-version
openjdk version "17.0.1" 2021-10-19
OpenJDK Runtime Environment (build 17.0.1+12-Ubuntu-121.10)
OpenJDK 64-Bit Server VM (build 17.0.1+12-Ubuntu-121.10, mixed mode, 
sharing)
ubuntu at podman-tester:~$ ps -ef | grep Wait
ubuntu      1531    1468  0 21:19 pts/0    00:00:01 podman run -it 
--tty=true --rm my-java-app java -cp / Wait
ubuntu      1571    1556  0 21:19 pts/0    00:00:01 java -cp / Wait
ubuntu      1946    1686  0 21:23 pts/1    00:00:00 grep --color=auto Wait
ubuntu at podman-tester:~$ jps
1778 Jps
1571 Wait
ubuntu at podman-tester:~$ cat /^C
ubuntu at podman-tester:~$ sudo jps
1571 Wait
1805 Jps
ubuntu at podman-tester:~$ grep hsperf/proc/1571/maps
7f88dc40c000-7f88dc414000 rw-s 00000000 08:01 792115                     
/tmp/hsperfdata_root/1
ubuntu at podman-tester:~$ sudo grep hsperf/proc/1571/maps
7f88dc40c000-7f88dc414000 rw-s 00000000 08:01 792115                     
/tmp/hsperfdata_root/1
ubuntu at podman-tester:~$ ls -l /proc/1571/root/tmp/hsperfdata_root/1
-rw------- 1 ubuntu ubuntu 32768 May  7 21:24 
/proc/1571/root/tmp/hsperfdata_root/1





More information about the hotspot-runtime-dev mailing list