SIGBUS on linux in perfMemory_init (containers)
Ioi Lam
ioi.lam at oracle.com
Fri Apr 29 21:19:00 UTC 2022
On 4/29/2022 1:55 PM, Ioi Lam wrote:
>
>
> On 4/29/2022 6:44 AM, Vitaly Davidovich wrote:
>> Hi all,
>>
>> We've been seeing intermittent SIGBUS failures on linux with jdk11.
>> They
>> all have this distinctive backtrace:
>>
>> C [libc.so.6+0x12944d]
>>
>> V [libjvm.so+0xcca542] perfMemory_init()+0x72
>>
>> V [libjvm.so+0x8a3242] vm_init_globals()+0x22
>>
>> V [libjvm.so+0xedc31d] Threads::create_vm(JavaVMInitArgs*,
>> bool*)+0x1ed
>>
>> V [libjvm.so+0x9615b2] JNI_CreateJavaVM+0x52
>>
>> C [libjli.so+0x49af] JavaMain+0x8f
>>
>> C [libjli.so+0x9149] ThreadJavaMain+0x9
>>
>>
>> Initially, we suspected that /tmp was full but that turned out to not be
>> the case. After a few more instances of the crash and investigation, we
>> believe we know the root cause.
>>
>>
>> The crashing applications are all running in a K8 pod, with each JVM
>> in a
>> separate container:
>>
>>
>> container_type: cgroupv1 (from the hs_err file)
>>
>>
>> /tmp is mounted such that it's shared by multiple containers. Since
>> these
>> JVMs are running in containers, we believe what happens is the
>> namespaced
>> (i.e. per container) PIDs overlap between different containers - 2
>> JVMs, in
>> separate containers, can end up with the same namespaced PID. Since /tmp
>> is shared, they can now "contend" on the same perfMemory file since
>> those
>> file names are PID based.
>
> Hi Vitaly,
>
> Is there any reason for sharing the same /tmp directory across
> different containers?
>
> Are you using the /tmp/hsperfdata_$USER/<pid> files at all. If not,
> for the time being, you can disable them with the -XX:-UsePerfData flag,
>
> https://bugs.openjdk.java.net/browse/JDK-8255008 has a related proposal:
>
> Java: -Djdk.attach.tmpdir=/container-attachdir
> -XX:+UnlockCommercialFeature -XX:+FlightRecorder -XX:+StartAttachListener
> Docker: --volume /tmp/container-attachdir:/container-attachdir
>
> In this case, we probably will run into the same UID clash problem as
> well.
>
> Maybe we should have an additional property like
> -Djdk.attach.use.global.pid=true
>
I read the proposal in JDK-8255008 again and realized that the JVM
inside the container doesn't know what it's host PID is. The proposal is
to create these files:
$jdk_attach_dir/hsperfdata_{user}/e4f3e2e4fd97:10
$jdk_attach_dir/.java_pid:e4f3e2e4fd97:10
where the e4f3e2e4fd97 is the container ID which is visible as
/tmp/hostname from inside the container.
I'll try to implement a prototype for the proposal.
Thanks
- Ioi
>
>>
>> Once multiple JVMs can contend on the same file, a SIGBUS can arise
>> if one
>> JVM has mmap'd the file and another ftruncate()'s it from under it (e.g.
>> https://github.com/openjdk/jdk11/blob/37115c8ea4aff13a8148ee2b8832b20888a5d880/src/hotspot/os/linux/perfMemory_linux.cpp#L909
>>
>> ).
>>
>>
>> Is this a known issue? I couldn't find any existing JBS entries or
>> mailing
>> list discussions around this specific circumstance.
>>
>>
>> As for possible solutions, would it be possible to use the global PID
>> instead of the namespaced PID to "regain" the uniqueness invariant of
>> the
>> PID?
>
> I think this needs to be optional. E.g., if you run a tool such as
> "jcmd" inside the same container as the jvm process, the jcmd tool
> would expect the PID to be the local version specific to this container.
>
>> Also, might it make sense to flock() the file to prevent another
>> process from mucking with it?
>>
> That will not solve the fundamental problem where two processes are
> using the same hsperfdata file. One of them would fail to write to it,
> and tools won't be able to monitor both processes.
>
> Thanks
> - Ioi
>
>> Happy to provide more info if needed.
>>
>>
>> Thanks
>
More information about the hotspot-runtime-dev
mailing list