<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi Ma Zhen,</p>
<p>I have to admit that I mistook sockets for named pipes when I was
thinking about this the first time. The problem seems to be
mentioned in CRIU docs [1]. Since you describe a connection to
daemon (not a named unix socket) I guess there would be a path
forward if the socket is idle & stateless: open another
connection to the daemon and inherit this (note that you would
have to use `-XX:CRaCIgnoredFileDescriptors` to prevent
automatically closed inherited FDs).</p>
<p>However, while a technical solution might exist, this is really
not up to CRaC philosophy. The process *should* isolate itself
from the rest of the system; from CRaC POV the inability to do
that in glibc is a deficiency (if there was API for that the
native part of CRaC in JVM would call it). It might be possible to
lookup & call `__nss_disable_nscd` early on during boot, but
given that this is not a public API I don't think this belongs to
the general JVM code.</p>
<p>The application has cooperate with CRaC a bit. Hopefully it is a
reasonable requirement to deploy in an environment that does not
inject FDs to the application like NSCD does. Googling around it
doesn't seem to have the best reputation.</p>
<p>In these cases, I would love to provide more guidance in the
first error the user gets, but IIUC this socket is not easy to
classify as NSCD socket without tracing its origin.</p>
<p>I am glad that the containerized solution works for you.</p>
<p>Radim</p>
<p>[1] <a class="moz-txt-link-freetext" href="https://criu.org/External_UNIX_socket">https://criu.org/External_UNIX_socket</a></p>
<div class="moz-cite-prefix">On 9/16/25 11:42, ma zhen wrote:<br>
</div>
<blockquote type="cite" cite="mid:CA+U33_POrYGhidUtEZzV_6gRm03QaV9z9Oo+1B3vU1Mdzvh6mQ@mail.gmail.com"><br>
<div>
<div dir="ltr">
<div dir="ltr">
<div>Hi Radim,</div>
<div><br>
</div>
<div>Thank you so much for the detailed and insightful
response.</div>
<div><br>
</div>
<div>I followed your advice and set the parameter
`-XX:CRaCAllowedOpenFilePrefixes=socket:`. This
successfully bypassed CRaC's own validation check, and the
log confirmed this with the following message:</div>
<div><br>
</div>
<div>`JVM: FD fd=4 type=socket
path=<a class="moz-txt-link-rfc2396E" href="socket:[11012921],port=29295">"socket:[11012921],port=29295"</a> OK: allowed in
-XX:CRaCAllowedOpenFilePrefixes`</div>
<div><br>
</div>
<div>However, this then revealed an underlying issue,
seemingly within CRIU. During the dump process, CRIU first
reported:</div>
<div><br>
</div>
<div>`Error (criu/<a class="moz-txt-link-freetext" href="sk-unix.c:865">sk-unix.c:865</a>): unix: External socket is
used. Consider using --ext-unix-sk option.`</div>
<div><br>
</div>
<div>Following CRIU's advice, I passed this option using the
`CRAC_CRIU_OPTS` environment variable
(`CRAC_CRIU_OPTS="--ext-unix-sk"`). Unfortunately, this
led to a different error from CRIU:</div>
<div><br>
</div>
<div>`Error (criu/<a class="moz-txt-link-freetext" href="sk-unix.c:871">sk-unix.c:871</a>): unix: Can't dump half of
stream unix connection.`</div>
<div><br>
</div>
<div>My initial interpretation of this final error is that
CRIU may have a limitation in handling a process that
holds only one end of an external, stream-oriented
(`SOCK_STREAM`) Unix socket connection. However, I'm not
entirely certain about this, and I plan to look into
CRIU's documentation and code to confirm this behavior.</div>
<div><br>
</div>
<div>In the meantime, I was wondering if this aligns with
your experience? Perhaps you've encountered this specific
CRIU error before.</div>
<div><br>
</div>
<div>This experience certainly reinforces your other
recommendation of running the application in a minimal
container to avoid creating the socket in the first place.
It seems like the most reliable path forward for now.</div>
<div><br>
</div>
<div>Thank you again for your guidance. It's been extremely
valuable.</div>
<div><br>
</div>
<div>Best regards,</div>
<div><br>
</div>
<div>mazhen</div>
</div>
</div>
<br>
<div class="gmail_quote gmail_quote_container">
<div dir="ltr" class="gmail_attr">Radim Vansa <<a href="mailto:rvansa@azul.com" moz-do-not-send="true" class="moz-txt-link-freetext">rvansa@azul.com</a>>
于2025年9月15日周一 23:12写道:<br>
</div>
<blockquote class="gmail_quote">
Hi Ma Zhen,<br>
<br>
we are aware of similar issue where an application has <br>
`/var/cache/nscd/passwd` mapped despite not having the
priviledge to <br>
open() this file - the application can receive a file
descriptor through <br>
a socket and then is able to mmap it. Another case are files
under <br>
`/var/lib/sss/mc/` opened by getpwuid_r, getpwname_r,
getgrgid_r, <br>
getgrname_r or similar functions.<br>
<br>
You're right that File Descriptor Policies cannot be applied
here, these <br>
work on a Java level (the FD must have an associated Java
object).<br>
<br>
There is a VM option `-XX:CRaCAllowedOpenFilePrefixes` that
lets the <br>
checkpoint to proceed if a file from this path is opened; in
most cases <br>
CRIU can reopen a regular file without issues (and it should
be able to <br>
handle sockets as well). I have not tested if the path
matching works <br>
with sockets, but shouldn't be too difficult to fix up for
Unix sockets.<br>
<br>
Besides this there's no Resource-handling on native level
(you cannot <br>
register a native hook), though it might be possible to find
an open FD <br>
and close it from Java - I wouldn't recommend such hacky
way.<br>
<br>
To be honest on systems where we've encountered this issue
we rather <br>
disabled NSCD service completely. If you can't control the
environment, <br>
you can run the application in a container that won't be
configured with <br>
these services.<br>
<br>
Cheers,<br>
<br>
Radim<br>
<br>
On 9/15/25 11:29, ma zhen wrote:<br>
><br>
> Hi CRaC developers,<br>
><br>
> I am currently working on adapting a Java application
to support CRaC. <br>
> I've encountered a specific challenge related to a Unix
socket that is <br>
> preventing successful checkpoint creation.<br>
><br>
> During the checkpoint process, I consistently receive a
<br>
> CheckpointOpenSocketException for a specific file
descriptor, which <br>
> lsof identifies as a Unix socket.<br>
><br>
> I have conducted a detailed investigation to trace the
origin of this <br>
> socket and found that it is not created directly by my
Java <br>
> application code. Instead, it is created by the
underlying glibc <br>
> library as part of the Name Service Switch (NSS)
framework. The call <br>
> stack, captured using BCC, clearly shows that the
socket() call <br>
> originates from glibc's __nscd_* functions. This
happens when the JVM <br>
> or application triggers a name service lookup (e.g.,
resolving a user <br>
> ID). In my specific environment, this results in a Unix
socket <br>
> connection from the Java process to the lwsmd daemon
for authentication.<br>
><br>
> Because this socket is created and managed within the
native C <br>
> library, the standard approach of implementing a
Java-level <br>
> org.crac.Resource to close and restore it doesn't seem
applicable, as <br>
> my application code has no direct handle or control
over its lifecycle.<br>
><br>
> I have documented the full analysis, including the
error, lsof output, <br>
> and BCC stack traces, in a detailed write-up which you
can find here:<br>
> <a href="https://github.com/mz1999/blog/blob/master/docs/trace_java_socket_creation-en.md" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">
https://github.com/mz1999/blog/blob/master/docs/trace_java_socket_creation-en.md</a><br>
><br>
> My question is: What is the recommended approach for
handling such <br>
> file descriptors that are opened by underlying native
libraries <br>
> without direct control from the Java application?<br>
><br>
> Are there any existing mechanisms, perhaps through
advanced file <br>
> descriptor policies, or any planned features that might
address this <br>
> common scenario? Or is there another workaround that
the team would <br>
> suggest?<br>
><br>
> Thank you for your time and for developing this
fantastic project. Any <br>
> guidance you can provide would be greatly appreciated.<br>
><br>
> Best regards,<br>
> mazhen<br>
</blockquote>
</div>
</div>
</blockquote>
</body>
</html>