<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p>Hi Ma Zhen,</p>
    <p>you have correctly observed that closing file descriptors is
      rather an architectural choice than purely a technical need. CRIU
      is really capable of restoring the process as-is, as its main
      motivation is migration of running containers. Containers already
      define the filesystem, and the runtime is in control of external
      connections - e.g. CRIU can checkpoint and later restore an open
      socket connection, and the container runtime restores the 'second
      half' of the socket so that the pause is transparent to the
      running process.</p>
    <p>If this is what you want, there's nothing preventing you from
      using CRIU on a Java process manually - at the risk of breaking
      the internal logic of the application. However the point of CRaC
      is not such a transparent restore: we want to preserve the
      valuable state of JVM and application but adapt it to the new
      environment. We want to do a conscious decision about any resource
      external to the process. Being forced to gracefully adapt to the
      restore is a feature.</p>
    <p>Yes, we have File Descriptor policies, but that's not a solution
      - it provides a workaround for proof-of-concepts, until some code
      that you can't easily fix gets updated to support CRaC properly.
      Ideas meet practicality, and you are responsible for realizing
      what should be done with particular external resource.</p>
    <p>You're right that ATM we don't handle JDK Platform Logging (and
      neither JUL) configured to write to a file, and since that is JDK
      code out of user control it is a bug. We attempt to fix those one
      by one (PRs are welcome!).<br>
    </p>
    <p>I hope I have provided some insight to these choices - and yes, I
      understand the pain as we still have many places to fix.</p>
    <p>Cheers, </p>
    <p>Radim<br>
    </p>
    <div class="moz-cite-prefix">On 10. 04. 25 11:30, ma zhen wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:CA+U33_P+7i9X3d31Vfx9AkiYeeuROFBaWcDFPrXN37BY_y2Y9g@mail.gmail.com">
      <table width="100%">
        <tbody>
          <tr>
            <td><br>
            </td>
            <td width="100%">
              <div><span>Caution:</span> This email originated from
                outside of the organization. Do not click links or open
                attachments unless you recognize the sender and know the
                content is safe.
              </div>
            </td>
          </tr>
        </tbody>
      </table>
      <br>
      <div>
        <div dir="ltr">
          <div dir="ltr">
            <div dir="ltr">
              <p class="gmail-ng-star-inserted">
                <span class="gmail-ng-star-inserted">Hi CRaC developers,</span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">I'm currently
                    exploring the integration of CRaC support into our
                    company's middleware products. I'm also very
                    interested in the underlying implementation details
                    of CRaC and have been doing some research into its
                    mechanics.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">As I
                    understand it, CRaC leverages CRIU under the hood
                    for checkpointing and restoring running processes.
                    My research indicates that CRIU itself is capable of
                    handling open file descriptors and established
                    network connections during the checkpoint/restore
                    cycle.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">However, the
                    CRaC API requires developers to explicitly manage
                    these resources, typically by closing them in the </span><span class="gmail-inline-code gmail-ng-star-inserted">beforeCheckpoint()</span><span class="gmail-ng-star-inserted"> and re-establishing
                    them in the </span><span class="gmail-inline-code gmail-ng-star-inserted">afterRestore()</span><span class="gmail-ng-star-inserted">.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">To understand
                    the rationale behind this design choice, I looked
                    into the initial CRaC prototype, specifically the
                    first PR (<a href="https://github.com/openjdk/crac/pull/1" moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/openjdk/crac/pull/1</a></span><span class="gmail-ng-star-inserted">). It appears that
                    even in this early version, the implementation
                    iterated through all process file descriptors during
                    checkpoint. It ignored certain FDs (like those
                    related to classpath files, </span><span class="gmail-inline-code gmail-ng-star-inserted">/dev/random</span><span class="gmail-ng-star-inserted">, </span><span class="gmail-inline-code gmail-ng-star-inserted">/dev/urandom</span><span class="gmail-ng-star-inserted">, and files marked </span><span class="gmail-inline-code gmail-ng-star-inserted">M_PERSISTENT</span><span class="gmail-ng-star-inserted"> - though I'm unclear
                    on the exact meaning of </span><span class="gmail-inline-code gmail-ng-star-inserted">M_PERSISTENT</span><span class="gmail-ng-star-inserted"> in this context). If
                    any other application-opened files remained, the
                    checkpoint process would fail. This suggests the
                    requirement for manual resource management was
                    present from the outset.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">As I'm not
                    deeply familiar with JVM internals, I'm struggling
                    to fully grasp the reasoning. Was this restriction
                    primarily introduced to simplify the initial design
                    and implementation of CRaC within the JVM?</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">I also
                    noticed that current versions of CRaC include File
                    Descriptor Policies. These allow configuring an </span><span class="gmail-inline-code gmail-ng-star-inserted">action:
                    ignore</span><span class="gmail-ng-star-inserted"> for
                    specific file descriptors, effectively delegating
                    their handling to CRIU. This seems to demonstrate
                    that letting CRIU manage certain open files </span><span class="gmail-ng-star-inserted">is</span><span class="gmail-ng-star-inserted"> feasible within the
                    CRaC framework.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">This leads me
                    to wonder: if delegation to CRIU is possible and
                    works (at least for some cases via policies), why
                    isn't relying on CRIU for resource handling the
                    default or more broadly encouraged approach? Why the
                    strict requirement for manual closure and reopening
                    in the general case?</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">For instance,
                    consider using </span><span class="gmail-inline-code gmail-ng-star-inserted">System.getLogger()</span><span class="gmail-ng-star-inserted"> from the JDK
                    Platform Logging API. As application developers, we
                    don't typically manage the underlying file
                    descriptor for the log file directly. To make this
                    work with CRaC, we currently need to identify and
                    configure a File Descriptor Policy for it, which can
                    feel somewhat cumbersome. Wouldn't a smoother
                    experience involve CRaC (perhaps optionally)
                    defaulting to letting CRIU handle such internally
                    managed resources, like those opened by standard JDK
                    libraries?</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">I would
                    appreciate any insights or clarification you could
                    offer on the design philosophy behind CRaC's
                    approach to managing external resources like files
                    and sockets, especially in contrast to CRIU's
                    capabilities.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">Thanks for
                    your time and any insights you can share.</span></span></p>
              <p class="gmail-ng-star-inserted">
                <span><span class="gmail-ng-star-inserted">Best regards,</span></span></p>
              <p class="gmail-ng-star-inserted">
                mazhen</p>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </body>
</html>