[crac] RFR: Persist memory in-JVM [v6]

Anton Kozlov akozlov at openjdk.org
Thu Sep 28 13:06:01 UTC 2023


On Fri, 22 Sep 2023 15:58:19 GMT, Radim Vansa <rvansa at openjdk.org> wrote:

>> This is a WIP for persisting various portions of JVM memory from within the process, rather than leaving that up to C/R engine (such as CRIU). In the future this will enable us to optimize loading (theoretically leading to faster startup), compress and encrypt the data.
>> 
>> At this moment the implementation of persisting thread stacks is in proof-of-concept shape, ~especially waking up the primordial thread includes some hacks. This could be improved by using a custom (global, ideally robust) futex instead of the internal futex used by `pthread_join`.~ Fix already implemented.
>> 
>> ~One of the concerns related to thread stacks is rseq used by glibc; without disabling this CRIU would attempt to 'fix' the rseqs (see `fixup_thread_rseq`) and touch the unmapped memory. CRIU uses special ptrace commands to check the status; I am not aware if it is possible to access this information from within the process using any public API.~ Solved. The JVM forks and the child ptraces JVM, recording the rseq info. Once we have the we can unregister the rseq before checkpoint and register it afterwards (here we have the advantage that we know the threads won't be in any critical section as we're in a safepoint).
>> ~Currently this works with `/proc/sys/kernel/yama/ptrace_scope` set to `0`; we should make it work with `1` (default), too.~ Fixed.
>> 
>> Regarding persistence implementation, currently we store the memory in multiple files; first block (page-size aligned) contains some validation data and index of memory address - file offsets. The way this is implemented requires the size of index to be known before dumping memory. It might be more convenient (and portable for e.g. network-based storage) to use single file, and either keep the index in the 'fundamental' memory (C heap), put it at the end of file or to another index file.
>
> Radim Vansa has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits:
> 
>  - Move MemoryPersister impl to own file
>  - Merge branch 'crac' into persist_memory
>  - Backport of API from future changes for other persistent memory features
>  - Another assembly fix
>  - Don't fork when we're not unregistering rseq
>  - Fix assembly loop
>  - fix whitespaces
>  - fix whitespaces
>  - Address review comments, fix rseq on GLIBC < 2.35
>  - Merge branch 'crac' into persist_memory
>  - ... and 23 more: https://git.openjdk.org/crac/compare/8fcfc112...1a8fc70c

The stack management still looks overcomplicated, but it seems possible to revert to something simpler in case of troubles.

src/hotspot/cpu/x86/vm_version_x86.cpp line 2809:

> 2807:   // outside CodeCache.
> 2808:   size_t aligned_size = align_up(stub_size, os::vm_page_size());
> 2809:   char *stub_memory = os::reserve_memory(aligned_size, true, mtCode);

Should not this be a global variable as the previous stub?

src/hotspot/os/linux/crac_linux.cpp line 597:

> 595:     "mov x4, 0\n\t"
> 596:     "mov x5, 0\n\t"
> 597:     "mov x8, %[sysnum]\n\t"

This could be `register volatile asm("..")` for simplicity

src/hotspot/os/linux/crac_linux.cpp line 599:

> 597:     "mov x8, %[sysnum]\n\t"
> 598:     ".begin: mov x0, %[futex]\n\t"
> 599:     "mov x3, 0\n\t"

mov x3, zr ?

src/hotspot/os/linux/crac_linux.cpp line 603:

> 601:     "cbnz x0, .end\n\t" // exit the loop on error
> 602:     "mov x3, %[futex]\n\t"
> 603:     "ldr w3, [x3]\n\t"

Suggestion:

    "ldr w3, [%[futex]]\n\t"

?

src/hotspot/os/linux/crac_linux.cpp line 614:

> 612:   while (persist_futex) {
> 613:      syscall(SYS_futex, &persist_futex, FUTEX_WAIT_PRIVATE, 1, nullptr, nullptr, 0);
> 614:   }

Nit: in asm it

do {
  syscall(SYS_futex, &persist_futex, FUTEX_WAIT_PRIVATE, 1, nullptr, nullptr, 0);
} while (persist_futex);

is implemented.

-------------

Marked as reviewed by akozlov (Lead).

PR Review: https://git.openjdk.org/crac/pull/95#pullrequestreview-1648819289
PR Review Comment: https://git.openjdk.org/crac/pull/95#discussion_r1336174099
PR Review Comment: https://git.openjdk.org/crac/pull/95#discussion_r1336208335
PR Review Comment: https://git.openjdk.org/crac/pull/95#discussion_r1338431674
PR Review Comment: https://git.openjdk.org/crac/pull/95#discussion_r1336207444
PR Review Comment: https://git.openjdk.org/crac/pull/95#discussion_r1336190274


More information about the crac-dev mailing list