Withdrawn: 8329204: Diagnostic command for zeroing unused parts of the heap
duke
duke at openjdk.org
Fri Jul 5 03:25:29 UTC 2024
On Wed, 27 Mar 2024 17:24:34 GMT, Volker Simonis <simonis at openjdk.org> wrote:
> Diagnostic command for zeroing unused parts of the heap
>
> I propose to add a new diagnostic command `System.zero_unused_memory` which zeros out all unused parts of the heap. The name of the command is intentionally GC/heap agnostic because in the future it might be extended to also zero unused parts of the Metaspace and/or CodeCache.
>
> Currently `System.zero_unused_memory` triggers a full GC and afterwards zeros unused parts of the heap. Zeroing can help snapshotting technologies like [CRIU][1] or [Firecracker][2] to shrink the snapshot size of VMs/containers with running JVM processes because pages which only contain zero bytes can be easily removed from the image by making the image *sparse* (e.g. with [`fallocate -p`][3]).
>
> Notice that uncommitting unused heap parts in the JVM doesn't help in the context of virtualization (e.g. KVM/Firecracker) because from the host perspective they are still dirty and can't be easily removed from the snapshot image because they usually contain some non-zero data. More details can be found in my FOSDEM talk ["Zeroing and the semantic gap between host and guest"][4].
>
> Furthermore, removing pages which only contain zero bytes (i.e. "empty pages") from a snapshot image not only decreases the image size but also speeds up the restore process because empty pages don't have to be read from the image file but will be populated by the kernel zero page first until they are used for the first time. This also decreases the initial memory footprint of a restored process.
>
> An additional argument for memory zeroing is security. By zeroing unused heap parts, we can make sure that secrets contained in unreferenced Java objects are deleted. Something that's currently impossibly to achieve from Java because even if a Java program zeroes out arrays with sensitive data after usage, it can never guarantee that the corresponding object hasn't already been moved by the GC and an old, unreferenced copy of that data still exists somewhere in the heap.
>
> A prototype implementation for this proposal for Serial, Parallel, G1 and Shenandoah GC is available in the linked pull request.
>
> [1]: https://criu.org
> [2]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md
> [3]: https://man7.org/linux/man-pages/man1/fallocate.1.html
> [4]: https://fosdem.org/2024/schedule/event/fosdem-2024-3454-zeroing-and-the-semantic-gap-between-host-and-guest/
This pull request has been closed without being integrated.
-------------
PR: https://git.openjdk.org/jdk/pull/18521
More information about the hotspot-gc-dev
mailing list