RFR: 8341334: CDS: Parallel pretouch and relocation [v6]

Aleksey Shipilev shade at openjdk.org
Tue Nov 5 16:16:32 UTC 2024


On Tue, 5 Nov 2024 14:52:04 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
>> 
>>  - Merge branch 'master' into JDK-8341334-cds-parallel-relocation
>>  - Make sure we gracefully shutdown whatever happens, refix shutdown race
>>  - Simpler bitmap distribution
>>  - Capitalize constants
>>  - Do not create worker threads too early: Mac/Windows are not yet ready to use Semaphores
>>  - Don't change the patching order in -ArchiveParallelIteration case
>>  - Flags
>>  - Work
>
> src/hotspot/share/cds/filemap.cpp line 1758:
> 
>> 1756:     char* start = _from + MIN2(_bytes, _bytes * chunk / max_chunks);
>> 1757:     char* end   = _from + MIN2(_bytes, _bytes * (chunk + 1) / max_chunks);
>> 1758:     os::pretouch_memory(start, end);
> 
> What happens if I have many cores and a small memory range? We would have many workers for a potentially smallish total range. Could start-end then end up being tiny? 
> 
> On Linux, we would do madvise MADV_POPULATE_WRITE. Could we end up feeding invalid range lengths to madvise, not page aligned? Or, could it just be inefficient if many threads try to madvise the same overlapping areas (see len calculation in os::pd_pretouch_memory)

Small memory range and lots of workers -- that's an interesting question. I think we want to cap the chunk size from below, at least by `os::page_size()`. Let me see if I can do this without messing things up.

`os::pretouch_memory` already does `MADV_POPULATE_WRITE` when supported (see JDK-8315923). The key thing for fast startup-time-sensitive pretouch is to eat the memory faults in multiple threads. It is arguably a kernel "issue" that `MADV_POPULATE_WRITE` is single-threaded, given that kernel can _probably_ do this with kernel workers liek we do it here on JVM side, but that is not something I expect to be available any time soon.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21302#discussion_r1829634991


More information about the hotspot-runtime-dev mailing list