RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17]

Thu Dec 28 20:20:01 UTC 2023

On Thu, 14 Dec 2023 11:14:00 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Liming Liu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Replace to char* when type casting
>
> test/hotspot/gtest/runtime/test_os_linux.cpp line 377:
> 
>> 375:   EXPECT_TRUE(os::release_memory(heap, 1 * G));
>> 376:   UseTransparentHugePages = useThp;
>> 377: }
> 
> This seems like it's concurrently running `madvise(..., MADV_POPULATE_WRITE)`, correct? This is not what I meant.
> 
> What I meant was having at least 2 threads, where one thread is running `os::pretouch_memory` and another using the memory for something. For example, 1 thread pretouching, the other thread filling out the memory with an incrementing integer array `[0,1,2,3,4,...]`. I think this is what Kim meant also, or am I the one misunderstanding him?

[Sorry, I lost track of this and didn't respond to the earlier comment from
@jdksjolen.]

Yes, that's correct.  The reason for adding the safe for concurrent use
pretouch mechanism was https://bugs.openjdk.org/browse/JDK-8260332.

The idea is that presently, when a thread needs to expand the oldgen, it
pretouches while holding the expansion lock.  Any other threads that also need
need the oldgen to be expanded have to wait until the holder of that lock
completes.  Most of the work involved in expansion is quick and short, but not
so much for pretouching.  So it was found that we're sometimes blocking a
bunch of threads for a long-ish time.

The original proposal there was to allow the otherwise waiting threads to
cooperate in the pretouch.  But the protocol involved was complicated and
messy.  A simpler approach was suggested; allow other threads to use the newly
expanded memory concurrently with the expanding thread doing the pretouch.
There's obviously some racing there, with the using threads possibly touching
pages before the pretouching reaches them, but the thinking is that the
pretouched wave-front will likely surge ahead of the using threads.  And if
not, then the using threads are effectively cooperating in the "pretouch".

That approach needed https://bugs.openjdk.org/browse/JDK-8272807 as a building
block.

But I discovered there were a bunch of places with similar problems,
suggesting the need for some more general mechanism. I did a bit of
prototyping in that direction, but got distracted by other work and haven't
gotten back to it. (The idea is to record needed pretouching, deferring it up
the call chain, to a point where other threads are not being blocked waiting
for the expansion operation. A complicating factor is that some of those
places may have multiple distinct memory ranges being allocated and needing
pretouch, all within the same expansion operation.)

But that approach may interact poorly with the madvise approach. It might be
that the madvise _should_ be done down inside the expansion operation where
the pretouches currently happen, rather than being deferred up the call chain
and permitting the madvise to be concurrent with using threads that might
introduce the same "shredding" problem the madvise is attempting to fix. That
would be yet another complicating factor that my prototyping didn't address at
all.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1437864972