RFR: 8272807: Permit use of memory concurrent with pretouch

Wed Sep 8 14:07:04 UTC 2021

On Thu, 2 Sep 2021 18:33:56 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Note that this PR replaces the withdrawn https://github.com/openjdk/jdk/pull/5215.
> 
> Please review this change which adds os::touch_memory, which is similar to
> os::pretouch_memory but allows concurrent access to the memory while it is
> being touched.  This is accomplished by using an atomic add of zero as the
> operation for touching the memory, ensuring the virtual location is backed
> by physical memory while not changing any values being read or written by
> other threads.
> 
> While I was there, fixed some other lurking issues in os::pretouch_memory.
> There was a potential overflow in the iteration that has been fixed.  And if
> the range arguments weren't page aligned then the last page might not get
> touched.  The latter was even mentioned in the function's description.  Both
> of those have been fixed by careful alignment and some extra checks.  The
> resulting code is a little more complicated, but more robust and complete.
> 
> Similarly added TouchTask, which is similar to PretouchTask.  Again here,
> there is some cleaning up to avoid potential overflows and such.
> 
> - The chunk size is computed using the page size after possible adjustment
> for UseTransparentHugePages.  We want a chunk size that reflects the actual
> number of touches that will be performed.
> 
> - The chunk claim is now done using a CAS that won't exceed the range end.
> The old atomic-fetch-and-add and check the result, which is performed by
> each worker thread, could lead to overflow.  The old code has a test for
> overflow, but since pointer-arithmetic overflow is UB that's not reliable.
> 
> - The old calculation of num_chunks for parallel touching could also
> potentially overflow.
> 
> Testing:
> mach5 tier1-3

I think it would be prudent to separate this PR into two separate PRs.

1) For the changes to the pretouch loop.

* I'm not sure we need all the added safeguarding, which, as you say, complicate the code. I think a few asserts would be good enough.

I don't think the function needs to:
- work if the caller pass in end < start. Sounds like an assert.
- Safeguard that end + page doesn't overflow. Do we ever hand out pages at the end of the virtual address range? There's a lot of HotSpot code that assumes that we never get that kind of memory, so is it worth protecting against this compared to the extra complexity it gives? Previously, I could take a glance at that function and understand it. Now I need to pause and stare at it a bit.

2) For the new infrastructure.

* I'm not sure we should add TouchTask until we have a concrete use-case (with prototype/PR) for it. Without that, it is hard to gauge the usability of this feature. How do other threads proceed concurrently with the worker threads? What is the impact of using TouchTask? One of the concerns that has been brought up by others is that using a WorkGang will block the rest of the JVM from safepointing while the workers are pre-touching. A use-case would clarify this, I think.

* Is the usage of template specialization really needed? Wouldn't it be sufficient to pass down a bool parameter instead? I doubt we would see any performance difference by doing that.

* It's a little bit confusing that in the context of TouchTask "touch" means "concurrent touch", while in the touch_memory_* it means either concurrent or non-concurrent. Maybe rename TouchTask to something to make this distinction a bit clearer.

-------------

Changes requested by stefank (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5353