RFR: 8271195: Use largest available large page size smaller than LargePageSizeInBytes when available [v10]

Wed Feb 23 10:14:56 UTC 2022

On Wed, 23 Feb 2022 08:38:28 GMT, Swati Sharma <duke at openjdk.java.net> wrote:

>> Hi Team,
>> 
>> In this patch I have fixed two issues related to large pages, following is the summary of changes :-
>> 
>> 1. Patch fixes existing large page allocation functionality where if a commit over 1GB pages fails allocation should happen over next small page size i.e. 2M where as currently its happening over 4kb pages resulting into significant TLB miss penalty.
>> Patch includes new JTREG Test case covering various scenarios for checking the correct explicit page allocation according to the 1G, 2M, 4K priority.
>> 2. While attempting commit over larger pages we first try to reserve requested bytes over the virtual address space, in case commit to large page fails we should be un reserving entire reservation to avoid leaving any leaks in virtual address space.
>> 
>> 
>> Please find below the performance data with and without patch for the JMH benchmark included with the patch.
>> 
>> ![image](https://user-images.githubusercontent.com/96874289/152189587-4822a4ca-f5e2-4621-b405-0da941485143.png)
>> 
>> 
>> Please review and provide your valuable comments.
>> 
>> 
>> 
>> Thanks,
>> Swati Sharma
>> Runtime Software Development Engineer 
>> Intel
>
> Swati Sharma has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8271195: TestCase issue resolved.

@albertnetymk on the question whether a microbenchmark like this makes sense: I think it _can_ be useful to add micros that are diagnostics/verification tools for corner case effects, even if they'd be a bad fit for inclusion in automated regression testing. Still it's important to make sure each such micro has reasonable runtime, provides decent signal to noise and has well-documented external constraints. We do not automatically include newly added microbenchmarks in automated testing so don't worry about this clogging up the CI or weekly perf testing. 

Setting up benchmarks that stress large pages is brittle and depends a lot on the system and things out of control of the JMH micro itself, so I'd ask for a high-level comment in the microbenchmark that gives a quick rundown on what the benchmark is trying to test, how to configure the system, how you verify it's actually testing the right thing and in which configurations you expect to see an effect.

test/micro/org/openjdk/bench/vm/gc/MicroLargePages.java line 34:

> 32: public class MicroLargePages {
> 33: 
> 34:     @Param({"1073741824"})

Exactly 10^30 elements + array header size means the `long[]` will be a couple of bytes above 1GB. Have you tried the experiment on slightly smaller values?

To make the micro more practical in terms of runtime it'd be interesting if the effects can be verified using either smaller `ARRAYSIZE` by default, or by having the loop in `micro_HOP_DIST_4KB` that iterates from `0` to `ARRAYSIZE` only iterate a small fraction of the total size.

test/micro/org/openjdk/bench/vm/gc/MicroLargePages.java line 57:

> 55:     @Benchmark
> 56:     public void micro_HOP_DIST_4KB() {
> 57:         for (int i = 0; i < ITERS ; i++) {

JMH runs as many iterations it can in the allotted runtime. The `ITERS` loop here seem completely unnecessary and just increase the runtime of each 100x. Having a benchmark iteration take an excessively long time like this also throws off the total runtime estimates. 

If you want to make sure the micro runs for at least some number of times to capture some effect you'd be better off doing so by increasing the iteration time.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7326