RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v16]

Marcus G K Williams github.com+168222+mgkwill at openjdk.java.net
Thu Mar 11 22:35:08 UTC 2021


On Thu, 4 Mar 2021 07:21:37 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>>> > > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it.
>>> > 
>>> > 
>>> > I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`.
>>> 
>>> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type().
>>> 
>> Sure, got that, just wanted to highlight that we need to figure out how to handle the sanity check for multiple sizes. Should a size that fail the sanity check be removed from the `_page_sizes` member. Maybe `_page_sizes` should include all page sizes, and then we have an additional member for "useable large page sizes". As I said, not sure how to best handle this.
>> 
>>> > > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue.
>>> > 
>>> > 
>>> > I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size.
>>> 
>>> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level?
>> 
>> I agree, it's not obvious how to make this work in a good way. But using the `os::page_size_for_region*` functions in the upper layers to request a page size could be one solution. But we probably need to have a way to change the "default" value for some cases.
>> 
>> Another thing to think about/discuss is what should be done if a reservation-request within the VM for 4G with 1G pages fail, should we fall straight back to 4k page, should we try 2M page or possible fail hard to show something is probably wrong with the config.
>
>> Hi @kstefanj and @tstuefe . Trying to resolve your comments and working through your suggestions. I will be responding more over the next day or so as I try to implement and understand what you are proposing. Thanks again for your review and suggestions.
> 
> Well, thanks for your patience :)

Hello @tstuefe & @kstefanj. 

I've updated the change to implement your suggestions, all of them hopefully. I'd appreciate any further review and suggestions.

There is one issue with the current code set for which I wanted to get suggestions. Invariably 
`os::page_size_for_region_aligned(bytes, 1)` in `os::Linux::reserve_memory_special_huge_tlbfs_mixed` returns with 4096, which of course breaks the assert `assert(large_page_size > (size_t)os::vm_page_size(), "large page size: %ld not larger than small page size: %ld", large_page_size, (size_t)os::vm_page_size());` - Line 4059 in os_linux.cpp. 

Here is the ps.log:

> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/mgkwill/src/git/jdk/src/hotspot/os/linux/os_linux.cpp:4059), pid=2789288, tid=2789289
> #  assert(large_page_size > (size_t)os::vm_page_size()) failed: large page size: 4096 not larger than small page size: 4096
> #
> # JRE version:  (17.0) (slowdebug build )
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.mgkwill.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0xfb660a]  os::Linux::reserve_memory_special_huge_tlbfs_mixed(unsigned long, unsigned long, char*, bool)+0x146
> #
> # Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/core.2789288)
> #
> # An error report file with more information is saved as:
> # /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/hs_err_pid2789288.log
> #
> #
> Aborted (core dumped)
> ps-2789288.log:
> [0.002s][info][pagesize] Available page sizes: 4k, 2M, 1G
> [0.003s][info][pagesize] Available large page sizes: 2M, 1G
> [0.005s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 251658240
> [0.005s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096
> [0.005s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free)
> [0.005s][info][pagesize] 2048k default large page
> [0.005s][info][pagesize] Page Sizes: 4k, 2M, 1G
> [0.005s][info][pagesize] CodeHeap 'non-nmethods':  min=4M max=6M base=0x00007f8edc600000 page_size=2M size=6M
> [0.006s][info][pagesize] CodeHeap 'profiled nmethods':  min=4M max=116M base=0x00007f8edcc00000 page_size=2M size=116M
> [0.007s][info][pagesize] CodeHeap 'non-profiled nmethods':  min=4M max=118M base=0x00007f8ee4000000 page_size=2M size=118M
> [0.023s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 16202596352
> [0.023s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096
> [0.023s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free)
> [0.023s][info][pagesize] 2048k default large page
> [0.023s][info][pagesize] Page Sizes: 4k, 2M, 1G
> [0.023s][info][pagesize] Heap:  min=8M max=15452M base=0x000000043a400000 page_size=2M size=15452M
> [0.023s][info][pagesize] Card Table:  min=31645697B max=31645697B base=0x00007f8ef8919000 page_size=4K size=30908K
> [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496
> [0.825s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096
> [0.825s][info][pagesize] Memory: 4k page, physical 131844416k(13732388k free), swap 0k(0k free)
> [0.825s][info][pagesize] 2048k default large page
> [0.825s][info][pagesize] Page Sizes: 4k, 2M, 1G

I'm not sure I understand why `Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496`. 

Any suggestions as to the issue and solution? Once I solve this I will remove the excessive logging.

Thanks,
Marcus

-------------

PR: https://git.openjdk.java.net/jdk/pull/1153



More information about the hotspot-gc-dev mailing list