RFR(s): 8077276: allocating heap with UseLargePages and HugeTLBFS may trash existing memory mappings (linux)

Stefan Karlsson stefan.karlsson at oracle.com
Wed Apr 15 11:55:18 UTC 2015


Hi Thomas,

Thanks for fixing this  bug.

Some comments inlined:

On 2015-04-09 17:08, Thomas Stüfe wrote:
> Hi all,
>
> please review this fix to huge page allocation on Linux.
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8077276
> webrev: http://cr.openjdk.java.net/~stuefe/webrevs/8077276/webrev.00/webrev/

http://cr.openjdk.java.net/~stuefe/webrevs/8077276/webrev.00/webrev/src/os/linux/vm/os_linux.cpp.udiff.html

+  const int prot = exec ? PROT_READ|PROT_WRITE|PROT_EXEC : PROT_READ|PROT_WRITE;
  
    char* start;
    if (req_addr != NULL) {
      assert(is_ptr_aligned(req_addr, alignment), "Must be");
      assert(is_size_aligned(bytes, alignment), "Must be");
-    start = os::reserve_memory(bytes, req_addr);
-    assert(start == NULL || start == req_addr, "Must be");
+    // do NOT use MAP_FIXED here - do not trash existing mappings.
+    start = (char*) ::mmap(req_addr, bytes, prot,
+      MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0);
+    if (start == MAP_FAILED) {
+      start = NULL;
+    }
+    if (!is_ptr_aligned(start, alignment)) {
+      ::munmap(start, bytes);
+      start = NULL;
+    }


Previously, the small page area was mapped with PROT_NONE, but now you 
change it to RW. Is there a reason why you changed that?


    result = ::mmap(lp_start, lp_bytes, prot,
                    MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED|MAP_HUGETLB,
                    -1, 0);
    if (result == MAP_FAILED) {
-    warn_on_large_pages_failure(req_addr, bytes, errno);
+    warn_on_large_pages_failure(start, bytes, errno);

Shouldn't we report lp_start instead of start here?


http://cr.openjdk.java.net/~stuefe/webrevs/8077276/webrev.00/webrev/raw_files/new/src/os/linux/vm/os_linux.cpp

There are a number of things that I would like to change with the the test.

Can you use mapping == NULL in null checks instead of just if (mapping). 
The same for the asserts.

I think the test will be easier to understand if you get rid of the for 
loop and extract code into helper functions, or do a small amount of 
code duplication:

     // do the test twice: once with a req_addr=NULL, once with a non-NULL
     // req_addr; the latter shall test if we trash existing mappings
     for (int run = 0; run < 2; run ++) {

I think you should fail or exit the test if you can't get the expected 
mapping, instead of continuing and have :

       if (addr) {
         test_log("... mapped [" PTR_FORMAT "-" PTR_FORMAT ").", addr, 
addr + size);
       } else {
         test_log("... failed.");
       }

       if (addr != NULL) {

Thanks,
StefanK

>
> os::Linux::reserve_memory_special_huge_tlbfs_mixed() first establishes a
> mapping with small pages over the whole requested range, then exchanges the
> parts of the mapping which are aligned to large page size with large pages.
>
> Now, when establishing the first mapping, it uses os::reserve_memory() with
> a potentially non-null req_addr. In that case, os::reserve_memory() will do
> a mmap(MAP_FIXED).
>
> This will trash any pre-existing mappings.
>
> Note that I could not willingly reproduce the error with an unmodified VM.
> But I added a reproduction case which demonstrated that if one were to call
> os::Linux::reserve_memory_special_huge_tlbfs_mixed() with a non-NULL
> req_addr, existing mappings would be trashed. Depending on where we are in
> address space, we also would overwrite libc structures and crash
> immediately.
>
> The repro case is part of the change, see changes in
> test_reserve_memory_special_huge_tlbfs_mixed(), and can be executed with:
>
> ./images/jdk/bin/java -XX:+UseLargePages -XX:+UseHugeTLBFS -XX:-UseSHM
> -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests
>
> The fix: instead of using os::reserve_memory(),
> os::Linux::reserve_memory_special_huge_tlbfs_mixed() now calls mmap()
> directly with the non-NULL req_addr, but without MAP_FIXED. This means the
> OS may do its best to allocate at req_addr, but will not trash any
> mappings. This also follows the pattern in
> os::Linux::reserve_memory_special_huge_tlbfs_only(), which is a sister
> function of os::Linux::reserve_memory_special_huge_tlbfs_mixed().
>
> Note also discussion at:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2015-April/017823.html.
>
> Thanks for reviewing!
>
> Kind Regards, Thomas



More information about the hotspot-runtime-dev mailing list