MMap and hugetlbfs

Cleber Muramoto cleber.muramoto at gmail.com
Wed Mar 13 23:09:47 UTC 2024


Are there plans to encapsulate offset/page boundary computation in
FileChannel.map for files under a hugetlbfs?

The problem is that, unlike files in "regular" file systems with 4K pages,
*ftruncate* can only be called with multiples of page size, otherwise it
will fail and set errno to EINVAL, which is translated in the JNI land into
an IOException with a not very informative message: "Invalid argument".

With hugetlbfs, we also have to manually compute page positions, since
FileDispatcher::allocationGranularity() will "always" report 4K and we have
to call *mmap* with, e.g. 2M page alignment:

----------
import static java.nio.file.StandardOpenOption.*;

import java.io.*;
import java.lang.foreign.*;
import java.nio.channels.*;
import java.nio.channels.FileChannel.*;
import java.nio.file.*;

public class TestHugeTLBFS {

  static final long REGULAR_PS = 4096L;
  static final long HUGE_PS = 1024 * 1024 * 2L;
  static final StandardOpenOption[] OPTS = { CREATE, WRITE, READ };

  public static void main(String[] args) throws IOException {
    var base = Paths.get("/var/lib/hugetlbfs/global/pagesize-2MB");

    var tmp = Files.createTempFile(base, "test", ".bin");

    var a = Arena.ofShared();

    MemorySegment ms = null;

    try (var fc = FileChannel.open(tmp, OPTS)) {
      // fails in truncate (length is not a multiple of HUGE_PS)
      ms = fc.map(MapMode.READ_WRITE, 0, HUGE_PS - 1, a);
    } catch (IOException e) {
      e.printStackTrace();
    }

    assert ms == null : "Worked?!";

    try (var fc = FileChannel.open(tmp, OPTS)) {
      // fails in mmap (computed offset is 4K aligned, but it's not 2MB
aligned)
      ms = fc.map(MapMode.READ_WRITE, HUGE_PS + 3 * REGULAR_PS, HUGE_PS - 3
* REGULAR_PS, a);
    } catch (IOException e) {
      e.printStackTrace();
    }

    assert ms == null : "Worked?!";

    try (var fc = FileChannel.open(tmp, OPTS)) {
      // This works, because the aligned offset ends up 2MB aligned
      ms = fc.map(MapMode.READ_WRITE, HUGE_PS + 19, HUGE_PS -  19, a);
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }

    ms.set(ValueLayout.JAVA_INT_UNALIGNED, 0, 42);
    a.close();

    try (var fc = FileChannel.open(tmp, OPTS)) {
      ms = fc.map(MapMode.READ_WRITE, HUGE_PS + 19 , HUGE_PS - 19, a =
Arena.ofShared());
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }

    assert ms.get(ValueLayout.JAVA_INT_UNALIGNED, 0) == 42;
    a.close();
  }
}
----------

To satisfy the alignment constraints for both offset and length, the offset
has to be rounded down to the beginning of a page boundary and the length
must be compensated taking into account the aligned start offset, more or
less like:

MemorySegment map(Path path, long offset, long length, long pageSize) {
    var start = offset;
    var len = length;
    var truncLen = start + len;

    if (truncLen % pageSize != 0) {
      // round down to start of a page offset -> offset - (offset %
pageSize)
      start = alignDown(offset, pageSize);
      var end = offset + length;
      len = end - start;
      truncLen = start + len;

      if (truncLen % pageSize != 0) {
        // round up to a multiple of pageSize: length -> pageSize * (length
/ pageSize + ((length % pageSize == 0) ? 0 : 1))
        len = alignUp(len, pageSize);
        truncLen = start + len;

        assert truncLen % pageSize == 0 : "Sanity";
      }
    }

    try (var fc = FileChannel.open(path,...)) {
      var segment = fc.map(mode, path, start, len, Arena.ofSomething());

      if (start != offset || len != length) {
        segment = segment.asSlice(offset - start, length);
      }

      return segment;
    }
}

It would be nice to have this somehow handled by FileChannel.

An overload with a user-defined page size might not be ideal, but it might
be the cheapest way to bypass the (incorrect)
FileDispatcher::allocationGranularity() value to enforce correct alignment
constraints.

Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240313/900ee72a/attachment.htm>


More information about the panama-dev mailing list