Memory Mapped Segment with offsets into the underlying file

Johannes Lichtenberger lichtenberger.johannes at gmail.com
Wed Jul 1 22:44:55 UTC 2020


I think I got a more performant version working now:

https://github.com/sirixdb/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/io/memorymapped

Now I'm reading and writing with the Foreign Memory API. However, during
writes I'm acquiring a memory mapped segment with Integer.MAX_VALUE bytes
at first. Then, I'm constantly checking if the size I'm going to write will
be bigger than the region I've mapped -- doubling the size if needed
each time. I'm truncating the file after a commit, when the writer is
closed after the memory mapped segment is closed.

Thus, importing a JSON document of 3,8 Gb size into SirixDB now is about 9
minutes from within IntelliJ (size afterwards is about 2,4 Gb) on my
machine with a SATA SSD. Might hopefully be even better when I've fixed a
concurrency bug and fetching the page-fragments concurrently and
in-parallel with a PCIe SSD drive or Intel Optane Memory.

When reading the whole file in preorder, that is fetching more than
310_000_000 nodes I'm currently down to about 4 minutes vs. around 6:20min
with the RandomAccessFile :-)

I also reduced the max size of instances to cache with in-memory Caffeine
caches (the simple buffer manager impl. basically), thus GC pauses are
about 5ms to 15ms _always_ without noticable peaks with G1. I think it
makes sense to use the default G1 here with the memory mapped segment and
in the other case with a large heap Shenandoah probably.

The only thing, which is missing right now might be sharing a MemorySegment
between threads, whereas I'm making sure that only ever one thread reads or
writes (but that's something where I have to wait for Java 15 or even later
I guess). A basic operation SirixDB is offering is a timer based
auto-commit with the ScheduledExecutorService, which commits in another
thread than the main parent thread.

kind regards and thanks for all the suggestions
Johannes

Am Mi., 1. Juli 2020 um 20:15 Uhr schrieb Maurizio Cimadamore <
maurizio.cimadamore at oracle.com>:

> I believe (as with other things mmap related) you are in OS territory. I
> think the upper limit there is just the virtual addressable space.
>
> Maurizio
> On 01/07/2020 19:03, Johannes Lichtenberger wrote:
>
> Another thing I couldn't get from reading the JavaDocs, what's the biggest
> size I can map on 64 Bit systems?
>
> johannes at johannesl3:/opt$ uname -a
> Linux johannesl3 5.3.0-61-generic #55-Ubuntu SMP Fri Jun 19 11:16:34 UTC
> 2020 x86_64 x86_64 x86_64 GNU/Linux
>
> I tried
>
> MemorySegment.mapFromPath(dataFile, Long.MAX_VALUE, FileChannel.MapMode.
> READ_WRITE);
>
> and some other values. Integer.MAX_VALUE is permitted at least :-)
>
> kind regards
>
> Johannes
>
>
> Am Mi., 1. Juli 2020 um 13:43 Uhr schrieb Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com>:
>
>> Hi,
>>
>> On 01/07/2020 11:04, Johannes Lichtenberger wrote:
>> > Hi,
>> >
>> > is it currently possible to specify a start offset somehow to map a
>> > specific region, despite the number of bytes to map? Im using Java 14
>> as of
>> > now.
>> the Java 15 API will let you do that - it takes both an offset and a
>> length (in bytes).
>> >
>> > As already mentioned, my application is always appending data to a file
>> and
>> > needs to read randomly. For writing I thought I could use a
>> > RandomAccessFile or Channel based implementation and only mmap the file
>> for
>> > read-only operations. However, somehow, when I write a long offset into
>> the
>> > position 0 and afterwards try to read it via:
>> >
>> > dataFileSegment =
>> >      MemorySegment.mapFromPath(checkNotNull(dataFile),
>> > dataFile.toFile().length(), FileChannel.MapMode.READ_ONLY);
>> >
>> > final MemoryAddress baseAddress = dataFileSegment.baseAddress();
>> >
>> > uberPageReference.setKey((long) LONG_VAR_HANDLE.get(baseAddress));
>> >
>> >
>> > the key I'm setting seems to be way off. However, it clearly might be
>> > a bug in my implementation or it's somehow not synchronized when
>> > something is written to the file without the Foreign Memory API or it
>> > might well be an anti-pattern.
>>
>> Hard to say from here - while bugs in the API impl are always possible,
>> I'd double check your impl first, since it seems like what you are doing
>> is not trivial - in the sense that you end up with two views of the same
>> file.
>>
>> One thing to check is whether the writes using the regular IO API have
>> been flushed to the file before the file is memory mapped.
>>
>> Other than that your snippet above looks ok, and I don't think it should
>> trigger any issue in the impl.
>>
>> >
>> > Now I thought I could just map a region, which starts at the end of
>> > the current file for appending operations and spans maybe 1Gb for
>> > instance.
>> >
>> > Then I could save in a field the real length of the written data and
>> > adapt it everytime. When the mapped segment doesn't have enough space
>> > simply a new segment is created. Before closing the segment I'd have
>> > to truncate the size to the real length, however.
>> >
>> > Furthermore, I think it would be great if the JVM would have support for
>> > madvise to get rid of the prefetching of pages done by the Kernel in my
>> > case (random reads, as it's a tree of tries... but basically as in every
>> > index structure).
>>
>> This was brought up before [1] - in general I think that, instead of
>> adding XYZ feature to the JDK (most of which are going to be heavily
>> OS-dependent), for advanced use cases it would probably best to just use
>> `mmap` directly, as described in [2]. Of course that will only be
>> possible once the FFI support is in, but we're not too far from that ;-)
>>
>> Maurizio
>>
>> [1] -
>> https://mail.openjdk.java.net/pipermail/panama-dev/2020-April/008680.html
>> [2] -
>> https://gist.github.com/mcimadamore/128ee904157bb6c729a10596e69edffd
>>
>>
>> >
>> > And as suggested I now try to map large regions of the file (or the
>> whole
>> > file) plus setting VarHandles as static final fields :-)
>> >
>> > kind regards
>> > Johannes
>>
>


More information about the panama-dev mailing list