Memory Mapped file segment (file is empty)

Johannes Lichtenberger lichtenberger.johannes at gmail.com
Thu Jun 25 15:01:53 UTC 2020


Thanks a lot. I guess I only have two options, either map the whole file
(maybe 10 Gb or more, as I'm not storing segments in smaller files, only
ever appending to the same resource-data file) or mapping rather small
memory regions (depending on the number of records changed it might only be
a few hundred bytes. However, I think mapping the whole file is okay, but
maybe some can comment on this.

Regarding your comment about the loop, I'd have to change the interfaces,
which use DataOutput / DataInput and swap the implementation I guess. That
might be a bigger change. Isn't it possible to read/write a byte-array from
the memory mapped file segment and afterwards deconstruct this byte-array
via classes, which implement the DataOutput / DataInput interfaces? Maybe
it's a stupid idea and way too inefficient -- so I'll just have to refactor
this class for instance (and some others which are way too big):

https://github.com/sirixdb/sirix/blob/master/bundles/sirix-core/src/main/java/org/sirix/node/NodeKind.java

kind regards
Johannes


Am Do., 25. Juni 2020 um 15:55 Uhr schrieb Maurizio Cimadamore <
maurizio.cimadamore at oracle.com>:

>
> On 25/06/2020 11:02, Johannes Lichtenberger wrote:
>
> Thanks, is it already available in a JDK 14 binary or should I wait till
> october?
>
> You might try here:
>
> https://jdk.java.net/15/
>
> Furthermore, regarding the usage itself, is this okay to get a byte-array
> from the memory mapped file or should I use some sort of MemoryLayout...?
>
> final byte[] page = new byte[dataLength];
> final VarHandle byteVarHandle = MemoryHandles.varHandle(byte.class, ByteOrder.nativeOrder());
> for (int i = 0; i < dataLength; i++) {
>   // 4 (dataLength) + 1 = 5 in offset, as the loop starts with 0 we have to add one to the offset  page[i] = (byte) byteVarHandle.get(baseAddress.addOffset(5 + i), i);
> }
>
> For now this is ok, we do not support mapping complex layouts to arrays
> (we might consider it in the future).
>
> Worth mentioning - the above code will not perform very well; it would be
> better to create an indexed var handle from the layout of the element you
> want to access. E.g.
>
> VarHandle intHandle =
> MemoryLayout.ofSequence(MemoryLayouts.JAVA_INT).varHandle(int.class,
> PathElement.sequenceElement());
>
> And then use the handle in a loop, like this:
>
> for (int i = 0; i < n ; i++) {
>    intHandle.get(baseAddress, (long)i);
> }
>
>
> And from what I understand about memory mapped files -- not related to your API -- but in general it makes sense to map the whole "database"-file and not just a small page-portion, right? As it will fetch the page on-demand into the non heap memory location.
>
> In the Java 15 API you can map a smaller portion of a file (by passing an
> offset and a size); this was possible with the ByteBuffer API, but that
> option was partially omitted in the segment API.
>
> I think when it comes to mapped files, mileage can vary - I believe some
> of my colleagues (and people in this very mailing list) are much more
> knowledgeable than I am when it comes to fine tuning the size of mapped
> memory regions :-)
>
>
> Another thing, which might be sonsense as of now is this I guess:
>
>  https://github.com/sirixdb/sirix/blob/7bdbd5d17a034f02902f8f7dd0ef7012d89c81fb/bundles/sirix-core/src/main/java/org/sirix/io/memorymapped/MemoryMappedFileWriter.java#L140
>
>  First, an (on-heap) byte-array is produced, such that a database page-fragment is serialized with all its records to it. Basically, the DataOutput interface is used to serialize/deserialize stuff in-memory, do compression, encryption... Changing this might be more work, and I guess it's okay.
>
> Then, I think I can skip the ByteBuffer (just copied from the RandomAccessFile based implementation), then copying the byte buffer again to the memory mapped file segment.
>
> Yep - it seems to me that you should be able to use the memory access API
> directly here.
>
> Maurizio
>
>  Kind regards
>
> Johannes
>
>
> Am Do., 25. Juni 2020 um 11:26 Uhr schrieb Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com>:
>
>> Hi Johannes,
>> the specific condition you are talking about has been rectified in the
>> upcoming Java 15.
>>
>> The new code should be doing something like this:
>>
>> ```
>> if (bytesSize < 0) throw new IllegalArgumentException("Requested bytes
>> size must be >= 0.");
>> if (bytesOffset < 0) throw new IllegalArgumentException("Requested bytes
>> offset must be >= 0.");
>> ```
>>
>> So, hopefully, your code should work against the latest version of the API
>>
>> Maurizio
>>
>> On 25/06/2020 08:24, Johannes Lichtenberger wrote:
>> > Hi Paul,
>> >
>> > that's great. I guess usually there's a -dev and a -user Mailinglist,
>> > that's why I thought it's for internal development topics.
>> >
>> > I'm using AdpotOpenJDK, because I wanted to try Shenandoah, which is at
>> > least not in the Oracle binaries.
>> >
>> > IMPLEMENTOR="AdoptOpenJDK"
>> > IMPLEMENTOR_VERSION="AdoptOpenJDK"
>> > JAVA_VERSION="14.0.1"
>> > JAVA_VERSION_DATE="2020-04-14"
>> >
>> > BTW: My OS is Linux/Ubuntu.
>> >
>> > Kind regards
>> > Johannes
>> >
>> > Am Do., 25. Juni 2020 um 02:04 Uhr schrieb Paul Sandoz <
>> > paul.sandoz at oracle.com>:
>> >
>> >> Hi Johannes,
>> >>
>> >> This is the correct list. Feedback on the API and its usability is very
>> >> important, as well as whether the implementation works as expected.
>> >>
>> >> This was fixed in https://bugs.openjdk.java.net/browse/JDK-8246095
>> >>
>> >>
>> >>
>> https://github.com/openjdk/panama-foreign/blob/foreign-memaccess/src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/MappedMemorySegmentImpl.java#L102
>> >>
>> >> What JDK build are you using?
>> >>
>> >> Paul.
>> >>
>> >> On Jun 24, 2020, at 3:48 PM, Johannes Lichtenberger <
>> >> lichtenberger.johannes at gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I guess that's not the right mailinglist to ask for help with the API,
>> but
>> >> maybe you can point me at least somewhere.
>> >>
>> >> Currently I face the problem, that I can't get a memory segment on an
>> empty
>> >> file when I want to memory map a file. However, a writer would write
>> to the
>> >> memory mapped file.
>> >>
>> >> I want to be able to swap a rather old RandomAccessFile based
>> >> implementation with a memory mapped file implementation. The two
>> classes
>> >> (actually three, but only the FileReader and FileWriter matters) are
>> rather
>> >> simple.
>> >>
>> >> The old package:
>> >>
>> >>
>> >>
>> https://github.com/sirixdb/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/io/file
>> >>
>> >>
>> >> and the new package
>> >>
>> >>
>> >>
>> https://github.com/sirixdb/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/io/memorymapped
>> >>
>> >> This line fails if the underlying file size is 0:
>> >>
>> >>
>> >>
>> https://github.com/sirixdb/sirix/blob/7bdbd5d17a034f02902f8f7dd0ef7012d89c81fb/bundles/sirix-core/src/main/java/org/sirix/io/memorymapped/MemoryMappedFileWriter.java#L83
>> >>
>> >> Stacktrace  (Precondition fails, so I'm not using the API in the
>> intended
>> >> way):
>> >>
>> >> java.lang.IllegalArgumentException: Requested bytes size must be > 0.
>> >>
>> >> at
>> >>
>> >>
>> jdk.incubator.foreign/jdk.internal.foreign.Utils.makeMappedSegment(Utils.java:140)
>> >> at
>> >>
>> >>
>> jdk.incubator.foreign/jdk.incubator.foreign.MemorySegment.mapFromPath(MemorySegment.java:398)
>> >> at
>> >>
>> >>
>> org.sirix.io.memorymapped.MemoryMappedFileWriter.<init>(MemoryMappedFileWriter.java:83)
>> >>
>> >>
>> >> So, it might be easiest to write a more or less direct conversion to
>> the
>> >> Foreign Memory API. Maybe someone can help or point me to a helpful
>> >> Mailinglist/Forum/whatever.
>> >>
>> >> Kind regards
>> >> Johannes
>> >>
>> >>
>> >>
>>
>


More information about the panama-dev mailing list