Memory Mapped file segment (file is empty)
Ty Young
youngty1997 at gmail.com
Thu Jun 25 16:03:39 UTC 2020
On 6/25/20 8:53 AM, Maurizio Cimadamore wrote:
>
>> And from what I understand about memory mapped files -- not related
>> to your API -- but in general it makes sense to map the whole
>> "database"-file and not just a small page-portion, right? As it will
>> fetch the page on-demand into the non heap memory location.
>
> In the Java 15 API you can map a smaller portion of a file (by passing
> an offset and a size); this was possible with the ByteBuffer API, but
> that option was partially omitted in the segment API.
>
> I think when it comes to mapped files, mileage can vary - I believe
> some of my colleagues (and people in this very mailing list) are much
> more knowledgeable than I am when it comes to fine tuning the size of
> mapped memory regions :-)
>
FWIW, I'll share some personal experience and info I've found since I've
been working with this too.
According to Wikipedia[1], pages(4096 bytes) are indeed only loaded into
memory once accessed:
A possible benefit of memory-mapped files is a "lazy loading", thus
using small amounts of RAM even for a very large file. Trying to load
the entire contents of a file that is significantly larger than the
amount of memory available can cause severe thrashing as the operating
system reads from disk into memory and simultaneously writes pages from
memory back to disk. Memory-mapping may not only bypass the page file
completely, but also allow smaller page-sized sections to be loaded as
data is being edited, similarly to demand paging used for programs.
(random platform differences aside)
Going by the code that was linked, I feel like there is a
misunderstanding the relationship between MemorySegment.mapFromPath and
the file being mapped. The "size" argument has little to do with the
size of the file but rather the amount of accessible data from within
FMA. FMA will expand this amount for you if the file's size is under the
desired size to be mapped.
In other words, if you pass 128 bytes instead of the file's size, the
size of the underlying file will be 128 bytes assuming it's not already
expanded beyond 128 bytes.
You can then expand the file further by recalling the factory method
with a larger size using the same file. All newly expanded bytes will be
zero'd. If you need to know where to start reading from, you may want to
try reserving the first 8 bytes as a long offset or keeping an entirely
different file that has any important offset you may need.
(side note: it would be nice if an "expand(long bytes)" method was added
to MappedMemorySegment)
The good news is that an expanded MappedMemorySegment has no affect on
any other smaller MappedMemorySegment instances or their slices and
since they are backed by the same file, any slices can continue to work
as expected. *You may want to force() and unload() just to be safe*.
Here is some basic code to show this:
File file = new File("./test");
if(!file.exists())
file.createNewFile();
else
{
file.delete();
file.createNewFile();
}
MappedMemorySegment segment =
MemorySegment.mapFromPath(file.toPath(), 0, 128,
FileChannel.MapMode.READ_WRITE);
MappedMemorySegment segment2 =
MemorySegment.mapFromPath(file.toPath(), 0, 196,
FileChannel.MapMode.READ_WRITE);
segment2.close();
The bad news is that you somehow need to manage all this without leaking
on-heap memory or creating a lot of garbage. If you close the older,
smaller MappedMemorySegment then all slices will also close but, again,
the larger MappedMemorySegment has access to all the same data. Managing
old vs. new is a bit of a headache and can blow up if you aren't
careful. Avoiding slicing and using VarHandles would probably be a good
way to make things easier for yourself and VarHandles are faster too.
If the data is being repeated like so in a reliable format like:
1. 80-byte string
2. struct with int(4-bytes), 4-byte padding, and an 8-byte long
3. int(4-bytes)
(total 100 bytes)
You could, I think, use MemoryHandles.withOffset then to create a
VarHandle that would allow you to iterate through these entries. You
could further use VarHandles to reduce the amount slicing for each entry.
This may require use of unsafe though, I think? Not sure.
Not a JDK developer or an expert by any means but I hope some of this
helps at least a little.
[1] https://en.wikipedia.org/wiki/Memory-mapped_file#Benefits
More information about the panama-dev
mailing list