<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 27/10/2024 13:33, Johannes
Lichtenberger wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAGXNUvaAjMApy-ZjKgZUukyViZ=sxq0zmwDNbNXRDoLRFQyaBw@mail.gmail.com">
<div dir="auto">
<div dir="ltr">Hello,
<div><br>
</div>
<div>I'm trying to implement something similar to the
BufferManager described in 2.1 in [1] for UmbraDB:</div>
<div><br>
</div>
<div>So I wanted to create the first buffer size class with
fixed sized pages as follows:<br>
<br>
<div style="background-color:rgb(30,31,34);color:rgb(188,190,196)">
<pre style="font-family:"JetBrains Mono",monospace;font-size:9.8pt"><span style="color:rgb(122,126,133)">// Allocate the memory
</span>MemorySegment reservedMemory = bufferManager.allocateMemory(totalMemorySize);
<span style="color:rgb(122,126,133)">// Partition into page-sized chunks
</span>List<MemorySegment> pages = <span style="color:rgb(207,142,109)">new </span>ArrayList<>();
<span style="color:rgb(207,142,109)">for </span>(<span style="color:rgb(207,142,109)">long </span>offset = <span style="color:rgb(42,172,184)">0</span>; offset + pageSize < totalMemorySize; offset += pageSize) {
MemorySegment pageSegment = reservedMemory.asSlice(offset, pageSize);
pages.add(pageSegment);
}</pre>
</div>
</div>
<div>Using `mmap` to create the virtual address mapping for
the big chunk allocation like this: <br>
</div>
<div>
<div style="background-color:rgb(30,31,34);color:rgb(188,190,196)">
<pre style="font-family:"JetBrains Mono",monospace;font-size:9.8pt"><span style="color:rgb(207,142,109)">public </span>MemorySegment <span style="color:rgb(86,168,245)">allocateMemory</span>(<span style="color:rgb(207,142,109)">long </span>size) <span style="color:rgb(207,142,109)">throws </span>Throwable {
<span style="color:rgb(122,126,133)">// Call mmap to reserve virtual memory
</span><span style="color:rgb(122,126,133)"> </span>MemorySegment addr = (MemorySegment) <span style="color:rgb(199,125,187)">mmap</span>.invoke(MemorySegment.<span style="color:rgb(199,125,187);font-style:italic">NULL</span>, <span style="color:rgb(122,126,133)">// Let OS choose the starting address
</span><span style="color:rgb(122,126,133)"> </span>size, <span style="color:rgb(122,126,133)">// Size of the memory to reserve
</span><span style="color:rgb(122,126,133)"> </span><span style="color:rgb(199,125,187);font-style:italic">PROT_READ </span>| <span style="color:rgb(199,125,187);font-style:italic">PROT_WRITE</span>, <span style="color:rgb(122,126,133)">// Read and write permissions
</span><span style="color:rgb(122,126,133)"> </span><span style="color:rgb(199,125,187);font-style:italic">MAP_PRIVATE </span>| <span style="color:rgb(199,125,187);font-style:italic">MAP_ANONYMOUS</span>, <span style="color:rgb(122,126,133)">// Private, anonymous mapping
</span><span style="color:rgb(122,126,133)"> </span>-<span style="color:rgb(42,172,184)">1</span>, <span style="color:rgb(122,126,133)">// No file descriptor
</span><span style="color:rgb(122,126,133)"> </span><span style="color:rgb(42,172,184)">0 </span><span style="color:rgb(122,126,133)">// No offset
</span><span style="color:rgb(122,126,133)"> </span>);
<span style="color:rgb(207,142,109)">if </span>(addr == MemorySegment.<span style="color:rgb(199,125,187);font-style:italic">NULL</span>) {
<span style="color:rgb(207,142,109)">throw new </span>OutOfMemoryError(<span style="color:rgb(106,171,115)">"Failed to allocate memory via mmap"</span>);
}
<span style="color:rgb(207,142,109)">return </span>addr;
}</pre>
</div>
</div>
<div>First thing I noticed is that I need
addr.reinterpret(size) here</div>
</div>
</div>
</blockquote>
<p>Yes, you get back a zero-length memory segment (as you are
calling a raw mmap downcall method handle), so you need to resize.</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:CAGXNUvaAjMApy-ZjKgZUukyViZ=sxq0zmwDNbNXRDoLRFQyaBw@mail.gmail.com">
<div dir="auto">
<div dir="ltr">
<div>, but now I wonder how Arena::allocate is actually
implemented for Linux (calling malloc?). I think the native
memory shouldn't be allocated in this case up until
something is written to the MemorySegment slices, right? Of
course we already have a lot of MemorySegment instances on
the Java heap which are allocated, in comparison to a C or
C++ version.</div>
</div>
</div>
</blockquote>
Arena::allocate in its basic implementation just calls malloc (well,
we do that via Unsafe::allocateMemory, but it's similar). Then we
also reinterpret to the correct size. Of course there could be
smarter allocation strategies, but they can be built on top, by
defining custom arenas (the Arena interface can be implemented
exactly for this purpose). We will investigate better strategies,
esp. for the confined case -- but I think delaying allocation until
the bits are actually accessed (which is kind of what you get with
mmap) might not be a great general strategy.<br>
<blockquote type="cite" cite="mid:CAGXNUvaAjMApy-ZjKgZUukyViZ=sxq0zmwDNbNXRDoLRFQyaBw@mail.gmail.com">
<div dir="auto">
<div dir="ltr">
<div dir="auto"><br>
</div>
<div dir="auto">Next, I'm not sure if I missed something, but
it seems ridiculous hard to get a file descriptor (the
actual int) for some syscalls, I guess as it's platform
specific, but if I didn't miss something I'd have to use
reflection, right? For instance if you have a FileChannel.</div>
</div>
</div>
</blockquote>
<p>There's this:</p>
<p><a class="moz-txt-link-freetext" href="https://bugs.openjdk.org/browse/JDK-8292771">https://bugs.openjdk.org/browse/JDK-8292771</a></p>
<p>We have been close to add a new restricted method to get the
descriptor. Other workarounds were highlighted in the JBS issue -
but perhaps this is something that can be re-assessed. Perhaps Uwe
or Per can chime in here, and see if this is needed or not (if not
we should just close the JBS issue).<br>
</p>
<blockquote type="cite" cite="mid:CAGXNUvaAjMApy-ZjKgZUukyViZ=sxq0zmwDNbNXRDoLRFQyaBw@mail.gmail.com">
<div dir="auto">
<div dir="ltr">
<div dir="auto"><br>
</div>
<div dir="auto">My idea of using something like this is based
on the idea of reducing allocations on the Java heap, as I
described the problem a couple of months ago. Up until now I
never "recycled/reused" pages when a read from disk was
issued and when the page was not cached. I've always created
new instances of these potentially big objects after a disk
read, so in addition I'd implement something to cache and
reuse Java pages (which kind of wrap the MemorySegments):</div>
<div dir="auto"><br>
</div>
<div dir="auto">In addition to the actual native memory I want
to buffer page instances which use the MemorySegments on
top, which can be reclaimed through filling the underlying
MemorySegments with 0 again and some other cleanup stuff. So
essentially I'd never unmap or use madvice don't need, as
long as the process is running.</div>
</div>
</div>
</blockquote>
<p>I think this is an area where perhaps Uwe can help? The latest
Lucene is doing something quite similar to what you are trying to
do I believe:</p>
<p><a class="moz-txt-link-freetext" href="https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together">https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together</a></p>
<p>Cheers<br>
Maurizio<br>
</p>
<blockquote type="cite" cite="mid:CAGXNUvaAjMApy-ZjKgZUukyViZ=sxq0zmwDNbNXRDoLRFQyaBw@mail.gmail.com">
<div dir="auto">
<div dir="ltr">
<div dir="auto"><br>
</div>
<div dir="auto">Hope that makes sense and hopefully the
ConcurrentHashMap to retrieve a page with a certain size,
plus taking the first entry from a Deque of free pages...
doesn't add more CPU cycles and synchronization overhead,
but the allocation rate was 2,7Gb for a single txn.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Kind regards</div>
<div dir="auto">Johannes</div>
<div><br>
</div>
<div>[1] <a href="https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf</a></div>
</div>
</div>
</blockquote>
</body>
</html>