<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi Johannes,<br>
I'm trying to uplevel as much as possible here. Is this correct:</p>
<p>1. your application, even when using a byte[][] backing storage
already had an allocation issue<br>
2. it is not clear from the information you shared, where this
allocation issue is coming from (it predates memory segment)<br>
3. when you made the switch to use memory segments instead of
byte[][] things got worse not better.</p>
<p>Does that accurately reflect your case? IMHO, the crux of the
issue is (1)/(2). If there was already some allocation issue in
your application/framework, then adopting memory segment is
unlikely to make that disappear (esp. with the kind of code we're
staring at right now, which I think is allocating _more_ temp
objects in the heap).</p>
<p>You referred to these big traversals several times. What does a
traversal do? In principle, if your data is already in memory, I'd
expect a traversal not to allocate any memory (regardless of the
backing storage being used).</p>
<p>So I feel like I probably don't understand what's going on :-)</p>
<p>It would be beneficial for the discussion to come up with some
simplified model of how the code used to work before (using some
mock pseudo code and data structures), which problems you
identified, and why and how you thought using a memory segment
would improve over that. This might also people (other than me!)
to provide more feedback.<br>
</p>
<p>Maurizio<br>
</p>
<p><br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 16/09/2024 16:23, Johannes
Lichtenberger wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAGXNUva-HXgWf9ukqW1AFCJ_7+R49miYz46hiKC7sJ58cvcivA@mail.gmail.com">
<div dir="auto">
<p dir="ltr">Hi Maurizio,</p>
<p dir="ltr">thanks for all the input AFAICT I'm not using any
JNI...</p>
<p dir="ltr">So, the problem is, that I'm creating too many
allocations (I think in one test it was 2,7Gb/s) and it's much
more with > 1 trx (and by far the most objects allocated
were/are byte-arrays), in the main branch. Thus, I had the
idea to replace this slots byte array of byte arrays with a
single MemorySegment. I think for now it would be even optimal
to use a single on-heap byte-array. </p>
<p dir="ltr">The setSlot method is currently mainly called once
during serialization of the DataRecords during a sync / commit
to disk. Also it's called during deserialization, but even
though slots may be added in random order they are appended to
the MemorySegment. I think that usually records are
added/deleted instead of updated (besides the long "pointers"
to neighbour nodes/records).</p>
<p dir="ltr">It's basically a binary encoding for
tree-structured data with fine grained nodes
(firstChild/rightSibling/leftSibling/parebt/lastChild) and the
nodes are stored in a dense trie where the leaf pages hold
mostly 1024 nodes.</p>
<p dir="ltr">Up to a predefined very small threshold N page
fragments are fetched in parallel from disk if thers's no in
memory reference and not found in a Caffeine cache, which are
then combined to a full page, thus setSlot is called for slots
which are not currently set, but are set in the current page
fragment once during reconstruction of the full page.</p>
<p dir="ltr">So, I assume afterwards they are only ever set in a
single read-write trx per resource and only seldom variable
length data may be adapted. If that's not the case I could
also try to leave some space after each slot, thus that it can
probably grow without having to shift other data or something
like that.</p>
<p dir="ltr">At least I think the issue with a much worse
runtime of traversing roughly 310_000_000 nodes in a preorder
traversal (remember that they are stored in pages) currently
switching from 1 to 5 trxs in parallel is due to the objects
allocated (without the MemorySegment):</p>
<p dir="ltr"><a href="https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/main/analysis-single-trx.jfr__;!!ACWV5N9M2RV99hQ!PZjliXD6cF77z6VQbG0HoRr9sTYhYMqnXNbcRaPb8CHFWPu8ZR4NLPzxrkm-EjhrU5u33ZhN68JOHuNfBB2iC3SR5yjfWDVmJw$" moz-do-not-send="true">https://github.com/sirixdb/sirix/blob/main/analysis-single-trx.jfr</a><br>
</p>
<p dir="ltr">vs.</p>
<p dir="ltr"><a href="https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/main/analysis-5-trxs.jfr__;!!ACWV5N9M2RV99hQ!PZjliXD6cF77z6VQbG0HoRr9sTYhYMqnXNbcRaPb8CHFWPu8ZR4NLPzxrkm-EjhrU5u33ZhN68JOHuNfBB2iC3SR5yhCs_wJyA$" moz-do-not-send="true">https://github.com/sirixdb/sirix/blob/main/analysis-5-trxs.jfr</a><br>
</p>
<p dir="ltr">Andrei Pangin helped a bit analyzing the async
profiler snapshots, as the runtime of 5 trxs in parallel is
almost exactly 4x slower than with a single trx and it's most
probably due to the amount of allocations (even though GC
seems ok).</p>
<p dir="ltr">So all in all I've had a specific runtime
performance problem and (also paging a lot, so I think it
makes sense that it may be due to the allocation rate).</p>
<p dir="ltr">I hope the nodes can simply get a MemorySegment
constructor param in the future instead of a couple of object
delegates... so that I can directly use MemorySegments instead
of having to convert between byte arrays back and forth during
serialization/deserialization. It's even we can get (almost)
rid of the whole step and we gain better data locality.</p>
<p dir="ltr">Hope it makes some sense now, but it may also be
worth looking into a single bigger byte array instead of a
MemorySegment (even though I think that off-heap memory usage
might not be a bad idea for a database system).</p>
<p dir="ltr">You may have a quick look into the 2 profiles I
provided...</p>
<p dir="ltr">Kind regards and thanks a lot for your input. If it
may help I can provide a bigger JSON file I used for
importing / the test.</p>
<p dir="ltr">Johannes</p>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Maurizio Cimadamore <<a href="mailto:maurizio.cimadamore@oracle.com" moz-do-not-send="true" class="moz-txt-link-freetext">maurizio.cimadamore@oracle.com</a>>
schrieb am Mo., 16. Sept. 2024, 12:31:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
On 16/09/2024 11:26, Maurizio Cimadamore wrote:<br>
> I've rarely had these "Evacuation Failure: Pinned" log
entries <br>
> regarding the current "master" branch on Github, but now
it's even worse.<br>
<br>
Zooming in on this aspect: this would suggest that your heap
memory is <br>
being kept "pinned" somewhere.<br>
<br>
Are you, by any chance, using downcall method handles with the
<br>
"critical" Linker option? Or any form of critical JNI?<br>
<br>
It wouldn be interesting (separately from the "architectural"
angle <br>
discussed in my previous reply) to see which method call(s) is
causing <br>
this exactly...<br>
<br>
Cheers<br>
Maurizio<br>
<br>
</blockquote>
</div>
</blockquote>
</body>
</html>