Question: ByteBuffer vs MemorySegment for binary (de)serializiation and in-memory buffer pool

Johannes Lichtenberger lichtenberger.johannes at gmail.com
Thu Sep 1 22:13:19 UTC 2022


I think it's a really good idea to use off-heap memory for the Buffer
Manager/the pages with the stored records. In my case, I'm working on an
immutable, persistent DBMS currently storing JSON and XML with only one
read-write trx per resource concurrently and if desired in parallel to N
read-only trx bound to specific revisions (in the relational world the term
for a resource is a relation/table). During an import of a close to 4Gb
JSON file with intermediate commits, I found out that depending on the
number of records/nodes accumulated in the trx intent log (a trx private
map more or less), after which a commit and thus a sync to disk with
removing the pages from the log is issued, the GC runs are >= 100ms most of
the times and the objects are long-lived and are promoted to the old gen
obviously, which seems to take these >= 100ms. That is I'll have to study
how Shenandoah works, but in this case, it brings no advantage regarding
the latency.

Maybe it would make sense to store the data in the record instances also
off-head, as Gavin did with his simple Buffer Manager :-) that said
lowering the max records number after which to commit and sync to disk also
has a tremendous effect and with Shenandoah, the GC times are less than a
few ms at least.

I'm using the Foreign Memory API however already to store the data in
memory-mapped files, once the pages (or page fragments) and records therein
are serialized and then written to the memory segment after compression and
hopefully soon encyrption.

Kind regards
Johannes



Am Do., 1. Sept. 2022 um 22:52 Uhr schrieb Maurizio Cimadamore <
maurizio.cimadamore at oracle.com>:

>
> On 01/09/2022 19:26, Gavin Ray wrote:
> > I think this is where my impression of verbosity is coming from, in
> > [1] I've linked a gist of ByteBuffer vs MemorySegment implementation
> > of a page header struct,
> > and it's the layout/varhandles that are the only difference, really.
> >
> Ok, I see what you mean, of course; thanks for the Gist.
>
> In this case I think the instance accessor we added on MemorySegment
> will bring the code more or less to the same shape as what it used to be
> with the ByteBuffer API.
>
> Using var handles is very useful when you want to access elements (e.g.
> structs inside other structs inside arrays) as it takes all the offset
> computation out of the way.
>
> If you're happy enough with hardwired offsets (and I agree that in this
> case things might be good enough), then there's nothing wrong with using
> the ready-made accessor methods.
>
> Maurizio
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20220902/b4e75a7b/attachment-0001.htm>


More information about the panama-dev mailing list