Accessing foreign memory that already exists

Antoine Chambille ach at activeviam.com
Tue Mar 31 22:57:34 UTC 2020


Yes exactly. We can handle the in-memory database use case well with a
public API that creates "unsafe", unconfined memory segments. That
jdk.incubator.foreign.Foreign interface would work very well, I hope it
makes it to the final version!

Thanks
-Antoine



On Tue, Mar 31, 2020 at 11:28 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> Thanks for the quick feedback.
>
> While memory segments are, internally, quite configurable, and we are
> trying to decide which bits of configuration to expose.
>
> We will surely expose an unsafe way to make unconfined segment - that's
> actually already in:
>
>
> https://github.com/openjdk/panama-foreign/blob/cef1b7f746df24dea1c57470ed51403647ebb893/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/Foreign.java#L114
>
> We also plan to have some primitive to take an address that has no segment
> and give it a bounded segment (I have a patch for that in the works).
>
> If possible, I'd like to stop there - from what you say, it seems like the
> two ingredients above should be enough for your need - as you only really
> need the memory access API to replace Unsafe access, not to manage the
> lifecycle of the memory you allocate (I suppose you have your own
> abstractions for that).
>
> Is my understanding correct?
>
> P.S. note that even confined segments implement the Spliterator interface
> as per recent changes, so they are amenable to parallel processing (e.g.
> ForkJoin task).
>
> Maurizio
>
> On 31/03/2020 21:58, Antoine Chambille wrote:
>
> Hi Maurizio,
>
> Thank you for the explanations and for your interest.
>
> In short, for the use case of an in-memory analytical database you need a
> specialized memory allocator to manage tables and indexes, and direct
> read/write access to the data, in pure Java, from many concurrent threads.
> It's ok if the memory segments are just façades to the memory and don't
> actually manage it.
>
> To answer you questions directly:
> * you'd like this segment to have a known size
>   -> that would be handy, the segment could be used directly without the
> need of a parent structure to hold the size.
> * you'd like this segment to be closeable, and, upon close() some
> well-known native function in your allocator should be invoked
>   -> indeed that would be the right place to have a "cleaner". not
> mandatory though, it can be done externally.
> * you'd probably like this segment not to be confined
>   -> absolutely! we need massively parallel data access for data loading
> (mount large datasets on demand in the cloud for short lived sessions) and
> for aggregations (interactive query times even on terabytes).
>
>
>
> In a bit more detail:
> Modern analytical databases are based on column stores, including the one
> we develop at ActiveViam that is called ActivePivot. The data is stored in
> binary columns, with a few indexing structures derived from hash tables and
> bitmap indexes. Those data structures are essentially made of big,
> long-lived arrays. To support very large datasets we allocate them
> off-heap, and we use the Java heap for aggregations and calculations.
>
> Currently in ActivePivot the off-heap memory is managed by a SLAB
> allocator, based on mmap, that supports highly concurrent allocations and
> deallocations. It's also NUMA aware, so that during aggregations Java
> threads process the data partitions on the same NUMA node. Java threads
> read and write the data using sun.misc.Unsafe. The data access performance
> is good and predictable, there are no boundary checks. But optimizations
> such as loop unrolling and vectorization that work on java arrays are lost
> with Unsafe. And in many cases (column scans, joins, aggregations) we could
> use the panama Vector API that we also anticipate eagerly, and that would
> not work with Unsafe. For those reasons, we would like to return to the
> ranks and rebase our data access code on memory segments.
>
> Thanks,
> -Antoine
>
>
>
>
>
> On Tue, Mar 31, 2020 at 12:36 PM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
>
>> Hi Antoine,
>> this is an interesting use case, and one I've been thinking quite a bit
>> recently, as it comes up with native interop (see below).
>>
>> In general there are two categories of memory addresses: checked ones
>> (the ones with a known segment attached to them) and unchecked ones (the
>> ones with no segment attached to them, or the ones that have the special
>> Nothing segment attached to them).
>>
>> Our policy is that addresses that are not backed by a segment _cannot_
>> be de-referenced. This is how we've been achieving safety for the basic
>> foreign memory access use case that doesn't do native interop. (we're
>> discussing as to whether that's the right default, based on some library
>> porting activity we've been doing recently - but there doesn't seem
>> clear evidence pointing one way or another).
>>
>> But there are cases where you might want to take an existing address,
>> which is backed by no existing segment, and attach a segment to it -
>> which will make it fully functional again - this operation is called
>> 'rebasing an address':
>>
>>
>> https://github.com/openjdk/panama-foreign/blob/foreign-abi/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/MemoryAddress.java#L83
>>
>> So, with all this in mind, the goal to do what you want is to be able to
>> (unsafely!) create a memory segment which has roughly the
>> characteristics you need - e.g. given base address and given size. The
>> native interop branch has a useful method for making these unchecked
>> segments:
>>
>>
>> https://github.com/openjdk/panama-foreign/blob/foreign-abi/src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/Foreign.java#L100
>>
>> In other words, let's say you have a long address "addr" and that you
>> want to create a segment around it:
>>
>> 1) create a memory address out of "addr"
>>
>> var base = MemoryAddress.ofLong(addr)
>>
>> 2) create an unchecked segment with right base address and size
>>
>> var segment = Foreign.ofNativeUnchecked(base, size)
>>
>> And voila, you now have a segment for your non-Java generated address.
>>
>> Few notes:
>>
>> * since the address has been generated by you, when you close this
>> segment, the memory access API won't attempt to do anything fancy here
>> (but it will make all the addresses based on that segment invalid);
>> options we have discussed here is to add ways to attach custom 'cleanup'
>> functions - I'm a bit skeptical of those, but I can be convinced given
>> the right use cases
>>
>> * the segment will be confined on the calling thread - meaning that it
>> can only be accessed and closed by that thread (as a regular segment)
>>
>> I think here we can do things to allow more flexibility - in principle
>> there's some kind of 'unsafe native segment builder' lurking in here
>> which lets you specify:
>>
>> * whether to confine to a thread or not
>> * what the size of the segment is
>> * what is the base address of the segment
>> * whether the resulting segment is closeable (and, if so, if a custom
>> close() action should be provided)
>>
>> My sense is that clients typically will _not_ need all this flexibility.
>> For instance, in the native interop case there are only two cases which
>> seem overwhelmingly common:
>>
>> * I have an unchecked address and I want to give it a size - but I don't
>> want closeability, or confinement - just let me dereference it within
>> some known bounds
>> * I have an unchecked address which I know comes from some 'malloc'
>> call, and I want to attach it a full blown segment, and I want the
>> segment::close operation to call free()
>>
>> I guess time will tell whether we need N ad-hoc unsafe factories, or a
>> more flexible builder-based solution.
>>
>> At this point I'd be very interested on what your requirements would be
>> for the segment you create with this unsafe API. My educated guess would
>> be that:
>>
>> * you'd like this segment to have a known size
>> * you'd like this segment to be closeable, and, upon close() some
>> well-known native function in your allocator should be invoked
>> * you'd probably like this segment not to be confined
>>
>> Is my guess correct?
>>
>> Cheers
>> Maurizio
>>
>> On 31/03/2020 09:06, Antoine Chambille wrote:
>> > Hi everyone,
>> >
>> > At ActiveViam we are watching the foreign memory project with eager
>> > anticipation. Thank you for the hard work, looking forward to it!
>> >
>> > One question related to our usage of off-heap memory:
>> >
>> > If some native memory already exists, what is the preferred way to
>> expose
>> > it as a memory segment?
>> >
>> >
>> > Some details about our use case: we make an in-memory database that
>> > delivers interactive queries to many users on terabyte datasets. The
>> > database structures are allocated off-heap, but not with malloc which
>> is a
>> > bottleneck. We developed a highly concurrent, NUMA-Aware SLAB allocator.
>> > This custom memory manager is written almost entirely in Java with just
>> a
>> > few system calls (anonymous mmap, munmap, madvise).
>> >
>> > Cheers,
>> > -Antoine
>>
>
>

-- 
  [image: ActiveViam] <https://www.activeviam.com> [image: LinkedIn]
<https://www.linkedin.com/company/activeviam>

Antoine Chambille
*Global Head of Research & Development *

[image: Office] +33 (0)1 40 13 91 00
[image: YouTube] <https://www.youtube.com/user/QuartetFS/videos>
[image: Blog] <https://www.activeviam.com/blog/>
[image: Twitter] <https://twitter.com/active_viam>
[image: location]
<https://maps.google.com/?q=46+rue+de+l+Arbre+Sec,+75001+Paris,+France>  46
rue de l'Arbre Sec, 75001 Paris [image: url]
<https://www.activeviam.com>  visit
our website


More information about the panama-dev mailing list