Ability to extend a MemorySegment

Wed Jan 22 03:37:45 UTC 2020

Ok fair enough, this wasn't clear at all; as i said there's probably a 
reason.  I don't think it's a good one but that's just one opinion.

This is what the top of the project page says:
--

  Project Panama: Interconnecting JVM and native code

We are improving and enriching the connections between the Java ^TM 
virtual machine and well-defined but “foreign” (non-Java) APIs, 
including many interfaces commonly used by C programmers.

To this end, Project Panama will include most or all of these components:

  * native function calling from JVM (C, C++), specifically per JEP 191
  * native data access from JVM or inside JVM heap
  * new data layouts in JVM heap
  * native metadata definition for JVM
  * header file API extraction tools (see below)
  * native library management APIs
  * native-oriented interpreter and runtime “hooks”
  * class and method resolution “hooks”
  * native-oriented JIT optimizations
  * tooling or wrapper interposition for safety
  * exploratory work with difficult-to-integrate native libraries

--

I think that's a big enough task already without complicating it for 
auxiliary purposes, and the design is more complex because of it.

On 21/1/20 10:28 pm, Maurizio Cimadamore wrote:
>
> On 21/01/2020 11:22, Michael Zucchi wrote:
>> I think MemorySegment is way over-engineered already, adding more 
>> complexity doesn't sound great.  I wish it was just a 1:1 mapping to 
>> a malloc block (or slice thereof), and had no support for arrays or 
>> array bytebuffers. 
>
> Well, you come from the perspective of someone who needs ABI support - 
> fine. But are you saying that, because MemorySegment have to play nice 
> with ABI, it is not important to support other related use cases which 
> have _nothing to do with ABI_ ?
>
> One cool thing you can do with segments is:
>
> * create a memory segment
> * map it into a ByteBuffer
> * use the ByteBuffer as usual
> * close the segment
>

I mean sure.  That's quite a related use for direct buffers and doesn't 
add anything weird to the api, they both use malloc to allocate the 
block and they can both be accessed from native calls.

Its the ability to represent non-malloc-memory as memory that's a bit weird.

> This gives you same old good ByteBuffer API (no var handles!) but with 
> deterministic deallocation on top (which is something that Java users 
> have been asking for ages, but that, given restrictions of the BB API 
> we couldn't fully deliver).
>
> Another thing segments allow you to do is to take a big file (over 2G) 
> and map slices of it to different BB, effectively allowing you to use 
> the BB API on stuff that is bigger than 2GB.
>
Well files are covered fine by FileChannel.map().  Large memory arrays 
aren't though.

> What you call over-engineering to me sounds more like you are probably 
> just not interested in the other half of use cases that this API is 
> about. With this I'm not saying the API as is is perfect, but in some 
> of the feedback we have received so far the recurrent theme seems to 
> be "I don't need XYZ, so it is just silly to have it in the API", or 
> "I really badly need XYZ so why didn't you just add it to the API" - 
> and all I'm saying here is that there are principles upon which some 
> of these decisions have been made.
>
Yes of course, I don't think you guys are just doing it to be difficult 
or making stuff up out of thin air.  As I said the thinking isn't 
obvious not the least from the project description. And I know there are 
always trade-offs with any decision.

Not that anybody would read it, maybe the project needs a FAQ that 
covers these issues.  And well if they're constantly being brought up 
whilst asking for feedback, maybe there's a reason.  I know it doesn't 
always appear that way but the criticism (at least mine) is intended to 
be constructive or to learn enough to be constructive. I've worked on 
public projects and the experience was miserable enough to break me so i 
don't envy you in the least.

>> And also that MemoryAddress was literally just an address inside of 
>> it's segment - and not an 'offset' which may or may not actually be 
>> an address (with no direct way to find out - from java). 
> If you really want to get a low-level address from a MemoryAddress you 
> can call ForeignUnsafe::getUnsafeOffset(MemoryAddress) and get the 
> 'long' address you need.

Ok thanks.  Not an obvious place to look and i presume that also works 
MemorySegment.allocateNative().  Generally they behave differently and 
look differently and it makes debugging quite painful.  I was trying to 
use offset() to determine if an address was actually null, but that did 
funny things with upcall handles until the last change to those which 
changed them to anonymous pointers.  That's just a weird gotcha.

>> The need to support this other stuff just makes the api weird and 
>> confusing since the naming conventions don't make sense if it was 
>> just backed by malloc and for all that you can never actually use 
>> these array-based segments with "foreign" functions anyway.
> See above. There's more to memory segments than just ABI.
>> It this stuff there just as a (clumsy) mechanism to allow for the 
>> general copy() method?
> Please expand a bit more - are you referring to MemoryAddress::copy? 
> Why is it clumsy (same interface as System.arrayCopy which has served 
> us well for quite a bit).
Well It's clumsy in that memorysegment first needs to support all these 
different backend types (clumsy required design), and then you need to 
create a memorysegment wrapping one to use it (clumsy to use).

A copy-to-array, and copy-from-array (or to/from native) pair would also 
provide such functionality and be more like system.arraycopy (including 
having to be native/internal and not implemented in java).  Or one could 
just bounce it through a *Buffer view - which is also a bit clumsy 
(particularly with the default-network-byte-order) but it is what it is.

>> I'm sure there are reasons it's the way it is but the whole thing is 
>> just so, well, 'foreign', to anything from either C or Java.
>
>> And an even more restricted allocator? Ouch. 
>
> Again, you are jumping ahead and misreading what I said. What I was 
> trying to explain is that there are a number of considerations which 
> might affect the decision of whether something like 'realloc' really 
> belongs to the MemorySegment API such as:
>
> * which allocator segments ended up being based upon
> * the fact that, like it or not, not all segments will be able to 
> support this operation anyway
> * whether it's important enough to deserve a place in the API anyway 
> (seems like, e.g. you and Ty disagree on how much important realloc 
> actually is)
>
> Sure, we can pretend all these choices do not exist - but in reality 
> they do. And again, are you saying "ouch" to having an allocator which 
> will give you faster performances and better scalability than malloc, 
> and better integration with the VM ecosystem as a whole? On what 
> basis? Do you know that real-world projects like e.g. Netty [1] are 
> building their own allocators anyway as they find malloc/free (and, by 
> consequence, ByteBufer) to be hugely unfeasible for them?
>

If malloc is too slow you allocate a bigger segment and dole out 
addresses from it, the same solution one uses in C.  Having a stack 
allocator as an option makes some sense - this is both a very common use 
when calling native functions, and seems to be the main "intended" 
use-case of allocateNative() - but such an allocator can be trivally 
implemented atop MemorySegment over malloc.  But having it as the only 
type is just going to be too restrictive - it certainly wouldn't work 
for twitter's case.

Things like netty just demonstrate you can't handle every case 
automatically and applications are free to handle pathalogical cases 
with novel data structures.  This will always be true.  The issues with 
direct bytebuffer are well known for some applications and the blog post 
just demonstrates they can be mitigated even within the current jvm.  
 From JNI it's trivial to wrap a malloc() memory in a bytebuffer, it's 
just weird that's the only way presently.  And their example shows it 
isn't the native allocation that's the main issue anyway as it happens 
with java memory too.

One of the (many) cargo cults for java for years has been to avoid 
old-generation at all costs, but that post just demonstrates you can't 
rely on the gc in all cases.  Lost of "modern java" like Streams 
processing also encourages or requires practices that generate more garbage.