AW: AW: Using MemoryAccess with structured MemoryLayout

Thu Feb 25 18:33:09 UTC 2021

Maurizio,

I really appreciate your long reply, indeed, and I understand what you mean with "seeing from the other side".

But see, as an application vendor, I am not convinced by the solution you provide to the struct-member-problem. Tooling is fine, but it should not be a MUST just to get a high performant AND readable solution. Really: Most coders I know HATE tooling but LOVE coding. A Java core API shall be in itself be standalone not not force tooling. This just badly smells like javah. Infact, what you describe with "writing your own wrapper" TaggedValues.setValue(segment, 42) looks pretty well, but why shall EVERY programmer reinvent the wheel here? From a new API that provides nice wrappers for single fields I actually do expect that it also provides a just-as-nice way for structs, too. So speaking of what API users do expect, from a high level view, would be NOT writing customs wrappers around VarHandle, NEITHER using tooling, but just using a simple, standard API for that:

/*
 * Define a memory layout as a Java-view on a C struct
 * Use MemoryAccess static methods to most easily (and in high performance) push values into the struct members
 */
MemoryLayout struct = ...;
MemoryAccess.setInt(structMember, value); // yes, THIS short!

So the most easy way to get the structMember is to get a MemorySegment simply by its NAME.

Hence, what I REALLY miss the most in the Memory Access API is a simple command to get a named MemorySegment from a structured MemorySegment root plus a String name -- without using VarHandle or tooling. :-)

-Markus

-----Ursprüngliche Nachricht-----
Von: Maurizio Cimadamore [mailto:maurizio.cimadamore at oracle.com] 
Gesendet: Donnerstag, 25. Februar 2021 19:13
An: markus at headcrashing.eu; panama-dev at openjdk.java.net
Betreff: Re: AW: Using MemoryAccess with structured MemoryLayout

On Thu, 2021-02-25 at 18:31 +0100, Markus KARG wrote:
> Maurizio,
> 
> thank you for your kind answer.
> 
> Yes, indeed I am already using VarHandle currently, but actually I
> like the idea of MemoryAccess more, as the code looks a bit simpler
> to me.
> 
> What I envision is something like doing this instead, as it spares
> one code line (the actual invocation of the VarValue):
> 
> ```
> MemorySegment valueSegment = taggedValues.memorySegment(
>                                       PathElement.sequenceElement(3),
>                                       PathElement.groupElement("value
> "));
> MemoryAccess.setInt(valueSegment, someInteger);
> ```
> 
> It would be cool to have this additional possibility, as it makes
> using structs rather simple compared to the VarHandle way.

Hi,
I see that you would like to somehow attach the layout to the segment -
but layouts and segments are orthogonal, and for good reasons. 

First, not always, when accessing a segment you might know what is the
layout of the thing being accessed - in a lot of cases access is much
more ad-hoc.

Second, if a notion of layout is always associated with a segment, you
end up in a place where, in order to slice a segment, you probably have
to follow that operation with some kind of "cast" (e.g. where you set
the layout of the slice to something else). We've been there with a
past incarnation of the Panama API, and, while an API like the one you
describe is probably more suited to closely model a C pointer type,
that API is not very "primitive" - meaning that it is quite useless if
you start using a memory segment in a more buffer-like way.

Note that not _all_ the users of the Memory Access API are interested
in native interop - many just want to be able to allocate slabs of
native memory, and free deterministically. So, the more baggage we add,
the more those non-linker use cases become bloated with unnecessary
overhead.

Third, I imagine that you would like a method like this:

MemoryAccess.setIntAtLayout(valueSegment, someInteger, PathElement...)

E.g. you want/need to specify a path into the segment to obtain one of
the leaves (otherwise I don't see how the runtime can infer which
element you wanna access). But here we rub against another big problem:
VarHandle (and MethodHandle) work best when they are _constants_ e.g.
declared as static final variables in your code. When that happens, the
VM is able to inline all the var handle goo away, and optimize the code
enough that accessing a segment in a tight loop will often result in a
sequence of unrolled MOV instructions (in some cases you can even see
auto-vectorization kicking in).

If the VarHandle is not constant - well, none of these optimization
will occur - meaning that your memory access will easily be 10x slower.
The reason MemoryAccess works is that it works on a number of
predefined VarHandle which are created as static constants under the
hood, once and for all.

But your API would require a _fresh_ VarHandle to be created on every
call, based on the coordinates passed in. Hence, the var handle would
not be constant, and performance would suffer big time.

The fine line we're walking in this project is to expose the tools and
the knob which allow clients to perform memory access/foreign function
access in the fastest possible way we know of/is possible within the
JVM. To do that, sometimes (not always) we have to "look the other way"
when it comes to usability - simply because it would be impossible to
have an API that is both 100% efficient and 100% usable.

The main trick that users can adopt in these cases, is to mediate
access; that is, if there is a particular kind of struct that you want
to operate with, nothing prevents you from declaring _your own_
MemoryAccess-like abstraction that works for specific fields of that
struct - e.g.

TaggedValues.setValue(segment, 42);

TaggedValues will have constant method handles (one for each field),
and a bunch of accessors (a pair for each field). There is nothing
magic in MemoryAccess - it's just shorthand for accessing ubiquitous
primitive types. There's no reason users cannot replicate the same
idiom in their code - so that clients will be _both_ fast AND
usable/readable.

Of course, when working with bigger libraries, there might be many
structs to work with, and manually defining a "wrapper static class"
for each struct might prove too tedious. But that's why we're investing
in tooling: that's exactly the job that jextract does: it parses a
complex C header and turns it into a bunch of static declarations which
help you access your native API more quickly (as the boilerplate has
been generated for you) and more safely (as the static wrappers will
avoid direct VarHandle usage, which can sometimes be "sharp").

Even at the jextract level, we are aware that some people would expect
an API that is closer to the C world (e.g. a `Pointer` type? Struct
wrappers?) - but again here our approach is to enable people to write
code which targets the library they wanna use quickly (e.g. way faster
than using JNI), but w/o introducing unnecessary translation steps in
the middle - which would make the bindings too slow for some advanced
use cases.

I apologize for the (too) big reply - I hope you find it helpful to
understand the "why not" part of your earlier question.

Cheers
Maurizio

> 
> -Markus
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Maurizio Cimadamore [mailto:maurizio.cimadamore at oracle.com] 
> Gesendet: Donnerstag, 25. Februar 2021 17:18
> An: markus at headcrashing.eu; panama-dev at openjdk.java.net
> Betreff: Re: Using MemoryAccess with structured MemoryLayout
> 
> Hi Markus,
> to read inside the struct, you can:
> 
> * use the MemoryAccess API - but doing so is limited - e.g.
> MemoryAccess only supports access by physical offset or logical
> index.
> 
> * create your own VarHandle which points to the desired part of the
> layout, and use that
> 
> Here:
> 
> https://urldefense.com/v3/__https://download.java.net/java/early_access/jdk16/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/MemoryLayout.html__;!!GqivPVa7Brio!LVwiSmmQDT3XCTpdxKQi2AocVfza9_6et_c92Nt2gcvxhNVkKRoQn59203xQEPwO1lXaSNk$ 
> 
> More specifically:
> 
> 
> ```
> SequenceLayout taggedValues = MemoryLayout.ofSequence(5,
>     MemoryLayout.ofStruct(
>         MemoryLayout.ofValueBits(8,
> ByteOrder.nativeOrder()).withName("kind"),
>         MemoryLayout.ofPaddingBits(24),
>         MemoryLayout.ofValueBits(32,
> ByteOrder.nativeOrder()).withName("value")
>     )
> ).withName("TaggedValues");
> 
> ```
> 
> And
> 
> ```
> VarHandle valueHandle = taggedValues.varHandle(int.class,
>                                                PathElement.sequenceEl
> ement(),
>                                                PathElement.groupEleme
> nt("value"));
> ```
> 
> Cheers
> Maurizio
> 
> 
> On Thu, 2021-02-25 at 17:01 +0100, Markus KARG wrote:
> > On Windows, many API function have C struct as parameters.
> > 
> > It is rather straightforward to set up a structured MemoryLayout.
> > 
> > In case I want to easily poke bytes into that struct, I'd like to
> > use
> > MemoryAccess.
> > 
> > Unfortunately, there seem to be no EASY / SIMPLE way to write:
> > 
> > MemoryAccess.setIntAt(MEMBER_OF_SUCH_A_STRUCT,
> > VALUE_OF_THAT_MEMBER);
> > 
> > .or I missed to see it in the JavaDocs.
> > 
> > Is this possible? If yes, how? If not, why not?
> > 
> > -Markus
> > 
> >  
> >