AW: AW: Using MemoryAccess with structured MemoryLayout
Jorn Vernee
jorn.vernee at oracle.com
Fri Feb 26 10:14:55 UTC 2021
Hi Rémi,
I see trusted record fields are already being put to good use ;)
Using lazily initialized MutableCallSites is an interesting way to fold
multiple access shapes into the same capability object. I thought you'd
end up with a chain of ifs to check the shape at each use-site if you
have multiple different shapes spread over different use-sites, but I
guess the path String having to be constant makes sure that those checks
are folded away as well. Very clever :)
To me, this validates the work we've done done on the memory access API
over the past year and a half, which for a large part was about "finding
the right primitive" to add to the JDK. If the foundation is solid, it
opens up all kinds of possibilities for building other things on top,
such as this FastAccess example.
It also opens up the opportunity to implement some of these
middle-ground APIs in the JDK, whether it's something like this, or
something like the old Panama binder re-implemented on top of the
current API. There seem to be many things to choose from here. I think
we'll have to see though; each API layer we add requires maintenance,
and it's not worth it if users just end up spinning their own thing in
the end, because they wanted a different API shape.
For now, we're still finalizing the basement :)
Jorn
On 25/02/2021 23:46, Remi Forax wrote:
> [sneaking into this conversation]
>
> While i agree that the state of the basement is now far better with panama than with JNI,
> I also think you can have a kind of middle ground API, that is based on MemoryLayout but propose a little more high level api than just MemoryAccess.
>
> Something like this,
> i can describe my layout
>
> SequenceLayout keyValues = MemoryLayout.ofSequence(
> MemoryLayout.ofStruct(
> MemoryLayout.ofValueBits(32, nativeOrder()).withName("key"),
> MemoryLayout.ofValueBits(32, nativeOrder()).withName("value")
> )
> ).withName("KeyValues");
>
>
> and then creates a FastAccess objet on that MemoryLayout
>
> private static final FastAccess FAST_ACCESS = FastAccess.of(keyValues);
>
>
> then i have access to method getInt/getLong, ..., setInt, setLong etc that takes a kind of ad hoc DSL that describe an array of PathElement in a more compact way and in a way that is considered as a constant by the JIT, so i can write code like this
>
> try (var segment = MemorySegment.allocateNative(400)) {
> for (int i = 0 ; i < 100 ; i++) {
> MemoryAccess.setIntAtIndex(segment, i, i);
> }
>
> assertEquals(4, FAST_ACCESS.getInt(segment, "[].key", 2));
> assertEquals(3, FAST_ACCESS.getInt(segment, "[].value", 1));
> }
>
> The prototype is here
> https://github.com/forax/panama-fastaccess
>
> Rémi
>
> ----- Mail original -----
>> De: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
>> À: markus at headcrashing.eu, "panama-dev at openjdk.java.net'" <panama-dev at openjdk.java.net>
>> Envoyé: Jeudi 25 Février 2021 22:58:06
>> Objet: Re: AW: AW: Using MemoryAccess with structured MemoryLayout
>> I think I disagree on a couple of points :-)
>>
>> On Thu, 2021-02-25 at 19:33 +0100, Markus KARG wrote:
>>> Maurizio,
>>>
>>> I really appreciate your long reply, indeed, and I understand what
>>> you mean with "seeing from the other side".
>>>
>>> But see, as an application vendor, I am not convinced by the solution
>>> you provide to the struct-member-problem. Tooling is fine, but it
>>> should not be a MUST just to get a high performant AND readable
>>> solution. Really: Most coders I know HATE tooling but LOVE coding. A
>>> Java core API shall be in itself be standalone not not force tooling.
>>> This just badly smells like javah.
>> This is one of the points where I (strongly) disagree; in my opinion
>> there is a huge difference between what javah generates and what
>> jextract generates; regardless of whether you love or hate the tool,
>> their options, or the flavor of the code that comes out of them, one
>> thing is _very_ different: javah generates C header files - jextract
>> generates plain Java files. The latter are ready to be included in your
>> repository of choice, you need zero extra work to build them and run
>> them, your IDE can index them, autocompletion works, etc. The same,
>> sadly, cannot be said about what comes out of javah - which forces you
>> to write some C glue code just to be able to call simple functions like
>> getpid.
>>
>> So, I think I cannot agree with you there - yes, they are both tools,
>> and they both generate code, but let's please stop and recognize how
>> useful and handy it is to be able to call a native function without
>> writing a single line of native code!
>>
>> Also, on the topic of coders loving to code, but hating tooling - well,
>> I think there's code and code. No matter how you can improve the API
>> for accessing struct members, there is still a significant amount of
>> information that has to be derived from the header files; if you take a
>> look at libraries like this:
>>
>> http://www.jcuda.org/jcuda/doc/index.html
>>
>> (hat tip to Marco Hutter who has the patience and will power to
>> maintain it :-) )
>>
>> I don't think that many developer would want to write code like this?
>> Which is why JCuda exists in the first place, so that they don't have
>> to! So, yes, tooling has a bad rep, I get it, but problems like these
>> can only be solved with tools.
>>
>> And if you only need to interact with few structs - then, it shouldn't
>> matter too much if it takes some code to get there?
>>
>>
>>> Infact, what you describe with "writing your own wrapper"
>>> TaggedValues.setValue(segment, 42) looks pretty well, but why shall
>>> EVERY programmer reinvent the wheel here? From a new API that
>>> provides nice wrappers for single fields I actually do expect that it
>>> also provides a just-as-nice way for structs, too. So speaking of
>>> what API users do expect, from a high level view, would be NOT
>>> writing customs wrappers around VarHandle, NEITHER using tooling, but
>>> just using a simple, standard API for that:
>>>
>>> /*
>>> * Define a memory layout as a Java-view on a C struct
>>> * Use MemoryAccess static methods to most easily (and in high
>>> performance) push values into the struct members
>>> */
>>> MemoryLayout struct = ...;
>>> MemoryAccess.setInt(structMember, value); // yes, THIS short!
>> Well, sure, I'd like to be able to write less code, and have it run
>> even faster :-) but from what you write, I'm having trouble picturing
>> what exactly you are proposing.
>>
>> What is structMember? Is it a layout? Is it a segment? Is it both? If
>> it's both, I think I already replied as to why that's not great API-
>> wise. A layout is fundamentally a _static_ abstraction - it exists as
>> defined somewhere (an header file, a protobuf file, somewhere) and
>> there it lies. A memory segment is a _dynamic_ entity: it's allocated,
>> it's freed, it's sliced, it traversed with a spliterator... layouts are
>> used _at the boundaries_ (e.g. if you need to know how much to
>> allocate) - but a segment is just a bunch of bytes, and that's a big
>> part in what makes the access API efficient and universally applicable.
>>
>> But regardless, I think I've also explained how, from the nitty-gritty
>> performance perspective, the API you propose doesn't really make sense
>> in the JVM we have (where, to be optimized, var handles/method handles
>> have to be constant static fields). It _might_ (or not!) make sense in
>> the VM we'll have 5 years from now, and if it does, rest assured that
>> we'll circle back to this, but that's a story for another day, I think.
>>
>> There's no magic trick we can pull out of the hat here - it's a choice
>> between having a high-level API (which performs horribly) or have a
>> lower-level API which performs well, _and that can be specialized_ for
>> the use cases that you want to work with.
>>
>> I understand you feel that's not enough; we believe that, when combined
>> , the Foreign Memory Access API and the Foreign Linker API
>> significantly enhance Java's ability to interop with native memory and
>> libraries, and to do so in 100% Java code.
>>
>>
>> As a case of "eating your own dog's food": our jextract tool works on
>> top of LLVM/libclang; in fact, jextract was built on top of an
>> handwritten JNI port of libclang. Over the last year or so, we
>> replaced this ad-hoc JNI code with 100% auto-generated foreign-linker
>> binding. To be honest we never looked back. It's (far) easier to
>> maintain, and when libclang gets new goodies, we just run the tool
>> again and commit the sources: all the new functions/structs/constants
>> are there.
>>
>> Even at the beginning, when the performance of the foreign linker API
>> wasn't great (Panama used to be several Xs slower than JNI at calling
>> native functions, and then some more Xs slower when calling Java code
>> back from native), it was still worth it, because the difference in
>> performance was negligible overall compared to how much time we saved
>> by no longer having to maintain that JNI port.
>>
>> Now that the linker API is, performance-wise, on a more solid footing
>> (for downcalls, for upcalls we're about to get significantly faster
>> than JNI [1]), I honestly don't see many reasons as to why one should
>> stick with JNI - other than compatibility (which in some contexts can
>> be a big one, I know). Yes, the new APIs might be a little on the low-
>> level side, but at least you know we tried hard to squeeze every ounce
>> of available power into them :-)
>>
>> Maurizio
>>
>> [1] - https://github.com/openjdk/panama-foreign/pull/457
>>
>>> So the most easy way to get the structMember is to get a
>>> MemorySegment simply by its NAME.
>>> Hence, what I REALLY miss the most in the Memory Access API is a
>>> simple command to get a named MemorySegment from a structured
>>> MemorySegment root plus a String name -- without using VarHandle or
>>> tooling. :-)
>>>
>>> -Markus
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: Maurizio Cimadamore [mailto:maurizio.cimadamore at oracle.com]
>>> Gesendet: Donnerstag, 25. Februar 2021 19:13
>>> An: markus at headcrashing.eu; panama-dev at openjdk.java.net
>>> Betreff: Re: AW: Using MemoryAccess with structured MemoryLayout
>>>
>>> On Thu, 2021-02-25 at 18:31 +0100, Markus KARG wrote:
>>>> Maurizio,
>>>>
>>>> thank you for your kind answer.
>>>>
>>>> Yes, indeed I am already using VarHandle currently, but actually I
>>>> like the idea of MemoryAccess more, as the code looks a bit simpler
>>>> to me.
>>>>
>>>> What I envision is something like doing this instead, as it spares
>>>> one code line (the actual invocation of the VarValue):
>>>>
>>>> ```
>>>> MemorySegment valueSegment = taggedValues.memorySegment(
>>>> PathElement.sequenceElement(3
>>>> ),
>>>> PathElement.groupElement("val
>>>> ue
>>>> "));
>>>> MemoryAccess.setInt(valueSegment, someInteger);
>>>> ```
>>>>
>>>> It would be cool to have this additional possibility, as it makes
>>>> using structs rather simple compared to the VarHandle way.
>>> Hi,
>>> I see that you would like to somehow attach the layout to the segment
>>> -
>>> but layouts and segments are orthogonal, and for good reasons.
>>>
>>> First, not always, when accessing a segment you might know what is
>>> the
>>> layout of the thing being accessed - in a lot of cases access is much
>>> more ad-hoc.
>>>
>>> Second, if a notion of layout is always associated with a segment,
>>> you
>>> end up in a place where, in order to slice a segment, you probably
>>> have
>>> to follow that operation with some kind of "cast" (e.g. where you set
>>> the layout of the slice to something else). We've been there with a
>>> past incarnation of the Panama API, and, while an API like the one
>>> you
>>> describe is probably more suited to closely model a C pointer type,
>>> that API is not very "primitive" - meaning that it is quite useless
>>> if
>>> you start using a memory segment in a more buffer-like way.
>>>
>>> Note that not _all_ the users of the Memory Access API are interested
>>> in native interop - many just want to be able to allocate slabs of
>>> native memory, and free deterministically. So, the more baggage we
>>> add,
>>> the more those non-linker use cases become bloated with unnecessary
>>> overhead.
>>>
>>> Third, I imagine that you would like a method like this:
>>>
>>> MemoryAccess.setIntAtLayout(valueSegment, someInteger,
>>> PathElement...)
>>>
>>> E.g. you want/need to specify a path into the segment to obtain one
>>> of
>>> the leaves (otherwise I don't see how the runtime can infer which
>>> element you wanna access). But here we rub against another big
>>> problem:
>>> VarHandle (and MethodHandle) work best when they are _constants_ e.g.
>>> declared as static final variables in your code. When that happens,
>>> the
>>> VM is able to inline all the var handle goo away, and optimize the
>>> code
>>> enough that accessing a segment in a tight loop will often result in
>>> a
>>> sequence of unrolled MOV instructions (in some cases you can even see
>>> auto-vectorization kicking in).
>>>
>>> If the VarHandle is not constant - well, none of these optimization
>>> will occur - meaning that your memory access will easily be 10x
>>> slower.
>>> The reason MemoryAccess works is that it works on a number of
>>> predefined VarHandle which are created as static constants under the
>>> hood, once and for all.
>>>
>>> But your API would require a _fresh_ VarHandle to be created on every
>>> call, based on the coordinates passed in. Hence, the var handle would
>>> not be constant, and performance would suffer big time.
>>>
>>> The fine line we're walking in this project is to expose the tools
>>> and
>>> the knob which allow clients to perform memory access/foreign
>>> function
>>> access in the fastest possible way we know of/is possible within the
>>> JVM. To do that, sometimes (not always) we have to "look the other
>>> way"
>>> when it comes to usability - simply because it would be impossible to
>>> have an API that is both 100% efficient and 100% usable.
>>>
>>> The main trick that users can adopt in these cases, is to mediate
>>> access; that is, if there is a particular kind of struct that you
>>> want
>>> to operate with, nothing prevents you from declaring _your own_
>>> MemoryAccess-like abstraction that works for specific fields of that
>>> struct - e.g.
>>>
>>> TaggedValues.setValue(segment, 42);
>>>
>>> TaggedValues will have constant method handles (one for each field),
>>> and a bunch of accessors (a pair for each field). There is nothing
>>> magic in MemoryAccess - it's just shorthand for accessing ubiquitous
>>> primitive types. There's no reason users cannot replicate the same
>>> idiom in their code - so that clients will be _both_ fast AND
>>> usable/readable.
>>>
>>> Of course, when working with bigger libraries, there might be many
>>> structs to work with, and manually defining a "wrapper static class"
>>> for each struct might prove too tedious. But that's why we're
>>> investing
>>> in tooling: that's exactly the job that jextract does: it parses a
>>> complex C header and turns it into a bunch of static declarations
>>> which
>>> help you access your native API more quickly (as the boilerplate has
>>> been generated for you) and more safely (as the static wrappers will
>>> avoid direct VarHandle usage, which can sometimes be "sharp").
>>>
>>> Even at the jextract level, we are aware that some people would
>>> expect
>>> an API that is closer to the C world (e.g. a `Pointer` type? Struct
>>> wrappers?) - but again here our approach is to enable people to write
>>> code which targets the library they wanna use quickly (e.g. way
>>> faster
>>> than using JNI), but w/o introducing unnecessary translation steps in
>>> the middle - which would make the bindings too slow for some advanced
>>> use cases.
>>>
>>> I apologize for the (too) big reply - I hope you find it helpful to
>>> understand the "why not" part of your earlier question.
>>>
>>> Cheers
>>> Maurizio
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> -Markus
>>>>
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: Maurizio Cimadamore [mailto:maurizio.cimadamore at oracle.com]
>>>> Gesendet: Donnerstag, 25. Februar 2021 17:18
>>>> An: markus at headcrashing.eu; panama-dev at openjdk.java.net
>>>> Betreff: Re: Using MemoryAccess with structured MemoryLayout
>>>>
>>>> Hi Markus,
>>>> to read inside the struct, you can:
>>>>
>>>> * use the MemoryAccess API - but doing so is limited - e.g.
>>>> MemoryAccess only supports access by physical offset or logical
>>>> index.
>>>>
>>>> * create your own VarHandle which points to the desired part of the
>>>> layout, and use that
>>>>
>>>> Here:
>>>>
>>>> https://urldefense.com/v3/__https://download.java.net/java/early_access/jdk16/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/MemoryLayout.html__;!!GqivPVa7Brio!LVwiSmmQDT3XCTpdxKQi2AocVfza9_6et_c92Nt2gcvxhNVkKRoQn59203xQEPwO1lXaSNk$
>>>>
>>>> More specifically:
>>>>
>>>>
>>>> ```
>>>> SequenceLayout taggedValues = MemoryLayout.ofSequence(5,
>>>> MemoryLayout.ofStruct(
>>>> MemoryLayout.ofValueBits(8,
>>>> ByteOrder.nativeOrder()).withName("kind"),
>>>> MemoryLayout.ofPaddingBits(24),
>>>> MemoryLayout.ofValueBits(32,
>>>> ByteOrder.nativeOrder()).withName("value")
>>>> )
>>>> ).withName("TaggedValues");
>>>>
>>>> ```
>>>>
>>>> And
>>>>
>>>> ```
>>>> VarHandle valueHandle = taggedValues.varHandle(int.class,
>>>> PathElement.sequence
>>>> El
>>>> ement(),
>>>> PathElement.groupEle
>>>> me
>>>> nt("value"));
>>>> ```
>>>>
>>>> Cheers
>>>> Maurizio
>>>>
>>>>
>>>> On Thu, 2021-02-25 at 17:01 +0100, Markus KARG wrote:
>>>>> On Windows, many API function have C struct as parameters.
>>>>>
>>>>> It is rather straightforward to set up a structured MemoryLayout.
>>>>>
>>>>> In case I want to easily poke bytes into that struct, I'd like to
>>>>> use
>>>>> MemoryAccess.
>>>>>
>>>>> Unfortunately, there seem to be no EASY / SIMPLE way to write:
>>>>>
>>>>> MemoryAccess.setIntAt(MEMBER_OF_SUCH_A_STRUCT,
>>>>> VALUE_OF_THAT_MEMBER);
>>>>>
>>>>> .or I missed to see it in the JavaDocs.
>>>>>
>>>>> Is this possible? If yes, how? If not, why not?
>>>>>
>>>>> -Markus
>>>>>
>>>>>
More information about the panama-dev
mailing list