Separation between MemorySegment and MemoryScope
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Mar 26 22:52:19 UTC 2021
On 26/03/2021 22:27, Remi Forax wrote:
> Hi all,
> I've taken a look to the doc about the separation between MemorySegment/MemoryScope (the branch panama-foreign is still building ...)
>
> I like this change it makes things conceptually cleaner, in my case being able to close several segments at the same time,
> but at the same time i'm still mourning the fact that before this change, i really liked the fact that a MemorySegment was confined to a thread by default.
I see what you mean. But see below.
>
> Thinking a little more about that, I wonder if having a default behavior with respect to the MemoryScope is a good idea,
> i wonder if it is not better to go full explicit, i.e. to have the methods that creates a MemorySegment to always take the scope as last parameter.
> It forces us users to think in term of scope instead of using a default which may not be a good one depending on the application/library.
>
> It has also the advantage to cut the number of methods by half, which is a huge bonus because it keeps the things simple.
In a way, I get what you mean. If you look at MemorySegment and you
think: I'm always gonna want deterministic deallocation, the current
status feels maybe odd.
The assumption that drove us into this API shape was that, perhaps,
after the dust settle, maybe deterministic deallocation, while useful in
places, might not need to be "so in your face". In a way, it almost
feels like we want new features to stand out (which is understandable).
But let's look at it from the perspective from a developer who needs to
abandon the ByteBuffer API because they need more than 32 bits of
indexing, or more structured access juice, whatever.
Are we 100% sure that forcing deterministic deallocation (or even
_thinking_ about that possibility, by having to create a scope) is the
right move? Sure, that's explicit, but in a lot of case the Cleaner is
"just fine" (TM) [in fact ByteBuffer is still used quite a bit].
In other words, we think there's a space where you can use the Foreign
Memory, and Linker API and simply _not care_ of when things get
deallocated (as long as they do).
Simple proof from (very) real worl use case: jextract. After moving to
the new API, extracting Windows.h with jextract is 2x faster. I was
pretty sure it was a bug - but then, after verifying, no, it was still
generating the same stuff as before. So what happened? As it goes,
libclang uses structs returned by value _a lot_ (everywhere, almost) -
we wrapped these segments in Java classes and... then we forgot to close
them.
Of course that was sloppy on our part - we were running with massive
memory leaks; but there's also a lesson in that: by enabling implicit
deallocation jextract got 2x faster! Our buggy code was just auto-fixed
(of course after we learned that, we started double checking the various
calls, so we now think it's correct :-)).
So, there you go, a non trivial piece of code, which is using implicit
allocation almost everywhere.
Could we make it even better with explicit deallocation? Maybe. Should
we spend time doing so? Probably not, until there's some evidence that
points to the fact that the current scheme is causing issues.
Given the recent patch that we pushed earlier today, which makes
interacting with native libraries even safer with implicit segments, I
think a lot of use cases will be covered just with that, after seeing
what happened with jextract.
As for confinement, again, coming from BB, it is too much of a
transition to see your segments confined by default. If you don't want
them to be, then you have to learn about scopes, etc. - but maybe you
only came here because of VarHandles!!!
And, confinement by default had another fatal flaw - in that it was
inconsistent for segments created from arrays and buffers which, after
all, were shared by default. So that's why we made the default shared,
and implicit.
We're not totally closed to make changes to defaults, but I think the
API, as it stands, as a nice "progression" where you can go from "just
using segments as if they were BB" and end up doing very complex stuff
with scopes and allocators - but you only learn what you want to learn,
that's the idea, at least.
>
>
> Another question, at some point, i would like to use a MemorySegment with the Vector API* and i would be very cool if there was a way to deactivate the bound checks only for some MemorySegments, not all.
> Given we have ofNativeRestricted, i wonder if having a noBoundChecks for a particular MemoryScope (composable with the rest) with the same restriction (the property foreign.restricted has to be set) is something possible or not.
There were some ideas for removing bound checks from the everything segment:
https://github.com/openjdk/panama-foreign/pull/431
I don't think progress has been made since, but it's defo something that
can be looked at. Elsewhere Paul said that Vector will look at
integration with MemorySegment, but can only do so after MemorySegment
lands in java.base (aaaah these incubating modules).
Cheers
Maurizio
>
> regards,
> Rémi
>
> * currently i'm using a ByteBuffer (from MemorySegment.asByteBuffer()) but i've hard time to have the JIT to remove the bound checks, it seems that the fact that an index is < buffer.capacity() has no effect on the generated code (perhaps it's because the fields of ByteBuffer are not tagged with @Stable ?).
More information about the panama-dev
mailing list