access modes for pointers and memory regions

John Rose john.r.rose at oracle.com
Tue Jan 22 21:21:07 UTC 2019


`Pointer.AccessMode` tells whether a pointer is allowed to read
and/or write the memory behind the pointer.

http://hg.openjdk.java.net/panama/dev/file/9c4e3cc4ce5e/src/java.base/share/classes/java/foreign/memory/Pointer.java#l233

The access mode of a pointer is not meant to provide unambiguous
information about the memory behind the pointer, but rather to
restrict the operations on that memory *via that pointer*.  The
pointer acts as a capability to read and/or write the memory,
but there may be other capabilities that can do things the
pointer cannot.  Holding a pointer with mode `AccessMode.READ`
does not prevent another pointer from writing the same memory.
I'm belaboring the obvious here, but it's for a good cause…

A `BoundedMemoryRegion` has a mode flag also:

http://hg.openjdk.java.net/panama/dev/file/9081e5f050d7/src/java.base/share/classes/jdk/internal/foreign/memory/BoundedMemoryRegion.java#l46

The meaning of this mode flag, like that of the pointer's
mode flag, restricts the capability represented by the memory
region.  It is  not an absolute statement about the memory
referred to by the region.  For example, a queue head
might be implemented with a segment of memory, which
one end of the queue writes and the other reads, and
both pointers and memory regions might refer to that
memory, but with disjoint capabilities; one side reads
only and one side writes only, but the memory itself,
apart from any capabilities that access it, is best
described as `AccessMode.READ_WRITE`.

If a memory region is marked as `READ` there is
nothing to prevent other parties, with write permissions,
from writing to the same memory.

This is a useful quality, as in the case of queues,
or with divide-and-conquer parallel algorithms,
where each thread can read a large region of memory
but only one thread may write some assigned sub-region
which the other threads may only read.

The absolute status of a segment of memory may be
read/write, read-only, or write-only.  That last
category makes sense only in limited settings, for
interprocess shared memory.  But the first two are
widespread.

Examples of absolutely read/write memory are obvious,
starting with any Java array or malloc buffer, unless either
is protected somehow against writes.  Standard
examples of read-only memory include the
characters of a Java string or of a C char array
placed in (virtual) ROM.

Thus we may use modes like `AccessMode.READ`
either in an absolute or in a relative sense, and
there is some wiggle room for the same object
having differing modes in the two senses.
Some combinations probably don't make sense,
like a read/write pointer into a read-only memory
segment.

Here are the useful combinations, I think:

R/RO: I can access this immutable segment.
R/RW: I can read but not write this writable segment.
W/RW: I can write but not read this writable segment.
RW/RW: I have full access to this segment.

The only combination not representable with
`AccessMode` is the first (R/RO).  One way to
elevate it as an explicit property would be like
this:

```
    enum AccessMode {
        /**
         * A read access mode.
         * May be aliased to writable content.
         */
        READ(1 << 0),
        /**
         * A write access mode.
         * May be aliased to readable content.
         */
        WRITE(1 << 1),
        /**
         * A read access mode.
         * Never aliased to writable content.
         * Only content with this access
         * is guaranteed to be immutable.
         */
        IMMUTABLE(READ.value | (1 << 2)),
        /**
         * A read and write access mode.
         * May be aliased with read or write modes.
         */
        READ_WRITE(READ.value | WRITE.value);
    }
```

Or pointers and memory regions could be given
semi-independent queries `isAccessibleFor(m)`
and `memorySegmentIsAccessibleFor(m)`.

Here's why I'm shining a light on this corner case:
Unless there is a formalized and explicit indication
of when memory is absolutely read-only, we will
not be able to extend Java's safe publication guarantees
to objects which are created by Panama-based factories.
If you build a data structure and then share it with
other threads, there are several things that can go
badly wrong.  But if your VM is willing to keep track
of which memory segments are shareable, you can
catch bugs which otherwise could go undetected.

Of course, the kind of "keeping track" I mean here
includes the following (none of which are new ideas):

- thread-confined mutable data (scopes, etc.)
- thread-shareable mutable data (lockable owner metadata)
- thread-shareable frozen data (strings, ROM, etc.)
- defensive copies of input data

The last one, defensive copies, should be part of the
API of pointers.  A defensive copy should convert
a mutable segment into a freshly copied immutable
or confined private copy.  (There are two ways to
do defensive copies:  We might call them "clone"
and "freeze" in the setting of Java objects.)  The
defensive copy should be idempotent on frozen
objects, so that repeated freezing is only as expensive
as the making of the first copy.

Getting the details right on this will allow native
data to interoperate better with Java's multi-threaded
APIs.  Getting the details right requires a little extra
attention to the the difference between a read-only
capability and an absolutely read-only memory segment.

Here's a sketch of the sort of thing I mean.

http://cr.openjdk.java.net/~jrose/panama/immutable-sketch

The sketch also includes better bounds checking for
on-heap memory regions.

— John


More information about the panama-dev mailing list