RFR: 8207851 JEP Draft: Support ByteBuffer mapped over non-volatile memory

Fri Sep 28 21:14:00 UTC 2018

Hi Francesco,

Thanks for the pointer to AtomicBuffer stuff. It's quite interesting.

I don't know how directly relevant this JEP is your work. I guess that's really 
up to you and possibly Andrew Dinn. However, in my thinking, if you have useful 
comments and relevant questions, you're certainly welcome to participate in the 
discussion.

Looking at the AtomicBuffer interface, I see that it supports reading and 
writing of a variety of data items, with a few different memory access modes. 
That reminds me of the VarHandles API. [1] This enables quite a number of 
different operations on a data item somewhere in memory, with a variety of 
memory access modes. What would AtomicBuffer look like if it were to use 
VarHandles? Or would AtomicBuffer be necessary at all if the rest of the library 
were to use VarHandles?

Note that a VarHandle can be used to access an arbitrary item within a region of 
memory, such as an array or a ByteBuffer.[2] An obvious extension to VarHandle 
is to allow a long offset, not just an int offset.

Note also that while many VarHandle methods return Object and take a varargs 
parameter of Object..., this does not imply that primitives are boxed! This is a 
bit of VM magic called "signature polymorphism"; see JVMS 2.9.3 [3].

s'marks

[1] 
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/invoke/VarHandle.html

[2] 
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/invoke/MethodHandles.html#byteBufferViewVarHandle(java.lang.Class,java.nio.ByteOrder)

[3] https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-2.html#jvms-2.9.3

On 9/28/18 12:38 AM, Francesco Nigro wrote:
> Hi guys!
>
> I'm one of the mentioned devs (like many others) that are using external (and 
> unsafe) APIs to concurrent access ByteBuffer's content and a developer of a 
> messaging broker's journal
> that would benefit by this JEP :)
> Re concurrent access API, how it looks 
> https://github.com/real-logic/agrona/blob/master/agrona/src/main/java/org/agrona/concurrent/AtomicBuffer.java?
>
> note:
> I don't know how's considered to appear in these discussions without 
> presenting myself and I hope to not be OT, but both this JEP and the comments 
> around are so interesting
> that I couldn't resist: I apologize if I'm not respecting some rule on it
>
> Thanks for the hard work,
> Francesco
>
> Il giorno ven 28 set 2018 alle ore 09:21 Peter Levart <peter.levart at gmail.com 
> <mailto:peter.levart at gmail.com>> ha scritto:
>
>     Hi Stuart,
>
>     I mostly agree with your assessment about the suitability of the
>     ByteBuffer API for nice multithreaded use. What would such API look like?
>     I think pretty much like ByteBuffer but without things that mutate
>     mark/position/limit/ByteOrder. A stripped-down ByteBuffer API therefore.
>     That would be in my opinion the most low-level API possible. If you add
>     things to such API that coordinate multithreaded access to the underlying
>     memory, you are already creating a concurrent data structure for a
>     particular set of use cases, which might not cover all possible use cases
>     or be sub-optimal at some of them. So I think this is better layered on
>     top of such API not built into it. Low-level multithreaded access to
>     memory is, in my opinion, always going to be "unsafe" from the standpoint
>     of coordination. It's not only the mark/position/limit/ByteOrder that is
>     not multithreaded-friendly about ByteBuffer API, but the underlying memory
>     too. It would be nice if mark/position/limit/ByteOrder weren't in the way
>     though.
>
>     Regards, Peter
>
>
>     On 09/28/2018 07:51 AM, Stuart Marks wrote:
>>     Hi Andrew,
>>
>>     Let me first stay that this issue of "ByteBuffer might not be the right
>>     answer" is something of a digression from the JEP discussion. I think the
>>     JEP should proceed forward using MBB with the API that you and Alan had
>>     discussed previously. At most, the discussion of the "right thing" issue
>>     might affect a side note in the JEP text about possible limitations and
>>     future directions of this effort. However, it's not a blocker to the JEP
>>     making progress as far as I'm concerned.
>>
>>     With that in mind, I'll discuss the issue of multithreaded access to
>>     ByteBuffers and how this bears on whether buffers are or aren't the
>>     "right answer." There are actually several issues that figure into the
>>     "right answer" analysis. In this message, though, I'll just focus on the
>>     issue of multithreaded access.
>>
>>     To recap (possibly for the benefit of other readers) the Buffer class doc
>>     has the following statement:
>>
>>         Buffers are not safe for use by multiple concurrent threads. If a buffer
>>         is to be used by more than one thread then access to the buffer
>>     should be
>>         controlled by appropriate synchronization.
>>
>>     Buffers are primarily designed for sequential operations such as I/O or
>>     codeset conversion. Typical buffer operations set the mark, position, and
>>     limit before initiating the operation. If the operation completes
>>     partially -- not uncommon with I/O or codeset conversion -- the position
>>     is updated so that the operation can be resumed easily from where it left
>>     off.
>>
>>     The fact that buffers not only contain the data being operated upon but
>>     also mutable state information such as mark/position/limit makes it
>>     difficult to have multiple threads operate on different parts of the same
>>     buffer. Each thread would have to lock around setting the position and
>>     limit and performing the operation, preventing any parallelism. The
>>     typical way to deal with this is to create multiple buffer slices, one
>>     per thread. Each slice has its own mark/position/limit values but shares
>>     the same backing data.
>>
>>     We can avoid the need for this by adding absolute bulk operations, right?
>>
>>     Let's suppose we were to add something like this (considering ByteBuffer
>>     only, setting the buffer views aside):
>>
>>         get(int srcOff, byte[] dst, int dstOff, int length)
>>         put(int dstOff, byte[] src, int srcOff, int length)
>>
>>     Each thread can perform its operations on a different part of the buffer,
>>     in parallel, without interference from the others. Presumably these
>>     operations don't read or write the mark and position. Oh, wait. The
>>     existing absolute put and get overloads *do* respect the buffer's limit,
>>     so the absolute bulk operations ought to as well. This means they do
>>     depend on shared state. (I guess we could make the absolute bulk ops not
>>     respect the limit, but that seems inconsistent.)
>>
>>     OK, let's adopt an approach similar to what was described by Peter Levart
>>     a couple messages upthread, where a) there is an initialization step
>>     where various things including the limit are set properly; b) the buffer
>>     is published to the worker threads properly, e.g., using a lock or other
>>     suitable memory operation; and c) all worker threads agree only to use
>>     absolute operations and to avoid relative operations.
>>
>>     Now suppose the threads have completed their work and you want to, say,
>>     write the buffer's contents to a channel. You have to carefully make sure
>>     the threads are all finished and properly publish their results back to
>>     some central thread, have that central thread receive the results, set
>>     the position and limit, after which the central thread can initiate the
>>     I/O operation.
>>
>>     This can certainly be made to work.
>>
>>     But note what we just did. We now have an API where:
>>
>>      - there are different "phases", where in one phase all the methods work,
>>     but in another phase only certain methods work (otherwise it breaks
>>     silently);
>>
>>      - you have to carefully control all the code to ensure that the wrong
>>     methods aren't called when the buffer is in the wrong phase (otherwise it
>>     breaks silently); and
>>
>>      - you can't hand off the buffer to a library (3rd party or JDK) without
>>     carefully orchestrating a transition into the right phase (otherwise it
>>     breaks silently).
>>
>>     Frankly, this is pretty crappy. It's certainly possible to work around
>>     it. People do, and it is painful, and they complain about it up and down
>>     all day long (and rightfully so).
>>
>>     Note that this discussion is based primarily on looking at the ByteBuffer
>>     API. I have not done extensive investigation of the impact of the various
>>     buffer views (IntBuffer, LongBuffer, etc.), nor have I looked thoroughly
>>     at the implementations. I have no doubt that we will run into additional
>>     issues when we do those investigations.
>>
>>     If we were designing an API to support multi-threaded access to memory
>>     regions, it would almost certainly look nothing like the buffer API. This
>>     is what Alan means by "buffers might not be the right answer." As things
>>     stand, it appears quite difficult to me to fix the multi-threaded access
>>     problem without turning buffers into something they aren't, or
>>     fragmenting the API in some complex and uncomfortable way.
>>
>>     Finally, note that this is not an argument against adding bulk absolute
>>     operations! I think we should probably go ahead and do that anyway. But
>>     let's not fool ourselves into thinking that bulk absolute operations
>>     solve the multi-threaded buffer access problem.
>>
>>     s'marks
>>
>