RFR: 8207851 JEP Draft: Support ByteBuffer mapped over non-volatile memory

Fri Sep 28 05:51:45 UTC 2018

Hi Andrew,

Let me first stay that this issue of "ByteBuffer might not be the right answer" 
is something of a digression from the JEP discussion. I think the JEP should 
proceed forward using MBB with the API that you and Alan had discussed 
previously. At most, the discussion of the "right thing" issue might affect a 
side note in the JEP text about possible limitations and future directions of 
this effort. However, it's not a blocker to the JEP making progress as far as 
I'm concerned.

With that in mind, I'll discuss the issue of multithreaded access to ByteBuffers 
and how this bears on whether buffers are or aren't the "right answer." There 
are actually several issues that figure into the "right answer" analysis. In 
this message, though, I'll just focus on the issue of multithreaded access.

To recap (possibly for the benefit of other readers) the Buffer class doc has 
the following statement:

     Buffers are not safe for use by multiple concurrent threads. If a buffer
     is to be used by more than one thread then access to the buffer should be
     controlled by appropriate synchronization.

Buffers are primarily designed for sequential operations such as I/O or codeset 
conversion. Typical buffer operations set the mark, position, and limit before 
initiating the operation. If the operation completes partially -- not uncommon 
with I/O or codeset conversion -- the position is updated so that the operation 
can be resumed easily from where it left off.

The fact that buffers not only contain the data being operated upon but also 
mutable state information such as mark/position/limit makes it difficult to have 
multiple threads operate on different parts of the same buffer. Each thread 
would have to lock around setting the position and limit and performing the 
operation, preventing any parallelism. The typical way to deal with this is to 
create multiple buffer slices, one per thread. Each slice has its own 
mark/position/limit values but shares the same backing data.

We can avoid the need for this by adding absolute bulk operations, right?

Let's suppose we were to add something like this (considering ByteBuffer only, 
setting the buffer views aside):

     get(int srcOff, byte[] dst, int dstOff, int length)
     put(int dstOff, byte[] src, int srcOff, int length)

Each thread can perform its operations on a different part of the buffer, in 
parallel, without interference from the others. Presumably these operations 
don't read or write the mark and position. Oh, wait. The existing absolute put 
and get overloads *do* respect the buffer's limit, so the absolute bulk 
operations ought to as well. This means they do depend on shared state. (I guess 
we could make the absolute bulk ops not respect the limit, but that seems 
inconsistent.)

OK, let's adopt an approach similar to what was described by Peter Levart a 
couple messages upthread, where a) there is an initialization step where various 
things including the limit are set properly; b) the buffer is published to the 
worker threads properly, e.g., using a lock or other suitable memory operation; 
and c) all worker threads agree only to use absolute operations and to avoid 
relative operations.

Now suppose the threads have completed their work and you want to, say, write 
the buffer's contents to a channel. You have to carefully make sure the threads 
are all finished and properly publish their results back to some central thread, 
have that central thread receive the results, set the position and limit, after 
which the central thread can initiate the I/O operation.

This can certainly be made to work.

But note what we just did. We now have an API where:

  - there are different "phases", where in one phase all the methods work, but 
in another phase only certain methods work (otherwise it breaks silently);

  - you have to carefully control all the code to ensure that the wrong methods 
aren't called when the buffer is in the wrong phase (otherwise it breaks 
silently); and

  - you can't hand off the buffer to a library (3rd party or JDK) without 
carefully orchestrating a transition into the right phase (otherwise it breaks 
silently).

Frankly, this is pretty crappy. It's certainly possible to work around it. 
People do, and it is painful, and they complain about it up and down all day 
long (and rightfully so).

Note that this discussion is based primarily on looking at the ByteBuffer API. I 
have not done extensive investigation of the impact of the various buffer views 
(IntBuffer, LongBuffer, etc.), nor have I looked thoroughly at the 
implementations. I have no doubt that we will run into additional issues when we 
do those investigations.

If we were designing an API to support multi-threaded access to memory regions, 
it would almost certainly look nothing like the buffer API. This is what Alan 
means by "buffers might not be the right answer." As things stand, it appears 
quite difficult to me to fix the multi-threaded access problem without turning 
buffers into something they aren't, or fragmenting the API in some complex and 
uncomfortable way.

Finally, note that this is not an argument against adding bulk absolute 
operations! I think we should probably go ahead and do that anyway. But let's 
not fool ourselves into thinking that bulk absolute operations solve the 
multi-threaded buffer access problem.

s'marks