RFR: 8207851 JEP Draft: Support ByteBuffer mapped over non-volatile memory

Fri Sep 28 07:21:13 UTC 2018

Hi Stuart,

I mostly agree with your assessment about the suitability of the 
ByteBuffer API for nice multithreaded use. What would such API look 
like? I think pretty much like ByteBuffer but without things that mutate 
mark/position/limit/ByteOrder. A stripped-down ByteBuffer API therefore. 
That would be in my opinion the most low-level API possible. If you add 
things to such API that coordinate multithreaded access to the 
underlying memory, you are already creating a concurrent data structure 
for a particular set of use cases, which might not cover all possible 
use cases or be sub-optimal at some of them. So I think this is better 
layered on top of such API not built into it. Low-level multithreaded 
access to memory is, in my opinion, always going to be "unsafe" from the 
standpoint of coordination. It's not only the 
mark/position/limit/ByteOrder that is not multithreaded-friendly about 
ByteBuffer API, but the underlying memory too. It would be nice if 
mark/position/limit/ByteOrder weren't in the way though.

Regards, Peter

On 09/28/2018 07:51 AM, Stuart Marks wrote:
> Hi Andrew,
>
> Let me first stay that this issue of "ByteBuffer might not be the 
> right answer" is something of a digression from the JEP discussion. I 
> think the JEP should proceed forward using MBB with the API that you 
> and Alan had discussed previously. At most, the discussion of the 
> "right thing" issue might affect a side note in the JEP text about 
> possible limitations and future directions of this effort. However, 
> it's not a blocker to the JEP making progress as far as I'm concerned.
>
> With that in mind, I'll discuss the issue of multithreaded access to 
> ByteBuffers and how this bears on whether buffers are or aren't the 
> "right answer." There are actually several issues that figure into the 
> "right answer" analysis. In this message, though, I'll just focus on 
> the issue of multithreaded access.
>
> To recap (possibly for the benefit of other readers) the Buffer class 
> doc has the following statement:
>
>     Buffers are not safe for use by multiple concurrent threads. If a 
> buffer
>     is to be used by more than one thread then access to the buffer 
> should be
>     controlled by appropriate synchronization.
>
> Buffers are primarily designed for sequential operations such as I/O 
> or codeset conversion. Typical buffer operations set the mark, 
> position, and limit before initiating the operation. If the operation 
> completes partially -- not uncommon with I/O or codeset conversion -- 
> the position is updated so that the operation can be resumed easily 
> from where it left off.
>
> The fact that buffers not only contain the data being operated upon 
> but also mutable state information such as mark/position/limit makes 
> it difficult to have multiple threads operate on different parts of 
> the same buffer. Each thread would have to lock around setting the 
> position and limit and performing the operation, preventing any 
> parallelism. The typical way to deal with this is to create multiple 
> buffer slices, one per thread. Each slice has its own 
> mark/position/limit values but shares the same backing data.
>
> We can avoid the need for this by adding absolute bulk operations, right?
>
> Let's suppose we were to add something like this (considering 
> ByteBuffer only, setting the buffer views aside):
>
>     get(int srcOff, byte[] dst, int dstOff, int length)
>     put(int dstOff, byte[] src, int srcOff, int length)
>
> Each thread can perform its operations on a different part of the 
> buffer, in parallel, without interference from the others. Presumably 
> these operations don't read or write the mark and position. Oh, wait. 
> The existing absolute put and get overloads *do* respect the buffer's 
> limit, so the absolute bulk operations ought to as well. This means 
> they do depend on shared state. (I guess we could make the absolute 
> bulk ops not respect the limit, but that seems inconsistent.)
>
> OK, let's adopt an approach similar to what was described by Peter 
> Levart a couple messages upthread, where a) there is an initialization 
> step where various things including the limit are set properly; b) the 
> buffer is published to the worker threads properly, e.g., using a lock 
> or other suitable memory operation; and c) all worker threads agree 
> only to use absolute operations and to avoid relative operations.
>
> Now suppose the threads have completed their work and you want to, 
> say, write the buffer's contents to a channel. You have to carefully 
> make sure the threads are all finished and properly publish their 
> results back to some central thread, have that central thread receive 
> the results, set the position and limit, after which the central 
> thread can initiate the I/O operation.
>
> This can certainly be made to work.
>
> But note what we just did. We now have an API where:
>
>  - there are different "phases", where in one phase all the methods 
> work, but in another phase only certain methods work (otherwise it 
> breaks silently);
>
>  - you have to carefully control all the code to ensure that the wrong 
> methods aren't called when the buffer is in the wrong phase (otherwise 
> it breaks silently); and
>
>  - you can't hand off the buffer to a library (3rd party or JDK) 
> without carefully orchestrating a transition into the right phase 
> (otherwise it breaks silently).
>
> Frankly, this is pretty crappy. It's certainly possible to work around 
> it. People do, and it is painful, and they complain about it up and 
> down all day long (and rightfully so).
>
> Note that this discussion is based primarily on looking at the 
> ByteBuffer API. I have not done extensive investigation of the impact 
> of the various buffer views (IntBuffer, LongBuffer, etc.), nor have I 
> looked thoroughly at the implementations. I have no doubt that we will 
> run into additional issues when we do those investigations.
>
> If we were designing an API to support multi-threaded access to memory 
> regions, it would almost certainly look nothing like the buffer API. 
> This is what Alan means by "buffers might not be the right answer." As 
> things stand, it appears quite difficult to me to fix the 
> multi-threaded access problem without turning buffers into something 
> they aren't, or fragmenting the API in some complex and uncomfortable 
> way.
>
> Finally, note that this is not an argument against adding bulk 
> absolute operations! I think we should probably go ahead and do that 
> anyway. But let's not fool ourselves into thinking that bulk absolute 
> operations solve the multi-threaded buffer access problem.
>
> s'marks
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20180928/597632a1/attachment-0001.html>