RFR: 8365017: The SegmentBulkOperations::copy method can be improved using overlaps [v7]

Wed Aug 13 14:37:12 UTC 2025

On Wed, 13 Aug 2025 14:18:42 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> If I see this right, it could be that we duplicate the copy on a byte. If we only copy 2 bytes, then the copy on the second byte is repeated:
>> 
>> src[0] -> dst[0]
>> src[1] -> dst[1]
>> src[1] -> dst[1]
>> 
>> But is that ok?
>> 
>> Assume we have threads 1 and 2, both access memory locations A and B:
>> 
>> // Initial state
>> A = x;
>> B = y;
>> 
>> // Concurrently:
>> 1: B = A;
>> 2: A = B + 1;
>> 
>> What states could be observed at the end? We could look at all permutations of the 4 operations from threads 1 and 2. Let's call the operations 1.1 (1 loads A), 1.2 (1 stores B), 2.1 (2 loads B), 2.2 (2 stores A).
>> These are the 4! / 4 = 6 possible permutations, together with the results for A and B:
>> 
>> 1.1 1.2 2.1 2.2 -> A = x+1; B = x
>> 1.1 2.1 1.2 2.2 -> A = y+1; B = x
>> 1.1 2.1 2.2 1.2 -> A = y+1; B = x
>> 2.1 1.1 1.2 2.2 -> A = y+1; B = x
>> 2.1 1.1 2.2 1.2 -> A = y+1; B = x
>> 2.1 2.2 1.1 1.2 -> A = y+1; B = y+1
>> 
>> Now assume we repeat the copy for thread 1. Thus, this could happen:
>> 
>> 1: B = A;
>> 2: A = B + 1;
>> 1: B = A; // repeated copy
>> 
>> And if it happens in this sequence, we get result:
>> `A = x+1; B = x+1`
>> But this result was not observable before.
>> 
>> That makes me wonder if repeating the copy is really allowed in the Java memory model?
>
>> We have discussed the possibility of threads seeing different values, like in the above example by @eme64. We think this is ok because there are no guarantees of inter-thread visibility for memory segments. This has to be provided externally (e.g., using volatile/CAS operations). There are other cases where we are susceptible to similar problems (e.g, when doing unaligned long access). In short, segments do not fulfill all the aspects of the normal Java memory model (like for arrays).
> 
> Hmm, I see. Is this documented in the `MemorySegment` API? What are all the bad things that can happen?
> - Tearing of unaligned access - can it also tear if the user has ensured alignment?
> - Repeated instructions (like the repeated copy I pointed out above).
> 
> It is of course a little surprising that you lose the Java memory model guarantees of instruction ordering if you wrap an array in a `MemorySegment`.

> > > We have discussed the possibility of threads seeing different values, like in the above example by @eme64. We think this is ok because there are no guarantees of inter-thread visibility for memory segments. This has to be provided externally (e.g., using volatile/CAS operations). There are other cases where we are susceptible to similar problems (e.g, when doing unaligned long access). In short, segments do not fulfill all the aspects of the normal Java memory model (like for arrays).
> > 
> > 
> > Hmm, I see. Is this documented in the `MemorySegment` API? What are all the bad things that can happen?
> > 
> > * Tearing of unaligned access - can it also tear if the user has ensured alignment?
> > * Repeated instructions (like the repeated copy I pointed out above).
> > 
> > It is of course a little surprising that you lose the Java memory model guarantees of instruction ordering if you wrap an array in a `MemorySegment`.
> 
> This is unavoidable. Consider the case where we have a wrapped `long[]` in a segment, and then we get a new `var s2 = segment.asSlice(1)` on which we operate with long semantics...

Absolutely, we cannot avoid tearing. At least not for unaligned access. But one might hope that aligned accesses would not tear.

I don't know enough about the details here, but it seems to me that repeating instructions is a slightly different category. It does not seem unavoidable. But maybe we are willing to do it too, so we can get out some more performance.

I suppose in the end we can do whatever we want, as long as we ensure the user can understand what the guarantees are ;)

I quickly  scanned the `MemorySegment` documentation, and I could not see anything about memory model or visibility guarantees. As a user, I now would assume that this means that the Java memory model applies. At least as long as I stay away from unaligned accesses.

Well, there is a note for `JAVA_DOUBLE_UNALIGNED`, for example:
`Care should be taken when using unaligned value layouts as they may induce performance and portability issues.`
Why not just mention tearing directly?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26672#issuecomment-3184191184