RFR: 8365017: The SegmentBulkOperations::copy method can be improved using overlaps [v7]
Emanuel Peter
epeter at openjdk.org
Wed Aug 13 12:32:23 UTC 2025
On Wed, 13 Aug 2025 10:08:33 GMT, Per Minborg <pminborg at openjdk.org> wrote:
>> This PR proposes to use overlapping memory areas in `SegmentBulkOperations::copy`, similar to what is proposed for `SegmentBulkOperations::fill` in https://github.com/openjdk/jdk/pull/25383.
>>
>> This PR passes `tier1`, `tier2`, and `tier3`testing on multiple platforms.
>
> Per Minborg has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision:
>
> - Revert comment
> - Revert change in AMSI
> - Back paddle on the branchless overlap method
> - Merge branch 'master' into copy-overlap
> - Fix comment
> - Simplify
> - Update copyright year
> - Add a branchless overlaps method
> - Optimize copy for certain lengths
> - Add methods to ScopedMemoryAccess
> - ... and 2 more: https://git.openjdk.org/jdk/compare/b0f898b5...1a1606e2
If I see this right, it could be that we duplicate the copy on a byte. If we only copy 2 bytes, then the copy on the second byte is repeated:
src[0] -> dst[0]
src[1] -> dst[1]
src[1] -> dst[1]
But is that ok?
Assume we have threads 1 and 2, both access memory locations A and B:
// Initial state
A = x;
B = y;
// Concurrently:
1: B = A;
2: A = B + 1;
What states could be observed at the end? We could look at all permutations of the 4 operations from threads 1 and 2. Let's call the operations 1.1 (1 loads A), 1.2 (1 stores B), 2.1 (2 loads B), 2.2 (2 stores A).
These are the 4! / 4 = 6 possible permutations, together with the results for A and B:
1.1 1.2 2.1 2.2 -> A = x+1; B = x
1.1 2.1 1.2 2.2 -> A = y+1; B = x
1.1 2.1 2.2 1.2 -> A = y+1; B = x
2.1 1.1 1.2 2.2 -> A = y+1; B = x
2.1 1.1 2.2 1.2 -> A = y+1; B = x
2.1 2.2 1.1 1.2 -> A = y+1; B = y+1
Now assume we repeat the copy for thread 1. Thus, this could happen:
1: B = A;
2: A = B + 1;
1: B = A; // repeated copy
And if it happens in this sequence, we get result:
`A = x+1; B = x+1`
But this result was not observable before.
That makes me wonder if repeating the copy is really allowed in the Java memory model?
src/java.base/share/classes/jdk/internal/foreign/SegmentBulkOperations.java line 156:
> 154: // `putByte()` below is enough as 3 is the maximum number of bytes covered
> 155: final byte b = SCOPED_MEMORY_ACCESS.getByte(srcSession, src.unsafeGetBase(), src.unsafeGetOffset() + srcOffset + len - Byte.BYTES);
> 156: SCOPED_MEMORY_ACCESS.putByte(dstSession, dst.unsafeGetBase(), dst.unsafeGetOffset() + dstOffset + len - Byte.BYTES, b);
Are these duplications of operations really ok? Do you have a good argument for that?
-------------
PR Review: https://git.openjdk.org/jdk/pull/26672#pullrequestreview-3115657763
PR Review Comment: https://git.openjdk.org/jdk/pull/26672#discussion_r2273228594
More information about the core-libs-dev
mailing list