RFR: 8349146: [REDO] Implement a better allocator for downcalls [v8]
Shaojin Wen
swen at openjdk.org
Fri May 2 11:33:48 UTC 2025
On Fri, 2 May 2025 11:13:27 GMT, Per Minborg <pminborg at openjdk.org> wrote:
>> This PR is based on the work of @mernst-github and aims to implement an _internal_ thread-local 'stack' allocator, which works like a dynamically sized arena, but with reset functionality to reset the allocated size back to a certain level. The underlying memory could stay around between calls, which could improve performance.
>>
>> Re-allocated segments are not zeroed between allocations.
>
> Per Minborg has updated the pull request incrementally with one additional commit since the last revision:
>
> Cleanup and only create CleanupAction once per thread
src/java.base/share/classes/jdk/internal/foreign/BufferStack.java line 194:
> 192: }
> 193: }
> 194: }
Suggestion:
private record PerThread(ReentrantLock lock,
Arena arena,
SlicingAllocator stack,
CleanupAction cleanupAction) {
@SuppressWarnings("restricted")
@ForceInline
public Arena pushFrame(long size, long byteAlignment) {
boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread();
if (needsLock && !lock.tryLock()) {
// Rare: another virtual thread on the same carrier competed for acquisition.
return Arena.ofConfined();
}
if (!stack.canAllocate(size, byteAlignment)) {
if (needsLock) lock.unlock();
return Arena.ofConfined();
}
long parentOffset = stack.currentOffset();
final MemorySegment frameSegment = stack.allocate(size, byteAlignment);
long topOfStack = stack.currentOffset();
Arena confinedArena = Arena.ofConfined();
// return new Frame(needsLock, size, byte
// The cleanup action will keep the original automatic `arena` (from which
// the reusable segment is first allocated) alive even if this Frame
// becomes unreachable but there are reachable segments still alive.
return new Frame(this, needsLock, parentOffset, topOfStack, confinedArena, new SlicingAllocator(frameSegment.reinterpret(confinedArena, cleanupAction)));
}
static PerThread of(long byteSize, long byteAlignment) {
final Arena arena = Arena.ofAuto();
return new PerThread(new ReentrantLock(),
arena,
new SlicingAllocator(arena.allocate(byteSize, byteAlignment)),
new CleanupAction(arena));
}
private record CleanupAction(Arena arena) implements Consumer<MemorySegment> {
@Override
public void accept(MemorySegment memorySegment) {
Reference.reachabilityFence(arena);
}
}
}
private record Frame(PerThread thead, boolean locked, long parentOffset, long topOfStack, Arena confinedArena, SegmentAllocator frame) implements Arena {
@ForceInline
private void assertOrder() {
if (topOfStack != thead.stack.currentOffset())
throw new IllegalStateException("Out of order access: frame not top-of-stack");
}
@ForceInline
@Override
@SuppressWarnings("restricted")
public MemorySegment allocate(long byteSize, long byteAlignment) {
// Make sure we are on the right thread and not closed
MemorySessionImpl.toMemorySession(confinedArena).checkValidState();
return frame.allocate(byteSize, byteAlignment);
}
@ForceInline
@Override
public MemorySegment.Scope scope() {
return confinedArena.scope();
}
@ForceInline
@Override
public void close() {
assertOrder();
// the Arena::close method is called "early" as it checks thread
// confinement and crucially before any mutation of the internal
// state takes place.
confinedArena.close();
thead.stack.resetTo(parentOffset);
if (locked) {
thead.lock.unlock();
}
}
}
Same as the suggestion above, you changed the code. I updated and remade the suggestion and changed it to two records. What do you think?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24829#discussion_r2071481137
More information about the core-libs-dev
mailing list