RFR: 8315559: Delay TempSymbol cleanup to avoid symbol table churn [v4]
Kim Barrett
kbarrett at openjdk.org
Wed Nov 1 15:27:07 UTC 2023
On Wed, 1 Nov 2023 12:09:05 GMT, Oli Gillespie <ogillespie at openjdk.org> wrote:
>> If you have to add more code to wrap NonblockingQueue, please implement it in a .cpp file. I thought NBQ was sufficient for this. Maybe we want some other data structure for this.
>
> I have attempted to avoid locks if at all possible. Re ABA problems, my reading of the push/pop concern is that the problem occurs if a caller pops a node and then re-uses that same node later by re-pushing it. Something like this:
>
> Queue contains Node1(val=a, next=null).
> 1. Thread 1 (start push) pushes Node2(val=b, next=null)
> 2. Thread 2 pops Node1(val=a, next=null)
> 3. Thread 2 sees that Node1.next is null and reuses the container, making Node1(val=c, next=null)
> 4. Thread 2 (start push) starts to push Node1(val=c, next=null)
> 5. Thread 1 (finish push) sets Node1.next=Node2
> 6. Thread 2 (finish push) sets Node2.next=Node1 <-- circular list
>
> Since my change doesn't reuse nodes, I believe it's safe.
Coleen -
I think NBQ is a reasonable choice for use here. But it's not a complete
solution on its own. It imposes documented requirements on clients. I don't
think we have a different data structure for this purpose (thread-safe FIFO
without locks), so any alternative would need to be invented, and would be
solving the same problem as NBQ and the surrounding client-provided code.
Oliver -
The current usage is not safe. The reuse can occur through the allocator. For
example, one thread starts a pop. Another thread steals that pop, then deletes
the object. Later, an allocation gets a new node at the same address as the
deleted node. That newly allocated node makes its way through the queue to
eventually become visible to that first thread's still in-progress pop. (So
this is an SMR bug. You generally can't delete an object while some other
thread might be looking at it.)
GlobalCounter is not a locking mechanism. It is an RCU-style synchronization
mechanism, so related to but different from RWLocks. In particular, readers
(threads in a critical section) never block due to this mechanism - only
write_synchronize blocks.
A problem with using GlobalCounter in that simplistic way is that once the
queue is "full", the one-in-one-out policy is going to have every allocation
hit GlobalCounter::write_synchronize (a potentially somewhat expensive
operation, since it needs to iterate over all threads), at least until the
queue is bulk drained. Switching over to a one-in-N-out policy could ameliate
that by batching the synchronizes over several nodes, and also remove the need
for complete bulk draining. Have min/max queue size and switching between
insert-only and one-in-N-out policies depending on the current size seems like
a possible solution.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16398#discussion_r1378947580
More information about the hotspot-dev
mailing list