RFR: 8331553: Windows JVM leaks Event and Thread handles when multiple threads are used

Daniel Jeliński djelinski at openjdk.org
Wed Jun 19 11:43:09 UTC 2024


On Tue, 18 Jun 2024 21:01:15 GMT, Daniel Jeliński <djelinski at openjdk.org> wrote:

> We use 2 ParkEvent instances per thread. The ParkEvent objects are never freed, but they are recycled when a thread dies, so the number of live ParkEvent instances is proportional to the maximum number of threads that were live at any time.
> 
> On Windows, the ParkEvent object wraps a kernel Event object. Kernel objects are a limited and costly resource. In this PR, I replace the use of kernel events with user-space synchronization.
> 
> The new implementation uses WaitOnAddress and WakeByAddressSingle methods to implement synchronization. The methods are available since Windows 8. We only support Windows 10 and newer, so OS support should not be a problem.
> 
> WaitOnAddress was observed to return spuriously, so I added the necessary code to recalculate the timeout and continue waiting.
> 
> Tier1-5 tests passed. Performance tests were... inconclusive. For example, `ThreadOnSpinWaitProducerConsumer` reported 30% better results, while `LockUnlock.testContendedLock` results were 50% worse. 
> 
> Thoughts?

As you found out already, the implementation is based on a hash table, so access will be slower with many threads waiting at the same time. The hash table is stored in user space (in PEB), and the implementation reportedly doesn't require any kernel resources.

I'm not sure what to do with the remaining reference to `WaitForSingleObject`; it explains why we decompose the timeout by pointing to `EventWait`, which no longer exists as far as I could tell. I'll search the history some more before I decide what to do with that comment.

I don't think we care about pre-Win8 OSes at this point. JDK-11 was the last version to support Windows 7, I guess we won't backport this change there.

The patch replaces the underlying mechanics of ObjectMonitor. The ParkEvent is only used when a thread is blocked on a monitor, so the number of concurrent waits will never be greater than the total number of running threads.

I'll check if I can run Renaissance philosophers. Could you explain how UseHeavyMonitors changes the benchmark?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19778#issuecomment-2178483869


More information about the build-dev mailing list