RFR: JDK-8255885: Metaspace: freelist commit counter is not updated when purging

Coleen Phillimore coleenp at openjdk.java.net
Thu Nov 19 19:35:10 UTC 2020


On Wed, 4 Nov 2020 14:32:44 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Metaspace chunks are uncommitted, under certain conditions, when being returned to the Metaspace by a dying class loader.
> 
> There are two places where this happens:
> 1) when individual chunks are returned (see ChunkManager::return_chunk_locked())
> 2) later, when CLDG::purge()->Metaspace::purge()->ChunkManager::purge() is called - all free chunks are again checked and potentially uncommitted.
> 
> Free chunks are kept in FreeChunkList. FreeChunkList has a counter for words-comitted. That counter gets updated when chunks are added and removed. However, path (2) does not change the list, it just uncommits chunks residing in that list. That passes under the FreeChunkList's radar and now the counter is off.
> 
> -----
> 
> Review note: `ChunkManager` keeps free chunks. It has an instance of `FreeChunkListVector`, which has a series (one per chunk size) of `FreeChunkList` instances. The commit counter value is handed up from `FreeChunkList` to, eventually, `ChunkManager`.
> 
> What changed:
> 
> - I removed the counter (_committed_word_size) from FreeChunkList. This seemed the simplest and safest measure. Keeping this counter up-to-date is a hassle, since chunks may get committed and uncommitted while being in this list.
> 
> - Now we calculate the number of committed words on the fly if asked, by iterating all chunks in the list and accumulating. To reflect this change, I change the name of "FreeChunkList::committed_word_size()" to "FreeChunkList::calc_committed_word_size()". In metaspace, anything named "calc_" is potentially expensive.
> 
> - Same renamings in FreeChunkListVector and ChunkManager::total_committed_word_size() - to ChunkManager::calc_committed_word_size()
> 
> Now these functions always deliver the correct result. However, they are potentially slower and now require the central metaspace lock. That is not a problem: the only user of this function is when collecting the statistics for the "VM.metaspace" jcmd report, and that happens under lock protection and walks all kind of stuff anyway, so walking the freelists won't hurt.
> 
> Just to be sure, I removed the use of these functions from ChunkManager::print_on(), even though that was just a debug function.
> 
> -----
> 
> Tests:
> 
> This patch has been tested in our nightlies for a week now.

This seems good and more accurate to recalculate the size committed in each chunk by iterating through the chunks.  For path 2 in your description, for Metaspace::purge() I thought that chunks are returned to the ChunkFreeList from the ChunkList on each metaspace.  So they should be accounted for.  Or is the issue that committed_words within each chunk varies and remove() only captures a snapshot in time? At any rate, it's good to not have this in the logging and printing if it's going to be expensive but accurate.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1057


More information about the hotspot-runtime-dev mailing list