RFR: 8251158: Implementation of JEP 387: Elastic Metaspace

Mon Aug 24 06:11:48 UTC 2020

Hi Albert,

On Mon, Aug 24, 2020 at 12:06 AM Albert Yang <albert.m.yang at oracle.com>
wrote:

> Hi Thomas,
>
>  > error handling is just simpler and cleaner if ChunkManager::get_chunk()
> never hands out
> a fully uncommitted chunk in the first place
>
> I see. Then, when the current chunk is salvaged, it must have a positive
> used words, as
> the first request will always be satisfied, right? I added `assert(false)`
> in
> `MetaspaceArena::salvage_chunk()`, running `make test
> TEST='gtest:metaspace
> test/hotspot/jtreg/gtest/MetaspaceGtests.java
> test/hotspot/jtreg/runtime/Metaspace/elastic'` doesn't trigger the
> assertion. Is it truly
> unreachable or am I missing some tests?
>
> ```
>    if (c->used_words() == 0) {
>      assert(false, ""); // newly added
>      UL2(trace, "salvage: returning empty chunk " METACHUNK_FORMAT ".",
> METACHUNK_FORMAT_ARGS(c));
>      _chunk_manager->return_chunk(c);
>      return;
>    }
> ```
>

Good spotting. I removed this piece of code last week, but that change did
not find its way yet into a new webrev.

summary:     Remove a dead portion of code
http://hg.openjdk.java.net/jdk/sandbox/rev/074a374e9c5c

It was a leftover from the time where ChunkManager::get_chunk() could
return uncommitted chunks, the arene then committed the chunk,
failed and retired that thing. As I said the way it's done now is just
simpler.

>
> Memory allocation in metaspace involves both virtual (potentially
> enlarging a chunk)

Enlarging the chunk has nothing to do with reservation. Reservation happens
in a layer below that, in VirtualSpaceList/VirtualSpaceNode. When we
enlarge the chunk, or carve out a new chunk, even a root chunk, space is
already reserved. MetaspareArena::allocate() may indirectly cause
additional reservation - rarely - if the underlying VSList is expandable
and the allocation causes creation of a new VSNode.

> and
> physical (potentially committing memory) address space, and either one
> could fail. The
> second step, if failed, needs to undo the first step as well. Such
> tight-coupling is the
> main reason (to me) to not expose this as two steps to the caller. As I
> read it,
> `MetaspaceArena::allocate()` actually doesn't release the virtual memory
> used in the first
> if the second step fails. Not sure if there are any consequences or the
> severity.
>

Yes, but quite irrelevant. This can happen in two cases:

- the commit quota hits us when the underlying VSNode has not been fully
committed yet. For the non-class metaspace this means delta between
committed and reserved memory for a single MetaspaceContext is smaller
than Settings::virtual_space_node_default_word_size() which is atm 8MB. For
the compressed class space, whose VSList only has one node which is as
large as CompressedClassSpaceSize, this delta can be as large as
CompressedClassSpaceSize-<what we committed from class space>. However, the
latter case means someone specified a smaller MaxMetaspaceSize than
CompressedClassSpaceSize, and we cannot use up the compressed class space
since the commit quota is too small. Which is expected behaviour.

- the commit quota hits us when allocating from non-class metaspace and the
allocation just caused us to create a new VSNode, which then remains
completely uncommitted. In that rare case we are left with a delta of
Settings::virtual_space_node_default_word_size(), so 8MB.

Bottomline, the delta will be, for correctly sized Metaspace, at most ever
8MB ever. Which in the big scheme of things does not matter at all. Any
attempt to get rid of that would complicate the coding for a miniscule gain
in process vsize.

Not sure I follow your "tight coupling" remark. The coupling between the
sub systems is quite loose.

> I wonder if the problem could be simpler if chunks smaller/larger than the
> commit granule
> are dealt with in two different ways. This way when we deal with chunks
> smaller than the
> commit granule, we know it's already committed.
>

I do not understand what the problem is you want to solve.

>
>  > In other words, chunks can be smaller than a commit granule and that
> makes sense since
> for small loaders we want to share a granule among multiple loaders.
>
> It makes sense. Somehow, I got the impression that MetaspaceArena has 1:1
> mapping to
> ChunkManager, but actually all arenas share the singleton class/non-class
> chunk manager.
> The diagram in
>
> https://cr.openjdk.java.net/~stuefe/JEP-Improve-Metaspace-Allocator/review-guide/review-guide-1.0.html#2-subsystems
> was quite clear, but I think I ignored it after failing to find
> "SpaceManager" in the
> code. (Maybe the diagram is out of sync with the code?)

Yes, you have an old link. The current review guide version was mentioned
in my original RFR mail:

http://cr.openjdk.java.net/~stuefe/jep387/review/2020-08-11/guide/review-guide.html

When putting out a new webrev, it will come with an updated version of the
review guide. The master data for both are kept at github:
https://github.com/tstuefe/jep387/tree/master/review in case
openjdk.java.net is not reachable.

Thank you for the explanation,
> which, I believe, should be included in the comments for documentation
> purpose. (I didn't
> find the rational for using a buddy allocator in the code comments.)
>
>
I can try and further beef up the comments.

> Thank you once again for the clarification; I have got a better
> understanding of the patch.
>
>
Sure.

Thanks for the review work,

Thomas

> --
> /Albert
>