[aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation

Thomas Stüfe thomas.stuefe at gmail.com
Fri May 8 07:33:35 UTC 2020


Hi Ioi,

thanks for taking the time to look at this!

On Fri, May 8, 2020 at 7:37 AM Ioi Lam <ioi.lam at oracle.com> wrote:

> Hi Thomas,
>
> I am running your patch in out CI pipeline now. Some comments:
>
> [1] Do we still need to use _class_space_list? It looks like we just have
> a single range. Is _class_space_list->current_virtual_space() always the
> same VS?
>
>   if (_class_space_list != NULL) {
>     address base =
> (address)_class_space_list->current_virtual_space()->bottom();
>     address top = base + compressed_class_space_size();
>     st->print("Compressed class space mapped at: " PTR_FORMAT "-"
> PTR_FORMAT ", size: " SIZE_FORMAT,
>                p2i(base), p2i(top), top - base);
>
>
Yes it is. Has been bugging me for years too. With ccs, the
VirtualSpaceList is degenerated and only contains one node. It still makes
some sense code wise. But maybe VirtualSpace*List* is a misnomer.

It also ties in into the question whether Metaspace reservations should be
able to grow on demand. Oracle folks I talk to tend to say yes. I
personally think the "Metaspace is infinite don't worry" meme was broken
since ccs was a thing, since CompressedClassSpaceSize is an absolute limit
already.

I may tweak and simplify the code a bit with Elastic Metaspace. Maybe, as
you say, with ccs we can get rid of VirtualSpaceList and deal with the
range directly. Have to check.

[2] Does JDK-8243535 exist with the current jdk/jdk repo?
>
>   // This does currently not work because rs may be the result of a split
> operation
>   // and NMT seems not to be able to handle splits.
>   // See JDK-8243535.
>   // MemTracker::record_virtual_memory_type((address)rs.base(), mtClass);
>
>
Yes. It shows up more rarely. With this patch, it will show up always. I
had a conversation with Zhengyu about this. I think there may be
workarounds, like removing the original mapping in NMT and replacing it
with the two new ones for archive and ccs. But it seems iffy, I'd rather
have this fixed in NMT.

I also think it is not such a big deal, but if you think otherwise, I can
integrate a workaround into this patch.


>
> [3] I think this can be put in header file.
>
> bool Metaspace::class_space_is_initialized() {
>   return _class_space_list != NULL;
> }
>
>
Will do.


> [4] Why does the CCS need to follow UseLargePagesInMetaspace? This is the
> reason for the
>     gap in the following chart:
>
>     // We do this by reserving space for the ccs behind the archives. Note
> however that
>     //  ccs follows a different alignment (Metaspace::reserve_alignment),
> so there may
>     //  be a gap between ccs and cds.
>     // We use a similar layout at runtime, see
> reserve_address_space_for_archives().
>     //
>     //                              +-- SharedBaseAddress (default =
> 0x800000000)
>     //                              v
>     // +-..---------+---------+ ...
> +----+----+----+--------+--------------------+
>     // |    Heap    | Archive |     | MC | RW | RO | [gap]  |    class
> space     |
>     // +-..---------+---------+ ...
> +----+----+----+--------+--------------------+
>     // |<--   MaxHeapSize  -->|     |<-- UnscaledClassSpaceMax = 4GB -->|
>
>
No, nothing to do with large pages. CCS does not have large pages.

The gap can result from different alignment rules between metaspace and
cds. With Elastic Metaspace, metaspace reservations will be aligned to 4M.
So, ccs will have to start at a 4M aligned border.


>
> _reserve_alignment is determined here:
>
> void Metaspace::ergo_initialize() {
>   if (DumpSharedSpaces) {
>     // Using large pages when dumping the shared archive is currently not
> implemented.
>     FLAG_SET_ERGO(UseLargePagesInMetaspace, false);
>   }
>
>   size_t page_size = os::vm_page_size();
>   if (UseLargePages && UseLargePagesInMetaspace) {
>     page_size = os::large_page_size();
>   }
>
>   _commit_alignment  = page_size;
>   _reserve_alignment = MAX2(page_size,
> (size_t)os::vm_allocation_granularity());
>
> But when CDS is enabled, the RS is reserved without large pages, so it
> should be possible to incrementally commit the CCS using just
> os::vm_allocation_granularity().
>
>
Absolutely. CCS does not use large pages for that reason. As I wrote, this
is about the new reservation granularity of 4M.


> [5] I see the aarch64 code in
> Metaspace::reserve_address_space_for_compressed_classes has been changed.
> Is this necessary for this RFE, or is it a separate improvement that can be
> done in a different RFE?
>

Well, it had to be changed somewhat, since the prototype differs. I swapped
Metaspace::reserve_preferred_space() for
Metaspace::reserve_address_space_for_compressed_classes(), which follows a
somewhat different protocol.

That aside, the functionality for AARCH64 should be as unchanged as
possible. I removed the AIX specific coding, since it was subtly wrong, but
that's fine since we are AIX/ppc maintainers.


> [6] For AARCH64, should we skip the part that reserves it right above the
> heap? Or does AARCH64 always allocate the heap such that it's a preferable
> address?
>
>
Yes and yes :)

We do skip that for aarch64 (at least if the address is wrong):

1157     // Try right above the Java heap...
1158     address base = align_up(CompressedOops::end(),
Metaspace::reserve_alignment());
1159     assert(base != NULL, "Sanity");
1160
1161     const size_t size = align_up(CompressedClassSpaceSize,
Metaspace::reserve_alignment());
1162     if (CompressedKlassPointers::is_valid_base(base)) {
1163       rs = ReservedSpace(size, Metaspace::reserve_alignment(),
false /* large */, (char*)base);
1164     }

At 1162, we check if the attach address would be valid for this platform.
If aarch64 says no, we won't reserve there but go straight to the "attach
wherever you think is good" part. Which is implemented
by Metaspace::reserve_address_space_for_compressed_classes(), which on
aarch64 does search a suitable address.



>     // case (b)
>     ReservedSpace rs;
>
>     // Try right above the Java heap...
>     address base = align_up(CompressedOops::end(),
> Metaspace::reserve_alignment());
>     assert(base != NULL, "Sanity");
>
>     const size_t size = align_up(CompressedClassSpaceSize,
> Metaspace::reserve_alignment());
>     if (CompressedKlassPointers::is_valid_base(base)) {
>       rs = ReservedSpace(size, Metaspace::reserve_alignment(), false /*
> large */, (char*)base);
>     }
>
>     // ...failing that, reserve anywhere, but let platform do optimized
> placement:
>     if (!rs.is_reserved()) {
>       rs = Metaspace::reserve_address_space_for_compressed_classes(size);
>     }
>
> [7] argument.cpp checks this:
>
>   if (UseSharedSpaces || DumpSharedSpaces) {
>     if
> (!CompressedKlassPointers::is_valid_base((address)SharedBaseAddress)) {
>       log_warning(cds)("SharedBaseAddress=" PTR_FORMAT " is invalid for
> this platform, option will be ignored.",
>                        p2i((address)SharedBaseAddress));
>       FLAG_SET_DEFAULT(SharedBaseAddress, default_SharedBaseAddress());
>     }
>   }
>
> but initialize_dumptime_shared_and_meta_spaces aligns up the
> SharedBaseAddress before doing the assert.
>
>   void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() {
>     assert(DumpSharedSpaces, "should be called for dump time only");
>
>     const size_t reserve_alignment =
> MetaspaceShared::reserved_space_alignment();
>     char* shared_base = (char*)align_up((char*)SharedBaseAddress,
> reserve_alignment);
>
>   #ifdef _LP64
>     assert(CompressedKlassPointers::is_valid_base((address)shared_base),
> "Sanity");
>
> So theoretically shared_base may no longer be is_valid_base after the
> align_up. It probably won't happen, but I think the code will be much
> clearer tofirst to align_up, then check for is_valid_base and reset to
> default, and finally assert.
>
>
Okay, I'll change that.


> [8] In MetaspaceShared::initialize_dumptime_shared_and_meta_spaces, the
> following block can be simplified:
>
>   if (!_shared_rs.is_reserved()) {
>     // Get a reserved space anywhere if attaching at the SharedBaseAddress
> fails:
>     if (UseCompressedClassPointers) {
>       // If we need a compressed class space too, let the platform handle
> the reservation.
>       _shared_rs =
> Metaspace::reserve_address_space_for_compressed_classes(cds_total);
>     } else {
>       // anywhere is fine.
>       _shared_rs = ReservedSpace(cds_total, reserve_alignment, false /*
> large */, (char*)NULL);
>     }
>   }
>
>    ... if you change the declaration to:
>
>   static ReservedSpace reserve_address_space_for_compressed_classes(size_t
> size) NOT_LP64({ return ReservedSpace();});
>
> [9] I think the #ifdef _LP64 is not necessary:
>
> #ifdef _LP64
>     // Some sanity checks after reserving address spaces for archives and
> class space.
>     assert(archive_space_rs.is_reserved(), "Sanity");
>     if (Metaspace::using_class_space()) {
>       // Class space must closely follow the archive space. Both spaces
> must be aligned correctly.
>       assert(class_space_rs.is_reserved(), "A class space should have been
> reserved");
>       assert(class_space_rs.base() >= archive_space_rs.end(), "class space
> should follow the cds archive space");
>       assert(is_aligned(archive_space_rs.base(),
> MetaspaceShared::reserved_space_alignment()), "Archive space misaligned");
>       assert(is_aligned(class_space_rs.base(),
> Metaspace::reserve_alignment()), "class space misaligned");
>     }
> #endif
>
>
I'll take a look.


> The rest of the code looks OK to me, but I may take a look at it again
> after getting more sleep :-)
>
>
Sure. Thanks man! :)

..Thomas

Thanks
> - Ioi
>
>
>
>
> On 5/7/20 7:21 AM, Thomas Stüfe wrote:
>
> Hi all,
>
> please take a look at the third iteration of this change:
>
>
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.02/webrev/
>
> Changes in this version:
>
> - at the request of Coleen (sorry I did not see your post earlier) I
> removed all platform dependent files and put the platform dependent coding
> back to metaspace.cpp and compressedOops.cpp to ease the review pain. I
> used plain platform defines though (#ifdef AARCH64) instead of hiding them
> behind another macro to make things clearer. Note that I still intent to
> put this code away into the platform corners but will do so in a follow up
> RFE.
>
> - I reinstated, in a fashion, the special handling of reservations on
> Windows. On all platforms we reserve address space to map the archive files
> in with a subsequent mapping operation. However, on Windows, we cannot use
> MapViewOfFile() into an existing mapping. So I remove the mapping again
> before mapping the archives - see comment in code for details.
>
> All CI tests at SAP run through without problems, including on Win64 and
> aarch64, but I would be happy if others were to run test too.
>
> Thank you, Thomas
>
> On Tue, Apr 28, 2020 at 4:54 PM Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
>> Hi all,
>>
>> Could I have reviews for the following proposal of reworking cds/class
>> space reservation?
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8243392
>>
>> Webrev:
>> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/
>>
>> (Many thanks to Ioi Lam for so patiently explaining CDS internals to me,
>> and to Andrew Haley and Nick Gasson for help with aarch64!)
>>
>> Reservation of the compressed class space is needlessly complicated and
>> has some minor issues. It can be simplified and made clearer.
>>
>> The complexity stems from the fact that this area lives at the
>> intersection of two to three sub systems, depending on how one counts.
>> Metaspace, CDS, and the platform which may or may not its own view of how
>> to reserve class space. And this code has been growing organically over
>> time.
>>
>> One small example:
>>
>> ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t
>> alignment,
>>                                                  bool large_pages, char
>> *requested_addr,
>>                                                  bool use_requested_addr)
>>
>> which I spent hours decoding, resulting in a very confused mail to
>> hs-runtime and aarch64-port-dev [2].
>>
>> This patch attempts to simplify cds and metaspace setup a bit; to comment
>> implicit knowledge which is not immediately clear; to cleanly abstract
>> platform concerns like optimized class space placement; and to disentangle
>> cds from metaspace to solve issues which may bite us later with Elastic
>> Metaspace [4].
>>
>> ---
>>
>> The main change is the reworked reservation mechanism. This is based on
>> Ioi's proposal [5].
>>
>> When reserving class space, three things must happen:
>>
>> 1) reservation of the space obviously. If cds is active that space must
>> be in the vicinity of cds archives to be covered by compressed class
>> pointer encoding.
>> 2) setting up the internal Metaspace structures atop of that space
>> 3) setting up compressed class pointer encoding.
>>
>> In its current form, Metaspace may or may not do some or all of that in
>> one function
>> (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
>> metaspace_rs, char* requested_addr, address cds_base);) - if cds is active,
>> it will reserve the space for Metaspace and hand it in, otherwise it will
>> create it itself.
>>
>> When discussing this in [2], Ioi proposed to move the reservation of the
>> class space completely out of Metaspace and make it a responsibility of the
>> caller always. This would reduce some complexity, and this patch follows
>> the proposal.
>>
>> I removed
>> Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
>> metaspace_rs, char* requested_addr, address cds_base); and all its sub
>> functions.
>>
>> (1) now has to always be done outside - a ReservedSpace for class space
>> has to be provided by the caller. However, Metaspace now offers a utility
>> function for reserving space at a "nice" location, and explicitly doing
>> nothing else:
>>
>> ReservedSpace
>> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>>
>> this function can be redefined on a platform level for platform optimized
>> reservation, see below for details.
>>
>> (2) is taken care of by a new function,
>> Metaspace::initialize_class_space(ReservedSpace rs)
>>
>> (3) is taken care of a new function
>> CompressedKlassPointers::initialize(), see below for details.
>>
>>
>> So, class space now is set up three explicit steps:
>>
>> - First, reserve a suitable space by however means you want. For
>> convenience you may use
>> Metaspace::reserve_address_space_for_compressed_classes(), or you may roll
>> your own reservation.
>> - Next, tell Metaspace to use that range as backing storage for class
>> space: Metaspace::initialize_class_space(ReservedSpace rs)
>> - Finally, set up encoding. Encoding is independent from the concept of a
>> ReservedSpace, it just gets an address range, see below for details.
>>
>> Separating these steps and moving them out of the responsibility of
>> Metaspace makes this whole thing more flexible; it also removes unnecessary
>> knowledge (e.g. Metaspace does not need to know anything about either ccp
>> encoding or cds).
>>
>> ---
>>
>> How it comes together:
>>
>> If CDS is off, we just reserve a space using
>> Metaspace::reserve_address_space_for_compressed_classes(), initialize it
>> with Metaspace::initialize_class_space(ReservedSpace rs), then set up
>> compressed class pointer encoding covering the range of this class space.
>>
>> If CDS is on (dump time), we reserve large 4G space, either at
>> SharedBaseAddress or using
>> Metaspace::reserve_address_space_for_compressed_classes(); we then split
>> that into 3G archive space and 1G class space; we set up that space with
>> Metaspace as class space; then we set up compressed class pointer encoding
>> covering both archive space and cds.
>>
>> If CDS is on (run time), we reserve a large space, split it into archive
>> space (large enough to hold both archives) and class space, then basically
>> proceed as above.
>>
>> Note that this is almost exactly how things worked before (modulo some
>> minor fixes, e.g. alignment issues), only the code is reformed and made
>> more explicit.
>>
>> ---
>>
>> I moved compressed class pointer setup over to CompressedKlassPointers
>> and changed the interface:
>>
>> -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace
>> metaspace_rs, address cds_base)
>> +void CompressedKlassPointers::initialize(address addr, size_t len);
>>
>> Instead of feeding it a single ReservedSpace, which is supposed to
>> represent class space, and an optional alternate base if cds is on, now we
>> give it just an numeric address range. That range marks the limits to where
>> Klass structures are to be expected, and is the implicit promise that
>> outside that range no Klass structures will exist, so encoding has to cover
>> only this range.
>>
>> This range may contain just the class space; or class space+cds; or
>> whatever allocation scheme we come up with in the future. Encoding does not
>> really care how the memory is organized as long as the input range covers
>> all possible Klass locations. That way we remove knowledge about class
>> space/cds from compressed class pointer encoding.
>>
>> Moving it away from metaspace.cpp into the CompressedKlassPointers class
>> also mirrors CompressedOops::initialize().
>>
>> ---
>>
>> I renamed _narrow_klass_range to just _range, because strictly speaking
>> this is the range un-narrow Klass pointers can have.
>>
>> As for the implementation of CompressedKlassPointers::initialize(address
>> addr, size_t len), I mimicked very closely what happened before, so there
>> should be almost no differences. Since "almost no differences" sounds scary
>> :) here are the differences:
>>
>> - When CDS is active (dump or run time) we now always, unconditionally,
>> set the encoding range to 4G. This fixes a theoretical bug discussed on
>> aarch64-port-dev [1].
>>
>> - When CDS is not active, we set the encoding range to the minimum
>> required length. Before, it was left at its default value of 4G.
>>
>> Both differences only affect aarch64, since they are currently the only
>> one using the range field in CompressedKlassPointers.
>>
>> I wanted to add an assert somewhere to test encoding of the very last
>> address of the CompressedKlassPointers range, again to prevent errors like
>> [3]. But I did not come up with a good place for this assert which would
>> cover also the encoding done by C1/C2.
>>
>> For the same reason I thought about introducing a mode where Klass
>> structures would be allocated in reverse order, starting at the end of the
>> ccs, but again left it out as too big a change.
>>
>> ---
>>
>> OS abstraction: platforms may have restrictions of what constitutes a
>> valid compressed class pointer encoding base. Or if not, they may have at
>> least preferences. There was logic like this in metaspace.cpp, which I
>> removed and cleanly factored out into platform dependent files, giving each
>> platform the option to add special logic.
>>
>> These are two new methods:
>>
>> - bool CompressedKlassPointers::is_valid_base(address p)
>>
>> to let the platform tell you whether it considers p to be a valid
>> encoding base. The only platform having these restrictions currently is
>> aarch64.
>>
>> - ReservedSpace
>> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>>
>> this hands over the process of allocating a range suitable for compressed
>> class pointer encoding to the platform. Most platforms will allocate just
>> anywhere, but some platforms may have a better strategy (e.g. trying low
>> memory first, trying only correctly aligned addresses and so on).
>>
>> Beforehand, this coding existed in a similar form in metaspace.cpp for
>> aarch64 and AIX. For now, I left the AIX part out - it seems only half
>> done, and I want to check further if we even need it, if yes why not on
>> Linux ppc, and C1 does not seem to support anything other than base+offset
>> with shift either, but I may be mistaken.
>>
>> These two methods should give the platform enough control to implement
>> their own scheme for optimized class space placement without bothering any
>> shared code about it.
>>
>> Note about the form, I introduced two new platform dependent files,
>> "metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy about
>> this but this seems to be what we generally do in hotspot, right?
>>
>> ---
>>
>> Metaspace reserve alignment vs cds alignment
>>
>> CDS was using Metaspace reserve alignment for CDS internal purposes. I
>> guess this was just a copy paste issue. It never caused problems since
>> Metaspace reserve alignment == page size, but that is not true anymore in
>> the upcoming Elastic Metaspace where reserve alignment will be larger. This
>> causes a number of issues.
>>
>> I separated those two cleanly. CDS now uses
>> os::vm_allocation_granularity. Metaspace::reserve_alignment is only used in
>> those two places where it is needed, when CDS creates the address space for
>> class space on behalf of the Metaspace.
>>
>> ---
>>
>> Windows special handling in CDS
>>
>> To simplify coding I removed the windows specific handling which left out
>> reservation of the archive. This was needed because windows cannot mmap
>> files into reserved regions. But fallback code exists in filemap.cpp for
>> this case which just reads in the region instead of mapping it.
>>
>> Should that turn out to be a performance problem, I will reinstate the
>> feature. But a simpler way would be reserve the archive and later just
>> before mmapping the archive file to release the archive space. That would
>> not only be simpler but give us the best guarantee that that address space
>> is actually available. But I'd be happy to leave that part out completely
>> if we do not see any performance problems on windows x64.
>>
>> ---
>>
>> NMT cannot deal with spaces which are split. This problem manifests in
>> that bookkeeping for class space is done under "Shared Classes", not
>> "Classes" as it should. This problem exists today too at dump time and
>> randomly at run time. But since I simplified the reservation, this problem
>> now shows up always, whether or not we map at the SharedBaseAddress.
>> While I could work around this problem, I'd prefer this problem to be
>> solved at the core, and NMT to have an option to recognize reservation
>> splits. So I'd rather not put a workaround for this into the patch but
>> leave it for fixing as a separate issue. I opened this issue to track it
>> [6].
>>
>> ---
>>
>> Jtreg tests:
>>
>> I expanded the CompressedOops/CompressedClassPointers.java. I also
>> extended them to Windows. The tests now optionally omit strict class space
>> placement tests, since these tests heavily depend on ASLR and were the
>> reason they were excluded on Windows. However I think even without checking
>> for class space placement they make sense, just to see that the VM comes up
>> and lives with the many different settings we can run in.
>>
>> ---
>>
>> Tests:
>>
>> - I ran the patch through Oracles submit repo
>> - I ran tests manually for aarch64, zero, linux 32bit and windows x64
>> - The whole battery of nightly tests at SAP, including ppc, ppcle and
>> aarch64, unfortunately excluding windows because of unrelated errors.
>> Windows x64 tests will be redone tonight.
>>
>>
>> Thank you,
>>
>> Thomas
>>
>> [1]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
>> [2]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
>> [3] https://bugs.openjdk.java.net/browse/JDK-8193266
>> [4] https://bugs.openjdk.java.net/browse/JDK-8221173
>> [5]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
>> [6] https://bugs.openjdk.java.net/browse/JDK-8243535
>>
>>
>


More information about the aarch64-port-dev mailing list