RFR(M): 8243392: Remodel CDS/Metaspace storage reservation

Ioi Lam ioi.lam at oracle.com
Wed Apr 29 05:14:31 UTC 2020


Hi Thomas,

There are a lot of changes so it will take me a while to go through 
everything. Just some initial comments:

   // User may have specified an invalid base address. Should we ignore 
it or assert?
guarantee(CompressedKlassPointers::is_valid_base((address)shared_base),
             "SharedBaseAddress: " PTR_FORMAT " is not a valid base.", 
p2i(shared_base));

This will cause the VM to crash. I think it's better (1) exit the VM 
properly with an error code, or (2) override the user's input.

======

Since this is a potentially disruptive change, I want to run it in our 
CI as well. Could you tell me the tip of your repo?

========

For testing the CDS relocation code, I would suggest running:

cd test/hotspot/jtreg
jtreg -javaoption:-XX:+UnlockDiagnosticVMOptions \
       -javaoption:-XX:ArchiveRelocationMode=1 \
       -javaoption:-XX:NativeMemoryTracking=detail
       :hotspot_cds_relocation

This will place the CCS at random locations picked by the OS.

========

metaspace.cpp:

If your intention is to "shake things up a little", it's not a good idea 
to include it in a complex change set. If things indeed go wrong, we 
don't know who caused it (your CCS changes, or old bugs triggered by 
this debug code), and we will end up backing out the entire changeset.

I would suggest putting this in a different RFE, and even push it now.

   // The upcoming Elastic Metaspace will have stricter alignment 
requirements.
   // For debug builds, increase reserve alignment to shake loose errors 
resulting
   // from misusing this alignment.
   // Note: do not increase too much (e.g. not on platforms with 64K 
pages), we do not
   // want to disturb tests requiring precise numbers for metaspace size 
or ccs size.
#ifdef ASSERT
   if (_reserve_alignment == 4 * K) {
     _reserve_alignment *= 4;
   }
#endif


More to come ....

Thanks
- Ioi


On 4/28/20 7:54 AM, Thomas Stüfe wrote:
> Hi all,
>
> Could I have reviews for the following proposal of reworking cds/class 
> space reservation?
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8243392
>
> Webrev: 
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/
>
> (Many thanks to Ioi Lam for so patiently explaining CDS internals to 
> me, and to Andrew Haley and Nick Gasson for help with aarch64!)
>
> Reservation of the compressed class space is needlessly complicated 
> and has some minor issues. It can be simplified and made clearer.
>
> The complexity stems from the fact that this area lives at the 
> intersection of two to three sub systems, depending on how one counts. 
> Metaspace, CDS, and the platform which may or may not its own view of 
> how to reserve class space. And this code has been growing organically 
> over time.
>
> One small example:
>
> ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t 
> alignment,
>                                                  bool large_pages, 
> char *requested_addr,
>                                                  bool use_requested_addr)
>
> which I spent hours decoding, resulting in a very confused mail to 
> hs-runtime and aarch64-port-dev [2].
>
> This patch attempts to simplify cds and metaspace setup a bit; to 
> comment implicit knowledge which is not immediately clear; to cleanly 
> abstract platform concerns like optimized class space placement; and 
> to disentangle cds from metaspace to solve issues which may bite us 
> later with Elastic Metaspace [4].
>
> ---
>
> The main change is the reworked reservation mechanism. This is based 
> on Ioi's proposal [5].
>
> When reserving class space, three things must happen:
>
> 1) reservation of the space obviously. If cds is active that space 
> must be in the vicinity of cds archives to be covered by compressed 
> class pointer encoding.
> 2) setting up the internal Metaspace structures atop of that space
> 3) setting up compressed class pointer encoding.
>
> In its current form, Metaspace may or may not do some or all of that 
> in one function 
> (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace 
> metaspace_rs, char* requested_addr, address cds_base);) - if cds is 
> active, it will reserve the space for Metaspace and hand it in, 
> otherwise it will create it itself.
>
> When discussing this in [2], Ioi proposed to move the reservation of 
> the class space completely out of Metaspace and make it a 
> responsibility of the caller always. This would reduce some 
> complexity, and this patch follows the proposal.
>
> I removed 
> Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace 
> metaspace_rs, char* requested_addr, address cds_base); and all its sub 
> functions.
>
> (1) now has to always be done outside - a ReservedSpace for class 
> space has to be provided by the caller. However, Metaspace now offers 
> a utility function for reserving space at a "nice" location, and 
> explicitly doing nothing else:
>
> ReservedSpace 
> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>
> this function can be redefined on a platform level for platform 
> optimized reservation, see below for details.
>
> (2) is taken care of by a new function, 
> Metaspace::initialize_class_space(ReservedSpace rs)
>
> (3) is taken care of a new function 
> CompressedKlassPointers::initialize(), see below for details.
>
>
> So, class space now is set up three explicit steps:
>
> - First, reserve a suitable space by however means you want. For 
> convenience you may use 
> Metaspace::reserve_address_space_for_compressed_classes(), or you may 
> roll your own reservation.
> - Next, tell Metaspace to use that range as backing storage for class 
> space: Metaspace::initialize_class_space(ReservedSpace rs)
> - Finally, set up encoding. Encoding is independent from the concept 
> of a ReservedSpace, it just gets an address range, see below for details.
>
> Separating these steps and moving them out of the responsibility of 
> Metaspace makes this whole thing more flexible; it also removes 
> unnecessary knowledge (e.g. Metaspace does not need to know anything 
> about either ccp encoding or cds).
>
> ---
>
> How it comes together:
>
> If CDS is off, we just reserve a space using 
> Metaspace::reserve_address_space_for_compressed_classes(), initialize 
> it with Metaspace::initialize_class_space(ReservedSpace rs), then set 
> up compressed class pointer encoding covering the range of this class 
> space.
>
> If CDS is on (dump time), we reserve large 4G space, either at 
> SharedBaseAddress or using 
> Metaspace::reserve_address_space_for_compressed_classes(); we then 
> split that into 3G archive space and 1G class space; we set up that 
> space with Metaspace as class space; then we set up compressed class 
> pointer encoding covering both archive space and cds.
>
> If CDS is on (run time), we reserve a large space, split it into 
> archive space (large enough to hold both archives) and class space, 
> then basically proceed as above.
>
> Note that this is almost exactly how things worked before (modulo some 
> minor fixes, e.g. alignment issues), only the code is reformed and 
> made more explicit.
>
> ---
>
> I moved compressed class pointer setup over to CompressedKlassPointers 
> and changed the interface:
>
> -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace 
> metaspace_rs, address cds_base)
> +void CompressedKlassPointers::initialize(address addr, size_t len);
>
> Instead of feeding it a single ReservedSpace, which is supposed to 
> represent class space, and an optional alternate base if cds is on, 
> now we give it just an numeric address range. That range marks the 
> limits to where Klass structures are to be expected, and is the 
> implicit promise that outside that range no Klass structures will 
> exist, so encoding has to cover only this range.
>
> This range may contain just the class space; or class space+cds; or 
> whatever allocation scheme we come up with in the future. Encoding 
> does not really care how the memory is organized as long as the input 
> range covers all possible Klass locations. That way we remove 
> knowledge about class space/cds from compressed class pointer encoding.
>
> Moving it away from metaspace.cpp into the CompressedKlassPointers 
> class also mirrors CompressedOops::initialize().
>
> ---
>
> I renamed _narrow_klass_range to just _range, because strictly 
> speaking this is the range un-narrow Klass pointers can have.
>
> As for the implementation of 
> CompressedKlassPointers::initialize(address addr, size_t len), I 
> mimicked very closely what happened before, so there should be almost 
> no differences. Since "almost no differences" sounds scary :) here are 
> the differences:
>
> - When CDS is active (dump or run time) we now always, 
> unconditionally, set the encoding range to 4G. This fixes a 
> theoretical bug discussed on aarch64-port-dev [1].
>
> - When CDS is not active, we set the encoding range to the minimum 
> required length. Before, it was left at its default value of 4G.
>
> Both differences only affect aarch64, since they are currently the 
> only one using the range field in CompressedKlassPointers.
>
> I wanted to add an assert somewhere to test encoding of the very last 
> address of the CompressedKlassPointers range, again to prevent errors 
> like [3]. But I did not come up with a good place for this assert 
> which would cover also the encoding done by C1/C2.
>
> For the same reason I thought about introducing a mode where Klass 
> structures would be allocated in reverse order, starting at the end of 
> the ccs, but again left it out as too big a change.
>
> ---
>
> OS abstraction: platforms may have restrictions of what constitutes a 
> valid compressed class pointer encoding base. Or if not, they may have 
> at least preferences. There was logic like this in metaspace.cpp, 
> which I removed and cleanly factored out into platform dependent 
> files, giving each platform the option to add special logic.
>
> These are two new methods:
>
> - bool CompressedKlassPointers::is_valid_base(address p)
>
> to let the platform tell you whether it considers p to be a valid 
> encoding base. The only platform having these restrictions currently 
> is aarch64.
>
> - ReservedSpace 
> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>
> this hands over the process of allocating a range suitable for 
> compressed class pointer encoding to the platform. Most platforms will 
> allocate just anywhere, but some platforms may have a better strategy 
> (e.g. trying low memory first, trying only correctly aligned addresses 
> and so on).
>
> Beforehand, this coding existed in a similar form in metaspace.cpp for 
> aarch64 and AIX. For now, I left the AIX part out - it seems only half 
> done, and I want to check further if we even need it, if yes why not 
> on Linux ppc, and C1 does not seem to support anything other than 
> base+offset with shift either, but I may be mistaken.
>
> These two methods should give the platform enough control to implement 
> their own scheme for optimized class space placement without bothering 
> any shared code about it.
>
> Note about the form, I introduced two new platform dependent files, 
> "metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy 
> about this but this seems to be what we generally do in hotspot, right?
>
> ---
>
> Metaspace reserve alignment vs cds alignment
>
> CDS was using Metaspace reserve alignment for CDS internal purposes. I 
> guess this was just a copy paste issue. It never caused problems since 
> Metaspace reserve alignment == page size, but that is not true anymore 
> in the upcoming Elastic Metaspace where reserve alignment will be 
> larger. This causes a number of issues.
>
> I separated those two cleanly. CDS now uses 
> os::vm_allocation_granularity. Metaspace::reserve_alignment is only 
> used in those two places where it is needed, when CDS creates the 
> address space for class space on behalf of the Metaspace.
>
> ---
>
> Windows special handling in CDS
>
> To simplify coding I removed the windows specific handling which left 
> out reservation of the archive. This was needed because windows cannot 
> mmap files into reserved regions. But fallback code exists in 
> filemap.cpp for this case which just reads in the region instead of 
> mapping it.
>
> Should that turn out to be a performance problem, I will reinstate the 
> feature. But a simpler way would be reserve the archive and later just 
> before mmapping the archive file to release the archive space. That 
> would not only be simpler but give us the best guarantee that that 
> address space is actually available. But I'd be happy to leave that 
> part out completely if we do not see any performance problems on 
> windows x64.
>
> ---
>
> NMT cannot deal with spaces which are split. This problem manifests in 
> that bookkeeping for class space is done under "Shared Classes", not 
> "Classes" as it should. This problem exists today too at dump time and 
> randomly at run time. But since I simplified the reservation, this 
> problem now shows up always, whether or not we map at the 
> SharedBaseAddress.
> While I could work around this problem, I'd prefer this problem to be 
> solved at the core, and NMT to have an option to recognize reservation 
> splits. So I'd rather not put a workaround for this into the patch but 
> leave it for fixing as a separate issue. I opened this issue to track 
> it [6].
>
> ---
>
> Jtreg tests:
>
> I expanded the CompressedOops/CompressedClassPointers.java. I also 
> extended them to Windows. The tests now optionally omit strict class 
> space placement tests, since these tests heavily depend on ASLR and 
> were the reason they were excluded on Windows. However I think even 
> without checking for class space placement they make sense, just to 
> see that the VM comes up and lives with the many different settings we 
> can run in.
>
> ---
>
> Tests:
>
> - I ran the patch through Oracles submit repo
> - I ran tests manually for aarch64, zero, linux 32bit and windows x64
> - The whole battery of nightly tests at SAP, including ppc, ppcle and 
> aarch64, unfortunately excluding windows because of unrelated errors. 
> Windows x64 tests will be redone tonight.
>
>
> Thank you,
>
> Thomas
>
> [1] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
> [2] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
> [3] https://bugs.openjdk.java.net/browse/JDK-8193266
> [4] https://bugs.openjdk.java.net/browse/JDK-8221173
> [5] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
> [6] https://bugs.openjdk.java.net/browse/JDK-8243535
>



More information about the hotspot-dev mailing list