[aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation

Thomas Stüfe thomas.stuefe at gmail.com
Sat May 16 08:12:28 UTC 2020


Hi,

On Sat, May 16, 2020 at 12:26 AM <coleen.phillimore at oracle.com> wrote:

>
> Hi, This looks good. The code motion makes sense so this is an improvement.
>
>
thanks!



> On 5/7/20 10:21 AM, Thomas Stüfe wrote:
>
> Hi all,
>
> please take a look at the third iteration of this change:
>
>
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.02/webrev/
>
> Changes in this version:
>
> - at the request of Coleen (sorry I did not see your post earlier) I
> removed all platform dependent files and put the platform dependent coding
> back to metaspace.cpp and compressedOops.cpp to ease the review pain. I
> used plain platform defines though (#ifdef AARCH64) instead of hiding them
> behind another macro to make things clearer. Note that I still intent to
> put this code away into the platform corners but will do so in a follow up
> RFE.
>
>
> Oh, please don't, especially since most platforms are just a
> ReservedSpace(size, align, false, NULL);  The AARCH64 code isn't a big
> intrusion in this code.  Thank you for putting it back. It is easier to
> follow how it is.
>
>
Okay. We'll see how quickly the coding deteriorates again though. We may
want to put extra stuff for ppc64 in too, and/or just AIX. See also:
https://bugs.openjdk.java.net/browse/JDK-8244943
<https://bugs.openjdk.java.net/browse/JDK-8244943?filter=-3> . I'll keep
this off for now to a later point.

>
> - I reinstated, in a fashion, the special handling of reservations on
> Windows. On all platforms we reserve address space to map the archive files
> in with a subsequent mapping operation. However, on Windows, we cannot use
> MapViewOfFile() into an existing mapping. So I remove the mapping again
> before mapping the archives - see comment in code for details.
>
>
> I missed this windows part of the change, where is it?
>

Follow the trail Ioi led with
"MetaspaceShared::use_windows_memory_mapping()".

In original coding this is a complicated dance
between MetaspaceShared::reserve_address_space_for_archives
and FileMapInfo::map_region, the implicit knowledge here is that these
functions are called twice, first time trying to attach to the archive base
address (which is the nice fast case, no relocation) and the second time
trying to attach anywhere.

In MetaspaceShared::reserve_address_space_for_archives, we want to reserve
space for archive and ccs laying one beside the other; later we plan to map
the archive file into the prereserved archive space. But that does not work
on Windows, you cannot map into pre-reserved space; so the first time we do
this we explicitly do *not* reserve the archive space (windows only), just
the class space. Then we attempt to map the archive space into that region
which is not reserved yet but hopefully unpopulated. Since that may not
work, we call MetaspaceShared::reserve_address_space_for_archives a second
time, this time reserving both archive space and css, and this time around
instead of mapping the archive file in Ioi just reads it sequentially. This
is slower than mapping the file, but his reasoning is that this does not
matter since he needs to relocate anyway - different base address - so the
sequential read does not hurt performance.

I think this all makes sense. Its the fastest way of doing this. In my
first attempt at this patch I wanted to simplify things and just always
read sequentially; but that hurt startup time on WIndows when the cds was
big. So I just tried to make the code a bit simpler and more faultproof: in
my version of MetaspaceShared::reserve_address_space_for_archives, I always
unconditionally reserve space for both archive and ccs. Then, returning
from that function, in MetaspaceShared::map_archives(), I unreserve the
archive again on windows, so subsequent mappings to this address can
succeed. I think this makes the code a bit clearer, and an additional
advantage is that since we were able to reserve at that address already
once the chance is very high that mapping of the files some milliseconds
later will succeed. Not guaranteed though, that is why the second time
around we use the original sequential read logic.


>
>
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.02/webrev/src/hotspot/share/memory/metaspace/virtualSpaceList.cpp.udiff.html
>
> Why did you make this change?  Shouldn't the caller align it?
>

Probably yes. I'll remove this again.


>
> Looks good to me.
> Coleen
>
>
Thanks!

I'll post an update in the next days to incorporate yours and Iois feedback
and rebase the patch.

Cheers, Thomas


>
> All CI tests at SAP run through without problems, including on Win64 and
> aarch64, but I would be happy if others were to run test too.
>
> Thank you, Thomas
>
> On Tue, Apr 28, 2020 at 4:54 PM Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
>> Hi all,
>>
>> Could I have reviews for the following proposal of reworking cds/class
>> space reservation?
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8243392
>>
>> Webrev:
>> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/
>>
>> (Many thanks to Ioi Lam for so patiently explaining CDS internals to me,
>> and to Andrew Haley and Nick Gasson for help with aarch64!)
>>
>> Reservation of the compressed class space is needlessly complicated and
>> has some minor issues. It can be simplified and made clearer.
>>
>> The complexity stems from the fact that this area lives at the
>> intersection of two to three sub systems, depending on how one counts.
>> Metaspace, CDS, and the platform which may or may not its own view of how
>> to reserve class space. And this code has been growing organically over
>> time.
>>
>> One small example:
>>
>> ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t
>> alignment,
>>                                                  bool large_pages, char
>> *requested_addr,
>>                                                  bool use_requested_addr)
>>
>> which I spent hours decoding, resulting in a very confused mail to
>> hs-runtime and aarch64-port-dev [2].
>>
>> This patch attempts to simplify cds and metaspace setup a bit; to comment
>> implicit knowledge which is not immediately clear; to cleanly abstract
>> platform concerns like optimized class space placement; and to disentangle
>> cds from metaspace to solve issues which may bite us later with Elastic
>> Metaspace [4].
>>
>> ---
>>
>> The main change is the reworked reservation mechanism. This is based on
>> Ioi's proposal [5].
>>
>> When reserving class space, three things must happen:
>>
>> 1) reservation of the space obviously. If cds is active that space must
>> be in the vicinity of cds archives to be covered by compressed class
>> pointer encoding.
>> 2) setting up the internal Metaspace structures atop of that space
>> 3) setting up compressed class pointer encoding.
>>
>> In its current form, Metaspace may or may not do some or all of that in
>> one function
>> (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
>> metaspace_rs, char* requested_addr, address cds_base);) - if cds is active,
>> it will reserve the space for Metaspace and hand it in, otherwise it will
>> create it itself.
>>
>> When discussing this in [2], Ioi proposed to move the reservation of the
>> class space completely out of Metaspace and make it a responsibility of the
>> caller always. This would reduce some complexity, and this patch follows
>> the proposal.
>>
>> I removed
>> Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
>> metaspace_rs, char* requested_addr, address cds_base); and all its sub
>> functions.
>>
>> (1) now has to always be done outside - a ReservedSpace for class space
>> has to be provided by the caller. However, Metaspace now offers a utility
>> function for reserving space at a "nice" location, and explicitly doing
>> nothing else:
>>
>> ReservedSpace
>> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>>
>> this function can be redefined on a platform level for platform optimized
>> reservation, see below for details.
>>
>> (2) is taken care of by a new function,
>> Metaspace::initialize_class_space(ReservedSpace rs)
>>
>> (3) is taken care of a new function
>> CompressedKlassPointers::initialize(), see below for details.
>>
>>
>> So, class space now is set up three explicit steps:
>>
>> - First, reserve a suitable space by however means you want. For
>> convenience you may use
>> Metaspace::reserve_address_space_for_compressed_classes(), or you may roll
>> your own reservation.
>> - Next, tell Metaspace to use that range as backing storage for class
>> space: Metaspace::initialize_class_space(ReservedSpace rs)
>> - Finally, set up encoding. Encoding is independent from the concept of a
>> ReservedSpace, it just gets an address range, see below for details.
>>
>> Separating these steps and moving them out of the responsibility of
>> Metaspace makes this whole thing more flexible; it also removes unnecessary
>> knowledge (e.g. Metaspace does not need to know anything about either ccp
>> encoding or cds).
>>
>> ---
>>
>> How it comes together:
>>
>> If CDS is off, we just reserve a space using
>> Metaspace::reserve_address_space_for_compressed_classes(), initialize it
>> with Metaspace::initialize_class_space(ReservedSpace rs), then set up
>> compressed class pointer encoding covering the range of this class space.
>>
>> If CDS is on (dump time), we reserve large 4G space, either at
>> SharedBaseAddress or using
>> Metaspace::reserve_address_space_for_compressed_classes(); we then split
>> that into 3G archive space and 1G class space; we set up that space with
>> Metaspace as class space; then we set up compressed class pointer encoding
>> covering both archive space and cds.
>>
>> If CDS is on (run time), we reserve a large space, split it into archive
>> space (large enough to hold both archives) and class space, then basically
>> proceed as above.
>>
>> Note that this is almost exactly how things worked before (modulo some
>> minor fixes, e.g. alignment issues), only the code is reformed and made
>> more explicit.
>>
>> ---
>>
>> I moved compressed class pointer setup over to CompressedKlassPointers
>> and changed the interface:
>>
>> -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace
>> metaspace_rs, address cds_base)
>> +void CompressedKlassPointers::initialize(address addr, size_t len);
>>
>> Instead of feeding it a single ReservedSpace, which is supposed to
>> represent class space, and an optional alternate base if cds is on, now we
>> give it just an numeric address range. That range marks the limits to where
>> Klass structures are to be expected, and is the implicit promise that
>> outside that range no Klass structures will exist, so encoding has to cover
>> only this range.
>>
>> This range may contain just the class space; or class space+cds; or
>> whatever allocation scheme we come up with in the future. Encoding does not
>> really care how the memory is organized as long as the input range covers
>> all possible Klass locations. That way we remove knowledge about class
>> space/cds from compressed class pointer encoding.
>>
>> Moving it away from metaspace.cpp into the CompressedKlassPointers class
>> also mirrors CompressedOops::initialize().
>>
>> ---
>>
>> I renamed _narrow_klass_range to just _range, because strictly speaking
>> this is the range un-narrow Klass pointers can have.
>>
>> As for the implementation of CompressedKlassPointers::initialize(address
>> addr, size_t len), I mimicked very closely what happened before, so there
>> should be almost no differences. Since "almost no differences" sounds scary
>> :) here are the differences:
>>
>> - When CDS is active (dump or run time) we now always, unconditionally,
>> set the encoding range to 4G. This fixes a theoretical bug discussed on
>> aarch64-port-dev [1].
>>
>> - When CDS is not active, we set the encoding range to the minimum
>> required length. Before, it was left at its default value of 4G.
>>
>> Both differences only affect aarch64, since they are currently the only
>> one using the range field in CompressedKlassPointers.
>>
>> I wanted to add an assert somewhere to test encoding of the very last
>> address of the CompressedKlassPointers range, again to prevent errors like
>> [3]. But I did not come up with a good place for this assert which would
>> cover also the encoding done by C1/C2.
>>
>> For the same reason I thought about introducing a mode where Klass
>> structures would be allocated in reverse order, starting at the end of the
>> ccs, but again left it out as too big a change.
>>
>> ---
>>
>> OS abstraction: platforms may have restrictions of what constitutes a
>> valid compressed class pointer encoding base. Or if not, they may have at
>> least preferences. There was logic like this in metaspace.cpp, which I
>> removed and cleanly factored out into platform dependent files, giving each
>> platform the option to add special logic.
>>
>> These are two new methods:
>>
>> - bool CompressedKlassPointers::is_valid_base(address p)
>>
>> to let the platform tell you whether it considers p to be a valid
>> encoding base. The only platform having these restrictions currently is
>> aarch64.
>>
>> - ReservedSpace
>> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>>
>> this hands over the process of allocating a range suitable for compressed
>> class pointer encoding to the platform. Most platforms will allocate just
>> anywhere, but some platforms may have a better strategy (e.g. trying low
>> memory first, trying only correctly aligned addresses and so on).
>>
>> Beforehand, this coding existed in a similar form in metaspace.cpp for
>> aarch64 and AIX. For now, I left the AIX part out - it seems only half
>> done, and I want to check further if we even need it, if yes why not on
>> Linux ppc, and C1 does not seem to support anything other than base+offset
>> with shift either, but I may be mistaken.
>>
>> These two methods should give the platform enough control to implement
>> their own scheme for optimized class space placement without bothering any
>> shared code about it.
>>
>> Note about the form, I introduced two new platform dependent files,
>> "metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy about
>> this but this seems to be what we generally do in hotspot, right?
>>
>> ---
>>
>> Metaspace reserve alignment vs cds alignment
>>
>> CDS was using Metaspace reserve alignment for CDS internal purposes. I
>> guess this was just a copy paste issue. It never caused problems since
>> Metaspace reserve alignment == page size, but that is not true anymore in
>> the upcoming Elastic Metaspace where reserve alignment will be larger. This
>> causes a number of issues.
>>
>> I separated those two cleanly. CDS now uses
>> os::vm_allocation_granularity. Metaspace::reserve_alignment is only used in
>> those two places where it is needed, when CDS creates the address space for
>> class space on behalf of the Metaspace.
>>
>> ---
>>
>> Windows special handling in CDS
>>
>> To simplify coding I removed the windows specific handling which left out
>> reservation of the archive. This was needed because windows cannot mmap
>> files into reserved regions. But fallback code exists in filemap.cpp for
>> this case which just reads in the region instead of mapping it.
>>
>> Should that turn out to be a performance problem, I will reinstate the
>> feature. But a simpler way would be reserve the archive and later just
>> before mmapping the archive file to release the archive space. That would
>> not only be simpler but give us the best guarantee that that address space
>> is actually available. But I'd be happy to leave that part out completely
>> if we do not see any performance problems on windows x64.
>>
>> ---
>>
>> NMT cannot deal with spaces which are split. This problem manifests in
>> that bookkeeping for class space is done under "Shared Classes", not
>> "Classes" as it should. This problem exists today too at dump time and
>> randomly at run time. But since I simplified the reservation, this problem
>> now shows up always, whether or not we map at the SharedBaseAddress.
>> While I could work around this problem, I'd prefer this problem to be
>> solved at the core, and NMT to have an option to recognize reservation
>> splits. So I'd rather not put a workaround for this into the patch but
>> leave it for fixing as a separate issue. I opened this issue to track it
>> [6].
>>
>> ---
>>
>> Jtreg tests:
>>
>> I expanded the CompressedOops/CompressedClassPointers.java. I also
>> extended them to Windows. The tests now optionally omit strict class space
>> placement tests, since these tests heavily depend on ASLR and were the
>> reason they were excluded on Windows. However I think even without checking
>> for class space placement they make sense, just to see that the VM comes up
>> and lives with the many different settings we can run in.
>>
>> ---
>>
>> Tests:
>>
>> - I ran the patch through Oracles submit repo
>> - I ran tests manually for aarch64, zero, linux 32bit and windows x64
>> - The whole battery of nightly tests at SAP, including ppc, ppcle and
>> aarch64, unfortunately excluding windows because of unrelated errors.
>> Windows x64 tests will be redone tonight.
>>
>>
>> Thank you,
>>
>> Thomas
>>
>> [1]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
>> [2]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
>> [3] https://bugs.openjdk.java.net/browse/JDK-8193266
>> [4] https://bugs.openjdk.java.net/browse/JDK-8221173
>> [5]
>> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
>> [6] https://bugs.openjdk.java.net/browse/JDK-8243535
>>
>>
>


More information about the aarch64-port-dev mailing list