Question about ccs reservation, CDS and aarch64 specifics

Thomas Stüfe thomas.stuefe at gmail.com
Mon Apr 20 11:10:42 UTC 2020


On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam <ioi.lam at oracle.com> wrote:

>
>
> On 4/18/20 12:15 AM, Thomas Stüfe wrote:
>
> Hi Ioi,
>
> I am working on a small patch and have some more questions.
>
> - First, a simple one, in
> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the
> space does not have anything to do with metaspace, as you wrote, so the
> alignment could be anything, right?
>
> I think so.
>
> - Out of curiousity, when you pack the different regions
> (DumpRegion::pack) you align the end to page size. Why? Why could the next
> region not simply follow immediately? I looked if any code needs a region
> to be page aligned, but may have missed it.
>
>
> We map RO read-only and MC/RW in read-write. If the regions are not
> aligned, you will have a page that wants half to be read-only and half to
> be read-write.
>
>
Okay. I wondered why page align here and not allocation granularity. Now I
understand. I guess this is also the reason why we could not use large
pages for the archive?

I think this is fine, I did not want to change it. On some platforms we
have 64K (non-large) pages, but even there I think the waste would be
acceptable.


> I guess we can adjust the mapping to be more lenient (if a page wants half
> read-write, we map it read-write), but that's no done today.
>
>
> - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :
>
> I assume this code has to work for all three cases right
> 1) lp32.
> 2) lp64 with and without UseCompressedClassPointers?
> 3) lp64 without UseCompressedClassPointers?
>
> If yes, does the setting for UseCompressedClassPointers have to be the
> same at run time?
>
>
> Yes. The value of UseCompressedOops and UseCompressedClassPointers must be
> the same between dump time and run time.
>
>
>
> In this layout:
>   // On 64-bit VM, the heap and class space layout will be the same as if
>   // you're running in -Xshare:on mode:
>   //
>   //                              +-- SharedBaseAddress (default =
> 0x800000000)
>   //                              v
>   // +-..---------+---------+ ... +----+----+----+--------------------+
>   // |    Heap    | Archive |     | MC | RW | RO |    class space     |
>   // +-..---------+---------+ ... +----+----+----+--------------------+
>   // |<--   MaxHeapSize  -->|     |<-- UnscaledClassSpaceMax = 4GB -->|
>   //
>
> Why does the class space has to follow mc+rw+ro? Could it come before?
>
>
> Compressed klass pointers are stored in archived objects. If the class
> space is now lower than SharedBaseAddress, you will need to rebase all of
> the compressed klass pointers. This is not efficient and will slow down
> start-up.
>
>
Well, could SharedBaseAddress not point to start of the ccs:

  // +-- SharedBaseAddress (default = 0x800000000)
  // v
  // +----+----+----+-----------------------------------+
  // |    class space     | ..gap maybe.. | MC | RW | RO
  // +----+----+----+-----------------------------------+

you'd then need to make sure that the relative offset of MC to
SharedBaseAddress is the same at dump time and at runtime. Is my
understanding correct? I am not saying I want to do this, I just try to
understand the way ccs archive allocation works.


>
>
> Actually, does it have to be in the same space at all, or could it live
> somewhere completely different?
>
>
> It can be higher. You just need to ensure that the distance between
> SharedBaseAddress to the end of the class space is within max compressed
> klass space size.
>
> But, I am wondering why you're asking this :-)
>
>
I try to understand the allocation and where apply what restrictions. We
have at least three parties, cds, metaspace and the underlying platform,
all with their own subtleties of how the memory should be allocated:
- metaspace will in the near future want a larger alignment than what cds
uses for reservation.
- platforms like aarch64 and maybe ppc want the compressed class base to
look in a certain way

Part of my confusion was that I always thought of
CompressClassPointers::base() to be basically the same as the start of the
ccs (maybe modulo being zero on zero-based mode) but that is obviously not
true since CDS exists. So what I wrote first:

"Metaspace::reserve_preferred_space.. Despite its generic-sounding name,
these functions can only be used to allocate ccs."

is actually not fully correct. In reality this space is to be used to
allocate memory to house Klass structures so that their pointers are
compressable, so the reserved start address has to be compatible with that.
But, e.g., that start address does not have to be aligned to
Metaspace::reserve_alignment().

In both cds dump and runtime case, the ccs is carved from the end part of
the reserved space. Only that split point, and the size of that second
part, have to be aligned to Metaspace::reserve_alignment().

Were we to allocate ccs first and put the archives behind it this would
simplify some matters, but only minor points. I think the way it works now
is okay. I will try to disentangle it a bit in a way you proposed.


> To ask in a more precise way: I understand that both the mc+rw+ro archives
> and the ccs have to live in an area encompassed by the compressed class
> pointers encoding scheme. I wonder whether there are any restrictions
> beyond that.
>
> Could there be a gap between archives and ccs?
>
> Yes
>
> Can the order be reversed?
>
> No.
>
> Do the relative positions between archives and ccs have to be the same
> between dump time and runtime?
>
> No. All the pointers stored inside CDS point to inside of the MC/MW/RO
> regions, so it doesn't retain any knowledge of where the CCS was at dump
> time.
>
>
Clear answers, thank you!

..Thomas


>
> Thanks
> - Ioi
>
>
>
> Thanks!
>
> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>>
>>
>> On 4/16/20 11:14 AM, Thomas Stüfe wrote:
>>
>> Hi Ioi,
>>
>> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>> (I suppose you mean "compressed class space" by "ccs" :-)
>>>
>>>
>> Yes, I think I stole this from Stefan Karlsson :)
>>
>>
>>> <snip>
>>>
>>
>>
>>> I am not even sure if case (C) can happen at all.
>>>
>>> I admit that I've been guilty of making the interface even more
>>> complicated
>>> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
>>> (Relocate the CDS archive if it cannot be mapped to the
>>> requested address). Looks now is a good time to clean up.
>>>
>>>
>> The coding has been complicated to begin with, and then it usually only
>> gets worse since no-one has time for a revamp :( A clean up would be very
>> helpful.
>>
>> One reason I look at this coding now, beside the aarch64 problem, was
>> that I try to disentangle CDS from Metaspace, especially the alignment
>> policy. Remember, I tried to tackle this last summer? but it keeps biting
>> me. For such a small problem this is weirdly complicated.
>>
>>
>>> One thing that can be cleaned up is the call to
>>> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>>
>>> (a) when CDS is enabled:
>>>
>>>     Metaspace::global_initialize()
>>>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>>        -> ... MetaspaceShared::map_archives()
>>>          -> ... reserve the space, eventually calling
>>> Metaspace::reserve_space
>>>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>>
>>> (b) when CDS is disabled
>>>
>>>     Metaspace::global_initialize()
>>>     -> allocate_metaspace_compressed_klass_ptrs
>>>        -> (if cds is not enabled) Metaspace::reserve_space()
>>>
>>>
>>> In case (b), we should first reserve the space, and then call into
>>> allocate_metaspace_compressed_klass_ptrs. This will simplify the
>>> arguments
>>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
>>> variations
>>> of calls to Metaspace::reserve_space(). I think this will make it
>>> possible to
>>> drop the use_requested_addr argument and rely simply on (requested_addr
>>> != NULL)
>>>
>>>
>> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
>> Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>>
>> This would melt down
>> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize
>> compressed class space from a pre-arranged ReservedSpace, and set up base +
>> shift".
>>
>> We could probably rename that thing
>> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>>
>> We even could move set_narrow_klass_base_and_shift() out of
>> Metaspace::set_up_compressed_klass_space, then it becomes a series of three
>> simple operations:
>> 1) obtain a ReservedSpace however you see fit
>> 2) register it with Metaspace as address space for ccs,
>> 3) set_narrow_klass_base_and_shift. We would not have to hand down
>> cds_base to Metaspace, only for it to be used as base address
>> in set_narrow_klass_base_and_shift.
>>
>>
>> Yes, that seems the right thing to do. That will hopefully make the
>> aarch64 initialization code a little simpler as well.
>>
>> One question which came to me today was:
>>
>> In AppCDS, DynamicArchiveBuilder::do_it() calls
>> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have
>> anything to do with ccs? Don't they just need some space to dump into? Hope
>> that question is not dumb.
>>
>> Do you mean:
>>
>> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
>> -> MetaspaceShared::reserve_shared_space
>>     -> Metaspace::reserve_space
>>
>> That's not necessary. When I wrote the code I thought
>> Metaspace::reserve_space was a general function for reserving spaces :-)
>> but as you said, this function is probably intended only for initializing
>> the CCS.
>>
>> Thanks
>> - Ioi
>>
>> Thanks, Thomas
>>
>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> Does that make sense? In other words, if the whole point of
>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>> to find a good address", would it not make sense to just try a low
>>> address as part of the try-addresses-loop?
>>>
>>> We certainly don't want to have to use a dedicated heapbase register
>>> or a shift. Just give us a multiple of 4*G and we're happy.
>>>
>>>
>>>
>>>
>>
>


More information about the hotspot-runtime-dev mailing list