RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v9]

Tue Feb 17 07:43:18 UTC 2026

On Mon, 16 Feb 2026 07:10:09 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:

>> **Summary**
>> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding.
>> 
>> **Problem**
>> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with:
>> 
>> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes.
>> 
>> 
>> **Solution**
>> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 × 8 bytes) while maintaining backward compatibility with the existing u4 offset fields.
>> 
>> Current:   address = base + offset_bytes           (max ~2GB)
>> Proposed:  address = base + (offset_units << 3)    (max 32GB)
>> 
>> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero – we're wasting them!
>> 
>> Current byte offset (aligned to 8 bytes):
>>   0x00001000  =  0000 0000 0000 0000 0001 0000 0000 0|000
>>                                                       └── Always 000!
>> 
>> Scaled offset (shift=3):
>>   0x00000200  =  Same address, but stored in 29 bits instead of 32
>>                  Frees up 3 bits → 8x larger range!
>> Current byte offset (aligned to 8 bytes):  0x00001000  =  0000 0000 0000 0000 0001 0000 0000 0|000                                                      └── Always 000!Scaled offset (shift=3):  0x00000200  =  Same address, but stored in 29 bits instead of 32                 Frees up 3 bits → 8x larger range!
>> 
>> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB.
>> 
>> **Test**
>> All tier1 and tier2 tests passed.  No visible performance impact.  Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB).
>> 
>> Archive:
>>   - 300000 simple classes
>>   - 2000 mega-classes
>>   - 5000 FieldObject classes
>>   - Total: 307000 classes
>> 
>> AOT Cache:
>>   Times (wall):      create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms
>>   Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms
>>   
>> Static CDS:
>>   Times (wall):      create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms
>>   Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1...
>
> Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   miss update for FileMapInfo

src/hotspot/share/cds/aotCompressedPointers.hpp line 44:

> 42:   //
> 43:   // Note: This encoding is ONLY for compact hashtable values. General pointer serialization
> 44:   // (WriteClosure/ReadClosure::do_ptr) uses raw byte offsets without scaling.

We actually use `narrowPtr` in many other places, such as in [RunTimeLambdaProxyClassKey](https://github.com/openjdk/jdk/blob/03703f347df7d3507ffeaf45e32be8bec6403b7d/src/hotspot/share/cds/lambdaProxyClassDictionary.hpp#L135-L142). Usually we do that to reduce footprint and reduce runtime pointer patching.

I think when storing an offset into the AOT cache, we should always use `narrowPtr` for uniformity. The "raw" offset such as `ArchiveBuilder::any_to_offset()` should be only used for internal operations while building the AOT cache.

I have a [patch](https://github.com/iklam/jdk/commit/6d6b9332a5d7c18374d2d13f72a4cc00479afafd) that fixes two places (that I missed in https://github.com/openjdk/jdk/pull/29590):
- Make sure `CompactHashtableWriter::_compact_buckets` is 8-byte aligned on x64.
- Fixed the decoding of vtable decoding in serviceability agent.

src/hotspot/share/cds/archiveBuilder.cpp line 327:

> 325:   // On 32-bit: use 256MB + AOT code size due to limited virtual address space.
> 326:   size_t buffer_size = LP64_ONLY(AOTCompressedPointers::MaxMetadataOffsetBytes)
> 327:                        NOT_LP64(256 * M + AOTCodeCache::max_aot_code_size());

When `CompactObjectHeaders` are enabled, we should reserve a smaller size to avoid assertion.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2815417672
PR Review Comment: https://git.openjdk.org/jdk/pull/29494#discussion_r2815424299