RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3]
Johan Sjölen
jsjolen at openjdk.org
Tue Feb 3 09:55:58 UTC 2026
On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:
>> **Summary**
>> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding.
>>
>> **Problem**
>> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with:
>>
>> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes.
>>
>>
>> **Solution**
>> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 × 8 bytes) while maintaining backward compatibility with the existing u4 offset fields.
>>
>> Current: address = base + offset_bytes (max ~2GB)
>> Proposed: address = base + (offset_units << 3) (max 32GB)
>>
>> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero – we're wasting them!
>>
>> Current byte offset (aligned to 8 bytes):
>> 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000
>> └── Always 000!
>>
>> Scaled offset (shift=3):
>> 0x00000200 = Same address, but stored in 29 bits instead of 32
>> Frees up 3 bits → 8x larger range!
>> Current byte offset (aligned to 8 bytes): 0x00001000 = 0000 0000 0000 0000 0001 0000 0000 0|000 └── Always 000!Scaled offset (shift=3): 0x00000200 = Same address, but stored in 29 bits instead of 32 Frees up 3 bits → 8x larger range!
>>
>> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB.
>>
>> **Test**
>> All tier1 and tier2 tests passed. No visible performance impact. Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB).
>>
>> Archive:
>> - 300000 simple classes
>> - 2000 mega-classes
>> - 5000 FieldObject classes
>> - Total: 307000 classes
>>
>> AOT Cache:
>> Times (wall): create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms
>> Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms
>>
>> Static CDS:
>> Times (wall): create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms
>> Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1...
>
> Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision:
>
> add hotspot_resourcehogs_no_cds test group
This is a drive-by comment.
You are going to have to track whether a value is "offset-scaled" or "raw" and using `enum class`es can remove the risk of not catching a mistake.
```c++
enum class archive_offset : uintx {}
template <typename T> T static offset_to_archived_address(archive_offset offset_units) {
assert(offset_units != 0, "sanity");
uintx offset_bytes = ((uintx)offset_units) << MetadataOffsetShift;
T p = (T)(SharedBaseAddress + offset_bytes);
assert(Metaspace::in_aot_cache(p), "must be");
return p;
}
This might be overkill, but I thought it prudent to float the idea.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3840272170
More information about the hotspot-dev
mailing list