RFR: 8376125: Out of memory in the CDS archive error with lot of classes [v3]

Tue Feb 3 07:23:08 UTC 2026

On Mon, 2 Feb 2026 22:02:10 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:

>> **Summary**
>> This change extends the CDS/AOT archive size limit from 2GB to 32GB by using scaled offset encoding.
>> 
>> **Problem**
>> Applications with a large number of classes (e.g., 300,000+) can exceed the current 2GB archive size limit, causing archive creation to fail with:
>> 
>> [error][aot] Out of memory in the CDS archive: Please reduce the number of shared classes.
>> 
>> 
>> **Solution**
>> Instead of storing raw byte offsets in u4 fields (limited to ~2GB), we now store scaled offset units where each unit represents 8 bytes (OFFSET_SHIFT = 3). This allows addressing up to 32GB (2^32 × 8 bytes) while maintaining backward compatibility with the existing u4 offset fields.
>> 
>> Current:   address = base + offset_bytes           (max ~2GB)
>> Proposed:  address = base + (offset_units << 3)    (max 32GB)
>> 
>> All archived objects are guaranteed to be 8-byte aligned. This means the lower 3 bits of any valid byte offset are always zero – we're wasting them!
>> 
>> Current byte offset (aligned to 8 bytes):
>>   0x00001000  =  0000 0000 0000 0000 0001 0000 0000 0|000
>>                                                       └── Always 000!
>> 
>> Scaled offset (shift=3):
>>   0x00000200  =  Same address, but stored in 29 bits instead of 32
>>                  Frees up 3 bits → 8x larger range!
>> Current byte offset (aligned to 8 bytes):  0x00001000  =  0000 0000 0000 0000 0001 0000 0000 0|000                                                      └── Always 000!Scaled offset (shift=3):  0x00000200  =  Same address, but stored in 29 bits instead of 32                 Frees up 3 bits → 8x larger range!
>> 
>> By storing `offset_bytes >> 3` instead of `offset_bytes`, we use all 32 bits of the u4 field to represent meaningful data, extending the addressable range from 2GB to 32GB.
>> 
>> **Test**
>> All tier1 and tier2 tests passed.  No visible performance impact.  Local benchmark shows significant performance improvement for CDS, Dynamic CDS and AOT Cache archive loading, with huge archive size (>2GB).
>> 
>> Archive:
>>   - 300000 simple classes
>>   - 2000 mega-classes
>>   - 5000 FieldObject classes
>>   - Total: 307000 classes
>> 
>> AOT Cache:
>>   Times (wall):      create=250020ms verify=2771ms baseline=15470ms perf_with_aot=2388ms
>>   Times (classload): verify=965ms baseline=14771ms perf_with_aot=969ms
>>   
>> Static CDS:
>>   Times (wall):      create=161859ms verify=2055ms baseline=15592ms perf_with_cds=1996ms
>>   Times (classload): verify=1027ms baseline=14852ms perf_with_cds=1...
>
> Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add hotspot_resourcehogs_no_cds test group

Looking at the issue closer, and the provided example. The classes seem to be both numerous and monstrous. How realistic is this scenario? Such objects would pose other challenges too, e.g. to GC.

We can, and should, certainly make the dividing line between CDS and class space more fluid to allow for a larger CDS at the cost of the class space. As @iklam wrote, 3.5 GB, possibly even more, can be done.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/29494#issuecomment-3839553187