RFR: 8352075: Perf regression accessing fields [v21]
Coleen Phillimore
coleenp at openjdk.org
Thu Jun 5 20:40:57 UTC 2025
On Tue, 3 Jun 2025 07:16:47 GMT, Radim Vansa <rvansa at openjdk.org> wrote:
>> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 trying to reduce the performance regression in some scenarios introduced in https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and memory consumption it is a (better) alternative to https://github.com/openjdk/jdk/pull/24713 .
>>
>> This PR optimizes local field lookup in classes with more than 16 fields; rather than sequentially iterating through all fields during lookup we sort the fields based on the field name. The stream includes extra table after the field information: for field at position 16, 32 ... we record the (variable-length-encoded) offset of the field info in this stream. On field lookup, rather than iterating through all fields, we iterate through this table, resolve names for given fields and continue field-by-field iteration only after the last record (hence at most 16 fields).
>>
>> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte that was left with value 0 at the end of stream. In classes with > 16 fields we add extra 4 bytes with offset of the table, and the table contains one varint for each 16 fields. The terminal byte is not used either.
>>
>> My measurements on the attached reproducer
>>
>> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC'
>> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC
>> Time (mean ± σ): 51.3 ms ± 2.8 ms [User: 44.7 ms, System: 13.7 ms]
>> Range (min … max): 45.1 ms … 53.9 ms 100 runs
>>
>> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC'
>> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC
>> Time (mean ± σ): 78.2 ms ± 1.0 ms [User: 74.6 ms, System: 17.3 ms]
>> Range (min … max): 73.8 ms … 79.7 ms 100 runs
>>
>> (the jdk25-master above already contains JDK-8353175)
>>
>> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC'
>> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC
>> Time (mean ± σ): 38.5 ms ± 0.5 ms [User: 34.4 ms, System: 17.3 ms]
>> Range (min … max): 37.7 ms … 42.1 ms 100 runs
>>
>> While https://github.com/openjdk/jdk/pull/24713 returned the performance to previous levels, this PR improves it by 25% compared to JDK 17 (which does not contain the regression)! This time, the undisclosed production-grade reproducer shows even higher improvement:
>>
>> JDK 17: 1.6 s
>> JDK 21 (no patches): 22 s
>> JDK25-master: 12.3 s
>> JDK25-this-pr: 0.5 s
>
> Radim Vansa has updated the pull request incrementally with three additional commits since the last revision:
>
> - Moved jtreg test
> - Improved documentation
> - Fix coding style (asterisk placement)
With a lot of reading, this looks reasonable, but I still have many questions.
I'm also testing this with tier1-7.
src/hotspot/share/oops/fieldInfo.cpp line 137:
> 135: int position;
> 136: } field_pos_t;
> 137: field_pos_t* positions = nullptr;
This is unused.
src/hotspot/share/oops/fieldInfo.cpp line 164:
> 162: r.read_field_counts(&java_fields, &injected_fields);
> 163: assert(java_fields >= 0, "must be");
> 164: if (java_fields == 0 || fis->length() == 0 || static_cast<uint>(java_fields) < BinarySearchThreshold) {
I don't know why you only sort Java fields and ignore the injected fields. JavaClasses::compute_offsets calls find_local_field, so might not find an injected field, I assume in the java.lang.Class (mirror). Should this sorted cache exclude classes with injected fields? ie if injected_fields > 0?
If you exclude classes with injected fields, you could remove the javaClasses code (and maybe not have to re-sort any fields during dynamic dumping (?))
src/hotspot/share/oops/fieldInfo.cpp line 173:
> 171: PackedTableBuilder builder(fis->length() - 1, java_fields - 1);
> 172:
> 173: Array<u1>* table = MetadataFactory::new_array<u1>(loader_data, java_fields * builder.element_bytes(), CHECK_NULL);
Since this isn't used until you fill the table, can you move this down to just before the 'fill' call?
src/hotspot/share/oops/fieldInfo.cpp line 285:
> 283: FieldInfo fi;
> 284: reader.read_field_info(fi);
> 285: if (fi.field_flags().is_injected()) {
I thought that above, you only process java fields and not the injected fields?
src/hotspot/share/oops/fieldInfo.hpp line 225:
> 223: // Gadget for decoding and reading the stream of field records.
> 224: class FieldInfoReader {
> 225: UNSIGNED5::Reader<const u1*, int> _r;
I don't like this name _r but it's not your change.
src/hotspot/share/oops/fieldInfo.hpp line 238:
> 236:
> 237: private:
> 238: uint32_t next_uint() { return _r.next_uint(); }
Why did you make this change and have the callers expose _r ?
-------------
PR Review: https://git.openjdk.org/jdk/pull/24847#pullrequestreview-2901503494
PR Comment: https://git.openjdk.org/jdk/pull/24847#issuecomment-2946136847
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129643750
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129856223
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129865720
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2129890059
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130217134
PR Review Comment: https://git.openjdk.org/jdk/pull/24847#discussion_r2130238145
More information about the hotspot-dev
mailing list