RFR: 8334230: Optimize C2 classes layout
Neethu Prasad
nprasad at openjdk.org
Fri Jul 5 15:01:10 UTC 2024
**Notes**
Rearrange C2 class fields to optimize footprint.
**Verification**
1. Ran tier2_compiler, hotspot_compiler, tier 1 & tier 2 tests.
2. Ran pahole on 64 bit machine post re-ordering and verified that there are no holes / reduction in total bytes.
| Class | Size | Cachelines | Sum Members | Holes | Sum holes | Last Cacheline | Padding |
| ----- | ----- | ---------- | --------------- | ----- | ---------- | --------------- | -------- |
| ArrayPointer | 56 -> 48 | 1 -> 1 | 45 -> 0 | 2 -> 0 | 11 -> 0 | 56 bytes -> 48 | 0 -> 3 |
| CallJavaNode | 152 -> 144 | 3 -> 3 | 12 -> 0 | 1 -> 0 | 5 -> 0 | 24 bytes -> 16 | 7 -> 4 |
| C2Access | 56 -> 48 | 1-> 1 | 42 -> 0 | 1 -> 0 | 7 -> 0 | 56 bytes -> 48 | 7 -> 6 |
| VectorSet| 32 -> 24 | 1-> 1 | 24 -> 0 | 1 -> 0 | 8 -> 0 | 32 bytes -> 24 | 1 -> 1 |
class ArrayPointer {
const class Node * _pointer; /* 0 8 */
const class Node * _base; /* 8 8 */
const jlong _constant_offset; /* 16 8 */
const class Node * _int_offset; /* 24 8 */
const class GrowableArray<Node*> * _other_offsets; /* 32 8 */
const jint _int_offset_shift; /* 40 4 */
const bool _is_valid; /* 44 1 */
public:
/* size: 48, cachelines: 1, members: 7 */
/* padding: 3 */
/* last cacheline: 48 bytes */
};
class CallJavaNode : public CallNode {
public:
/* class CallNode <ancestor>; */ /* 0 128 */
protected:
/* --- cacheline 2 boundary (128 bytes) --- */
class ciMethod * _method; /* 128 8 */
bool _optimized_virtual; /* 136 1 */
bool _method_handle_invoke; /* 137 1 */
bool _override_symbolic_info; /* 138 1 */
bool _arg_escape; /* 139 1 */
public:
protected:
public:
/* size: 144, cachelines: 3, members: 6 */
/* padding: 4 */
/* last cacheline: 16 bytes */
/* BRAIN FART ALERT! 144 bytes != 12 (member bytes) + 0 (member bits) + 0 (byte holes) + 0 (bit holes), diff = 1024 bits */
};
class C2Access : public StackObj {
public:
/* class StackObj <ancestor>; */ /* 0 0 */
/* XXX last struct has 1 byte of padding */
int ()(void) * * _vptr.C2Access; /* 0 8 */
protected:
DecoratorSet _decorators; /* 8 8 */
class Node * _base; /* 16 8 */
class C2AccessValuePtr & _addr; /* 24 8 */
class Node * _raw_access; /* 32 8 */
enum BasicType _type; /* 40 1 */
uint8_t _barrier_data; /* 41 1 */
public:
protected:
public:
/* size: 48, cachelines: 1, members: 8 */
/* padding: 6 */
/* paddings: 1, sum paddings: 1 */
/* last cacheline: 48 bytes */
};
class VectorSet : public AnyObj {
public:
/* class AnyObj <ancestor>; */ /* 0 0 */
/* XXX last struct has 1 byte of padding */
static const uint word_bits; /* 0 0 */
static const uint bit_mask; /* 0 0 */
uint _size; /* 0 4 */
uint _data_size; /* 4 4 */
uint32_t * _data; /* 8 8 */
class Arena * _set_arena; /* 16 8 */
/* size: 24, cachelines: 1, members: 5, static members: 2 */
/* paddings: 1, sum paddings: 1 */
/* last cacheline: 24 bytes */
};
I wrote simple program that just assigns integer value to a variable and observed the following -
Number of ArrayPointer instances = 58.
Number of C2Access instances = 1390.
Number of CallJavaNode instances = 1626.
58 * 8 byte + 1390 * 8 + 1626 * 8 = 24KB
24 KB space saving at the very least and significant memory footprint savings for much complex programs.
-------------
Commit messages:
- 8334230: Keep constructor order same as before & optimize VectorSet
- 8334230: Optimize C2 classes layout
Changes: https://git.openjdk.org/jdk/pull/19861/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19861&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8334230
Stats: 20 lines in 4 files changed: 8 ins; 8 del; 4 mod
Patch: https://git.openjdk.org/jdk/pull/19861.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19861/head:pull/19861
PR: https://git.openjdk.org/jdk/pull/19861
More information about the hotspot-dev
mailing list