[master] RFR: Load Klass* from header in interpreter (x86) [v5]
Aleksey Shipilev
shade at openjdk.java.net
Tue Oct 12 12:40:16 UTC 2021
On Wed, 6 Oct 2021 16:48:11 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> This implements loading the compressed Klass* from the object header, instead of the Klass* field in the x86 interpreter. It does the fast-path (unlocked object) in assembly, and calls into the runtime to deal with locked objects.
>>
>> This is a proof-of-concept. It is not entirely clear if we can even call into runtime from a couple of places where load_klass() is called, especially in C2, C1 and some stubs. OTOH, we may end up not having to do any of this if we come up with a way to avoid displaced headers altogether.
>>
>> Testing:
>> - [x] tier1 (x86_32,x86_64)
>> - [x] tier2 (x86_32,x86_64)
>> - [x] hotspot_gc (x86_32,x86_64)
>
> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision:
>
> - Merge branch 'master' into klass-from-header-interpreter
> - Emit null-checks before load_klass with correct offset
> - Use constant instead of literal in xor-test-sequence
> - Update comment about xorb/xorq encoding
> - Load Klass* from header in interpreter (x86)
Looks okay for the experimental code.
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4554:
> 4552: // NOTE: While it would seem nice to use xorb instead (for which we don't have an encoding in our assembler),
> 4553: // the encoding for xorq uses the signed version (0x81/6) of xor, which encodes as compact as xorb would,
> 4554: // and does't make a difference performance-wise.
I can give you `xorb`, if you want, in a subsequent PR.
src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4597:
> 4595: null_check(src, oopDesc::klass_offset_in_bytes());
> 4596: }
> 4597: movptr(dst, Address(src, oopDesc::klass_offset_in_bytes()));
Do we not need to do any fixups for x86_32?
src/hotspot/cpu/x86/macroAssembler_x86.hpp line 342:
> 340:
> 341: // oop manipulations
> 342: void load_klass(Register dst, Register src, Register tmp, bool null_check_src);
I think if you define `bool null_check_src = false`, then the significant part of the changes would go away. It might not be a good idea when upstreaming this, but it would probably simplify the merges quite a bit.
src/hotspot/cpu/x86/templateTable_x86.cpp line 4185:
> 4183: Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
> 4184: __ load_klass(rdx, rdx, tmp_load_klass, false);
> 4185: __ jmp(resolved);
Have you run into troubles with `jmpb` short branch here?
-------------
Marked as reviewed by shade (Committer).
PR: https://git.openjdk.java.net/lilliput/pull/15
More information about the lilliput-dev
mailing list