LRB and 32-bit compressed oops
Roland Westrelin
rwestrel at redhat.com
Fri Mar 29 08:32:35 UTC 2019
> Run with -Xmx20g, thus enabling compressed oops, you shall see this:
>
> [Verified Entry Point]
> 6.94% 0x00007f60c0497050: mov %eax,-0x14000(%rsp)
> 5.80% 0x00007f60c0497057: push %rbp
> 0.30% 0x00007f60c0497058: sub $0x10,%rsp
> 11.81% 0x00007f60c049705c: mov 0xc(%rsi),%r11d
> 0.82% 0x00007f60c0497060: mov %r11,%r9
> 0.48% 0x00007f60c0497063: shl $0x3,%r9
> .......................... LRB fastpath check ..........................
> 5.29% 0x00007f60c0497067: testb $0x1,0x20(%r15)
> 5.49% ╭ 0x00007f60c049706c: jne 0x00007f60c0497086
> .........│......... LRB fastpath ends, store to %r9 follows ............
> 0.87% │↗ ↗↗ 0x00007f60c049706e: movl $0x2a,0xc(%r9)
> 7.59% ││ ││ 0x00007f60c0497076: add $0x10,%rsp
> 6.12% ││ ││ 0x00007f60c049707a: pop %rbp
> 1.01% ││ ││ 0x00007f60c049707b: mov 0x108(%r15),%r10
> 0.63% ││ ││ 0x00007f60c0497082: test %eax,(%r10)
> 6.73% ││ ││ 0x00007f60c0497085: retq
> ---------││-││----------- LRB midpath starts --------------------------
> .........│|.|│............ checking in-cset ...........................
> ↘│ ││ 0x00007f60c0497086: mov %r9,%r10
> │ ││ 0x00007f60c0497089: shr $0x17,%r10
> │ ││ 0x00007f60c049708d: movabs $0x7f60d00919f0,%r8
> │ ││ 0x00007f60c0497097: cmpb $0x0,(%r8,%r10,1)
> ╰ ││ 0x00007f60c049709c: je 0x00007f60c049706e
> ............││............ checking is-forwarded ......................
> ││ 0x00007f60c049709e: mov -0x8(%r12,%r11,8),%r9
> ││ 0x00007f60c04970a3: lea (%r12,%r11,8),%r10
> ││ 0x00007f60c04970a7: cmp %r10,%r9
> ╰│ 0x00007f60c04970aa: jne 0x00007f60c049706e
> .............│............... slow path call ..........................
> │ 0x00007f60c04970ac: mov %r9,%rdi
> │ 0x00007f60c04970af: movabs $0x7f60d7775030,%r10
> │ 0x00007f60c04970b9: callq *%r10
> │ 0x00007f60c04970bc: mov %rax,%r9
> ╰ 0x00007f60c04970bf: jmp 0x00007f60c049706e
So why not store the forwarding pointer compressed? Decoding would then
happen after the LRB. So this code:
0.82% 0x00007f60c0497060: mov %r11,%r9
0.48% 0x00007f60c0497063: shl $0x3,%r9
would fold into the following access:
0.87% │↗ ↗↗ 0x00007f60c049706e: movl $0x2a,0xc(%r9)
and would be essentially free. I suppose this:
↘│ ││ 0x00007f60c0497086: mov %r9,%r10
│ ││ 0x00007f60c0497089: shr $0x17,%r10
could be adjusted so there's no need to decode the value here. And
decoding here:
││ 0x00007f60c049709e: mov -0x8(%r12,%r11,8),%r9
is already folded in the forwarding pointer access.
I suppose this would help the other case you mention where decoding is a
noop.
Roland.
More information about the shenandoah-dev
mailing list