RFR(M): 8145521: Use trampolines for i2i and i2c entries in Methods that are stored in CDS archive
Ioi Lam
ioi.lam at oracle.com
Thu Feb 25 18:03:33 UTC 2016
On 2/22/16 12:50 PM, Dean Long wrote:
> Hi Ioi, comments inlined below.
>
> On 2/20/2016 5:22 PM, Ioi Lam wrote:
>> Hi Dean,
>>
>> Thanks for raising this issue.
>>
>> My first impression is this probably shouldn't matter (or somehow we
>> can get away with this).
>>
>> Today, the CDS archive already contains executable code in the "misc
>> code" section. On Linux/x64, this is typically in the address range
>> 0x802400000-0x802409000. The code is used to dynamically patch the
>> C++ vtables of Metadata types that are stored in the CDS archive. So
>> even today, there's a chance that the suspended PC lands in this
>> section.
>>
>
> OK, so you're saying that the trampolines don't make things any worse :-)
Well, theoretically the i2i and c2i entries would be executed more
frequently than the vtable patching code, so if there was a problem, it
could be aggravated.
>
>> The code is generated inside MetaspaceShared::generate_vtable_methods
>> and looks like this when you run with -Xshare:on:
>>
>> (gdb) x/5i 0x802400000
>> 0x802400000: mov $0x0,%eax
>> 0x802400005: jmpq 0x8024084d0
>> 0x80240000a: mov $0x1,%eax
>> 0x80240000f: jmpq 0x8024084d0
>> 0x802400014: mov $0x2,%eax
>> ....
>> (gdb) x/16i 0x8024084d0
>> 0x8024084d0: push %rsi
>> 0x8024084d1: push %rdi
>> 0x8024084d2: mov %rax,%rdi
>> 0x8024084d5: shr $0x8,%rdi
>> 0x8024084d9: shl $0x3,%rdi
>> 0x8024084dd: movabs $0x802000000,%rsi
>> 0x8024084e7: add %rdi,%rsi
>> 0x8024084ea: mov (%rsi),%rsi
>> 0x8024084ed: pop %rdi
>> 0x8024084ee: mov %rsi,(%rdi)
>> 0x8024084f1: and $0xff,%rax
>> 0x8024084f8: shl $0x3,%rax
>> 0x8024084fc: add %rsi,%rax
>> 0x8024084ff: pop %rsi
>> 0x802408500: mov (%rax),%rax
>> 0x802408503: jmpq *%rax
>>
>> In JDK9, the profiler takes a sample by first calling into
>> JavaThread::pd_get_top_frame_for_profiling:
>>
>> (gdb) where
>> #0 frame::safe_for_sender (this=0x7ffec63895f0, thread=0x7ffff08ce000)
>> #1 0x00007ffff69d72a7 in JavaThread::pd_get_top_frame
>> (this=0x7ffff08ce000, fr_addr=0x7ffec63897b0,
>> ucontext=0x7ffec6287d00, isInJava=true)
>> #2 0x00007ffff69d70be in JavaThread::pd_get_top_frame_for_profiling
>> (this=0x7ffff08ce000, fr_addr=0x7ffec63897b0, ucontext=0x7ffec6287d00,
>> isInJava=true)
>>
>> I tried manually setting frame::_pc to 0x802400000, and it would
>> cause frame::safe_for_sender() to return false, and thus would
>> prevent the profiler from doing any stack walking.
>>
>
> Was the reason safe_for_sender() failed something that we can rely on
> 100%? I looked at safe_for_sender, and it's not obvious why it would
> fail, unless the frame pointer was invalid, and I think that depends
> on the platform and flags like PreserveFramePointer. How about filing
> a separate bug to track this issue?
>
OK, I fill a new bug https://bugs.openjdk.java.net/browse/JDK-8150663 to
track this.
Thanks
- Ioi
> dl
>
>> Also, both the CDS vtable patching code and the i2i trampolines are
>> tail calls, so you will never find a PC in them except for the
>> top-most frame. This means the check in JavaThread::pd_get_top_frame
>> is enough. There's no need to do additional checks after we have
>> started stack walking.
>>
>> I think the chance of landing in the CDS vtable patching methods, or
>> in the trampolines, is pretty small, so that shouldn't bother the
>> sampling profiler too much.
>>
>> Anyway, as part of this bug, we should add a regression test with the
>> profiler enabled, and keep calling a method in the CDS archive in a
>> tight loop (and manually disable compilation of that method using
>> command-line options). The test should never crash, and if possible,
>> it should also check that we don't have too many "unknown" samples.
>>
>> Thanks
>> - Ioi
>>
>>
>> On 2/19/16 4:39 PM, Dean Long wrote:
>>> We have profiling code that will suspend a thread at random points
>>> and try to walk the stack.
>>> I think frame::sender() will get confused if you happen to catch a
>>> thread in the trampoline,
>>> and frame::_pc is in metadata and not the code cache.
>>>
>>> dl
>>>
>>> On 2/19/2016 2:19 PM, Calvin Cheung wrote:
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8145221
>>>> webrev: http://cr.openjdk.java.net/~ccheung/8145221/webrev.00/
>>>>
>>>> This optimization reduces the size of the RW region of the CDS
>>>> archive. It also reduces the amount of pages in the RW region that
>>>> are actually written into during runtime.
>>>>
>>>> The original prototype on the x86_64 platform was done by Ioi
>>>> (ioi.lam at oracle.com).
>>>> I helped porting it to other platforms.
>>>> Special thanks to Goetz (goetz.lindenmaier at sap.com) who provided
>>>> the changes to the ppc platform.
>>>>
>>>> Testing:
>>>> JPRT with testset hotspot
>>>> maunal testing on X86_64, x86_32 and solaris/sparcv9 with
>>>> -XX:+PrintAssembly -XX:+PrintInterpreter
>>>> built on the Zero platform
>>>> tested on the openjdk aarch64 platform
>>>> refworkload on various platform (please refer to bug report for
>>>> data)
>>>>
>>>> thanks,
>>>> Calvin
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list