RFR(M): 8145221: Use trampolines for i2i and i2c entries in Methods that are stored in CDS archive

Calvin Cheung calvin.cheung at oracle.com
Fri Mar 18 17:53:30 UTC 2016


Hi Jiangli,

Thanks for your quick review.

Let me try to answer the remaining questions below.

On 3/17/16, 10:13 PM, Ioi Lam wrote:
> Hi Jiangli,
>
> Since I wrote some of the original code, I'll try to answer some of 
> the questions below
>
> On 3/17/16 9:37 PM, Jiangli Zhou wrote:
>> Hi Calvin,
>>
>> The changes look good overall. Following are my comments and questions.
>>
>> - src/os_cpu/bsd_x86/vm/thread_bsd_x86.cpp
>> - src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.cpp
>> - src/os_cpu/linux_sparc/vm/thread_linux_sparc.cpp
>> - src/os_cpu/linux_x86/vm/thread_linux_x86.cpp
>> - src/os_cpu/solaris_sparc/vm/thread_solaris_sparc.cpp
>> - src/os_cpu/solaris_x86/vm/thread_solaris_x86.cpp
>> - src/os_cpu/windows_x86/vm/thread_windows_x86.cpp
>>    Please place the new CDS-only code under #if INCLUDE_CDS.
>>
>>   It would be a good idea to assert the address (current pc) is 
>> within the generated trampoline entry buffer.
> I am not sure what this means. Could you explain?
>>
>> - src/share/vm/interpreter/abstractInterpreter.cpp
>>    It’s not necessary to put '#include for 
>> memory/metaspaceShared.hpp' under #if INCLUDE_CDS.
I'll fix that.
>>
>> - src/share/vm/memory/metaspaceShared.cpp
>>    I see you changed ‘read_only’ to false for ‘mc’ region. Is it 
>> possible to place the trampoline code buffer in the ‘md’ region, so 
>> the ‘mc’ region does not need to be writable? It’s good to keep the 
>> ‘mc’ as read only so it would be easier to detect any ‘accidental’ 
>> writes to the region due to future changes. The ‘md’ can be made 
>> executable.
But putting code in the md region and making md region executable is 
confusing too.
I'd prefer leaving it as is for now.
>>
>> - src/share/vm/oops/method.cpp
>>    Method::unlink_method() is only called from 
>> Method::remove_unshareable_info(), which happens at dump time only. 
>> Why is ‘if (!DumpSharedSpaces)’ check needed in Method::unlink_method()?
> We can change the code to
>
>    assert(DumpSharedSpaces, "...");
>    // Set the values to what they should be at run time. Note that
>    // this Method can no longer be executed during dump time.
>    _i2i_entry = Interpreter::entry_for_cds_method(this);
>    _from_interpreted_entry = _i2i_entry;
>
> I think unlink_method used to be called during run time as well. See 
> this assert later in unlink_method
>
>   // In case of DumpSharedSpaces, _method_data should always be NULL.
>   //
>   // During runtime (!DumpSharedSpaces), when we are cleaning a
>   // shared class that failed to load, this->link_method() may
>   // have already been called (before an exception happened), so
>   // this->_method_data may not be NULL.
>   assert(!DumpSharedSpaces || _method_data == NULL, "unexpected method 
> data?");
>
> So we need to remove the "!DumpSharedSpaces ||" and fix the comments.
>> - src/share/vm/oops/method.hpp
>>    In set_interpreter_entry(), when _i2i_entry and 
>> _from_interpreted_entry are not the same as ‘entry’ for shared method?
>
> The purpose of the check is to avoid writing the same value into the 
> shared Method (which would cause the pages to be copied-on-write, even 
> if the value does not change).
>
>  475   void set_interpreter_entry(address entry) {
>  476     if (_i2i_entry != entry) {
>  477       _i2i_entry = entry;
>  478     }
>  479     if (_from_interpreted_entry != entry) {
>  480       _from_interpreted_entry = entry;
>  481     }
>  482   }
>
>> - src/share/vm/runtime/sharedRuntime.cpp
>>      Please add a comment for the change at line 2537
>> 2537     if (!DumpSharedSpaces && 
>> CodeCacheExtensions::skip_compiler_support()) {
I believe this is for handling the case when compiler is disabled during 
dumping.
I'll work with Ioi to come up with some comments.
>> - src/share/vm/oops/constMethod.hpp
>>
>>      ConstMethod does not seem to be the right place for 
>> ‘_adapter_trampoline’ since it’s set at runtime.
>
> The pointer ConstMethod::_adapter_trampoline never changes at run 
> time. Only the content pointed to by _adapter_trampoline can change.
>
> Thanks
> - Ioi
>> Do you have estimate about the size increase with the generated 
>> trampolines for the shared methods?
IIRC, the CDSAdapterHandlerEntry::init() is being called around 670 
times during dump time using the default classlist. So for 64-bit, the 
size increase for md is around 50k.

thanks,
Calvin
>>
>> Thanks,
>> Jiangli
>>
>>> On Mar 16, 2016, at 10:36 PM, Calvin Cheung 
>>> <calvin.cheung at oracle.com> wrote:
>>>
>>> Dean, Ioi,
>>>
>>> Thanks for the review and discussions.
>>>
>>> Here's an updated webrev:
>>> http://cr.openjdk.java.net/~ccheung/8145221/webrev.01/
>>>
>>> Changes in this webrev:
>>> - in thread_*.cpp, added check in JavaThread::pd_get_top_frame(). 
>>> Return if it is in the middle of a trampoline call.
>>> - moved the guarantee from metaspaceShared_sparc.cpp to the common 
>>> metaspaceShared.cpp;
>>> - changed the MIN_SHARED_MISC_DATA_SIZE and 
>>> MIN_SHARED_MISC_CODE_SIZE and related comment in metaspaceShared.hpp
>>>
>>> The last 2 changes are required due to the test failure with the 
>>> hotspot/test/runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java 
>>> test.
>>>
>>> Testing:
>>>     JPRT with testset hotspot
>>>     jtreg tests under hotspot/runtime on all platforms
>>>
>>> thanks,
>>> Calvin
>>>
>>> p.s. I corrected the bug id in the subject line from 8145521 to 8145221
>>>
>>>
>>> On 2/25/16, 10:03 AM, Ioi Lam wrote:
>>>>
>>>> On 2/22/16 12:50 PM, Dean Long wrote:
>>>>> Hi Ioi, comments inlined below.
>>>>>
>>>>> On 2/20/2016 5:22 PM, Ioi Lam wrote:
>>>>>> Hi Dean,
>>>>>>
>>>>>> Thanks for raising this issue.
>>>>>>
>>>>>> My first impression is this probably shouldn't matter (or somehow 
>>>>>> we can get away with this).
>>>>>>
>>>>>> Today, the CDS archive already contains executable code in the 
>>>>>> "misc code" section. On Linux/x64, this is typically in the 
>>>>>> address range 0x802400000-0x802409000. The code is used to 
>>>>>> dynamically patch the C++ vtables of Metadata types that are 
>>>>>> stored in the CDS archive. So even today, there's a chance that 
>>>>>> the suspended PC lands in this section.
>>>>>>
>>>>> OK, so you're saying that the trampolines don't make things any 
>>>>> worse :-)
>>>> Well, theoretically the i2i and c2i entries would be executed more 
>>>> frequently than the vtable patching code, so if there was a 
>>>> problem, it could be aggravated.
>>>>>> The code is generated inside 
>>>>>> MetaspaceShared::generate_vtable_methods and looks like this when 
>>>>>> you run with -Xshare:on:
>>>>>>
>>>>>> (gdb) x/5i 0x802400000
>>>>>>    0x802400000:    mov    $0x0,%eax
>>>>>>    0x802400005:    jmpq   0x8024084d0
>>>>>>    0x80240000a:    mov    $0x1,%eax
>>>>>>    0x80240000f:    jmpq   0x8024084d0
>>>>>>    0x802400014:    mov    $0x2,%eax
>>>>>>    ....
>>>>>> (gdb) x/16i 0x8024084d0
>>>>>>    0x8024084d0:    push   %rsi
>>>>>>    0x8024084d1:    push   %rdi
>>>>>>    0x8024084d2:    mov    %rax,%rdi
>>>>>>    0x8024084d5:    shr    $0x8,%rdi
>>>>>>    0x8024084d9:    shl    $0x3,%rdi
>>>>>>    0x8024084dd:    movabs $0x802000000,%rsi
>>>>>>    0x8024084e7:    add    %rdi,%rsi
>>>>>>    0x8024084ea:    mov    (%rsi),%rsi
>>>>>>    0x8024084ed:    pop    %rdi
>>>>>>    0x8024084ee:    mov    %rsi,(%rdi)
>>>>>>    0x8024084f1:    and    $0xff,%rax
>>>>>>    0x8024084f8:    shl    $0x3,%rax
>>>>>>    0x8024084fc:    add    %rsi,%rax
>>>>>>    0x8024084ff:    pop    %rsi
>>>>>>    0x802408500:    mov    (%rax),%rax
>>>>>>    0x802408503:    jmpq   *%rax
>>>>>>
>>>>>> In JDK9, the profiler takes a sample by first calling into 
>>>>>> JavaThread::pd_get_top_frame_for_profiling:
>>>>>>
>>>>>> (gdb) where
>>>>>> #0  frame::safe_for_sender (this=0x7ffec63895f0, 
>>>>>> thread=0x7ffff08ce000)
>>>>>> #1  0x00007ffff69d72a7 in JavaThread::pd_get_top_frame 
>>>>>> (this=0x7ffff08ce000, fr_addr=0x7ffec63897b0, 
>>>>>> ucontext=0x7ffec6287d00, isInJava=true)
>>>>>> #2  0x00007ffff69d70be in 
>>>>>> JavaThread::pd_get_top_frame_for_profiling (this=0x7ffff08ce000, 
>>>>>> fr_addr=0x7ffec63897b0, ucontext=0x7ffec6287d00,
>>>>>>     isInJava=true)
>>>>>>
>>>>>> I tried manually setting frame::_pc to 0x802400000, and it would 
>>>>>> cause frame::safe_for_sender() to return false, and thus would 
>>>>>> prevent the profiler from doing any stack walking.
>>>>>>
>>>>> Was the reason safe_for_sender() failed something that we can rely 
>>>>> on 100%?  I looked at safe_for_sender, and it's not obvious why it 
>>>>> would fail, unless the frame pointer was invalid, and I think that 
>>>>> depends on the platform and flags like PreserveFramePointer.  How 
>>>>> about filing a separate bug to track this issue?
>>>>>
>>>> OK, I fill a new bug 
>>>> https://bugs.openjdk.java.net/browse/JDK-8150663 to track this.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>> dl
>>>>>
>>>>>> Also, both the CDS vtable patching code and the i2i trampolines 
>>>>>> are tail calls, so you will never find a PC in them except for 
>>>>>> the top-most frame. This means the check in 
>>>>>> JavaThread::pd_get_top_frame is enough. There's no need to do 
>>>>>> additional checks after we have started stack walking.
>>>>>>
>>>>>> I think the chance of landing in the CDS vtable patching methods, 
>>>>>> or in the trampolines, is pretty small, so that shouldn't bother 
>>>>>> the sampling profiler too much.
>>>>>>
>>>>>> Anyway, as part of this bug, we should add a regression test with 
>>>>>> the profiler enabled, and keep calling a method in the CDS 
>>>>>> archive in a tight loop (and manually disable compilation of that 
>>>>>> method using command-line options). The test should never crash, 
>>>>>> and if possible, it should also check that we don't have too many 
>>>>>> "unknown" samples.
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>>
>>>>>> On 2/19/16 4:39 PM, Dean Long wrote:
>>>>>>> We have profiling code that will suspend a thread at random 
>>>>>>> points and try to walk the stack.
>>>>>>> I think frame::sender() will get confused if you happen to catch 
>>>>>>> a thread in the trampoline,
>>>>>>> and frame::_pc is in metadata and not the code cache.
>>>>>>>
>>>>>>> dl
>>>>>>>
>>>>>>> On 2/19/2016 2:19 PM, Calvin Cheung wrote:
>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8145221
>>>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/8145221/webrev.00/
>>>>>>>>
>>>>>>>> This optimization reduces the size of the RW region of the CDS 
>>>>>>>> archive. It also reduces the amount of pages in the RW region 
>>>>>>>> that are actually written into during runtime.
>>>>>>>>
>>>>>>>> The original prototype on the x86_64 platform was done by Ioi 
>>>>>>>> (ioi.lam at oracle.com).
>>>>>>>> I helped porting it to other platforms.
>>>>>>>> Special thanks to Goetz (goetz.lindenmaier at sap.com) who 
>>>>>>>> provided the changes to the ppc platform.
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>>     JPRT with testset hotspot
>>>>>>>>     maunal testing on X86_64, x86_32 and solaris/sparcv9 with 
>>>>>>>> -XX:+PrintAssembly -XX:+PrintInterpreter
>>>>>>>>     built on the Zero platform
>>>>>>>>     tested on the openjdk aarch64 platform
>>>>>>>>     refworkload on various platform (please refer to bug report 
>>>>>>>> for data)
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>> Calvin
>


More information about the hotspot-runtime-dev mailing list