RFR: 8178287: AsyncGetCallTrace fails to traverse valid Java stacks [v4]

Ludovic Henry ludovic at datadoghq.com
Fri Aug 20 12:59:43 UTC 2021


[Sending again now with the email address properly registered to the
mailing lists]

Hello,

As it's been quite a while already, I wanted to nudge this RFR forward. Who
would be best qualified to review this PR?

Thank you!
Ludovic

On Fri, Aug 20, 2021 at 8:49 AM Ludovic Henry <ludovic at datadoghq.com> wrote:

> Hello,
>
> As it's been quite a while already, I wanted to nudge this RFR forward.
> Who would be best qualified to review this PR?
>
> Thank you!
> Ludovic
>
> On Wed, Aug 18, 2021 at 9:21 PM Marcus Hirt <hirt at openjdk.java.net> wrote:
>
>> On Mon, 19 Jul 2021 09:25:59 GMT, Ludovic Henry <luhenry at openjdk.org>
>> wrote:
>>
>> >> When the signal sent for AsyncGetCallTrace or JFR would land on a
>> runtime stub (like arraycopy), a vtable stub, or the prolog of a compiled
>> method,  it wouldn't be able to detect the sender (caller) frame for
>> multiple reasons. This patch fixes these cases through adding
>> CodeBlob-specific frame parser which are in the best position to know how a
>> frame is setup.
>> >>
>> >> The following examples have been profiled with honest-profiler which
>> uses `AsyncGetCallTrace`.
>> >>
>> >> # `Prof1`
>> >>
>> >> public class Prof1 {
>> >>
>> >>     public static void main(String[] args) {
>> >>         StringBuilder sb = new StringBuilder();
>> >>         for (int i = 0; i < 1000000; i++) {
>> >>             sb.append("ab");
>> >>             sb.delete(0, 1);
>> >>         }
>> >>         System.out.println(sb.length());
>> >>     }
>> >> }
>> >>
>> >>
>> >> - Baseline:
>> >>
>> >> Flat Profile (by method):
>> >>         (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>> >>         (t  0.5,s  0.2) Prof1::main
>> >>         (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>> >>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >>         (t  0.0,s  0.0)
>> java.lang.AbstractStringBuilder::ensureCapacityInternal
>> >>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>> >>         (t  0.0,s  0.0) java.lang.String::getBytes
>> >>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> >>         (t  0.0,s  0.0) java.lang.StringBuilder::delete
>> >>         (t  0.2,s  0.0) java.lang.StringBuilder::append
>> >>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>> >>         (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> >>
>> >> - With `StubRoutinesBlob::FrameParser`:
>> >>
>> >> Flat Profile (by method):
>> >>         (t 98.7,s 98.7)
>> java.lang.AbstractStringBuilder::ensureCapacityInternal
>> >>         (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>> >>         (t 99.8,s  0.2) Prof1::main
>> >>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >>         (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>> >>         (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>> >>         (t 98.8,s  0.0) java.lang.StringBuilder::append
>> >>         (t  0.9,s  0.0) java.lang.StringBuilder::delete
>> >>
>> >>
>> >> # `Prof2`
>> >>
>> >> import java.util.function.Supplier;
>> >>
>> >> public class Prof2 {
>> >>
>> >>     public static void main(String[] args) {
>> >>         var rand = new java.util.Random(0);
>> >>         Supplier[] suppliers = {
>> >>                 () -> 0,
>> >>                 () -> 1,
>> >>                 () -> 2,
>> >>                 () -> 3,
>> >>         };
>> >>
>> >>         long sum = 0;
>> >>         for (int i = 0; i >= 0; i++) {
>> >>             sum += (int)suppliers[i % suppliers.length].get();
>> >>         }
>> >>     }
>> >> }
>> >>
>> >>
>> >> - Baseline:
>> >>
>> >> Flat Profile (by method):
>> >>         (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>> >>         (t 39.2,s 35.2) Prof2::main
>> >>         (t  1.4,s  1.4) Prof2::lambda$main$3
>> >>         (t  1.0,s  1.0) Prof2::lambda$main$2
>> >>         (t  0.9,s  0.9) Prof2::lambda$main$1
>> >>         (t  0.7,s  0.7) Prof2::lambda$main$0
>> >>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >>         (t  0.0,s  0.0) java.lang.Thread::exit
>> >>         (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>> >>         (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>> >>         (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>> >>         (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
>> >>
>> >>
>> >> - With `VtableBlob::FrameParser` and `nmethod::FrameParser`:
>> >>
>> >> Flat Profile (by method):
>> >>         (t 74.1,s 70.3) Prof2::main
>> >>         (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>> >>         (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>> >>         (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>> >>         (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>> >>         (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>> >>         (t  1.2,s  1.2) Prof2::lambda$main$2
>> >>         (t  0.9,s  0.9) Prof2::lambda$main$3
>> >>         (t  0.9,s  0.9) Prof2::lambda$main$1
>> >>         (t  0.7,s  0.7) Prof2::lambda$main$0
>> >>         (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >
>> > Ludovic Henry has updated the pull request with a new target base due
>> to a merge or a rebase. The incremental webrev excludes the unrelated
>> changes brought in by the merge/rebase. The pull request contains eight
>> additional commits since the last revision:
>> >
>> >  - Remove FrameParser and the required allocation
>> >
>> >    The need for the allocation would be it non async-safe. However,
>> >    AsyncGetCallTrace is async-safe and thus can't allow for allocations.
>> >  - Merge branch 'master' of https://github.com/openjdk/jdk into
>> fix-8178287
>> >  - Fix comments
>> >  - Disable checks in FrameParser when known to be safe
>> >  - Allow AsyncGetCallTrace and JFR to unwind stack from vtable stub
>> >
>> >    The program is the following:
>> >
>> >    ```
>> >    import java.util.function.Supplier;
>> >
>> >    public class Prof2 {
>> >
>> >        public static void main(String[] args) {
>> >            var rand = new java.util.Random(0);
>> >            Supplier[] suppliers = {
>> >                    () -> 0,
>> >                    () -> 1,
>> >                    () -> 2,
>> >                    () -> 3,
>> >            };
>> >
>> >            long sum = 0;
>> >            for (int i = 0; i >= 0; i++) {
>> >                sum += (int)suppliers[i % suppliers.length].get();
>> >            }
>> >        }
>> >    }
>> >    ```
>> >
>> >    The results are as follows:
>> >
>> >    - Baseline (from previous commit):
>> >
>> >    Flat Profile (by method):
>> >                (t 39.3,s 39.3) AGCT::Unknown Java[ERR=-5]
>> >                (t 40.3,s 36.1) Prof2::main
>> >                (t  6.4,s  5.3) Prof2$$Lambda$28.0x0000000800081000::get
>> >                (t  6.1,s  5.1) Prof2$$Lambda$29.0x0000000800081220::get
>> >                (t  6.0,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>> >                (t  6.1,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>> >                (t  1.1,s  1.1) Prof2::lambda$main$2
>> >                (t  1.1,s  1.1) Prof2::lambda$main$0
>> >                (t  1.0,s  1.0) Prof2::lambda$main$1
>> >                (t  0.9,s  0.9) Prof2::lambda$main$3
>> >                (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >
>> >    - With unwind from vtable stub
>> >
>> >    Flat Profile (by method):
>> >            (t 74.1,s 70.3) Prof2::main
>> >            (t  6.5,s  5.5) Prof2$$Lambda$29.0x0000000800081220::get
>> >            (t  6.6,s  5.4) Prof2$$Lambda$28.0x0000000800081000::get
>> >            (t  5.7,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>> >            (t  5.9,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>> >            (t  4.9,s  4.9) AGCT::Unknown Java[ERR=-5]
>> >            (t  1.2,s  1.2) Prof2::lambda$main$2
>> >            (t  0.9,s  0.9) Prof2::lambda$main$3
>> >            (t  0.9,s  0.9) Prof2::lambda$main$1
>> >            (t  0.7,s  0.7) Prof2::lambda$main$0
>> >            (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >
>> >    We attribute the vtable stub to the caller and not the callee, which
>> is
>> >    already an improvement from the existing case.
>> >  - Allow AsyncGetCallTrace and JFR to unwind stack from nmethod's prolog
>> >
>> >    When sampling hits the prolog of a method, Hotspot assumes it's
>> unable
>> >    to parse the frame. This change allows to parse such frame on x86 by
>> >    specializing which instruction it's hitting in the prolog.
>> >
>> >    The results are as follows:
>> >
>> >    - Baseline:
>> >
>> >    Flat Profile (by method):
>> >            (t 60.7,s 60.7) AGCT::Unknown Java[ERR=-5]
>> >            (t 39.2,s 35.2) Prof2::main
>> >            (t  1.4,s  1.4) Prof2::lambda$main$3
>> >            (t  1.0,s  1.0) Prof2::lambda$main$2
>> >            (t  0.9,s  0.9) Prof2::lambda$main$1
>> >            (t  0.7,s  0.7) Prof2::lambda$main$0
>> >            (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >            (t  0.0,s  0.0) java.lang.Thread::exit
>> >            (t  0.9,s  0.0) Prof2$$Lambda$2.0x0000000800c00c28::get
>> >            (t  1.0,s  0.0) Prof2$$Lambda$3.0x0000000800c01000::get
>> >            (t  1.4,s  0.0) Prof2$$Lambda$4.0x0000000800c01220::get
>> >            (t  0.7,s  0.0) Prof2$$Lambda$1.0x0000000800c00a08::get
>> >
>> >    - With incomplete frame parsing:
>> >
>> >    Flat Profile (by method):
>> >            (t 39.3,s 39.3) AGCT::Unknown Java[ERR=-5]
>> >            (t 40.3,s 36.1) Prof2::main
>> >            (t  6.4,s  5.3) Prof2$$Lambda$28.0x0000000800081000::get
>> >            (t  6.1,s  5.1) Prof2$$Lambda$29.0x0000000800081220::get
>> >            (t  6.0,s  5.0) Prof2$$Lambda$27.0x0000000800080c28::get
>> >            (t  6.1,s  5.0) Prof2$$Lambda$26.0x0000000800080a08::get
>> >            (t  1.1,s  1.1) Prof2::lambda$main$2
>> >            (t  1.1,s  1.1) Prof2::lambda$main$0
>> >            (t  1.0,s  1.0) Prof2::lambda$main$1
>> >            (t  0.9,s  0.9) Prof2::lambda$main$3
>> >            (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >            (t  0.0,s  0.0) java.util.Locale::getInstance
>> >            (t  0.0,s  0.0) AGCT::Not walkable Java[ERR=-6]
>> >            (t  0.0,s  0.0)
>> jdk.internal.loader.BuiltinClassLoader::loadClassOrNull
>> >            (t  0.0,s  0.0) java.lang.ClassLoader::loadClass
>> >            (t  0.0,s  0.0) sun.net.util.URLUtil::urlNoFragString
>> >            (t  0.0,s  0.0) java.lang.Class::forName0
>> >            (t  0.0,s  0.0) java.util.Locale::initDefault
>> >            (t  0.0,s  0.0)
>> jdk.internal.loader.BuiltinClassLoader::loadClass
>> >            (t  0.0,s  0.0) jdk.internal.loader.URLClassPath::getLoader
>> >            (t  0.0,s  0.0) jdk.internal.loader.URLClassPath::getResource
>> >            (t  0.0,s  0.0) java.lang.String::toLowerCase
>> >            (t  0.0,s  0.0) sun.launcher.LauncherHelper::loadMainClass
>> >            (t  0.0,s  0.0) sun.launcher.LauncherHelper::checkAndLoadMain
>> >            (t  0.0,s  0.0) java.util.Locale::<clinit>
>> >            (t  0.0,s  0.0)
>> jdk.internal.loader.BuiltinClassLoader::findClassOnClassPathOrNull
>> >            (t  0.0,s  0.0)
>> jdk.internal.loader.ClassLoaders$AppClassLoader::loadClass
>> >            (t  0.0,s  0.0) java.lang.Class::forName
>> >
>> >    The program is as follows:
>> >
>> >    ```
>> >    import java.util.function.Supplier;
>> >
>> >    public class Prof2 {
>> >
>> >        public static void main(String[] args) {
>> >            var rand = new java.util.Random(0);
>> >            Supplier[] suppliers = {
>> >                    () -> 0,
>> >                    () -> 1,
>> >                    () -> 2,
>> >                    () -> 3,
>> >            };
>> >
>> >            long sum = 0;
>> >            for (int i = 0; i >= 0; i++) {
>> >                sum += (int)suppliers[i % suppliers.length].get();
>> >            }
>> >        }
>> >    }
>> >    ```
>> >
>> >    We see that the results are particularely useful in this case as the
>> >    methods are very short (it only returns an integer), and the
>> probability
>> >    of hitting the prolog is then very high.
>> >  - Allow AsyncGetCallTrace and JFR to walk a stub frame
>> >
>> >    When the signal sent for AsyncGetCallTrace or JFR would land on a
>> stub
>> >    (like arraycopy), it wouldn't be able to detect the sender (caller)
>> >    frame because `_cb->frame_size() == 0`.
>> >
>> >    Because we fully control how the prolog and epilog of stub code is
>> >    generated, we know there are two cases:
>> >    1. A stack frame is allocated via macroAssembler->enter(), and
>> consists
>> >    in `push rbp; mov rsp, rbp;`.
>> >    2. No stack frames are allocated and rbp is left unchanged and rsp is
>> >    decremented with the `call` instruction that push the return `pc` on
>> the
>> >    stack.
>> >
>> >    For case 1., we can easily know the sender frame by simply looking at
>> >    rbp, especially since we know that all stubs preserve the frame
>> pointer
>> >    (on x86 at least).
>> >
>> >    For case 2., we end up returning the sender's sender, but that
>> already
>> >    gives us more information than what we have today.
>> >
>> >    The results are as follows:
>> >
>> >    - Baseline:
>> >
>> >    Flat Profile (by method):
>> >            (t 99.4,s 99.4) AGCT::Unknown Java[ERR=-5]
>> >            (t  0.5,s  0.2) Prof1::main
>> >            (t  0.2,s  0.2) java.lang.AbstractStringBuilder::append
>> >            (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >            (t  0.0,s  0.0)
>> java.lang.AbstractStringBuilder::ensureCapacityInternal
>> >            (t  0.0,s  0.0) java.lang.AbstractStringBuilder::shift
>> >            (t  0.0,s  0.0) java.lang.String::getBytes
>> >            (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> >            (t  0.0,s  0.0) java.lang.StringBuilder::delete
>> >            (t  0.2,s  0.0) java.lang.StringBuilder::append
>> >            (t  0.0,s  0.0) java.lang.AbstractStringBuilder::delete
>> >            (t  0.0,s  0.0) java.lang.AbstractStringBuilder::putStringAt
>> >
>> >    - With StubRoutinesBlob::FrameParser
>> >
>> >    Flat Profile (by method):
>> >            (t 98.7,s 98.7)
>> java.lang.AbstractStringBuilder::ensureCapacityInternal
>> >            (t  0.9,s  0.9) java.lang.AbstractStringBuilder::delete
>> >            (t 99.8,s  0.2) Prof1::main
>> >            (t  0.1,s  0.1) AGCT::Unknown not Java[ERR=-3]
>> >            (t  0.0,s  0.0) AGCT::Unknown Java[ERR=-5]
>> >            (t 98.8,s  0.0) java.lang.AbstractStringBuilder::append
>> >            (t 98.8,s  0.0) java.lang.StringBuilder::append
>> >            (t  0.9,s  0.0) java.lang.StringBuilder::delete
>> >
>> >    The program is as follows:
>> >
>> >    ```
>> >    public class Prof1 {
>> >
>> >        public static void main(String[] args) {
>> >            StringBuilder sb = new StringBuilder();
>> >            for (int i = 0; i < 1000000; i++) {
>> >                sb.append("ab");
>> >                sb.delete(0, 1);
>> >            }
>> >            System.out.println(sb.length());
>> >        }
>> >    }
>> >    ```
>> >
>> >    We now account for the arraycopy stub which is called by
>> >    AbstractStringBuilder::ensureCapacityInternal. It was previously
>> ignored
>> >    because it would not know how to parse the frame for the arraycopy
>> stub
>> >    and would fall in the AGCT::Unknown Java[ERR=-5] section.
>> >
>> >    However, it still isn't perfect since it doesn't point to the
>> arraycopy stub
>> >    directly.
>> >  - Extract sender frame parsing to CodeBlock::FrameParser
>> >
>> >    Whether and how a frame is setup is controlled by the code generator
>> >    for the specific CodeBlock. The CodeBlock is then in the best place
>> to know how
>> >    to parse the sender's frame from the current frame in the given
>> CodeBlock.
>> >
>> >    This refactoring proposes to extract this parsing out of `frame` and
>> into a
>> >    `CodeBlock::FrameParser`. This FrameParser is then specialized in
>> the relevant
>> >    inherited children of CodeBlock.
>> >
>> >    This change is to largely facilitate adding new supported cases for
>> JDK-8252417
>> >    like runtime stubs.
>>
>> Ok. Do you know who should be the primary reviewer? Rickard Bäckman? Nils
>> Eliasson?
>>
>> -------------
>>
>> PR: https://git.openjdk.java.net/jdk/pull/4436
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20210820/0f42f19a/attachment-0001.htm>


More information about the serviceability-dev mailing list