[PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output

Schmidt, Lutz lutz.schmidt at sap.com
Mon May 20 08:22:30 UTC 2019


Hi Vladimir, 

I appreciate the time you spent making this change "push-worthy". Thank you for reviewing it!

I am now hoping for a second reviewer to set aside some time and have a look. 

Thanks,
Lutz

On 17.05.19, 16:48, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:

    Good.
    
    Thanks
    Vladimir
    
    > On May 17, 2019, at 6:19 AM, Schmidt, Lutz <lutz.schmidt at sap.com> wrote:
    > 
    > Hi Vladimir, 
    > here is what I changed to overcome the ZERO build issues:
    > 
    > ----------- 8< ----------------
    > diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp
    > --- a/src/hotspot/cpu/zero/assembler_zero.hpp    Fri May 17 14:11:44 2019 +0200
    > +++ b/src/hotspot/cpu/zero/assembler_zero.hpp    Fri May 17 14:14:06 2019 +0200
    > @@ -37,6 +37,12 @@
    > 
    >  public:
    >   void pd_patch_instruction(address branch, address target, const char* file, int line);
    > +
    > +  //---<  calculate length of instruction  >---
    > +  static unsigned int instr_len(unsigned char *instr) { return 1; }
    > +
    > +  //---<  longest instructions  >---
    > +  static unsigned int instr_maxlen() { return 1; }
    > };
    > 
    > class MacroAssembler : public Assembler {
    > diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp
    > --- a/src/hotspot/share/compiler/abstractDisassembler.cpp    Fri May 17 14:11:44 2019 +0200
    > +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp    Fri May 17 14:14:06 2019 +0200
    > @@ -61,6 +61,9 @@
    > bool AbstractDisassembler::_show_bytes  = false; // set "true" to see what's in memory bit by bit
    >                                                  // might prove cumbersome because instr_len is hard to find on x86
    > #endif
    > +#if defined(ZERO)
    > +bool AbstractDisassembler::_show_bytes  = false; // set "true" to see what's in memory bit by bit
    > +#endif
    > 
    > // Return #bytes printed. Callers may use that for output alignment.
    > // Print instruction address, and offset from blob begin.
    > ----------- >8 ----------------
    > 
    > This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo:
    > https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/
    > 
    > Regards,
    > Lutz
    > 
    > 
    > On 16.05.19, 21:40, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    > 
    >    I am not sure about exact parameters but I see build testing uses next:
    > 
    >    configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc
    > 
    >    Vladimir
    > 
    >>    On 5/16/19 12:22 PM, Schmidt, Lutz wrote:
    >> Hi Vladimir,
    >> 
    >> thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"?
    >> 
    >> I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time.
    >> 
    >> Thanks,
    >> Lutz
    >> 
    >> On 16.05.19, 20:38, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >> 
    >>     linux-x64-zero build is broke:
    >> 
    >>     workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler'
    >>           int instr_size_in_bytes = Assembler::instr_len(pos);
    >>                                                ^~~~~~~~~
    >>     Other builds and testing are good.
    >> 
    >>     Thanks,
    >>     Vladimir
    >> 
    >>>     On 5/16/19 9:47 AM, Vladimir Kozlov wrote:
    >>> Nice.
    >>> 
    >>> I submitted our tier1-3 testing.
    >>> 
    >>> Thanks,
    >>> Vladimir
    >>> 
    >>>> On 5/16/19 2:55 AM, Schmidt, Lutz wrote:
    >>>> Hi Vladimir,
    >>>> 
    >>>> sorry for the delayed reaction on your comments.
    >>>> 
    >>>>  - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list
    >>>> of modified files.
    >>>> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In
    >>>> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last
    >>>> night).
    >>>>  - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter
    >>>> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore.
    >>>>  - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that.
    >>>>  - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/
    >>>> 
    >>>> ~thartmann:
    >>>> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it
    >>>> work reliably. I would like to keep it in as a reminder and a template to build on.
    >>>> 
    >>>> Thanks,
    >>>> Lutz
    >>>> 
    >>>> On 10.05.19, 23:16, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >>>> 
    >>>>     Hi Lutz,
    >>>>     My comments are inlined below.
    >>>>>     On 5/10/19 8:44 AM, Schmidt, Lutz wrote:
    >>>>> Thank you, Vladimir!
    >>>>> Please find my comments inline and let me know what you think.
    >>>>> A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/
    >>>>     Found one more I missed last time:
    >>>>     assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned):
    >>>>       static int instr_len(unsigned char *instr);
    >>>>> Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a
    >>>> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there
    >>>> were no serious clashes) and sent a short note.
    >>>>     NP. Please, provide final webrev when you can so that I can run these changes through our testing to
    >>>>     make sure no issues are present (especially in builds).
    >>>>> Regards,
    >>>>> Lutz
    >>>>> 
    >>>>> On 09.05.19, 21:30, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >>>>> 
    >>>>>     Hi Lutz,
    >>>>> 
    >>>>>     Thank you for doing this great work.
    >>>>> 
    >>>>>     I have just small comments:
    >>>>> 
    >>>>>     x86_64.ad - empty change.
    >>>>> File contains whitespace changes for formatting. Not visible in webrev.
    >>>>     Okay.
    >>>>> 
    >>>>>     nmethod.cpp - LUCY?
    >>>>> 
    >>>>>     +        st->print_cr("LUCY: NULL-oop");
    >>>>>     +   tty->print("LUCY NULL-oop");
    >>>>> Oops. Leftover debugging output. Removed. Reads "NULL-oop" now.
    >>>>     Okay.
    >>>>> 
    >>>>>     nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'.
    >>>>> Changed.
    >>>>> 
    >>>>>     vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value().
    >>>>> Changed.
    >>>>> 
    >>>>>     disassembler.* - LUCY_OBSOLETE?
    >>>>> 
    >>>>>     +#if defined(LUCY_OBSOLETE)  // Used in SAPJVM only
    >>>>> This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction
    >>>> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it
    >>>> out. But now I did. It's gone.
    >>>>     Okay.
    >>>>> 
    >>>>>     compilerDefinitions.hpp - I don't see where tier_digit() is used.
    >>>>> I'm surprised myself. Introduced it and then made it obsolete. It's gone.
    >>>>> 
    >>>>>     disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You
    >>>>>     need to check for invalid value and print help output in such case - it will be very useful if you
    >>>>>     forgot a value spelling. Also add line for 'help' value.
    >>>>> 
    >>>>> The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually,
    >>>> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is
    >>>> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers)
    >>>> like. Not changed as of now. Awaiting your input.
    >>>>     It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is
    >>>>     useless because PrintAssemblyOptions is used only for disassembler and there are no values which
    >>>>     don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix).
    >>>>     And an other commented new line:
    >>>>     +  // ost->print_cr("PrintAssemblyOptions='%s'", options());
    >>>>> 
    >>>>> Printing help text: There is an option (hsdis-help) to request help text printout. >
    >>>>> Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found -
    >>>> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more
    >>>> intelligence. I can do that, but would like to do it in a separate effort. Your opinion?
    >>>>     Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use
    >>>>     if-else or switch without complicating the code.
    >>>>     I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use
    >>>>     here? I don't think next comment is correct for ccstr type:
    >>>>     http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190
    >>>>     It would be nice to fix it but you can do it later if you don't want to add more changes.
    >>>>> 
    >>>>>     Do you need next commented lines:
    >>>>> 
    >>>>>     disassembler.cpp -
    >>>>>     +//  ptrdiff_t     _offset;
    >>>>> Deleted.
    >>>>> 
    >>>>>     +//      Output suppressed because it messes up disassembly.
    >>>>>     +//      output()->print_cr("[Disassembling for mach='%s']", (const char*)arg);
    >>>>> Uncommented, would like to keep it. Made the if condition permanently false.
    >>>>> 
    >>>>>     disassembler_s390.cpp -
    >>>>>     +//    st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize);
    >>>>> Deleted.
    >>>>> 
    >>>>>     compile.cpp -
    >>>>>     +//  st->print("#  ");  _tf->dump_on(st);  st->cr();
    >>>>> Uncommented.
    >>>>> 
    >>>>> 
    >>>>>     abstractDisassembler.cpp -
    >>>>>     //                  st->print("0x%016lx", *((julong*)here));
    >>>>>                        st->print("0x%016lx", *((uintptr_t*)here));
    >>>>>     //                  st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4)));
    >>>>> Commented lines are gone.
    >>>>> 
    >>>>>     abstractDisassembler.cpp - may be explicit cast (byte*)?:
    >>>>> 
    >>>>>              st->print("%2.2x", *byte);
    >>>>>              st->print("%2.2x", *pos);
    >>>>>                      st->print("0x%02x", *here);
    >>>>> Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com,
    >>>> std::byte is a C++17 feature. We are not there yet.
    >>>>     okay
    >>>>> 
    >>>>>     PTR64_FORMAT ?:
    >>>>>                        st->print("0x%016lx", *((uintptr_t*)here));
    >>>>> I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters.
    >>>> Changed other occurrences, so I changed this one as well.
    >>>>     Thanks,
    >>>>     Vladimir
    >>>>> 
    >>>>> 
    >>>>>     Thanks,
    >>>>>     Vladimir
    >>>>> 
    >>>>>>     On 5/8/19 8:31 AM, Schmidt, Lutz wrote:
    >>>>>> Dear Community,
    >>>>>> 
    >>>>>> may I please request comments and reviews for this change? Thank you!
    >>>>>> 
    >>>>>> I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The
    >>>> code which constitutes this patch was not changed. Here's the webrev link:
    >>>>>> https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/
    >>>>>> 
    >>>>>> Regards,
    >>>>>> Lutz
    >>>>>> 
    >>>>>> On 11.04.19, 23:24, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >>>>>> 
    >>>>>>     Dear All,
    >>>>>> 
    >>>>>>     this topic was discussed back in Nov/Dec 2018:
    >>>>>>     http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html
    >>>>>>     http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html
    >>>>>> 
    >>>>>>     Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable.
    >>>>>>     The result was mixed, some pro, some con. I let the input from back then influence my work of the
    >>>> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general
    >>>> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the
    >>>> list (with default settings):
    >>>>>> 
    >>>>>>     PrintAssemblyOptions help:
    >>>>>>       hsdis-print-raw       test plugin by requesting raw output (deprecated)
    >>>>>>       hsdis-print-raw-xml   test plugin by requesting raw xml (deprecated)
    >>>>>>       hsdis-print-pc        turn off PC printing (on by default) (deprecated)
    >>>>>>       hsdis-print-bytes     turn on instruction byte output (deprecated)
    >>>>>> 
    >>>>>>       hsdis-show-pc            toggle printing current pc,        currently ON
    >>>>>>       hsdis-show-offset        toggle printing current offset,    currently OFF
    >>>>>>       hsdis-show-bytes         toggle printing instruction bytes, currently OFF
    >>>>>>       hsdis-show-data-hex      toggle formatting data as hex,     currently ON
    >>>>>>       hsdis-show-data-int      toggle formatting data as int,     currently OFF
    >>>>>>       hsdis-show-data-float    toggle formatting data as float,   currently OFF
    >>>>>>       hsdis-show-structs       toggle compiler data structures,   currently OFF
    >>>>>>       hsdis-show-comment       toggle instruction comments,       currently OFF
    >>>>>>       hsdis-show-block-comment toggle block comments,             currently OFF
    >>>>>>       hsdis-align-instr        toggle instruction alignment,      currently OFF
    >>>>>> 
    >>>>>>     Finally, I have pushed my changes to a state where I can dare to request your comments and reviews.
    >>>> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes.
    >>>> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like
    >>>> a plan? Here is what you get:
    >>>>>> 
    >>>>>>     The machine code generated by the JVM can be printed in three different formats:
    >>>>>>      - Hexadecimal.
    >>>>>>        This is basically a hex dump of the memory range containing the code.
    >>>>>>        This format is always available (PRODUCT and not-PRODUCT builds), regardless
    >>>>>>        of the availability of a disassembler library. It applies to all sorts of
    >>>>>>        code, be it blobs, stubs, compiled nmethods, ...
    >>>>>>        This format seems useless at first glance, but it is not. In an upcoming,
    >>>>>>        separate enhancement, the JVM will be made capable of reading files
    >>>>>>        containing such code blocks and disassembling them post mortem. The most
    >>>>>>        prominent example is an hs_err* file.
    >>>>>>      - Disassembled.
    >>>>>>        This is an assembly listing of the instructions as found in the memory range
    >>>>>>        occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable
    >>>>>>        disassembler library (hsdis-<platform>.so) must be available at runtime.
    >>>>>>        Most often, that will only be the case in test environments. If no disassembler
    >>>>>>        library is available, hexadecimal output is used as fallback.
    >>>>>>      - OptoAssembly.
    >>>>>>        This is a meta code listing created only by the C2 compiler. As it is somewhat
    >>>>>>        closer to the Java code, it may be helpful in linking assembly code to Java code.
    >>>>>> 
    >>>>>>     All three formats can be merged with additional information, most prominently compiler-internal
    >>>> "knowledge" about blocks, related bytecodes, statistics counters, and much more.
    >>>>>> 
    >>>>>>     Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes,
    >>>> dependencies, exception handlers, are printed to aid in debugging.
    >>>>>> 
    >>>>>>     The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support
    >>>> OptoAssembly output. Data structures are unavailable as well.
    >>>>>> 
    >>>>>>     So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you
    >>>> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here:
    >>>>>>       http://cr.openjdk.java.net/~lucy/webrevs/8213084/
    >>>>>> 
    >>>>>>     OptoAssembly output for reference (always on with PrintAssembly):
    >>>>>>     =================================================================
    >>>>>> 
    >>>>>>     036     B2: #   out( B7 B3 ) <- in( B1 )  Freq: 1
    >>>>>>     036     movl    RBP, [RSI + #12 (8-bit)]        # compressed ptr ! Field: java/lang/String.value
    >>>> (constant)
    >>>>>>     039     movl    R11, [RBP + #12 (8-bit)]        # range
    >>>>>>     03d     NullCheck RBP
    >>>>>> 
    >>>>>>     03d     B3: #   out( B6 B4 ) <- in( B2 )  Freq: 0.999999
    >>>>>>     03d     cmpl    RDX, R11        # unsigned
    >>>>>>     040     jnb,us  B6  P=0.000000 C=5375.000000
    >>>>>> 
    >>>>>>     PrintAssembly with no disassembler library available:
    >>>>>>     =====================================================
    >>>>>> 
    >>>>>>     [Code]
    >>>>>>     [Entry Point]
    >>>>>>       0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090
    >>>>>>     [Verified Entry Point]
    >>>>>>       0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3
    >>>>>>       0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c
    >>>>>>       0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6
    >>>>>>       0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4
    >>>>>>     [Exception Handler]
    >>>>>>       0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff
    >>>>>>     [End]
    >>>>>> 
    >>>>>>     PrintAssembly with minimal verbosity:
    >>>>>>     =====================================
    >>>>>> 
    >>>>>>       0x00007f0434b89bd6:   mov    0xc(%rsi),%ebp
    >>>>>>       0x00007f0434b89bd9:   mov    0xc(%rbp),%r11d
    >>>>>>       0x00007f0434b89bdd:   cmp    %r11d,%edx
    >>>>>>       0x00007f0434b89be0:   jae    0x00007f0434b89c0e
    >>>>>> 
    >>>>>>     PrintAssembly (previous plus code offsets from code begin):
    >>>>>>     ===========================================================
    >>>>>> 
    >>>>>>       0x00007f63c11d7956 (+0x36):   mov    0xc(%rsi),%ebp
    >>>>>>       0x00007f63c11d7959 (+0x39):   mov    0xc(%rbp),%r11d
    >>>>>>       0x00007f63c11d795d (+0x3d):   cmp    %r11d,%edx
    >>>>>>       0x00007f63c11d7960 (+0x40):   jae    0x00007f63c11d798e
    >>>>>> 
    >>>>>>     PrintAssembly (previous plus block comments):
    >>>>>>     ===========================================================
    >>>>>> 
    >>>>>>     ;; B2: #       out( B7 B3 ) <- in( B1 )  Freq: 1
    >>>>>>       0x00007f48211d76d6 (+0x36):   mov    0xc(%rsi),%ebp
    >>>>>>       0x00007f48211d76d9 (+0x39):   mov    0xc(%rbp),%r11d
    >>>>>>      ;; B3: #       out( B6 B4 ) <- in( B2 )  Freq: 0.999999
    >>>>>>       0x00007f48211d76dd (+0x3d):   cmp    %r11d,%edx
    >>>>>>       0x00007f48211d76e0 (+0x40):   jae    0x00007f48211d770e
    >>>>>> 
    >>>>>>     PrintAssembly (previous plus instruction comments):
    >>>>>>     ===========================================================
    >>>>>> 
    >>>>>>     ;; B2: #       out( B7 B3 ) <- in( B1 )  Freq: 1
    >>>>>>       0x00007fc3e11d7a56 (+0x36):   mov    0xc(%rsi),%ebp           ;*getfield value {reexecute=0
    >>>> rethrow=0 return_oop=0}
    >>>>>>                                                                     ; - java.lang.String::charAt at 8
    >>>> (line 702)
    >>>>>>       0x00007fc3e11d7a59 (+0x39):   mov    0xc(%rbp),%r11d          ; implicit exception: dispatches to
    >>>> 0x00007fc3e11d7a9e
    >>>>>>      ;; B3: #       out( B6 B4 ) <- in( B2 )  Freq: 0.999999
    >>>>>>       0x00007fc3e11d7a5d (+0x3d):   cmp    %r11d,%edx
    >>>>>>       0x00007fc3e11d7a60 (+0x40):   jae    0x00007fc3e11d7a8e
    >>>>>> 
    >>>>>>     For completeness, here are the links to
    >>>>>>     Bug:    https://bugs.openjdk.java.net/browse/JDK-8213084
    >>>>>>     Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/
    >>>>>> 
    >>>>>>     But please, as mentioned above, first focus on the output. The nitty details of the implementation
    >>>> I would like to discuss after the output format has received some support.
    >>>>>> 
    >>>>>>     Thank you so much for your time!
    >>>>>>     Lutz
    >>>>>> 
    >>>>>> 
    >>>>>> 
    >>>>>> 
    >>>>> 
    >>>>> 
    >>>> 
    >>>> 
    >> 
    >> 
    > 
    > 
    
    



More information about the hotspot-compiler-dev mailing list