[PING] RFR(M): 8207343: Automate vtable/itable stub size calculation

Schmidt, Lutz lutz.schmidt at sap.com
Wed Aug 22 16:30:30 UTC 2018


Hi Vladimir, 
how about this:

+#if defined(PRODUCT)
+  // These values are good for the PRODUCT case (no tracing).
+  static const int first_vtableStub_size =  64;
+  static const int first_itableStub_size = 256;
+#else
+  // These values are good for the non-PRODUCT case (when tracing can be switched on).
+  // Here is a list of observed worst-case values:
+  //               vtable  itable
+  // aarch64:
+  // arm:
+  // ppc (linux, BE): 404     288
+  // ppc (linux, LE): 356     276
+  // ppc (AIX):       416     296
+  // s390x:           408     256
+  // Solaris-sparc:   660     324
+  // Solaris-x86:     689     328
+  // x86 (Linux):     670     309
+  // x86 (MacOS):     670     309
+  static const int first_vtableStub_size = 1024;
+  static const int first_itableStub_size =  512;
+#endif

This is included in the webrev (Revision 03, replaced in-place). The superfluous comment lines have been removed as well. 
http://cr.openjdk.java.net/~lucy/webrevs/8207343.03/

Regards,
Lutz

On 22.08.18, 17:46, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:

    Yes, I agree, it will simplify code. But add comment explaining why such difference between product and debug build. May 
    be show platforms with worst case sizes.
    
    Thanks,
    Vladimir
    
    On 8/21/18 11:39 PM, Schmidt, Lutz wrote:
    > Thank you very much, Vladimir!
    > 
    > One quick add'l question: could you live with the #if defined(PRODUCT) variant? I'm a strong KISS (Keep It Simple and Stupid) proponent.
    > 
    > If yes - wonderful. If no - I'd accept that.
    > 
    > Thanks,
    > Lutz
    > 
    > On 22.08.18, 04:11, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    > 
    >      Very nice.
    >      
    >      You left commented code in vtableStubs_x86_32.cpp and vtableStubs_x86_64.cpp
    >      
    >      +//  const int slop32     = (vtable_index == 0) ? 4 :     // code size change with transition from 8-bit to 32-bit
    >      constant (@index == 16).
    >      +//                         (vtable_index < 16) ? 3 : 0;  // index == 0 generates even shorter code.
    >      
    >      If comments in these lines are useful - use them. But remove these lines. Otherwise they could be confusing.
    >      
    >      Thanks,
    >      Vladimir
    >      
    >      
    >      On 8/21/18 2:36 PM, Schmidt, Lutz wrote:
    >      > No worries, Vladimir!
    >      > I did not intend to add new flags. The vm has plenty, I believe.
    >      >
    >      > Please refer to the attached file to see exactly what changed from iteration 02 to 03. Here is a summary:
    >      >    -  _vtab_stub_size and _itab_stub_size were moved to the private section of class VtableStubs.
    >      >    -  same is true for methods code_size_limit() and check_and_set_size_limit().
    >      >    -  class VtableStub and class VtableStubs swapped position in the header file to make the compilers happy.
    >      >    -  minor adjustments in all platform files due to the move of code_size_limit().
    >      >    -  there are now separate initial sizes for vtable and itable stubs.
    >      >    -  the sizes are not fixed but calculated, depending on the actual flag setting.
    >      >
    >      > What is missing? Help from arm/aarch64 people to produce real-world values for arm and aarch64. I do not have access to machinery to create the data myself. I believe the effective default are also suitable for these platform, but would like to be sure. The data is easy to produce:
    >      >    -  run any workload that creates vtable and itable stubs, using the parameters
    >      >       -XX:{+|-}CountCompiledCalls -XX:{+|-}DebugVtables -Xlog:vtablestubs=Trace. (all 4 combinations)
    >      >    -  pipe the output into grep table\ #
    >      >    -  the output of the grep is what I need to fill the table.
    >      >
    >      > What I did not do and don't want to do: make the size estimates platform-specific. I would rather condense all that size-calculation stuff into
    >      >    #if defined(PRODUCT)
    >      >      static int first_vtableStub_size =  64;
    >      >      static int first_itableStub_size = 256;
    >      >    #else
    >      >      static int first_vtableStub_size = 1024;
    >      >      static int first_itableStub_size =  512;
    >      >    #endif
    >      >
    >      > The full webrev (iteration #3) can be found at: http://cr.openjdk.java.net/~lucy/webrevs/8207343.03/
    >      >
    >      > Looking forward to your comments. And to comments/opinions/reviews from others. And to help from ARM experts.
    >      >
    >      > Thanks,
    >      > Lutz
    >      >
    >      > On 18.08.18, 18:52, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >
    >      >      Please, don't add new flags.
    >      >
    >      >      If you mean to use CountCompiledCalls or DebugVtables (or other existing flags) to calculate firstStub_size I agree with it.
    >      >
    >      >      Thanks,
    >      >      Vladimir
    >      >
    >      >      On 8/18/18 6:42 AM, Schmidt, Lutz wrote:
    >      >      > Hi Vladimir,
    >      >      >
    >      >      > you are right, the size estimate for the first stub is wildly off. But:
    >      >      >    o  it's just the first stub. All subsequent ones use a much better estimate.
    >      >      >    o  for x86 and with "-XX:+CountCompiledCalls -XX:+DebugVtables" I get
    >      >      >       [3.716s][trace][vtablestubs] vtable #3 at 0x000000010b4f8a30: size: 671, estimate: 1024, slop area: 353
    >      >      >       [3.809s][trace][vtablestubs] itable #5 at 0x000000010b4f9100: size: 305, estimate: 1024, slop area: 719
    >      >      >    o  I wanted to get rid of the platform-specific size estimates (less locations that need maintenance and can become outdated)
    >      >      >
    >      >      > What can we do? I suggest initializing firstStub_size depending on the PRODUCT flag. In product builds, neither CountCompiledCalls nor DebugVtables will generate code. Furthermore, I could introduce first_vtableStub_size and first_itableStub_size to even better adapt the initial sizes.
    >      >      >
    >      >      > Could you live with that? To find good sizing, I'll have to run some tests. That will happen Monday.
    >      >      >
    >      >      > The fields you mention should not be public. I will change that. Maybe I can even move code_size_limit() and check_and_set_size_limit() to class VtableStubs.
    >      >      >
    >      >      > Regards,
    >      >      > Lutz
    >      >      >
    >      >      >
    >      >      > On 17.08.18, 23:24, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >      >
    >      >      >      Finally looked through changes - good idea.
    >      >      >
    >      >      >      Looking only on SPARC and x86 code.
    >      >      >
    >      >      >      Why you dropped pd_code_size_limit() as firstStub_size estimate (may be with additional delta (say 25%))? It is much
    >      >      >      better then 1024 based on your results:
    >      >      >
    >      >      >      [4.075s][trace][vtablestubs] vtable #3 at 0x00000001199b8630: size: 29, estimate: 1024, slop area: 995
    >      >      >
    >      >      >      Why next fields are public?:
    >      >      >
    >      >      >      +  static int _vtab_stub_size;
    >      >      >      +  static int _itab_stub_size;
    >      >      >
    >      >      >      Otherwise look good.
    >      >      >
    >      >      >      Thanks,
    >      >      >      Vladimir
    >      >      >
    >      >      >      On 8/17/18 7:23 AM, Schmidt, Lutz wrote:
    >      >      >      > Hi,
    >      >      >      >
    >      >      >      > I have uploaded a new webrev which contains the modifications discussed below, plus some minor tweaks. In detail:
    >      >      >      >    o  Using UL instead of PrintMiscellaneous in share/code/vtableStubs.cpp
    >      >      >      >    o  removed a leftover comment line in share/code/vtableStubs.hpp
    >      >      >      >    o  added a new LOG_TAG(vtablestubs)
    >      >      >      >    o  increased expected len of call_VM in vtableStubs_x86*.cpp (had a failing test on MacOS)
    >      >      >      >
    >      >      >      > Please find the new webrev here: http://cr.openjdk.java.net/~lucy/webrevs/8207343.02/
    >      >      >      >
    >      >      >      > Any volunteers out there for a second review?
    >      >      >      >
    >      >      >      > Thanks,
    >      >      >      > Lutz
    >      >      >      >
    >      >      >      >
    >      >      >      > On 16.08.18, 18:47, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >      >      >
    >      >      >      >      Yes, it is better. And numbers are reasonable now ;)
    >      >      >      >
    >      >      >      >      Thanks,
    >      >      >      >      Vladimir
    >      >      >      >
    >      >      >      >      On 8/16/18 12:53 AM, Schmidt, Lutz wrote:
    >      >      >      >      > Sorry for spamming: forgot to attach the output file.
    >      >      >      >      > Lutz
    >      >      >      >      >
    >      >      >      >      > On 16.08.18, 09:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >      >      >      >      >
    >      >      >      >      >      Hi Vladimir,
    >      >      >      >      >
    >      >      >      >      >      I have reformatted the trace output a bit, see attached file. Do you like it better now? And the printed actual size was plain wrong, it is now calculated as (pc - code_begin).
    >      >      >      >      >
    >      >      >      >      >      Any other comments?
    >      >      >      >      >      Thanks,
    >      >      >      >      >      Lutz
    >      >      >      >      >
    >      >      >      >      >      On 15.08.18, 18:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >      >      >      >
    >      >      >      >      >          It looks good to me only one suggestion is to add 'size' to be clear what these numbers mean:
    >      >      >      >      >
    >      >      >      >      >          vtable #3 at 0x0000000116e85c30[1024], size estimate 1024, left over: 353
    >      >      >      >      >
    >      >      >      >      >          Also looking on numbers and it strange. Size estimate 'stub_length' matches actual size (code_end - entry_point) then
    >      >      >      >      >          why left over?
    >      >      >      >      >
    >      >      >      >      >          Thanks,
    >      >      >      >      >          Vladimir
    >      >      >      >      >
    >      >      >      >      >          On 8/15/18 5:55 AM, Schmidt, Lutz wrote:
    >      >      >      >      >          > Hi Vladimir,
    >      >      >      >      >          > what do you think about this change (see attachment) to convert tracing output to UL? I will create a new webrev once I have included your (expected) comments.
    >      >      >      >      >          > Regards, Lutz
    >      >      >      >      >          >
    >      >      >      >      >          > On 15.08.18, 00:34, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >      >      >      >      >          >
    >      >      >      >      >          >      No hurry, Vladimir!
    >      >      >      >      >          >      In the meantime, I will have a look at the PrintMiscellaneous to UL conversion. In the light of previous discussions, this might help making some people happy. __
    >      >      >      >      >          >      Thanks,
    >      >      >      >      >          >      Lutz
    >      >      >      >      >          >
    >      >      >      >      >          >      On 15.08.18, 00:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >      >      >      >          >
    >      >      >      >      >          >          On 8/14/18 12:46 AM, Schmidt, Lutz wrote:
    >      >      >      >      >          >          > Hi Vladimir,
    >      >      >      >      >          >          >
    >      >      >      >      >          >          > the answer is simple: No, I did not. The proposed change is basically a generalized, harmonized variant of a modification we have been using at SAP for many years.
    >      >      >      >      >          >
    >      >      >      >      >          >          Okay. Give me time to look and test changes.
    >      >      >      >      >          >
    >      >      >      >      >          >          What do you think about using Xlog (UL) instead of PrintMiscellaneous in bookkeeping()?
    >      >      >      >      >          >
    >      >      >      >      >          >          >
    >      >      >      >      >          >          > Your idea is interesting, though. I (still) do not completely like the implementation currently in RFR. The code size variance caused by data only known at runtime (constants, offsets, for example) makes the generators ugly.
    >      >      >      >      >          >          >
    >      >      >      >      >          >          > Using a temp buffer adds other complexities. What about relocations, for example. Without having had a deeper look, I expect the effort to be "considerable".
    >      >      >      >      >          >
    >      >      >      >      >          >          Yes, it is not simple change.
    >      >      >      >      >          >
    >      >      >      >      >          >          Thanks,
    >      >      >      >      >          >          Vladimir
    >      >      >      >      >          >
    >      >      >      >      >          >          >
    >      >      >      >      >          >          > Thanks,
    >      >      >      >      >          >          > Lutz
    >      >      >      >      >          >          >
    >      >      >      >      >          >          > On 14.08.18, 00:49, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
    >      >      >      >      >          >          >
    >      >      >      >      >          >          >      Hi Lutz,
    >      >      >      >      >          >          >
    >      >      >      >      >          >          >      Did you consider to generate these stubs in temp buffer before publishing them in CodeCache as we do
    >      >      >      >      >          >          >      for nmethods?
    >      >      >      >      >          >          >
    >      >      >      >      >          >          >      Thanks,
    >      >      >      >      >          >          >      Vladimir
    >      >      >      >      >          >          >
    >      >      >      >      >          >          >      On 8/13/18 3:52 AM, Schmidt, Lutz wrote:
    >      >      >      >      >          >          >      > Dear Community,
    >      >      >      >      >          >          >      > are there any praises, objections, questions, comments, or even reviews for this change?
    >      >      >      >      >          >          >      > Thanks for considering!
    >      >      >      >      >          >          >      > Lutz
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      > On 13.08.18, 12:47, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      Hi Boris,
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      back from vacation I'd like to elaborate a bit more on your comments.
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      Based on your input, I have changed the description in VtableStub::pd_code_alignment() to read
    >      >      >      >      >          >          >      >        "ARM32 cache line size is not an architected constant. We just align on word size."
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      The description for aarch64 was adapted accordingly:
    >      >      >      >      >          >          >      >        "aarch64 cache line size is not an architected constant. We just align on 4 bytes (instruction size)."
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      With respect to the variable name, I just harmonized the naming across all platforms. I would appreciate if you could live with it. Another option would be to just return a value, without using any variable name. I don't like that too much, but if the community prefers it...
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      I have updated http://cr.openjdk.java.net/~lucy/webrevs/8207343.01 in-place with just the two comment lines modified.
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      Thank you,
    >      >      >      >      >          >          >      >      Lutz
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >      On 07.08.18, 22:12, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >          Hi Boris,
    >      >      >      >      >          >          >      >          thanks for looking at this. I will respond to your comments in more detail next week. I'm on vacation this week.
    >      >      >      >      >          >          >      >          Thanks,
    >      >      >      >      >          >          >      >          Lutz
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >          On 05.08.18, 19:08, "Boris Ulasevich" <boris.ulasevich at bell-sw.com> wrote:
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >              Hi Lutz,
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >                 I have run jtreg with your change and do not see new fails (test that
    >      >      >      >      >          >          >      >              fails on aarch64 is excluded for 32 bit platforms).
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >                 I am OK with your change (I'm not a reviewer). But description and
    >      >      >      >      >          >          >      >              variable name in VtableStub::pd_code_alignment function looks strange
    >      >      >      >      >          >          >      >              for me. Raspberry Pi2 ARM1176JZF-S processor has a cache line length of
    >      >      >      >      >          >          >      >              32 bytes, and, as I know, icache line size is not a constant for ARM32
    >      >      >      >      >          >          >      >              architecture.
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >              regards,
    >      >      >      >      >          >          >      >              Boris
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >              On 02.08.2018 15:02, Schmidt, Lutz wrote:
    >      >      >      >      >          >          >      >              > Hi Zhongwei,
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              > thank you for testing aarch64. Given my lack of expertise, I am surprised there are only those two issues.
    >      >      >      >      >          >          >      >              > Ad 1.: fixed. The declaration just a few lines above was forgotten to adapt.
    >      >      >      >      >          >          >      >              > Ad 2.: adapted. I pushed the estimate to 152 bytes. I expected this value would require some adjustment.
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              > Please find a new webrev with the changes at http://cr.openjdk.java.net/~lucy/webrevs/8207343.01
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              > Best Regards,
    >      >      >      >      >          >          >      >              > Lutz
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              > On 02.08.18, 04:45, "Zhongwei Yao" <Zhongwei.Yao at arm.com> wrote:
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      Hi, Lutz,
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      I have tested it on aarch64 by running jtreg tests. And find two tiny issues in vtableStubs_aarch64.cpp's VtableStubs::create_itable_stub function:
    >      >      >      >      >          >          >      >              >      1. typecheckSize on line 212 is not defined.
    >      >      >      >      >          >          >      >              >      2. estimate on line 211 is not large enough, I get 148 in gc/g1/TestFromCardCacheIndex.java case. Here is the assertion failure from that case:
    >      >      >      >      >          >          >      >              >          assert(slop_delta >= 0) failed: itable #3: Code size estimate (140) for lookup_interface_method too small, required: 148
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      However, I don't have tested the modification in vtableStubs_arm.cpp due to I don't have an arm32 environment at hand.
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      --
    >      >      >      >      >          >          >      >              >      Best regards,
    >      >      >      >      >          >          >      >              >      Zhongwei
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      ________________________________________
    >      >      >      >      >          >          >      >              >      From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> on behalf of Schmidt, Lutz <lutz.schmidt at sap.com>
    >      >      >      >      >          >          >      >              >      Sent: Monday, July 30, 2018 3:57:05 PM
    >      >      >      >      >          >          >      >              >      To: hotspot-compiler-dev at openjdk.java.net
    >      >      >      >      >          >          >      >              >      Subject: RFR(M): 8207343: Automate vtable/itable stub size calculation
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      Dear all,
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      may I please request reviews for this change:
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      Bug:    https://bugs.openjdk.java.net/browse/JDK-8207343
    >      >      >      >      >          >          >      >              >      Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8207343.00/
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      With this change, I try to get rid of the a-priory size guessing for vtable and itable stubs. Please refer to the bug description for all the details. I didn't want to duplicate that text.
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      ARM and AARCH64 help requested!
    >      >      >      >      >          >          >      >              >      The edits in vtableStubs_aarch64.cpp and vtableStubs_arm.cpp are made blindfolded. I am neither an ARM expert nor do I have build or test hardware available. I would be very grateful if one of the ARM gurus could please fill in for me.
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >      Thank you!
    >      >      >      >      >          >          >      >              >      Lutz
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >              >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >      >
    >      >      >      >      >          >          >
    >      >      >      >      >          >          >
    >      >      >      >      >          >
    >      >      >      >      >          >
    >      >      >      >      >          >
    >      >      >      >      >          >
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >      >
    >      >      >      >
    >      >      >      >
    >      >      >
    >      >      >
    >      >
    >      >
    >      
    > 
    



More information about the hotspot-compiler-dev mailing list