[PING] RFR(M): 8207343: Automate vtable/itable stub size calculation

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Aug 22 15:46:59 UTC 2018


Yes, I agree, it will simplify code. But add comment explaining why such difference between product and debug build. May 
be show platforms with worst case sizes.

Thanks,
Vladimir

On 8/21/18 11:39 PM, Schmidt, Lutz wrote:
> Thank you very much, Vladimir!
> 
> One quick add'l question: could you live with the #if defined(PRODUCT) variant? I'm a strong KISS (Keep It Simple and Stupid) proponent.
> 
> If yes - wonderful. If no - I'd accept that.
> 
> Thanks,
> Lutz
> 
> On 22.08.18, 04:11, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> 
>      Very nice.
>      
>      You left commented code in vtableStubs_x86_32.cpp and vtableStubs_x86_64.cpp
>      
>      +//  const int slop32     = (vtable_index == 0) ? 4 :     // code size change with transition from 8-bit to 32-bit
>      constant (@index == 16).
>      +//                         (vtable_index < 16) ? 3 : 0;  // index == 0 generates even shorter code.
>      
>      If comments in these lines are useful - use them. But remove these lines. Otherwise they could be confusing.
>      
>      Thanks,
>      Vladimir
>      
>      
>      On 8/21/18 2:36 PM, Schmidt, Lutz wrote:
>      > No worries, Vladimir!
>      > I did not intend to add new flags. The vm has plenty, I believe.
>      >
>      > Please refer to the attached file to see exactly what changed from iteration 02 to 03. Here is a summary:
>      >    -  _vtab_stub_size and _itab_stub_size were moved to the private section of class VtableStubs.
>      >    -  same is true for methods code_size_limit() and check_and_set_size_limit().
>      >    -  class VtableStub and class VtableStubs swapped position in the header file to make the compilers happy.
>      >    -  minor adjustments in all platform files due to the move of code_size_limit().
>      >    -  there are now separate initial sizes for vtable and itable stubs.
>      >    -  the sizes are not fixed but calculated, depending on the actual flag setting.
>      >
>      > What is missing? Help from arm/aarch64 people to produce real-world values for arm and aarch64. I do not have access to machinery to create the data myself. I believe the effective default are also suitable for these platform, but would like to be sure. The data is easy to produce:
>      >    -  run any workload that creates vtable and itable stubs, using the parameters
>      >       -XX:{+|-}CountCompiledCalls -XX:{+|-}DebugVtables -Xlog:vtablestubs=Trace. (all 4 combinations)
>      >    -  pipe the output into grep table\ #
>      >    -  the output of the grep is what I need to fill the table.
>      >
>      > What I did not do and don't want to do: make the size estimates platform-specific. I would rather condense all that size-calculation stuff into
>      >    #if defined(PRODUCT)
>      >      static int first_vtableStub_size =  64;
>      >      static int first_itableStub_size = 256;
>      >    #else
>      >      static int first_vtableStub_size = 1024;
>      >      static int first_itableStub_size =  512;
>      >    #endif
>      >
>      > The full webrev (iteration #3) can be found at: http://cr.openjdk.java.net/~lucy/webrevs/8207343.03/
>      >
>      > Looking forward to your comments. And to comments/opinions/reviews from others. And to help from ARM experts.
>      >
>      > Thanks,
>      > Lutz
>      >
>      > On 18.08.18, 18:52, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >
>      >      Please, don't add new flags.
>      >
>      >      If you mean to use CountCompiledCalls or DebugVtables (or other existing flags) to calculate firstStub_size I agree with it.
>      >
>      >      Thanks,
>      >      Vladimir
>      >
>      >      On 8/18/18 6:42 AM, Schmidt, Lutz wrote:
>      >      > Hi Vladimir,
>      >      >
>      >      > you are right, the size estimate for the first stub is wildly off. But:
>      >      >    o  it's just the first stub. All subsequent ones use a much better estimate.
>      >      >    o  for x86 and with "-XX:+CountCompiledCalls -XX:+DebugVtables" I get
>      >      >       [3.716s][trace][vtablestubs] vtable #3 at 0x000000010b4f8a30: size: 671, estimate: 1024, slop area: 353
>      >      >       [3.809s][trace][vtablestubs] itable #5 at 0x000000010b4f9100: size: 305, estimate: 1024, slop area: 719
>      >      >    o  I wanted to get rid of the platform-specific size estimates (less locations that need maintenance and can become outdated)
>      >      >
>      >      > What can we do? I suggest initializing firstStub_size depending on the PRODUCT flag. In product builds, neither CountCompiledCalls nor DebugVtables will generate code. Furthermore, I could introduce first_vtableStub_size and first_itableStub_size to even better adapt the initial sizes.
>      >      >
>      >      > Could you live with that? To find good sizing, I'll have to run some tests. That will happen Monday.
>      >      >
>      >      > The fields you mention should not be public. I will change that. Maybe I can even move code_size_limit() and check_and_set_size_limit() to class VtableStubs.
>      >      >
>      >      > Regards,
>      >      > Lutz
>      >      >
>      >      >
>      >      > On 17.08.18, 23:24, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >      >
>      >      >      Finally looked through changes - good idea.
>      >      >
>      >      >      Looking only on SPARC and x86 code.
>      >      >
>      >      >      Why you dropped pd_code_size_limit() as firstStub_size estimate (may be with additional delta (say 25%))? It is much
>      >      >      better then 1024 based on your results:
>      >      >
>      >      >      [4.075s][trace][vtablestubs] vtable #3 at 0x00000001199b8630: size: 29, estimate: 1024, slop area: 995
>      >      >
>      >      >      Why next fields are public?:
>      >      >
>      >      >      +  static int _vtab_stub_size;
>      >      >      +  static int _itab_stub_size;
>      >      >
>      >      >      Otherwise look good.
>      >      >
>      >      >      Thanks,
>      >      >      Vladimir
>      >      >
>      >      >      On 8/17/18 7:23 AM, Schmidt, Lutz wrote:
>      >      >      > Hi,
>      >      >      >
>      >      >      > I have uploaded a new webrev which contains the modifications discussed below, plus some minor tweaks. In detail:
>      >      >      >    o  Using UL instead of PrintMiscellaneous in share/code/vtableStubs.cpp
>      >      >      >    o  removed a leftover comment line in share/code/vtableStubs.hpp
>      >      >      >    o  added a new LOG_TAG(vtablestubs)
>      >      >      >    o  increased expected len of call_VM in vtableStubs_x86*.cpp (had a failing test on MacOS)
>      >      >      >
>      >      >      > Please find the new webrev here: http://cr.openjdk.java.net/~lucy/webrevs/8207343.02/
>      >      >      >
>      >      >      > Any volunteers out there for a second review?
>      >      >      >
>      >      >      > Thanks,
>      >      >      > Lutz
>      >      >      >
>      >      >      >
>      >      >      > On 16.08.18, 18:47, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >      >      >
>      >      >      >      Yes, it is better. And numbers are reasonable now ;)
>      >      >      >
>      >      >      >      Thanks,
>      >      >      >      Vladimir
>      >      >      >
>      >      >      >      On 8/16/18 12:53 AM, Schmidt, Lutz wrote:
>      >      >      >      > Sorry for spamming: forgot to attach the output file.
>      >      >      >      > Lutz
>      >      >      >      >
>      >      >      >      > On 16.08.18, 09:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>      >      >      >      >
>      >      >      >      >      Hi Vladimir,
>      >      >      >      >
>      >      >      >      >      I have reformatted the trace output a bit, see attached file. Do you like it better now? And the printed actual size was plain wrong, it is now calculated as (pc - code_begin).
>      >      >      >      >
>      >      >      >      >      Any other comments?
>      >      >      >      >      Thanks,
>      >      >      >      >      Lutz
>      >      >      >      >
>      >      >      >      >      On 15.08.18, 18:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >      >      >      >
>      >      >      >      >          It looks good to me only one suggestion is to add 'size' to be clear what these numbers mean:
>      >      >      >      >
>      >      >      >      >          vtable #3 at 0x0000000116e85c30[1024], size estimate 1024, left over: 353
>      >      >      >      >
>      >      >      >      >          Also looking on numbers and it strange. Size estimate 'stub_length' matches actual size (code_end - entry_point) then
>      >      >      >      >          why left over?
>      >      >      >      >
>      >      >      >      >          Thanks,
>      >      >      >      >          Vladimir
>      >      >      >      >
>      >      >      >      >          On 8/15/18 5:55 AM, Schmidt, Lutz wrote:
>      >      >      >      >          > Hi Vladimir,
>      >      >      >      >          > what do you think about this change (see attachment) to convert tracing output to UL? I will create a new webrev once I have included your (expected) comments.
>      >      >      >      >          > Regards, Lutz
>      >      >      >      >          >
>      >      >      >      >          > On 15.08.18, 00:34, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>      >      >      >      >          >
>      >      >      >      >          >      No hurry, Vladimir!
>      >      >      >      >          >      In the meantime, I will have a look at the PrintMiscellaneous to UL conversion. In the light of previous discussions, this might help making some people happy. __
>      >      >      >      >          >      Thanks,
>      >      >      >      >          >      Lutz
>      >      >      >      >          >
>      >      >      >      >          >      On 15.08.18, 00:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >      >      >      >          >
>      >      >      >      >          >          On 8/14/18 12:46 AM, Schmidt, Lutz wrote:
>      >      >      >      >          >          > Hi Vladimir,
>      >      >      >      >          >          >
>      >      >      >      >          >          > the answer is simple: No, I did not. The proposed change is basically a generalized, harmonized variant of a modification we have been using at SAP for many years.
>      >      >      >      >          >
>      >      >      >      >          >          Okay. Give me time to look and test changes.
>      >      >      >      >          >
>      >      >      >      >          >          What do you think about using Xlog (UL) instead of PrintMiscellaneous in bookkeeping()?
>      >      >      >      >          >
>      >      >      >      >          >          >
>      >      >      >      >          >          > Your idea is interesting, though. I (still) do not completely like the implementation currently in RFR. The code size variance caused by data only known at runtime (constants, offsets, for example) makes the generators ugly.
>      >      >      >      >          >          >
>      >      >      >      >          >          > Using a temp buffer adds other complexities. What about relocations, for example. Without having had a deeper look, I expect the effort to be "considerable".
>      >      >      >      >          >
>      >      >      >      >          >          Yes, it is not simple change.
>      >      >      >      >          >
>      >      >      >      >          >          Thanks,
>      >      >      >      >          >          Vladimir
>      >      >      >      >          >
>      >      >      >      >          >          >
>      >      >      >      >          >          > Thanks,
>      >      >      >      >          >          > Lutz
>      >      >      >      >          >          >
>      >      >      >      >          >          > On 14.08.18, 00:49, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>      >      >      >      >          >          >
>      >      >      >      >          >          >      Hi Lutz,
>      >      >      >      >          >          >
>      >      >      >      >          >          >      Did you consider to generate these stubs in temp buffer before publishing them in CodeCache as we do
>      >      >      >      >          >          >      for nmethods?
>      >      >      >      >          >          >
>      >      >      >      >          >          >      Thanks,
>      >      >      >      >          >          >      Vladimir
>      >      >      >      >          >          >
>      >      >      >      >          >          >      On 8/13/18 3:52 AM, Schmidt, Lutz wrote:
>      >      >      >      >          >          >      > Dear Community,
>      >      >      >      >          >          >      > are there any praises, objections, questions, comments, or even reviews for this change?
>      >      >      >      >          >          >      > Thanks for considering!
>      >      >      >      >          >          >      > Lutz
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      > On 13.08.18, 12:47, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      Hi Boris,
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      back from vacation I'd like to elaborate a bit more on your comments.
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      Based on your input, I have changed the description in VtableStub::pd_code_alignment() to read
>      >      >      >      >          >          >      >        "ARM32 cache line size is not an architected constant. We just align on word size."
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      The description for aarch64 was adapted accordingly:
>      >      >      >      >          >          >      >        "aarch64 cache line size is not an architected constant. We just align on 4 bytes (instruction size)."
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      With respect to the variable name, I just harmonized the naming across all platforms. I would appreciate if you could live with it. Another option would be to just return a value, without using any variable name. I don't like that too much, but if the community prefers it...
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      I have updated http://cr.openjdk.java.net/~lucy/webrevs/8207343.01 in-place with just the two comment lines modified.
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      Thank you,
>      >      >      >      >          >          >      >      Lutz
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >      On 07.08.18, 22:12, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >          Hi Boris,
>      >      >      >      >          >          >      >          thanks for looking at this. I will respond to your comments in more detail next week. I'm on vacation this week.
>      >      >      >      >          >          >      >          Thanks,
>      >      >      >      >          >          >      >          Lutz
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >          On 05.08.18, 19:08, "Boris Ulasevich" <boris.ulasevich at bell-sw.com> wrote:
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >              Hi Lutz,
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >                 I have run jtreg with your change and do not see new fails (test that
>      >      >      >      >          >          >      >              fails on aarch64 is excluded for 32 bit platforms).
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >                 I am OK with your change (I'm not a reviewer). But description and
>      >      >      >      >          >          >      >              variable name in VtableStub::pd_code_alignment function looks strange
>      >      >      >      >          >          >      >              for me. Raspberry Pi2 ARM1176JZF-S processor has a cache line length of
>      >      >      >      >          >          >      >              32 bytes, and, as I know, icache line size is not a constant for ARM32
>      >      >      >      >          >          >      >              architecture.
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >              regards,
>      >      >      >      >          >          >      >              Boris
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >              On 02.08.2018 15:02, Schmidt, Lutz wrote:
>      >      >      >      >          >          >      >              > Hi Zhongwei,
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              > thank you for testing aarch64. Given my lack of expertise, I am surprised there are only those two issues.
>      >      >      >      >          >          >      >              > Ad 1.: fixed. The declaration just a few lines above was forgotten to adapt.
>      >      >      >      >          >          >      >              > Ad 2.: adapted. I pushed the estimate to 152 bytes. I expected this value would require some adjustment.
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              > Please find a new webrev with the changes at http://cr.openjdk.java.net/~lucy/webrevs/8207343.01
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              > Best Regards,
>      >      >      >      >          >          >      >              > Lutz
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              > On 02.08.18, 04:45, "Zhongwei Yao" <Zhongwei.Yao at arm.com> wrote:
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      Hi, Lutz,
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      I have tested it on aarch64 by running jtreg tests. And find two tiny issues in vtableStubs_aarch64.cpp's VtableStubs::create_itable_stub function:
>      >      >      >      >          >          >      >              >      1. typecheckSize on line 212 is not defined.
>      >      >      >      >          >          >      >              >      2. estimate on line 211 is not large enough, I get 148 in gc/g1/TestFromCardCacheIndex.java case. Here is the assertion failure from that case:
>      >      >      >      >          >          >      >              >          assert(slop_delta >= 0) failed: itable #3: Code size estimate (140) for lookup_interface_method too small, required: 148
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      However, I don't have tested the modification in vtableStubs_arm.cpp due to I don't have an arm32 environment at hand.
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      --
>      >      >      >      >          >          >      >              >      Best regards,
>      >      >      >      >          >          >      >              >      Zhongwei
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      ________________________________________
>      >      >      >      >          >          >      >              >      From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> on behalf of Schmidt, Lutz <lutz.schmidt at sap.com>
>      >      >      >      >          >          >      >              >      Sent: Monday, July 30, 2018 3:57:05 PM
>      >      >      >      >          >          >      >              >      To: hotspot-compiler-dev at openjdk.java.net
>      >      >      >      >          >          >      >              >      Subject: RFR(M): 8207343: Automate vtable/itable stub size calculation
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      Dear all,
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      may I please request reviews for this change:
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      Bug:    https://bugs.openjdk.java.net/browse/JDK-8207343
>      >      >      >      >          >          >      >              >      Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8207343.00/
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      With this change, I try to get rid of the a-priory size guessing for vtable and itable stubs. Please refer to the bug description for all the details. I didn't want to duplicate that text.
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      ARM and AARCH64 help requested!
>      >      >      >      >          >          >      >              >      The edits in vtableStubs_aarch64.cpp and vtableStubs_arm.cpp are made blindfolded. I am neither an ARM expert nor do I have build or test hardware available. I would be very grateful if one of the ARM gurus could please fill in for me.
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >      Thank you!
>      >      >      >      >          >          >      >              >      Lutz
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >              >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >      >
>      >      >      >      >          >          >
>      >      >      >      >          >          >
>      >      >      >      >          >
>      >      >      >      >          >
>      >      >      >      >          >
>      >      >      >      >          >
>      >      >      >      >
>      >      >      >      >
>      >      >      >      >
>      >      >      >      >
>      >      >      >
>      >      >      >
>      >      >
>      >      >
>      >
>      >
>      
> 


More information about the hotspot-compiler-dev mailing list