[PING] RFR(M): 8207343: Automate vtable/itable stub size calculation
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Aug 22 15:46:59 UTC 2018
Yes, I agree, it will simplify code. But add comment explaining why such difference between product and debug build. May
be show platforms with worst case sizes.
Thanks,
Vladimir
On 8/21/18 11:39 PM, Schmidt, Lutz wrote:
> Thank you very much, Vladimir!
>
> One quick add'l question: could you live with the #if defined(PRODUCT) variant? I'm a strong KISS (Keep It Simple and Stupid) proponent.
>
> If yes - wonderful. If no - I'd accept that.
>
> Thanks,
> Lutz
>
> On 22.08.18, 04:11, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
>
> Very nice.
>
> You left commented code in vtableStubs_x86_32.cpp and vtableStubs_x86_64.cpp
>
> +// const int slop32 = (vtable_index == 0) ? 4 : // code size change with transition from 8-bit to 32-bit
> constant (@index == 16).
> +// (vtable_index < 16) ? 3 : 0; // index == 0 generates even shorter code.
>
> If comments in these lines are useful - use them. But remove these lines. Otherwise they could be confusing.
>
> Thanks,
> Vladimir
>
>
> On 8/21/18 2:36 PM, Schmidt, Lutz wrote:
> > No worries, Vladimir!
> > I did not intend to add new flags. The vm has plenty, I believe.
> >
> > Please refer to the attached file to see exactly what changed from iteration 02 to 03. Here is a summary:
> > - _vtab_stub_size and _itab_stub_size were moved to the private section of class VtableStubs.
> > - same is true for methods code_size_limit() and check_and_set_size_limit().
> > - class VtableStub and class VtableStubs swapped position in the header file to make the compilers happy.
> > - minor adjustments in all platform files due to the move of code_size_limit().
> > - there are now separate initial sizes for vtable and itable stubs.
> > - the sizes are not fixed but calculated, depending on the actual flag setting.
> >
> > What is missing? Help from arm/aarch64 people to produce real-world values for arm and aarch64. I do not have access to machinery to create the data myself. I believe the effective default are also suitable for these platform, but would like to be sure. The data is easy to produce:
> > - run any workload that creates vtable and itable stubs, using the parameters
> > -XX:{+|-}CountCompiledCalls -XX:{+|-}DebugVtables -Xlog:vtablestubs=Trace. (all 4 combinations)
> > - pipe the output into grep table\ #
> > - the output of the grep is what I need to fill the table.
> >
> > What I did not do and don't want to do: make the size estimates platform-specific. I would rather condense all that size-calculation stuff into
> > #if defined(PRODUCT)
> > static int first_vtableStub_size = 64;
> > static int first_itableStub_size = 256;
> > #else
> > static int first_vtableStub_size = 1024;
> > static int first_itableStub_size = 512;
> > #endif
> >
> > The full webrev (iteration #3) can be found at: http://cr.openjdk.java.net/~lucy/webrevs/8207343.03/
> >
> > Looking forward to your comments. And to comments/opinions/reviews from others. And to help from ARM experts.
> >
> > Thanks,
> > Lutz
> >
> > On 18.08.18, 18:52, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> >
> > Please, don't add new flags.
> >
> > If you mean to use CountCompiledCalls or DebugVtables (or other existing flags) to calculate firstStub_size I agree with it.
> >
> > Thanks,
> > Vladimir
> >
> > On 8/18/18 6:42 AM, Schmidt, Lutz wrote:
> > > Hi Vladimir,
> > >
> > > you are right, the size estimate for the first stub is wildly off. But:
> > > o it's just the first stub. All subsequent ones use a much better estimate.
> > > o for x86 and with "-XX:+CountCompiledCalls -XX:+DebugVtables" I get
> > > [3.716s][trace][vtablestubs] vtable #3 at 0x000000010b4f8a30: size: 671, estimate: 1024, slop area: 353
> > > [3.809s][trace][vtablestubs] itable #5 at 0x000000010b4f9100: size: 305, estimate: 1024, slop area: 719
> > > o I wanted to get rid of the platform-specific size estimates (less locations that need maintenance and can become outdated)
> > >
> > > What can we do? I suggest initializing firstStub_size depending on the PRODUCT flag. In product builds, neither CountCompiledCalls nor DebugVtables will generate code. Furthermore, I could introduce first_vtableStub_size and first_itableStub_size to even better adapt the initial sizes.
> > >
> > > Could you live with that? To find good sizing, I'll have to run some tests. That will happen Monday.
> > >
> > > The fields you mention should not be public. I will change that. Maybe I can even move code_size_limit() and check_and_set_size_limit() to class VtableStubs.
> > >
> > > Regards,
> > > Lutz
> > >
> > >
> > > On 17.08.18, 23:24, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> > >
> > > Finally looked through changes - good idea.
> > >
> > > Looking only on SPARC and x86 code.
> > >
> > > Why you dropped pd_code_size_limit() as firstStub_size estimate (may be with additional delta (say 25%))? It is much
> > > better then 1024 based on your results:
> > >
> > > [4.075s][trace][vtablestubs] vtable #3 at 0x00000001199b8630: size: 29, estimate: 1024, slop area: 995
> > >
> > > Why next fields are public?:
> > >
> > > + static int _vtab_stub_size;
> > > + static int _itab_stub_size;
> > >
> > > Otherwise look good.
> > >
> > > Thanks,
> > > Vladimir
> > >
> > > On 8/17/18 7:23 AM, Schmidt, Lutz wrote:
> > > > Hi,
> > > >
> > > > I have uploaded a new webrev which contains the modifications discussed below, plus some minor tweaks. In detail:
> > > > o Using UL instead of PrintMiscellaneous in share/code/vtableStubs.cpp
> > > > o removed a leftover comment line in share/code/vtableStubs.hpp
> > > > o added a new LOG_TAG(vtablestubs)
> > > > o increased expected len of call_VM in vtableStubs_x86*.cpp (had a failing test on MacOS)
> > > >
> > > > Please find the new webrev here: http://cr.openjdk.java.net/~lucy/webrevs/8207343.02/
> > > >
> > > > Any volunteers out there for a second review?
> > > >
> > > > Thanks,
> > > > Lutz
> > > >
> > > >
> > > > On 16.08.18, 18:47, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> > > >
> > > > Yes, it is better. And numbers are reasonable now ;)
> > > >
> > > > Thanks,
> > > > Vladimir
> > > >
> > > > On 8/16/18 12:53 AM, Schmidt, Lutz wrote:
> > > > > Sorry for spamming: forgot to attach the output file.
> > > > > Lutz
> > > > >
> > > > > On 16.08.18, 09:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
> > > > >
> > > > > Hi Vladimir,
> > > > >
> > > > > I have reformatted the trace output a bit, see attached file. Do you like it better now? And the printed actual size was plain wrong, it is now calculated as (pc - code_begin).
> > > > >
> > > > > Any other comments?
> > > > > Thanks,
> > > > > Lutz
> > > > >
> > > > > On 15.08.18, 18:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> > > > >
> > > > > It looks good to me only one suggestion is to add 'size' to be clear what these numbers mean:
> > > > >
> > > > > vtable #3 at 0x0000000116e85c30[1024], size estimate 1024, left over: 353
> > > > >
> > > > > Also looking on numbers and it strange. Size estimate 'stub_length' matches actual size (code_end - entry_point) then
> > > > > why left over?
> > > > >
> > > > > Thanks,
> > > > > Vladimir
> > > > >
> > > > > On 8/15/18 5:55 AM, Schmidt, Lutz wrote:
> > > > > > Hi Vladimir,
> > > > > > what do you think about this change (see attachment) to convert tracing output to UL? I will create a new webrev once I have included your (expected) comments.
> > > > > > Regards, Lutz
> > > > > >
> > > > > > On 15.08.18, 00:34, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
> > > > > >
> > > > > > No hurry, Vladimir!
> > > > > > In the meantime, I will have a look at the PrintMiscellaneous to UL conversion. In the light of previous discussions, this might help making some people happy. __
> > > > > > Thanks,
> > > > > > Lutz
> > > > > >
> > > > > > On 15.08.18, 00:25, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> > > > > >
> > > > > > On 8/14/18 12:46 AM, Schmidt, Lutz wrote:
> > > > > > > Hi Vladimir,
> > > > > > >
> > > > > > > the answer is simple: No, I did not. The proposed change is basically a generalized, harmonized variant of a modification we have been using at SAP for many years.
> > > > > >
> > > > > > Okay. Give me time to look and test changes.
> > > > > >
> > > > > > What do you think about using Xlog (UL) instead of PrintMiscellaneous in bookkeeping()?
> > > > > >
> > > > > > >
> > > > > > > Your idea is interesting, though. I (still) do not completely like the implementation currently in RFR. The code size variance caused by data only known at runtime (constants, offsets, for example) makes the generators ugly.
> > > > > > >
> > > > > > > Using a temp buffer adds other complexities. What about relocations, for example. Without having had a deeper look, I expect the effort to be "considerable".
> > > > > >
> > > > > > Yes, it is not simple change.
> > > > > >
> > > > > > Thanks,
> > > > > > Vladimir
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Lutz
> > > > > > >
> > > > > > > On 14.08.18, 00:49, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> > > > > > >
> > > > > > > Hi Lutz,
> > > > > > >
> > > > > > > Did you consider to generate these stubs in temp buffer before publishing them in CodeCache as we do
> > > > > > > for nmethods?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Vladimir
> > > > > > >
> > > > > > > On 8/13/18 3:52 AM, Schmidt, Lutz wrote:
> > > > > > > > Dear Community,
> > > > > > > > are there any praises, objections, questions, comments, or even reviews for this change?
> > > > > > > > Thanks for considering!
> > > > > > > > Lutz
> > > > > > > >
> > > > > > > > On 13.08.18, 12:47, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
> > > > > > > >
> > > > > > > > Hi Boris,
> > > > > > > >
> > > > > > > > back from vacation I'd like to elaborate a bit more on your comments.
> > > > > > > >
> > > > > > > > Based on your input, I have changed the description in VtableStub::pd_code_alignment() to read
> > > > > > > > "ARM32 cache line size is not an architected constant. We just align on word size."
> > > > > > > >
> > > > > > > > The description for aarch64 was adapted accordingly:
> > > > > > > > "aarch64 cache line size is not an architected constant. We just align on 4 bytes (instruction size)."
> > > > > > > >
> > > > > > > > With respect to the variable name, I just harmonized the naming across all platforms. I would appreciate if you could live with it. Another option would be to just return a value, without using any variable name. I don't like that too much, but if the community prefers it...
> > > > > > > >
> > > > > > > > I have updated http://cr.openjdk.java.net/~lucy/webrevs/8207343.01 in-place with just the two comment lines modified.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Lutz
> > > > > > > >
> > > > > > > > On 07.08.18, 22:12, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
> > > > > > > >
> > > > > > > > Hi Boris,
> > > > > > > > thanks for looking at this. I will respond to your comments in more detail next week. I'm on vacation this week.
> > > > > > > > Thanks,
> > > > > > > > Lutz
> > > > > > > >
> > > > > > > > On 05.08.18, 19:08, "Boris Ulasevich" <boris.ulasevich at bell-sw.com> wrote:
> > > > > > > >
> > > > > > > > Hi Lutz,
> > > > > > > >
> > > > > > > > I have run jtreg with your change and do not see new fails (test that
> > > > > > > > fails on aarch64 is excluded for 32 bit platforms).
> > > > > > > >
> > > > > > > > I am OK with your change (I'm not a reviewer). But description and
> > > > > > > > variable name in VtableStub::pd_code_alignment function looks strange
> > > > > > > > for me. Raspberry Pi2 ARM1176JZF-S processor has a cache line length of
> > > > > > > > 32 bytes, and, as I know, icache line size is not a constant for ARM32
> > > > > > > > architecture.
> > > > > > > >
> > > > > > > > regards,
> > > > > > > > Boris
> > > > > > > >
> > > > > > > > On 02.08.2018 15:02, Schmidt, Lutz wrote:
> > > > > > > > > Hi Zhongwei,
> > > > > > > > >
> > > > > > > > > thank you for testing aarch64. Given my lack of expertise, I am surprised there are only those two issues.
> > > > > > > > > Ad 1.: fixed. The declaration just a few lines above was forgotten to adapt.
> > > > > > > > > Ad 2.: adapted. I pushed the estimate to 152 bytes. I expected this value would require some adjustment.
> > > > > > > > >
> > > > > > > > > Please find a new webrev with the changes at http://cr.openjdk.java.net/~lucy/webrevs/8207343.01
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Lutz
> > > > > > > > >
> > > > > > > > > On 02.08.18, 04:45, "Zhongwei Yao" <Zhongwei.Yao at arm.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi, Lutz,
> > > > > > > > >
> > > > > > > > > I have tested it on aarch64 by running jtreg tests. And find two tiny issues in vtableStubs_aarch64.cpp's VtableStubs::create_itable_stub function:
> > > > > > > > > 1. typecheckSize on line 212 is not defined.
> > > > > > > > > 2. estimate on line 211 is not large enough, I get 148 in gc/g1/TestFromCardCacheIndex.java case. Here is the assertion failure from that case:
> > > > > > > > > assert(slop_delta >= 0) failed: itable #3: Code size estimate (140) for lookup_interface_method too small, required: 148
> > > > > > > > >
> > > > > > > > > However, I don't have tested the modification in vtableStubs_arm.cpp due to I don't have an arm32 environment at hand.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best regards,
> > > > > > > > > Zhongwei
> > > > > > > > >
> > > > > > > > > ________________________________________
> > > > > > > > > From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> on behalf of Schmidt, Lutz <lutz.schmidt at sap.com>
> > > > > > > > > Sent: Monday, July 30, 2018 3:57:05 PM
> > > > > > > > > To: hotspot-compiler-dev at openjdk.java.net
> > > > > > > > > Subject: RFR(M): 8207343: Automate vtable/itable stub size calculation
> > > > > > > > >
> > > > > > > > > Dear all,
> > > > > > > > >
> > > > > > > > > may I please request reviews for this change:
> > > > > > > > >
> > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8207343
> > > > > > > > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8207343.00/
> > > > > > > > >
> > > > > > > > > With this change, I try to get rid of the a-priory size guessing for vtable and itable stubs. Please refer to the bug description for all the details. I didn't want to duplicate that text.
> > > > > > > > >
> > > > > > > > > ARM and AARCH64 help requested!
> > > > > > > > > The edits in vtableStubs_aarch64.cpp and vtableStubs_arm.cpp are made blindfolded. I am neither an ARM expert nor do I have build or test hardware available. I would be very grateful if one of the ARM gurus could please fill in for me.
> > > > > > > > >
> > > > > > > > > Thank you!
> > > > > > > > > Lutz
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>
More information about the hotspot-compiler-dev
mailing list