Performance of Scope.getSymbolsByName()

Ron Shapiro ronshapiro at google.com
Thu May 30 23:09:17 UTC 2019


Do you have any other comments on the first webrev? Is there anything else
we need to do to submit that?

On Wed, May 22, 2019 at 11:00 AM Ron Shapiro <ronshapiro at google.com> wrote:

> I think there still would be benefit in that as well, as I'm seeing that
> come up in other contexts (as referenced in the bug).
>
> On Wed, May 22, 2019 at 9:06 AM Jan Lahoda <jan.lahoda at oracle.com> wrote:
>
>> Sorry, I was too busy last few days.
>>
>> I was peeking at some possible improvements, but I think I like Ron's
>> first (.00) patch - that caches what can be cached nicely.
>>
>> Looking at the testcase generated by Ron's reproducer using:
>> python test.py 7
>>
>> and the (biggest) size of the outcome of:
>> types.membersClosure(site, false).getSymbolsByName(sym.name, cf)
>>
>> seems to be 13700 elements - which means the Scope lookup and iteration
>> runs ~13700 times, so avoiding these additional lookup costs seems like
>> a clear win.
>>
>> I have an idea that might speed up the iterations through deeply nested
>> CompoundScopes, although the effect of that in combination with Ron's is
>> going to be fairly limited, if any, I think.
>>
>> Jan
>>
>> On 22. 05. 19 12:24, Maurizio Cimadamore wrote:
>> > This doesn't work. You are basically relying on the order in which
>> > symbols are entered in the members closure scope.
>> >
>> > In simple example like these:
>> >
>> > class A {
>> >      int m(List<String> l) {return 0;}
>> > }
>> >
>> > class B extends A {
>> >     int m(List<Integer> l) {return 0;}
>> > }
>> >
>> >
>> > The logic you proposed will not work. That's because we first see B::m
>> -
>> > and 'symbolByName' is empty at that stage; so we add it there. Then we
>> > do another round and see A::m - but we don't really look there - given
>> > that we first check to see if the symbol we're considering (sym) is
>> > override-equivalent with B::m (the only symbol in symbolByName). And
>> > that happens to be the case, since they are the same symbol. So we exit
>> > the loop, w/o having found any clash.
>> >
>> > In other words, symbolByName would need to also contain A::m for the
>> > code to see the clash - but that never happens; by the time A::m is
>> > added, is already too late.
>> >
>> >
>> > I think caching the result of
>> >
>> > types.membersClosure(site, false).getSymbolsByName(sym.name, cf)
>> >
>> > is a good measure.
>> >
>> > I'm a bit surprised that iteration is so slow (membersClosure is slow
>> to
>> > set up, but once you do it the results are cached). So, rather than
>> > tweaking the algorithm, I think it'd be better to investigate the
>> reason
>> > was to why asking a compound scope iterator is so slow, which then
>> would
>> > yield dividends for the rest of the code as well.
>> >
>> > Maurizio
>> >
>> >
>> > On 21/05/2019 21:21, Maurizio Cimadamore wrote:
>> >>
>> >> I see what you have done - I have to think about it a bit to see if I
>> >> can come up with some counter example.
>> >>
>> >> Thanks
>> >> Maurizio
>> >>
>> >> On 21/05/2019 17:39, Ron Shapiro wrote:
>> >>> Are the checks of the inner loop symmetrical?
>> >>>
>> >>> Currently it's checking m_i against (m_0..n - m_i ). This second
>> >>> webrev below would check it against just (m_0..i-1 ), which albeit
>> >>> still n^2, it divides by a factor of 2.
>> >>>
>> >>> (sorry if the subscripting here doesn't display correctly)
>> >>>
>> >>> http://cr.openjdk.java.net/~ronsh/8224161/webrev.01/
>> >>>
>> >>> This feels conceptually logical to me, but I'm not seeing a
>> >>> measurable change by it. It looks a little bit cleaner to me, but I'm
>> >>> fine with either webrev given the benefits they both bring.
>> >>>
>> >>> I can take a look in another thread about speeding up CompoundScope
>> >>> iteration.
>> >>>
>> >>> On Tue, May 21, 2019 at 8:05 AM Maurizio Cimadamore
>> >>> <maurizio.cimadamore at oracle.com
>> >>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>> >>>
>> >>>
>> >>>     On 21/05/2019 12:16, Ron Shapiro wrote:
>> >>>>     I still think that something to optimize the actual ScopeImpl
>> >>>>     Iterable is a worthwhile endeavor, as that would alleviate the
>> >>>>     need to materialize here (and solve hopefully the other issues
>> >>>>     I'm seeing), but I was having trouble figuring out how to do
>> >>>>     that. This may be a good interim option without much cost.
>> >>>
>> >>>     Sure - I'm not opposed to optimizing the iteration process - I
>> >>>     was expressing my skepticism w.r.t. making checkOverrideClash
>> >>>     simpler/non quadratic.
>> >>>
>> >>>     Maurizio
>> >>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>     On Tue, May 21, 2019, 5:59 AM Maurizio Cimadamore
>> >>>>     <maurizio.cimadamore at oracle.com
>> >>>>     <mailto:maurizio.cimadamore at oracle.com>> wrote:
>> >>>>
>> >>>>         I think your fix is a good one. We spent some cycles
>> >>>>         optimizing this, a bit odd we have missed this :-)
>> >>>>
>> >>>>         I'm very skeptical you can collapse into a single loop, as
>> >>>>         this implement the logic in JLS 8.4.8.3 [1] which, as you
>> >>>>         can see, is inherently quadratic (for each method, we have
>> >>>>         to scan all methods with same name in supertypes to see if
>> >>>>         there is an override clash). The algorithm that was there
>> >>>>         before wasn't - and it lead to the wrong answers in tricky
>> >>>>         cases - so while you can get 80% there with a non-quadratic
>> >>>>         algorithm, you will miss issues by doing so.
>> >>>>
>> >>>>         One thing that would help would be, instead, to limit the
>> >>>>         analysis only in cases where it adds value - for instance,
>> >>>>         if your hierarchy is just non-generic classes (as in your
>> >>>>         example), then there's no way for you to accidentally
>> >>>>         override a 'bridge' method, since no bridges will be
>> >>>>         generated! But when looking at this, I couldn't find great
>> >>>>         ways to detect these conditions w/o spending more time than
>> >>>>         the check itself.
>> >>>>
>> >>>>         [1] -
>> >>>>
>> https://docs.oracle.com/javase/specs/jls/se12/html/jls-8.html#jls-8.4.8.3
>> >>>>
>> >>>>         Maurizio
>> >>>>
>> >>>>         On 20/05/2019 21:58, Ron Shapiro wrote:
>> >>>>>         In the real world example, I'm seeing the 40s that was
>> >>>>>         previously spent in Check.checkOverrideClashes drop to to
>> >>>>>         9.5s when using this patch. Of that 9.5, 8.9 is spent in
>> >>>>>         iterating through the CompoundIterator and calling
>> >>>>>         getSymbolsByName.
>> >>>>>
>> >>>>>         On Mon, May 20, 2019 at 4:34 PM Ron Shapiro
>> >>>>>         <ronshapiro at google.com <mailto:ronshapiro at google.com>>
>> wrote:
>> >>>>>
>> >>>>>             This patch, which materializes the duplicate outer and
>> >>>>>             inner Iterables first into a list. It removes the
>> >>>>>             entire section of the CompoundIterator iteration from
>> >>>>>             the profile.
>> >>>>>
>> >>>>>             webrev:
>> >>>>>
>> http://cr.openjdk.java.net/~ronsh/8224161/webrev.00/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java.sdiff.html
>> >>>>>
>> >>>>>             I'm not sure it's the absolutely correct solution as it
>> >>>>>             possibly masks an underlying issue.
>> >>>>>
>> >>>>>             I'm still seeing some time spent in
>> >>>>>             MethodSymbol.overrides, Types.isSubSignature, and
>> >>>>>             Types.memberType, all of which happen in the inner
>> >>>>>             loop. If we can remove those and collapse the nested
>> >>>>>             loops into one, then this solution isn't necessary and
>> >>>>>             it would also solve that performance issue.
>> >>>>>
>> >>>>>             On Fri, May 17, 2019 at 5:55 PM Ron Shapiro
>> >>>>>             <ronshapiro at google.com <mailto:ronshapiro at google.com>>
>> >>>>>             wrote:
>> >>>>>
>> >>>>>                 I still have more to investigate to fully wrap my
>> >>>>>                 head around it, but I finally found a sample
>> >>>>>                 program that exhibits this. Filed a bug here:
>> >>>>>                 https://bugs.openjdk.java.net/browse/JDK-8224161
>> >>>>>
>> >>>>>                 On Fri, May 17, 2019 at 11:21 AM Jan Lahoda
>> >>>>>                 <jan.lahoda at oracle.com
>> >>>>>                 <mailto:jan.lahoda at oracle.com>> wrote:
>> >>>>>
>> >>>>>                     Hi Ron,
>> >>>>>
>> >>>>>                     I am afraid it is hard to guess what is the
>> >>>>>                     problem without some
>> >>>>>                     testcase. So, at least to me, having a sample
>> >>>>>                     would be helpful.
>> >>>>>
>> >>>>>                     Thanks,
>> >>>>>                          Jan
>> >>>>>
>> >>>>>                     On 17. 05. 19 0:41, Ron Shapiro wrote:
>> >>>>>                     > Hi,
>> >>>>>                     >
>> >>>>>                     > I'm observing a particularly bizarre
>> >>>>>                     compilation. It's a single file
>> >>>>>                     > with annotation processing, and the type that
>> >>>>>                     is being compiled and
>> >>>>>                     > processed has ~1000 declared and inherited
>> >>>>>                     methods combined. The total
>> >>>>>                     > compilation is 3 minutes, but 65% of the
>> >>>>>                     entire compilation is spent in
>> >>>>>                     > 3 methods:
>> >>>>>                     >
>> >>>>>
>>  Check.checkOverrideClashes(), Resolve.findInheritedMemberType(),
>> >>>>>
>> >>>>>                     > and Resolve.findField().
>> >>>>>                     >
>> >>>>>                     > Looking at profiles, it looks like
>> >>>>>                     getSymbolsByName() is the major
>> >>>>>                     > culprit here. I initially thought the reason
>> >>>>>                     was that there were far too
>> >>>>>                     > many overloads (this type had >600
>> >>>>>                     overloads...) and that that was
>> >>>>>                     > causing a bad regression for the
>> >>>>>                     pseudo-hashmap that ScopeImpl uses.
>> >>>>>                     > However, renaming the methods did not
>> >>>>>                     alleviate the build pain and these
>> >>>>>                     > methods continue to be taking long amounts of
>> >>>>>                     time.
>> >>>>>                     >
>> >>>>>                     > I was wondering what could be done to improve
>> >>>>>                     the performance of this
>> >>>>>                     > code. It seemed to me that something like a
>> >>>>>                     Map<Name, List<Symbol>>
>> >>>>>                     > could be a reasonable+modern replacement for
>> >>>>>                     this table, which would
>> >>>>>                     > naturally have a fast getSymbolsForName()
>> >>>>>                     implementation. I'm having
>> >>>>>                     > some trouble implementing it correctly, and I
>> >>>>>                     believe it's partially
>> >>>>>                     > related to not fully understanding some of
>> >>>>>                     the semantics of the class.
>> >>>>>                     >
>> >>>>>                     > Does what I wrote make sense to anyone, and
>> >>>>>                     maybe spark a lightbulb?
>> >>>>>                     >
>> >>>>>                     > I'm trying to put together a repro in case
>> >>>>>                     that helps, but I'm not 100%
>> >>>>>                     > sure I even understand what the regression
>> >>>>>                     case is.
>> >>>>>                     >
>> >>>>>                     > Thanks for you help!
>> >>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20190530/0b00c16f/attachment-0001.html>


More information about the compiler-dev mailing list