Performance of Scope.getSymbolsByName()

Jan Lahoda jan.lahoda at oracle.com
Wed Jun 12 06:44:39 UTC 2019


Hi Ron,

I ran the tests, I think it looks OK.
It is this patch, right:
http://cr.openjdk.java.net/~ronsh/8224161/webrev.00/

Do you need a sponsor? Maurizio, any comments?

Thanks,
     Jan

On 11. 06. 19 20:25, Ron Shapiro wrote:
> Is there a way I can run those? Any way I can help get this in?
> 
> On Thu, Jun 6, 2019 at 10:36 AM Jan Lahoda <jan.lahoda at oracle.com 
> <mailto:jan.lahoda at oracle.com>> wrote:
> 
>     Hi Ron,
> 
>     I think we should run it through the perf tests, just to be sure (and I
>     didn't have time for that yet), but otherwise, I think that simple
>     change looks useful to me.
> 
>     Thanks,
>           Jan
> 
>     On 06. 06. 19 3:41, Ron Shapiro wrote:
>      > Friendly ping - is the first webrev ready for submission?
>      >
>      > On Thu, May 30, 2019, 7:09 PM Ron Shapiro <ronshapiro at google.com
>     <mailto:ronshapiro at google.com>
>      > <mailto:ronshapiro at google.com <mailto:ronshapiro at google.com>>> wrote:
>      >
>      >     Do you have any other comments on the first webrev? Is there
>      >     anything else we need to do to submit that?
>      >
>      >     On Wed, May 22, 2019 at 11:00 AM Ron Shapiro
>     <ronshapiro at google.com <mailto:ronshapiro at google.com>
>      >     <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>>> wrote:
>      >
>      >         I think there still would be benefit in that as well, as I'm
>      >         seeing that come up in other contexts (as referenced in
>     the bug).
>      >
>      >         On Wed, May 22, 2019 at 9:06 AM Jan Lahoda
>      >         <jan.lahoda at oracle.com <mailto:jan.lahoda at oracle.com>
>     <mailto:jan.lahoda at oracle.com <mailto:jan.lahoda at oracle.com>>> wrote:
>      >
>      >             Sorry, I was too busy last few days.
>      >
>      >             I was peeking at some possible improvements, but I
>     think I
>      >             like Ron's
>      >             first (.00) patch - that caches what can be cached
>     nicely.
>      >
>      >             Looking at the testcase generated by Ron's reproducer
>     using:
>      >             python test.py 7
>      >
>      >             and the (biggest) size of the outcome of:
>      >             types.membersClosure(site,
>     false).getSymbolsByName(sym.name <http://sym.name>
>      >             <http://sym.name>, cf)
>      >
>      >             seems to be 13700 elements - which means the Scope lookup
>      >             and iteration
>      >             runs ~13700 times, so avoiding these additional
>     lookup costs
>      >             seems like
>      >             a clear win.
>      >
>      >             I have an idea that might speed up the iterations through
>      >             deeply nested
>      >             CompoundScopes, although the effect of that in
>     combination
>      >             with Ron's is
>      >             going to be fairly limited, if any, I think.
>      >
>      >             Jan
>      >
>      >             On 22. 05. 19 12:24, Maurizio Cimadamore wrote:
>      >              > This doesn't work. You are basically relying on
>     the order
>      >             in which
>      >              > symbols are entered in the members closure scope.
>      >              >
>      >              > In simple example like these:
>      >              >
>      >              > class A {
>      >              >      int m(List<String> l) {return 0;}
>      >              > }
>      >              >
>      >              > class B extends A {
>      >              >     int m(List<Integer> l) {return 0;}
>      >              > }
>      >              >
>      >              >
>      >              > The logic you proposed will not work. That's
>     because we
>      >             first see B::m -
>      >              > and 'symbolByName' is empty at that stage; so we
>     add it
>      >             there. Then we
>      >              > do another round and see A::m - but we don't
>     really look
>      >             there - given
>      >              > that we first check to see if the symbol we're
>      >             considering (sym) is
>      >              > override-equivalent with B::m (the only symbol in
>      >             symbolByName). And
>      >              > that happens to be the case, since they are the same
>      >             symbol. So we exit
>      >              > the loop, w/o having found any clash.
>      >              >
>      >              > In other words, symbolByName would need to also
>     contain
>      >             A::m for the
>      >              > code to see the clash - but that never happens; by the
>      >             time A::m is
>      >              > added, is already too late.
>      >              >
>      >              >
>      >              > I think caching the result of
>      >              >
>      >              > types.membersClosure(site,
>      >             false).getSymbolsByName(sym.name <http://sym.name>
>     <http://sym.name>, cf)
>      >              >
>      >              > is a good measure.
>      >              >
>      >              > I'm a bit surprised that iteration is so slow
>      >             (membersClosure is slow to
>      >              > set up, but once you do it the results are
>     cached). So,
>      >             rather than
>      >              > tweaking the algorithm, I think it'd be better to
>      >             investigate the reason
>      >              > was to why asking a compound scope iterator is so
>     slow,
>      >             which then would
>      >              > yield dividends for the rest of the code as well.
>      >              >
>      >              > Maurizio
>      >              >
>      >              >
>      >              > On 21/05/2019 21:21, Maurizio Cimadamore wrote:
>      >              >>
>      >              >> I see what you have done - I have to think about it a
>      >             bit to see if I
>      >              >> can come up with some counter example.
>      >              >>
>      >              >> Thanks
>      >              >> Maurizio
>      >              >>
>      >              >> On 21/05/2019 17:39, Ron Shapiro wrote:
>      >              >>> Are the checks of the inner loop symmetrical?
>      >              >>>
>      >              >>> Currently it's checking m_i against (m_0..n - m_i ).
>      >             This second
>      >              >>> webrev below would check it against
>     just (m_0..i-1 ),
>      >             which albeit
>      >              >>> still n^2, it divides by a factor of 2.
>      >              >>>
>      >              >>> (sorry if the subscripting here doesn't display
>     correctly)
>      >              >>>
>      >              >>> http://cr.openjdk.java.net/~ronsh/8224161/webrev.01/
>      >              >>>
>      >              >>> This feels conceptually logical to me, but I'm not
>      >             seeing a
>      >              >>> measurable change by it. It looks a little bit
>     cleaner
>      >             to me, but I'm
>      >              >>> fine with either webrev given the benefits they
>     both bring.
>      >              >>>
>      >              >>> I can take a look in another thread about
>     speeding up
>      >             CompoundScope
>      >              >>> iteration.
>      >              >>>
>      >              >>> On Tue, May 21, 2019 at 8:05 AM Maurizio Cimadamore
>      >              >>> <maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>
>      >             <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>>
>      >              >>> <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>
>      >             <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>>>> wrote:
>      >              >>>
>      >              >>>
>      >              >>>     On 21/05/2019 12:16, Ron Shapiro wrote:
>      >              >>>>     I still think that something to optimize the
>      >             actual ScopeImpl
>      >              >>>>     Iterable is a worthwhile endeavor, as that
>     would
>      >             alleviate the
>      >              >>>>     need to materialize here (and solve
>     hopefully the
>      >             other issues
>      >              >>>>     I'm seeing), but I was having trouble
>     figuring out
>      >             how to do
>      >              >>>>     that. This may be a good interim option without
>      >             much cost.
>      >              >>>
>      >              >>>     Sure - I'm not opposed to optimizing the
>     iteration
>      >             process - I
>      >              >>>     was expressing my skepticism w.r.t. making
>      >             checkOverrideClash
>      >              >>>     simpler/non quadratic.
>      >              >>>
>      >              >>>     Maurizio
>      >              >>>
>      >              >>>>
>      >              >>>>
>      >              >>>>
>      >              >>>>     On Tue, May 21, 2019, 5:59 AM Maurizio
>     Cimadamore
>      >              >>>>     <maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>
>      >             <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>>
>      >              >>>>     <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>
>      >             <mailto:maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>>>> wrote:
>      >              >>>>
>      >              >>>>         I think your fix is a good one. We
>     spent some
>      >             cycles
>      >              >>>>         optimizing this, a bit odd we have
>     missed this :-)
>      >              >>>>
>      >              >>>>         I'm very skeptical you can collapse into a
>      >             single loop, as
>      >              >>>>         this implement the logic in JLS 8.4.8.3 [1]
>      >             which, as you
>      >              >>>>         can see, is inherently quadratic (for each
>      >             method, we have
>      >              >>>>         to scan all methods with same name in
>      >             supertypes to see if
>      >              >>>>         there is an override clash). The algorithm
>      >             that was there
>      >              >>>>         before wasn't - and it lead to the wrong
>      >             answers in tricky
>      >              >>>>         cases - so while you can get 80% there
>     with a
>      >             non-quadratic
>      >              >>>>         algorithm, you will miss issues by
>     doing so.
>      >              >>>>
>      >              >>>>         One thing that would help would be,
>     instead,
>      >             to limit the
>      >              >>>>         analysis only in cases where it adds
>     value -
>      >             for instance,
>      >              >>>>         if your hierarchy is just non-generic
>     classes
>      >             (as in your
>      >              >>>>         example), then there's no way for you to
>      >             accidentally
>      >              >>>>         override a 'bridge' method, since no
>     bridges
>      >             will be
>      >              >>>>         generated! But when looking at this, I
>      >             couldn't find great
>      >              >>>>         ways to detect these conditions w/o
>     spending
>      >             more time than
>      >              >>>>         the check itself.
>      >              >>>>
>      >              >>>>         [1] -
>      >              >>>>
>      >
>     https://docs.oracle.com/javase/specs/jls/se12/html/jls-8.html#jls-8.4.8.3
>      >              >>>>
>      >              >>>>         Maurizio
>      >              >>>>
>      >              >>>>         On 20/05/2019 21:58, Ron Shapiro wrote:
>      >              >>>>>         In the real world example, I'm seeing
>     the 40s
>      >             that was
>      >              >>>>>         previously spent in
>      >             Check.checkOverrideClashes drop to to
>      >              >>>>>         9.5s when using this patch. Of that
>     9.5, 8.9
>      >             is spent in
>      >              >>>>>         iterating through the CompoundIterator and
>      >             calling
>      >              >>>>>         getSymbolsByName.
>      >              >>>>>
>      >              >>>>>         On Mon, May 20, 2019 at 4:34 PM Ron
>     Shapiro
>      >              >>>>>         <ronshapiro at google.com
>     <mailto:ronshapiro at google.com>
>      >             <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>> <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>
>      >             <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>>>> wrote:
>      >              >>>>>
>      >              >>>>>             This patch, which materializes the
>      >             duplicate outer and
>      >              >>>>>             inner Iterables first into a list. It
>      >             removes the
>      >              >>>>>             entire section of the CompoundIterator
>      >             iteration from
>      >              >>>>>             the profile.
>      >              >>>>>
>      >              >>>>>             webrev:
>      >              >>>>>
>      >
>     http://cr.openjdk.java.net/~ronsh/8224161/webrev.00/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java.sdiff.html
>      >              >>>>>
>      >              >>>>>             I'm not sure it's the absolutely
>     correct
>      >             solution as it
>      >              >>>>>             possibly masks an underlying issue.
>      >              >>>>>
>      >              >>>>>             I'm still seeing some time spent in
>      >              >>>>>             MethodSymbol.overrides,
>      >             Types.isSubSignature, and
>      >              >>>>>             Types.memberType, all of which
>     happen in
>      >             the inner
>      >              >>>>>             loop. If we can remove those and
>     collapse
>      >             the nested
>      >              >>>>>             loops into one, then this solution
>     isn't
>      >             necessary and
>      >              >>>>>             it would also solve that
>     performance issue.
>      >              >>>>>
>      >              >>>>>             On Fri, May 17, 2019 at 5:55 PM
>     Ron Shapiro
>      >              >>>>>             <ronshapiro at google.com
>     <mailto:ronshapiro at google.com>
>      >             <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>> <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>
>      >             <mailto:ronshapiro at google.com
>     <mailto:ronshapiro at google.com>>>>
>      >              >>>>>             wrote:
>      >              >>>>>
>      >              >>>>>                 I still have more to
>     investigate to
>      >             fully wrap my
>      >              >>>>>                 head around it, but I finally
>     found a
>      >             sample
>      >              >>>>>                 program that exhibits this.
>     Filed a
>      >             bug here:
>      >              >>>>> https://bugs.openjdk.java.net/browse/JDK-8224161
>      >              >>>>>
>      >              >>>>>                 On Fri, May 17, 2019 at 11:21
>     AM Jan
>      >             Lahoda
>      >              >>>>>                 <jan.lahoda at oracle.com
>     <mailto:jan.lahoda at oracle.com>
>      >             <mailto:jan.lahoda at oracle.com
>     <mailto:jan.lahoda at oracle.com>>
>      >              >>>>>                 <mailto:jan.lahoda at oracle.com
>     <mailto:jan.lahoda at oracle.com>
>      >             <mailto:jan.lahoda at oracle.com
>     <mailto:jan.lahoda at oracle.com>>>> wrote:
>      >              >>>>>
>      >              >>>>>                     Hi Ron,
>      >              >>>>>
>      >              >>>>>                     I am afraid it is hard to
>     guess
>      >             what is the
>      >              >>>>>                     problem without some
>      >              >>>>>                     testcase. So, at least to me,
>      >             having a sample
>      >              >>>>>                     would be helpful.
>      >              >>>>>
>      >              >>>>>                     Thanks,
>      >              >>>>>                          Jan
>      >              >>>>>
>      >              >>>>>                     On 17. 05. 19 0:41, Ron
>     Shapiro
>      >             wrote:
>      >              >>>>>                     > Hi,
>      >              >>>>>                     >
>      >              >>>>>                     > I'm observing a particularly
>      >             bizarre
>      >              >>>>>                     compilation. It's a single
>     file
>      >              >>>>>                     > with annotation
>     processing, and
>      >             the type that
>      >              >>>>>                     is being compiled and
>      >              >>>>>                     > processed has ~1000 declared
>      >             and inherited
>      >              >>>>>                     methods combined. The total
>      >              >>>>>                     > compilation is 3
>     minutes, but
>      >             65% of the
>      >              >>>>>                     entire compilation is spent in
>      >              >>>>>                     > 3 methods:
>      >              >>>>>                     >
>      >              >>>>>
>      >             
>       Check.checkOverrideClashes(), Resolve.findInheritedMemberType(),
>      >              >>>>>
>      >              >>>>>                     > and Resolve.findField().
>      >              >>>>>                     >
>      >              >>>>>                     > Looking at profiles, it
>     looks like
>      >              >>>>>                     getSymbolsByName() is the
>     major
>      >              >>>>>                     > culprit here. I initially
>      >             thought the reason
>      >              >>>>>                     was that there were far too
>      >              >>>>>                     > many overloads (this
>     type had >600
>      >              >>>>>                     overloads...) and that
>     that was
>      >              >>>>>                     > causing a bad regression
>     for the
>      >              >>>>>                     pseudo-hashmap that
>     ScopeImpl uses.
>      >              >>>>>                     > However, renaming the
>     methods
>      >             did not
>      >              >>>>>                     alleviate the build pain
>     and these
>      >              >>>>>                     > methods continue to be
>     taking
>      >             long amounts of
>      >              >>>>>                     time.
>      >              >>>>>                     >
>      >              >>>>>                     > I was wondering what
>     could be
>      >             done to improve
>      >              >>>>>                     the performance of this
>      >              >>>>>                     > code. It seemed to me that
>      >             something like a
>      >              >>>>>                     Map<Name, List<Symbol>>
>      >              >>>>>                     > could be a reasonable+modern
>      >             replacement for
>      >              >>>>>                     this table, which would
>      >              >>>>>                     > naturally have a fast
>      >             getSymbolsForName()
>      >              >>>>>                     implementation. I'm having
>      >              >>>>>                     > some trouble implementing it
>      >             correctly, and I
>      >              >>>>>                     believe it's partially
>      >              >>>>>                     > related to not fully
>      >             understanding some of
>      >              >>>>>                     the semantics of the class.
>      >              >>>>>                     >
>      >              >>>>>                     > Does what I wrote make
>     sense to
>      >             anyone, and
>      >              >>>>>                     maybe spark a lightbulb?
>      >              >>>>>                     >
>      >              >>>>>                     > I'm trying to put together a
>      >             repro in case
>      >              >>>>>                     that helps, but I'm not 100%
>      >              >>>>>                     > sure I even understand
>     what the
>      >             regression
>      >              >>>>>                     case is.
>      >              >>>>>                     >
>      >              >>>>>                     > Thanks for you help!
>      >              >>>>>
>      >
> 


More information about the compiler-dev mailing list