Performance of Scope.getSymbolsByName()
Ron Shapiro
ronshapiro at google.com
Thu Jun 6 01:41:48 UTC 2019
Friendly ping - is the first webrev ready for submission?
On Thu, May 30, 2019, 7:09 PM Ron Shapiro <ronshapiro at google.com> wrote:
> Do you have any other comments on the first webrev? Is there anything else
> we need to do to submit that?
>
> On Wed, May 22, 2019 at 11:00 AM Ron Shapiro <ronshapiro at google.com>
> wrote:
>
>> I think there still would be benefit in that as well, as I'm seeing that
>> come up in other contexts (as referenced in the bug).
>>
>> On Wed, May 22, 2019 at 9:06 AM Jan Lahoda <jan.lahoda at oracle.com> wrote:
>>
>>> Sorry, I was too busy last few days.
>>>
>>> I was peeking at some possible improvements, but I think I like Ron's
>>> first (.00) patch - that caches what can be cached nicely.
>>>
>>> Looking at the testcase generated by Ron's reproducer using:
>>> python test.py 7
>>>
>>> and the (biggest) size of the outcome of:
>>> types.membersClosure(site, false).getSymbolsByName(sym.name, cf)
>>>
>>> seems to be 13700 elements - which means the Scope lookup and iteration
>>> runs ~13700 times, so avoiding these additional lookup costs seems like
>>> a clear win.
>>>
>>> I have an idea that might speed up the iterations through deeply nested
>>> CompoundScopes, although the effect of that in combination with Ron's is
>>> going to be fairly limited, if any, I think.
>>>
>>> Jan
>>>
>>> On 22. 05. 19 12:24, Maurizio Cimadamore wrote:
>>> > This doesn't work. You are basically relying on the order in which
>>> > symbols are entered in the members closure scope.
>>> >
>>> > In simple example like these:
>>> >
>>> > class A {
>>> > int m(List<String> l) {return 0;}
>>> > }
>>> >
>>> > class B extends A {
>>> > int m(List<Integer> l) {return 0;}
>>> > }
>>> >
>>> >
>>> > The logic you proposed will not work. That's because we first see B::m
>>> -
>>> > and 'symbolByName' is empty at that stage; so we add it there. Then we
>>> > do another round and see A::m - but we don't really look there - given
>>> > that we first check to see if the symbol we're considering (sym) is
>>> > override-equivalent with B::m (the only symbol in symbolByName). And
>>> > that happens to be the case, since they are the same symbol. So we
>>> exit
>>> > the loop, w/o having found any clash.
>>> >
>>> > In other words, symbolByName would need to also contain A::m for the
>>> > code to see the clash - but that never happens; by the time A::m is
>>> > added, is already too late.
>>> >
>>> >
>>> > I think caching the result of
>>> >
>>> > types.membersClosure(site, false).getSymbolsByName(sym.name, cf)
>>> >
>>> > is a good measure.
>>> >
>>> > I'm a bit surprised that iteration is so slow (membersClosure is slow
>>> to
>>> > set up, but once you do it the results are cached). So, rather than
>>> > tweaking the algorithm, I think it'd be better to investigate the
>>> reason
>>> > was to why asking a compound scope iterator is so slow, which then
>>> would
>>> > yield dividends for the rest of the code as well.
>>> >
>>> > Maurizio
>>> >
>>> >
>>> > On 21/05/2019 21:21, Maurizio Cimadamore wrote:
>>> >>
>>> >> I see what you have done - I have to think about it a bit to see if I
>>> >> can come up with some counter example.
>>> >>
>>> >> Thanks
>>> >> Maurizio
>>> >>
>>> >> On 21/05/2019 17:39, Ron Shapiro wrote:
>>> >>> Are the checks of the inner loop symmetrical?
>>> >>>
>>> >>> Currently it's checking m_i against (m_0..n - m_i ). This second
>>> >>> webrev below would check it against just (m_0..i-1 ), which albeit
>>> >>> still n^2, it divides by a factor of 2.
>>> >>>
>>> >>> (sorry if the subscripting here doesn't display correctly)
>>> >>>
>>> >>> http://cr.openjdk.java.net/~ronsh/8224161/webrev.01/
>>> >>>
>>> >>> This feels conceptually logical to me, but I'm not seeing a
>>> >>> measurable change by it. It looks a little bit cleaner to me, but
>>> I'm
>>> >>> fine with either webrev given the benefits they both bring.
>>> >>>
>>> >>> I can take a look in another thread about speeding up CompoundScope
>>> >>> iteration.
>>> >>>
>>> >>> On Tue, May 21, 2019 at 8:05 AM Maurizio Cimadamore
>>> >>> <maurizio.cimadamore at oracle.com
>>> >>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>> >>>
>>> >>>
>>> >>> On 21/05/2019 12:16, Ron Shapiro wrote:
>>> >>>> I still think that something to optimize the actual ScopeImpl
>>> >>>> Iterable is a worthwhile endeavor, as that would alleviate the
>>> >>>> need to materialize here (and solve hopefully the other issues
>>> >>>> I'm seeing), but I was having trouble figuring out how to do
>>> >>>> that. This may be a good interim option without much cost.
>>> >>>
>>> >>> Sure - I'm not opposed to optimizing the iteration process - I
>>> >>> was expressing my skepticism w.r.t. making checkOverrideClash
>>> >>> simpler/non quadratic.
>>> >>>
>>> >>> Maurizio
>>> >>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Tue, May 21, 2019, 5:59 AM Maurizio Cimadamore
>>> >>>> <maurizio.cimadamore at oracle.com
>>> >>>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>> >>>>
>>> >>>> I think your fix is a good one. We spent some cycles
>>> >>>> optimizing this, a bit odd we have missed this :-)
>>> >>>>
>>> >>>> I'm very skeptical you can collapse into a single loop, as
>>> >>>> this implement the logic in JLS 8.4.8.3 [1] which, as you
>>> >>>> can see, is inherently quadratic (for each method, we have
>>> >>>> to scan all methods with same name in supertypes to see if
>>> >>>> there is an override clash). The algorithm that was there
>>> >>>> before wasn't - and it lead to the wrong answers in tricky
>>> >>>> cases - so while you can get 80% there with a non-quadratic
>>> >>>> algorithm, you will miss issues by doing so.
>>> >>>>
>>> >>>> One thing that would help would be, instead, to limit the
>>> >>>> analysis only in cases where it adds value - for instance,
>>> >>>> if your hierarchy is just non-generic classes (as in your
>>> >>>> example), then there's no way for you to accidentally
>>> >>>> override a 'bridge' method, since no bridges will be
>>> >>>> generated! But when looking at this, I couldn't find great
>>> >>>> ways to detect these conditions w/o spending more time than
>>> >>>> the check itself.
>>> >>>>
>>> >>>> [1] -
>>> >>>>
>>> https://docs.oracle.com/javase/specs/jls/se12/html/jls-8.html#jls-8.4.8.3
>>> >>>>
>>> >>>> Maurizio
>>> >>>>
>>> >>>> On 20/05/2019 21:58, Ron Shapiro wrote:
>>> >>>>> In the real world example, I'm seeing the 40s that was
>>> >>>>> previously spent in Check.checkOverrideClashes drop to to
>>> >>>>> 9.5s when using this patch. Of that 9.5, 8.9 is spent in
>>> >>>>> iterating through the CompoundIterator and calling
>>> >>>>> getSymbolsByName.
>>> >>>>>
>>> >>>>> On Mon, May 20, 2019 at 4:34 PM Ron Shapiro
>>> >>>>> <ronshapiro at google.com <mailto:ronshapiro at google.com>>
>>> wrote:
>>> >>>>>
>>> >>>>> This patch, which materializes the duplicate outer and
>>> >>>>> inner Iterables first into a list. It removes the
>>> >>>>> entire section of the CompoundIterator iteration from
>>> >>>>> the profile.
>>> >>>>>
>>> >>>>> webrev:
>>> >>>>>
>>> http://cr.openjdk.java.net/~ronsh/8224161/webrev.00/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java.sdiff.html
>>> >>>>>
>>> >>>>> I'm not sure it's the absolutely correct solution as it
>>> >>>>> possibly masks an underlying issue.
>>> >>>>>
>>> >>>>> I'm still seeing some time spent in
>>> >>>>> MethodSymbol.overrides, Types.isSubSignature, and
>>> >>>>> Types.memberType, all of which happen in the inner
>>> >>>>> loop. If we can remove those and collapse the nested
>>> >>>>> loops into one, then this solution isn't necessary and
>>> >>>>> it would also solve that performance issue.
>>> >>>>>
>>> >>>>> On Fri, May 17, 2019 at 5:55 PM Ron Shapiro
>>> >>>>> <ronshapiro at google.com <mailto:ronshapiro at google.com>>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> I still have more to investigate to fully wrap my
>>> >>>>> head around it, but I finally found a sample
>>> >>>>> program that exhibits this. Filed a bug here:
>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8224161
>>> >>>>>
>>> >>>>> On Fri, May 17, 2019 at 11:21 AM Jan Lahoda
>>> >>>>> <jan.lahoda at oracle.com
>>> >>>>> <mailto:jan.lahoda at oracle.com>> wrote:
>>> >>>>>
>>> >>>>> Hi Ron,
>>> >>>>>
>>> >>>>> I am afraid it is hard to guess what is the
>>> >>>>> problem without some
>>> >>>>> testcase. So, at least to me, having a sample
>>> >>>>> would be helpful.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Jan
>>> >>>>>
>>> >>>>> On 17. 05. 19 0:41, Ron Shapiro wrote:
>>> >>>>> > Hi,
>>> >>>>> >
>>> >>>>> > I'm observing a particularly bizarre
>>> >>>>> compilation. It's a single file
>>> >>>>> > with annotation processing, and the type that
>>> >>>>> is being compiled and
>>> >>>>> > processed has ~1000 declared and inherited
>>> >>>>> methods combined. The total
>>> >>>>> > compilation is 3 minutes, but 65% of the
>>> >>>>> entire compilation is spent in
>>> >>>>> > 3 methods:
>>> >>>>> >
>>> >>>>>
>>> Check.checkOverrideClashes(), Resolve.findInheritedMemberType(),
>>> >>>>>
>>> >>>>> > and Resolve.findField().
>>> >>>>> >
>>> >>>>> > Looking at profiles, it looks like
>>> >>>>> getSymbolsByName() is the major
>>> >>>>> > culprit here. I initially thought the reason
>>> >>>>> was that there were far too
>>> >>>>> > many overloads (this type had >600
>>> >>>>> overloads...) and that that was
>>> >>>>> > causing a bad regression for the
>>> >>>>> pseudo-hashmap that ScopeImpl uses.
>>> >>>>> > However, renaming the methods did not
>>> >>>>> alleviate the build pain and these
>>> >>>>> > methods continue to be taking long amounts of
>>> >>>>> time.
>>> >>>>> >
>>> >>>>> > I was wondering what could be done to improve
>>> >>>>> the performance of this
>>> >>>>> > code. It seemed to me that something like a
>>> >>>>> Map<Name, List<Symbol>>
>>> >>>>> > could be a reasonable+modern replacement for
>>> >>>>> this table, which would
>>> >>>>> > naturally have a fast getSymbolsForName()
>>> >>>>> implementation. I'm having
>>> >>>>> > some trouble implementing it correctly, and I
>>> >>>>> believe it's partially
>>> >>>>> > related to not fully understanding some of
>>> >>>>> the semantics of the class.
>>> >>>>> >
>>> >>>>> > Does what I wrote make sense to anyone, and
>>> >>>>> maybe spark a lightbulb?
>>> >>>>> >
>>> >>>>> > I'm trying to put together a repro in case
>>> >>>>> that helps, but I'm not 100%
>>> >>>>> > sure I even understand what the regression
>>> >>>>> case is.
>>> >>>>> >
>>> >>>>> > Thanks for you help!
>>> >>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20190605/3dc532f3/attachment-0001.html>
More information about the compiler-dev
mailing list