Loading classes with many methods is very expensive

Mon Oct 27 12:45:47 UTC 2014

Hi Aleksey,

On 10/27/2014 12:23 PM, Aleksey Shipilev wrote:
> On 10/27/2014 01:53 AM, Peter Levart wrote:
>> As you might have noticed, the number of methods obtained from patched
>> code differed from original code. I have investigated this and found
>> that original code treats abstract class methods the same as abstract
>> interface methods as far as multiple inheritance is concerned (it keeps
>> them together in the returned array). So I fixed this and here's new
>> webrev which behaves the same as original code:
>>
>> http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/
> This seems to be a candidate fix for
> https://bugs.openjdk.java.net/browse/JDK-8061950, right? If so, please
> assign yourself there. It might be advisable to start the separate RFR
> thread for a fix?

That's exactly the issue I was looking for. I assigned it to myself.

Thanks for looking at the code...

>
> My comments for the first reading of the patch:
>
>   * Why MethodList maintains the linked list? Doesn't O(n)
> MethodList.add() lead to quadratic complexity again? Should we instead
> use the ArrayList<Method> for storing method synonyms?

It's not quadratic, but more O(n*m) where n is the number of methods and 
m is the number of methods with same signature inherited via different 
paths from abstract class and/or (super)interfaces. This is typically 1, 
rarely 2, almost never 3 or more. For example:

abstract class A implements I, J, K {}

interface I {  void m(); }

interface J {  void m(); }

interface K {  void m(); }

Here the linked list with signature 'void m()' will contain 3 elements. 
When a particular method for a particular signature is found in a 
(super)interface, it has to be compared with potentially all methods 
having same signature and modifications performed while traversing are: 
do nothing, remove element, append element. So linked-list is perfect 
for that purpose. LinkedList.add() is only called when an abstract 
superclass brings multiple methods with same signature and all of them 
are inherited by abstract subclass - very very rarely is there more than 
1 method with same signature inherited from superclass.

I merged the functionality of the method-signature-key with the 
linked-list node, holding just a reference to Method object and a link 
to 'next' node. I think this is the most compact temporary datastructure 
representation for that purpose that leverages standard LinkedHashMap.

>
>   * Please convert some of the indexed loops into for-each loops for
> better readability? E.g:
>
>        for (int i = 0; i < intfcsMethods.length; i++) {
>           for (Method m : intfcsMethods[i]) {
>
>    ...to:
>
>        for (Method[] ms : intfcsMethods) {
>           for (Method m : ms) {

Good catch. I'll do this.

>   * I would think the code would be more readable if MethodList.m was
> having a more readable name, e.g. MethodList.base or MethodList.image or
> MethodList.primary?

Just MethodList.method will be ok - it's nothing special with this 
method. It just happens to be 1st in list of methods with same signature.

With introduction of local variables (later down), the conditions will 
be shortened in MethodList.consolidateWith() and short field name is not 
needed.

>
>   * Formatting and logic in MethodList.consolidateWith gives me chills. I
> think at very least introducing variables for m.getDeclaringClass() and
> ml.m.getDeclaringClass() improve readability. Also, moving the comments
> at the end of the line should improve readability and better highlight
> the boolean expression you are spelling out. Below is the
> straight-forward rewrite:
>
>>      MethodList consolidateWith(MethodList ml) {
>>          Method mineM = m;
>>          Method otherM = ml.m;
>>          Class<?> mine = mineM.getDeclaringClass();
>>          Class<?> other = otherM.getDeclaringClass();
>>
>>          if (other == mine                                             // other is same method as mine
>>              || !Modifier.isAbstract(mineM.getModifiers())             // OR, mine is non-abstract method...
>>                  && (other.isAssignableFrom(mine)                      // ...which overrides other
>>                     || other.isInterface() && !mine.isInterface())) {  // ...class method which wins over interface
>>              // just return
>>              return this;
>>          }
>>          if (!Modifier.isAbstract(otherM.getModifiers())               // other is non-abstract method...
>>               && (mine.isAssignableFrom(other)                         // ...which overrides mine
>>                   || mine.isInterface() && !other.isInterface())) {    // ...class method which wins over interface
>>              // knock m out and continue
>>              return (next == null) ? ml : next.consolidateWith(ml);
>>          }
>>
>>          // else leave m in and continue
>>          next = (next == null) ? ml : next.consolidateWith(ml);
>>          return this;
>>      }

Excellent. I'll take your formatting if you don't mind ;-)

>
>    * Should we use just methods.put() here?
>        2851             MethodList ml0 = methods.putIfAbsent(ml, ml);

It really doesn't matter as the exception is thrown if (ml0 != null) and 
LinkedHashMap is forgotten...

if (ml0 == null), put and putIfAbsent are equivalent. I used putIfAbsent 
to make the 3 loops (declared methods, superclass methods, 
(super)interface methods) more uniform.

>
>    * I wonder if the code would be simpler if we push the Map<MethodList,
> MethodList> maintenance to some internal utility class? That would
> probably simplify the loops in privateGetPublicMethods?

I thought MethodList.add()/consolidateWith() were close to those utility 
methods. I could extract the logic in each of 3 loops to 3 void methods 
taking 'methods' Map and a Method instance, but I don't want to pollute 
the method namespace in j.l.Class any more than necessary and making a 
special inner class for that is an overkill. Previously the holder for 
such methods was MethodArray, but it's gone now. I could subclass 
LinkedHashMap if I wanted to avoid an unnecessary indirection, but 
that's not a good practice. It's not that bad as it is - 38 lines of 
code for 3 loops with comments and spacing. I could improve comments 
though...

>
> Thanks,
> -Aleksey.
>