Linear implementation of Tarjan's algorithm

Mon Oct 23 15:58:45 UTC 2017

To have a better idea of the kind of performance issues we are having in 
this area, take a look here:

https://bugs.openjdk.java.net/browse/JDK-8152289

I'm slightly pessimistic that any of the performance fixes discussed 
here in the last few weeks would actually make a difference here - the 
sad reality is that the cost of doing incorporation dominates pretty 
much everything else. And the name of the game to address these kind of 
issues seem to be to attempt to reduce the size of the inference graph 
by finding similarities between variables [1]/minimize number of 
propagation steps as much as possible [2], etc.

[1] - https://bugs.openjdk.java.net/browse/JDK-8046685
[2] - https://bugs.openjdk.java.net/browse/JDK-8067767

Maurizio

On 23/10/17 14:01, B. Blaser wrote:
> I agree with both of you.
>
> To Rémi, It's never easy to choose between readability and absolute
> performance, but I agree an inline rewriting would probably be more
> appropriate here.
>
> To Maurizio, this wasn't intended to be a major optimization but only
> a very small rewriting that might help going faster in some particular
> situations.
>
> Cheers,
> Bernard
>
>
> On 22 October 2017 at 22:14, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>> Another thing I'm concerned here is that we're jumping into performance
>> fixes w/o any evidence of what the bottlenecks really are. I'm all for
>> fixing performance issues (and during JDK 9 we did a lot of work in order to
>> resolve the combinatorial explosion of checks in nested method calls). So, I
>> think that, before we decide to spend significant cycles on this (and other
>> related issues), we should at least come up with reasonable examples which
>> demonstrate the bottleneck. In other words, in a world where the average
>> inference graph has 4-5 nodes, does it really matter much how the contains()
>> method is implemented? And, if you have a 1000 nodes inference-graph, are
>> you sure that the main bottleneck is going to be the contains method? After
>> having profiled MANY javac runs, I have to admit I have never seen that
>> popping up - of course with all the optimizations that went into the
>> compiler recently, things might have changed, but I still think that we
>> should handle performance-related matters on a more data-driven basis.
>>
>> Cheers
>> Maurizio
>>
>>
>>
>> On 22/10/17 19:06, Remi Forax wrote:
>>> Please do not add a new class just for that, for a code as simple as this
>>> one, you can inline every methods.
>>> Also, i think the compiler will create 8 methods for your class HashStack,
>>> you should declare the class and all the method package private (or wait for
>>> the compiler using the 18.3 as bootstrap JDK).
>>>
>>> regards,
>>> Rémi
>>>
>>> ----- Mail original -----
>>>> De: "B. Blaser" <bsrbnd at gmail.com>
>>>> À: "compiler-dev" <compiler-dev at openjdk.java.net>
>>>> Envoyé: Dimanche 22 Octobre 2017 19:36:12
>>>> Objet: Linear implementation of Tarjan's algorithm
>>>> Hi,
>>>>
>>>> As briefly mentioned in [1], the current implementation of Tarjan's
>>>> algorithm performs a linear scan of the stack while iterating on the
>>>> node dependencies which might be avoided...
>>>>
>>>> Next is an attempt to make the whole implementation more linear using
>>>> a 'HashStack' that provides all operations in constant time,
>>>> especially 'contains()'.
>>>>
>>>> What do you think?
>>>>
>>>> Thanks,
>>>> Bernard
>>>>
>>>>
>>>> [1]
>>>> http://mail.openjdk.java.net/pipermail/compiler-dev/2017-October/011226.html
>>>>
>>>>
>>>> diff --git
>>>> a/src/jdk.compiler/share/classes/com/sun/tools/javac/util/GraphUtils.java
>>>> b/src/jdk.compiler/share/classes/com/sun/tools/javac/util/GraphUtils.java
>>>> ---
>>>> a/src/jdk.compiler/share/classes/com/sun/tools/javac/util/GraphUtils.java
>>>> +++
>>>> b/src/jdk.compiler/share/classes/com/sun/tools/javac/util/GraphUtils.java
>>>> @@ -28,6 +28,7 @@
>>>> import java.util.ArrayList;
>>>> import java.util.Collection;
>>>> import java.util.Properties;
>>>> +import java.util.HashSet;
>>>>
>>>> /** <p><b>This is NOT part of any supported API.
>>>>    *  If you write code that depends on this, you do so at your own risk.
>>>> @@ -164,7 +165,28 @@
>>>>           ListBuffer<List<N>> sccs = new ListBuffer<>();
>>>>
>>>>           /** Stack of all reacheable nodes from given root. */
>>>> -        ListBuffer<N> stack = new ListBuffer<>();
>>>> +        HashStack<N> stack = new HashStack<>();
>>>> +        //where
>>>> +        private static class HashStack<N> {
>>>> +            HashSet<N> set = new HashSet<>();
>>>> +            List<N> stack = List.nil();
>>>> +
>>>> +            private void push(N n) {
>>>> +                set.add(n);
>>>> +                stack = stack.prepend(n);
>>>> +            }
>>>> +
>>>> +            private N pop() {
>>>> +                N n = stack.head;
>>>> +                set.remove(n);
>>>> +                stack = stack.tail;
>>>> +                return n;
>>>> +            }
>>>> +
>>>> +            private boolean contains(N n) {
>>>> +                return set.contains(n);
>>>> +            }
>>>> +        }
>>>>
>>>>           private List<? extends List<? extends N>> findSCC(Iterable<?
>>>> extends N> nodes) {
>>>>               for (N node : nodes) {
>>>> @@ -197,7 +219,7 @@
>>>>               n.index = index;
>>>>               n.lowlink = index;
>>>>               index++;
>>>> -            stack.prepend(n);
>>>> +            stack.push(n);
>>>>               n.active = true;
>>>>           }
>>>>
>>>> @@ -205,7 +227,7 @@
>>>>               N n;
>>>>               ListBuffer<N> cycle = new ListBuffer<>();
>>>>               do {
>>>> -                n = stack.remove();
>>>> +                n = stack.pop();
>>>>                   n.active = false;
>>>>                   cycle.add(n);
>>>>                } while (n != v);
>>