Linear implementation of Tarjan's algorithm
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Nov 3 01:58:37 UTC 2017
On 02/11/17 23:26, B. Blaser wrote:
> Of course, it will still require some optimization to be sub-second
> (as you said), but this simple correction could do half of the job.
Hi Bernard,
unfortunately this optimization is unsound - there's no way to prove
that P <: Comparable<P> is the same as P' <: Comparable<P'>, because of
the f-bound. That's why the cache is there - it's not an optimization,
is something that prevents javac from doing the same check over and over
again.
When such a loop occurs, the right thing to do is to say 'false', as it
is not possible to prove that the bounds are indeed the same. Note that
the fact that the two variables are clones is largely irrelevant here -
we clone new tvars every time we check a new instance of the same
generic method call/diamond constructor - so in a way these are all
different variables which just happen to have the same name/same set of
initial (declared) bounds.
Moreover, the 'loose' type equality test is an artifact of the past - it
was introduced because the strict behavior (which was the default in 6)
was, well, sometimes too strict - essentially, if bounds of a type
variable are replaced for other type variables (e.g. replace U in the
bound of 'Z extends List<U>'), javac creates a different type-variable
with the replaced bound. If you performed the same replacement twice on
the original variable, the resulting replaced vars would not be deemed
'equals' by javac, which is why this looser behavior was added. Some
more comments are added here:
http://hg.openjdk.java.net/jdk7/jdk7/langtools/file/tip/src/share/classes/com/sun/tools/javac/code/Types.java#l632
These changes have been added as a consequence of the fixes for:
https://bugs.openjdk.java.net/browse/JDK-6729401
Now, all this is relatively ancient history, which predates the Java 8
changes - in fact, this 'legacy' loose behavior, is very problematic -
it's completely outside the spec, and, in the general case with fbounds,
it cannot even be proven to always complete (which is why the cache is
there). During 8, I sidestepped this issue by adding the 'strict' type
equality check, which should become the 'standard' - but, as I there was
only so much code I could rewrite in one shot (:-)), I added the new
routine beside the old one, instead of having the new routine completely
replacing the old one (and deal with the compatibility fallout).
I think the time is come that we do a bit of spring cleaning in the
area, and get rid of the legacy (and unsound) loose check, and see where
we're at.
That said, speaking of inference performances, if we ever move the
entire code base to use the strict equality check, your proposed
optimization would not even apply - in a way your optimization is
exploiting the fact that that particular check is using the loose check
instead of the strict one - with the strict one, two different type var
instances (no matter if they are clones, and even have same bounds) will
still be deemed different.
I still stand by my earlier analysis that the particular issue
highlighted in 8152289 is not something that needs to be addressed by
looking at the profiler. While you can make the bottleneck smaller and
smaller, you still have an underlying behavior that doesn't scale with
the number of nested calls - with increasing values of N, you do 10,
100, 1000, 10000, 1000 ....00000 checks - so no matter how fast the
rewritten logic is, you can always add a bunch new nested call to bring
the compilation time more or less in time to what it was before the fix.
The real fix is to find a way so that the incorporation machinery
doesn't have to actually perform a number of steps that is exponential
in the number of the type variables involved, and to detect patterns
that can be exploited in order to reduce the complexity of the
incorporation step.
Thanks
Maurizio
More information about the compiler-dev
mailing list