Complexity reduction when choosing the best inference leaf to be solved

Fri Oct 13 16:51:28 UTC 2017


On 13/10/17 13:31, Vicente Romero wrote:
>
>
> On 10/13/2017 05:08 AM, Maurizio Cimadamore wrote:
>>
>>
>>
>> On 12/10/17 22:27, Vicente Romero wrote:
>>>
>>>
>>> On 10/12/2017 04:16 PM, Maurizio Cimadamore wrote:
>>>>
>>>>
>>>>
>>>> On 12/10/17 20:04, Vicente Romero wrote:
>>>>> This is my interpretation of this problem and how I think we 
>>>>> should discuss if this interpretation is correct, or if there is a 
>>>>> better one, before considering optimizing the current 
>>>>> implementation which doesn't fit very well into any of these 
>>>>> descriptions strategies.
>>>> I think your interpretation of the problem is correct. This is the 
>>>> dual of the max flow problem, with some variations (as you note). 
>>>> I'm not sure about the aggressive strategy you propose; it seems 
>>>> like it could get fooled quite easily if a flow with an high 
>>>> overall cost is 'hidden' behind a leaf with an initial very low 
>>>> cost. As per your point (2), yes it will typically be the case that 
>>>> the var to instantiate is not a leaf, which is why we try to come 
>>>> up with the best set of variables to instantiate eagerly - knowing 
>>>> that each (redundant) eager instantiation could potentially lead to 
>>>> spurious error messages.
>>>>
>>>> That said, this part is outside the spec - as section 18.5.2.2 says 
>>>> this:
>>>>
>>>> "1 A subset of constraints is selected in C, satisfying the 
>>>> property that, for each constraint, no input variable can influence 
>>>> an output variable of another constraint in C. The terms /input 
>>>> variable/ and /output variable/ are defined below. An inference 
>>>> variable α /can influence/ an inference variable β if α depends on 
>>>> the resolution of β (§18.4 
>>>> <https://docs.oracle.com/javase/specs/jls/se9/html/jls-18.html#jls-18.4>), 
>>>> or vice versa; or if there exists a third inference variable γ such 
>>>> that α can influence γ and γ can influence β.
>>>>
>>>> 2 The selected constraint(s) are removed from C.
>>>>
>>>> 3 The input variables α1, ..., αm of all the selected constraint(s) 
>>>> are resolved.""
>>>>
>>>> Here javac is playing some tricks that are totally outside of the 
>>>> spec - note that the spec requires that all input inference 
>>>> variables in the node to be unstuck should be solved at once - but 
>>>> that's not what javac does - as javac calls solveAny to do the 
>>>> heuristic trick:
>>>>
>>>> http://hg.openjdk.java.net/jdk10/master/file/tip/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/DeferredAttr.java#l635
>>>>
>>>> I think that, instead of investing resources on making these tricks 
>>>> more optimal, we should probably try to bring the compiler closer 
>>>> to what the spec has to say in this matter (or at least try and see 
>>>> what breaks, if anything).
>>>
>>> Or this is probably one of those cases in which the spec should 
>>> mimic what javac is doing
>> I was thinking more about it, and  yes, there can be cases where the 
>> spec is overly eager; consider a couple of constraints like C1 and C2:
>>
>>
>> vars(C1) = I1, O1
>> input(C1) = I1
>> output(C1) = O1
>>
>> vars(C2) = I2, O2
>> input(C2) = I2
>> output(C2) = O2
>>
>> T can influence I1
>> U can influence I1
>> V can influence U
>>
>>
>> In this case, it's fine to pick C1 as the first constraint to solve - 
>> as its input variables (I1) do not influence any output var (O1, O2) 
>> of any other constraints.
>>
>> So, now we have to solve all input vars of C2 - namely I1. Which 
>> means, now we have to turn at the 'can-influence' relationship, and 
>> ask 'what are the variables I need to solve in order to get at I1' ? 
>> I believe this question has only one answer that is 'minimal'. In 
>> other words, there should be only a minimal subset of the graph that 
>> goes from one or more leaves and lead to I1. In the example above, 
>> such subset is { T, U, V, I1 }.
>>
>> Of course there are more non-minimal choices:
>>
>> { T, U, V, I1, O1 }
>> { T, U, V, I1, O2 }
>> { T, U, V, I1, O1, O2 }
>>
>> But I think there's only _one_ minimal answer. So, the way I see the 
>> javac implementation, is that the BestLeafSolver is a big workaround 
>> to the fact that we need to be able to solve a 'subset' of the 
>> inference graph. But this can be achieved in many different ways - 
>> and the approach implemented right now (which predates the spec text) 
>> is probably more general than it needs to.
>>
>> I think a very simple BestLeafSolver would simply 'skip' all the 
>> inference graph nodes which cannot influence any of the nodes 
>> containing at least one variable in the target set, and will keep 
>> going until all variables in such set are solved. I don' think such 
>> an algorithm would require any tracking of cost, paths and such.
>
> well I think that there are two parts here, one is finding that 
> minimal set that could be done using a simple flood fill algorithm in 
> the graph, just connectivity matters. But then once we have the 
> subset, we have to find a resolution order right? After the 
> application of the Tarjan algo for strongly connected components, the 
> graph is already topologically sorted. The subset found in step 1 will 
> also be sorted we can follow that order.
Well, yes - basically we need to run Tarjan on a smaller graph.

Maurizio
>
>>
>> Maurizio
>
> Vicente
>
>>>
>>>>
>>>> Cheers
>>>> Maurizio
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20171013/0b4ded31/attachment.html>