Constant propagation in Nashorn compiler pipeline

Attila Szegedi attila.szegedi at oracle.com
Wed Jun 3 15:34:34 UTC 2015


This is to initiate a discussion of (maybe) adding some form of constant propagation into Nashorn’s compiler pipeline. We currently do constant folding as a early phase (Lower), but we don’t have constant propagation. Michael Haupt expressed interest to tackle this, and I thought it’d be best to work out the design kinks here in the open so others can also chime in if they want.

Scope
=====

We’re talking about intra-function constant propagation over non-scoped local variables only. That should not be too hard to do. It could be somewhat similar to how LocalVariableTypesCalculator (LVTC) works, actually. LVTC currently does def-use tracking of type information from assignment to IdentNode objects and propagates the type information into use of IdentNode objects, through their shared Symbol. This is similar, except it could be used to propagate constant values. 

Of course, if we wanted, later we could even extend it to something else, e.g. range analysis, or really any other invariants. It’s just a matter of how much work are we comfortable having the compiler do.

Implementation notes
=================

By the way of implementation, we’d need a top-down (meaning, it operates on enterXxx() methods; possibly multipass for loops, same as LVTC) pass that keeps a Map<Symbol, LiteralNode> and builds another Map<IdentNode, LiteralNode>. After it’s done, we’d need another pass, this time bottom-up (operating on leaveXxx() methods), replacing every IdentNode identified as having constant value with a LiteralNode instead. 

We should actually be even smarter about it, and have the top-down pass do another round of constant folding, using a generic Map<Expression, LiteralNode> instead of Map<IdentNode, LiteralNode> and evaluate expressions on the top-down pass and substitute already known constants opportunistically attempting to evaluate other expressions into constants. E.g. it could then collapse:

    var a = 3; var b = 2; var c = a + b;

into

    var c = 5;

(and eliminate a and b if they aren’t used elsewhere)

We’d still need the bottom-up pass to perform the final replacement of folded constants. As such, the replacement of IdentNode with a LiteralNode would just become a special case of constant folding after propagation :-)

An important element of the solution is to ensure that if at control flow join points a Symbol arrives containing different possible constant values on different branches (or a constant and a non-constant), then its const-ness has to be erased.

Function literals
============

Also, we should figure out a way to represent a “literal” for a function. We currently don’t do that. Constant propagation for function objects isn’t trivial as we can’t just replace every occurrence with a new FunctionNode; on the other hand it’d be nice to have as invocations of constant-propagated (proven never to change) functions could be  performed without a guard, e.g.

function () {
    function f() { … }
    …
    f(); <— “f” can be proven to always point to the function above.
}

Where to put the pass in the pipeline
============================

This pass should likely be done earlier than LVTC, so that we don’t calculate local types for variable reads that were replaced with constants. If all reads of a local variable are replaced with a constant, Nashorn can even completely eliminate the local variable (it currently eliminates never-read locals).

Attila.



More information about the nashorn-dev mailing list