RFR: 8344148: Add an explicit compiler phase for warning generation

Thu Nov 14 10:55:18 UTC 2024

On Wed, 13 Nov 2024 22:25:36 GMT, Archie Cobbs <acobbs at openjdk.org> wrote:

> Please review this patch which does some minor refactoring to the compiler:
> * Create a new `WARN` phase which can be a dedicated home for (new) lint/warning logic
> * Create a new `WarningAnalyzer` singleton whose job is to invoke such lint/warning logic
> * Move `ThisEscapeAnalyzer` out of `Flow` (where it doesn't belong) and into `WarningAnalyzer`
> * Refactor `ThisEscapeAnalyzer` to be a context singleton like all other such classes
> 
> See [JDK-8344148](https://bugs.openjdk.org/browse/JDK-8344148) for details.

Note that javac has another way to perform analysis on code -- that's the `Analyzer` class. I've added that class a while ago, in an attempt to consolidate various "finders" that were scattered across the compiler code base. What is a "finder" ? Well, it's a piece of analysis that takes a piece of existing code, rewrites it in a certain way, and then see if the result preserve some properties of the original tree. Example:

* original tree: `new HashMap<String, Integer>`
* rewritten tree: `new HashMap<>`
* property to preserve: is the type of the `JCNewTree` AST node preserved by the transformation?

While we have been able to move the finders elsewhere and consolidate some of the logic, there is still some coupling between `Attr` and the `Analyzer` class that I'm not too proud if: namely, `Attr` has to "register" trees with `Analizer`, so that these trees can be queued up for further inspection.

If we are to have a dedicated pass for all kind of "pluggable" analyses, then it seems to me that a "finder" is just a special kind of analysis that is carried out - whereas a "linter" is another kind. Both need to process the attributed tree in some way, the difference is only in what they end up producing (a finder results in simplification suggestions, whereas a linter will typically result in new warnings).

Even though the code in `Analyzer` is a bit arcane, I think it highlight a potential scalability issue that we need to be aware of with the new `Warn` step: as the number of linters that go through the new step grows, we are faced with a dilemma: we could fold all the various linter analyses in a single tree scanner -- or we could have oen tree scanner per linter (or somewhere in between). The first is more efficient, but leads to code nightmare (evidence: look at `Lower`). The latter is cleaner to read (each scanner does something different, and we can compose them), but less efficient (the tree need to be scanned N times where N is the number of enabled linters -- for code running with `Xlint:all` that number is rather high).

So, I believe that whatever we do in this area needs to be able to solve this fundamental tension between scalability and maintainability. `Analyzer` went there full blazing with a convoluted mechanism where trees are put in a queue if any of the analyzers is interested -- then each analyzer works off the queue w/o having to process stuff it's not interested in. This was made possible by the simplifying assumption that each "finder" was only interested in a single JCTree kind (which might be too simplistic). Maybe all this complexity is not required, and having one pass per linter will work just fine (and lead to better code than the one in `Analyzer`). But either way, we should explore this space consciously, compare alternatives, assess performance, etc. before we come up with a proposal on how to rearchitect linters.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22088#issuecomment-2476024242