Pluggable Annotation Processing: problems from the IDE (IntelliJ) perspective

Wed Jan 27 22:30:16 UTC 2021

Hi Anna,

Thanks for the comments. For context, can you describe in more detail 
how IntelliJ is using javac and any other Java compilers? For example, 
is javac being invoked programmatically, or phases of it being 
subclassed, is a different compiler infrastructure used for incremental 
IDE usage versus generation of class files?

For background, when JSR 269 was being developed, besides javac there 
was an independent implementation of the API being done in Eclipse. The 
Eclipse implementation was in the context of that IDE and was a 
successor to the earlier Eclipse implementation of the apt API. Eclipse 
provided incremental running of annotation processors in response to 
updated files, etc. My understanding is the apt implementation in 
Eclipse had more complete dependency tracking, but the JSR 269 API 
provides fuller mechanisms to implement an incremental re-running policy.

As you note, internally javac currently drops the originating elements 
information as it is not used in a batch compilation context.

The interfaces for the environment objects, Filer, etc. were designed to 
be wrappable, but that is problematic if users do instanceof tests or 
rely on other implementation-class functionality.

During JSR 269, the utility of having a standard AST API was recognized, 
but it was technically infeasible to have an AST-level API that worked 
well across two compiler that didn't shave the same code base, javac and 
ecj in particular.

Thanks,

-Joe

On 1/26/2021 4:05 AM, Anna Kozlova wrote:
> Hi all,
>
> As we lately see the `JSR 269 - Pluggable Annotation Processing - 
> Maintenance Review ballot', we would like to share our troubles with 
> this API from the IDE perspective. It isn't connected to the latest 
> changes in the JSR so I was advised to post our thoughts here.
>
> *Initial setup:*
> IntelliJ IDEA allows incremental compilation, which means that only 
> changed code and its dependencies are recompiled instead of the whole 
> project (workspace)/module (project). The task is complicated by 
> itself but when people use Annotation Processors it becomes sometimes 
> impossible though, it seems, we may get all information we need to 
> build source - output relation and thus enable incremental compilation.
>
> *Problem description:*
> When a processor generates a java source code, a bytecode or a 
> resource file, it uses "create*" methods from the 
> javax.annotation.processing.Filer interface. Every "create*" method 
> has a vararg parameter "originatingElements" which are supposed to be 
> "type or package or module elements causally associated with the 
> creation of this file, may be elided or null". Those elements are 
> supposed to be used by the processing environment to register and 
> track dependencies between generated classes and existing code 
> elements (classes, methods, fields) used by the processor to produce 
> the generated code. The default Filter implementation by javac simply 
> ignores this data. Internally, javac calls JavaFileManager to actually 
> create and store the generated data, but, unfortunately, the 
> originatingElements passed by processors are already lost. When 
> generating bytecode, javac uses the 
> javax.tools.JavaFileManager.getJavaFileForOutput() method and passes 
> the "sibling" argument, pointing to the corresponding source file. 
> This information is used by our build system to register 
> source->output dependencies and facilitate incremental 
> compilation. However, if data generation is initiated by an annotation 
> processor, the "sibling" parameter is always null. One could expect it 
> to point to a source file object containing originatingElements passed 
> by the AP.
> (Another problem is that there are multiple originating elements 
> potentially corresponding to multiple source files, but there is only 
> a single "sibling" reference in the getJavaFileForOutput() 
> method.) Because originatingElements are ignored, our build system 
> cannot track dependencies between source files and AP-generated code. 
> So we should always assume the worst scenario and recompile the whole 
> module or a project whenever we detect that generated code is affected.
>
> Without this information the detection itself is not as reliable as it 
> could have been. So if a project heavily relies on AP code generation, 
> we cannot provide the best incremental compilation experience because 
> of lack of data.
>
> *Current solution:*
>
> The javadoc for create* methods in the Filer interface suggests that
>
> "This information may be used in an incremental environment to 
> determine the need to rerun processors or remove generated files. 
> Non-incremental environments may ignore the originating element 
> information."
>
> In order to get access to originatingElements, our build system has to 
> provide its own implementation of Filer interface. We do this by 
> wrapping the original Filter implementation with a wrapper that 
> registers originatingElements and delegates the call to original 
> Filter implementation. This is on its own a non-trivial task, because 
> the whole API provides no direct ways neither to access the 
> originatingElements nor to register a custom Filter implementation. 
> Just to wrap the original Filer implementation, we have to 
> re-implement the AP discovery logic, and then use the 
> JavaCompiler.CompilationTask.setProcessors() method that would 
> explicitly configure processor objects. Every Processor object is in 
> turn wrapped with our "wrapper", whose the only purpose is to make 
> sure the AP will get a wrapped Filer and not the original one.
>
> *Problems with the current solution:*
>
> This approach generally works, but it leads to additional problems on 
> the annotation processor side. Unfortunately, many popular processors 
> assume that passed objects like ProcessingEnvironment and Filer have 
> certain implementations. The processor code may heavily rely on this 
> assumption, e.g. cast the passed object to implementation or use 
> instanceof checks on it. So if processor gets a wrapped 
> ProcessingEnvironment, it would fail to execute further. As a result, 
> user's project just stops compiling.
>
> Another problem caused by such wrapping approach, is javac's so called 
> Tree API. An annotation processor may use the Tree API for its 
> internal logic. The only way for the processor to obtain a reference 
> to this API Facade is a call 
> com.sun.source.util.Trees.instance(ProcessingEnvironment). If 
> processor passes the wrapped ProcessingEnvironment object to the 
> Trees.instance() method, it won't work, because its implementation 
> internally uses an instanceof check itself!
>
> Our current approach is to detect such situation and provide AP 
> developer with hints in the error message. In order to make the 
> processor work in our incremental environment, the AP developer has to 
> write additional code that "unwraps" the passed ProcessingEnvironment 
> object and uses the unwrapped object to initialize the Tree API. Such 
> a situation, of course, is far from ideal: AP developers should not 
> write code to please an IDE.
>
> *What can improve the situation:*
> So would be great if following problems were addressed in the API: 
> there should be a direct way to access originatingElements. Ideally 
> there accessing this data should not make AP developers to change 
> their code. If code changes are inevitable on AP side, those changes 
> should be possible without making assumptions about inteface 
> implementations. There should be a standard way to get access to Tree 
> API as well, or, even better, the original API should be extended to 
> make implementation of complex processing logic possible without 
> semi-closed Tree API.
>
> Thanks,
> Anna
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20210127/0c4bc0cb/attachment-0001.htm>