JDK 13 RFR of JDK-8146726 : Improve AbstractProcessor to issue warnings for repeated information

Wed May 1 21:01:19 UTC 2019

Hi Joe,

Thanks for the explanation of annotations and modules. I'll leave the exact 
circumstances of issuing warnings up to you. Since it's just about issuing 
warning messages, it's reasonable that it doesn't need to be precise.

My concern is that the deduplication+warning code is hard to understand, and 
that someone in the future (maybe you!) will come across it and have a difficult 
time figuring it out; they might also worry about its precise behavior, not 
realizing that the warning policy isn't be precise.

It seems to me like the core of the algorithm can be implemented as follows. 
Given a String[] array, the elements can be deduplicated into a set this way:

     List<String> dups = new ArrayList<>(Arrays.asList(array));
     Set<String> result = new HashSet<>();
     dups.removeIf(result::add);

Now, 'result' contains the deduplicated items and 'dups' is a list (which might 
itself contain duplicates!) of duplicated items. The warning messages, if 
necessary, can then be generated from the 'dups' list after the fact.

I'm not entirely sure of the best way to handle module prefix removal. If the 
index/substring logic were extracted into a message, the initialization of 
'dups' could be replaced with a short stream that calls the module prefix 
removal function from map(). The exact behavior of the for-loop as it stands, 
though, would be difficult to implement this way. Then again, since the warning 
policy doesn't need to be precise, maybe this is ok.

I'll leave it up to you whether you want to continue reworking this code.

s'marks

On 4/30/19 4:21 PM, Joe Darcy wrote:
> Hi Stuart,
> 
> On 4/29/2019 5:14 PM, Stuart Marks wrote:
>> Hi Joe,
>>
>> Some nitpicky comments for you in AbstractProcessor.java. :-)
> 
> 
> Thanks for the careful review.
> 
> 
>>
>> I'm finding the boolean flags to be fairly confusing. Consider the 
>> 'initialized' field. This has a synchronized getter; 
>> getSupportedAnnotationTypes() calls the getter and stores the result in a 
>> local for use within this method. However, arrayToSet() accesses the 
>> 'initialized' field directly, and it's called (unsynchronized) from 
>> getSupportedOptions() as well as getSupportedAnnotationTypes().
>>
>> First, there's a potential memory visibility problem here, if you're concerned 
>> about multithreaded access. (At least some of the code in this class seems to 
>> try to handle thread-safety.) Second, the 'initialized' state is accessed from 
>> two different levels in the method call hierarchy, e.g. 
>> getSupportedAnnotationTypes() and arrayToSet(). This isn't necessarily a 
>> problem, but it does make me pause to try to think about what it means.
> 
> 
> Yes; the is-initialized check should be done using the isInitalized method as 
> done elsewhere in the class; needed update made in
> 
>      http://cr.openjdk.java.net/~darcy/8146726.6/
> 
>>
>> Another flag is 'stripModulePrefixes' (parameter to arrayToSet()) and its 
>> dependent local variable 'stripped' which is set if anything was actually 
>> stripped from this element. These make arrayToSet() a bit of a puzzle for me. 
>> I think the following are true:
>>
>> 1) if we aren't stripping module prefixes, we'll issue a warning on any 
>> duplicates.
>>
>> 2) if we are stripping module prefixes, we won't issue a warning if a prefixes 
>> was stripped from this element.
>>
>> My first inclination is to think that there are really two different 
>> arrayToSet() methods wanted here. The first doesn't strip and may issue 
>> warnings, and the second does strip and doesn't issue warnings. It seems to me 
>> that two separate methods might be simpler than one bigger method with a bunch 
>> of flags.
>>
>> But that's not what the current implementation does. However, the current 
>> implementation doesn't quite make sense to me either. For example, the comment 
>> says that if module names are stripped, warnings aren't issued because they'd 
>> be spurious if the array is ["foo/a.B", "bar/a.B"]. However, I'm not sure this 
>> logic makes sense. It seems to me that if the array is
>>
>>     ["a.B", "foo/a.B"]
>>
>> then no warning is issued, but if the array is
>>
>>     ["foo/a.B", "a.B"]
>>
>> the a warning is issued. Am I reading this correctly, and is this the right 
>> behavior?
>>
>>
> A bit of background on the intention of the AbstractProcessor class. The class 
> was a major usability improvement added in JSR 269 over the apt API . The 
> AbstractProcessor class provides a convenient "JDWIOWTD" - just do what I 
> obviously want to do - approach for common processor tasks. Of most relevance 
> here, it allows annotations on the processor class to succinctly indicate which 
> annotation types and options the processor supports.
> 
> The semantics of the improved warnings in supported options proposed in this 
> change is straightforward: is there are any repeated elements in the array 
> retrieved from the annotation (and a messager to report a warning is available), 
> report that an element is repeated.
> 
> Given some existing functionality to support annotation processing and modules, 
> the improved warnings support for the supported annotation types is more subtle. 
> Before modules, basically an annotation type name was an annotation type name, 
> like "a.B", and they could be compared for duplicates just like options
> 
> The supported annotation type names are a set of strings built from a string 
> array of inputs. Originally pre-modules, the supported syntax of the type names 
> included fully qualified names and some wildcard support. These names could be 
> subject to the same style of duplicate checking as done for option names in this 
> patch. With modules, the recommended syntax for names includes a module prefix, 
> "foo/a.B" as distinct from "bar/a.B". Without a module prefix, "a.B" will match 
> a same-named annotation type in any module.
> 
> One goal of AbstractProcessor in JDK 9 and later is to allow a processor to 
> smoothly support running in an environment with or without modules. Therefore 
> "a.B" retains its meaning by matching an annotation type in that name in any 
> module. Since the recommendation is to include a module name, if the compilation 
> environment doesn't support modules, 
> AbstractProcesso.getSupportedAnnotationTypes drop the module prefix if present. 
> Therefore, if the the annotation information is
> 
>      @SupportedAnnotationTypes({"foo/a.B", "bar/a.B"})
> 
> AbstractProcessor.getSupportedAnnotationTypes treats this as equivalent to
> 
>      @SupportedAnnotationTypes({"a.B", "a.B"})
> 
> which is of course semantically the same as "a.B". Now, in this case I think it 
> is acceptable to not get a warning when such a processor is run in an 
> environment without modules. However, the code as it is currently written will 
> *not* warning for
> 
>      {"foo/a.B", "foo/a.B"}
> 
> when run without modules. That input will get a warning if run *with* module 
> though. The input
> 
>          {"a.B", "a.B"}
> 
> will generate a warning independent of modules being supported.
> 
> Having jtreg @compile statements with different source levels allows this 
> behavior to be tested.
> 
> The current minor complication in the checking loop are in lieu of a more 
> complicated check that would first compare for duplicates on the original set of 
> values and then preform a module stripping step where that was appropriate. I 
> didn't regard the extra warning coverage to be worth the effort, but I'm happy 
> to discuss the matter.
> 
> HTH,
> 
> -Joe
>