End position storage in javac

Jan Lahoda jan.lahoda at oracle.com
Tue Dec 2 15:06:54 UTC 2025


Hi,


Yes, I think it would make sense to look into moving the end positions 
into the trees, as the types of compilations that don't use end 
positions are, I think, getting rarer. I took a peek at the draft PR, 
and overall it seems reasonable to me. (I'd need a more detailed pass to 
fully review, though.)


Thanks,

     Jan


On 11/27/25 14:45, Maurizio Cimadamore wrote:
> Hi Liam, Archie
> I believe this came up also in the recent discussions on lint 
> warnings, where we needed to expand the set of end positions retained 
> by default.
>
> I think the long term solution here is, as you say (and I think Jan 
> supports that too) that end positions should be just stored by default 
> in the trees.
>
> There's a lot of things javac does "its own way" for reasons sometimes 
> good, sometimes less good. For instance, javac had to use its own 
> `List` because when it was written generics were not yet available. 
> That said, one issue with using just a plain j.l.List (like ArrayList) 
> in javac is that (a) javac typically operates on very small lists, 
> where the overhead of array lists might be too big and (b) j.u.List is 
> very bad for recursing algorithms, which is what javac is all about. 
> So, I believe using a custom List impl there seems to be a good trade 
> off.
>
> Other areas that are brought up from time to time are:
>
> * use of special data structures for scopes -- why not just maps?
> * use of special data structures for names -- why not just strings?
>
> I believe we did some experiments on the former, and concluded that 
> javac implementation was still better than a hashmap (as javac 
> requirement are specialized, and scopes need to be traversed in 
> different ways, and somtimes a new scope needs to be "pushed" on top 
> of an old one -- reusing the undelrying entries). For names I'm less 
> sure, but maybe somebody else knows the answer there.
>
> Then there's the general lack of data-orientedness of the javac 
> design. Lots of classes with lots of visitors everywhere, and various 
> ways to query "are you a T". I would like very much, one day, to make 
> the Type/Symbol/Tree hierarchies sealed, and get rid of all the 
> various kinds/tags, etc. and maybe even see if we can get rid of 
> visitors and just use plain code with pattern matching.
>
> Maurizio
>
>
> On 26/11/2025 13:01, Liam Miller-Cushon wrote:
>> Hi,
>>
>> I wanted to discuss how javac handles end positions, and get input on 
>> the possibility of having the compiler unconditionally store end 
>> positions in a field on JCTree instead of in a separate map.
>>
>> Currently end position information is not stored by default, but is 
>> enabled in certain modes: if -Xjcov is set, or if the compilation 
>> includes diagnostic listeners, task listeners, or annotation 
>> processors (since they may want end positions).
>>
>> The hash table used to store the end positions was optimized in JDK 9 
>> in JDK-8033287, and there was some related discussion on 
>> compiler-dev@ about the motivation for making end positions optional 
>> at that time.
>>
>> As I understand it, the goal is to save memory in the case where end 
>> positions aren't needed. That savings comes with a trade-off when end 
>> positions are needed, though, since the map is less efficient than 
>> storing the position directly in JCTree.
>>
>> Today, many invocations of javac will need end position information 
>> (annotation processing is common, when javac is used programatically 
>> in IDEs end positions will be enabled). For the invocations that do 
>> not need end positions, typical developer machines are less memory 
>> constrained than they were when the optimization for end positions 
>> was first introduced.
>>
>> Looking at the compilation of java.base, it contains about 3000 files 
>> and creates about 3 million AST nodes, so adding an int field to 
>> JCTree to store end positions would take about 12MB.
>>
>> What do you think? Would it make sense to consider adding a field to 
>> JCTree to store end positions, instead of using EndPosTable?
>>
>> I have a draft PR of the approach here: 
>> https://github.com/openjdk/jdk/pull/28506
>>
>> Having end position information always available might enable some 
>> potential improvements to javac. For example, some compilers indicate 
>> a span of source text for some diagnostics, for example the 'range 
>> highlighting' described in these clang docs: 
>> https://clang.llvm.org/diagnostics.html


More information about the compiler-dev mailing list