RFR 8220792: JavacFileManager.list() performance

Ron Shapiro ronshapiro at google.com
Sun Mar 31 12:04:57 UTC 2019


My initial inclination was to say that you were right, but that this would
have been the case even before my change, since already ArchiveContainer
did nothing to index parent directories of the directories that actually
have files in the archive. But then I did some debugging to see what
actually happens, and it seems that the archive filesystem visits all
parent directories, and thus those are also indexed. Here's one example
from running jtreg:

indexing archivePath: .../asmtools.jar
preVisit: /
preVisit: /org
preVisit: /org/openjdk
preVisit: /org/openjdk/asmtools
preVisit: /org/openjdk/asmtools/util
preVisit: /org/openjdk/asmtools/jdis
preVisit: /org/openjdk/asmtools/jdec
preVisit: /org/openjdk/asmtools/jcoder
preVisit: /org/openjdk/asmtools/jcdec
preVisit: /org/openjdk/asmtools/jasm
preVisit: /org/openjdk/asmtools/common
preVisit: /org/openjdk/asmtools/asmutils
preVisit: /META-INF

I checked the code for ZipFileSystem, and it hard-codes the root directory
as "/" for all zip files no matter the contents.

So for the example above, list("org", recurse=true) should still recur
properly.

On Fri, Mar 29, 2019 at 9:21 PM Jonathan Gibbons <
jonathan.gibbons at oracle.com> wrote:

> Ron,
>
> Although I've not checked in deep detail, I think there may be an inherent
> problem with the caching you are doing, as regards to recursive list.
>
> The list method takes a "recurse" parameter, which can be used to list the
> contents of a package *and its subpackages*.
>
> If you have two containers, one containing package "p", and another
> containing package "p.q", if you try a recursive list of "p", I believe
> your code may miss the contents of "p.q" in the second container. If
> nothing else, this is an important case to be checked and tested. Although
> this feature is not typically used in javac itself, it is used in javadoc,
> which also uses the same file manager.
>
> It may be that the cache needs to be smart enough to handle the case when
> an archive has subdirectories of a package, even if it does not have any
> direct contents of the package.
>
> -- Jon
> On 3/29/19 7:10 AM, Ron Shapiro wrote:
>
> - Added a comment explaining the ordering. LMK if you have a better way of
> explaining it, as I think you grok it better than I do
> - Thanks for the pointer to javacFileManagerOptions. I updated to use that.
> Both are in this updated webrev:
> http://cr.openjdk.java.net/~ronsh/8220792/webrev.02/
>
> - Regarding indexing: I don't think today anything would detect updates to
> a jar file if a new RelativeDirectory was added. Perhaps that's
> intentional, because we don't care/don't even want to try to support
> someone updating their jarfile in the middle of a compilation. I defer to
> others about indexing other containers - it looks like the
> JRTImageContainer depends on the JRTIndex class, which does a lazy
> computation of the index, so it looks like it would be some additional work
> to add that to the index. But perhaps it's not even worth it - IIUC, there
> should only be one JRTImageContainer, and probably a limited amount of
> DirectoryContainers - the performance issue that I was observing was that
> with many jars, each needed to do some work for every call to list. We've
> now reduced O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to
> be small) + O(ArchiveContainer = possibly large)) per list() call
> to O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to be
> small) + O(overlapping ArchiveContainers, which is hopefully small))
>
>
>
> On Wed, Mar 27, 2019 at 1:47 PM Jonathan Gibbons <
> jonathan.gibbons at oracle.com> wrote:
>
>>
>> On 3/27/19 9:06 AM, Ron Shapiro wrote:
>> >
>> > > It's not clear to me why you maintain the list of PathAndContainer
>> > as a sorted list, and hence why you need to use java.util.List instead
>> > of javac's List.
>> > There are some tests that ensure that the classpath that javac sees is
>> > equivalent to the classpath the JVM sees vis-a-vis ordering. Without
>> > sorting, javac was seeing a different order, and I presume that is
>> > because the JRTImageContainer and DirectoryContainers could appear
>> > anywhere. Sorting solved that issue.
>>
>> OK, I think I understand the issue and why sorting fixes it.
>>
>> While some tests may check that "the classpath that javac sees is
>> equivalent to the classpath the JVM sees vis-a-vis ordering", that's not
>> the primary issue. The primary issue is that the new cache must preserve
>> the order in which containers are searched on the classpath (i.e just
>> within javac, separate from any part set by the JVM). While this is not
>> so important when looking up classes in "well-behaved" packages, it is
>> important when handling split packages and resources, where the contents
>> in one container may hide the contents in another container.
>>
>> I think this is subtle enough that it is worth a comment in the code,
>> explaining that it is important to preserve the search order used in the
>> uncached Location path.
>>
>> -- Jon
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20190331/79cb9020/attachment.html>


More information about the compiler-dev mailing list