RFR 8220792: JavacFileManager.list() performance
Ron Shapiro
ronshapiro at google.com
Tue Apr 2 23:34:43 UTC 2019
Thanks for such a detailed review!
I noticed one small thing - javacFileManagerOptions is private in the
superclass (BaseFileManager). I changed that reference to
Option.getJavaFileManagerOptions() instead, which is what that field is set
to also. Here's the updated webrev:
http://cr.openjdk.java.net/~ronsh/8220792/webrev.03/
Or, I can alternatively make the field protected.
On Tue, Apr 2, 2019 at 6:31 PM Jonathan Gibbons <jonathan.gibbons at oracle.com>
wrote:
> OK, thanks for looking into this; I also looked at the code and agree with
> your assessment.
>
> One of these days we'll have some really good unit tests for
> JavacFileManager ;-) Until then, this looks good.
>
> -- Jon
> On 03/31/2019 05:04 AM, Ron Shapiro wrote:
>
> My initial inclination was to say that you were right, but that this would
> have been the case even before my change, since already ArchiveContainer
> did nothing to index parent directories of the directories that actually
> have files in the archive. But then I did some debugging to see what
> actually happens, and it seems that the archive filesystem visits all
> parent directories, and thus those are also indexed. Here's one example
> from running jtreg:
>
> indexing archivePath: .../asmtools.jar
> preVisit: /
> preVisit: /org
> preVisit: /org/openjdk
> preVisit: /org/openjdk/asmtools
> preVisit: /org/openjdk/asmtools/util
> preVisit: /org/openjdk/asmtools/jdis
> preVisit: /org/openjdk/asmtools/jdec
> preVisit: /org/openjdk/asmtools/jcoder
> preVisit: /org/openjdk/asmtools/jcdec
> preVisit: /org/openjdk/asmtools/jasm
> preVisit: /org/openjdk/asmtools/common
> preVisit: /org/openjdk/asmtools/asmutils
> preVisit: /META-INF
>
> I checked the code for ZipFileSystem, and it hard-codes the root directory
> as "/" for all zip files no matter the contents.
>
> So for the example above, list("org", recurse=true) should still recur
> properly.
>
> On Fri, Mar 29, 2019 at 9:21 PM Jonathan Gibbons <
> jonathan.gibbons at oracle.com> wrote:
>
>> Ron,
>>
>> Although I've not checked in deep detail, I think there may be an
>> inherent problem with the caching you are doing, as regards to recursive
>> list.
>>
>> The list method takes a "recurse" parameter, which can be used to list
>> the contents of a package *and its subpackages*.
>>
>> If you have two containers, one containing package "p", and another
>> containing package "p.q", if you try a recursive list of "p", I believe
>> your code may miss the contents of "p.q" in the second container. If
>> nothing else, this is an important case to be checked and tested. Although
>> this feature is not typically used in javac itself, it is used in javadoc,
>> which also uses the same file manager.
>>
>> It may be that the cache needs to be smart enough to handle the case when
>> an archive has subdirectories of a package, even if it does not have any
>> direct contents of the package.
>>
>> -- Jon
>> On 3/29/19 7:10 AM, Ron Shapiro wrote:
>>
>> - Added a comment explaining the ordering. LMK if you have a better way
>> of explaining it, as I think you grok it better than I do
>> - Thanks for the pointer to javacFileManagerOptions. I updated to use
>> that.
>> Both are in this updated webrev:
>> http://cr.openjdk.java.net/~ronsh/8220792/webrev.02/
>>
>> - Regarding indexing: I don't think today anything would detect updates
>> to a jar file if a new RelativeDirectory was added. Perhaps that's
>> intentional, because we don't care/don't even want to try to support
>> someone updating their jarfile in the middle of a compilation. I defer to
>> others about indexing other containers - it looks like the
>> JRTImageContainer depends on the JRTIndex class, which does a lazy
>> computation of the index, so it looks like it would be some additional work
>> to add that to the index. But perhaps it's not even worth it - IIUC, there
>> should only be one JRTImageContainer, and probably a limited amount of
>> DirectoryContainers - the performance issue that I was observing was that
>> with many jars, each needed to do some work for every call to list. We've
>> now reduced O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to
>> be small) + O(ArchiveContainer = possibly large)) per list() call
>> to O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to be
>> small) + O(overlapping ArchiveContainers, which is hopefully small))
>>
>>
>>
>> On Wed, Mar 27, 2019 at 1:47 PM Jonathan Gibbons <
>> jonathan.gibbons at oracle.com> wrote:
>>
>>>
>>> On 3/27/19 9:06 AM, Ron Shapiro wrote:
>>> >
>>> > > It's not clear to me why you maintain the list of PathAndContainer
>>> > as a sorted list, and hence why you need to use java.util.List instead
>>> > of javac's List.
>>> > There are some tests that ensure that the classpath that javac sees is
>>> > equivalent to the classpath the JVM sees vis-a-vis ordering. Without
>>> > sorting, javac was seeing a different order, and I presume that is
>>> > because the JRTImageContainer and DirectoryContainers could appear
>>> > anywhere. Sorting solved that issue.
>>>
>>> OK, I think I understand the issue and why sorting fixes it.
>>>
>>> While some tests may check that "the classpath that javac sees is
>>> equivalent to the classpath the JVM sees vis-a-vis ordering", that's not
>>> the primary issue. The primary issue is that the new cache must preserve
>>> the order in which containers are searched on the classpath (i.e just
>>> within javac, separate from any part set by the JVM). While this is not
>>> so important when looking up classes in "well-behaved" packages, it is
>>> important when handling split packages and resources, where the contents
>>> in one container may hide the contents in another container.
>>>
>>> I think this is subtle enough that it is worth a comment in the code,
>>> explaining that it is important to preserve the search order used in the
>>> uncached Location path.
>>>
>>> -- Jon
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20190402/0dcbb725/attachment-0001.html>
More information about the compiler-dev
mailing list