RFR 8220792: JavacFileManager.list() performance

Ron Shapiro ronshapiro at google.com
Wed Apr 3 00:20:06 UTC 2019


Sounds good. Updated webrev:
http://cr.openjdk.java.net/~ronsh/8220792/webrev.04/

On Tue, Apr 2, 2019 at 8:00 PM Jonathan Gibbons <jonathan.gibbons at oracle.com>
wrote:

> Ron,
>
> In this case, I would just make the field protected.
>
> Arguably BaseFileManager has outlived its usefulness; it was introduced
> when for a while we had two different file managers, one based on
> java.io.File and another based on java.nio.file.Path.  As you see, that
> played out to what we have today, where one file manager supports both. I
> don't foresee any additional need for separate use BaseFileManager, because
> any additional file manager functionality should be based on using nio Path
> and FileSystem.
>
> Bottom line, BaseFileManager and JavacFileManager are joined at the hip,
> and may one day be fused together as one.  So it's reasonable for
> JavacFileManager to use protected fields of BaseFileManager.
>
> -- Jon
>
> On 04/02/2019 04:34 PM, Ron Shapiro wrote:
>
> Thanks for such a detailed review!
>
> I noticed one small thing - javacFileManagerOptions is private in the
> superclass (BaseFileManager). I changed that reference to
> Option.getJavaFileManagerOptions() instead, which is what that field is set
> to also. Here's the updated webrev:
> http://cr.openjdk.java.net/~ronsh/8220792/webrev.03/
>
> Or, I can alternatively make the field protected.
>
> On Tue, Apr 2, 2019 at 6:31 PM Jonathan Gibbons <
> jonathan.gibbons at oracle.com> wrote:
>
>> OK, thanks for looking into this; I also looked at the code and agree
>> with your assessment.
>>
>> One of these days we'll have some really good unit tests for
>> JavacFileManager ;-)  Until then, this looks good.
>>
>> -- Jon
>> On 03/31/2019 05:04 AM, Ron Shapiro wrote:
>>
>> My initial inclination was to say that you were right, but that this
>> would have been the case even before my change, since already
>> ArchiveContainer did nothing to index parent directories of the directories
>> that actually have files in the archive. But then I did some debugging to
>> see what actually happens, and it seems that the archive filesystem visits
>> all parent directories, and thus those are also indexed. Here's one example
>> from running jtreg:
>>
>> indexing archivePath: .../asmtools.jar
>> preVisit: /
>> preVisit: /org
>> preVisit: /org/openjdk
>> preVisit: /org/openjdk/asmtools
>> preVisit: /org/openjdk/asmtools/util
>> preVisit: /org/openjdk/asmtools/jdis
>> preVisit: /org/openjdk/asmtools/jdec
>> preVisit: /org/openjdk/asmtools/jcoder
>> preVisit: /org/openjdk/asmtools/jcdec
>> preVisit: /org/openjdk/asmtools/jasm
>> preVisit: /org/openjdk/asmtools/common
>> preVisit: /org/openjdk/asmtools/asmutils
>> preVisit: /META-INF
>>
>> I checked the code for ZipFileSystem, and it hard-codes the root
>> directory as "/" for all zip files no matter the contents.
>>
>> So for the example above, list("org", recurse=true) should still recur
>> properly.
>>
>> On Fri, Mar 29, 2019 at 9:21 PM Jonathan Gibbons <
>> jonathan.gibbons at oracle.com> wrote:
>>
>>> Ron,
>>>
>>> Although I've not checked in deep detail, I think there may be an
>>> inherent problem with the caching you are doing, as regards to recursive
>>> list.
>>>
>>> The list method takes a "recurse" parameter, which can be used to list
>>> the contents of a package *and its subpackages*.
>>>
>>> If you have two containers, one containing package "p", and another
>>> containing package "p.q", if you try a recursive list of "p", I believe
>>> your code may miss the contents of "p.q" in the second container. If
>>> nothing else, this is an important case to be checked and tested. Although
>>> this feature is not typically used in javac itself, it is used in javadoc,
>>> which also uses the same file manager.
>>>
>>> It may be that the cache needs to be smart enough to handle the case
>>> when an archive has subdirectories of a package, even if it does not have
>>> any direct contents of the package.
>>>
>>> -- Jon
>>> On 3/29/19 7:10 AM, Ron Shapiro wrote:
>>>
>>> - Added a comment explaining the ordering. LMK if you have a better way
>>> of explaining it, as I think you grok it better than I do
>>> - Thanks for the pointer to javacFileManagerOptions. I updated to use
>>> that.
>>> Both are in this updated webrev:
>>> http://cr.openjdk.java.net/~ronsh/8220792/webrev.02/
>>>
>>> - Regarding indexing: I don't think today anything would detect updates
>>> to a jar file if a new RelativeDirectory was added. Perhaps that's
>>> intentional, because we don't care/don't even want to try to support
>>> someone updating their jarfile in the middle of a compilation. I defer to
>>> others about indexing other containers - it looks like the
>>> JRTImageContainer depends on the JRTIndex class, which does a lazy
>>> computation of the index, so it looks like it would be some additional work
>>> to add that to the index. But perhaps it's not even worth it - IIUC, there
>>> should only be one JRTImageContainer, and probably a limited amount of
>>> DirectoryContainers - the performance issue that I was observing was that
>>> with many jars, each needed to do some work for every call to list. We've
>>> now reduced O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to
>>> be small) + O(ArchiveContainer = possibly large)) per list() call
>>> to O(O(JRTImageContainer = 1) + O(DirectoryContainer = likely to be
>>> small) + O(overlapping ArchiveContainers, which is hopefully small))
>>>
>>>
>>>
>>> On Wed, Mar 27, 2019 at 1:47 PM Jonathan Gibbons <
>>> jonathan.gibbons at oracle.com> wrote:
>>>
>>>>
>>>> On 3/27/19 9:06 AM, Ron Shapiro wrote:
>>>> >
>>>> > > It's not clear to me why you maintain the list of PathAndContainer
>>>> > as a sorted list, and hence why you need to use java.util.List
>>>> instead
>>>> > of javac's List.
>>>> > There are some tests that ensure that the classpath that javac sees
>>>> is
>>>> > equivalent to the classpath the JVM sees vis-a-vis ordering. Without
>>>> > sorting, javac was seeing a different order, and I presume that is
>>>> > because the JRTImageContainer and DirectoryContainers could appear
>>>> > anywhere. Sorting solved that issue.
>>>>
>>>> OK, I think I understand the issue and why sorting fixes it.
>>>>
>>>> While some tests may check that "the classpath that javac sees is
>>>> equivalent to the classpath the JVM sees vis-a-vis ordering", that's
>>>> not
>>>> the primary issue. The primary issue is that the new cache must
>>>> preserve
>>>> the order in which containers are searched on the classpath (i.e just
>>>> within javac, separate from any part set by the JVM). While this is not
>>>> so important when looking up classes in "well-behaved" packages, it is
>>>> important when handling split packages and resources, where the
>>>> contents
>>>> in one container may hide the contents in another container.
>>>>
>>>> I think this is subtle enough that it is worth a comment in the code,
>>>> explaining that it is important to preserve the search order used in
>>>> the
>>>> uncached Location path.
>>>>
>>>> -- Jon
>>>>
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20190402/e95141f9/attachment-0001.html>


More information about the compiler-dev mailing list