RFR: 1529: Optimize Census namespace instantiation
Erik Joelsson
erikj at openjdk.org
Wed Aug 10 16:43:56 UTC 2022
On Wed, 10 Aug 2022 15:32:08 GMT, Magnus Ihse Bursie <ihse at openjdk.org> wrote:
>> This patch adds a new factory method `Census::parseNamespace`, which enables initialization of a single Namespace directly from a HostedRepository (without having to clone the census repo to local disk first).
>>
>> The motivation behind this is bot performance. The Census is loaded in many WorkItems. Cloning/fetching the complete census repo to local disk is relatively expensive compared to a couple of REST calls to just read files directly from the remote repo. Especially as our REST client already leverages caching through ETag. This particular optimization can however only be applied to the IssueNotifier in the NotifyBot, where I doubt it will make that much of a difference. The main performance gain is realized in a different bot which currently isn't public (but these gains will impact all Skara users).
>>
>> I chose to at least for now, only apply this for initializing a single Namespace as that can be done by reading only 2 files from the remote, so the performance gain is very clear. Other bots either need to read considerably more files, or have the need to update the census. This optimization idea may be expanded in the future if I see a need and noticeable gains from it.
>>
>> In order to make HostedRepository available to the Census class, I had to get rid of several existing module dependencies to avoid circular dependencies. Fortunately this wasn't too hard. It made no sense to me that `org.openjdk.skara.forge` would require `org.openjdk.skara.census`. There was just one static method `PullRequestUtils` in the forge module that used a reference to the census module, and this static method was only used from one module `org.openjdk.skara.bots.pr`, so I simply moved it to `CheckablePullRequest` which is kind of a utility class in that module. The rest of the offending dependencies were just unused, so I cleaned out a fair few of those.
>>
>> In the `IssueNotifier`, in addition to using the new initialization method, I also added lazy initialization and reusing of the `CensusInstance` class, to avoid re-initializing the same data for every round in loops.
>
> census/src/main/java/module-info.java line 25:
>
>> 23: module org.openjdk.skara.census {
>> 24: requires org.openjdk.skara.xml;
>> 25: requires static org.openjdk.skara.forge;
>
> What does `requires static` mean in a `module-info.java` file?
It's the java modules way of declaring a dependency optional. It's only needed at compile time, but not at runtime.
At least in this case, the dependency is only used in a static method, which means any types from the forge module would only be used if you call that static method, and to be able to do that, you would already have had to load those classes.
I don't think it matters in Skara currently, as I think all modules that require census also require forge, but it feels a bit cleaner to me that pulling in forge is optional here.
-------------
PR: https://git.openjdk.org/skara/pull/1348
More information about the skara-dev
mailing list