RFR: 1801: Exclude some GraphQL calls from RestRequestCache rate limiter

Magnus Ihse Bursie ihse at openjdk.org
Tue Jan 24 16:17:51 UTC 2023


On Tue, 24 Jan 2023 14:41:46 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

> In [SKARA-1731](https://bugs.openjdk.org/browse/SKARA-1731), the best practices regarding non GET requests on Github became better enforced. Unfortunately that has lead to some unexpected performance issues. I believe this is mostly caused by GraphQL requests, which are always POST, being treated as a non GET REST requests by the limiter and limited to one per second. This limitation on a subset of our github queries is currently causing a queue buildup for that lock, which stabilizes to about 1m wait time. This delay is slowing down response times from the bots massively.
> 
> Reading the Integrator best practices document again (https://docs.github.com/en/rest/guides/best-practices-for-integrators?apiVersion=2022-11-28) and the GraphQL resource limitations (https://docs.github.com/en/graphql/overview/resource-limitations) I'm pretty sure you aren't supposed to limit GraphQL to one call per second, but rather just make sure you stay within the points limit. The one per second recommendation is for the REST API.
> 
> I want to make it possible to exclude certain calls from this rate limiter to see if this can alleviate the contention we are currently seeing. I have run with this patch for the pr bot since yesterday afternoon (PT) and it seems to be working well.
> 
> Two PRs were unable to integrate yesterday due to this contention:
> https://github.com/openjdk/jdk/pull/12139
> https://github.com/openjdk/jdk/pull/11922

Basically, what I think I am suggesting, is that instead of a `skipLimit` boolean, we should have a `graphQlPoints`, which we can set to 0 for normal REST calls. And if it is non-zero, the rate limits counts towards the graph QL point limit instead. 

If we know that we stay way below the 5000 per hour, we can skip the actual limitation for now, but that would make it easy to inject such a check in the future. Or, we could do it properly from the start, and add a check that the accumulated QL points do not exceed the limit.

-------------

PR: https://git.openjdk.org/skara/pull/1461


More information about the skara-dev mailing list