RFR: 1406: Two GitHub APIs don't return `patch` field when `per_page` argument exceeds 70 [v2]

Erik Joelsson erikj at openjdk.java.net
Fri Apr 22 16:48:17 UTC 2022

On Fri, 22 Apr 2022 15:31:37 GMT, Guoxiong Li <gli at openjdk.org> wrote:

>> Hi all,
>> When the bot checks whether a backport is clean, it would use the [get-a-commit](https://docs.github.com/en/rest/commits/commits#get-a-commit) api to get the patch files of the original commit and use the [list-pull-requests-files](https://docs.github.com/en/rest/pulls/pulls#list-pull-requests-files) api to get the patch files of the backport PR, and then compare them.
>> But the [get-a-commit](https://docs.github.com/en/rest/commits/commits#get-a-commit) api has the following drawbacks:
>> 1. When the `per_page` argument exceeds 70, the files after 70th don't have the `patch` field. Please see the link [1] with `per_page` as 71, the final file, 71th, doesn't have the `patch` field. And the default `per_page` of this api is **300**, so if we don't provide a `per_page` and the total files number exceeds 70, we get the wrong result without `patch` field as well [2].
>> 2. At most 3000 patch files of a commit are returned by the api. If the files number exceeds 3000, such as [3], the api returns only 3000 files. The class `RestRequest` gets the pagination information duplicately, but finally only get 3000 files. (The test case of this patch can prove it.)
>> 3. If a patch file of the commit is too large (not exactly know the api how to judge "too large"), the api doesn't return the patch file as well. Please use the link [1] and find `modules/javafx.web/src/main/native/Source/ThirdParty/libxml/src/ChangeLog` or `modules/javafx.web/src/main/native/Source/ThirdParty/libxml/src/HTMLparser.c`, you can identify that they don't have the `patch` field.
>> The [list-pull-requests-files](https://docs.github.com/en/rest/pulls/pulls#list-pull-requests-files) api has the same drawbacks, but its default `per_page` is **30** instead of **300**, so the first drawback above doesn't happen in this api **now**.
>> This patch adds the `per_page` argument to these two api so that we can solve the first drawback. Note: considering the default `per_page` of the api [list-pull-requests-files](https://docs.github.com/en/rest/pulls/pulls#list-pull-requests-files) will change in the future (may exceed 70), it is good to also provide a explicit `per_page` for it.
>> I can't solve second and third drawbacks now. A possible alternative is using the media-type [5] `application/vnd.github.VERSION.diff` or `application/vnd.github.VERSION.patch` to get the meta `diff` or `patch` data and parse them. But the current paser classes `GitRawDiffParser` and `UnifiedDiffParser` can't parse such meta diff or patch (the format is not suitable). It needs many changes to the code about vcs, so I don't think it is a short term solution.
>> As a conclusion, this patch can solve the situations provided by kevin in SKARA-1332. But it can't identify the difference of the "too large" files and the files which exceeds 3000. Note: this shortcoming has already existed in SKARA and is not produced by this patch.
>> Best Regards,
>> -- Guoxiong
>> [1] https://api.github.com/repos/openjdk/jfx/commits/b0f2521219efc1b0d0c45088736d5105712bc2c9?&per_page=71
>> [2] https://api.github.com/repos/openjdk/jfx/commits/b0f2521219efc1b0d0c45088736d5105712bc2c9
>> [3] https://git.openjdk.java.net/jfx/commit/6f28d912024495278c4c35ab054bc2aab480b3e4
>> [4] https://api.github.com/repos/openjdk/jfx/commits/6f28d912024495278c4c35ab054bc2aab480b3e4
>> [5] https://docs.github.com/en/rest/overview/media-types#commits-commit-comparison-and-pull-requests
> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision:
>   Add comment.

Thanks, looks good.


Marked as reviewed by erikj (Lead).

PR: https://git.openjdk.java.net/skara/pull/1302

More information about the skara-dev mailing list