RFR: 8320691: Timeout handler on Windows takes 2 hours to complete

Daniel Jeliński djelinski at openjdk.org
Fri Nov 24 09:43:06 UTC 2023


On Fri, 24 Nov 2023 09:17:47 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:

>> test/failure_handler/src/share/conf/windows.properties line 61:
>> 
>>> 59: native.core.app=cdb
>>> 60: native.core.args=-c ".dump /ma core.%p;qd" -p %p
>>> 61: native.core.params.timeout=3600000
>> 
>> Hello Daniel, I found it surprising that this takes 2 hours to complete. The failure handler infrastructure has timeout handling built in, after which it kills the failure handler action (the process). Looking at the value specified here it translates to a timeout of 60 minutes (which is too high by the way). So I looked around in some other files and I think there might be a bug here. In other files (linux.properties and mac.properties), I notice the timeout is specified as:
>> 
>> 
>> native.core.timeout=600000
>> 
>> Notice the absence of "params" part in that key. I wonder if that is playing a role here and whether we should fix this key. While at it, perhaps we should also reduce this timeout to may be something lesser (1 hour seems to high). Linux and macosx use a value of `600000` which is 10 minutes. If Windows requires a few more minutes then that's understandable but perhaps we should set it to a maximum of 30 minutes maybe?
>
>> Notice the absence of "params" part in that key. I wonder if that is playing a role here and whether we should fix this key.
> 
> Actually ignore that part. I had a look at the internal logs that you referenced. It appears that this form of specifying the timeout (through the use of `params` key) seems to work too. The reason why it took 2 hours is because it ran that command against 2 separate processes and each one timed out after one hour:
> 
> 
> [2023-11-23 21:45:40] [cdb.exe, -c, ".dump, /f, core.12345;qd", -p, 12345] timeout=3600000
> ...
> 0:001> WARNING: tool timed out: killed process after 3600000 ms
> ----------------------------------------
> [2023-11-23 22:45:40] exit code: -2 time: 3600006 ms
> ----------------------------------------
> 
> 
> ----------------------------------------
> [2023-11-23 22:47:36] [cdb.exe, -c, ".dump, /f, core.6789;qd", -p, 6789] timeout=3600000
> ----------------------------------------
> 
> ...
> 0:063> WARNING: tool timed out: killed process after 3600000 ms
> ----------------------------------------
> [2023-11-23 23:47:36] exit code: -2 time: 3599996 ms
> ----------------------------------------
> 
> 
> Edit: ... and you did mention about this in the description of the JBS issue. I just overlooked it :)

good point, 10 minutes should be more than enough. I'll update.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16806#discussion_r1404142975


More information about the build-dev mailing list