RFR: 1042: Watchdog causing multiple restarts for mlbridge
Erik Helin
ehelin at openjdk.java.net
Mon May 17 13:04:34 UTC 2021
On Fri, 14 May 2021 20:25:27 GMT, Erik Joelsson <erikj at openjdk.org> wrote:
> When starting certain bots with a fresh scratch area, we currently end up in a restart loop. This is because all the threads immediately get busy cloning repos, which starves out the watchdog pings for longer than the hard coded 10 minutes. This patch changes the watchdog to use the configuration setting "watchdog" for the restart timeout instead. This value is currently used for a log warning which is also driven by the watchdog, so to be able to still have separate values, I've introduced a new option "watchdog_warn" which can optionally be set for just the warning part.
>
> In addition to this, I also added a bit more logging to make it easier to follow through logstash when watchdog pings occur, or when a new instance of a bot runner is started. Failure to start due to configuration errors are now also posted using proper logs.
Looks good. Please see my question/comment in-line.
bots/cli/src/main/java/org/openjdk/skara/bots/cli/BotLauncher.java line 172:
> 170: runnerConfig = BotRunnerConfiguration.parse(jsonConfig, jsonFile.getParent());
> 171: } catch (ConfigurationError configurationError) {
> 172: log.severe("Failed to parse configuration file: " + jsonFile
Will this actually log anything if there is no logging configured? `applyLogging` resets the `LogManager` and the `BotConsoleHandler` is only added if `log.console` is configured. Would it be wise to both print to standard out and log?
-------------
Marked as reviewed by ehelin (Reviewer).
PR: https://git.openjdk.java.net/skara/pull/1157
More information about the skara-dev
mailing list