RFR: 1827: Better handling of Error

Zhao Song zsong at openjdk.org
Wed Mar 22 20:30:25 UTC 2023


On Wed, 22 Mar 2023 20:05:37 GMT, Erik Joelsson <erikj at openjdk.org> wrote:

> After some deliberation, I think this is how we should handle an `Error` being thrown when running a `WorkItem`. 
> 
> We need to attempt to log it and register it in the counters, otherwise it will be very hard to discover the problem. 
> 
> We also need to attempt to return the scratch path to the BotRunner, as well as removing the item from the active set, otherwise the BotRunner will enter a broken state. The drawback of this is that it will allow new instances of the same WorkItem to run, which will likely repeat the Error. I still think that's the better option compared to what we experienced in SKARA-1825.
> 
> I think it's fine to skip scheduling of pending items. When an Error has been thrown in a thread, we should try to minimize any additional work in that thread to only what's necessary. Unless every WorkItem throws an Error, pending items will get scheduled eventually anyway, by some other successful WorkItem. If every WorkItem is throwing an Error, then not scheduling pending items is the least of our problems. By re-throwing the `Error` we make sure the Runnable exists as quickly as possible.

LGTM

bot/src/main/java/org/openjdk/skara/bot/BotRunner.java line 236:

> 234:                 } catch (Error e) {
> 235:                     EXCEPTIONS_COUNTER.labels(item.botName(), item.workItemName(), e.getClass().getName()).inc();
> 236:                     log.log(Level.SEVERE, "Error thrown during item execution: " + e.getMessage(), e);

Maybe we could also log `item` in the message.

-------------

Marked as reviewed by zsong (Committer).

PR Review: https://git.openjdk.org/skara/pull/1490#pullrequestreview-1353426220
PR Review Comment: https://git.openjdk.org/skara/pull/1490#discussion_r1145373732


More information about the skara-dev mailing list