RFR: 8267532: Try/catch block not optimized as expected [v3]

Fri Nov 3 14:13:06 UTC 2023

On Fri, 3 Nov 2023 03:25:18 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> May I ask if it is possible/preferable to prune code not only when the handler is not taken, but also when it is rarely taken. Thanks.

I spent some time thinking about this during the design process as well. First off, we need to define what 'rarely taken' means. For exception handlers, I think that means they are rarely entered when compared to the `try` block they cover. So, we could theoretically have 2 counters for each exception handler: one for the `try` block, and one for the exception handler itself, then profile both and figure out the ratio of each block being taken, to which we could apply a heuristic to define 'rarely taken'

However, profiling `try` blocks is tricky. 'Regular' profiling happens based on a particular bytecode. e.g. when we interpret a `goto` we profile using JumpData. However, the entry of a `try` block could be any bytecode. So, we'd either have to do a dynamic lookup in some table when interpreting any bytecode to see if it's the start of a try block, and then do the profiling. Or, we'd not be able to do the profiling in the interpreter, and only in C1 compiled code when we see that an instruction lies at the start of a `try` block. Of course we'd have to adjust the profiling of catch blocks as well, since the 2 counters are used in tandem, and be careful that they don't go out of sync, since the profiling happens in separate locations. In other words: implementing this seems not trivial.

The benefit is also not clear to me: In which cases do we expect an exception to be thrown only sometimes? Is it better to deoptimize in that case, or should we just always generate the exception handler? On the other hand, we already have a real-world use case on our hands (FFM API) which we can test against, that is addressed sufficiently by the simpler approach.

Given the complexity & uncertainty around the benefits of using a frequency-based heuristic, I backed off on that idea, and went with the simpler approach implemented by this patch. I feel like this is a good sweet spot to be in, in terms of implementation complexity & benefit tradeoff. We can always expand the profiling later when we see real world use cases that would benefit from that.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16416#issuecomment-1792503426