Improving the performance of stacktrace generation

Sat Jul 7 15:31:12 PDT 2012

On 07/08/2012 12:03 AM, Charles Oliver Nutter wrote:
> Today I have a new conundrum for you all: I need stack trace
> generation on Hotspot to be considerably faster than it is now.
>
> In order to simulate many Ruby features, JRuby (over)uses Java stack
> traces. We recently (JRuby 1.6, about a year ago) moved to using the
> Java stack trace as the source of our Ruby backtrace information,
> mining out compiled frames and using interpreter markers to peel off
> interpreter frames. The result is that a Ruby trace with mixed
> compiled and interpreted code like this
> (https://gist.github.com/3068210) turns into this
> (https://gist.github.com/3068213). I consider this a great deal better
> than the plain Java trace, and I know other language implementers have
> lamented the verbosity of stack traces coming out of their languages.
>
> The unfortunate thing is that stack trace generation is very expensive
> in the JVM, and in order to generate normal exceptions and emulate
> other features we sometimes generate a lot of them. I think there's
> value in exploring how we can make stack trace generation cheaper at
> the JVM level.
>
> Here's a few cases in Ruby where we need to use Java stack traces to
> provide the same features:
>
> * Exceptions as non-exceptional or moderately-exceptional method results
>
> In this case I'm specifically thinking about Ruby's tendency to
> propagate errno values as exceptions; EAGAIN/EWOULDBLOCK for example
> are thrown from nonblocking IO methods when there's no data available.
>
> You will probably say "that's a horrible use for exceptions" and I
> agree. But there are a couple reasons why it's nicer too:
> - using return value sigils requires you to propagate them back out
> through many levels of calls
> - exception-handling is cleaner in code than having all your errno
> handling logic spliced into regular program flow
>
> In any case, the cost of generating a stack trace for potentially
> every non-blocking IO call is obviously too high. In JRuby, we default
> to having EAGAIN/EWOULDBLOCK exceptions not generate a stack trace,
> and you must pass a flag for them to do so. The justification is that
> these exceptions are almost always used to branch back to the top of a
> nonblocking IO loop, so the backtrace is useless.

I don't see how to do more.

>
> * Getting the current or previous method's name/file/line
>
> Ruby supports a number of features that allow you to get basic
> information about the method currently being executed or the method
> that called it. The most general of these features is the "caller"
> method, which provides an array of all method name + file + line that
> would appear in a stack trace at this point. This feature is often
> abused to get only the current or previous frame, and so in Ruby 1.9
> they added __method__ to get the currently-executing method's
> name+file+line.
>
> In both cases, we must generate a full Java trace for these methods
> because the name of a method body is not necessarily statically known.
> We often want only the current frame or the current and previous
> frames, but we pay the cost of generating an entire Java stack trace
> to get them.

You can use Throwable.getStackTraceElement()
which is package visible and OpenJDK specific but at least
it will be faster for all VMs that uses OpenJDK.

>
> * Warnings that actually report the line of code that triggered them
>
> In Ruby, it is possible to generate non-fatal warnings to stderr. In
> many cases, these warnings automatically include the file and line at
> which the triggering code lives. Because the warning logic is
> downstream from the Ruby code, we again must use a full Java stack
> trace to find the most recent (on stack) Ruby frame. This causes
> warnings to be as expensive as regular exceptions.

Please never optimize warnings, they are here to bug users
until they fix the thing. So they should be slow :)

>
> Because the use of frame introspection (in this case through stack
> traces) has largely been ignored on current JVMs, I suspect there's a
> lot of improvement possible. At a minimum, the ability to only grab
> the top N frames from the stack trace could be a great improvement
> (Hotspot even has flags to restrict how large a trace it will
> generate, presumably to avoid the cost of accounting for deep stacks
> and generating traces from them).
>
> Any thoughts on this? Does anyone else have need for lighter-weight
> name/file/line inspection of the call stack?
>
> - Charlie

Rémi