Draft JEP: Efficient Stack Walking API

Remi Forax forax at univ-mlv.fr
Wed Jul 9 14:11:30 UTC 2014


On 07/09/2014 03:42 PM, David M. Lloyd wrote:
> Just want to say that I am also looking forward to progress on this.

so am i :)

Rémi

>
> On 07/09/2014 12:25 AM, Jeremy Manson wrote:
>> Thanks for the response, Mandy.  I'm looking forward to seeing the final
>> version.
>>
>> For CallerFinder, we use reflective goop to get at
>> sun.misc.JavaLangAccess.getStackTraceElement.  It requires us to build a
>> Throwable (with associated stacktrace), but not to generate all of those
>> StackTraceElement[] strings.  This saves a lot of CPU cycles over 
>> something
>> like Thread.getStackTrace(), but in this case, cheaper is better, and 
>> we'd
>> definitely prefer a better solution.  Functions like
>> Java_java_lang_Throwable_fillInStackTrace are still big cycle consumers.
>>
>> Jeremy
>>
>>
>> On Mon, Jul 7, 2014 at 8:06 PM, Mandy Chung <mandy.chung at oracle.com> 
>> wrote:
>>
>>>   Hi Jeremy,
>>>
>>> Thanks for the feedback and the CallerFinder API you have.
>>>
>>>
>>> On 7/7/2014 9:55 AM, Jeremy Manson wrote:
>>>
>>> Hey folks,
>>>
>>>   I don't know if Mandy's draft JEP has gotten any love,
>>>
>>>
>>> The JEP process is in transition to 2.0 version.  Hope this JEP will 
>>> come
>>> out soon.
>>>
>>>
>>>   but this is something that has (in the past) been a major CPU cycle
>>> consumer for us, and we've had to invent / reinvent many wheels to 
>>> fix it
>>> internally, so we'd love to see a principled solution.
>>>
>>>   A couple of notes:
>>>
>>>   - A large percentage of the time, you just want to find one of:
>>>    1) The direct caller of the method,
>>>    2) The first caller outside a given package.
>>>
>>>
>>> The current thinking is to allow you to find the direct caller as 
>>> well as
>>> express the predicate for filtering that will cover these cases.
>>>
>>>
>>>   We added a CallerFinder API that basically looks like this:
>>>
>>>   // Finds the caller of the invoking method in the current stack that
>>> isn't in one of the excluded classes
>>>   public static StackTraceElement findCaller(Class<?>... 
>>> excludedClasses);
>>>
>>>   // Finds the first caller of a given class
>>> public static StackTraceElement findCallerOf(Class<?>... 
>>> classesToFind);
>>>
>>>   This isn't the ideal API (it is more the one that happened to be
>>> convenient when we threw together the class), but it gets the vast 
>>> majority
>>> of use cases.
>>>
>>>
>>> Does it use Thread.getStackTrace() to implement this CallerFinder API?
>>> Thread.getStackTrace or Throwable.getStackTrace both eagerly capture 
>>> the
>>> entire stack trace that is expensive.  We want to have the VM to be 
>>> able to
>>> only capture the stack frames that the client needs and the 
>>> implementation
>>> as efficient as possible.
>>>
>>>
>>>   2) Even with a super-efficient stack walker, anyone who uses the
>>> java.util.logging framework pervasively is going to see a lot of CPU 
>>> cycles
>>> consumed by determining the caller.
>>>
>>>
>>> The current LogRecord implementation calls new Throwable that has to 
>>> pay
>>> the cost of capturing the entire stack.
>>>
>>>
>>>    We've had a lot of luck minimizing this by using a bytecode 
>>> rewriter to
>>> change callers of log(msg) to log(sourceClass, sourceMethod, msg).  
>>> This is
>>> almost certainly something that could be done (even in a principled 
>>> way!)
>>> by the VM; improvements to CPU usage in such apps have been dramatic.
>>>
>>>
>>> Thanks.  I'll make sure to measure and compare the performance with
>>> java.util.logging using the new stack walk API and also may ask your 
>>> help
>>> to determine if you observe the performance difference comparing the
>>> rewritten bytecode vs the java.util.logging using the new API.
>>>
>>> Mandy
>>>
>>>
>>>   Jeremy
>>>
>>>
>>>
>>> On Sun, Mar 30, 2014 at 4:02 PM, Mandy Chung <mandy.chung at oracle.com>
>>> wrote:
>>>
>>>> Below is a draft JEP we are considering submitting for JDK 9.
>>>>
>>>> Mandy
>>>>
>>>> ----------------------------
>>>> Title: Efficient API for Stack Walking
>>>>
>>>> Goal
>>>> ----
>>>>
>>>> Define a standard API for stack walking that will be efficient and
>>>> performant.
>>>>
>>>> Non-goal
>>>> --------
>>>>
>>>> It is not a goal for this API be easy to use via Reflection for 
>>>> example
>>>> use in code that is compiled for an older JDK.
>>>>
>>>> Motivation
>>>> ----------
>>>>
>>>> There is no standard API to obtain information about the caller's 
>>>> class
>>>> and traverse the execution stack in a performant way. Existing 
>>>> libraries
>>>> and frameworks such as Log4j and Groovy have to resort to using the
>>>> JDK internal API `sun.reflect.Reflection.getCallerClass(int depth)`.
>>>>
>>>> This JEP proposes to define a standard API for stack walking that will
>>>> be efficient and performant and also enable the implementation up
>>>> level the stack walk machinery from the VM to Java and replaces
>>>> the current mechanism of `Throwable.fillInStackTrace.
>>>>
>>>> Description
>>>> -----------
>>>>
>>>> There is no standard API to traverse certain frames on the execution
>>>> stack efficiently and access the Class instance of each frame.
>>>>
>>>> There are APIs that allow to access the stack trace information:
>>>>    - `Throwable.getStackTrace()` and `Thread.getStackTrace()` that 
>>>> returns
>>>>       an array of `StackTraceElement` which contains the classname
>>>>       and method name of a stack trace.
>>>>    - `SecurityManager.getClassContext()` which is a protected method
>>>>       such that only `SecurityManager` subclass can access the class
>>>>       context.
>>>>
>>>> These APIs require the VM to eagerly capture a snapshot of the entire
>>>> stack trace and returns the information representing the entire stack.
>>>> There is no other way to avoid the cost to examine all frames if
>>>> the caller is only interested in the top few frames on the stack.
>>>> Both `Throwable.getStackTrace()` and `Thread.getStackTrace()` methods
>>>> return an array of `StackTraceElement` that contains the classname and
>>>> method name of a stack frame but the `Class` instance.
>>>>
>>>> In fact, for applications interested in the entire stack, the
>>>> specification
>>>> allows VM implementation to omit some frames in the stack for 
>>>> performance.
>>>> In other words, `Thread.getStackTrace()` may return a partial stack 
>>>> trace.
>>>>
>>>> These APIs do not satisfy the use cases that currently depend on
>>>> the `getCallerClass(int depth)` method or its performance overhead
>>>> is intolerable.  The use cases include:
>>>>
>>>>    - JDK caller-sensitive APIs look up its immediate caller's class
>>>>      which will be used to determine the behavior of the API.  For 
>>>> example
>>>>      `Class.forName(String classname)` and
>>>>      `ResourceBundle.getBundle(String rbname)` methods use the 
>>>> immediate
>>>>      caller's class loader to load a class and a resource bundle
>>>> respectively.
>>>>      `Class.getMethod` etc will use the immediate caller's class 
>>>> loader
>>>>      to determine the security checks to be performed.
>>>>
>>>>    - `java.util.logging`, Log4j and Groovy runtime filter the 
>>>> intermediary
>>>>      stack frames (typically implementation-specific and reflection 
>>>> frames)
>>>>      and find the caller's class to be used by the runtime of such 
>>>> library
>>>>      or framework.
>>>>
>>>>    - Traverse the entire stack trace or the stack trace of a 
>>>> `Throwbale`
>>>>      and obtain additional information about classes for enhanced
>>>>      diagnosibility in addition to the class and method name.
>>>>
>>>> This JEP will define a stack walk API that allows laziness, frame
>>>> filtering,
>>>> supports short reaches to stop at a frame matching some criteria
>>>> as well as long reaches to traverse the entire stack trace. This would
>>>> need the JVM to provide a flexible mechanism to traverse and 
>>>> materialize
>>>> the specific stack frame information to be used and allow efficient
>>>> lazy access to additional stack frames when required.
>>>> Native JVM transitions should be minimzed.
>>>>
>>>> The API will define how it works when running with a security manager
>>>> that allows access to a `Class` instance
>>>> of any frame ensuring that the security is not compromised.
>>>>
>>>> An example API to walk the stack can be like:
>>>>     Thread.walkStack(Consumer<StackFrameInfo> action, int depthLimit)
>>>>
>>>> that takes a callback to be invoked for each frame traversed.  A 
>>>> variant
>>>> of the walkStack method will take a predicate for stack frame 
>>>> filtering.
>>>>
>>>>     Thread.getCaller(Function<StackFrameInfo, R> function)
>>>>     Thread.findCaller(Predicate<StackFrameInfo> predicate,
>>>>                       Function<StackFrameInfo, R> function)
>>>>
>>>> finds the caller frame with or without filtering.
>>>>
>>>> Testing
>>>> -------
>>>>
>>>> Unit tests and JCK tests for the new SE API will need to be developed.
>>>> In addition, the performance of the new API for different use cases
>>>> will be assessed.
>>>>
>>>>
>>>> Impact
>>>> ------
>>>>
>>>>    - Performance/scalability: performance measurement shall be 
>>>> performed
>>>>      using micro-benchmarks as well as real world usage of 
>>>> `getCallerClass`
>>>>      replaced with the new API.
>>>>
>>>>    - TCK: New JCK test cases shall be developed.
>>>>
>>>>
>>>
>>>
>




More information about the core-libs-dev mailing list