fast way to infer caller

Thu Apr 7 20:08:20 UTC 2022

Perhaps https://bugs.openjdk.java.net/browse/JDK-4515935 for the MemoryHandler could be used to determine if StackWalker is fast enough for the lowest rung on the StackWalker performance ladder.   Currently the MemoryHandler doesn't not infer the caller and the target handler sees the callsite of the thread that triggers the push.  Most use cases of MemoryHandler are records that are loggable but, get discarded and never published to the target handler (never formatted nor sent to some data sink).  So this is a real world use case of only openjdk classes.

The Peabody fix I proposed in 2007 was to unconditionally force the caller to be computed prior to adding the LogRecord to the internal data structure.  Therefore all loggable records would pay the cost of inferring the caller.  Current code is fast and broken (assuming target formatter is showing callsite) and the correct code will be slower.  If I were to redo that patch from Peabody I would think that PR review would bring to light a consistent view that StackWalker is fast enough at least the openjdk logging.  Effectively defining the minimum performance standard.  However, if it raises performance regression concerns perhaps there is some more work to be done improving StackWalker? :)

Jason

________________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.java.net> on behalf of Bernd Eckenfels <ecki at zusammenkunft.net>
Sent: Thursday, April 7, 2022 1:02 PM
To: core-libs-dev at openjdk.java.net
Subject: Re: fast way to infer caller

Some loggers do need to find the location of the log statement (class and line where the logger is used not where it is instantiated).

 for those (it makes loggers more useful) getting the call site is time critical even if they are not in tight performance critical loops.

But it actually does matter if/how the JVM optimizes such introspection.. if the JVM can inline (and maybe even constant intrinsic) the stalkwalker it would benefit such use cases just as well.

--
https://bernd.eckenfels.net
________________________________
From: core-libs-dev <core-libs-dev-retn at openjdk.java.net> on behalf of Michael Kuhlmann <jdk at fiolino.de>
Sent: Thursday, April 7, 2022 7:55:16 PM
To: core-libs-dev at openjdk.java.net <core-libs-dev at openjdk.java.net>
Subject: Re: fast way to infer caller

On 4/7/22 19:27, Kasper Nielsen wrote:
>>
>> nope, see my previous mail to Ceki, the VM is cheating here if it can
>> inline the call to MethodHandles.lookup()
>>
>
> Does how the VM cheats really matter? The fact is that the code in the JDK
> can
> get the calling class and implement something like MethodHandles.lookup() so
> it takes ~3 ns. If you implemented something like a lookup class as a normal
> user your best bet would be StackWalker.GetCallingClass() and you would end
> up with something that is at least 2 magnitudes slower. That is probably not
> an issue for most use cases. But for some, it might be a bit of a steep
> cost.
>
> /Kasper

Hi Kasper,

sorry to jump in here as an uninvolved onlooker, but I can't resist.
I really don't see why this should matter. Getting the caller class is a
rare edge case that you just do in exceptional situations; most often
it's more for debugging or something.

What users really are interested in is high performance for standard
cases. Implementing a specific optimization into Hotspot to gain few
milliseconds is the least thing I expect from the JVM developers.

I also don't understand why someone should instantiate a logger during
performance critical procedures. In more than 20 years of Java
development, I've never seen the need to create a logger on the fly.
They are *always* assigned to final static variables, or at least to
predefined pools. Everything else would be just wrong: To instantiate a
logger, you have to fetch at least the log level definition from some
configuration source, and this can never be fast. At least not that
we're talking about nanoseconds here.

All logging implementations I know of (and all that make sense) are
highly optimized on log throughput; this can only be achieved by
preprocessing during initialization, why this is slow. But that doesn't
matter, because, as said, you should anyway create logger instances
beforehand.

Sorry for the rant, but I really don't see the use case here.