nano cost for simple virtual calls and object casts
Andy Nuss
andrew_nuss at yahoo.com
Wed Sep 5 10:03:53 PDT 2012
That's a good idea, I did take great care with my 2 important
benchmarks, but how do I post the code in a readable form and given that each one is about 100 lines? By the way, the issue is with my own generic utility classes. I would be surprised given that each generic class which is part of the bottleneck and instantiated with so many different T's would be able to avoid the cast and the vtable call, because in that case it would be doing incredible code bloat. Is that a good assumption?
________________________________
From: Krystal Mok <rednaxelafx at gmail.com>
To: Andy Nuss <andrew_nuss at yahoo.com>
Cc: "hotspot-compiler-dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
Sent: Wednesday, September 5, 2012 9:21 AM
Subject: Re: nano cost for simple virtual calls and object casts
Hi Andy,
You may be relying on microbenchmarks, and tiny little details in these microbenchmarks may actually have "unexpected" impact on the result. So it'd really be better if you provide the actual test cases for others to be able to explain what's happening.
You may want to read this page for a guide on microbenchmarks [1], and other pages in the same wiki to get an idea of what HotSpot is already doing. The JIT compilers in HotSpot, especially the server compiler, tries hard to optimize virtual call sites; if a virtual call site isn't really polymorphic, it'd be as fast as a static call site; ditto for casts.
Regards,
Kris
[1]: https://wikis.oracle.com/display/HotSpotInternals/MicroBenchmarks
On Thu, Sep 6, 2012 at 12:00 AM, Andy Nuss <andrew_nuss at yahoo.com> wrote:
Hi,
>
>
>I am writing an automata, for a specialized regex/parser-generator scripting grammar, even though java.util.regex and antlr are great, I'm doing something a little different, with different matching powers. I wrote an experimental version of my automata, same essential code in both C++ and java, and my automata execution speed per character, not counting overhead, was 3x to 6x faster in the C++ version.
>
>
>Because for many reasons, as with Antlr, this matching engine has to be for java, then so do its automata, and writing a native function for just the automata is not really a good solution overall.
>
>
>So I was trying to get at the root cause of slowness in the java hotspot compiled version. I wrote a careful test for the nanosecond overhead of "virtual" calls to what would otherwise be simple inlineable methods. That came out to about 0.5 nanos (0.7 for interface calls). Also, I profiled several kinds of object casts, and that was 0.4 nanos. By the way, using an iterative loop to copy very simple array elems was used as a baseline, and the array elem copy cost was about 0.5 nanos per object.
>
>
>As it relates to my data structures in the automata execution function bottleneck, there are for each character transition possibly many "vtable" calls and possibly many "casts". The "casts" are in my case principally due to the use of generics, and in particular my several generic linked lists on elem T. All these half nano costs add up, and slow down the engine in java relative to the C++ version where static_casts are like nops and I think simple virtual calls are much faster. (As to arrays, there's not much that can be done.) My todo is therefore, redesign a little to ensure that there are no virtual calls for what would otherwise be inlineable methods. For the many types of linked lists used, I have no choice but to manually "bloat" the linked list methods out of List<T> and into each list class.
>
>
>Question: does casting really have to be so expensive in java and why? is there any way to reduce the overhead of simple "virtual" calls of abstract classes, possibly at the expense of interface calls?
>
>Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20120905/d940d5f5/attachment.html
More information about the hotspot-compiler-dev
mailing list