nano cost for simple virtual calls and object casts

Krystal Mok rednaxelafx at gmail.com
Wed Sep 5 10:10:28 PDT 2012


You could try hosting services like pastbin [1] or Github [2] or the like.

- Kris

[1]: http://pastebin.com/
[2]: https://github.com/

On Thu, Sep 6, 2012 at 1:03 AM, Andy Nuss <andrew_nuss at yahoo.com> wrote:

> That's a good idea, I did take great care with my 2 important benchmarks,
> but how do I post the code in a readable form and given that each one is
> about 100 lines?  By the way, the issue is with my own generic utility
> classes.  I would be surprised given that each generic class which is part
> of the bottleneck and instantiated with so many different T's would be able
> to avoid the cast and the vtable call, because in that case it would be
> doing incredible code bloat.  Is that a good assumption?
>
>
>   ------------------------------
> *From:* Krystal Mok <rednaxelafx at gmail.com>
> *To:* Andy Nuss <andrew_nuss at yahoo.com>
> *Cc:* "hotspot-compiler-dev at openjdk.java.net" <
> hotspot-compiler-dev at openjdk.java.net>
> *Sent:* Wednesday, September 5, 2012 9:21 AM
> *Subject:* Re: nano cost for simple virtual calls and object casts
>
> Hi Andy,
>
> You may be relying on microbenchmarks, and tiny little details in these
> microbenchmarks may actually have "unexpected" impact on the result. So
> it'd really be better if you provide the actual test cases for others to be
> able to explain what's happening.
>
> You may want to read this page for a guide on microbenchmarks [1], and
> other pages in the same wiki to get an idea of what HotSpot is already
> doing. The JIT compilers in HotSpot, especially the server compiler, tries
> hard to optimize virtual call sites; if a virtual call site isn't really
> polymorphic, it'd be as fast as a static call site; ditto for casts.
>
> Regards,
> Kris
>
> [1]:  https://wikis.oracle.com/display/HotSpotInternals/MicroBenchmarks
>
> On Thu, Sep 6, 2012 at 12:00 AM, Andy Nuss <andrew_nuss at yahoo.com> wrote:
>
> Hi,
>
> I am writing an automata, for a specialized regex/parser-generator
> scripting grammar, even though java.util.regex and antlr are great, I'm
> doing something a little different, with different matching powers.  I
> wrote an experimental version of my automata, same essential code in both
> C++ and java, and my automata execution speed per character, not counting
> overhead, was 3x to 6x faster in the C++ version.
>
> Because for many reasons, as with Antlr, this matching engine has to be
> for java, then so do its automata, and writing a native function for just
> the automata is not really a good solution overall.
>
> So I was trying to get at the root cause of slowness in the java hotspot
> compiled version.  I wrote a careful test for the nanosecond overhead of
> "virtual" calls to what would otherwise be simple inlineable methods.  That
> came out to about 0.5 nanos (0.7 for interface calls).  Also, I profiled
> several kinds of object casts, and that was 0.4 nanos.  By the way, using
> an iterative loop to copy very simple array elems was used as a baseline,
> and the array elem copy cost was about 0.5 nanos per object.
>
> As it relates to my data structures in the automata execution function
> bottleneck, there are for each character transition possibly many "vtable"
> calls and possibly many "casts".  The "casts" are in my case principally
> due to the use of generics, and in particular my several generic linked
> lists on elem T.  All these half nano costs add up, and slow down the
> engine in java relative to the C++ version where static_casts are like nops
> and I think simple virtual calls are much faster.  (As to arrays, there's
> not much that can be done.)  My todo is therefore, redesign a little to
> ensure that there are no virtual calls for what would otherwise be
> inlineable methods.  For the many types of linked lists used, I have no
> choice but to manually "bloat" the linked list methods out of List<T> and
> into each list class.
>
> Question: does casting really have to be so expensive in java and why?  is
> there any way to reduce the overhead of simple "virtual" calls of abstract
> classes, possibly at the expense of interface calls?
>
> Andy
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20120906/f409472c/attachment.html 


More information about the hotspot-compiler-dev mailing list