Automatic Closure Specialization

Wed Oct 1 13:41:56 UTC 2014

Hey,

I have to admit I'm not a huge fan of the current system for explicitly specializing closures (removing virtual calls). 

Here's a couple of problems I see with the current solution:
1. The specialization macros are obviously not pretty
2. It's awkward to have to remember to explicitly list the closure as specialized in the macros
3. There are plenty of composite closure types out there. What I mean by this is when closures are combined, e.g. one closure is used to filter a memory range, and if passing the filter, it will invoke the actual closure, currently resulting in a virtual call even though the composite structure is completely known at the call site.
4. Each closure has to have like do_oop, do_oop_v, do_oop_nv, for both oop types and then a do_oop_work for joining them. Yuck! Asserts try to check that the _v and _nv methods do the same thing to combat programmer mistakes.

With my alternative template magic solution:
1. You won't have to explicitly specialize wanted closure types - they are automatically specialized unless the contrary is explicitly stated.
2. Parameterized composite closure types can be used without unnecessary virtual call overheads.
3. Only a single do_oop (do_metadata etc) member function is needed, and hence no need to put asserts trying to keep _v and _nv synchronized.
4. It is backward compatible and does not require major refactoring; could transition into this system step by step. The two systems can even co-exist.
5. It supports an interface where OopClosure is the interface to oop_iterate, rather than ExtendedOopClosure. It uses SFINAE to send metadata info to the closure only if the derived type is an ExtendedOopClosure, otherwise it simply sends the oops (do_oop) only. (of course I can remove this if it's unwanted and we intentionally don't want to support oop_iterate(OopClosure*) )

For the interested reader, this is how the old system worked:
The ugly macros generate overloads of oop_iterate on oopDesc which uses a virtual call to the Klass (also using macro generated overloads) to figure out where the oops are and then call the closure. This step with the virtual call to the Klass to call the closure removes any potential for template magic because template member functions can't be virtual in C++.

And this is how my system solves this:
A template oop_iterate (with closure type as parameter) member function uses a virtual call to the Klass, but only to acquire information where oops can be found (and NOT to call the actual closure too). It then uses static template polymorphism (CRTP idiom) to invoke the do_oop method of the corresponding derived closure types (without virtual calls). This required support from the Klass implementations. I currently support object arrays and normal instances. If the Klass implementation does not support this new scheme, it simply reverts to a normal virtual call like before.
As a bonus I made a new include file in utilities/templateIdioms.hpp with some template magic I needed and which I was missing but could likely be used in more places in the future.

Would this change be interesting for the GC group? In that case I could prepare a patch (and perhaps add support to the other Klass implementations). :)
I would also need some help to check if this works on your wide range of platforms and compilers etc (only checked the assembly output for my own setup).

Cheers!

/Erik