Automatic Closure Specialization

Erik Österlund erik.osterlund at
Wed Oct 8 13:11:41 UTC 2014

Hi Stefan,

On 08 Oct 2014, at 14:30, Stefan Karlsson <stefan.karlsson at<mailto:stefan.karlsson at>> wrote:

Hi Erik,

On 2014-10-08 13:20, Erik Österlund wrote:

So here in this email is attached a patch-file containing my changes since I don't have access to the cr-server.
If anyone could help me make a RFR with a bug-ID with this changeset, that would be great.


I ran this through JPRT and got this failure on Solaris x86:

"hotspot/src/share/vm/memory/specialized_oop_closures.hpp", line 222: Warning: Identifier expected instead of "}".
1 Warning(s) detected.
gmake[8]: *** [abstractCompiler.o] Error 2
gmake[7]: *** [the_vm] Error 2
gmake[6]: *** [fastdebug] Error 2
gmake[5]: *** [generic_build2] Error 2
gmake[4]: *** [fastdebug] Error 2

Here's a new take where I declare my enum on that line with a compiler helmet on. The new archive contains the _incremental and _full patch to fix this issue.
There's no way I can run JPRT myself is there?

Thank you,



== Implementation Summary ==

=== Dispatching To Closures ===
All dispatching to OopClosure go through an all static class OopClosureDispatcher. It uses SFINAE to find the most appropriate member function to call by checking certain conditions of the OopClosureType. For any template parameterized OopClosureType calls to member function X in {do_oop, do_metadata, do_klass, do_class_loader_data} it will attempt the following in this order: 1) Check if the OopClosureType is manually unspecialized for whatever reason (currently only used by the abstract classes OopClosure and ExtendedOopClosure), in that case use virtual call. 2) Check if OopClosureType (and not a parent) declares X_nv, then use a non-virtual call to it. Otherwise 3) check if OopClosureType (and not a parent) declares X, then use a non-virtual call to X (hence removing need to define both X and X_nv any more, but still being backward compatible). Otherwise 3) use a normal virtual call to X.

The reason why checking that candidates for non-virtual calls are declared in the concrete OopClosureType and not a parent is to be safe rather than sorry in detecting "abstract" base classes that are never used by themselves. However their derived types are used, but the type passed in to oop_iterate is not the derived closure type but the abstract one. So in summary we make sure we make non-virtual calls to the derived closure types and not the base types.

=== Dispatching To Klass ===
Two mechanisms are used for retaining the OopClosureType when dispatching to it using a specific Klass.
Attempt one hopes there is a separation of concern between finding oops and dispatching to OopClosureType. A virtual call is made to the Klass to either a) return back the oop maps or b) identify which Klass it is with a "DispatchTag".
If oop maps are returned (InstanceKlass, OopArrayKlass, TypeArrayKlass) then they are iterated over from oop_iterate where all the type info about the closure is still known. If this is not supported (InstanceRefKlass, InstanceMirrorKlass, InstanceClassLoaderKlass), then the dispatch tag is used to call an inline template method defined in brand new Instance*Klass.inline.hpp files.

The new methods in Instance*Klass.inline.hpp are (non-virtual) template variants of what the old oop_oop_iterate macros did. They use SFINAE to check if the OopClosureType expects metadata related calls too (if it's an ExtendedOopClosure) otherwise does nothing. Note that oop_iterate now takes an arbitrary type instead of ExtendedOopClosure, but it uses SFINAE to force template instantiations only if it's a subtype of ExtendedOopClosure and otherwise generate compiler errors and similarly on oop_iterate_no_header, specialization is used and it explicitly checks the template parameter is of type OopClosure. If it's an ExtendedOopClosure, it sends the first ever composite closure type NoHeaderOopClosure<OopClosureType> to the dispatch system so that metadata-related calls are never called in the first place. :)



On 01 Oct 2014, at 17:02, Erik Österlund <erik.osterlund at<mailto:erik.osterlund at>> wrote:

On 01 Oct 2014, at 16:11, Stefan Karlsson <stefan.karlsson at<mailto:stefan.karlsson at>> wrote:

On 2014-10-01 15:41, Erik Österlund wrote:

I have to admit I'm not a huge fan of the current system for explicitly specializing closures (removing virtual calls).

Here's a couple of problems I see with the current solution:
1. The specialization macros are obviously not pretty
2. It's awkward to have to remember to explicitly list the closure as specialized in the macros
3. There are plenty of composite closure types out there. What I mean by this is when closures are combined, e.g. one closure is used to filter a memory range, and if passing the filter, it will invoke the actual closure, currently resulting in a virtual call even though the composite structure is completely known at the call site.
4. Each closure has to have like do_oop, do_oop_v, do_oop_nv, for both oop types and then a do_oop_work for joining them. Yuck! Asserts try to check that the _v and _nv methods do the same thing to combat programmer mistakes.

With my alternative template magic solution:
1. You won't have to explicitly specialize wanted closure types - they are automatically specialized unless the contrary is explicitly stated.
2. Parameterized composite closure types can be used without unnecessary virtual call overheads.
3. Only a single do_oop (do_metadata etc) member function is needed, and hence no need to put asserts trying to keep _v and _nv synchronized.
4. It is backward compatible and does not require major refactoring; could transition into this system step by step. The two systems can even co-exist.
5. It supports an interface where OopClosure is the interface to oop_iterate, rather than ExtendedOopClosure. It uses SFINAE to send metadata info to the closure only if the derived type is an ExtendedOopClosure, otherwise it simply sends the oops (do_oop) only. (of course I can remove this if it's unwanted and we intentionally don't want to support oop_iterate(OopClosure*) )

For the interested reader, this is how the old system worked:
The ugly macros generate overloads of oop_iterate on oopDesc which uses a virtual call to the Klass (also using macro generated overloads) to figure out where the oops are and then call the closure. This step with the virtual call to the Klass to call the closure removes any potential for template magic because template member functions can't be virtual in C++.

And this is how my system solves this:
A template oop_iterate (with closure type as parameter) member function uses a virtual call to the Klass, but only to acquire information where oops can be found (and NOT to call the actual closure too). It then uses static template polymorphism (CRTP idiom) to invoke the do_oop method of the corresponding derived closure types (without virtual calls). This required support from the Klass implementations. I currently support object arrays and normal instances. If the Klass implementation does not support this new scheme, it simply reverts to a normal virtual call like before.
As a bonus I made a new include file in utilities/templateIdioms.hpp with some template magic I needed and which I was missing but could likely be used in more places in the future.

Would this change be interesting for the GC group? In that case I could prepare a patch (and perhaps add support to the other Klass implementations). :)
I would also need some help to check if this works on your wide range of platforms and compilers etc (only checked the assembly output for my own setup).

Yes, please provide a patch!

I've had a patch to start solving some of these problems for a long time:<>

one problem that I haven't solved in my patch is how to pass down the OopClosureType past the Klass::oop_iterate virtual calls. See oop.inline.hpp:
template <bool nv, typename OopClosureType>
inline int oopDesc::oop_iterate(OopClosureType* blk) {
 CONCRETE_KLASS_DO_AND_RETURN(klass(), oop_oop_iterate<nv>(this,blk));

and klass.inline.hpp for the dispatch code:

+#define CONCRETE_KLASS_DO_AND_RETURN(the_klass, todo) \
+  do { \
+    switch ((the_klass)->dispatch_tag()) { \
+    case Klass::_instance:                         return InstanceKlass::cast(the_klass)->todo; \
+    case Klass::_instance_ref:                  return InstanceRefKlass::cast(the_klass)->todo; \
+    case Klass::_instance_mirror:            return InstanceMirrorKlass::cast(the_klass)->todo; \
+    case Klass::_instance_class_loader: return InstanceClassLoaderKlass::cast(the_klass)->todo; \
+    case Klass::_obj_array:                        return ObjArrayKlass::cast(the_klass)->todo; \
+    case Klass::_type_array:                      return TypeArrayKlass::cast(the_klass)->todo; \
+    default: fatal("Incorrect dispatch index"); return 0; \
+    }   \
+  } while (false)

Yeah, I was faced with the same problem. It's impossible in C++ to pass template arguments to virtual member functions. (Note however that it's possible to have template parameterized /types/ with virtual member functions, so with your choice of using enums with a dispatch tag I think it should be possible to have a KlassDispatchProxy<ClosureType> cl; cl.get_and_dispatch(obj); where KlassDispatchProxy type (rather than Klass type) is picked using the enum and get_and_dispatch is a virtual method for the parameterized type).

However, my choice was to instead recognize finding oop maps and dispatching to the closure as two fundamentally separate concerns. So instead I implemented a new virtual Klass member function to only get the oop maps (if supported) of an oop (for InstanceKlass simply return the array of oop map blocks, for ObjArrayKlass simply return an oop map dummy with the interval of oops for the array). This way, I can get the oop maps with a virtual call without having to know the ClosureType needed for dispatching, and then when I have the oop maps, simply dispatc (partially using SFINAE to dispatch the stuff the closure expects to receive; depending on if it's an ExtendedOopClosure or not).

I'm glad this is something useful. Currently I only enable this for InstanceKlass and ObjArrayKlass, otherwise it reverts to standard virtual calls. Will prepare a patch with a few more Klass implementations to make it a bit more complete. :)





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Type: application/zip
Size: 12323 bytes
URL: <>

More information about the hotspot-gc-dev mailing list