RFR: 8201436: Replace oop_ps_push_contents with oop_iterate and closure

Thu Oct 4 16:20:15 UTC 2018

Hi!

This is the first enhancement in a series of three to remove Parallel GC 
specific behaviour from shared code. I will post all three enhancements 
to the mailing list.

JDK-8201436 Replace oop_ps_push_contents with oop_iterate and closure
             https://bugs.openjdk.java.net/browse/JDK-8201436
JDK-8211446 Replace oop_pc_follow_contents with oop_iterate and closure
             https://bugs.openjdk.java.net/browse/JDK-8211446
JDK-8211447 Replace oop_pc_update_pointers with oop_iterate and closure
             https://bugs.openjdk.java.net/browse/JDK-8211447

The webrevs are to be applied in the same order as in the above list 
although logically they are quite independent of each other.

The aim is to remove Parallel GC specific behaviour by removing 
oop_ps_push_contents, oop_pc_follow_contents and oop_pc_update_pointers 
from the *Klass hierarchy in oops/*Klass.hpp. Instead of using the old 
dynamic dispatch on the Klass object, the generic oop_iterate* (used by 
all other GCs) is used instead.

Improvements include:
- removing GC specific implementation from shared code. Better 
separation, less #ifdefs
- reusing the same iterators everywhere, using the same mechanism to 
iterate in the different GCs
- reduction in code

Performance:
Most of the benchmarks looks good, they show about the same performance 
before and after the changes. I have had serious problems with the Derby 
benchmark that did not like the way the generic reference handling was 
visiting objects. Changing the order in oop_oop_iterate_reverse 
introduced regressions in G1. Therefore I have added new 
oop_oop_iterate_reverse specialisations to speed up Parallel (reduce 
regression) without harming G1.

Ideas of things to do after this commit series:
- minor improvements in oop_oop_iterate_reverse (removing return of size 
that is not used anywhere)
- remove dynamic dispatch call to check if klass is of typeArray "kind" 
(!obj->klass()->is_typeArray_klass()). The empty action on type arrays 
is already handled and this extra check seems mostly confusing and 
possibly a small unnecessary cost.
- possibly move creation of push closure out of push_contents, and by 
that win some performance (the new closure now has to set a reference 
processor)

Enhancement:
https://bugs.openjdk.java.net/browse/JDK-8201436

Webrev:
http://cr.openjdk.java.net/~lkorinth/8201436/00/

Testing:
- all patches build on linux.
- mach5 (linux, windows and mac) tier1,tier2,tier3,tier4,tier5 has 
passed with all three patches applied
- aurora tests have been run and seem to indicate no change in general, 
and only a small regression in Derby. The last (Derby) run showed a 
young collection regression of 1.02% on Linux and 2.76% improvement on 
windows, but including experience from earlier runs, my estimate is that 
it is in reality a bit worse and that we have a _slight_ but acceptable 
regression in Derby.

Thanks, Leo