RFR (L): 8015774: Add support for multiple code heaps
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Oct 11 10:38:07 PDT 2013
This looks acceptable.
Thanks,
Vladimir
On 10/11/13 7:00 AM, Tobias Hartmann wrote:
> Thanks again for the review.
>
> I added a NonMethodCodeHeapSize flag, because it still wasn't clear what
> to do if the user increases or decreases the default
> ReservedCodeCacheSize, without specifying the sizes of the heaps
> individually. Now the default NonMethodCodeHeapSize is used and the
> remaining size is assigned 1/3 to the profiled heap and 2/3 to the
> non-profiled heap. Like this, it is more transparent because default
> values for all three heaps are available and can also be set by the
> user. I also added value checks to arguments.cpp.
>
> Please find the new webrev at:
> http://cr.openjdk.java.net/~anoll/8015774/webrev.03/
>
> See comments inline:
>
> 2013/10/11 Vladimir Kozlov <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>>
>
>
> You need to place "while (next == NULL" between next lines:
>
> _seen++;
> ! _current_nmethod = next;
>
> otherwise you will pass incorrect _current_type to process_nmethod().
>
> Done.
>
> I would prefer to have names similar to flags:
>
> "non-profiling nmethods"
> "profiling nmethods"
>
>
> I changed the names to "Non-profiled nmethods" and "Profiled nmethods".
>
> Regards,
> Tobias
>
>
> On 10/10/13 8:36 AM, Tobias Hartmann wrote:
>
> Vladimir, thanks again for the quick review!
>
> Please find the new webrev at:
> http://cr.openjdk.java.net/~__anoll/8015774/webrev.02/
> <http://cr.openjdk.java.net/~anoll/8015774/webrev.02/>
>
> See comments inline:
>
> 2013/10/10 Vladimir Kozlov <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>
> <mailto:vladimir.kozlov at __oracle.com
> <mailto:vladimir.kozlov at oracle.com>>>
>
>
>
> And you totally removed code in sweeper which enables
> compilation again.
>
> This is bad. PSR runs FA in VM environment only with few
> threads (2)
> and with this C2 Compiler threads will continue consume
> resources
> even when there are no space to store produced code. Output
> will be
> stormed by "CodeHeap for %s is full." messages.
>
> It doesn't even do decay invocation counters because next
> code is
> not executed:
> if (!should_compile_new_jobs()) {
>
> CompilationPolicy::policy()->____delay_compilation(method());
>
> return NULL;
> }
>
> Next code is also not executed. As result you continue do only
> partial sweeps when codecache (some heaps) is full:
>
> 266 if (!CompileBroker::should_____compile_new_jobs()) {
>
> 267 // If we have turned off compilations we might as
> well do
> full sweeps
> 268 // in order to reach the clean state faster.
> Otherwise the
> sleeping compiler
> 269 // threads will slow down sweeping.
> 270 _invocations = 1;
> 271 }
>
>
> The "CodeHeap for %s is full warning" is printed only once for each
> CodeHeap, so in fact we print less warnings than the old
> implementation
> does.
> Because we check in the common transition functions of the threshold
> policies (SimpleThresholdPolicy::common and
> AdvancedThresholdPolicy::__common) if the CodeHeap for the next
> compilation level is available and remain at the current level
> if it is
> full, the compiler threads do not produce code if a heap is full:
>
> if
> (CodeCache::is_full(CodeCache:__:get_code_blob_type(next___level)))
> {
> // The CodeHeap for next_level is full, stay at current level
> return cur_level;
> }
>
> They only consume resources to repeatedly check if the CodeHeap
> for the
> next compilation level is now available. Benchmarking showed
> that this
> does not negatively affect performance. You are right with the not
> executed code parts and the partial sweeps. Although this can be
> adapted
> easily I followed your suggestions and reverted the changes. As
> before,
> compilation is now disabled as long as a code heap is full and
> enabled
> after sweeping.
>
> I still think that it would be better to disable compilation for
> CodeHeaps individually (as this was done implicitly by my
> implementation), because we could still compile into the other heap.
> Maybe this can be introduced in a future patch.
>
> I agree, this part is still a little bit unclear. This
> is partly
> due to the fact that I'm still not sure how to set the
> size of the non method heap. On my machine around 3.5Mb are
> sufficient for all benchmarks (debug/product). The jprt
> tests seem to need more, that's why it is temporarily
> set to
> 8Mb. I have to evaluate that in more detail. Probably we
> can set it depending on build/runtime settings and
> operating system.
>
>
> [Non]ProfiledCodeHeapSize are already platform specific
> flags. You
> don't need to do anything spacial.
>
>
> Yes, I meant that I still have to figure out the specific values for
> each platform.
>
> One option would be to create a new heap for all blobs
> needed by
> the VM to startup. This heap size can be set to
> CodeCacheMinimumUseSpace. All non method blobs created
> by the
> running java application are then put into the non method
> heap.
>
>
> CodeCacheMinimumUseSpace is already used to run only
> Interpreter.
> You can't change its meaning. You need to make sure your
> new code
> allow to run with such size.
>
> I mean:
>
> size_t non_method_size = ReservedCodeCacheSize -
> (ProfiledCodeHeapSize + NonProfiledCodeHeapSize);
>
>
> I implemented it as you suggested. By default
> NonProfiledCodeHeapSize is
> approximately ProfiledCodeHeapSize * 2. I will set the platform
> specific
> default values such that the non_method_size is big enough to
> run the
> interpreter (currently it is set to 4Mb).
>
> My question was: why you need Profiled heap for C1 and
> Zero? Because
> this is what you get if COMPILER2 is not defined (Client or
> Zero VM).
>
>
> You are perfectly right. The profiled heap is not needed if
> COMPILER2 is
> not defined. I changed it accordingly.
>
> I'm not sure if I understood your point correctly. At
> the moment
> we delay compilation for the corresponding compilation
> level if a heap is full. In practice (at least for the
> benchmarks I run) the second heap then fills up really
> fast. Your
> suggestion to allocate CodeBlobs in another heap somehow
> "breaks" the design, because the client of the
> CodeCache could
> then encounter non profiled, e.g. C2 compiled, methods
> while
> iterating over the profiled heap. This is not really a
> problem at the moment, since no code relies on the fact
> that the
> non profiled heap _only_ contains C2 methods (as long
> as it contains only _methods_).
> However, I think that the second suggestion to move the
> boundary
> between the heaps is better. One could then increase
> the non profiled heap and decrease the profiled heap
> size if the
> first is full. Over time the non profiled heap would
> probably grow, whereas the profiled heap would shrink.
>
> Because I suspect this to be another big change, I
> would however
> suggest to implement this behavior with a second patch
> and first finish the current changes.
>
>
> Agree that it is for an other changes. But you need to design
> current layout to allow to do that (move boundary) in a future.
>
> To move the boundary we have to adapt the VirtualSpace used by the
> CodeHeaps to support "expanding into" another VirtualSpace as it is
> already implemented in class PSVirtualSpace (::expand_into) used by
> parallel scavenge GC. Having this we can define two adjacent
> VirtualSpaces for the profiled and non-profiled heaps, one of which
> grows high to low, and move the boundary in case one is full.
> Similar
> functionality is implemented in class AdjoiningVirtualSpaces
> (also used
> by the parallel scavenge GC). The current design of the code cache
> shouldn't be affected much.
>
> I will look into this more thoroughly and implement the missing
> functionality in VirtualSpace and CodeHeap.
>
> Whole print code is this:
>
> + warning("CodeHeap for %s is full. Compilation for these
> methods has been delayed.",
> CodeCache::get_heap_name(code_____blob_type));
>
>
> + warning("Try increasing the code cache size using
> -XX:ReservedCodeCacheSize=");
>
> You print specific code code heap name in first message (I
> agree
> with Christian that name are not good). You need to
> additionally
> print corresponding flag in next message because increasing
> ReservedCodeCacheSize will not help to NonProfiled heap if
> you don't
> increase NonProfiledCodeHeapSize.
>
> Done (names were already changed in the last webrev).
>
> About changes in sweeper.cpp.
> I think you accidentally removed collection of freed_memory in
> sweep_code_cache().
>
>
> Right, I reverted that.
>
> And you can simplify the code (no need for external loop while
> (_current_type <=) ) if you move "while (_current_method ==
> NULL
> ..." inside inner loop after "_current_nmethod = next".
>
>
> Done. Didn't notice that, thanks for the remark.
>
>
> Regards,
> Tobias
>
> thanks,
> Vladimir
>
> 2013/10/4 Christian Thalinger
> <christian.thalinger at oracle.____com
> <mailto:christian.thalinger at __oracle.com
> <mailto:christian.thalinger at oracle.com>>
> <mailto:christian.thalinger@
> <mailto:christian.thalinger@>____oracle.com <http://oracle.com>
>
> <mailto:christian.thalinger at __oracle.com
> <mailto:christian.thalinger at oracle.com>>>>
>
>
> Some general comments:
>
> Don't use T1, T2, … to name the tiers because SPARC
> processors are named like this. It's confusing. Use
> tier 1,
> tier 2, … instead. The compilation policies
> actually talk
> about tier levels...
>
>
> I changed the names of the code heaps and comments
> accordingly.
>
> I was hoping that we can have a more loose
> definition of
> the code heaps. So we can easily add more. What this
> patch does is a good start but all the hardcoded
> values and
> fields need to go at a later point in time, in my opinion.
>
>
> To improve the adding of new heaps I completely removed the
> direct dependencies on code heaps from the serviceability
> agent, dtrace and the pstack library, especially the fields
> _heap_non_method, _heap_non_profiled and _heap_profiled
> fields in the code cache. The GrowableArray<CodeHeap*>*
> is now
> the only field used to reference the code heaps. This
> allows adding new heaps to the cache without the need
> to adapt
> references. An exception is the dtrace ustack helper
> script (jhelper.d). Because the D language does not
> allow loops,
> one has to define a probe for each code heap. For now,
> the script supports up to 5 code heaps, but more can be
> added by
> simply adding new probes to the script (comment in line
> 183 in jhelper.d describes this).
>
> To summarize, if I want to add a new code heap to the
> cache, all
> I have to do is to
> - define a new CodeBlobType (and add it to
> IsMethodPredicate if
> it is a method type)
> - create the heap in CodeCache::initialize_heaps() and
> define
> default sizes
> [- define heap availability in CodeCache::heap_available]
> [- define the compilation level -> code blob type
> translation in
> CodeCache::get_code_blob_type]
>
> The last two steps are optional. CodeBlobs created with
> the new
> CodeBlobType are then allocated in the corresponding heap.
>
> I'm not completely sure what you mean with "hardcoded"
> values.
> Do you mean the default values for
> NonProfiledCodeHeapSize and ProfiledCodeHeapSize? Of
> course we
> could compute them implicitly as 1/3 and 2/3 of the
> ReservedCodeCacheSize (as it is now done if the user
> does not
> provide a value), without defining them as VM parameter,
> but otherwise we have to define default values for them.
>
> src/share/vm/code/codeBlob.____hpp:
>
>
> +// CodeBlob Types
> +// Used in the CodeCache to assign CodeBlobs to
> different
> CodeHeaps
> +enum CodeBlobType {
> + btMethodNoProfile = 0, // T1 and T4 methods
> (including native methods)
> + btMethodProfile = 1, // T2 and T3
> methods with
> profile information
> + btNonMethod = 2 // Non-methods like
> Buffers,
> Adapters and Runtime Stubs
> +};
>
> The bt prefix is already overloaded (e.g. BasicType).
> Generally speaking, I never liked the fact that enums
> don't
> have their name as scope. Maybe we can agree on doing
> something like this:
>
> struct CodeBlobType {
> enum Enum {
> NonMethod = 2
> };
> };
>
> void foo() {
> int a = CodeBlobType::NonMethod;
> int b = (CodeBlobType::Enum) 1;
> }
>
> Or we move to C++11 :-)
>
> http://www.stroustrup.com/C++____11FAQ.html#enum
> <http://www.stroustrup.com/C++__11FAQ.html#enum>
>
> <http://www.stroustrup.com/C++__11FAQ.html#enum
> <http://www.stroustrup.com/C++11FAQ.html#enum>>
>
>
> Yes, it is probably more clear to include the scope. I
> followed
> your suggestion and changed it to a struct with an
> anonymous enum.
>
> src/share/vm/code/codeCache.____cpp:
>
>
> +#define FOR_ALL_HEAPS(index) for (int index = 0;
> index <
> _heaps->length(); ++index)
>
> I'm one of the people who started to hate macros (for
> various reasons). Although I see the point in defining
> these
> I'd rather go without them. Is there a way we can
> use STL
> vectors with our allocators? Or we could write our own
> iterator class which iterates either over
> GrowableArrays in
> general or the code heaps in particular.
>
>
> I perfectly agree with you that macros should be used
> only if
> necessary and to improve readability in special cases. I
> checked the option of using STL vectors but since the
> STL is not
> used in the code yet and I'm not sure if they can be
> used with the hotspot allocators, I think it is better
> to work
> with own iterator classes. I therefore defined the custom
> iterator classes GrowableArrayIterator and
> GrowableArrayFilterIterator. They implement the STL input
> iterator interface
> without actually inheriting from it (since #include
> <iterator>
> seems not to work with solaris). The filter iterator is
> used to iterate only over those code heaps satisfying a
> given
> predicate, for example those heaps containing nmethods.
> This makes the code a lot more readable and reduces the
> number
> of macros. I think the remaining four macros, namely
> FOR_ALL_HEAPS and FOR_ALL_METHOD_HEAPS to iterate over
> the heaps
> and FOR_ALL_BLOBS and FOR_ALL_ALIVE_BLOBS to iterate
> over the blobs are justified, because they are simple (only
> declaring the iterators) and make the code more readable.
>
> src/share/vm/memory/heap.hpp:
>
> + const char* getName() const { return _name; }
>
> This should either be get_name() or name().
>
> I renamed it to name().
>
> Thanks!
>
> Best regards
> Tobias
>
>
> On 10/2/13 7:06 AM, Tobias Hartmann wrote:
>
> Hi,
>
> please review the following change.
>
> bug:
> https://bugs.openjdk.java.net/______browse/JDK-8015774
> <https://bugs.openjdk.java.net/____browse/JDK-8015774>
> <https://bugs.openjdk.java.__net/__browse/JDK-8015774
> <https://bugs.openjdk.java.net/__browse/JDK-8015774>>
> <https://bugs.openjdk.java.____net/browse/JDK-8015774
> <https://bugs.openjdk.java.__net/browse/JDK-8015774
> <https://bugs.openjdk.java.net/browse/JDK-8015774>>>
> webrev:
> http://cr.openjdk.java.net/~______anoll/8015774/webrev.00/
> <http://cr.openjdk.java.net/~____anoll/8015774/webrev.00/>
>
> <http://cr.openjdk.java.net/~____anoll/8015774/webrev.00/
> <http://cr.openjdk.java.net/~__anoll/8015774/webrev.00/>>
>
>
>
> <http://cr.openjdk.java.net/~____anoll/8015774/webrev.00/
> <http://cr.openjdk.java.net/~__anoll/8015774/webrev.00/>
> <http://cr.openjdk.java.net/~__anoll/8015774/webrev.00/
> <http://cr.openjdk.java.net/~anoll/8015774/webrev.00/>>>
>
> This change implements support for multiple
> code heaps
> in the code
> cache. The interface of the code cache was changed
> accordingly and
> references from other components of the VM were
> adapted. This includes
> the indirect references from:
> - the Serviceability Agent: vmStructs and the
> Java code
> cache interface
> (sun.jvm.hotspot.code.______CodeCache)
>
>
> - the dtrace ustack helper script (jhelper.d)
> - the pstack support library libjvm_db.c
>
> Currently the code cache contains the
> following three
> code heaps each of
> which contains CodeBlobs of a specific type:
> - Non-Profiled methods: nmethods that are not
> profiled,
> i.e., those
> compiled at tier 1 or 4 and native methods
> - Profiled methods: nmethods that are
> profiled, i.e.,
> those compiled at
> tier 2 or 3
> - Non-methods: Non-methods like Buffers,
> Adapters and
> Runtime Stubs
>
> By default, 2/3 of the ReservedCodeCacheSize
> is used
> for the
> non-profiled heap and 1/3 is used for the profiled
> heap. Experiments
> with a small code cache size have shown that this
> configuration performs
> best (see [1]). Sizes can be configured using the
> NonProfiledCodeHeapSize and ProfiledCodeHeapSize
> parameters. By now the
> non-method heap has a fixed size of 8mb. More
> tests
> have to be performed
> to determine reasonable default values for
> different
> build and runtime
> configurations. It would be also possible to
> add an
> additional heap for
> the CodeBlobs needed during VM startup (adapters,
> stubs, interpreter,
> ...) and use the non-method heap for
> non-method blobs
> allocated during
> runtime.
>
> The main benefit of this allocation is that
> functions
> operating on
> nmethods can now be optimized to only iterate over
> CodeBlobs on the
> nmethod heaps, avoiding a full scan of the
> code cache
> including the
> non-methods. Performance evaluation shows that
> this
> results in a great
> reduction of the sweeping time. Further it is now
> possible to extend the
> sweeper to selectively sweep nmethods of different
> compilation levels.
> Because it is more expensive to recompile a highly
> optimized method, it
> makes sense to first sweep those compiled at a
> lower
> level. Possible
> future optimizations may also include
> multithreaded
> sweeping and
> separation of nmethod code and metadata.
>
> The code was tested using Nashorn + Octane,
> DaCapo,
> SPECJvm2008 and jprt.
>
> Benchmark results:
> - [2] shows the score of the old and new version
> running Nashorn with
> the Octane benchmark at different code cache
> sizes (10
> runs each, sizes
> < 32mb crash with the old version). An performance
> improvement of around
> 15% is reached.
> - [3] shows the sweep time of the
> NMethodSweeper during
> runs of the
> Octane benchmark with different code cache
> sizes (10
> runs each). The
> time is greatly reduced with the new version.
> - [4] shows the time ratio (TOld / TNew - 1)
> of the
> DaCapo benchmarks
> with a code cache size of 64mb (30 runs with
> 20 warmups
> each). With the
> time differences being smaller than the confidence
> intervals it is not
> possible determine a difference between the
> two versions.
> - Multiple runs of the SPECJvm2008 benchmark with
> different code cache
> sizes did not show a difference between the two
> versions. This is
> probably due to the fact that not much code is
> compiled
> at runtime
> compared to Nashorn with the Octane benchmark.
>
> The error bars at each data point show the 95%
> confidence interval.
>
> [1]
> https://bugs.openjdk.java.net/______secure/attachment/16372/______OctaneRatio.png
> <https://bugs.openjdk.java.net/____secure/attachment/16372/____OctaneRatio.png>
>
> <https://bugs.openjdk.java.__net/__secure/attachment/16372/____OctaneRatio.png
> <https://bugs.openjdk.java.net/__secure/attachment/16372/__OctaneRatio.png>>
>
>
> <https://bugs.openjdk.java.____net/secure/attachment/16372/____OctaneRatio.png
>
> <https://bugs.openjdk.java.__net/secure/attachment/16372/__OctaneRatio.png
> <https://bugs.openjdk.java.net/secure/attachment/16372/OctaneRatio.png>>>
> [2]
> https://bugs.openjdk.java.net/______secure/attachment/16374/______OctaneCCSizes_new.png
> <https://bugs.openjdk.java.net/____secure/attachment/16374/____OctaneCCSizes_new.png>
>
> <https://bugs.openjdk.java.__net/__secure/attachment/16374/____OctaneCCSizes_new.png
> <https://bugs.openjdk.java.net/__secure/attachment/16374/__OctaneCCSizes_new.png>>
>
>
> <https://bugs.openjdk.java.____net/secure/attachment/16374/____OctaneCCSizes_new.png
>
> <https://bugs.openjdk.java.__net/secure/attachment/16374/__OctaneCCSizes_new.png
> <https://bugs.openjdk.java.net/secure/attachment/16374/OctaneCCSizes_new.png>>>
> [3]
> https://bugs.openjdk.java.net/______secure/attachment/16373/______OctaneSweepTime_new.png
> <https://bugs.openjdk.java.net/____secure/attachment/16373/____OctaneSweepTime_new.png>
>
> <https://bugs.openjdk.java.__net/__secure/attachment/16373/____OctaneSweepTime_new.png
> <https://bugs.openjdk.java.net/__secure/attachment/16373/__OctaneSweepTime_new.png>>
>
>
> <https://bugs.openjdk.java.____net/secure/attachment/16373/____OctaneSweepTime_new.png
>
> <https://bugs.openjdk.java.__net/secure/attachment/16373/__OctaneSweepTime_new.png
> <https://bugs.openjdk.java.net/secure/attachment/16373/OctaneSweepTime_new.png>>>
> [4]
> https://bugs.openjdk.java.net/______secure/attachment/16371/______DaCapoCCSizes_64.png
> <https://bugs.openjdk.java.net/____secure/attachment/16371/____DaCapoCCSizes_64.png>
>
> <https://bugs.openjdk.java.__net/__secure/attachment/16371/____DaCapoCCSizes_64.png
> <https://bugs.openjdk.java.net/__secure/attachment/16371/__DaCapoCCSizes_64.png>>
>
>
>
> <https://bugs.openjdk.java.____net/secure/attachment/16371/____DaCapoCCSizes_64.png
>
>
> <https://bugs.openjdk.java.__net/secure/attachment/16371/__DaCapoCCSizes_64.png
> <https://bugs.openjdk.java.net/secure/attachment/16371/DaCapoCCSizes_64.png>>>
>
> Thanks and best regards,
>
> Tobias
>
>
>
>
More information about the hotspot-compiler-dev
mailing list