From y.s.ramakrishna at oracle.com Mon Aug 1 18:35:33 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Mon, 01 Aug 2011 11:35:33 -0700 Subject: Review request: 7072527 CMS: JMM GC counters overcount in some cases In-Reply-To: <4E333E8D.7070005@oracle.com> References: <4E316E5D.50703@oracle.com> <4E31FEF1.8060205@oracle.com> <4E3207F9.6020805@oracle.com> <4E32E168.7000908@oracle.com> <4E333E8D.7070005@oracle.com> Message-ID: <4E36F1F5.50606@oracle.com> Looks good to me; ship it! -- ramki On 7/29/2011 4:13 PM, Kevin Walls wrote: > > Actually I missed the other comment that we could go further and > remove the unneeded constructor. > > http://cr.openjdk.java.net/~kevinw/7072527/webrev.01/ > > Thanks > Kevin > > On 29/07/11 17:35, Kevin Walls wrote: >> Hi -- >> >> I got this ready in a webrev: >> >> http://cr.openjdk.java.net/~kevinw/7072527/webrev.00/ >> >> The new test is passing, as are the old ones. 8-) >> >> Also, this change means that a CMS cycle ending in concurrent mode >> failure now counts as one collection. One could argue that either >> way (should it be 2 collections?) - but I'm thinking if we are >> counting "completed" collections then we are now counting correctly! >> >> Thanks >> Kevin >> >> >> On 29/07/11 02:08, Y. S. Ramakrishna wrote: >>> I filed: 7072527 CMS: JMM GC counters overcount in some cases >>> >>> On 07/28/11 17:29, Y. S. Ramakrishna wrote: >>>> Hi Kevin -- >>>> >>>> thanks for jumping on this! More inline below ... >>>> >>>> On 07/28/11 09:33, Krystal Mok wrote: >>>>> Hi Kevin, >>>>> >>>>> Thank you for taking care of this, and it's good to see the >>>>> problem is verified. >>>>> >>>>> I think whether or not the suggested fix is sufficient depends on >>>>> what paths can reach CMSCollector::do_compaction_work(). If all >>>>> paths that can reach CMSCollector::do_compaction_work() come from >>>>> GenCollectedHeap::do_collection(), then the fix should be good to >>>>> go. Otherwise it'll need a better workaround. >>>>> >>>>> I believe all concurrent mode failures/interrupts (which includes >>>>> the System.gc() case) does come from >>>>> GenCollectedHeap::do_collection(), but I'm not exactly sure about >>>>> this, could anybody please clarify on it? >>>> >>>> Yes, i believe this is indeed the case, and my browsing of the >>>> code using cscope seemed to confirm that belief. >>>> >>>> More below ... >>>> >>>>> >>>>> Regards, >>>>> Kris Mok >>>>> >>>>> On Thu, Jul 28, 2011 at 10:12 PM, Kevin Walls >>>>> > wrote: >>>>> >>>>> __ >>>>> Hi -- >>>>> >>>>> 6580448 was marked as a duplicate of 6581734, which fixed the >>>>> fact >>>>> that CMS collections were just not counted at all - with CMS, >>>>> only a >>>>> stop the world full gc would be counted in the stats. >>>>> >>>>> But looks like you're right... Here is a quick variation of the >>>>> testcase from 6581734 which shows the same thing, and this >>>>> verifies >>>>> the same, and is solved by ExplicitGCInvokesConcurrent. If >>>>> there is >>>>> no other feedback I can test if the removal of the >>>>> TraceCMSMemoryManagerStats() call in >>>>> CMSCollector::do_compaction_work is all we need... >>>> >>>> >>>> Kevin, yes, it would be great if you could verify this and push the >>>> fix. >>>> I am not sure if the push would need to wait for the signing of OCA >>>> from Kris, but best to check with Those Who Would Know Such Things. >>>> >>>> Since the original CR has been closed, i'll open one momentarily and >>>> can make you RE (if that's OK with you). I'll be happy to serve as >>>> reviewer of the change. >>>> >>>> As regards the jstat counter reporting two pauses per concurrent >>>> CMS cycle, I am of two minds on what the original intention >>>> was. I'd have originally regarded the double increment as a >>>> bug, but as you state it is really two pauses, even if part of >>>> a single cycle. And it makes sense to count them as two. I >>>> agree that this should be documented and left alone, given >>>> how long we have had this behaviour, and the alternative >>>> (of counting cycles, rather than pauses) may be no better >>>> (or arguably worse). There's actually an open CR for this >>>> which we can redirect into a CR to update the relevant documentation. >>>> >>>> -- ramki >>>> >>>>> >>>>> Regards >>>>> Kevin >>>>> >>>>> >>>>> /* >>>>> * Copyright (c) 2011, Oracle and/or its affiliates. All rights >>>>> reserved. >>>>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>>>> * >>>>> * This code is free software; you can redistribute it and/or >>>>> modify it >>>>> * under the terms of the GNU General Public License version 2 >>>>> only, as >>>>> * published by the Free Software Foundation. >>>>> * >>>>> * This code is distributed in the hope that it will be >>>>> useful, but >>>>> WITHOUT >>>>> * ANY WARRANTY; without even the implied warranty of >>>>> MERCHANTABILITY or >>>>> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General >>>>> Public License >>>>> * version 2 for more details (a copy is included in the LICENSE >>>>> file that >>>>> * accompanied this code). >>>>> * >>>>> * You should have received a copy of the GNU General Public >>>>> License >>>>> version >>>>> * 2 along with this work; if not, write to the Free Software >>>>> Foundation, >>>>> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >>>>> * >>>>> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA >>>>> 94065 USA >>>>> * or visit www.oracle.com if you need >>>>> additional information or have any >>>>> * questions. >>>>> */ >>>>> >>>>> /* >>>>> * @test TestFullGCount.java >>>>> * @bug >>>>> * @summary >>>>> * @run main/othervm -XX:+UseConcMarkSweepGC TestFullGCCount >>>>> * >>>>> */ >>>>> import java.util.*; >>>>> import java.lang.management.*; >>>>> >>>>> >>>>> >>>>> public class TestFullGCCount { >>>>> >>>>> private String poolName = "CMS"; >>>>> private String collectorName = "ConcurrentMarkSweep"; >>>>> >>>>> public static void main(String [] args) { >>>>> >>>>> TestFullGCCount t = null; >>>>> if (args.length==2) { >>>>> t = new TestFullGCCount(args[0], args[1]); >>>>> } else { >>>>> System.out.println("Defaulting to monitor CMS pool and >>>>> collector."); >>>>> t = new TestFullGCCount(); >>>>> } >>>>> t.run(); >>>>> >>>>> } >>>>> >>>>> public TestFullGCCount(String pool, String collector) { >>>>> poolName = pool; >>>>> collectorName = collector; >>>>> } >>>>> public TestFullGCCount() { >>>>> } >>>>> >>>>> public void run() { >>>>> >>>>> int count = 0; >>>>> int iterations = 20; >>>>> long counts[] = new long[iterations]; >>>>> boolean diffAlways2 = true; // assume we will fail >>>>> >>>>> for (int i=0; i>>>> System.gc(); >>>>> counts[i] = checkStats(); >>>>> if (i>0) { >>>>> if (counts[i] - counts[i-1] != 2) { >>>>> diffAlways2 = false; >>>>> } >>>>> } >>>>> } >>>>> if (diffAlways2) { >>>>> throw new RuntimeException("FAILED: difference in >>>>> count is >>>>> always 2."); >>>>> } >>>>> System.out.println("Passed."); >>>>> } >>>>> >>>>> private long checkStats() { >>>>> long count = 0; >>>>> List pools = >>>>> ManagementFactory.getMemoryPoolMXBeans(); >>>>> List collectors = >>>>> ManagementFactory.getGarbageCollectorMXBeans(); >>>>> for (int i=0; i>>>> GarbageCollectorMXBean collector = collectors.get(i); >>>>> String name = collector.getName(); >>>>> if (name.contains(collectorName)) { >>>>> System.out.println(name + ": collection count = " >>>>> + collector.getCollectionCount()); >>>>> count = collector.getCollectionCount(); >>>>> } >>>>> } >>>>> return count; >>>>> >>>>> } >>>>> >>>>> } >>>>> >>>>> >>>>> On 27/07/11 17:12, Krystal Mok wrote: >>>>>> Hi all, >>>>>> >>>>>> I've been looking at a strange inconsistency of full GC count >>>>>> recorded by jvmstat and JMM counters. I'd like to know which >>>>>> ones >>>>>> of the following behaviors are by design, which ones are >>>>>> bugs, and >>>>>> which ones are just my misunderstanding. I apologize for >>>>>> making a >>>>>> short story long... >>>>>> >>>>>> ===================================================== >>>>>> >>>>>> The counters involved: >>>>>> >>>>>> * A jvmstat counter named "sun.gc.collector.1.invocations" keeps >>>>>> track of the number of pauses occured as a result of a major >>>>>> collection. It is used by utilities such as jstat as the >>>>>> source of >>>>>> "FGC" (full collection count), and the old gen collection >>>>>> count in >>>>>> Visual GC. It's updated by an TraceCollectorStats object. >>>>>> * A JMM counter, GCMemoryManager::_num_collections, keeps >>>>>> track of >>>>>> the number of collections that have ended. This counter is >>>>>> used as >>>>>> HotSpot's implementation of the JMX >>>>>> GarbageCollectorMXBean.getCollectionCount(). It's updated by >>>>>> either a TraceMemoryManagerStats object or a >>>>>> TraceCMSMemoryManagerStats object. >>>>>> >>>>>> To show the situation, I've made a screenshot of a VisualVM >>>>>> and a >>>>>> JConsole running side by side, both are monitoring the >>>>>> VisualVM's >>>>>> GC stats: >>>>>> >>>>>> http://dl.iteye.com/upload/attachment/524811/913cb0e1-7add-3ac0-a718-24ca705cad22.png >>>>>> >>>>>> (I'll upload the screenshot to somewhere else if anybody >>>>>> can't see it) >>>>>> The VisualVM instance is running on JDK6u26, with ParNew+CMS. >>>>>> In the screenshot, Visual GC reports that the old gen collection >>>>>> count is 20, while JConsole reports 10. >>>>>> >>>>>> I see that there was this bug: >>>>>> 6580448: CMS: Full GC collection count mismatch between >>>>>> GarbageCollectorMXBean and jvmstat (VisualGC) >>>>>> I don't think the current implementation has a bug in the sense >>>>>> that the two counters don't report the same number. >>>>>> >>>>>> This behavior seems reasonable, but the naming of the value in >>>>>> these tools are confusing: both tools say "collections", but >>>>>> apparently the number in Visual GC means "number of pauses" >>>>>> where >>>>>> as the number in JConsole means "number of collection cycles". >>>>>> It'd be great if the difference could be documented >>>>>> somewhere, if >>>>>> that's the intended behavior. >>>>>> >>>>>> And then the buggy behavior. Code demo posted on gist: >>>>>> https://gist.github.com/1106263 >>>>>> Starting from JDK6u23, when using CMS without >>>>>> ExplicitGCInvokesConcurrent, System.gc() (or >>>>>> Runtime.getRuntime().gc(), or MemoryMXBean.gc() via JMX) would >>>>>> make the JMM GC counter increment by 2 per invocation, while the >>>>>> jvmstat counter is only incremented by 1. I believe the >>>>>> latter is >>>>>> correct and the former needs some fixing. >>>>>> >>>>>> ===================================================== >>>>>> >>>>>> My understanding of the behavior shown above: >>>>>> >>>>>> 1. The concurrent GC part: >>>>>> >>>>>> There are 2 pauses in a CMS concurrent GC cycle, one in the >>>>>> initial mark phase, and one in the final remark phase. >>>>>> To trigger a concurrent GC cycle, the CMS thread wakes up >>>>>> periodically to see if it shouldConcurrentCollect(), and >>>>>> trigger a >>>>>> cycle when the predicate returned true, or goes back to sleep if >>>>>> the predicate returned false. The whole concurrent GC cycle >>>>>> doesn't go through GenCollectedHeap::do_collection(). >>>>>> >>>>>> The jvmstat counter for old gen pauses is updated in >>>>>> CMSCollector::do_CMS_operation(CMS_op_type op), which exactly >>>>>> covers both pause phases. >>>>>> >>>>>> The JMM counter, however, is updated in the concurrent sweep >>>>>> phase, CMSCollector::sweep(bool asynch), if there was no >>>>>> concurrent mode failure; or it is updated in >>>>>> CMSCollector::do_compaction_work(bool clear_all_soft_refs) in >>>>>> case >>>>>> of a bailout due to concurrent mode failure (advertised as so in >>>>>> the code comments). So that's an increment by 1 per >>>>>> concurrent GC >>>>>> cycle, which does reflect the "number of collection cycles", >>>>>> fair >>>>>> enough. >>>>>> >>>>>> So far so good. >>>>>> >>>>>> 2. The System.gc() part: >>>>>> >>>>>> Without ExplicitGCInvokesConcurrent set, System.gc() does a >>>>>> stop-the-world full GC, which consists of only one pause, so >>>>>> "number of pauses" would equal "number of collections" in >>>>>> this case. >>>>>> It will go into GenCollectedHeap::do_collection(); both the >>>>>> jvmstat and the JMM GC counter gets incremented by 1 here, >>>>>> >>>>>> TraceCollectorStats tcs(_gens[i]->counters()); >>>>>> TraceMemoryManagerStats tmms(_gens[i]->kind()); >>>>>> >>>>>> But, drilling down into: >>>>>> _gens[i]->collect(full, do_clear_all_soft_refs, size, is_tlab); >>>>>> >>>>>> That'll eventually go into: >>>>>> CMSCollector::acquire_control_and_collect(bool full, bool >>>>>> clear_all_soft_refs) >>>>>> >>>>>> System.gc() is user requested so that'll go further into >>>>>> mark-sweep-compact: >>>>>> CMSCollector::do_compaction_work(bool clear_all_soft_refs) >>>>>> And here, it increments the JMM GC counter again (remember it >>>>>> was >>>>>> in the concurrent GC path too, to handle bailouts), even though >>>>>> this is still in the same collection. This leads to the "buggy >>>>>> behavior" mentioned earlier. >>>>>> >>>>>> The JMM GC counter wasn't added to CMS until this fix got in: >>>>>> 6581734: CMS Old Gen's collection usage is zero after GC >>>>>> which is >>>>>> incorrect >>>>>> >>>>>> The code added to CMSCollector::do_compaction_work() works >>>>>> fine in >>>>>> the concurrent GC path, but interacts badly with the original >>>>>> logic in GenCollectedHeap::do_collection(). >>>>>> >>>>>> ===================================================== >>>>>> >>>>>> I thought all concurrent mode failures/interrupts come from >>>>>> GenCollectedHeap::do_collection(). If that's the case, then it >>>>>> seems unnecessary to update the JMM GC counter in >>>>>> CMSCollector::do_compaction_work(), simply removing it should >>>>>> fix >>>>>> the problem. >>>>>> >>>>>> With that, I'd purpose the following (XS) change: (diff >>>>>> against HS20) >>>>>> >>>>>> diff -r f0f676c5a2c6 >>>>>> >>>>>> src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>> >>>>>> --- >>>>>> >>>>>> a/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>> >>>>>> Tue Mar 15 19:30:16 2011 -0700 >>>>>> +++ >>>>>> >>>>>> b/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>> >>>>>> Thu Jul 28 00:02:41 2011 +0800 >>>>>> @@ -2022,9 +2022,6 @@ >>>>>> >>>>>> _intra_sweep_estimate.padded_average()); >>>>>> } >>>>>> >>>>>> - { >>>>>> - TraceCMSMemoryManagerStats(); >>>>>> - } >>>>>> GenMarkSweep::invoke_at_safepoint(_cmsGen->level(), >>>>>> ref_processor(), clear_all_soft_refs); >>>>>> #ifdef ASSERT >>>>>> >>>>>> The same goes for the changes in: >>>>>> 7036199: Adding a notification to the implementation of >>>>>> GarbageCollectorMXBeans >>>>>> >>>>>> ===================================================== >>>>>> >>>>>> P.S. Is there an "official" name for the counters that I >>>>>> referred >>>>>> to as "jvmstat counters" above? Is it just jvmstat, or >>>>>> PerfData or >>>>>> HSPERFDATA? >>>>> >>>>> >> > From john.cuthbertson at oracle.com Tue Aug 2 18:16:56 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Tue, 02 Aug 2011 18:16:56 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7068240: G1: Long "parallel other time" and "ext root scanning" when running specific benchmark Message-ID: <20110802181658.AA8EA478CB@hg.openjdk.java.net> Changeset: 14a2fd14c0db Author: johnc Date: 2011-08-01 10:04 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/14a2fd14c0db 7068240: G1: Long "parallel other time" and "ext root scanning" when running specific benchmark Summary: In root processing, move the scanning of the reference processor's discovered lists to before RSet updating and scanning. When scanning the reference processor's discovered lists, use a buffering closure so that the time spent copying any reference object is correctly attributed. Also removed a couple of unused and irrelevant timers. Reviewed-by: ysr, jmasa ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp From tony.printezis at oracle.com Tue Aug 2 19:13:36 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 02 Aug 2011 15:13:36 -0400 Subject: CRR (M): 7059019: G1: add G1 support to the SA Message-ID: <4E384C60.6090603@oracle.com> Hi all, This is a webrev with the changes to add G1 support to the Serviceability Agent: http://cr.openjdk.java.net/~tonyp/7059019/webrev.0/ I already got a couple of preliminary reviews on it, so I think one extra would be sufficient. Tony From john.cuthbertson at oracle.com Tue Aug 2 22:41:33 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Tue, 02 Aug 2011 22:41:33 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7069863: G1: SIGSEGV running SPECjbb2011 and -UseBiasedLocking Message-ID: <20110802224135.C2BC7478D7@hg.openjdk.java.net> Changeset: 6aa4feb8a366 Author: johnc Date: 2011-08-02 12:13 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6aa4feb8a366 7069863: G1: SIGSEGV running SPECjbb2011 and -UseBiasedLocking Summary: Align the reserved size of the heap and perm to the heap region size to get a preferred heap base that is aligned to the region size, and call the correct heap reservation constructor. Also add a check in the heap reservation code that the reserved space starts at the requested address (if any). Reviewed-by: kvn, ysr ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/runtime/virtualspace.cpp From kevin.walls at oracle.com Wed Aug 3 11:56:38 2011 From: kevin.walls at oracle.com (Kevin Walls) Date: Wed, 03 Aug 2011 12:56:38 +0100 Subject: Review request: 7072527 CMS: JMM GC counters overcount in some cases In-Reply-To: <4E36F1F5.50606@oracle.com> References: <4E316E5D.50703@oracle.com> <4E31FEF1.8060205@oracle.com> <4E3207F9.6020805@oracle.com> <4E32E168.7000908@oracle.com> <4E333E8D.7070005@oracle.com> <4E36F1F5.50606@oracle.com> Message-ID: <4E393776.6080100@oracle.com> Thanks Ramki, and thanks for chasing up and clarifying the OCA point. Kevin On 01/08/11 19:35, Ramki Ramakrishna wrote: > Looks good to me; ship it! > > -- ramki > > On 7/29/2011 4:13 PM, Kevin Walls wrote: >> >> Actually I missed the other comment that we could go further and >> remove the unneeded constructor. >> >> http://cr.openjdk.java.net/~kevinw/7072527/webrev.01/ >> >> Thanks >> Kevin >> >> On 29/07/11 17:35, Kevin Walls wrote: >>> Hi -- >>> >>> I got this ready in a webrev: >>> >>> http://cr.openjdk.java.net/~kevinw/7072527/webrev.00/ >>> >>> The new test is passing, as are the old ones. 8-) >>> >>> Also, this change means that a CMS cycle ending in concurrent mode >>> failure now counts as one collection. One could argue that either >>> way (should it be 2 collections?) - but I'm thinking if we are >>> counting "completed" collections then we are now counting correctly! >>> >>> Thanks >>> Kevin >>> >>> >>> On 29/07/11 02:08, Y. S. Ramakrishna wrote: >>>> I filed: 7072527 CMS: JMM GC counters overcount in some cases >>>> >>>> On 07/28/11 17:29, Y. S. Ramakrishna wrote: >>>>> Hi Kevin -- >>>>> >>>>> thanks for jumping on this! More inline below ... >>>>> >>>>> On 07/28/11 09:33, Krystal Mok wrote: >>>>>> Hi Kevin, >>>>>> >>>>>> Thank you for taking care of this, and it's good to see the >>>>>> problem is verified. >>>>>> >>>>>> I think whether or not the suggested fix is sufficient depends on >>>>>> what paths can reach CMSCollector::do_compaction_work(). If all >>>>>> paths that can reach CMSCollector::do_compaction_work() come from >>>>>> GenCollectedHeap::do_collection(), then the fix should be good to >>>>>> go. Otherwise it'll need a better workaround. >>>>>> >>>>>> I believe all concurrent mode failures/interrupts (which includes >>>>>> the System.gc() case) does come from >>>>>> GenCollectedHeap::do_collection(), but I'm not exactly sure about >>>>>> this, could anybody please clarify on it? >>>>> >>>>> Yes, i believe this is indeed the case, and my browsing of the >>>>> code using cscope seemed to confirm that belief. >>>>> >>>>> More below ... >>>>> >>>>>> >>>>>> Regards, >>>>>> Kris Mok >>>>>> >>>>>> On Thu, Jul 28, 2011 at 10:12 PM, Kevin Walls >>>>>> > wrote: >>>>>> >>>>>> __ >>>>>> Hi -- >>>>>> >>>>>> 6580448 was marked as a duplicate of 6581734, which fixed the >>>>>> fact >>>>>> that CMS collections were just not counted at all - with CMS, >>>>>> only a >>>>>> stop the world full gc would be counted in the stats. >>>>>> >>>>>> But looks like you're right... Here is a quick variation of the >>>>>> testcase from 6581734 which shows the same thing, and this >>>>>> verifies >>>>>> the same, and is solved by ExplicitGCInvokesConcurrent. If >>>>>> there is >>>>>> no other feedback I can test if the removal of the >>>>>> TraceCMSMemoryManagerStats() call in >>>>>> CMSCollector::do_compaction_work is all we need... >>>>> >>>>> >>>>> Kevin, yes, it would be great if you could verify this and push >>>>> the fix. >>>>> I am not sure if the push would need to wait for the signing of OCA >>>>> from Kris, but best to check with Those Who Would Know Such Things. >>>>> >>>>> Since the original CR has been closed, i'll open one momentarily and >>>>> can make you RE (if that's OK with you). I'll be happy to serve as >>>>> reviewer of the change. >>>>> >>>>> As regards the jstat counter reporting two pauses per concurrent >>>>> CMS cycle, I am of two minds on what the original intention >>>>> was. I'd have originally regarded the double increment as a >>>>> bug, but as you state it is really two pauses, even if part of >>>>> a single cycle. And it makes sense to count them as two. I >>>>> agree that this should be documented and left alone, given >>>>> how long we have had this behaviour, and the alternative >>>>> (of counting cycles, rather than pauses) may be no better >>>>> (or arguably worse). There's actually an open CR for this >>>>> which we can redirect into a CR to update the relevant documentation. >>>>> >>>>> -- ramki >>>>> >>>>>> >>>>>> Regards >>>>>> Kevin >>>>>> >>>>>> >>>>>> /* >>>>>> * Copyright (c) 2011, Oracle and/or its affiliates. All rights >>>>>> reserved. >>>>>> * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. >>>>>> * >>>>>> * This code is free software; you can redistribute it and/or >>>>>> modify it >>>>>> * under the terms of the GNU General Public License version >>>>>> 2 only, as >>>>>> * published by the Free Software Foundation. >>>>>> * >>>>>> * This code is distributed in the hope that it will be >>>>>> useful, but >>>>>> WITHOUT >>>>>> * ANY WARRANTY; without even the implied warranty of >>>>>> MERCHANTABILITY or >>>>>> * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General >>>>>> Public License >>>>>> * version 2 for more details (a copy is included in the LICENSE >>>>>> file that >>>>>> * accompanied this code). >>>>>> * >>>>>> * You should have received a copy of the GNU General Public >>>>>> License >>>>>> version >>>>>> * 2 along with this work; if not, write to the Free Software >>>>>> Foundation, >>>>>> * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. >>>>>> * >>>>>> * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA >>>>>> 94065 USA >>>>>> * or visit www.oracle.com if you need >>>>>> additional information or have any >>>>>> * questions. >>>>>> */ >>>>>> >>>>>> /* >>>>>> * @test TestFullGCount.java >>>>>> * @bug >>>>>> * @summary >>>>>> * @run main/othervm -XX:+UseConcMarkSweepGC TestFullGCCount >>>>>> * >>>>>> */ >>>>>> import java.util.*; >>>>>> import java.lang.management.*; >>>>>> >>>>>> >>>>>> >>>>>> public class TestFullGCCount { >>>>>> >>>>>> private String poolName = "CMS"; >>>>>> private String collectorName = "ConcurrentMarkSweep"; >>>>>> >>>>>> public static void main(String [] args) { >>>>>> >>>>>> TestFullGCCount t = null; >>>>>> if (args.length==2) { >>>>>> t = new TestFullGCCount(args[0], args[1]); >>>>>> } else { >>>>>> System.out.println("Defaulting to monitor CMS pool and >>>>>> collector."); >>>>>> t = new TestFullGCCount(); >>>>>> } >>>>>> t.run(); >>>>>> >>>>>> } >>>>>> >>>>>> public TestFullGCCount(String pool, String collector) { >>>>>> poolName = pool; >>>>>> collectorName = collector; >>>>>> } >>>>>> public TestFullGCCount() { >>>>>> } >>>>>> >>>>>> public void run() { >>>>>> >>>>>> int count = 0; >>>>>> int iterations = 20; >>>>>> long counts[] = new long[iterations]; >>>>>> boolean diffAlways2 = true; // assume we will fail >>>>>> >>>>>> for (int i=0; i>>>>> System.gc(); >>>>>> counts[i] = checkStats(); >>>>>> if (i>0) { >>>>>> if (counts[i] - counts[i-1] != 2) { >>>>>> diffAlways2 = false; >>>>>> } >>>>>> } >>>>>> } >>>>>> if (diffAlways2) { >>>>>> throw new RuntimeException("FAILED: difference in >>>>>> count is >>>>>> always 2."); >>>>>> } >>>>>> System.out.println("Passed."); >>>>>> } >>>>>> >>>>>> private long checkStats() { >>>>>> long count = 0; >>>>>> List pools = >>>>>> ManagementFactory.getMemoryPoolMXBeans(); >>>>>> List collectors = >>>>>> ManagementFactory.getGarbageCollectorMXBeans(); >>>>>> for (int i=0; i>>>>> GarbageCollectorMXBean collector = collectors.get(i); >>>>>> String name = collector.getName(); >>>>>> if (name.contains(collectorName)) { >>>>>> System.out.println(name + ": collection count = " >>>>>> + >>>>>> collector.getCollectionCount()); >>>>>> count = collector.getCollectionCount(); >>>>>> } >>>>>> } >>>>>> return count; >>>>>> >>>>>> } >>>>>> >>>>>> } >>>>>> >>>>>> >>>>>> On 27/07/11 17:12, Krystal Mok wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I've been looking at a strange inconsistency of full GC count >>>>>>> recorded by jvmstat and JMM counters. I'd like to know which >>>>>>> ones >>>>>>> of the following behaviors are by design, which ones are >>>>>>> bugs, and >>>>>>> which ones are just my misunderstanding. I apologize for >>>>>>> making a >>>>>>> short story long... >>>>>>> >>>>>>> ===================================================== >>>>>>> >>>>>>> The counters involved: >>>>>>> >>>>>>> * A jvmstat counter named "sun.gc.collector.1.invocations" >>>>>>> keeps >>>>>>> track of the number of pauses occured as a result of a major >>>>>>> collection. It is used by utilities such as jstat as the >>>>>>> source of >>>>>>> "FGC" (full collection count), and the old gen collection >>>>>>> count in >>>>>>> Visual GC. It's updated by an TraceCollectorStats object. >>>>>>> * A JMM counter, GCMemoryManager::_num_collections, keeps >>>>>>> track of >>>>>>> the number of collections that have ended. This counter is >>>>>>> used as >>>>>>> HotSpot's implementation of the JMX >>>>>>> GarbageCollectorMXBean.getCollectionCount(). It's updated by >>>>>>> either a TraceMemoryManagerStats object or a >>>>>>> TraceCMSMemoryManagerStats object. >>>>>>> >>>>>>> To show the situation, I've made a screenshot of a VisualVM >>>>>>> and a >>>>>>> JConsole running side by side, both are monitoring the >>>>>>> VisualVM's >>>>>>> GC stats: >>>>>>> >>>>>>> http://dl.iteye.com/upload/attachment/524811/913cb0e1-7add-3ac0-a718-24ca705cad22.png >>>>>>> >>>>>>> (I'll upload the screenshot to somewhere else if anybody >>>>>>> can't see it) >>>>>>> The VisualVM instance is running on JDK6u26, with ParNew+CMS. >>>>>>> In the screenshot, Visual GC reports that the old gen >>>>>>> collection >>>>>>> count is 20, while JConsole reports 10. >>>>>>> >>>>>>> I see that there was this bug: >>>>>>> 6580448: CMS: Full GC collection count mismatch between >>>>>>> GarbageCollectorMXBean and jvmstat (VisualGC) >>>>>>> I don't think the current implementation has a bug in the sense >>>>>>> that the two counters don't report the same number. >>>>>>> >>>>>>> This behavior seems reasonable, but the naming of the value in >>>>>>> these tools are confusing: both tools say "collections", but >>>>>>> apparently the number in Visual GC means "number of pauses" >>>>>>> where >>>>>>> as the number in JConsole means "number of collection cycles". >>>>>>> It'd be great if the difference could be documented >>>>>>> somewhere, if >>>>>>> that's the intended behavior. >>>>>>> >>>>>>> And then the buggy behavior. Code demo posted on gist: >>>>>>> https://gist.github.com/1106263 >>>>>>> Starting from JDK6u23, when using CMS without >>>>>>> ExplicitGCInvokesConcurrent, System.gc() (or >>>>>>> Runtime.getRuntime().gc(), or MemoryMXBean.gc() via JMX) would >>>>>>> make the JMM GC counter increment by 2 per invocation, while >>>>>>> the >>>>>>> jvmstat counter is only incremented by 1. I believe the >>>>>>> latter is >>>>>>> correct and the former needs some fixing. >>>>>>> >>>>>>> ===================================================== >>>>>>> >>>>>>> My understanding of the behavior shown above: >>>>>>> >>>>>>> 1. The concurrent GC part: >>>>>>> >>>>>>> There are 2 pauses in a CMS concurrent GC cycle, one in the >>>>>>> initial mark phase, and one in the final remark phase. >>>>>>> To trigger a concurrent GC cycle, the CMS thread wakes up >>>>>>> periodically to see if it shouldConcurrentCollect(), and >>>>>>> trigger a >>>>>>> cycle when the predicate returned true, or goes back to >>>>>>> sleep if >>>>>>> the predicate returned false. The whole concurrent GC cycle >>>>>>> doesn't go through GenCollectedHeap::do_collection(). >>>>>>> >>>>>>> The jvmstat counter for old gen pauses is updated in >>>>>>> CMSCollector::do_CMS_operation(CMS_op_type op), which exactly >>>>>>> covers both pause phases. >>>>>>> >>>>>>> The JMM counter, however, is updated in the concurrent sweep >>>>>>> phase, CMSCollector::sweep(bool asynch), if there was no >>>>>>> concurrent mode failure; or it is updated in >>>>>>> CMSCollector::do_compaction_work(bool clear_all_soft_refs) >>>>>>> in case >>>>>>> of a bailout due to concurrent mode failure (advertised as >>>>>>> so in >>>>>>> the code comments). So that's an increment by 1 per >>>>>>> concurrent GC >>>>>>> cycle, which does reflect the "number of collection cycles", >>>>>>> fair >>>>>>> enough. >>>>>>> >>>>>>> So far so good. >>>>>>> >>>>>>> 2. The System.gc() part: >>>>>>> >>>>>>> Without ExplicitGCInvokesConcurrent set, System.gc() does a >>>>>>> stop-the-world full GC, which consists of only one pause, so >>>>>>> "number of pauses" would equal "number of collections" in >>>>>>> this case. >>>>>>> It will go into GenCollectedHeap::do_collection(); both the >>>>>>> jvmstat and the JMM GC counter gets incremented by 1 here, >>>>>>> >>>>>>> TraceCollectorStats tcs(_gens[i]->counters()); >>>>>>> TraceMemoryManagerStats tmms(_gens[i]->kind()); >>>>>>> >>>>>>> But, drilling down into: >>>>>>> _gens[i]->collect(full, do_clear_all_soft_refs, size, is_tlab); >>>>>>> >>>>>>> That'll eventually go into: >>>>>>> CMSCollector::acquire_control_and_collect(bool full, bool >>>>>>> clear_all_soft_refs) >>>>>>> >>>>>>> System.gc() is user requested so that'll go further into >>>>>>> mark-sweep-compact: >>>>>>> CMSCollector::do_compaction_work(bool clear_all_soft_refs) >>>>>>> And here, it increments the JMM GC counter again (remember >>>>>>> it was >>>>>>> in the concurrent GC path too, to handle bailouts), even though >>>>>>> this is still in the same collection. This leads to the "buggy >>>>>>> behavior" mentioned earlier. >>>>>>> >>>>>>> The JMM GC counter wasn't added to CMS until this fix got in: >>>>>>> 6581734: CMS Old Gen's collection usage is zero after GC >>>>>>> which is >>>>>>> incorrect >>>>>>> >>>>>>> The code added to CMSCollector::do_compaction_work() works >>>>>>> fine in >>>>>>> the concurrent GC path, but interacts badly with the original >>>>>>> logic in GenCollectedHeap::do_collection(). >>>>>>> >>>>>>> ===================================================== >>>>>>> >>>>>>> I thought all concurrent mode failures/interrupts come from >>>>>>> GenCollectedHeap::do_collection(). If that's the case, then it >>>>>>> seems unnecessary to update the JMM GC counter in >>>>>>> CMSCollector::do_compaction_work(), simply removing it >>>>>>> should fix >>>>>>> the problem. >>>>>>> >>>>>>> With that, I'd purpose the following (XS) change: (diff >>>>>>> against HS20) >>>>>>> >>>>>>> diff -r f0f676c5a2c6 >>>>>>> >>>>>>> src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> a/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>>> >>>>>>> Tue Mar 15 19:30:16 2011 -0700 >>>>>>> +++ >>>>>>> >>>>>>> b/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp >>>>>>> >>>>>>> Thu Jul 28 00:02:41 2011 +0800 >>>>>>> @@ -2022,9 +2022,6 @@ >>>>>>> >>>>>>> _intra_sweep_estimate.padded_average()); >>>>>>> } >>>>>>> >>>>>>> - { >>>>>>> - TraceCMSMemoryManagerStats(); >>>>>>> - } >>>>>>> GenMarkSweep::invoke_at_safepoint(_cmsGen->level(), >>>>>>> ref_processor(), clear_all_soft_refs); >>>>>>> #ifdef ASSERT >>>>>>> >>>>>>> The same goes for the changes in: >>>>>>> 7036199: Adding a notification to the implementation of >>>>>>> GarbageCollectorMXBeans >>>>>>> >>>>>>> ===================================================== >>>>>>> >>>>>>> P.S. Is there an "official" name for the counters that I >>>>>>> referred >>>>>>> to as "jvmstat counters" above? Is it just jvmstat, or >>>>>>> PerfData or >>>>>>> HSPERFDATA? >>>>>> >>>>>> >>> >> From john.cuthbertson at oracle.com Wed Aug 3 17:44:49 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 03 Aug 2011 10:44:49 -0700 Subject: RFR(M/L): 6484982: G1: process references during evacuation pauses In-Reply-To: <4E03B1A8.7020703@oracle.com> References: <4E03B1A8.7020703@oracle.com> Message-ID: <4E398911.1070307@oracle.com> Hi Everyone, A new webrev incorporating some feedback from Ramki can be found at: http://cr.openjdk.java.net/~johnc/6484982/webrev.1/ Thanks, JohnC On 06/23/11 14:35, John Cuthbertson wrote: > Hi Everyone, > > I would like to get a couple of volunteers to review the code changes > for this CR - the webrev can be found at > http://cr.openjdk.java.net/~johnc/6484982/webrev.0/ > > Summary: > G1 now contains 2 instances of the reference processor class - one for > concurrent marking and the other for STW GCs (both full and > incremental evacuation pauses). For evacuation pauses, during object > scanning and RSet scanning I embed the STW reference processor into > the OopClosures used to scan objects. This causes reference objects to > be 'discovered' by the reference processor. Towards the end of the > evacuation pause (just prior to retiring the the GC alloc regions) I > have added the code to process these discovered reference objects, > preserving (and copying) referent objects (and their reachable graphs) > as appropriate. The code that does this makes extensive use of the > existing copying oop closures and the G1ParScanThreadState structure > (to handle to-space allocation). > > The code changes also include a couple of fixes that were exposed by > the reference processing: > * In satbQueue.cpp, the routine > SATBMarkQueueSet::par_iterate_closure_all_threads() was claiming all > JavaThreads (giving them one parity value) but skipping the VMThread. > In a subsequent call to Thread::possibly_parallel_oops_do, the Java > threads were successfully claimed but the VMThread was not. This could > cause the VMThread's handle area to be skipped during the root scanning. > * There were a couple of assignments to the discovered field of > Reference objects that were not guarded by _discovery_needs_barrier > resulting in the G1 C++ write-barrier to dirty the card spanning the > Reference object's discovered field. This was causing the card table > verification (during card table clearing) to fail. > * There were also a couple of assignments of NULL to the next field > of Reference objects causing the same symptom. > > Testing: The GC test suite (32/64 bit) (+UseG1GC, +UseG1GC > +ExplicitGCInvokesConcurrent, +UseG1GC > InitiatingHeapOccupancyPercent=5, +UseG1GC +ParallelRefProcEnabled), > KitchenSink (48 hour runs with +UseG1GC, +UseG1GC > +ExplicitGCInvokesConcurrent), OpenDS (+UseG1GC, +UseG1GC > +ParallelRefProcEnabled), nsk GC and compiler tests, and jprt. Testing > was conducted with the _is_alive_non_header field in the STW ref > procssor both cleared and set (when cleared, more reference objects > are 'discovered'). > > Thanks, > > JohnC From y.s.ramakrishna at oracle.com Wed Aug 3 18:07:52 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 11:07:52 -0700 Subject: understanding GC logs In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> Message-ID: <4E398E78.3000408@oracle.com> On 8/3/2011 10:45 AM, Darji, Kinnari wrote: > > Hello GC team, > > What does this all different time mean? Can someone please clarify? > > What is the time application when application stops? > > [GC 9768.668: [ParNew > ^^^^^^ JVM timestamp (seconds since start of JVM) at start of GC operation) > > 3746 Desired survivor size 10878976 bytes, new threshold 4 (max 4) > > 3747 - age 1: 594288 bytes, 594288 total > > 3748 - age 2: 2369912 bytes, 2964200 total > > 3749 - age 3: 2877584 bytes, 5841784 total > > 3750 - age 4: 3075264 bytes, 8917048 total > > 3751 : 182066K->12384K(191744K), 0.0089120 secs] > 2755986K->2586303K(10710272K), 0.0092180 secs] > ^^^^^^^^ ^^^^^^^ Duration of Scavenge Duration of whole GC operation (includes scavenge) > > [Times: user=0.09 sys=0.00, real=0.01 secs] > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Process virtual user and system times, and real (elapsed) time during GC operation. The time for which the application threads were stopped is about 9.2 ms. -- ramki -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Wed Aug 3 18:36:17 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 11:36:17 -0700 Subject: understanding GC logs In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04F84@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> <4E398E78.3000408@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691E04F84@exnjmb89.nam.nsroot.net> Message-ID: <4E399521.2080805@oracle.com> On 8/3/2011 11:18 AM, Darji, Kinnari wrote: > > Thanks Ramki > > So If I look at logs starting [GC and real times, that should be > almost application STW time. Am I correct? > yes. Except that the real time in that display has a resolution of 10 ms only. (Thus the 9.2 ms looked like 0.01 s below, i think.) But yes, that's the STW time. One caveat though -- this only lists STW ops attributed to GC. More generally, you would want to use +PrintSafepointStatistics to see all STW operations (and details thereof), including of course the GC ops (which are usually the most common type of STW op, but by no means the only type). -- ramki > Thank you > > Kinnari > > *From:*Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 2:08 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: understanding GC logs > > > > On 8/3/2011 10:45 AM, Darji, Kinnari wrote: > > Hello GC team, > > What does this all different time mean? Can someone please clarify? > > What is the time application when application stops? > > [GC 9768.668: [ParNew > > ^^^^^^ JVM timestamp (seconds since start of JVM) at start > of GC operation) > > 3746 Desired survivor size 10878976 bytes, new threshold 4 (max 4) > > 3747 - age 1: 594288 bytes, 594288 total > > 3748 - age 2: 2369912 bytes, 2964200 total > > 3749 - age 3: 2877584 bytes, 5841784 total > > 3750 - age 4: 3075264 bytes, 8917048 total > > 3751 : 182066K->12384K(191744K), 0.0089120 secs] > 2755986K->2586303K(10710272K), 0.0092180 secs] > > ^^^^^^^^ > ^^^^^^^ > Duration of > Scavenge Duration of whole GC > operation > > (includes scavenge) > > [Times: user=0.09 sys=0.00, real=0.01 secs] > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Process virtual user and system > times, and real (elapsed) time during GC operation. > > The time for which the application threads were stopped is about 9.2 ms. > > -- ramki > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From igor.veresov at oracle.com Wed Aug 3 20:55:17 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 03 Aug 2011 13:55:17 -0700 Subject: Fwd: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 In-Reply-To: <4E39B231.5070606@oracle.com> References: <4E39B231.5070606@oracle.com> Message-ID: <4E39B5B5.1040509@oracle.com> Resending to the right list. -------- Original Message -------- Subject: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 Date: Wed, 03 Aug 2011 13:40:17 -0700 From: Igor Veresov To: hotspot-compiler-dev It seems that madvise(MADV_FREE) breaks pages reservation semantics of the the underlying segment. With tight memory constraints this would cause a race for pages and a segfault if the JVM louses. The solution is to revert back to the previous implementation of os::free_memory() that used mmap(). Webrev: http://cr.openjdk.java.net/~iveresov/7060842/webrev.00/ Tested is gc test suite. igor From y.s.ramakrishna at oracle.com Wed Aug 3 21:33:05 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 14:33:05 -0700 Subject: Fwd: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 In-Reply-To: <4E39B5B5.1040509@oracle.com> References: <4E39B231.5070606@oracle.com> <4E39B5B5.1040509@oracle.com> Message-ID: <4E39BE91.9000001@oracle.com> Yikes... a Linux man page i found does not say anything about swap reservation: *MADV_DONTNEED* Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources associated with it.) Subsequent accesses of pages in this range will succeed, but will result either in re-loading of the memory contents from the underlying mapped file (see *mmap*()) or zero-fill-on-demand pages for mappings without an underlying file. Is there an understanding that this is a Linux bug? Or is "resources" above open to interpretation, including swap reservation? (Andrew Haley?) Anyway, your change looks good, if only to get back to more reliable operation under tight memory situations? PS: Igor, what does "louses" below (in yr email) mean in this context? Reviewed! -- ramki On 8/3/2011 1:55 PM, Igor Veresov wrote: > Resending to the right list. > > -------- Original Message -------- > Subject: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS > running SPECjvm2008 > Date: Wed, 03 Aug 2011 13:40:17 -0700 > From: Igor Veresov > To: hotspot-compiler-dev > > It seems that madvise(MADV_FREE) breaks pages reservation semantics of > the the underlying segment. With tight memory constraints this would > cause a race for pages and a segfault if the JVM louses. The solution is > to revert back to the previous implementation of os::free_memory() that > used mmap(). > > Webrev: http://cr.openjdk.java.net/~iveresov/7060842/webrev.00/ > > Tested is gc test suite. > > igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Wed Aug 3 21:47:07 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 03 Aug 2011 14:47:07 -0700 Subject: Fwd: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 In-Reply-To: <4E39BE91.9000001@oracle.com> References: <4E39B231.5070606@oracle.com> <4E39B5B5.1040509@oracle.com> <4E39BE91.9000001@oracle.com> Message-ID: <4E39C1DB.4090703@oracle.com> On 8/3/11 2:33 PM, Ramki Ramakrishna wrote: > Yikes... a Linux man page i found does not say anything about swap > reservation: > > *MADV_DONTNEED* > Do not expect access in the near future. (For the time being, the > application is finished with the given range, so the kernel can free > resources associated with it.) Subsequent accesses of pages in this > range will succeed, but will result either in re-loading of the > memory contents from the underlying mapped file (see *mmap*()) or > zero-fill-on-demand pages for mappings without an underlying file. That's MADV_DONTNEED, we tried to use MADV_FREE, but there's nothing about it's semantics either with regard to reservations. > > Is there an understanding that this is a Linux bug? Or is "resources" > above open to > interpretation, including swap reservation? (Andrew Haley?) > I don't know if this is intentional. madvise() has notoriously different interpretations of its options under different OSes. > Anyway, your change looks good, if only to get back to more reliable > operation > under tight memory situations? Right. Thank you! > > PS: Igor, what does "louses" below (in yr email) mean in this context? Eh, that's a typo, I meant "looses". Thanks! igor > > Reviewed! > -- ramki > > > On 8/3/2011 1:55 PM, Igor Veresov wrote: >> Resending to the right list. >> >> -------- Original Message -------- >> Subject: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS >> running SPECjvm2008 >> Date: Wed, 03 Aug 2011 13:40:17 -0700 >> From: Igor Veresov >> To: hotspot-compiler-dev >> >> It seems that madvise(MADV_FREE) breaks pages reservation semantics of >> the the underlying segment. With tight memory constraints this would >> cause a race for pages and a segfault if the JVM louses. The solution is >> to revert back to the previous implementation of os::free_memory() that >> used mmap(). >> >> Webrev: http://cr.openjdk.java.net/~iveresov/7060842/webrev.00/ >> >> Tested is gc test suite. >> >> igor From igor.veresov at oracle.com Wed Aug 3 21:47:28 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 03 Aug 2011 14:47:28 -0700 Subject: review(XS): 7060836: RHEL 5.5 and 5.6 should support UseNUMA Message-ID: <4E39C1F0.2010405@oracle.com> There are popular linux distros that have a kernel that supports the sched_get_cpu() syscall but don't yet have a wrapper for it in libc. The solution is to implement the wrapper inside of the VM and call it if the libc version is not available. Contributed-by: Andrew John Hughes Webrev: http://cr.openjdk.java.net/~iveresov/7060836/webrev.00/ Tested with gc test suite. From y.s.ramakrishna at oracle.com Wed Aug 3 21:47:51 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 14:47:51 -0700 Subject: PrintSafepointStatistics (was Re: understanding GC logs) In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC0691EC78E3@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> <4E398E78.3000408@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691E04F84@exnjmb89.nam.nsroot.net> <4E399521.2080805@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691EC78E3@exnjmb89.nam.nsroot.net> Message-ID: <4E39C207.1050500@oracle.com> Hi Kinnari -- hs14, which you are on, is rather old (current dev is hs22; latest public is hs21). Is it possible that you could switch to a more recent JDK? If that's not possible, send me an hs_err file and I can get a ticket opened for you via the usual support channels. If the problem occurs with a recent hs21 or hs22, we can certainly take a look here. In either case, I have modified the subject line for relevance to the issue at hand, and also cross-posted to hsotspot-runtime-dev at o.j.n where PrintSafepointStatistics expertise resides. -- ramki On 8/3/2011 2:40 PM, Darji, Kinnari wrote: > > Hi Ramki, > > Not sure what's the problem. The process dies with following when I > have +PrintSafepointStatistics > > java version "1.6.0_16" > > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > > vmop_name [threads: total initially_running > wait_to_block] [time: spin block sync] [vmop_time time_elapsed] > page_trap_count > > no vm operation [ 7 1 > 1] [ 0 0 0] [ 0 0] 0 > > Polling page always armed > > 0 VM operations coalesced during safepoint > > Maximum sync time 0 ms > > ~ > > Can you please help? > > Thank you > > Kinnari > > *From:*Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 2:36 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: understanding GC logs > > > > On 8/3/2011 11:18 AM, Darji, Kinnari wrote: > > Thanks Ramki > > So If I look at logs starting [GC and real times, that should be > almost application STW time. Am I correct? > > > yes. Except that the real time in that display has a resolution of 10 > ms only. > (Thus the 9.2 ms looked like 0.01 s below, i think.) > > But yes, that's the STW time. > > One caveat though -- this only lists STW ops attributed to GC. > More generally, you would want to use +PrintSafepointStatistics to > see all STW operations (and details thereof), including of course the > GC ops (which are usually the most common type of STW op, but by > no means the only type). > > -- ramki > > > Thank you > > Kinnari > > *From:*Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 2:08 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: understanding GC logs > > > > On 8/3/2011 10:45 AM, Darji, Kinnari wrote: > > Hello GC team, > > What does this all different time mean? Can someone please clarify? > > What is the time application when application stops? > > [GC 9768.668: [ParNew > > ^^^^^^ JVM timestamp (seconds since start of JVM) at start > of GC operation) > > > 3746 Desired survivor size 10878976 bytes, new threshold 4 (max 4) > > 3747 - age 1: 594288 bytes, 594288 total > > 3748 - age 2: 2369912 bytes, 2964200 total > > 3749 - age 3: 2877584 bytes, 5841784 total > > 3750 - age 4: 3075264 bytes, 8917048 total > > 3751 : 182066K->12384K(191744K), 0.0089120 secs] > 2755986K->2586303K(10710272K), 0.0092180 secs] > > ^^^^^^^^ > ^^^^^^^ > Duration of > Scavenge Duration of whole GC > operation > > (includes scavenge) > > > [Times: user=0.09 sys=0.00, real=0.01 secs] > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Process virtual user and system > times, and real (elapsed) time during GC operation. > > The time for which the application threads were stopped is about 9.2 ms. > > -- ramki > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Wed Aug 3 21:58:35 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 14:58:35 -0700 Subject: Fwd: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 In-Reply-To: <4E39C1DB.4090703@oracle.com> References: <4E39B231.5070606@oracle.com> <4E39B5B5.1040509@oracle.com> <4E39BE91.9000001@oracle.com> <4E39C1DB.4090703@oracle.com> Message-ID: <4E39C48B.4000604@oracle.com> On 8/3/2011 2:47 PM, Igor Veresov wrote: > On 8/3/11 2:33 PM, Ramki Ramakrishna wrote: >> Yikes... a Linux man page i found does not say anything about swap >> reservation: >> >> *MADV_DONTNEED* >> Do not expect access in the near future. (For the time being, the >> application is finished with the given range, so the kernel can free >> resources associated with it.) Subsequent accesses of pages in this >> range will succeed, but will result either in re-loading of the >> memory contents from the underlying mapped file (see *mmap*()) or >> zero-fill-on-demand pages for mappings without an underlying file. > > That's MADV_DONTNEED, we tried to use MADV_FREE, but there's nothing > about it's semantics either with regard to reservations. Yr webrev shows DONT_NEED as the previous version in Linux. (Perhaps you are confusing with Solaris where you do an MADV_FREE perhaps?) >> >> Is there an understanding that this is a Linux bug? Or is "resources" >> above open to >> interpretation, including swap reservation? (Andrew Haley?) >> > > I don't know if this is intentional. madvise() has notoriously > different interpretations of its options under different OSes. yeah, perhaps we want the Unix committees of the 1980s back again? (careful what we wish for?) > ... >> >> PS: Igor, what does "louses" below (in yr email) mean in this context? > > Eh, that's a typo, I meant "looses". duh, me! :-) -- ramki From igor.veresov at oracle.com Wed Aug 3 22:09:24 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 03 Aug 2011 15:09:24 -0700 Subject: Fwd: review(XXS): 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 In-Reply-To: <4E39C48B.4000604@oracle.com> References: <4E39B231.5070606@oracle.com> <4E39B5B5.1040509@oracle.com> <4E39BE91.9000001@oracle.com> <4E39C1DB.4090703@oracle.com> <4E39C48B.4000604@oracle.com> Message-ID: <4E39C714.8060906@oracle.com> On 8/3/11 2:58 PM, Ramki Ramakrishna wrote: > > > On 8/3/2011 2:47 PM, Igor Veresov wrote: >> On 8/3/11 2:33 PM, Ramki Ramakrishna wrote: >>> Yikes... a Linux man page i found does not say anything about swap >>> reservation: >>> >>> *MADV_DONTNEED* >>> Do not expect access in the near future. (For the time being, the >>> application is finished with the given range, so the kernel can free >>> resources associated with it.) Subsequent accesses of pages in this >>> range will succeed, but will result either in re-loading of the >>> memory contents from the underlying mapped file (see *mmap*()) or >>> zero-fill-on-demand pages for mappings without an underlying file. >> >> That's MADV_DONTNEED, we tried to use MADV_FREE, but there's nothing >> about it's semantics either with regard to reservations. > > Yr webrev shows DONT_NEED as the previous version in Linux. (Perhaps you > are confusing with Solaris > where you do an MADV_FREE perhaps?) Yes, of course, that was Solaris talking. :) Sorry. igor From y.s.ramakrishna at oracle.com Wed Aug 3 23:30:31 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Wed, 03 Aug 2011 16:30:31 -0700 Subject: review(XS): 7060836: RHEL 5.5 and 5.6 should support UseNUMA In-Reply-To: <4E39C1F0.2010405@oracle.com> References: <4E39C1F0.2010405@oracle.com> Message-ID: <4E39DA17.40401@oracle.com> Reviewed! -- ramki On 8/3/2011 2:47 PM, Igor Veresov wrote: > There are popular linux distros that have a kernel that supports the > sched_get_cpu() syscall but don't yet have a wrapper for it in libc. > The solution is to implement the wrapper inside of the VM and call it > if the libc version is not available. > > Contributed-by: Andrew John Hughes > > Webrev: http://cr.openjdk.java.net/~iveresov/7060836/webrev.00/ > > Tested with gc test suite. From igor.veresov at oracle.com Wed Aug 3 23:44:57 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 03 Aug 2011 16:44:57 -0700 Subject: review(XS): 7060836: RHEL 5.5 and 5.6 should support UseNUMA In-Reply-To: <4E39DA17.40401@oracle.com> References: <4E39C1F0.2010405@oracle.com> <4E39DA17.40401@oracle.com> Message-ID: <4E39DD79.50805@oracle.com> Thanks, Ramki! igor On 8/3/11 4:30 PM, Ramki Ramakrishna wrote: > Reviewed! > > -- ramki > > On 8/3/2011 2:47 PM, Igor Veresov wrote: >> There are popular linux distros that have a kernel that supports the >> sched_get_cpu() syscall but don't yet have a wrapper for it in libc. >> The solution is to implement the wrapper inside of the VM and call it >> if the libc version is not available. >> >> Contributed-by: Andrew John Hughes >> >> Webrev: http://cr.openjdk.java.net/~iveresov/7060836/webrev.00/ >> >> Tested with gc test suite. From y.s.ramakrishna at oracle.com Thu Aug 4 17:12:20 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 04 Aug 2011 10:12:20 -0700 Subject: PrintSafepointStatistics (was Re: understanding GC logs) In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC0691EC81CA@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> <4E398E78.3000408@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691E04F84@exnjmb89.nam.nsroot.net> <4E399521.2080805@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691EC78E3@exnjmb89.nam.nsroot.net> <4E39C207.1050500@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691EC81CA@exnjmb89.nam.nsroot.net> Message-ID: <4E3AD2F4.6070308@oracle.com> Hi Kinnari -- On 08/04/11 08:01, Darji, Kinnari wrote: > Hi Ramki, > > I am running jdk-1.6.0_16 and Java HotSpot(TM) 64-Bit Server VM (build > 14.2-b01, mixed mode). I can?t change JDK version. Is there any other > way to have this info printed on GC logs with this JDK version? I'll get you in touch, off-list, with support folk so they can help open a service ticket based on your support contract for the older JDK. As regards having this kind of info printed without recourse to +PrintSafepointStatistics, try -XX:+PrintGCApplicationStoppedTime and -XX:+PrintGCApplicationConcurrentTime, which should give you the times you want, albeit with none of the finer details that +PrintSafepointStatistics would have provided you. Here's a description of those flags from globals.hpp:- product(bool, PrintGCApplicationConcurrentTime, false, \ "Print the time the application has been running") \ \ product(bool, PrintGCApplicationStoppedTime, false, \ "Print the time the application has been stopped") \ \ (... basically between safepoints, or at safepoints respectively). As regards: > > > > Attaching error file.. As I understood you were getting a JVM crash when you used +PrintSafepointStatistics with 6u16. In that case, the JVM would typically dump a file named hs_err_.log in the $CWD of your invoking shell. That's what the support folks would want (along with the core file may be in some cases). Please send the hs_err_*.log file so I can provide that to the support folk. It is possible that someone on the runtime list might already recognize this problem as one that has since been fixed. -- ramki > > > > Thank you > > Kinnari > > > > *From:* Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 5:48 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* 'hotspot-gc-use at openjdk.java.net'; > hotspot-runtime-dev at openjdk.java.net > *Subject:* PrintSafepointStatistics (was Re: understanding GC logs) > > > > Hi Kinnari -- hs14, which you are on, is rather old (current dev is > hs22; latest public is hs21). > Is it possible that you could switch to a more recent JDK? If that's not > possible, > send me an hs_err file and I can get a ticket opened for you via the > usual support > channels. If the problem occurs with a recent hs21 or hs22, we can certainly > take a look here. In either case, I have modified the subject line for > relevance > to the issue at hand, and also cross-posted to > hsotspot-runtime-dev at o.j.n > where PrintSafepointStatistics expertise resides. > > -- ramki > > On 8/3/2011 2:40 PM, Darji, Kinnari wrote: > > Hi Ramki, > > Not sure what?s the problem. The process dies with following when I have > +PrintSafepointStatistics > > > > java version "1.6.0_16" > > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > > vmop_name [threads: total initially_running > wait_to_block] [time: spin block sync] [vmop_time time_elapsed] > page_trap_count > > no vm operation [ 7 1 > 1] [ 0 0 0] [ 0 0] 0 > > > > Polling page always armed > > 0 VM operations coalesced during safepoint > > Maximum sync time 0 ms > > ~ > > > > Can you please help? > > > > Thank you > > Kinnari > > > > *From:* Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 2:36 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: understanding GC logs > > > > > > On 8/3/2011 11:18 AM, Darji, Kinnari wrote: > > Thanks Ramki > > So If I look at logs starting [GC and real times, that should be almost > application STW time. Am I correct? > > > yes. Except that the real time in that display has a resolution of 10 ms > only. > (Thus the 9.2 ms looked like 0.01 s below, i think.) > > But yes, that's the STW time. > > One caveat though -- this only lists STW ops attributed to GC. > More generally, you would want to use +PrintSafepointStatistics to > see all STW operations (and details thereof), including of course the > GC ops (which are usually the most common type of STW op, but by > no means the only type). > > -- ramki > > > > > > Thank you > > Kinnari > > > > *From:* Ramki Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > *Sent:* Wednesday, August 03, 2011 2:08 PM > *To:* Darji, Kinnari [ICG-IT] > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: understanding GC logs > > > > > > On 8/3/2011 10:45 AM, Darji, Kinnari wrote: > > > > Hello GC team, > > What does this all different time mean? Can someone please clarify? > > What is the time application when application stops? > > > > [GC 9768.668: [ParNew > > ^^^^^^ JVM timestamp (seconds since start of JVM) at start of > GC operation) > > > > 3746 Desired survivor size 10878976 bytes, new threshold 4 (max 4) > > 3747 - age 1: 594288 bytes, 594288 total > > 3748 - age 2: 2369912 bytes, 2964200 total > > 3749 - age 3: 2877584 bytes, 5841784 total > > 3750 - age 4: 3075264 bytes, 8917048 total > > 3751 : 182066K->12384K(191744K), 0.0089120 secs] > 2755986K->2586303K(10710272K), 0.0092180 secs] > > ^^^^^^^^ > ^^^^^^^ > Duration of > Scavenge Duration of whole GC > operation > > (includes scavenge) > > > > [Times: user=0.09 sys=0.00, real=0.01 secs] > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Process virtual user and system > times, and real (elapsed) time during GC operation. > > The time for which the application threads were stopped is about 9.2 ms. > > -- ramki > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Thu Aug 4 18:20:33 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 04 Aug 2011 11:20:33 -0700 Subject: PrintSafepointStatistics (was Re: understanding GC logs) In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC0691F84295@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC0691E04E82@exnjmb89.nam.nsroot.net> <4E398E78.3000408@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691E04F84@exnjmb89.nam.nsroot.net> <4E399521.2080805@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691EC78E3@exnjmb89.nam.nsroot.net> <4E39C207.1050500@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691EC81CA@exnjmb89.nam.nsroot.net> <4E3AD2F4.6070308@oracle.com> <21ED8E3420CDB647B88C7F80A7D64DAC0691F84295@exnjmb89.nam.nsroot.net> Message-ID: <4E3AE2F1.2040401@oracle.com> There should be a header line that tells you what each column represents. It was in one of your earlier emails, see below: (The column header is printed more frequently in later versions of the JVM making interpretation easier; see also more on that below.) On 08/04/11 10:34, Darji, Kinnari wrote: > Ramki, > I tried it one more time and my process came up fine. I see following logs in console log. Though I don't see anything on verbose GC logs. Is that proper output? If so, how do I interpret following logs? > > Deoptimize [ 8 0 0] [ 0 0 0] [ 0 0] 0 > Deoptimize [ 9 0 0] [ 0 0 0] [ 0 26] 0 > Deoptimize [ 9 0 0] [ 0 0 0] [ 0 46] 0 > Deoptimize [ 13 0 0] [ 0 0 0] [ 0 860] 0 > Deoptimize [ 13 0 0] [ 0 0 0] [ 0 714] 0 > Deoptimize [ 15 0 0] [ 0 0 0] [ 0 760] 0 > Deoptimize [ 15 0 0] [ 0 0 0] [ 0 0] 0 > Deoptimize [ 15 0 0] [ 0 0 0] [ 1 310] 0 > GenCollectForAllocation [ 15 0 0] [ 0 0 0] [ 27 292] 0 > Deoptimize [ 18 0 0] [ 0 0 0] [ 0 821] 0 > EnableBiasedLocking [ 30 0 0] [ 0 0 0] [ 1 355] 0 > BulkRevokeBias [ 32 1 0] [ 0 0 0] [ 3 1905] 0 > RevokeBias [ 32 0 1] [ 0 0 0] [ 3 9] 0 > RevokeBias [ 31 0 2] [ 0 0 0] [ 0 4] 0 > RevokeBias [ 30 0 0] [ 0 0 0] [ 1 1] 0 > RevokeBias [ 29 0 0] [ 0 0 0] [ 2 7] 0 > RevokeBias [ 28 0 1] [ 0 0 0] [ 2 2] 0 > BulkRevokeBias [ 28 0 1] [ 0 0 0] [ 0 2] 0 > RevokeBias [ 27 1 0] [ 0 0 0] [ 1 1] 0 > RevokeBias [ 25 0 1] [ 0 0 0] [ 2 3] > > Thank you > Kinnari > This here:- ... >> vmop_name [threads: total initially_running wait_to_block] [time: spin block sync] [vmop_time time_elapsed] page_trap_count Unfortunately, this is not the easiest thing to interpret if you are not familiar with the JVM safepoint protocol details (it's intended for extreme performance tuning or troubleshooting); not only that, because the data is printed in "batches", it's less than easy to align it with GC or other logging. (Later versions (in hs20 or later) fixed this somewhat by providing a JVM timestamp against each -- wherease above you need to reconstruct that info from the deltas, which is a pain.) So my advice is to use a newer JVM if you can, and if you can't then just rely on the less detailed, but easier to align, +PrintGCApplication{Concurrent,Stopped}Time flags. By the way, if the crash that you reported earlier does not in fact happen, please make sure to tell the support engineering contacts, who may contacted you off-list, so they can close off any ticket they may have opened for your report. thanks. -- ramki _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Thu Aug 4 22:12:27 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 04 Aug 2011 15:12:27 -0700 Subject: How to exercise the GC notification code in gcNotifier.cpp Message-ID: <4E3B194B.3010002@oracle.com> Hi Everyone, After investigating an issue that was reported while running a fairly substantial application, I believe I have found naked oop in the GC notification code which I think is the cause of the problem The oop in question is the muKlass oop in createGCInfo in gcNotifier.cpp. This code is executed by the ServiceThread as part of the sendNotification routine which pulls items from some request queue. The items on this queue seem to be added as a result of GC end events being pushed by the MemoryManagerService if the routine GCMemoryManager::is_notification_enabled() returns true. This field is set to true by the routine jmm_SetGCNotificationEnabled in management.cpp. Can anyone tell me how I can explicitly exercise this code? I have tried some things in jconsole and jvisualvm but have so far been unsuccessful. Thanks, JohnC From rednaxelafx at gmail.com Fri Aug 5 01:41:44 2011 From: rednaxelafx at gmail.com (Krystal Mok) Date: Fri, 5 Aug 2011 09:41:44 +0800 Subject: How to exercise the GC notification code in gcNotifier.cpp In-Reply-To: <4E3B194B.3010002@oracle.com> References: <4E3B194B.3010002@oracle.com> Message-ID: Hi John, If you prefer JConsole for testing interactively, you can explicitly subscribe/unsubscribe to a GarbageCollector notification, like this: http://rednaxelafx.iteye.com/upload/picture/pic/96172/7ad699e0-c8aa-3cce-ad92-0f0226c98fd6.png 1. Select the MBeans tab, then select a java.lang.GarbageCollector 2. Click "Notifications" under the GarbageCollector 3. Click the "Subscribe" button on the bottom right and then sun.management.GarbageCollectorImpl.addNotificationListener() will be invoked, which in turn will set SetGCNotificationEnabled to true. Unsubscribing is similar. The notification code was added in this rev (or somewhere close...): http://hg.openjdk.java.net/jdk7/hotspot/jdk/rev/5b38ed5f5eb4 Regards, Kris Mok On Fri, Aug 5, 2011 at 6:12 AM, John Cuthbertson < john.cuthbertson at oracle.com> wrote: > Hi Everyone, > > After investigating an issue that was reported while running a fairly > substantial application, I believe I have found naked oop in the GC > notification code which I think is the cause of the problem The oop in > question is the muKlass oop in createGCInfo in gcNotifier.cpp. This code is > executed by the ServiceThread as part of the sendNotification routine which > pulls items from some request queue. The items on this queue seem to be > added as a result of GC end events being pushed by the MemoryManagerService > if the routine GCMemoryManager::is_**notification_enabled() returns true. > This field is set to true by the routine jmm_SetGCNotificationEnabled in > management.cpp. Can anyone tell me how I can explicitly exercise this code? > I have tried some things in jconsole and jvisualvm but have so far been > unsuccessful. > > Thanks, > > JohnC > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.caspole at amd.com Fri Aug 5 19:40:38 2011 From: eric.caspole at amd.com (Eric Caspole) Date: Fri, 5 Aug 2011 15:40:38 -0400 Subject: Log Visualization Tools In-Reply-To: References: Message-ID: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> Sometimes I use HPjmeter for plain Xloggc, but I don't think it can do the fancy extra flags either. On Aug 5, 2011, at 1:51 PM, Matt Fowles wrote: > All~ > > What tools do people know of or have for parsing gc logs and > visualizing the results? > > The only thing I can find, GCViewer, (from > http://www.tagtraum.com/gcviewer.html) seems like it has not been > updated for a while and does not parse a lot of more complicated logs > (-XX:+PrintTenuringDistribution or -XX:PrintCMSStatistics=1). > > Are there more tools out there? Are there in house tools that people > are willing to share? > > Matt > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Fri Aug 5 20:28:03 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Fri, 05 Aug 2011 13:28:03 -0700 Subject: Log Visualization Tools In-Reply-To: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> References: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> Message-ID: <4E3C5253.2050802@oracle.com> Same here -- i sometimes use an internal homegrown awk script to extract the metrics, so they can be massaged into a data file amenable to plotting with gnuplot. That script does not take well to the extra output either, so we usually strip out the extra output and deal with only the more fundamental metrics only. The extra output from the more fancy flags has thus far been consumed only by humans or extracted on an ad-hoc basis into spreadsheets and such. This is clearly not a nice state of affairs. I believe there is work (or plans?) underway for some kind of logging framework into which the JVM will feed its metrics, and hopefully the tooling that consumes those logs will be able to deal with all these issues in a more uniform fashion once and for all.... Unfortunately, I have no real details of that work, though... Then there is gchisto which is GC-specific (but which also cannot consume the output from the more fancy flags), but that has been placed on the backseat as other issues have intervened. In general, until GC logging formats are standardized, tools that consume textual output from the JVM/GC will tend to break unless changes to these text formats are carefully controlled. There has been some talk on and off about trying to standardize those formats, but I am not sure about the status of that. May be the logging framework mentioned earlier will provide a superstructure from which such textual standardization will result naturally. -- ramki On 8/5/2011 12:40 PM, Eric Caspole wrote: > Sometimes I use HPjmeter for plain Xloggc, but I don't think it can > do the fancy extra flags either. > > On Aug 5, 2011, at 1:51 PM, Matt Fowles wrote: > >> All~ >> >> What tools do people know of or have for parsing gc logs and >> visualizing the results? >> >> The only thing I can find, GCViewer, (from >> http://www.tagtraum.com/gcviewer.html) seems like it has not been >> updated for a while and does not parse a lot of more complicated logs >> (-XX:+PrintTenuringDistribution or -XX:PrintCMSStatistics=1). >> >> Are there more tools out there? Are there in house tools that people >> are willing to share? >> >> Matt >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Fri Aug 5 20:33:22 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Fri, 05 Aug 2011 13:33:22 -0700 Subject: Log Visualization Tools In-Reply-To: <4E3C5253.2050802@oracle.com> References: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> <4E3C5253.2050802@oracle.com> Message-ID: <4E3C5392.7080906@oracle.com> Sorry for the noise: my response was sent to the wrong list in error; corrected herewith. -- ramki On 8/5/2011 1:28 PM, Ramki Ramakrishna wrote: > Same here -- i sometimes use an internal homegrown awk script to > extract the > metrics, so they can be massaged into a data file amenable to plotting > with gnuplot. > That script does not take well to the extra output either, so we > usually strip out the > extra output and deal with only the more fundamental metrics only. The > extra output > from the more fancy flags has thus far been consumed only by humans or > extracted on an ad-hoc basis into > spreadsheets and such. This is clearly not a nice state of affairs. I > believe there is > work (or plans?) underway for some kind of logging framework into > which the JVM will feed > its metrics, and hopefully the tooling that consumes those logs will > be able to > deal with all these issues in a more uniform fashion once and for > all.... Unfortunately, > I have no real details of that work, though... > > Then there is gchisto which is GC-specific (but which also cannot > consume the output > from the more fancy flags), but that has been placed on the backseat > as other issues > have intervened. > In general, until GC logging formats are standardized, tools that > consume textual > output from the JVM/GC will tend to break unless changes to these text > formats are > carefully controlled. There has been some talk on and off about trying to > standardize those formats, but I am not sure about the status of that. > May be the > logging framework mentioned earlier will provide a superstructure from > which such > textual standardization will result naturally. > > -- ramki > > On 8/5/2011 12:40 PM, Eric Caspole wrote: >> Sometimes I use HPjmeter for plain Xloggc, but I don't think it can >> do the fancy extra flags either. >> >> On Aug 5, 2011, at 1:51 PM, Matt Fowles wrote: >> >>> All~ >>> >>> What tools do people know of or have for parsing gc logs and >>> visualizing the results? >>> >>> The only thing I can find, GCViewer, (from >>> http://www.tagtraum.com/gcviewer.html) seems like it has not been >>> updated for a while and does not parse a lot of more complicated logs >>> (-XX:+PrintTenuringDistribution or -XX:PrintCMSStatistics=1). >>> >>> Are there more tools out there? Are there in house tools that people >>> are willing to share? >>> >>> Matt >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Fri Aug 5 21:22:46 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Fri, 05 Aug 2011 14:22:46 -0700 Subject: Log Visualization Tools In-Reply-To: References: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> <4E3C5253.2050802@oracle.com> <4E3C5392.7080906@oracle.com> Message-ID: <4E3C5F26.9090707@oracle.com> Historically, the non-standard ("extra fancy") flag output has not been governed by any rules. The basic logging format however has basically not changed since 1.4.2, as far as i recall. Of course, each collector, typically released in a major new release, has had its little quirks even within the basic format. Unfortunately, the situation has not been governed by any strict rules -- although we have never knowingly introduced changes to formatting in minor releases -- or even major releases -- for fear of breaking existing log-parsing scripts, I am sure some have slipped through. In the absence of a spec for the format, QA has never written tests to protect against inadvertent regressions. (Sorry for talking like that; i know it sounds rather like a lawyer or politician talking when i read that email back! :-) I am hoping things will get better, more standardized, going forward, so people can use tools without fear of them breaking with a new release. -- ramki On 8/5/2011 2:06 PM, Matt Fowles wrote: > Ramki~ > > What rules govern the upgrade of logging formats? Can the format only > be changed on major releases or can we just add a flag 'new format' to > minor releases? > > Matt > > On Fri, Aug 5, 2011 at 4:33 PM, Ramki Ramakrishna > wrote: >> Sorry for the noise: my response was sent to the wrong list in error; >> corrected herewith. >> >> -- ramki >> >> On 8/5/2011 1:28 PM, Ramki Ramakrishna wrote: >>> Same here -- i sometimes use an internal homegrown awk script to >>> extract the >>> metrics, so they can be massaged into a data file amenable to plotting >>> with gnuplot. >>> That script does not take well to the extra output either, so we >>> usually strip out the >>> extra output and deal with only the more fundamental metrics only. The >>> extra output >>> from the more fancy flags has thus far been consumed only by humans or >>> extracted on an ad-hoc basis into >>> spreadsheets and such. This is clearly not a nice state of affairs. I >>> believe there is >>> work (or plans?) underway for some kind of logging framework into >>> which the JVM will feed >>> its metrics, and hopefully the tooling that consumes those logs will >>> be able to >>> deal with all these issues in a more uniform fashion once and for >>> all.... Unfortunately, >>> I have no real details of that work, though... >>> >>> Then there is gchisto which is GC-specific (but which also cannot >>> consume the output >>> from the more fancy flags), but that has been placed on the backseat >>> as other issues >>> have intervened. >>> In general, until GC logging formats are standardized, tools that >>> consume textual >>> output from the JVM/GC will tend to break unless changes to these text >>> formats are >>> carefully controlled. There has been some talk on and off about trying to >>> standardize those formats, but I am not sure about the status of that. >>> May be the >>> logging framework mentioned earlier will provide a superstructure from >>> which such >>> textual standardization will result naturally. >>> >>> -- ramki >>> >>> On 8/5/2011 12:40 PM, Eric Caspole wrote: >>>> Sometimes I use HPjmeter for plain Xloggc, but I don't think it can >>>> do the fancy extra flags either. >>>> >>>> On Aug 5, 2011, at 1:51 PM, Matt Fowles wrote: >>>> >>>>> All~ >>>>> >>>>> What tools do people know of or have for parsing gc logs and >>>>> visualizing the results? >>>>> >>>>> The only thing I can find, GCViewer, (from >>>>> http://www.tagtraum.com/gcviewer.html) seems like it has not been >>>>> updated for a while and does not parse a lot of more complicated logs >>>>> (-XX:+PrintTenuringDistribution or -XX:PrintCMSStatistics=1). >>>>> >>>>> Are there more tools out there? Are there in house tools that people >>>>> are willing to share? >>>>> >>>>> Matt >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Fri Aug 5 23:13:49 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 05 Aug 2011 16:13:49 -0700 Subject: RFR(XS): 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App Message-ID: <4E3C792D.4020909@oracle.com> Hi Everyone, Can I have a couple of volunteers look at these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7074579/webrev.0/ The issue was a crash caused by an oop that was naked across a GC. What was happening was that the ServiceThread was attempting to send a GC notification that came from the end of an evacuation pause. (Note that the ServiceThread is a Java thread and the GC notification is sent when the threads are restarted after the safepoint.) The construction of the first of the object arrays (used to pass before and memory pool information) triggered a full GC which moved the memory usage class. Thus when the attempt to allocate the second object array was made, the variable holding the klass oop was now stale causing the crash. Although this issue was found when the app was run with G1, the issue is not G1 specific. The solution was to allocate a handle to hold the klass oop and use the de-referenced handle in the allocations. Verified by inserting a full GC after the first array allocation and running the GC notification regression test. Fix was tested by running the regression test with all collectors and monitoring a KitchenSink run with jconsole. Thanks, JohnC From igor.veresov at oracle.com Sat Aug 6 12:00:46 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Sat, 06 Aug 2011 12:00:46 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110806120053.5F1CF479A0@hg.openjdk.java.net> Changeset: a20e6e447d3d Author: iveresov Date: 2011-08-05 16:44 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages Reviewed-by: ysr ! src/os/linux/vm/os_linux.cpp Changeset: 7c2653aefc46 Author: iveresov Date: 2011-08-05 16:50 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7c2653aefc46 7060836: RHEL 5.5 and 5.6 should support UseNUMA Summary: Add a wrapper for sched_getcpu() for systems where libc lacks it Reviewed-by: ysr Contributed-by: Andrew John Hughes ! src/os/linux/vm/os_linux.cpp ! src/os/linux/vm/os_linux.hpp From shrode at subnature.com Sat Aug 6 16:24:36 2011 From: shrode at subnature.com (Jason Schroeder) Date: Sat, 6 Aug 2011 10:24:36 -0600 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: <20110806120053.5F1CF479A0@hg.openjdk.java.net> References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> Message-ID: I thought the benefit of MADV_DONTNEED was to stop Linux from writing dirty pages to disk. On Sat, Aug 6, 2011 at 6:00 AM, wrote: > Changeset: a20e6e447d3d > Author: ? ?iveresov > Date: ? ? ?2011-08-05 16:44 -0700 > URL: ? ? ? http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d > > 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 > Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages > Reviewed-by: ysr > > ! src/os/linux/vm/os_linux.cpp > From igor.veresov at oracle.com Sat Aug 6 21:32:23 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 06 Aug 2011 14:32:23 -0700 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> Message-ID: <4E3DB2E7.9020601@oracle.com> Yes, but not only that. It basically discards the page, to quote from the man page: "Subsequent accesses of pages in this range will succeed, but will result either in re-loading of the memory contents from the underlying mapped file (see mmap()) or zero-fill-on-demand pages for mappings without an underlying file." igor On 8/6/11 9:24 AM, Jason Schroeder wrote: > I thought the benefit of MADV_DONTNEED was to stop Linux from > writing dirty pages to disk. > > On Sat, Aug 6, 2011 at 6:00 AM, wrote: >> Changeset: a20e6e447d3d >> Author: iveresov >> Date: 2011-08-05 16:44 -0700 >> URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d >> >> 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 >> Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages >> Reviewed-by: ysr >> >> ! src/os/linux/vm/os_linux.cpp >> From shrode at subnature.com Sat Aug 6 21:46:07 2011 From: shrode at subnature.com (Jason Schroeder) Date: Sat, 6 Aug 2011 15:46:07 -0600 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: <4E3DB2E7.9020601@oracle.com> References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> Message-ID: Why is the MADV_DONTNEED usage being removed? My reading of the kernel sources for swapping indicates the change could/would cause a needless page write. On Sat, Aug 6, 2011 at 3:32 PM, Igor Veresov wrote: > Yes, but not only that. It basically discards the page, to quote from the > man page: "Subsequent accesses of pages in this range will succeed, but will > result either in re-loading of the memory contents from the underlying > mapped file (see mmap()) or zero-fill-on-demand pages for mappings without > an underlying file." > > igor > > On 8/6/11 9:24 AM, Jason Schroeder wrote: >> >> I thought the benefit of MADV_DONTNEED was to stop Linux from >> writing dirty pages to disk. >> >> On Sat, Aug 6, 2011 at 6:00 AM, ?wrote: >>> >>> Changeset: a20e6e447d3d >>> Author: ? ?iveresov >>> Date: ? ? ?2011-08-05 16:44 -0700 >>> URL: >>> http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d >>> >>> 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 >>> Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages >>> Reviewed-by: ysr >>> >>> ! src/os/linux/vm/os_linux.cpp >>> > > From igor.veresov at oracle.com Mon Aug 8 03:03:04 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Sun, 07 Aug 2011 20:03:04 -0700 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> Message-ID: <4E3F51E8.5050800@oracle.com> It was removed because it seems to have an unintended side effect. It has an effect of removing a swap reservation. So imagine that you're running a program on a machine with a heap size very close to the virtual swap size (phys mem + swap). You do an MADV_DONTNEED on some of your memory in order to free pages, but then some other program comes and allocates this freed memory. Now your program will try to touch the freed region at some point and the OS will try to allocate a page but will fail (because there's no memory) and your program will segfault. In other words, the region to which MADV_DONTNEED was applied behaves as if it was mmap'ed with MAP_NORESERVE. This semantics is not specified in the man but it seems to behave this way at least on some kernels. Doing mmap over existing mmap'ed segment has the same effect of freeing the pages but creates a new segment in the middle, thus producing 3 segments. This could have a negative effect of page fault handling speed and this is why I wanted to try MADV_DONTNEED in the first place, but it didn't work out as you can see from the explanation above. igor On 8/6/11 2:46 PM, Jason Schroeder wrote: > Why is the MADV_DONTNEED usage being removed? My reading > of the kernel sources for swapping indicates the change could/would > cause a needless page write. > > On Sat, Aug 6, 2011 at 3:32 PM, Igor Veresov wrote: >> Yes, but not only that. It basically discards the page, to quote from the >> man page: "Subsequent accesses of pages in this range will succeed, but will >> result either in re-loading of the memory contents from the underlying >> mapped file (see mmap()) or zero-fill-on-demand pages for mappings without >> an underlying file." >> >> igor >> >> On 8/6/11 9:24 AM, Jason Schroeder wrote: >>> I thought the benefit of MADV_DONTNEED was to stop Linux from >>> writing dirty pages to disk. >>> >>> On Sat, Aug 6, 2011 at 6:00 AM, wrote: >>>> Changeset: a20e6e447d3d >>>> Author: iveresov >>>> Date: 2011-08-05 16:44 -0700 >>>> URL: >>>> http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d >>>> >>>> 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 >>>> Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages >>>> Reviewed-by: ysr >>>> >>>> ! src/os/linux/vm/os_linux.cpp >>>> >> From kirk at kodewerk.com Sat Aug 6 19:55:14 2011 From: kirk at kodewerk.com (Charles K Pepperdine) Date: Sat, 6 Aug 2011 21:55:14 +0200 Subject: Log Visualization Tools In-Reply-To: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> References: <956CA137-70AD-40A0-8757-56BE98A429A9@amd.com> Message-ID: Hi Matt, If you send me the GC Log I'll happily analyze it for you. I've got some tooling that is close to release. Alpha should be by end of August. Regards, Kirk On Aug 5, 2011, at 9:40 PM, Eric Caspole wrote: > Sometimes I use HPjmeter for plain Xloggc, but I don't think it can > do the fancy extra flags either. > > On Aug 5, 2011, at 1:51 PM, Matt Fowles wrote: > >> All~ >> >> What tools do people know of or have for parsing gc logs and >> visualizing the results? >> >> The only thing I can find, GCViewer, (from >> http://www.tagtraum.com/gcviewer.html) seems like it has not been >> updated for a while and does not parse a lot of more complicated logs >> (-XX:+PrintTenuringDistribution or -XX:PrintCMSStatistics=1). >> >> Are there more tools out there? Are there in house tools that people >> are willing to share? >> >> Matt >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From igor.veresov at oracle.com Mon Aug 8 18:42:36 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 08 Aug 2011 11:42:36 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> Message-ID: <4E402E1C.1010807@oracle.com> Hi, Tom! Sorry it took me so long to get to that. 1. I don't think the new version of flag usage is prudent. The reason I proposed to introduce a new flag for interleaving is that it would make life easier in the future when the proper NUMA-aware implementation of GCs are added (G1 would be the most probable candidate). I would propose to still have UseNUMAInterleaving flag. The usage would be as follows: - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving (for the time being, and that behavior would change in the future). - If UseNUMAInterleaving is specified on the command line, you just do the interleaving. If you don't add this flag now, you'll have to do that anyway as soon as NUMA-aware GCs start supporting windows. 2. I guess the accepted coding convention in hotspot is that "else" should have closing and open bracket be on one line. 2846 } 2847 else { And in all other places... 3. Did you forget to remove that? 3149 // tty->print("VirtualQuery AllocBase=%p, RegionSize=%Id\n", allocInfo.AllocationBase, allocInfo.RegionSize); 4. Does it make sense to pass UseLargePages and UseNUMAInterleaving to allocate_pages_individually()? They are global variables anyway. 5. What is the typical allocation granularity on windows? Wouldn't that be a problem if we tried to allocate a large heap with small interleaved pages? Have you tried using larger interleaving granularity for modern windows version? Doing a syscall and creating a segment per even a large page seems bit excessive. If you did try that, was there any difference? 6. The usage of "result" doesn't seem right here, did you mean "if (!result) return false;" ? 3129 bool result = VirtualAlloc(addr, bytes, MEM_COMMIT, PAGE_READWRITE) != 0; 3130 if (result == NULL) return false; 7. Wouldn't it be nicer instead of the idiom BOOL ok = SysCall(); if (!ok) return false; just to say if (!SysCall()) return false; ? 8. Instead of introducing a global variable numa_used_node_count, could you implement os::numa_get_groups_num() that was intended to return this number? Also build_numa_used_node_list() seems to have the same functionality as os::numa_get_leaf_groups() was intended to have. Could you implement it and use it instead? Please name function parameters in lower case with words separated with underscores. I know that there are exceptions, especially in os_windows.cpp, but it's better if we stick to the general convention. igor On 5/26/11 4:37 PM, Deneau, Tom wrote: > I have incorporated the change suggested by Paul Hohensee to just use the existing UseNUMA flag rather than introduce a new flag. Please let me know when you think this will be able to be checked in... > > The new webrev is at > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > -- Tom Deneau, AMD > > > >> -----Original Message----- >> From: Deneau, Tom >> Sent: Monday, May 16, 2011 12:54 PM >> To: 'hotspot-compiler-dev at openjdk.java.net' >> Subject: Review Request: UseNUMAInterleaving >> >> Please review this patch which adds a new flag called >> UseNUMAInterleaving. This flag provides a subset of the functionality >> provided by UseNUMA, and its main purpose is to provide that subset on >> OSes like Windows which do not support the full UseNUMA functionality. >> In UseNUMA terminology, UseNUMAInterleaved makes all memory >> "numa_global" which is implemented as interleaved. >> >> The situations where this shows the biggest benefits would be: >> * Windows platforms with multiple numa nodes (eg, 4) >> >> * The JVM process is run across all the nodes (not affinitized to one >> node). >> >> * A workload that uses the majority of the cores in the machine, so >> that the heap is being accessed from many cores, including remote >> ones. >> >> * Enough memory per node and a heap size such that the default heap >> placement policy on windows would end up with the heap (or >> nursery) placed on one node. >> >> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our >> measurements, we have seen some cases where the performance with >> UseNUMAInterleaving was 2.7x vs. the performance without. There were >> gains of varying sizes across all systems. >> >> As currently implemented this flag is ignored on Linux and Solaris >> since they already support the full UseNUMA flag. >> >> The webrev is at >> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/ >> >> Summary of changes: >> >> * Other than adding the new UseNUMAInterleaving global flag, all of >> the changes are in src/os/windows/vm/os_windows.cpp >> >> * Some static routines were added to set things up init time. These >> * check that the required APIs (VirtualAllocExNuma, >> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in >> the OS >> >> * build the list of numa nodes on which this process has affinity >> >> * Changes to os::reserve_memory >> * There was already a routine that reserved pages one page at a >> time (used for Individual Large Page Allocation on WS2003). >> This was abstracted to a separate routine, called >> allocate_pages_individually. This gets called both for the >> Individual Large Page Allocation thing mentioned above and for >> UseNUMAInterleaving (for both small and large pages) >> >> * When used for NUMA Interleaving this just goes thru the numa >> node list in a round-robin fashion, using a different one for >> each chunk (with 4K pages, the minimum allocation granularity >> is 64K, with 2M pages it is 1 Page) >> >> * Whether we do just a reserve or a combined reserve/commit is >> determined by the caller of allocate_pages_individually >> >> * When used with large pages, we do a Reserve and Commit at >> the same time which is the way it always worked and the way >> it has to work on windows. >> >> * For small pages, only the reserve is done, the commit will >> come later. (which is the way it worked for >> non-interleaved) >> >> * os::commit_memory changes >> * If UseNUMAIntereaving is true, os::commit_memory has to check >> whether it was being asked to commit memory that might have >> come from multiple Reserve allocations, if so, the commits >> must also be broken up. We don't keep any data structure to >> keep track of this, we just use VirtualQuery which queries the >> properties of a VA range and can tell us how much came from >> one VirtualAlloc call. >> >> I do not have a bug id for this. >> >> -- Tom Deneau, AMD From y.s.ramakrishna at oracle.com Mon Aug 8 22:06:52 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Mon, 08 Aug 2011 15:06:52 -0700 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> Message-ID: <4E405DFC.6030507@oracle.com> Jason, As Igor noted in his review request, IIUC, MADV_DONTNEED seems to also remove the swap reservations for those pages on Linux, so that if memory is tight, then a reload of the page(s) may not succeed. Since this was being used for pages in Eden, we are not equipped to deal with page reservations disappearing fron under us... I couldn't tell from the subsequent discussion, if this is by design in Linux or an inadvertent bug (at least the Linux man page is mum on the swap reservation semantics of MADV_DONTNEED; Solaris does not lose page reservations on an madvise; so we'll have to defer to people familiar with the Linux kernel to interpret intentions and semantics here). -- ramki On 8/6/2011 2:46 PM, Jason Schroeder wrote: > Why is the MADV_DONTNEED usage being removed? My reading > of the kernel sources for swapping indicates the change could/would > cause a needless page write. > > On Sat, Aug 6, 2011 at 3:32 PM, Igor Veresov wrote: >> Yes, but not only that. It basically discards the page, to quote from the >> man page: "Subsequent accesses of pages in this range will succeed, but will >> result either in re-loading of the memory contents from the underlying >> mapped file (see mmap()) or zero-fill-on-demand pages for mappings without >> an underlying file." >> >> igor >> >> On 8/6/11 9:24 AM, Jason Schroeder wrote: >>> I thought the benefit of MADV_DONTNEED was to stop Linux from >>> writing dirty pages to disk. >>> >>> On Sat, Aug 6, 2011 at 6:00 AM, wrote: >>>> Changeset: a20e6e447d3d >>>> Author: iveresov >>>> Date: 2011-08-05 16:44 -0700 >>>> URL: >>>> http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a20e6e447d3d >>>> >>>> 7060842: UseNUMA crash with UseHugreTLBFS running SPECjvm2008 >>>> Summary: Use mmap() instead of madvise(MADV_DONTNEED) to uncommit pages >>>> Reviewed-by: ysr >>>> >>>> ! src/os/linux/vm/os_linux.cpp >>>> >> From shrode at subnature.com Mon Aug 8 22:14:45 2011 From: shrode at subnature.com (Jason Schroeder) Date: Mon, 8 Aug 2011 16:14:45 -0600 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: <4E405DFC.6030507@oracle.com> References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> <4E405DFC.6030507@oracle.com> Message-ID: Thank you all. In my spare time, I am trying to bring back some of the intentions for the book marking collector, and I have spent some time spelunking amongst linux internals, and the patch, and hence the paging behavior, was curious to me. From frederic.parain at oracle.com Tue Aug 9 12:52:57 2011 From: frederic.parain at oracle.com (Frederic Parain) Date: Tue, 09 Aug 2011 14:52:57 +0200 Subject: How to exercise the GC notification code in gcNotifier.cpp In-Reply-To: <4E3B194B.3010002@oracle.com> References: <4E3B194B.3010002@oracle.com> Message-ID: <4E412DA9.7050003@oracle.com> Hi John, I've checked the code using the muKlass oop and I confirm that there's an issue with the muKlass oop. It is used in two successive calls to oopFactory::new_objArray() and if a GC occurs during the first call, the oop could be invalid for the second call. The fix is straightforward (creating and using an instanceKlassHandle instead of the oop), I'll try to push it ASAP. Thanks, Fred PS: I was not able to reproduce the failure either. On 08/ 5/11 12:12 AM, John Cuthbertson wrote: > Hi Everyone, > > After investigating an issue that was reported while running a fairly > substantial application, I believe I have found naked oop in the GC > notification code which I think is the cause of the problem The oop in > question is the muKlass oop in createGCInfo in gcNotifier.cpp. This code > is executed by the ServiceThread as part of the sendNotification routine > which pulls items from some request queue. The items on this queue seem > to be added as a result of GC end events being pushed by the > MemoryManagerService if the routine > GCMemoryManager::is_notification_enabled() returns true. This field is > set to true by the routine jmm_SetGCNotificationEnabled in > management.cpp. Can anyone tell me how I can explicitly exercise this > code? I have tried some things in jconsole and jvisualvm but have so far > been unsuccessful. > > Thanks, > > JohnC -- Frederic Parain - Oracle Grenoble Engineering Center - France Phone: +33 4 76 18 81 17 Email: Frederic.Parain at Oracle.com From frederic.parain at oracle.com Tue Aug 9 13:30:19 2011 From: frederic.parain at oracle.com (Frederic Parain) Date: Tue, 09 Aug 2011 15:30:19 +0200 Subject: RFR(XS): 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App In-Reply-To: <4E3C792D.4020909@oracle.com> References: <4E3C792D.4020909@oracle.com> Message-ID: <4E41366B.8010305@oracle.com> The fix looks good. Fred On 08/ 6/11 01:13 AM, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers look at these changes? The webrev can > be found at: http://cr.openjdk.java.net/~johnc/7074579/webrev.0/ > > The issue was a crash caused by an oop that was naked across a GC. What > was happening was that the ServiceThread was attempting to send a GC > notification that came from the end of an evacuation pause. (Note that > the ServiceThread is a Java thread and the GC notification is sent when > the threads are restarted after the safepoint.) The construction of the > first of the object arrays (used to pass before and memory pool > information) triggered a full GC which moved the memory usage class. > Thus when the attempt to allocate the second object array was made, the > variable holding the klass oop was now stale causing the crash. Although > this issue was found when the app was run with G1, the issue is not G1 > specific. > > The solution was to allocate a handle to hold the klass oop and use the > de-referenced handle in the allocations. > > Verified by inserting a full GC after the first array allocation and > running the GC notification regression test. Fix was tested by running > the regression test with all collectors and monitoring a KitchenSink run > with jconsole. > > Thanks, > > JohnC -- Frederic Parain - Oracle Grenoble Engineering Center - France Phone: +33 4 76 18 81 17 Email: Frederic.Parain at Oracle.com From john.cuthbertson at oracle.com Tue Aug 9 17:13:34 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 09 Aug 2011 10:13:34 -0700 Subject: RFR(XS): 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App In-Reply-To: <4E41366B.8010305@oracle.com> References: <4E3C792D.4020909@oracle.com> <4E41366B.8010305@oracle.com> Message-ID: <4E416ABE.5020408@oracle.com> Hi Fred, Thanks for the review. JohnC On 08/09/11 06:30, Frederic Parain wrote: > The fix looks good. > > Fred > > On 08/ 6/11 01:13 AM, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers look at these changes? The webrev can >> be found at: http://cr.openjdk.java.net/~johnc/7074579/webrev.0/ >> >> The issue was a crash caused by an oop that was naked across a GC. What >> was happening was that the ServiceThread was attempting to send a GC >> notification that came from the end of an evacuation pause. (Note that >> the ServiceThread is a Java thread and the GC notification is sent when >> the threads are restarted after the safepoint.) The construction of the >> first of the object arrays (used to pass before and memory pool >> information) triggered a full GC which moved the memory usage class. >> Thus when the attempt to allocate the second object array was made, the >> variable holding the klass oop was now stale causing the crash. Although >> this issue was found when the app was run with G1, the issue is not G1 >> specific. >> >> The solution was to allocate a handle to hold the klass oop and use the >> de-referenced handle in the allocations. >> >> Verified by inserting a full GC after the first array allocation and >> running the GC notification regression test. Fix was tested by running >> the regression test with all collectors and monitoring a KitchenSink run >> with jconsole. >> >> Thanks, >> >> JohnC > From fweimer at bfk.de Wed Aug 10 15:09:28 2011 From: fweimer at bfk.de (Florian Weimer) Date: Wed, 10 Aug 2011 15:09:28 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: <4E405DFC.6030507@oracle.com> (Ramki Ramakrishna's message of "Mon, 08 Aug 2011 15:06:52 -0700") References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> <4E405DFC.6030507@oracle.com> Message-ID: <827h6l8frr.fsf@mid.bfk.de> * Ramki Ramakrishna: > I couldn't tell from the subsequent discussion, if this is by design > in Linux or an inadvertent bug (at least the Linux man page is mum on > the swap reservation semantics of MADV_DONTNEED; Solaris does not lose > page reservations on an madvise; so we'll have to defer to people > familiar with the Linux kernel to interpret intentions and semantics > here). In Linux, Hotspot should not use MAP_NORESERVE, but map with PROT_NONE initially and upgrade that to PROT_READ | PROT_WRITE (using mprotect) as heap usage grows. Then MADV_DONTNEED will do the right thing with regards to swap reservation. IIRC, MAP_NORESERVE results in a SIGSEGV signal which Hotspot cannot handle properly at the moment. -- Florian Weimer BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstra?e 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 From John.Coomes at oracle.com Wed Aug 10 18:47:55 2011 From: John.Coomes at oracle.com (John Coomes) Date: Wed, 10 Aug 2011 11:47:55 -0700 Subject: RFR(XS): 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App In-Reply-To: <4E3C792D.4020909@oracle.com> References: <4E3C792D.4020909@oracle.com> Message-ID: <20034.53851.333550.809588@oracle.com> John Cuthbertson (john.cuthbertson at oracle.com) wrote: > Hi Everyone, > > Can I have a couple of volunteers look at these changes? The webrev can > be found at: http://cr.openjdk.java.net/~johnc/7074579/webrev.0/ > ... > The solution was to allocate a handle to hold the klass oop and use the > de-referenced handle in the allocations. Looks good to me. And thanks for fixing the whitespace and line wrapping. -John From tony.printezis at oracle.com Wed Aug 10 19:34:53 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 10 Aug 2011 15:34:53 -0400 Subject: CRR (S/M): 7075646: G1: fix inconsistencies in the monitoring data Message-ID: <4E42DD5D.8070206@oracle.com> Hi all, I would like a couple of code reviews for some fixes in the G1 monitoring code: http://cr.openjdk.java.net/~tonyp/7075646/webrev.0/ The main motivation behind these changes is that G1's jstat output has inconsistencies and has been causing a few test failures. Here's a quick summary of the changes: - Reworked the way the capacities of the various spaces are calculated so that only the eden space used counter needs to be updated when a new eden region is allocated. - Now the values of the various sizes that need to be reported are calculated synchronously in all the appropriate places in the code and stored so that they do not need to be recalculated every time they are required. - The jstat counters for the young / old gen capacity are now correctly updated. - We ensure that when we are reporting a capacity to jstat we artficially pad it so that it's never 0 (as jstat does not handle 0 capacities gracefully). I attached a file that has before / after output comparisons, along with some commentary, for the various jstat GC parameters. Tony -------------- next part -------------- ---------- -gc S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT BEFORE: 0.0 1024.0 0.0 0.0 9216.0 9216.0 30720.0 11566.9 16384.0 2684.0 7 0.209 1 0.019 0.227 0.0 1024.0 0.0 1024.0 3072.0 3072.0 30720.0 12182.3 16384.0 2686.9 8 0.250 1 0.019 0.268 AFTER: 0.0 0.0 0.0 0.0 6144.0 3072.0 26624.0 11294.3 16384.0 2646.3 7 0.163 1 0.018 0.181 0.0 0.0 0.0 0.0 10240.0 0.0 22528.0 12153.0 16384.0 2648.5 8 0.196 1 0.018 0.214 COMMENTS: EC is now calculated differently (i.e., it's not always == EU) OC is also calculated differently (it's smaller as the Eden capacity is not larger). ---------- -gccapacity BEFORE: NGCMN NGCMX NGC S0C S1C EC OGCMN OGCMX OGC OC PGCMN PGCMX PGC PC YGC FGC 0.0 0.0 0.0 0.0 1024.0 4096.0 32768.0 65536.0 32768.0 30720.0 16384.0 65536.0 16384.0 16384.0 5 1 0.0 0.0 0.0 0.0 1024.0 1024.0 32768.0 65536.0 32768.0 30720.0 16384.0 65536.0 16384.0 16384.0 6 1 AFTER: 0.0 65536.0 6144.0 0.0 0.0 6144.0 0.0 65536.0 26624.0 26624.0 16384.0 65536.0 16384.0 16384.0 5 1 0.0 65536.0 6144.0 0.0 0.0 6144.0 0.0 65536.0 26624.0 26624.0 16384.0 65536.0 16384.0 16384.0 6 1 COMMENTS: The minimum generation capacities (NGCMN and OGCMN) are now set to the same value, i.e., 0. The maximum generation capacities (NGCMX and OGCMX) are also set to the same value which is the maximum heap capacity. EC and OC are also calculated differently. ---------- -gccause S0 S1 E O P YGC YGCT FGC FGCT GCT LGCC GCC BEFORE: ??? 0.00 100.00 29.16 15.93 4 0.137 1 0.017 0.154 G1 Evacuation Pause No GC ??? 0.00 100.00 29.16 16.00 5 0.137 1 0.017 0.154 No GC G1 Evacuation Pause AFTER: 0.00 0.00 42.86 33.36 15.93 4 0.091 1 0.020 0.111 G1 Evacuation Pause No GC 0.00 0.00 85.71 33.36 15.94 5 0.091 1 0.020 0.111 No GC G1 Evacuation Pause COMMENTS: Fixed the division-by-zero bug that was causing the missing values in the first column. The Eden is not shown to always be full given that its capacity is calculated differently. ---------- -gcnew S0C S1C S0U S1U TT MTT DSS EC EU YGC YGCT BEFORE: 0.0 1024.0 0.0 1024.0 1 15 0.0 6144.0 6144.0 3 0.084 0.0 1024.0 0.0 0.0 15 15 0.0 1024.0 1024.0 4 0.116 AFTER: 0.0 1024.0 0.0 1024.0 1 15 0.0 7168.0 5120.0 3 0.088 0.0 0.0 0.0 0.0 15 15 0.0 7168.0 0.0 4 0.136 COMMENTS: EC is now calculated differently. ---------- -gcnewcapacity NGCMN NGCMX NGC S0CMX S0C S1CMX S1C ECMX EC YGC FGC BEFORE: 0.0 0.0 0.0 0.0 0.0 65536.0 1024.0 65536.0 4096.0 4 1 0.0 0.0 0.0 0.0 0.0 65536.0 1024.0 65536.0 6144.0 5 1 AFTER: 0.0 65536.0 7168.0 0.0 0.0 65536.0 0.0 65536.0 7168.0 4 1 0.0 65536.0 6144.0 0.0 0.0 65536.0 0.0 65536.0 6144.0 5 1 COMMENTS: NGCMX is now set to the max heap capacity. NGC is now updated correctly. EC is now calculated differently. ---------- -gcold PC PU OC OU YGC FGC FGCT GCT BEFORE: 16384.0 2636.8 30720.0 10827.7 6 1 0.020 0.214 16384.0 2650.2 30720.0 11577.3 7 1 0.020 0.242 AFTER: 16384.0 2637.8 26624.0 10953.0 6 1 0.023 0.321 16384.0 2644.4 22528.0 11712.2 7 1 0.023 0.336 COMMENTS: OC is now calculated differently (it's smaller as the Eden capacity is now larger). ---------- -gcoldcapacity OGCMN OGCMX OGC OC YGC FGC FGCT GCT BEFORE: 32768.0 65536.0 32768.0 30720.0 8 2 0.072 0.350 32768.0 65536.0 32768.0 30720.0 9 2 0.072 0.420 AFTER: 0.0 65536.0 22528.0 22528.0 8 1 0.018 0.218 0.0 65536.0 22528.0 22528.0 9 1 0.018 0.253 COMMENTS: OGCMN is now set to 0, OGC is now updated correctly and matches OC, OC is now calculated differently (it's smaller as the Eden capacity is now larger). ---------- -gcutil S0 S1 E O P YGC YGCT FGC FGCT GCT BEFORE: ??? 100.00 100.00 40.87 16.38 8 0.308 1 0.017 0.325 ??? 0.00 100.00 2.71 16.40 8 0.308 2 0.058 0.365 AFTER: 0.00 0.00 80.00 53.93 16.40 8 0.290 1 0.021 0.311 0.00 0.00 6.25 5.08 16.40 8 0.290 2 0.066 0.356 COMMENTS: Fixed the division-by-zero bug that was causing the missing values in the first column. The Eden capacity is calculated differently so the eden is not shown to always be 100% full. ---------- From kevin.walls at oracle.com Thu Aug 11 16:49:59 2011 From: kevin.walls at oracle.com (kevin.walls at oracle.com) Date: Thu, 11 Aug 2011 16:49:59 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110811165003.160DF47AD9@hg.openjdk.java.net> Changeset: 41e6ee74f879 Author: kevinw Date: 2011-08-02 14:37 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/41e6ee74f879 7072527: CMS: JMM GC counters overcount in some cases Summary: Avoid overcounting when CMS has concurrent mode failure. Reviewed-by: ysr Contributed-by: rednaxelafx at gmail.com ! src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp ! src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.hpp + test/gc/7072527/TestFullGCCount.java Changeset: e9db47a083cc Author: kevinw Date: 2011-08-11 14:58 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e9db47a083cc Merge From john.cuthbertson at oracle.com Thu Aug 11 20:48:54 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Thu, 11 Aug 2011 20:48:54 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App Message-ID: <20110811204856.EFEA847AE8@hg.openjdk.java.net> Changeset: 87e40b34bc2b Author: johnc Date: 2011-08-11 11:36 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/87e40b34bc2b 7074579: G1: JVM crash with JDK7 running ATG CRMDemo Fusion App Summary: Handlize MemoryUsage klass oop in createGCInfo routine Reviewed-by: tonyp, fparain, ysr, jcoomes ! src/share/vm/services/gcNotifier.cpp From tony.printezis at oracle.com Fri Aug 12 15:20:12 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 12 Aug 2011 11:20:12 -0400 Subject: CRR (S/M): 7075646: G1: fix inconsistencies in the monitoring data In-Reply-To: <4E42DD5D.8070206@oracle.com> References: <4E42DD5D.8070206@oracle.com> Message-ID: <4E4544AC.50200@oracle.com> Hi all, Thanks to Jon Masa for several good suggestions, here's the updated webrev: http://cr.openjdk.java.net/~tonyp/7075646/webrev.1/ Tony Tony Printezis wrote: > Hi all, > > I would like a couple of code reviews for some fixes in the G1 > monitoring code: > > http://cr.openjdk.java.net/~tonyp/7075646/webrev.0/ > > The main motivation behind these changes is that G1's jstat output has > inconsistencies and has been causing a few test failures. Here's a > quick summary of the changes: > > - Reworked the way the capacities of the various spaces are calculated > so that only the eden space used counter needs to be updated when a > new eden region is allocated. > - Now the values of the various sizes that need to be reported are > calculated synchronously in all the appropriate places in the code and > stored so that they do not need to be recalculated every time they are > required. > - The jstat counters for the young / old gen capacity are now > correctly updated. > - We ensure that when we are reporting a capacity to jstat we > artficially pad it so that it's never 0 (as jstat does not handle 0 > capacities gracefully). > > I attached a file that has before / after output comparisons, along > with some commentary, for the various jstat GC parameters. > > Tony > From tony.printezis at oracle.com Fri Aug 12 23:47:55 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Fri, 12 Aug 2011 23:47:55 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7039627: G1: avoid BOT updates for survivor allocations and dirty survivor regions incrementally Message-ID: <20110812234759.3FEA047B33@hg.openjdk.java.net> Changeset: f44782f04dd4 Author: tonyp Date: 2011-08-12 11:31 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f44782f04dd4 7039627: G1: avoid BOT updates for survivor allocations and dirty survivor regions incrementally Summary: Refactor the allocation code during GC to use the G1AllocRegion abstraction. Use separate subclasses of G1AllocRegion for survivor and old regions. Avoid BOT updates and dirty survivor cards incrementally for the former. Reviewed-by: brutisso, johnc, ysr ! src/share/vm/gc_implementation/g1/g1AllocRegion.cpp ! src/share/vm/gc_implementation/g1/g1AllocRegion.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.inline.hpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/heapRegion.cpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp ! src/share/vm/gc_implementation/g1/heapRegionRemSet.cpp From tony.printezis at oracle.com Tue Aug 16 15:33:03 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 16 Aug 2011 11:33:03 -0400 Subject: CRR (S/M): 7075646: G1: fix inconsistencies in the monitoring data In-Reply-To: <4E4544AC.50200@oracle.com> References: <4E42DD5D.8070206@oracle.com> <4E4544AC.50200@oracle.com> Message-ID: <4E4A8DAF.1040902@oracle.com> Hi all, New webrev with a small fix to resolve an issue I came across during testing (I was forgetting to update the new generation capacity after the eden / survivor capacities had been updated): http://cr.openjdk.java.net/~tonyp/7075646/webrev.2/ If you started looking at the webrev below, the only change is this one (in g1MonitoringSupport.cpp): // Finally, give the rest to the old space... _old_committed += committed; // ..and calculate the young gen committed. _young_gen_committed = _eden_committed + _survivor_committed; Tony Tony Printezis wrote: > Hi all, > > Thanks to Jon Masa for several good suggestions, here's the updated > webrev: > > http://cr.openjdk.java.net/~tonyp/7075646/webrev.1/ > > Tony > > Tony Printezis wrote: >> Hi all, >> >> I would like a couple of code reviews for some fixes in the G1 >> monitoring code: >> >> http://cr.openjdk.java.net/~tonyp/7075646/webrev.0/ >> >> The main motivation behind these changes is that G1's jstat output >> has inconsistencies and has been causing a few test failures. Here's >> a quick summary of the changes: >> >> - Reworked the way the capacities of the various spaces are >> calculated so that only the eden space used counter needs to be >> updated when a new eden region is allocated. >> - Now the values of the various sizes that need to be reported are >> calculated synchronously in all the appropriate places in the code >> and stored so that they do not need to be recalculated every time >> they are required. >> - The jstat counters for the young / old gen capacity are now >> correctly updated. >> - We ensure that when we are reporting a capacity to jstat we >> artficially pad it so that it's never 0 (as jstat does not handle 0 >> capacities gracefully). >> >> I attached a file that has before / after output comparisons, along >> with some commentary, for the various jstat GC parameters. >> >> Tony >> > From igor.veresov at oracle.com Tue Aug 16 17:31:36 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 16 Aug 2011 10:31:36 -0700 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets In-Reply-To: <827h6l8frr.fsf@mid.bfk.de> References: <20110806120053.5F1CF479A0@hg.openjdk.java.net> <4E3DB2E7.9020601@oracle.com> <4E405DFC.6030507@oracle.com> <827h6l8frr.fsf@mid.bfk.de> Message-ID: The MAP_NORESERVE calls are supposed to be used only to reserve address space. When the heap gets expanded into this space we do mmap again, this time with the swap reservation (see os::commit_memory()). If that mmap fails, the VM notices that and reacts accordingly (like aborting the expansion attempt). igor On Wednesday, August 10, 2011 at 8:09 AM, Florian Weimer wrote: > * Ramki Ramakrishna: > > > I couldn't tell from the subsequent discussion, if this is by design > > in Linux or an inadvertent bug (at least the Linux man page is mum on > > the swap reservation semantics of MADV_DONTNEED; Solaris does not lose > > page reservations on an madvise; so we'll have to defer to people > > familiar with the Linux kernel to interpret intentions and semantics > > here). > > In Linux, Hotspot should not use MAP_NORESERVE, but map with PROT_NONE > initially and upgrade that to PROT_READ | PROT_WRITE (using mprotect) as > heap usage grows. Then MADV_DONTNEED will do the right thing with > regards to swap reservation. IIRC, MAP_NORESERVE results in a SIGSEGV > signal which Hotspot cannot handle properly at the moment. > > -- > Florian Weimer > BFK edv-consulting GmbH http://www.bfk.de/ > Kriegsstra?e 100 tel: +49-721-96201-1 > D-76133 Karlsruhe fax: +49-721-96201-99 From bengt.rutisson at oracle.com Tue Aug 16 18:03:56 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 16 Aug 2011 20:03:56 +0200 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 Message-ID: <4E4AB10C.8050605@oracle.com> Hi all, Could I have a couple of reviews for this change? Background: G1 was originally designed to be able to run in a non-generational mode. This has not been used for a long time and no testing has been done. Thus, the code has bit rotted. We don't see the need for this feature anymore, so rather than fixing it we should remove it. This is actually a low priority item, but since it will make my next step (supporting young space sizing better) simpler I would like to get this CR out of the way. Webrev: http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ CR: 6814390 G1: remove the concept of non-generational G1 http://monaco.us.oracle.com/detail.jsf?cr=6814390 Thanks, Bengt From y.s.ramakrishna at oracle.com Tue Aug 16 20:09:23 2011 From: y.s.ramakrishna at oracle.com (y.s.ramakrishna at oracle.com) Date: Tue, 16 Aug 2011 20:09:23 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110816200928.27D0947C15@hg.openjdk.java.net> Changeset: ca1f1753c866 Author: andrew Date: 2011-07-28 14:10 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ca1f1753c866 7072341: enable hotspot builds on Linux 3.0 Summary: Add "3" to list of allowable versions Reviewed-by: kamg, chrisphi ! make/linux/Makefile Changeset: 76b1a9420e3d Author: ysr Date: 2011-08-16 08:02 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/76b1a9420e3d Merge From tom.deneau at amd.com Tue Aug 16 20:14:00 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 16 Aug 2011 15:14:00 -0500 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <4E402E1C.1010807@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> Igor -- I am back from vacation, starting to address your comments... Regarding your comment #1 below. You mention "future planned NUMA-aware implementations of GCs". How do these future planned NUMA-aware implementations of GCs differ from today's NUMA-aware GCs? My understanding of the current GCs use of NUMA is that they support numa_global (interleaved) and numa_local (memory pinned to one numa node). In the currently released JVMs on Windows OSes, neither numa_local nor numa_global is implemented. The implementation I proposed in the patch maps both numa_local and numa_global requests to numa_global (on Windows). The reasons for this were: * it was very difficult (if not impossible) to implement the JVM's current numa_local semantics on Windows * in the benchmarks we measured, the extra performance that was left on the table by doing only numa_global and not doing numa_local was only a few percent. Are you saying that in the future numa_local will be supported on Windows, and that even then it might still be advantageous to have a flag (UseNUMAInterleaving) which instead maps all the regions to numa_global? Should this flag be available on all OSes? -- Tom > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Monday, August 08, 2011 1:43 PM > To: hotspot-gc-dev at openjdk.java.net; Deneau, Tom > Subject: Re: Review Request: UseNUMAInterleaving > > Hi, Tom! > > Sorry it took me so long to get to that. > > 1. I don't think the new version of flag usage is prudent. The reason I > proposed to introduce a new flag for interleaving is that it would make > life easier in the future when the proper NUMA-aware implementation of > GCs are added (G1 would be the most probable candidate). I would propose > to still have UseNUMAInterleaving flag. > > The usage would be as follows: > - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving > (for the time being, and that behavior would change in the future). > - If UseNUMAInterleaving is specified on the command line, you just do > the interleaving. If you don't add this flag now, you'll have to do that > anyway as soon as NUMA-aware GCs start supporting windows. > > > igor > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > I have incorporated the change suggested by Paul Hohensee to just use > the existing UseNUMA flag rather than introduce a new flag. Please let me > know when you think this will be able to be checked in... > > > > The new webrev is at > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > -- Tom Deneau, AMD > > > > From y.s.ramakrishna at oracle.com Tue Aug 16 21:31:14 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 16 Aug 2011 14:31:14 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> Message-ID: <4E4AE1A2.2080704@oracle.com> Not answering for Igor here (and i have not looked at yr webrev; just responding to the content of the email messages), but this question came up briefly at lunch today .... On 08/16/11 13:14, Deneau, Tom wrote: ... > Are you saying that in the future numa_local will be supported on > Windows, and that even then it might still be advantageous to have a > flag (UseNUMAInterleaving) which instead maps all the regions to > numa_global? Should this flag be available on all OSes? > ... and, FWIW, in my opinion, it would seem to be be advantageous and also less confusing to support the same semantics (num_global) when +UseNUMAInterleaving, uniformly on all OS's, rather than restrict that semantics to just windows. (And yes support that flag even after we extend our implementation to do numa_local on windows in the future...) - ramki > -- Tom > >> -----Original Message----- >> From: Igor Veresov [mailto:igor.veresov at oracle.com] >> Sent: Monday, August 08, 2011 1:43 PM >> To: hotspot-gc-dev at openjdk.java.net; Deneau, Tom >> Subject: Re: Review Request: UseNUMAInterleaving >> >> Hi, Tom! >> >> Sorry it took me so long to get to that. >> >> 1. I don't think the new version of flag usage is prudent. The reason I >> proposed to introduce a new flag for interleaving is that it would make >> life easier in the future when the proper NUMA-aware implementation of >> GCs are added (G1 would be the most probable candidate). I would propose >> to still have UseNUMAInterleaving flag. >> >> The usage would be as follows: >> - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving >> (for the time being, and that behavior would change in the future). >> - If UseNUMAInterleaving is specified on the command line, you just do >> the interleaving. If you don't add this flag now, you'll have to do that >> anyway as soon as NUMA-aware GCs start supporting windows. >> >> >> igor >> >> >> >> On 5/26/11 4:37 PM, Deneau, Tom wrote: >>> I have incorporated the change suggested by Paul Hohensee to just use >> the existing UseNUMA flag rather than introduce a new flag. Please let me >> know when you think this will be able to be checked in... >>> The new webrev is at >>> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ >>> >>> -- Tom Deneau, AMD >>> >>> > From john.cuthbertson at oracle.com Wed Aug 17 17:58:00 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 17 Aug 2011 10:58:00 -0700 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4AB10C.8050605@oracle.com> References: <4E4AB10C.8050605@oracle.com> Message-ID: <4E4C0128.5090007@oracle.com> Hi Bengt, Looks good to me. You may want to consider removing the CMCheckpointRootsInitialClosure - it's only instantiated in the code you removed from concurrentMarkThread.cpp. JohnC On 08/16/11 11:03, Bengt Rutisson wrote: > > Hi all, > > Could I have a couple of reviews for this change? > > Background: > G1 was originally designed to be able to run in a non-generational > mode. This has not been used for a long time and no testing has been > done. Thus, the code has bit rotted. We don't see the need for this > feature anymore, so rather than fixing it we should remove it. > > This is actually a low priority item, but since it will make my next > step (supporting young space sizing better) simpler I would like to > get this CR out of the way. > > Webrev: > http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ > > CR: > 6814390 G1: remove the concept of non-generational G1 > http://monaco.us.oracle.com/detail.jsf?cr=6814390 > > Thanks, > Bengt > From john.cuthbertson at oracle.com Wed Aug 17 18:15:13 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 17 Aug 2011 11:15:13 -0700 Subject: RFR(M/L): 6484982: G1: process references during evacuation pauses In-Reply-To: <4E398911.1070307@oracle.com> References: <4E03B1A8.7020703@oracle.com> <4E398911.1070307@oracle.com> Message-ID: <4E4C0531.80307@oracle.com> Hi Everyone, A new webrev for these changes can be found at: http://cr.openjdk.java.net/~johnc/6484982/webrev.2/ The changes in this webrev reverse the order of preserving objects referenced from the concurrent mark ref processor's discovered lists and processing references discovered by the STW ref processor. Preserving the objects referenced from the concurrent mark ref processor's discovered lists comes first so that any object that needs to be copied is done so before reference processing. JohnC On 08/03/11 10:44, John Cuthbertson wrote: > Hi Everyone, > > A new webrev incorporating some feedback from Ramki can be found at: > http://cr.openjdk.java.net/~johnc/6484982/webrev.1/ > > Thanks, > > JohnC > > On 06/23/11 14:35, John Cuthbertson wrote: >> Hi Everyone, >> >> I would like to get a couple of volunteers to review the code changes >> for this CR - the webrev can be found at >> http://cr.openjdk.java.net/~johnc/6484982/webrev.0/ >> >> Summary: >> G1 now contains 2 instances of the reference processor class - one >> for concurrent marking and the other for STW GCs (both full and >> incremental evacuation pauses). For evacuation pauses, during object >> scanning and RSet scanning I embed the STW reference processor into >> the OopClosures used to scan objects. This causes reference objects >> to be 'discovered' by the reference processor. Towards the end of the >> evacuation pause (just prior to retiring the the GC alloc regions) I >> have added the code to process these discovered reference objects, >> preserving (and copying) referent objects (and their reachable >> graphs) as appropriate. The code that does this makes extensive use >> of the existing copying oop closures and the G1ParScanThreadState >> structure (to handle to-space allocation). >> >> The code changes also include a couple of fixes that were exposed by >> the reference processing: >> * In satbQueue.cpp, the routine >> SATBMarkQueueSet::par_iterate_closure_all_threads() was claiming all >> JavaThreads (giving them one parity value) but skipping the VMThread. >> In a subsequent call to Thread::possibly_parallel_oops_do, the Java >> threads were successfully claimed but the VMThread was not. This >> could cause the VMThread's handle area to be skipped during the root >> scanning. >> * There were a couple of assignments to the discovered field of >> Reference objects that were not guarded by _discovery_needs_barrier >> resulting in the G1 C++ write-barrier to dirty the card spanning the >> Reference object's discovered field. This was causing the card table >> verification (during card table clearing) to fail. >> * There were also a couple of assignments of NULL to the next field >> of Reference objects causing the same symptom. >> >> Testing: The GC test suite (32/64 bit) (+UseG1GC, +UseG1GC >> +ExplicitGCInvokesConcurrent, +UseG1GC >> InitiatingHeapOccupancyPercent=5, +UseG1GC +ParallelRefProcEnabled), >> KitchenSink (48 hour runs with +UseG1GC, +UseG1GC >> +ExplicitGCInvokesConcurrent), OpenDS (+UseG1GC, +UseG1GC >> +ParallelRefProcEnabled), nsk GC and compiler tests, and jprt. >> Testing was conducted with the _is_alive_non_header field in the STW >> ref procssor both cleared and set (when cleared, more reference >> objects are 'discovered'). >> >> Thanks, >> >> JohnC > From john.coomes at oracle.com Wed Aug 17 20:56:16 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Wed, 17 Aug 2011 20:56:16 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 6791672: enable 1G and larger pages on solaris Message-ID: <20110817205620.B4D1447C72@hg.openjdk.java.net> Changeset: 24cee90e9453 Author: jcoomes Date: 2011-08-17 10:32 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/24cee90e9453 6791672: enable 1G and larger pages on solaris Reviewed-by: ysr, iveresov, johnc ! src/os/solaris/vm/os_solaris.cpp ! src/share/vm/runtime/os.cpp ! src/share/vm/runtime/os.hpp From tom.deneau at amd.com Wed Aug 17 22:51:12 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Wed, 17 Aug 2011 17:51:12 -0500 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <4E402E1C.1010807@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A90186EF950D@SAUSEXMBP01.amd.com> Igor -- Regarding your comment #5 below: > 5. What is the typical allocation granularity on windows? Wouldn't that > be a problem if we tried to allocate a large heap with small interleaved > pages? Have you tried using larger interleaving granularity for modern > windows version? Doing a syscall and creating a segment per even a large > page seems bit excessive. If you did try that, was there any difference? > The allocation granularity for 4K pages on Windows is 64K. (and for 2M pages is 1 page). I didn't do any precise measurements of how long it took to allocate an interleaved heap at this granularity but I didn't perceive startup slowdowns when allocating a 12G heap. I can try to get some actual measurements. I didn't try any different granularities. Do you think it's worth making the granularity a command line parameter? -- Tom > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Monday, August 08, 2011 1:43 PM > To: hotspot-gc-dev at openjdk.java.net; Deneau, Tom > Subject: Re: Review Request: UseNUMAInterleaving > > Hi, Tom! > > Sorry it took me so long to get to that. > > 1. I don't think the new version of flag usage is prudent. The reason I > proposed to introduce a new flag for interleaving is that it would make > life easier in the future when the proper NUMA-aware implementation of > GCs are added (G1 would be the most probable candidate). I would propose > to still have UseNUMAInterleaving flag. > > The usage would be as follows: > - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving > (for the time being, and that behavior would change in the future). > - If UseNUMAInterleaving is specified on the command line, you just do > the interleaving. If you don't add this flag now, you'll have to do that > anyway as soon as NUMA-aware GCs start supporting windows. > > 2. I guess the accepted coding convention in hotspot is that "else" > should have closing and open bracket be on one line. > 2846 } > 2847 else { > And in all other places... > > > 3. Did you forget to remove that? > 3149 // tty->print("VirtualQuery AllocBase=%p, RegionSize=%Id\n", > allocInfo.AllocationBase, allocInfo.RegionSize); > > 4. Does it make sense to pass UseLargePages and UseNUMAInterleaving to > allocate_pages_individually()? They are global variables anyway. > > 5. What is the typical allocation granularity on windows? Wouldn't that > be a problem if we tried to allocate a large heap with small interleaved > pages? Have you tried using larger interleaving granularity for modern > windows version? Doing a syscall and creating a segment per even a large > page seems bit excessive. If you did try that, was there any difference? > > 6. The usage of "result" doesn't seem right here, did you mean "if > (!result) return false;" ? > 3129 bool result = VirtualAlloc(addr, bytes, MEM_COMMIT, > PAGE_READWRITE) != 0; > 3130 if (result == NULL) return false; > > 7. Wouldn't it be nicer instead of the idiom > BOOL ok = SysCall(); > if (!ok) return false; > just to say > if (!SysCall()) return false; > ? > > 8. Instead of introducing a global variable numa_used_node_count, could > you implement os::numa_get_groups_num() that was intended to return this > number? > Also build_numa_used_node_list() seems to have the same functionality > as os::numa_get_leaf_groups() was intended to have. Could you implement > it and use it instead? > > Please name function parameters in lower case with words separated with > underscores. I know that there are exceptions, especially in > os_windows.cpp, but it's better if we stick to the general convention. > > > igor > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > I have incorporated the change suggested by Paul Hohensee to just use > the existing UseNUMA flag rather than introduce a new flag. Please let > me know when you think this will be able to be checked in... > > > > The new webrev is at > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > -- Tom Deneau, AMD > > > > > > > >> -----Original Message----- > >> From: Deneau, Tom > >> Sent: Monday, May 16, 2011 12:54 PM > >> To: 'hotspot-compiler-dev at openjdk.java.net' > >> Subject: Review Request: UseNUMAInterleaving > >> > >> Please review this patch which adds a new flag called > >> UseNUMAInterleaving. This flag provides a subset of the functionality > >> provided by UseNUMA, and its main purpose is to provide that subset on > >> OSes like Windows which do not support the full UseNUMA functionality. > >> In UseNUMA terminology, UseNUMAInterleaved makes all memory > >> "numa_global" which is implemented as interleaved. > >> > >> The situations where this shows the biggest benefits would be: > >> * Windows platforms with multiple numa nodes (eg, 4) > >> > >> * The JVM process is run across all the nodes (not affinitized to > one > >> node). > >> > >> * A workload that uses the majority of the cores in the machine, > so > >> that the heap is being accessed from many cores, including > remote > >> ones. > >> > >> * Enough memory per node and a heap size such that the default > heap > >> placement policy on windows would end up with the heap (or > >> nursery) placed on one node. > >> > >> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > >> measurements, we have seen some cases where the performance with > >> UseNUMAInterleaving was 2.7x vs. the performance without. There were > >> gains of varying sizes across all systems. > >> > >> As currently implemented this flag is ignored on Linux and Solaris > >> since they already support the full UseNUMA flag. > >> > >> The webrev is at > >> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/ > >> > >> Summary of changes: > >> > >> * Other than adding the new UseNUMAInterleaving global flag, all > of > >> the changes are in src/os/windows/vm/os_windows.cpp > >> > >> * Some static routines were added to set things up init time. > These > >> * check that the required APIs (VirtualAllocExNuma, > >> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > >> the OS > >> > >> * build the list of numa nodes on which this process has > affinity > >> > >> * Changes to os::reserve_memory > >> * There was already a routine that reserved pages one page at a > >> time (used for Individual Large Page Allocation on WS2003). > >> This was abstracted to a separate routine, called > >> allocate_pages_individually. This gets called both for the > >> Individual Large Page Allocation thing mentioned above and > for > >> UseNUMAInterleaving (for both small and large pages) > >> > >> * When used for NUMA Interleaving this just goes thru the numa > >> node list in a round-robin fashion, using a different one for > >> each chunk (with 4K pages, the minimum allocation granularity > >> is 64K, with 2M pages it is 1 Page) > >> > >> * Whether we do just a reserve or a combined reserve/commit is > >> determined by the caller of allocate_pages_individually > >> > >> * When used with large pages, we do a Reserve and Commit at > >> the same time which is the way it always worked and the > way > >> it has to work on windows. > >> > >> * For small pages, only the reserve is done, the commit will > >> come later. (which is the way it worked for > >> non-interleaved) > >> > >> * os::commit_memory changes > >> * If UseNUMAIntereaving is true, os::commit_memory has to check > >> whether it was being asked to commit memory that might have > >> come from multiple Reserve allocations, if so, the commits > >> must also be broken up. We don't keep any data structure to > >> keep track of this, we just use VirtualQuery which queries > the > >> properties of a VA range and can tell us how much came from > >> one VirtualAlloc call. > >> > >> I do not have a bug id for this. > >> > >> -- Tom Deneau, AMD > From igor.veresov at oracle.com Wed Aug 17 23:16:52 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 17 Aug 2011 16:16:52 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186EF950D@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF950D@SAUSEXMBP01.amd.com> Message-ID: <3F28B2E2132E42628478AE598EA1C5DD@oracle.com> On Wednesday, August 17, 2011 at 3:51 PM, Deneau, Tom wrote: > Igor -- > > Regarding your comment #5 below: > > > 5. What is the typical allocation granularity on windows? Wouldn't that > > be a problem if we tried to allocate a large heap with small interleaved > > pages? Have you tried using larger interleaving granularity for modern > > windows version? Doing a syscall and creating a segment per even a large > > page seems bit excessive. If you did try that, was there any difference? > > The allocation granularity for 4K pages on Windows is 64K. (and for 2M pages is 1 page). > I didn't do any precise measurements of how long it took to allocate an interleaved heap at this granularity but I didn't perceive startup slowdowns when allocating a 12G heap. I can try to get some actual measurements. I didn't try any different granularities. Do you think it's worth making the granularity a command line parameter? Hm, I don't know, but in your example with the 12G heap, if you have a 64k allocation granularity you'll have to make 196608 syscalls during startup, which seems like a lot. So, yeah, it could make sense to add some parameter that would allow the granularity to be increased and set it to a more sane default value at least for the case of small pages. igor > > -- Tom > > > > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Monday, August 08, 2011 1:43 PM > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net); Deneau, Tom > > Subject: Re: Review Request: UseNUMAInterleaving > > > > Hi, Tom! > > > > Sorry it took me so long to get to that. > > > > 1. I don't think the new version of flag usage is prudent. The reason I > > proposed to introduce a new flag for interleaving is that it would make > > life easier in the future when the proper NUMA-aware implementation of > > GCs are added (G1 would be the most probable candidate). I would propose > > to still have UseNUMAInterleaving flag. > > > > The usage would be as follows: > > - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving > > (for the time being, and that behavior would change in the future). > > - If UseNUMAInterleaving is specified on the command line, you just do > > the interleaving. If you don't add this flag now, you'll have to do that > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > 2. I guess the accepted coding convention in hotspot is that "else" > > should have closing and open bracket be on one line. > > 2846 } > > 2847 else { > > And in all other places... > > > > > > 3. Did you forget to remove that? > > 3149 // tty->print("VirtualQuery AllocBase=%p, RegionSize=%Id\n", > > allocInfo.AllocationBase, allocInfo.RegionSize); > > > > 4. Does it make sense to pass UseLargePages and UseNUMAInterleaving to > > allocate_pages_individually()? They are global variables anyway. > > > > 5. What is the typical allocation granularity on windows? Wouldn't that > > be a problem if we tried to allocate a large heap with small interleaved > > pages? Have you tried using larger interleaving granularity for modern > > windows version? Doing a syscall and creating a segment per even a large > > page seems bit excessive. If you did try that, was there any difference? > > > > 6. The usage of "result" doesn't seem right here, did you mean "if > > (!result) return false;" ? > > 3129 bool result = VirtualAlloc(addr, bytes, MEM_COMMIT, > > PAGE_READWRITE) != 0; > > 3130 if (result == NULL) return false; > > > > 7. Wouldn't it be nicer instead of the idiom > > BOOL ok = SysCall(); > > if (!ok) return false; > > just to say > > if (!SysCall()) return false; > > ? > > > > 8. Instead of introducing a global variable numa_used_node_count, could > > you implement os::numa_get_groups_num() that was intended to return this > > number? > > Also build_numa_used_node_list() seems to have the same functionality > > as os::numa_get_leaf_groups() was intended to have. Could you implement > > it and use it instead? > > > > Please name function parameters in lower case with words separated with > > underscores. I know that there are exceptions, especially in > > os_windows.cpp, but it's better if we stick to the general convention. > > > > > > igor > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > I have incorporated the change suggested by Paul Hohensee to just use > > the existing UseNUMA flag rather than introduce a new flag. Please let > > me know when you think this will be able to be checked in... > > > > > > The new webrev is at > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > -- Tom Deneau, AMD > > > > > > > > > > > > > -----Original Message----- > > > > From: Deneau, Tom > > > > Sent: Monday, May 16, 2011 12:54 PM > > > > To: 'hotspot-compiler-dev at openjdk.java.net (mailto:hotspot-compiler-dev at openjdk.java.net)' > > > > Subject: Review Request: UseNUMAInterleaving > > > > > > > > Please review this patch which adds a new flag called > > > > UseNUMAInterleaving. This flag provides a subset of the functionality > > > > provided by UseNUMA, and its main purpose is to provide that subset on > > > > OSes like Windows which do not support the full UseNUMA functionality. > > > > In UseNUMA terminology, UseNUMAInterleaved makes all memory > > > > "numa_global" which is implemented as interleaved. > > > > > > > > The situations where this shows the biggest benefits would be: > > > > * Windows platforms with multiple numa nodes (eg, 4) > > > > > > > > * The JVM process is run across all the nodes (not affinitized to > > one > > > > node). > > > > > > > > * A workload that uses the majority of the cores in the machine, > > so > > > > that the heap is being accessed from many cores, including > > remote > > > > ones. > > > > > > > > * Enough memory per node and a heap size such that the default > > heap > > > > placement policy on windows would end up with the heap (or > > > > nursery) placed on one node. > > > > > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > > > > measurements, we have seen some cases where the performance with > > > > UseNUMAInterleaving was 2.7x vs. the performance without. There were > > > > gains of varying sizes across all systems. > > > > > > > > As currently implemented this flag is ignored on Linux and Solaris > > > > since they already support the full UseNUMA flag. > > > > > > > > The webrev is at > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/ > > > > > > > > Summary of changes: > > > > > > > > * Other than adding the new UseNUMAInterleaving global flag, all > > of > > > > the changes are in src/os/windows/vm/os_windows.cpp > > > > > > > > * Some static routines were added to set things up init time. > > These > > > > * check that the required APIs (VirtualAllocExNuma, > > > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > > > the OS > > > > > > > > * build the list of numa nodes on which this process has > > affinity > > > > > > > > * Changes to os::reserve_memory > > > > * There was already a routine that reserved pages one page at a > > > > time (used for Individual Large Page Allocation on WS2003). > > > > This was abstracted to a separate routine, called > > > > allocate_pages_individually. This gets called both for the > > > > Individual Large Page Allocation thing mentioned above and > > for > > > > UseNUMAInterleaving (for both small and large pages) > > > > > > > > * When used for NUMA Interleaving this just goes thru the numa > > > > node list in a round-robin fashion, using a different one for > > > > each chunk (with 4K pages, the minimum allocation granularity > > > > is 64K, with 2M pages it is 1 Page) > > > > > > > > * Whether we do just a reserve or a combined reserve/commit is > > > > determined by the caller of allocate_pages_individually > > > > > > > > * When used with large pages, we do a Reserve and Commit at > > > > the same time which is the way it always worked and the > > way > > > > it has to work on windows. > > > > > > > > * For small pages, only the reserve is done, the commit will > > > > come later. (which is the way it worked for > > > > non-interleaved) > > > > > > > > * os::commit_memory changes > > > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > > > whether it was being asked to commit memory that might have > > > > come from multiple Reserve allocations, if so, the commits > > > > must also be broken up. We don't keep any data structure to > > > > keep track of this, we just use VirtualQuery which queries > > the > > > > properties of a VA range and can tell us how much came from > > > > one VirtualAlloc call. > > > > > > > > I do not have a bug id for this. > > > > > > > > -- Tom Deneau, AMD From igor.veresov at oracle.com Wed Aug 17 23:38:17 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 17 Aug 2011 16:38:17 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> Message-ID: <247BA26129A14681B03D0856A6FAC69D@oracle.com> On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > Igor -- > > I am back from vacation, starting to address your comments... > Regarding your comment #1 below. > > You mention "future planned NUMA-aware implementations of GCs". How > do these future planned NUMA-aware implementations of GCs differ from > today's NUMA-aware GCs? My understanding of the current GCs use of > NUMA is that they support numa_global (interleaved) and numa_local > (memory pinned to one numa node). > > In the currently released JVMs on Windows OSes, neither numa_local nor > numa_global is implemented. The implementation I proposed in the > patch maps both numa_local and numa_global requests to numa_global (on > Windows). The reasons for this were: > > * it was very difficult (if not impossible) to implement the JVM's > current numa_local semantics on Windows It's hard to realize numa_local semantics if you want to minimize the number of memory segments per lgroup. If you're prepared (like in your patch) to have hundreds of thousands of segments, this is not a problem and it's quite easily implementable. The only problem there would that such a huge number of segments will penalize page fault handling a lot. > * in the benchmarks we measured, the extra performance that was > left on the table by doing only numa_global and not doing > numa_local was only a few percent. > Hm, I have trouble believing that. How did you get such results? What were the experiments? > Are you saying that in the future numa_local will be supported on > Windows, and that even then it might still be advantageous to have a > flag (UseNUMAInterleaving) which instead maps all the regions to > numa_global? Should this flag be available on all OSes? > Basically yes. And like Ramki said it would be nice to support that on other OSes, so that we could at least get interleaving for the collectors that do no explicitly support NUMA. I guess I didn't do that before because the functionality is equivalent to just saying for example on Linux "numactl -i all java ", but since you can't do that on windows (as far as I can see) we could support this flag on unixes as well. Which is fairly easy to do, you just have to call os::numa_make_global() for a freshly reserved region. igor > -- Tom > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Monday, August 08, 2011 1:43 PM > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net); Deneau, Tom > > Subject: Re: Review Request: UseNUMAInterleaving > > > > Hi, Tom! > > > > Sorry it took me so long to get to that. > > > > 1. I don't think the new version of flag usage is prudent. The reason I > > proposed to introduce a new flag for interleaving is that it would make > > life easier in the future when the proper NUMA-aware implementation of > > GCs are added (G1 would be the most probable candidate). I would propose > > to still have UseNUMAInterleaving flag. > > > > The usage would be as follows: > > - If UseNUMA is specified on Windows that would turn UseNUMAInterleaving > > (for the time being, and that behavior would change in the future). > > - If UseNUMAInterleaving is specified on the command line, you just do > > the interleaving. If you don't add this flag now, you'll have to do that > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > igor > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > I have incorporated the change suggested by Paul Hohensee to just use > > the existing UseNUMA flag rather than introduce a new flag. Please let me > > know when you think this will be able to be checked in... > > > > > > The new webrev is at > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > -- Tom Deneau, AMD From tom.deneau at amd.com Wed Aug 17 23:50:28 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Wed, 17 Aug 2011 18:50:28 -0500 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <247BA26129A14681B03D0856A6FAC69D@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A90186EF951E@SAUSEXMBP01.amd.com> Igor -- Comments inline below... > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Wednesday, August 17, 2011 6:38 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving > > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > > Igor -- > > > > I am back from vacation, starting to address your comments... > > Regarding your comment #1 below. > > > > You mention "future planned NUMA-aware implementations of GCs". How > > do these future planned NUMA-aware implementations of GCs differ from > > today's NUMA-aware GCs? My understanding of the current GCs use of > > NUMA is that they support numa_global (interleaved) and numa_local > > (memory pinned to one numa node). > > > > In the currently released JVMs on Windows OSes, neither numa_local nor > > numa_global is implemented. The implementation I proposed in the > > patch maps both numa_local and numa_global requests to numa_global (on > > Windows). The reasons for this were: > > > > * it was very difficult (if not impossible) to implement the JVM's > > current numa_local semantics on Windows > It's hard to realize numa_local semantics if you want to minimize the > number of memory segments per lgroup. If you're prepared (like in your > patch) to have hundreds of thousands of segments, this is not a problem > and it's quite easily implementable. The only problem there would that > such a huge number of segments will penalize page fault handling a lot. > > * in the benchmarks we measured, the extra performance that was > > left on the table by doing only numa_global and not doing > > numa_local was only a few percent. > > > Hm, I have trouble believing that. How did you get such results? What > were the experiments? As I recall, I took the linux implementation of UseNUMA and forced all the numa_make_local to just call numa_make_global, and then measured the difference between this and regular UseNUMA on jbb2005. > > > Are you saying that in the future numa_local will be supported on > > Windows, and that even then it might still be advantageous to have a > > flag (UseNUMAInterleaving) which instead maps all the regions to > > numa_global? Should this flag be available on all OSes? > > > Basically yes. And like Ramki said it would be nice to support that on > other OSes, so that we could at least get interleaving for the collectors > that do no explicitly support NUMA. I guess I didn't do that before > because the functionality is equivalent to just saying for example on > Linux "numactl -i all java ", but since you can't do that on > windows (as far as I can see) we could support this flag on unixes as > well. Which is fairly easy to do, you just have to call > os::numa_make_global() for a freshly reserved region. > Ah, I had originally thought that this could also be done by just mapping numa_make_local to numa_make_global if the UseNUMAInterleaving flag is set. But I think I see your point, that you would also want the interleaving when you're using a non-numa-aware collector. > igor > > -- Tom > > > > > -----Original Message----- > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > Sent: Monday, August 08, 2011 1:43 PM > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > dev at openjdk.java.net); Deneau, Tom > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > Hi, Tom! > > > > > > Sorry it took me so long to get to that. > > > > > > 1. I don't think the new version of flag usage is prudent. The reason > I > > > proposed to introduce a new flag for interleaving is that it would > make > > > life easier in the future when the proper NUMA-aware implementation > of > > > GCs are added (G1 would be the most probable candidate). I would > propose > > > to still have UseNUMAInterleaving flag. > > > > > > The usage would be as follows: > > > - If UseNUMA is specified on Windows that would turn > UseNUMAInterleaving > > > (for the time being, and that behavior would change in the future). > > > - If UseNUMAInterleaving is specified on the command line, you just > do > > > the interleaving. If you don't add this flag now, you'll have to do > that > > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > > > > igor > > > > > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > > I have incorporated the change suggested by Paul Hohensee to just > use > > > the existing UseNUMA flag rather than introduce a new flag. Please > let me > > > know when you think this will be able to be checked in... > > > > > > > > The new webrev is at > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > > > -- Tom Deneau, AMD > > From igor.veresov at oracle.com Thu Aug 18 01:56:25 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 17 Aug 2011 18:56:25 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186EF951E@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF951E@SAUSEXMBP01.amd.com> Message-ID: <871789DCDC014BD0B971A993EF9A8E5F@oracle.com> Tom, I've tried to repeat your experiments and here's what I've got: SPECjbb2005 on Linux on 4 socket Nehalem, 25G heap, 15G young gen. I did 8 runs of 80 warehouses on 80 hardware threads, peak result was then selected. * base 481280 * usenuma 661524 (+37%) * numactl -i all 551175 (+14%) * usenuma and numa_local to numa_global hack 539724 (+12%) So, I'd say numa-aware allocator gives +25-27% on top of interleaving with this benchmark. But anyway, interleaving is substantially better than the base case. igor On Wednesday, August 17, 2011 at 4:50 PM, Deneau, Tom wrote: > Igor -- > > Comments inline below... > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Wednesday, August 17, 2011 6:38 PM > > To: Deneau, Tom > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net) > > Subject: Re: Review Request: UseNUMAInterleaving > > > > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > > > Igor -- > > > > > > I am back from vacation, starting to address your comments... > > > Regarding your comment #1 below. > > > > > > You mention "future planned NUMA-aware implementations of GCs". How > > > do these future planned NUMA-aware implementations of GCs differ from > > > today's NUMA-aware GCs? My understanding of the current GCs use of > > > NUMA is that they support numa_global (interleaved) and numa_local > > > (memory pinned to one numa node). > > > > > > In the currently released JVMs on Windows OSes, neither numa_local nor > > > numa_global is implemented. The implementation I proposed in the > > > patch maps both numa_local and numa_global requests to numa_global (on > > > Windows). The reasons for this were: > > > > > > * it was very difficult (if not impossible) to implement the JVM's > > > current numa_local semantics on Windows > > It's hard to realize numa_local semantics if you want to minimize the > > number of memory segments per lgroup. If you're prepared (like in your > > patch) to have hundreds of thousands of segments, this is not a problem > > and it's quite easily implementable. The only problem there would that > > such a huge number of segments will penalize page fault handling a lot. > > > * in the benchmarks we measured, the extra performance that was > > > left on the table by doing only numa_global and not doing > > > numa_local was only a few percent. > > Hm, I have trouble believing that. How did you get such results? What > > were the experiments? > > As I recall, I took the linux implementation of UseNUMA and forced all the numa_make_local > to just call numa_make_global, and then measured the difference between this and regular UseNUMA > on jbb2005. > > > > > > Are you saying that in the future numa_local will be supported on > > > Windows, and that even then it might still be advantageous to have a > > > flag (UseNUMAInterleaving) which instead maps all the regions to > > > numa_global? Should this flag be available on all OSes? > > Basically yes. And like Ramki said it would be nice to support that on > > other OSes, so that we could at least get interleaving for the collectors > > that do no explicitly support NUMA. I guess I didn't do that before > > because the functionality is equivalent to just saying for example on > > Linux "numactl -i all java ", but since you can't do that on > > windows (as far as I can see) we could support this flag on unixes as > > well. Which is fairly easy to do, you just have to call > > os::numa_make_global() for a freshly reserved region. > > Ah, I had originally thought that this could also be done by just mapping > numa_make_local to numa_make_global if the UseNUMAInterleaving flag is set. > But I think I see your point, that you would also want the interleaving when > you're using a non-numa-aware collector. > > > > igor > > > -- Tom > > > > > > > -----Original Message----- > > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > > Sent: Monday, August 08, 2011 1:43 PM > > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > > dev at openjdk.java.net (mailto:dev at openjdk.java.net)); Deneau, Tom > > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > > > Hi, Tom! > > > > > > > > Sorry it took me so long to get to that. > > > > > > > > 1. I don't think the new version of flag usage is prudent. The reason > > I > > > > proposed to introduce a new flag for interleaving is that it would > > make > > > > life easier in the future when the proper NUMA-aware implementation > > of > > > > GCs are added (G1 would be the most probable candidate). I would > > propose > > > > to still have UseNUMAInterleaving flag. > > > > > > > > The usage would be as follows: > > > > - If UseNUMA is specified on Windows that would turn > > UseNUMAInterleaving > > > > (for the time being, and that behavior would change in the future). > > > > - If UseNUMAInterleaving is specified on the command line, you just > > do > > > > the interleaving. If you don't add this flag now, you'll have to do > > that > > > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > > > > > > > igor > > > > > > > > > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > > > I have incorporated the change suggested by Paul Hohensee to just > > use > > > > the existing UseNUMA flag rather than introduce a new flag. Please > > let me > > > > know when you think this will be able to be checked in... > > > > > > > > > > The new webrev is at > > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > > > > > -- Tom Deneau, AMD From bengt.rutisson at oracle.com Thu Aug 18 07:28:12 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 18 Aug 2011 09:28:12 +0200 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4C0128.5090007@oracle.com> References: <4E4AB10C.8050605@oracle.com> <4E4C0128.5090007@oracle.com> Message-ID: <4E4CBF0C.9030406@oracle.com> Hi John, Thanks for the review! > Looks good to me. You may want to consider removing the > CMCheckpointRootsInitialClosure - it's only instantiated in the code > you removed from concurrentMarkThread.cpp. Good point. I removed the CMCheckpointRootsInitialClosure class. Here is an updated webrev: http://cr.openjdk.java.net/~brutisso/6814390c/webrev.02/ Bengt > > JohnC > > On 08/16/11 11:03, Bengt Rutisson wrote: >> >> Hi all, >> >> Could I have a couple of reviews for this change? >> >> Background: >> G1 was originally designed to be able to run in a non-generational >> mode. This has not been used for a long time and no testing has been >> done. Thus, the code has bit rotted. We don't see the need for this >> feature anymore, so rather than fixing it we should remove it. >> >> This is actually a low priority item, but since it will make my next >> step (supporting young space sizing better) simpler I would like to >> get this CR out of the way. >> >> Webrev: >> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ >> >> CR: >> 6814390 G1: remove the concept of non-generational G1 >> http://monaco.us.oracle.com/detail.jsf?cr=6814390 >> >> Thanks, >> Bengt >> > From y.s.ramakrishna at oracle.com Thu Aug 18 08:10:12 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Thu, 18 Aug 2011 01:10:12 -0700 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4CBF0C.9030406@oracle.com> References: <4E4AB10C.8050605@oracle.com> <4E4C0128.5090007@oracle.com> <4E4CBF0C.9030406@oracle.com> Message-ID: <4E4CC8E4.4000403@oracle.com> Hi Bengt -- that should also make the method checkpointRootsInitial() dead, and thus also record_concurrent_mark_init_start() [The clue is the guarantee you deleted in the latter method at line 940 which would be violated in the case of G1Gen because it would be tautologically false now.] I think the dead code contagion does stop there, as far as i could tell from a browse of the code. Perhaps a good IDE may find you a few more dead methods, who knows... Rest looks good to me. -- ramki On 8/18/2011 12:28 AM, Bengt Rutisson wrote: > > Hi John, > > Thanks for the review! > >> Looks good to me. You may want to consider removing the CMCheckpointRootsInitialClosure - it's >> only instantiated in the code you removed from concurrentMarkThread.cpp. > > Good point. I removed the CMCheckpointRootsInitialClosure class. Here is an updated webrev: > > http://cr.openjdk.java.net/~brutisso/6814390c/webrev.02/ > > Bengt > >> >> JohnC >> >> On 08/16/11 11:03, Bengt Rutisson wrote: >>> >>> Hi all, >>> >>> Could I have a couple of reviews for this change? >>> >>> Background: >>> G1 was originally designed to be able to run in a non-generational mode. This has not been used >>> for a long time and no testing has been done. Thus, the code has bit rotted. We don't see the >>> need for this feature anymore, so rather than fixing it we should remove it. >>> >>> This is actually a low priority item, but since it will make my next step (supporting young >>> space sizing better) simpler I would like to get this CR out of the way. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ >>> >>> CR: >>> 6814390 G1: remove the concept of non-generational G1 >>> http://monaco.us.oracle.com/detail.jsf?cr=6814390 >>> >>> Thanks, >>> Bengt >>> >> > From bengt.rutisson at oracle.com Thu Aug 18 08:44:12 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 18 Aug 2011 10:44:12 +0200 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4CC8E4.4000403@oracle.com> References: <4E4AB10C.8050605@oracle.com> <4E4C0128.5090007@oracle.com> <4E4CBF0C.9030406@oracle.com> <4E4CC8E4.4000403@oracle.com> Message-ID: <4E4CD0DC.7030101@oracle.com> Hi Ramki, On 2011-08-18 10:10, Y. Srinivas Ramakrishna wrote: > Hi Bengt -- that should also make the method checkpointRootsInitial() > dead, and > thus also record_concurrent_mark_init_start() [The clue is the guarantee > you deleted in the latter method at line 940 which would be violated > in the case of G1Gen > because it would be tautologically false now.] I think the dead code > contagion > does stop there, as far as i could tell from a browse of the code. > Perhaps a good > IDE may find you a few more dead methods, who knows... Wow! This change really propagates. Thanks for catching this. I found a couple of more methods to delete and one class. Now I hope I got all of the dead code that this change generated. This is what I deleted this time around: G1CollectorPolicy::record_concurrent_mark_init_start() G1CollectorPolicy::record_concurrent_mark_init_end() ConcurrentMark::checkpointRootsInitial() G1CollectedHeap::do_sync_mark() class CMMarkRootsClosure Here is an updated webrev: http://cr.openjdk.java.net/~brutisso/6814390c/webrev.03/ Thanks, Bengt > > Rest looks good to me. > -- ramki > > On 8/18/2011 12:28 AM, Bengt Rutisson wrote: >> >> Hi John, >> >> Thanks for the review! >> >>> Looks good to me. You may want to consider removing the >>> CMCheckpointRootsInitialClosure - it's only instantiated in the code >>> you removed from concurrentMarkThread.cpp. >> >> Good point. I removed the CMCheckpointRootsInitialClosure class. Here >> is an updated webrev: >> >> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.02/ >> >> Bengt >> >>> >>> JohnC >>> >>> On 08/16/11 11:03, Bengt Rutisson wrote: >>>> >>>> Hi all, >>>> >>>> Could I have a couple of reviews for this change? >>>> >>>> Background: >>>> G1 was originally designed to be able to run in a non-generational >>>> mode. This has not been used for a long time and no testing has >>>> been done. Thus, the code has bit rotted. We don't see the need for >>>> this feature anymore, so rather than fixing it we should remove it. >>>> >>>> This is actually a low priority item, but since it will make my >>>> next step (supporting young space sizing better) simpler I would >>>> like to get this CR out of the way. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ >>>> >>>> CR: >>>> 6814390 G1: remove the concept of non-generational G1 >>>> http://monaco.us.oracle.com/detail.jsf?cr=6814390 >>>> >>>> Thanks, >>>> Bengt >>>> >>> >> > From tony.printezis at oracle.com Thu Aug 18 16:54:53 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 18 Aug 2011 12:54:53 -0400 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4CD0DC.7030101@oracle.com> References: <4E4AB10C.8050605@oracle.com> <4E4C0128.5090007@oracle.com> <4E4CBF0C.9030406@oracle.com> <4E4CC8E4.4000403@oracle.com> <4E4CD0DC.7030101@oracle.com> Message-ID: <4E4D43DD.2040307@oracle.com> Bengt, Many thanks for doing this cleanup! The changes look good, but it turns out you can remove a few more things: g1CollectorPolicy.hpp: double _mark_init_start_sec; TruncatedSeq* _concurrent_mark_init_times_ms; double predict_init_time_ms() { return get_new_prediction(_concurrent_mark_init_times_ms); } and their uses. Can you also maybe rename the following (the "pre" suffix does not make sense any more)? G1CollectorPolicy::record_concurrent_mark_init_end_pre() to G1CollectorPolicy::record_concurrent_mark_init_end() Thanks, Tony On 08/18/2011 04:44 AM, Bengt Rutisson wrote: > > Hi Ramki, > > On 2011-08-18 10:10, Y. Srinivas Ramakrishna wrote: >> Hi Bengt -- that should also make the method checkpointRootsInitial() >> dead, and >> thus also record_concurrent_mark_init_start() [The clue is the guarantee >> you deleted in the latter method at line 940 which would be violated >> in the case of G1Gen >> because it would be tautologically false now.] I think the dead code >> contagion >> does stop there, as far as i could tell from a browse of the code. >> Perhaps a good >> IDE may find you a few more dead methods, who knows... > > Wow! This change really propagates. Thanks for catching this. I found > a couple of more methods to delete and one class. Now I hope I got all > of the dead code that this change generated. > > This is what I deleted this time around: > > G1CollectorPolicy::record_concurrent_mark_init_start() > G1CollectorPolicy::record_concurrent_mark_init_end() > ConcurrentMark::checkpointRootsInitial() > G1CollectedHeap::do_sync_mark() > class CMMarkRootsClosure > > Here is an updated webrev: > http://cr.openjdk.java.net/~brutisso/6814390c/webrev.03/ > > Thanks, > Bengt > >> >> Rest looks good to me. >> -- ramki >> >> On 8/18/2011 12:28 AM, Bengt Rutisson wrote: >>> >>> Hi John, >>> >>> Thanks for the review! >>> >>>> Looks good to me. You may want to consider removing the >>>> CMCheckpointRootsInitialClosure - it's only instantiated in the >>>> code you removed from concurrentMarkThread.cpp. >>> >>> Good point. I removed the CMCheckpointRootsInitialClosure class. >>> Here is an updated webrev: >>> >>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.02/ >>> >>> Bengt >>> >>>> >>>> JohnC >>>> >>>> On 08/16/11 11:03, Bengt Rutisson wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Could I have a couple of reviews for this change? >>>>> >>>>> Background: >>>>> G1 was originally designed to be able to run in a non-generational >>>>> mode. This has not been used for a long time and no testing has >>>>> been done. Thus, the code has bit rotted. We don't see the need >>>>> for this feature anymore, so rather than fixing it we should >>>>> remove it. >>>>> >>>>> This is actually a low priority item, but since it will make my >>>>> next step (supporting young space sizing better) simpler I would >>>>> like to get this CR out of the way. >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ >>>>> >>>>> CR: >>>>> 6814390 G1: remove the concept of non-generational G1 >>>>> http://monaco.us.oracle.com/detail.jsf?cr=6814390 >>>>> >>>>> Thanks, >>>>> Bengt >>>>> >>>> >>> >> > From john.cuthbertson at oracle.com Thu Aug 18 18:17:59 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 18 Aug 2011 11:17:59 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures Message-ID: <4E4D5757.7040004@oracle.com> Hi Everyone, Can I have a couple of volunteers review these refactoring changes to the marking code used during evacuation pauses (both initial mark pauses and regular evacuation pauses when marking is active) - the change can be found at http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. The refactoring changes fix an issue that was seen with the code changes for 6486945. During an initial mark pause, during root scanning, one thread had successfully forwarded an object and had started to copy it. While the object was being copied to its new location, another thread saw that the object had been forwarded and, after checking that the new location was unmarked, successfully marked the new location. The first thread would finish the copying, see that the new location was marked and skip the mark. The situation I ran into was that I was attempting to obtain the size of the new object just after it was marked (by the thread doing the marking) and the old object had not yet been fully copied to its new location. With these refactoring changes, the thread that successfully forwards an object in the collection set will mark the forwardee after copying - allowing me to safely obtain it's size. Testing: several runs of the GC test suite with a marking threshold of 10 and 20%, Kitchensink, and jprt. Thanks, JohnC From tom.deneau at amd.com Thu Aug 18 22:40:26 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Thu, 18 Aug 2011 17:40:26 -0500 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <247BA26129A14681B03D0856A6FAC69D@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> Igor -- For the linux/solaris support for UseNUMAInterleaving, you mentioned we can just call os::numa_make_global at the end of os::reserve_memory. Will this only have an effect for those collectors that do not explicitly already support UseNUMA? What would happen if UseNUMAInterleaving is used with a collector that does support UseNUMA? Which collectors do not explicitly support UseNUMA? -- Tom > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Wednesday, August 17, 2011 6:38 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving > > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > > Igor -- > > > > I am back from vacation, starting to address your comments... > > Regarding your comment #1 below. > > > > You mention "future planned NUMA-aware implementations of GCs". How > > do these future planned NUMA-aware implementations of GCs differ from > > today's NUMA-aware GCs? My understanding of the current GCs use of > > NUMA is that they support numa_global (interleaved) and numa_local > > (memory pinned to one numa node). > > > > In the currently released JVMs on Windows OSes, neither numa_local nor > > numa_global is implemented. The implementation I proposed in the > > patch maps both numa_local and numa_global requests to numa_global (on > > Windows). The reasons for this were: > > > > * it was very difficult (if not impossible) to implement the JVM's > > current numa_local semantics on Windows > It's hard to realize numa_local semantics if you want to minimize the > number of memory segments per lgroup. If you're prepared (like in your > patch) to have hundreds of thousands of segments, this is not a problem > and it's quite easily implementable. The only problem there would that > such a huge number of segments will penalize page fault handling a lot. > > * in the benchmarks we measured, the extra performance that was > > left on the table by doing only numa_global and not doing > > numa_local was only a few percent. > > > Hm, I have trouble believing that. How did you get such results? What > were the experiments? > > > Are you saying that in the future numa_local will be supported on > > Windows, and that even then it might still be advantageous to have a > > flag (UseNUMAInterleaving) which instead maps all the regions to > > numa_global? Should this flag be available on all OSes? > > > Basically yes. And like Ramki said it would be nice to support that on > other OSes, so that we could at least get interleaving for the collectors > that do no explicitly support NUMA. I guess I didn't do that before > because the functionality is equivalent to just saying for example on > Linux "numactl -i all java ", but since you can't do that on > windows (as far as I can see) we could support this flag on unixes as > well. Which is fairly easy to do, you just have to call > os::numa_make_global() for a freshly reserved region. > > igor > > -- Tom > > > > > -----Original Message----- > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > Sent: Monday, August 08, 2011 1:43 PM > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > dev at openjdk.java.net); Deneau, Tom > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > Hi, Tom! > > > > > > Sorry it took me so long to get to that. > > > > > > 1. I don't think the new version of flag usage is prudent. The reason > I > > > proposed to introduce a new flag for interleaving is that it would > make > > > life easier in the future when the proper NUMA-aware implementation > of > > > GCs are added (G1 would be the most probable candidate). I would > propose > > > to still have UseNUMAInterleaving flag. > > > > > > The usage would be as follows: > > > - If UseNUMA is specified on Windows that would turn > UseNUMAInterleaving > > > (for the time being, and that behavior would change in the future). > > > - If UseNUMAInterleaving is specified on the command line, you just > do > > > the interleaving. If you don't add this flag now, you'll have to do > that > > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > > > > igor > > > > > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > > I have incorporated the change suggested by Paul Hohensee to just > use > > > the existing UseNUMA flag rather than introduce a new flag. Please > let me > > > know when you think this will be able to be checked in... > > > > > > > > The new webrev is at > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > > > -- Tom Deneau, AMD > > From igor.veresov at oracle.com Fri Aug 19 00:00:46 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 18 Aug 2011 17:00:46 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> Message-ID: On Thursday, August 18, 2011 at 3:40 PM, Deneau, Tom wrote: > Igor -- > > For the linux/solaris support for UseNUMAInterleaving, you mentioned > we can just call os::numa_make_global at the end of > os::reserve_memory. Will this only have an effect for those > collectors that do not explicitly already support UseNUMA? What > would happen if UseNUMAInterleaving is used with a collector that > does support UseNUMA? Which collectors do not explicitly support > UseNUMA? > I don't think anything bad is going to happen the way numa_make_global() is currently implemented on unixes. NUMA-aware collectors will just override this page-allocation policy with something else. So I guess for now we could just turn on UseNUMAInterleaving whenever UseNUMA is specified (but not the other way around) regardless of the collector. We will probably have to change that later when we fully support numa of windows, because numa_make_global() will be much more heavyweight there - you would have to deallocate a chunk and then reallocate it again; but for now let's keep it simple. And, btw, I think you need to tap not only os::reserve_memory() but also os::attempt_reserve_memory_at(), and os::reserve_memory_special(). Basically everything that is used to allocate memory. igor > -- Tom > > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Wednesday, August 17, 2011 6:38 PM > > To: Deneau, Tom > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net) > > Subject: Re: Review Request: UseNUMAInterleaving > > > > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > > > Igor -- > > > > > > I am back from vacation, starting to address your comments... > > > Regarding your comment #1 below. > > > > > > You mention "future planned NUMA-aware implementations of GCs". How > > > do these future planned NUMA-aware implementations of GCs differ from > > > today's NUMA-aware GCs? My understanding of the current GCs use of > > > NUMA is that they support numa_global (interleaved) and numa_local > > > (memory pinned to one numa node). > > > > > > In the currently released JVMs on Windows OSes, neither numa_local nor > > > numa_global is implemented. The implementation I proposed in the > > > patch maps both numa_local and numa_global requests to numa_global (on > > > Windows). The reasons for this were: > > > > > > * it was very difficult (if not impossible) to implement the JVM's > > > current numa_local semantics on Windows > > It's hard to realize numa_local semantics if you want to minimize the > > number of memory segments per lgroup. If you're prepared (like in your > > patch) to have hundreds of thousands of segments, this is not a problem > > and it's quite easily implementable. The only problem there would that > > such a huge number of segments will penalize page fault handling a lot. > > > * in the benchmarks we measured, the extra performance that was > > > left on the table by doing only numa_global and not doing > > > numa_local was only a few percent. > > Hm, I have trouble believing that. How did you get such results? What > > were the experiments? > > > > > Are you saying that in the future numa_local will be supported on > > > Windows, and that even then it might still be advantageous to have a > > > flag (UseNUMAInterleaving) which instead maps all the regions to > > > numa_global? Should this flag be available on all OSes? > > Basically yes. And like Ramki said it would be nice to support that on > > other OSes, so that we could at least get interleaving for the collectors > > that do no explicitly support NUMA. I guess I didn't do that before > > because the functionality is equivalent to just saying for example on > > Linux "numactl -i all java ", but since you can't do that on > > windows (as far as I can see) we could support this flag on unixes as > > well. Which is fairly easy to do, you just have to call > > os::numa_make_global() for a freshly reserved region. > > > > igor > > > -- Tom > > > > > > > -----Original Message----- > > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > > Sent: Monday, August 08, 2011 1:43 PM > > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > > dev at openjdk.java.net (mailto:dev at openjdk.java.net)); Deneau, Tom > > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > > > Hi, Tom! > > > > > > > > Sorry it took me so long to get to that. > > > > > > > > 1. I don't think the new version of flag usage is prudent. The reason > > I > > > > proposed to introduce a new flag for interleaving is that it would > > make > > > > life easier in the future when the proper NUMA-aware implementation > > of > > > > GCs are added (G1 would be the most probable candidate). I would > > propose > > > > to still have UseNUMAInterleaving flag. > > > > > > > > The usage would be as follows: > > > > - If UseNUMA is specified on Windows that would turn > > UseNUMAInterleaving > > > > (for the time being, and that behavior would change in the future). > > > > - If UseNUMAInterleaving is specified on the command line, you just > > do > > > > the interleaving. If you don't add this flag now, you'll have to do > > that > > > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > > > > > > > igor > > > > > > > > > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > > > I have incorporated the change suggested by Paul Hohensee to just > > use > > > > the existing UseNUMA flag rather than introduce a new flag. Please > > let me > > > > know when you think this will be able to be checked in... > > > > > > > > > > The new webrev is at > > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > > > > > -- Tom Deneau, AMD From igor.veresov at oracle.com Fri Aug 19 00:18:05 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 18 Aug 2011 17:18:05 -0700 Subject: Review Request: UseNUMAInterleaving In-Reply-To: References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> Message-ID: <91928C974B07497184AF80B96F196606@oracle.com> Actually no, tapping into reserve() functions won't work. I think we should put os::numa_make_global() in os::commit_memory() and os::reserve_memory_special(). The reason is that the mmap inside of commit_memory will override the allocation policy of the underlying segment allocated by reserve_memory(). It would be still ok to use os::commit_memory() to implement os::free_memory() on Linux because the numa-aware allocator we change the policy afterwards anyway (see MutableNUMASpace::bias_region() ), so no problem here. igor On Thursday, August 18, 2011 at 5:00 PM, Igor Veresov wrote: > On Thursday, August 18, 2011 at 3:40 PM, Deneau, Tom wrote: > > Igor -- > > > > For the linux/solaris support for UseNUMAInterleaving, you mentioned > > we can just call os::numa_make_global at the end of > > os::reserve_memory. Will this only have an effect for those > > collectors that do not explicitly already support UseNUMA? What > > would happen if UseNUMAInterleaving is used with a collector that > > does support UseNUMA? Which collectors do not explicitly support > > UseNUMA? > I don't think anything bad is going to happen the way numa_make_global() is currently implemented on unixes. NUMA-aware collectors will just override this page-allocation policy with something else. So I guess for now we could just turn on UseNUMAInterleaving whenever UseNUMA is specified (but not the other way around) regardless of the collector. We will probably have to change that later when we fully support numa of windows, because numa_make_global() will be much more heavyweight there - you would have to deallocate a chunk and then reallocate it again; but for now let's keep it simple. > > And, btw, I think you need to tap not only os::reserve_memory() but also os::attempt_reserve_memory_at(), and os::reserve_memory_special(). Basically everything that is used to allocate memory. > > igor > > -- Tom > > > > > > > -----Original Message----- > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > Sent: Wednesday, August 17, 2011 6:38 PM > > > To: Deneau, Tom > > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net) > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote: > > > > Igor -- > > > > > > > > I am back from vacation, starting to address your comments... > > > > Regarding your comment #1 below. > > > > > > > > You mention "future planned NUMA-aware implementations of GCs". How > > > > do these future planned NUMA-aware implementations of GCs differ from > > > > today's NUMA-aware GCs? My understanding of the current GCs use of > > > > NUMA is that they support numa_global (interleaved) and numa_local > > > > (memory pinned to one numa node). > > > > > > > > In the currently released JVMs on Windows OSes, neither numa_local nor > > > > numa_global is implemented. The implementation I proposed in the > > > > patch maps both numa_local and numa_global requests to numa_global (on > > > > Windows). The reasons for this were: > > > > > > > > * it was very difficult (if not impossible) to implement the JVM's > > > > current numa_local semantics on Windows > > > It's hard to realize numa_local semantics if you want to minimize the > > > number of memory segments per lgroup. If you're prepared (like in your > > > patch) to have hundreds of thousands of segments, this is not a problem > > > and it's quite easily implementable. The only problem there would that > > > such a huge number of segments will penalize page fault handling a lot. > > > > * in the benchmarks we measured, the extra performance that was > > > > left on the table by doing only numa_global and not doing > > > > numa_local was only a few percent. > > > Hm, I have trouble believing that. How did you get such results? What > > > were the experiments? > > > > > > > Are you saying that in the future numa_local will be supported on > > > > Windows, and that even then it might still be advantageous to have a > > > > flag (UseNUMAInterleaving) which instead maps all the regions to > > > > numa_global? Should this flag be available on all OSes? > > > Basically yes. And like Ramki said it would be nice to support that on > > > other OSes, so that we could at least get interleaving for the collectors > > > that do no explicitly support NUMA. I guess I didn't do that before > > > because the functionality is equivalent to just saying for example on > > > Linux "numactl -i all java ", but since you can't do that on > > > windows (as far as I can see) we could support this flag on unixes as > > > well. Which is fairly easy to do, you just have to call > > > os::numa_make_global() for a freshly reserved region. > > > > > > igor > > > > -- Tom > > > > > > > > > -----Original Message----- > > > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > > > Sent: Monday, August 08, 2011 1:43 PM > > > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > > > dev at openjdk.java.net (mailto:dev at openjdk.java.net)); Deneau, Tom > > > > > Subject: Re: Review Request: UseNUMAInterleaving > > > > > > > > > > Hi, Tom! > > > > > > > > > > Sorry it took me so long to get to that. > > > > > > > > > > 1. I don't think the new version of flag usage is prudent. The reason > > > I > > > > > proposed to introduce a new flag for interleaving is that it would > > > make > > > > > life easier in the future when the proper NUMA-aware implementation > > > of > > > > > GCs are added (G1 would be the most probable candidate). I would > > > propose > > > > > to still have UseNUMAInterleaving flag. > > > > > > > > > > The usage would be as follows: > > > > > - If UseNUMA is specified on Windows that would turn > > > UseNUMAInterleaving > > > > > (for the time being, and that behavior would change in the future). > > > > > - If UseNUMAInterleaving is specified on the command line, you just > > > do > > > > > the interleaving. If you don't add this flag now, you'll have to do > > > that > > > > > anyway as soon as NUMA-aware GCs start supporting windows. > > > > > > > > > > > > > > > igor > > > > > > > > > > > > > > > > > > > > On 5/26/11 4:37 PM, Deneau, Tom wrote: > > > > > > I have incorporated the change suggested by Paul Hohensee to just > > > use > > > > > the existing UseNUMA flag rather than introduce a new flag. Please > > > let me > > > > > know when you think this will be able to be checked in... > > > > > > > > > > > > The new webrev is at > > > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/ > > > > > > > > > > > > -- Tom Deneau, AMD From john.coomes at oracle.com Fri Aug 19 01:24:16 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 19 Aug 2011 01:24:16 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 5 new changesets Message-ID: <20110819012428.CF13747D53@hg.openjdk.java.net> Changeset: 3be7439273c5 Author: katleman Date: 2011-05-25 13:31 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/3be7439273c5 7044486: open jdk repos have files with incorrect copyright headers, which can end up in src bundles Reviewed-by: ohair, trims ! agent/src/share/classes/sun/jvm/hotspot/runtime/ServiceThread.java ! make/linux/README ! make/windows/projectfiles/kernel/Makefile ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/os_cpu/solaris_sparc/vm/solaris_sparc.s ! src/share/tools/hsdis/README ! src/share/vm/gc_implementation/g1/heapRegionSet.inline.hpp ! src/share/vm/gc_implementation/parNew/parCardTableModRefBS.cpp ! src/share/vm/utilities/yieldingWorkgroup.cpp Changeset: 8b135e6129d6 Author: jeff Date: 2011-05-27 15:01 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/8b135e6129d6 7045697: JDK7 THIRD PARTY README update Reviewed-by: lana ! THIRD_PARTY_README Changeset: 52e4ba46751f Author: kamg Date: 2011-04-12 16:42 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/52e4ba46751f 7020373: JSR rewriting can overflow memory address size variables Summary: Abort if incoming classfile's parameters would cause overflows Reviewed-by: coleenp, dcubed, never ! src/share/vm/oops/generateOopMap.cpp + test/runtime/7020373/Test7020373.sh Changeset: bca686989d4b Author: asaha Date: 2011-06-15 14:59 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/bca686989d4b 7055247: Ignore test of # 7020373 Reviewed-by: dcubed ! test/runtime/7020373/Test7020373.sh Changeset: 337ffef74c37 Author: jeff Date: 2011-06-22 10:10 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/337ffef74c37 7057046: Add embedded license to THIRD PARTY README Reviewed-by: lana ! THIRD_PARTY_README From bengt.rutisson at oracle.com Fri Aug 19 07:47:51 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 19 Aug 2011 09:47:51 +0200 Subject: Request for review (s): 6814390 G1: remove the concept of non-generational G1 In-Reply-To: <4E4D4705.8090003@oracle.com> References: <4E4AB10C.8050605@oracle.com> <4E4C0128.5090007@oracle.com> <4E4CBF0C.9030406@oracle.com> <4E4CC8E4.4000403@oracle.com> <4E4CD0DC.7030101@oracle.com> <4E4D43DD.2040307@oracle.com> <4E4D4705.8090003@oracle.com> Message-ID: <4E4E1527.3050200@oracle.com> Tony, Thanks for the review! Nice that you found even more code to delete! I did the changes you suggested. Here is an updated webrev: http://cr.openjdk.java.net/~brutisso/6814390c/webrev.04/ Thanks, Bengt On 2011-08-18 19:08, Tony Printezis wrote: > Bengt, > > Could I ask a quick favor for an additional (embarrassing) > refactoring? Given your changes I don't think the following is needed: > > --- a/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp > +++ b/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp > @@ -2875,13 +2875,6 @@ > // We are doing young collections so reset this. > non_young_start_time_sec = young_end_time_sec; > > - // Note we can use either _collection_set_size or > - // _young_cset_length here > - if (_collection_set_size > 0 && _last_young_gc_full) { > - // don't bother adding more regions... > - goto choose_collection_set_end; > - } > - > if (!full_young_gcs()) { > bool should_continue = true; > NumberSeq seq; > @@ -2915,7 +2908,6 @@ > _should_revert_to_full_young_gcs = true; > } > > -choose_collection_set_end: > stop_incremental_cset_building(); > > > Tony > > On 08/18/2011 12:54 PM, Tony Printezis wrote: >> Bengt, >> >> Many thanks for doing this cleanup! The changes look good, but it >> turns out you can remove a few more things: >> >> g1CollectorPolicy.hpp: >> >> double _mark_init_start_sec; >> TruncatedSeq* _concurrent_mark_init_times_ms; >> double predict_init_time_ms() { >> return get_new_prediction(_concurrent_mark_init_times_ms); >> } >> >> and their uses. >> >> Can you also maybe rename the following (the "pre" suffix does not >> make sense any more)? >> >> G1CollectorPolicy::record_concurrent_mark_init_end_pre() >> >> to >> >> G1CollectorPolicy::record_concurrent_mark_init_end() >> >> Thanks, >> >> Tony >> >> On 08/18/2011 04:44 AM, Bengt Rutisson wrote: >>> >>> Hi Ramki, >>> >>> On 2011-08-18 10:10, Y. Srinivas Ramakrishna wrote: >>>> Hi Bengt -- that should also make the method >>>> checkpointRootsInitial() dead, and >>>> thus also record_concurrent_mark_init_start() [The clue is the >>>> guarantee >>>> you deleted in the latter method at line 940 which would be >>>> violated in the case of G1Gen >>>> because it would be tautologically false now.] I think the dead >>>> code contagion >>>> does stop there, as far as i could tell from a browse of the code. >>>> Perhaps a good >>>> IDE may find you a few more dead methods, who knows... >>> >>> Wow! This change really propagates. Thanks for catching this. I >>> found a couple of more methods to delete and one class. Now I hope I >>> got all of the dead code that this change generated. >>> >>> This is what I deleted this time around: >>> >>> G1CollectorPolicy::record_concurrent_mark_init_start() >>> G1CollectorPolicy::record_concurrent_mark_init_end() >>> ConcurrentMark::checkpointRootsInitial() >>> G1CollectedHeap::do_sync_mark() >>> class CMMarkRootsClosure >>> >>> Here is an updated webrev: >>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.03/ >>> >>> Thanks, >>> Bengt >>> >>>> >>>> Rest looks good to me. >>>> -- ramki >>>> >>>> On 8/18/2011 12:28 AM, Bengt Rutisson wrote: >>>>> >>>>> Hi John, >>>>> >>>>> Thanks for the review! >>>>> >>>>>> Looks good to me. You may want to consider removing the >>>>>> CMCheckpointRootsInitialClosure - it's only instantiated in the >>>>>> code you removed from concurrentMarkThread.cpp. >>>>> >>>>> Good point. I removed the CMCheckpointRootsInitialClosure class. >>>>> Here is an updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.02/ >>>>> >>>>> Bengt >>>>> >>>>>> >>>>>> JohnC >>>>>> >>>>>> On 08/16/11 11:03, Bengt Rutisson wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Could I have a couple of reviews for this change? >>>>>>> >>>>>>> Background: >>>>>>> G1 was originally designed to be able to run in a >>>>>>> non-generational mode. This has not been used for a long time >>>>>>> and no testing has been done. Thus, the code has bit rotted. We >>>>>>> don't see the need for this feature anymore, so rather than >>>>>>> fixing it we should remove it. >>>>>>> >>>>>>> This is actually a low priority item, but since it will make my >>>>>>> next step (supporting young space sizing better) simpler I would >>>>>>> like to get this CR out of the way. >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~brutisso/6814390c/webrev.01/ >>>>>>> >>>>>>> CR: >>>>>>> 6814390 G1: remove the concept of non-generational G1 >>>>>>> http://monaco.us.oracle.com/detail.jsf?cr=6814390 >>>>>>> >>>>>>> Thanks, >>>>>>> Bengt >>>>>>> >>>>>> >>>>> >>>> >>> From bluedavy at gmail.com Fri Aug 19 08:57:10 2011 From: bluedavy at gmail.com (BlueDavy Lin) Date: Fri, 19 Aug 2011 16:57:10 +0800 Subject: why this young gc is so slow? Message-ID: hi! CPU: 16 core, Physical Memory: 24G JDK: 6u23 || 6u21 startup options: -Xmn1g -Xms4g -Xmx4g some gc log: 2011-08-19T16:55:27.759+0800: 92.888: [GC [PSYoungGen: 1022368K->22752K(1024320K)] 1073312K->79200K(4170048K), 0.2117400 secs] [Times: user=2.80 sys=0.00, real=0.21 secs] 2011-08-19T16:55:29.377+0800: 94.507: [GC [PSYoungGen: 1022688K->22848K(1024192K)] 1079136K->82896K(4169920K), 0.1858340 secs] [Times: user=2.37 sys=0.01, real=0.18 secs] 2011-08-19T16:55:30.999+0800: 96.128: [GC [PSYoungGen: 1022784K->22784K(1024448K)] 1082832K->84752K(4170176K), 0.1648320 secs] [Times: user=1.98 sys=0.00, real=0.16 secs] 2011-08-19T16:55:32.576+0800: 97.706: [GC [PSYoungGen: 1023040K->22816K(1024384K)] 1085008K->89416K(4170112K), 0.1979820 secs] [Times: user=2.56 sys=0.00, real=0.19 secs] 2011-08-19T16:55:34.194+0800: 99.323: [GC [PSYoungGen: 1023072K->22784K(1024640K)] 1089672K->92208K(4170368K), 0.1748660 secs] [Times: user=2.18 sys=0.01, real=0.18 secs] 2011-08-19T16:55:35.779+0800: 100.909: [GC [PSYoungGen: 1023360K->22784K(1024512K)] 1092784K->97792K(4170240K), 0.2129800 secs] [Times: user=2.81 sys=0.00, real=0.21 secs] 2011-08-19T16:55:37.437+0800: 102.567: [GC [PSYoungGen: 1023360K->22752K(1024768K)] 1098368K->101528K(4170496K), 0.1880140 secs] [Times: user=2.40 sys=0.00, real=0.18 secs] 2011-08-19T16:55:39.040+0800: 104.169: [GC [PSYoungGen: 1023648K->22784K(1024704K)] 1102424K->103688K(4170432K), 0.1647610 secs] [Times: user=2.01 sys=0.01, real=0.16 secs] 2011-08-19T16:55:40.605+0800: 105.735: [GC [PSYoungGen: 1023680K->22816K(1024896K)] 1104584K->108600K(4170624K), 0.2030000 secs] [Times: user=2.65 sys=0.00, real=0.21 secs] I think for ps gc,the live objects is the key factor,but in this case,live objects are not so much,why gc time so slow? I think it should be below 20ms,can someone help me for this? -- ============================= |? ?? BlueDavy? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | |? ?? http://www.bluedavy.com? ?? ? ? ? ? ? ?| ============================= From igor.veresov at oracle.com Fri Aug 19 10:01:52 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 19 Aug 2011 03:01:52 -0700 Subject: why this young gc is so slow? In-Reply-To: References: Message-ID: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> Actually it's not only the amount of live data that matters (that's true, but only for the copying part) but also the number of edges in the object graph that have to be traversed. In your example it could be the case that the old gen contains a huge data structure (say an object array) that contains references to objects in the young gen. It could have references to the same object, doesn't matter, what would matter is their number. So, during a young GC we would have to scan all those references, which could take time. You could run with -XX:+PrintGCTaskTimeStamps and it should basically show you where the time goes. igor On Friday, August 19, 2011 at 1:57 AM, BlueDavy Lin wrote: > hi! > > CPU: 16 core, Physical Memory: 24G JDK: 6u23 || 6u21 > startup options: -Xmn1g -Xms4g -Xmx4g > > some gc log: > 2011-08-19T16:55:27.759+0800: 92.888: [GC [PSYoungGen: > 1022368K->22752K(1024320K)] 1073312K->79200K(4170048K), 0.2117400 > secs] [Times: user=2.80 sys=0.00, real=0.21 secs] > 2011-08-19T16:55:29.377+0800: 94.507: [GC [PSYoungGen: > 1022688K->22848K(1024192K)] 1079136K->82896K(4169920K), 0.1858340 > secs] [Times: user=2.37 sys=0.01, real=0.18 secs] > 2011-08-19T16:55:30.999+0800: 96.128: [GC [PSYoungGen: > 1022784K->22784K(1024448K)] 1082832K->84752K(4170176K), 0.1648320 > secs] [Times: user=1.98 sys=0.00, real=0.16 secs] > 2011-08-19T16:55:32.576+0800: 97.706: [GC [PSYoungGen: > 1023040K->22816K(1024384K)] 1085008K->89416K(4170112K), 0.1979820 > secs] [Times: user=2.56 sys=0.00, real=0.19 secs] > 2011-08-19T16:55:34.194+0800: 99.323: [GC [PSYoungGen: > 1023072K->22784K(1024640K)] 1089672K->92208K(4170368K), 0.1748660 > secs] [Times: user=2.18 sys=0.01, real=0.18 secs] > 2011-08-19T16:55:35.779+0800: 100.909: [GC [PSYoungGen: > 1023360K->22784K(1024512K)] 1092784K->97792K(4170240K), 0.2129800 > secs] [Times: user=2.81 sys=0.00, real=0.21 secs] > 2011-08-19T16:55:37.437+0800: 102.567: [GC [PSYoungGen: > 1023360K->22752K(1024768K)] 1098368K->101528K(4170496K), 0.1880140 > secs] [Times: user=2.40 sys=0.00, real=0.18 secs] > 2011-08-19T16:55:39.040+0800: 104.169: [GC [PSYoungGen: > 1023648K->22784K(1024704K)] 1102424K->103688K(4170432K), 0.1647610 > secs] [Times: user=2.01 sys=0.01, real=0.16 secs] > 2011-08-19T16:55:40.605+0800: 105.735: [GC [PSYoungGen: > 1023680K->22816K(1024896K)] 1104584K->108600K(4170624K), 0.2030000 > secs] [Times: user=2.65 sys=0.00, real=0.21 secs] > > I think for ps gc,the live objects is the key factor,but in > this case,live objects are not so much,why gc time so slow? I think > it should be below 20ms,can someone help me for this? > > -- > ============================= > | BlueDavy | > | http://www.bluedavy.com | > ============================= From kirk at kodewerk.com Fri Aug 19 11:12:33 2011 From: kirk at kodewerk.com (Charles K Pepperdine) Date: Fri, 19 Aug 2011 13:12:33 +0200 Subject: why this young gc is so slow? In-Reply-To: References: Message-ID: <71F65D30-3FF3-42EE-9D4C-3B6F74E96488@kodewerk.com> Can you pass along the entire log? Regards, Kirk Pepperdine On Aug 19, 2011, at 10:57 AM, BlueDavy Lin wrote: > hi! > > CPU: 16 core, Physical Memory: 24G JDK: 6u23 || 6u21 > startup options: -Xmn1g -Xms4g -Xmx4g > > some gc log: > 2011-08-19T16:55:27.759+0800: 92.888: [GC [PSYoungGen: > 1022368K->22752K(1024320K)] 1073312K->79200K(4170048K), 0.2117400 > secs] [Times: user=2.80 sys=0.00, real=0.21 secs] > 2011-08-19T16:55:29.377+0800: 94.507: [GC [PSYoungGen: > 1022688K->22848K(1024192K)] 1079136K->82896K(4169920K), 0.1858340 > secs] [Times: user=2.37 sys=0.01, real=0.18 secs] > 2011-08-19T16:55:30.999+0800: 96.128: [GC [PSYoungGen: > 1022784K->22784K(1024448K)] 1082832K->84752K(4170176K), 0.1648320 > secs] [Times: user=1.98 sys=0.00, real=0.16 secs] > 2011-08-19T16:55:32.576+0800: 97.706: [GC [PSYoungGen: > 1023040K->22816K(1024384K)] 1085008K->89416K(4170112K), 0.1979820 > secs] [Times: user=2.56 sys=0.00, real=0.19 secs] > 2011-08-19T16:55:34.194+0800: 99.323: [GC [PSYoungGen: > 1023072K->22784K(1024640K)] 1089672K->92208K(4170368K), 0.1748660 > secs] [Times: user=2.18 sys=0.01, real=0.18 secs] > 2011-08-19T16:55:35.779+0800: 100.909: [GC [PSYoungGen: > 1023360K->22784K(1024512K)] 1092784K->97792K(4170240K), 0.2129800 > secs] [Times: user=2.81 sys=0.00, real=0.21 secs] > 2011-08-19T16:55:37.437+0800: 102.567: [GC [PSYoungGen: > 1023360K->22752K(1024768K)] 1098368K->101528K(4170496K), 0.1880140 > secs] [Times: user=2.40 sys=0.00, real=0.18 secs] > 2011-08-19T16:55:39.040+0800: 104.169: [GC [PSYoungGen: > 1023648K->22784K(1024704K)] 1102424K->103688K(4170432K), 0.1647610 > secs] [Times: user=2.01 sys=0.01, real=0.16 secs] > 2011-08-19T16:55:40.605+0800: 105.735: [GC [PSYoungGen: > 1023680K->22816K(1024896K)] 1104584K->108600K(4170624K), 0.2030000 > secs] [Times: user=2.65 sys=0.00, real=0.21 secs] > > I think for ps gc,the live objects is the key factor,but in > this case,live objects are not so much,why gc time so slow? I think > it should be below 20ms,can someone help me for this? > > -- > ============================= > | BlueDavy | > | http://www.bluedavy.com | > ============================= From bengt.rutisson at oracle.com Fri Aug 19 13:01:13 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 19 Aug 2011 15:01:13 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4D5757.7040004@oracle.com> References: <4E4D5757.7040004@oracle.com> Message-ID: <4E4E5E99.3010303@oracle.com> Hi John Looks good to me. A couple of minor comments: g1CollectedHeap.cpp: In the do_oop_work method (lines 4324-4362): * To show the limited scope of the should_mark variable I would like to move the declaration of should_mark to just before it is being used at line 4349. * At the two places in the do_oop_work method where you use the do_mark_forwardee paramter you have comments saying that it is an initial mark closure. This is correct, so I think that I would actually prefer that the parameter was called something with inital mark rather than having to have the comments. * On the dead code subject: this closure seems to be unused (in g1CollectedHeap.cpp): "G1ParScanAndMarkHeapRSClosure scan_mark_heap_rs_cl(_g1h, &pss);" g1OopClosures.hpp: Just a nitpick: You removed a line break at row 114 and added a line break at row 122-123. Since you didn't change anything else on these lines it would make the diff easier to view if you left those changes out. Bengt On 2011-08-18 20:17, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers review these refactoring changes to > the marking code used during evacuation pauses (both initial mark > pauses and regular evacuation pauses when marking is active) - the > change can be found at > http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. > > The refactoring changes fix an issue that was seen with the code > changes for 6486945. > > During an initial mark pause, during root scanning, one thread had > successfully forwarded an object and had started to copy it. While the > object was being copied to its new location, another thread saw that > the object had been forwarded and, after checking that the new > location was unmarked, successfully marked the new location. The first > thread would finish the copying, see that the new location was marked > and skip the mark. The situation I ran into was that I was attempting > to obtain the size of the new object just after it was marked (by the > thread doing the marking) and the old object had not yet been fully > copied to its new location. > > With these refactoring changes, the thread that successfully forwards > an object in the collection set will mark the forwardee after copying > - allowing me to safely obtain it's size. > > Testing: several runs of the GC test suite with a marking threshold of > 10 and 20%, Kitchensink, and jprt. > > Thanks, > > JohnC > > From tony.printezis at oracle.com Fri Aug 19 13:48:38 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 09:48:38 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E5E99.3010303@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> Message-ID: <4E4E69B6.2090408@oracle.com> Bengt, On 08/19/2011 09:01 AM, Bengt Rutisson wrote: > > * At the two places in the do_oop_work method where you use the > do_mark_forwardee paramter you have comments saying that it is an > initial mark closure. This is correct, so I think that I would > actually prefer that the parameter was called something with inital > mark rather than having to have the comments. Can I defend the name of the parameter? It is true that do_mark_forwardee is only true during an initial mark pause. But, during an initial mark pause only some of the specializations of the closure will have do_mark_forwardee set to true (the ones that scan roots). So, whether the do_mark_forwardee is true does not only depend on what kind of pause it is but also what kind of closure. So, IMHO, naming it "during initial mark" or something like that would actually be misleading. Tony From stefan.karlsson at oracle.com Fri Aug 19 14:01:39 2011 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 Aug 2011 16:01:39 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E69B6.2090408@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> Message-ID: <4E4E6CC3.5060004@oracle.com> Tony, On 08/19/2011 03:48 PM, Tony Printezis wrote: > Bengt, > > On 08/19/2011 09:01 AM, Bengt Rutisson wrote: >> >> * At the two places in the do_oop_work method where you use the >> do_mark_forwardee paramter you have comments saying that it is an >> initial mark closure. This is correct, so I think that I would >> actually prefer that the parameter was called something with inital >> mark rather than having to have the comments. > > Can I defend the name of the parameter? It is true that > do_mark_forwardee is only true during an initial mark pause. But, > during an initial mark pause only some of the specializations of the > closure will have do_mark_forwardee set to true (the ones that scan > roots). So, whether the do_mark_forwardee is true does not only depend > on what kind of pause it is but also what kind of closure. So, IMHO, > naming it "during initial mark" or something like that would actually > be misleading. But that's not what the comment says: 4339 // Need to mark the copied object if we're an initial 4340 // mark closure, or the object is already marked and 4341 // we need to preserve the mark. 4342 bool should_mark = do_mark_forwardee || 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); and 4357 // Object is not in collection set - if we're an initial mark 4358 // closure then mark the object. 4359 if (do_mark_forwardee) { StefanK > > Tony From tony.printezis at oracle.com Fri Aug 19 14:19:19 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 10:19:19 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E6CC3.5060004@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> Message-ID: <4E4E70E7.30309@oracle.com> Stefan, OK, good point. Maybe John can change the comment to something like "Need to mark the copied object if we're root scanning closure during initial mark, ....". Would this address your concern? Tony On 08/19/2011 10:01 AM, Stefan Karlsson wrote: > But that's not what the comment says: > > 4339 // Need to mark the copied object if we're an initial > 4340 // mark closure, or the object is already marked and > 4341 // we need to preserve the mark. > 4342 bool should_mark = do_mark_forwardee || > 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); > > and > > 4357 // Object is not in collection set - if we're an initial mark > 4358 // closure then mark the object. > 4359 if (do_mark_forwardee) { > > StefanK > >> >> Tony > From stefan.karlsson at oracle.com Fri Aug 19 14:35:32 2011 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 Aug 2011 16:35:32 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E70E7.30309@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> Message-ID: <4E4E74B4.1020101@oracle.com> Tony, On 08/19/2011 04:19 PM, Tony Printezis wrote: > Stefan, > > OK, good point. Maybe John can change the comment to something like > "Need to mark the copied object if we're root scanning closure during > initial mark, ....". Would this address your concern? I just don't see the reason for giving the parameter such a "generic" name and then having comments about initial marking root scan in the code whenever the parameter is used. StefanK > > Tony > > On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >> But that's not what the comment says: >> >> 4339 // Need to mark the copied object if we're an initial >> 4340 // mark closure, or the object is already marked and >> 4341 // we need to preserve the mark. >> 4342 bool should_mark = do_mark_forwardee || >> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >> >> and >> >> 4357 // Object is not in collection set - if we're an initial mark >> 4358 // closure then mark the object. >> 4359 if (do_mark_forwardee) { >> >> StefanK >> >>> >>> Tony >> From tony.printezis at oracle.com Fri Aug 19 14:39:15 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 10:39:15 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E74B4.1020101@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> Message-ID: <4E4E7593.9090107@oracle.com> You think "do_mark_forwardee" is generic?!?!?! It's very descriptive on what it does. Tony On 08/19/2011 10:35 AM, Stefan Karlsson wrote: > Tony, > > On 08/19/2011 04:19 PM, Tony Printezis wrote: >> Stefan, >> >> OK, good point. Maybe John can change the comment to something like >> "Need to mark the copied object if we're root scanning closure during >> initial mark, ....". Would this address your concern? > > I just don't see the reason for giving the parameter such a "generic" > name and then having comments about initial marking root scan in the > code whenever the parameter is used. > > StefanK > >> >> Tony >> >> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>> But that's not what the comment says: >>> >>> 4339 // Need to mark the copied object if we're an initial >>> 4340 // mark closure, or the object is already marked and >>> 4341 // we need to preserve the mark. >>> 4342 bool should_mark = do_mark_forwardee || >>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>> >>> and >>> >>> 4357 // Object is not in collection set - if we're an initial mark >>> 4358 // closure then mark the object. >>> 4359 if (do_mark_forwardee) { >>> >>> StefanK >>> >>>> >>>> Tony >>> > From stefan.karlsson at oracle.com Fri Aug 19 14:43:15 2011 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 Aug 2011 16:43:15 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E7593.9090107@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> Message-ID: <4E4E7683.8040703@oracle.com> On 08/19/2011 04:39 PM, Tony Printezis wrote: > You think "do_mark_forwardee" is generic?!?!?! It's very descriptive > on what it does. Well, it's more generic than something like do_mark_forwardee_during_initial_mark_root_scanning, which at least to me, the comments imply. Anyways, you don't have to listen to my arguments if you don't want to. StefanK > > Tony > > On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >> Tony, >> >> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>> Stefan, >>> >>> OK, good point. Maybe John can change the comment to something like >>> "Need to mark the copied object if we're root scanning closure >>> during initial mark, ....". Would this address your concern? >> >> I just don't see the reason for giving the parameter such a "generic" >> name and then having comments about initial marking root scan in the >> code whenever the parameter is used. >> >> StefanK >> >>> >>> Tony >>> >>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>> But that's not what the comment says: >>>> >>>> 4339 // Need to mark the copied object if we're an initial >>>> 4340 // mark closure, or the object is already marked and >>>> 4341 // we need to preserve the mark. >>>> 4342 bool should_mark = do_mark_forwardee || >>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>> >>>> and >>>> >>>> 4357 // Object is not in collection set - if we're an initial mark >>>> 4358 // closure then mark the object. >>>> 4359 if (do_mark_forwardee) { >>>> >>>> StefanK >>>> >>>>> >>>>> Tony >>>> >> From tony.printezis at oracle.com Fri Aug 19 15:24:30 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 11:24:30 -0400 Subject: CRR (XS): 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize Message-ID: <4E4E802E.3000602@oracle.com> Hi all, Could I have a couple of reviews for this simple change to remove three non-product parameters we have not been using? http://cr.openjdk.java.net/~tonyp/7081064/webrev.0/ Thanks, Tony From tony.printezis at oracle.com Fri Aug 19 15:36:47 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 11:36:47 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E7683.8040703@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> Message-ID: <4E4E830F.3090806@oracle.com> Of course I want to listen to what you have to say... I understand your argument but I hope you can also see where I'm coming from. It's good to name the parameter with the action that it does and set it when it's appropriate (in this case for some closures during initial mark). Would you like better if John changes the comment to something like "we've been asked to mark the objects" and make sure the place where do_mark_forwardee is set to true describes why it is? Tony On 08/19/2011 10:43 AM, Stefan Karlsson wrote: > On 08/19/2011 04:39 PM, Tony Printezis wrote: >> You think "do_mark_forwardee" is generic?!?!?! It's very descriptive >> on what it does. > > Well, it's more generic than something like > do_mark_forwardee_during_initial_mark_root_scanning, which at least to > me, the comments imply. Anyways, you don't have to listen to my > arguments if you don't want to. > > StefanK > >> >> Tony >> >> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>> Tony, >>> >>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>> Stefan, >>>> >>>> OK, good point. Maybe John can change the comment to something like >>>> "Need to mark the copied object if we're root scanning closure >>>> during initial mark, ....". Would this address your concern? >>> >>> I just don't see the reason for giving the parameter such a >>> "generic" name and then having comments about initial marking root >>> scan in the code whenever the parameter is used. >>> >>> StefanK >>> >>>> >>>> Tony >>>> >>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>> But that's not what the comment says: >>>>> >>>>> 4339 // Need to mark the copied object if we're an initial >>>>> 4340 // mark closure, or the object is already marked and >>>>> 4341 // we need to preserve the mark. >>>>> 4342 bool should_mark = do_mark_forwardee || >>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>> >>>>> and >>>>> >>>>> 4357 // Object is not in collection set - if we're an initial >>>>> mark >>>>> 4358 // closure then mark the object. >>>>> 4359 if (do_mark_forwardee) { >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Tony >>>>> >>> > From stefan.karlsson at oracle.com Fri Aug 19 15:58:22 2011 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 Aug 2011 17:58:22 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E830F.3090806@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> Message-ID: <4E4E881E.4060704@oracle.com> On 08/19/2011 05:36 PM, Tony Printezis wrote: > Of course I want to listen to what you have to say... > > I understand your argument but I hope you can also see where I'm > coming from. It's good to name the parameter with the action that it > does and set it when it's appropriate (in this case for some closures > during initial mark). Yes, I like that part. > Would you like better if John changes the comment to something like > "we've been asked to mark the objects" and make sure the place where > do_mark_forwardee is set to true describes why it is? Yes. I like the parameter name. The thing I'm having a problem with is that the parameter name is "generic" in that sense that it doesn't convey who's setting it to true. But then we have the comment that goes on and reveals that information anyway. StefanK > > Tony > > On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>> You think "do_mark_forwardee" is generic?!?!?! It's very descriptive >>> on what it does. >> >> Well, it's more generic than something like >> do_mark_forwardee_during_initial_mark_root_scanning, which at least >> to me, the comments imply. Anyways, you don't have to listen to my >> arguments if you don't want to. >> >> StefanK >> >>> >>> Tony >>> >>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>> Tony, >>>> >>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>> Stefan, >>>>> >>>>> OK, good point. Maybe John can change the comment to something >>>>> like "Need to mark the copied object if we're root scanning >>>>> closure during initial mark, ....". Would this address your concern? >>>> >>>> I just don't see the reason for giving the parameter such a >>>> "generic" name and then having comments about initial marking root >>>> scan in the code whenever the parameter is used. >>>> >>>> StefanK >>>> >>>>> >>>>> Tony >>>>> >>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>> But that's not what the comment says: >>>>>> >>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>> 4340 // mark closure, or the object is already marked and >>>>>> 4341 // we need to preserve the mark. >>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>> >>>>>> and >>>>>> >>>>>> 4357 // Object is not in collection set - if we're an initial >>>>>> mark >>>>>> 4358 // closure then mark the object. >>>>>> 4359 if (do_mark_forwardee) { >>>>>> >>>>>> StefanK >>>>>> >>>>>>> >>>>>>> Tony >>>>>> >>>> >> From tony.printezis at oracle.com Fri Aug 19 16:23:28 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 12:23:28 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E881E.4060704@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> <4E4E881E.4060704@oracle.com> Message-ID: <4E4E8E00.7070208@oracle.com> On 08/19/2011 11:58 AM, Stefan Karlsson wrote: >> Would you like better if John changes the comment to something like >> "we've been asked to mark the objects" and make sure the place where >> do_mark_forwardee is set to true describes why it is? > > Yes. > > I like the parameter name. The thing I'm having a problem with is that > the parameter name is "generic" in that sense that it doesn't convey > who's setting it to true. Whoever wants to make sure every object visited by the closure is marked. Tony > But then we have the comment that goes on and reveals that information > anyway. > > StefanK > >> >> Tony >> >> On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >>> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>>> You think "do_mark_forwardee" is generic?!?!?! It's very >>>> descriptive on what it does. >>> >>> Well, it's more generic than something like >>> do_mark_forwardee_during_initial_mark_root_scanning, which at least >>> to me, the comments imply. Anyways, you don't have to listen to my >>> arguments if you don't want to. >>> >>> StefanK >>> >>>> >>>> Tony >>>> >>>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>>> Tony, >>>>> >>>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>>> Stefan, >>>>>> >>>>>> OK, good point. Maybe John can change the comment to something >>>>>> like "Need to mark the copied object if we're root scanning >>>>>> closure during initial mark, ....". Would this address your concern? >>>>> >>>>> I just don't see the reason for giving the parameter such a >>>>> "generic" name and then having comments about initial marking root >>>>> scan in the code whenever the parameter is used. >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Tony >>>>>> >>>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>>> But that's not what the comment says: >>>>>>> >>>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>>> 4340 // mark closure, or the object is already marked and >>>>>>> 4341 // we need to preserve the mark. >>>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>>> >>>>>>> and >>>>>>> >>>>>>> 4357 // Object is not in collection set - if we're an >>>>>>> initial mark >>>>>>> 4358 // closure then mark the object. >>>>>>> 4359 if (do_mark_forwardee) { >>>>>>> >>>>>>> StefanK >>>>>>> >>>>>>>> >>>>>>>> Tony >>>>>>> >>>>> >>> > From stefan.karlsson at oracle.com Fri Aug 19 16:44:53 2011 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 19 Aug 2011 18:44:53 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E8E00.7070208@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> <4E4E881E.4060704@oracle.com> <4E4E8E00.7070208@oracle.com> Message-ID: <4E4E9305.7010906@oracle.com> On 2011-08-19 18:23, Tony Printezis wrote: > > > On 08/19/2011 11:58 AM, Stefan Karlsson wrote: >>> Would you like better if John changes the comment to something like >>> "we've been asked to mark the objects" and make sure the place where >>> do_mark_forwardee is set to true describes why it is? >> >> Yes. >> >> I like the parameter name. The thing I'm having a problem with is >> that the parameter name is "generic" in that sense that it doesn't >> convey who's setting it to true. > > Whoever wants to make sure every object visited by the closure is marked. I see that you're misunderstanding what I'm trying to say. I like the parameter name, but I don't like the comments. StefanK > > Tony > >> But then we have the comment that goes on and reveals that >> information anyway. >> >> StefanK >> >>> >>> Tony >>> >>> On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >>>> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>>>> You think "do_mark_forwardee" is generic?!?!?! It's very >>>>> descriptive on what it does. >>>> >>>> Well, it's more generic than something like >>>> do_mark_forwardee_during_initial_mark_root_scanning, which at least >>>> to me, the comments imply. Anyways, you don't have to listen to my >>>> arguments if you don't want to. >>>> >>>> StefanK >>>> >>>>> >>>>> Tony >>>>> >>>>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>>>> Tony, >>>>>> >>>>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>>>> Stefan, >>>>>>> >>>>>>> OK, good point. Maybe John can change the comment to something >>>>>>> like "Need to mark the copied object if we're root scanning >>>>>>> closure during initial mark, ....". Would this address your >>>>>>> concern? >>>>>> >>>>>> I just don't see the reason for giving the parameter such a >>>>>> "generic" name and then having comments about initial marking >>>>>> root scan in the code whenever the parameter is used. >>>>>> >>>>>> StefanK >>>>>> >>>>>>> >>>>>>> Tony >>>>>>> >>>>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>>>> But that's not what the comment says: >>>>>>>> >>>>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>>>> 4340 // mark closure, or the object is already marked and >>>>>>>> 4341 // we need to preserve the mark. >>>>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> 4357 // Object is not in collection set - if we're an >>>>>>>> initial mark >>>>>>>> 4358 // closure then mark the object. >>>>>>>> 4359 if (do_mark_forwardee) { >>>>>>>> >>>>>>>> StefanK >>>>>>>> >>>>>>>>> >>>>>>>>> Tony >>>>>>>> >>>>>> >>>> >> From tony.printezis at oracle.com Fri Aug 19 17:07:37 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 13:07:37 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E9305.7010906@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> <4E4E881E.4060704@oracle.com> <4E4E8E00.7070208@oracle.com> <4E4E9305.7010906@oracle.com> Message-ID: <4E4E9859.8010109@oracle.com> OK. So at this point I'll take the easy way out and yield to John on whether he wants to change the comments. :-) I had only originally commented on the parameter name being accurate. Tony On 08/19/2011 12:44 PM, Stefan Karlsson wrote: > On 2011-08-19 18:23, Tony Printezis wrote: >> >> >> On 08/19/2011 11:58 AM, Stefan Karlsson wrote: >>>> Would you like better if John changes the comment to something like >>>> "we've been asked to mark the objects" and make sure the place >>>> where do_mark_forwardee is set to true describes why it is? >>> >>> Yes. >>> >>> I like the parameter name. The thing I'm having a problem with is >>> that the parameter name is "generic" in that sense that it doesn't >>> convey who's setting it to true. >> >> Whoever wants to make sure every object visited by the closure is >> marked. > > I see that you're misunderstanding what I'm trying to say. I like the > parameter name, but I don't like the comments. > > StefanK > >> >> Tony >> >>> But then we have the comment that goes on and reveals that >>> information anyway. >>> >>> StefanK >>> >>>> >>>> Tony >>>> >>>> On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >>>>> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>>>>> You think "do_mark_forwardee" is generic?!?!?! It's very >>>>>> descriptive on what it does. >>>>> >>>>> Well, it's more generic than something like >>>>> do_mark_forwardee_during_initial_mark_root_scanning, which at >>>>> least to me, the comments imply. Anyways, you don't have to listen >>>>> to my arguments if you don't want to. >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Tony >>>>>> >>>>>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>>>>> Tony, >>>>>>> >>>>>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>>>>> Stefan, >>>>>>>> >>>>>>>> OK, good point. Maybe John can change the comment to something >>>>>>>> like "Need to mark the copied object if we're root scanning >>>>>>>> closure during initial mark, ....". Would this address your >>>>>>>> concern? >>>>>>> >>>>>>> I just don't see the reason for giving the parameter such a >>>>>>> "generic" name and then having comments about initial marking >>>>>>> root scan in the code whenever the parameter is used. >>>>>>> >>>>>>> StefanK >>>>>>> >>>>>>>> >>>>>>>> Tony >>>>>>>> >>>>>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>>>>> But that's not what the comment says: >>>>>>>>> >>>>>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>>>>> 4340 // mark closure, or the object is already marked and >>>>>>>>> 4341 // we need to preserve the mark. >>>>>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> 4357 // Object is not in collection set - if we're an >>>>>>>>> initial mark >>>>>>>>> 4358 // closure then mark the object. >>>>>>>>> 4359 if (do_mark_forwardee) { >>>>>>>>> >>>>>>>>> StefanK >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Tony >>>>>>>>> >>>>>>> >>>>> >>> > From john.cuthbertson at oracle.com Fri Aug 19 17:19:41 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 19 Aug 2011 10:19:41 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E9305.7010906@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> <4E4E881E.4060704@oracle.com> <4E4E8E00.7070208@oracle.com> <4E4E9305.7010906@oracle.com> Message-ID: <4E4E9B2D.7010807@oracle.com> Wow. I certainly didn't think the changes were this controversial. I don't want to change the template parameter name unless it's to remove the "forwardee" (to perhaps do_mark_obect[s], or should_mark_object[s]) because it's is no longer strictly true that its the forwarded object that is marked. Forwarded objects are now marked by the code in copy_to_survivor_space. But changing the comment to include that do_mark_forwardee is true during an initial mark pause for the root scanning closures is fine. JohnC On 08/19/11 09:44, Stefan Karlsson wrote: > On 2011-08-19 18:23, Tony Printezis wrote: >> >> >> On 08/19/2011 11:58 AM, Stefan Karlsson wrote: >>>> Would you like better if John changes the comment to something like >>>> "we've been asked to mark the objects" and make sure the place >>>> where do_mark_forwardee is set to true describes why it is? >>> >>> Yes. >>> >>> I like the parameter name. The thing I'm having a problem with is >>> that the parameter name is "generic" in that sense that it doesn't >>> convey who's setting it to true. >> >> Whoever wants to make sure every object visited by the closure is >> marked. > > I see that you're misunderstanding what I'm trying to say. I like the > parameter name, but I don't like the comments. > > StefanK > >> >> Tony >> >>> But then we have the comment that goes on and reveals that >>> information anyway. >>> >>> StefanK >>> >>>> >>>> Tony >>>> >>>> On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >>>>> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>>>>> You think "do_mark_forwardee" is generic?!?!?! It's very >>>>>> descriptive on what it does. >>>>> >>>>> Well, it's more generic than something like >>>>> do_mark_forwardee_during_initial_mark_root_scanning, which at >>>>> least to me, the comments imply. Anyways, you don't have to listen >>>>> to my arguments if you don't want to. >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Tony >>>>>> >>>>>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>>>>> Tony, >>>>>>> >>>>>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>>>>> Stefan, >>>>>>>> >>>>>>>> OK, good point. Maybe John can change the comment to something >>>>>>>> like "Need to mark the copied object if we're root scanning >>>>>>>> closure during initial mark, ....". Would this address your >>>>>>>> concern? >>>>>>> >>>>>>> I just don't see the reason for giving the parameter such a >>>>>>> "generic" name and then having comments about initial marking >>>>>>> root scan in the code whenever the parameter is used. >>>>>>> >>>>>>> StefanK >>>>>>> >>>>>>>> >>>>>>>> Tony >>>>>>>> >>>>>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>>>>> But that's not what the comment says: >>>>>>>>> >>>>>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>>>>> 4340 // mark closure, or the object is already marked and >>>>>>>>> 4341 // we need to preserve the mark. >>>>>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>>>>> >>>>>>>>> and >>>>>>>>> >>>>>>>>> 4357 // Object is not in collection set - if we're an >>>>>>>>> initial mark >>>>>>>>> 4358 // closure then mark the object. >>>>>>>>> 4359 if (do_mark_forwardee) { >>>>>>>>> >>>>>>>>> StefanK >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Tony >>>>>>>>> >>>>>>> >>>>> >>> > From tony.printezis at oracle.com Fri Aug 19 17:20:47 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 19 Aug 2011 13:20:47 -0400 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E9B2D.7010807@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> <4E4E69B6.2090408@oracle.com> <4E4E6CC3.5060004@oracle.com> <4E4E70E7.30309@oracle.com> <4E4E74B4.1020101@oracle.com> <4E4E7593.9090107@oracle.com> <4E4E7683.8040703@oracle.com> <4E4E830F.3090806@oracle.com> <4E4E881E.4060704@oracle.com> <4E4E8E00.7070208@oracle.com> <4E4E9305.7010906@oracle.com> <4E4E9B2D.7010807@oracle.com> Message-ID: <4E4E9B6F.3020609@oracle.com> John, Good observation re: do_mark_forwardee not being quite accurate any more (the same for the mark_forwardee() method too). I'd be OK with them getting renamed to do_mark_object(). Tony On 08/19/2011 01:19 PM, John Cuthbertson wrote: > Wow. I certainly didn't think the changes were this controversial. > > I don't want to change the template parameter name unless it's to > remove the "forwardee" (to perhaps do_mark_obect[s], or > should_mark_object[s]) because it's is no longer strictly true that > its the forwarded object that is marked. Forwarded objects are now > marked by the code in copy_to_survivor_space. But changing the comment > to include that do_mark_forwardee is true during an initial mark pause > for the root scanning closures is fine. > > JohnC > > On 08/19/11 09:44, Stefan Karlsson wrote: >> On 2011-08-19 18:23, Tony Printezis wrote: >>> >>> >>> On 08/19/2011 11:58 AM, Stefan Karlsson wrote: >>>>> Would you like better if John changes the comment to something >>>>> like "we've been asked to mark the objects" and make sure the >>>>> place where do_mark_forwardee is set to true describes why it is? >>>> >>>> Yes. >>>> >>>> I like the parameter name. The thing I'm having a problem with is >>>> that the parameter name is "generic" in that sense that it doesn't >>>> convey who's setting it to true. >>> >>> Whoever wants to make sure every object visited by the closure is >>> marked. >> >> I see that you're misunderstanding what I'm trying to say. I like the >> parameter name, but I don't like the comments. >> >> StefanK >> >>> >>> Tony >>> >>>> But then we have the comment that goes on and reveals that >>>> information anyway. >>>> >>>> StefanK >>>> >>>>> >>>>> Tony >>>>> >>>>> On 08/19/2011 10:43 AM, Stefan Karlsson wrote: >>>>>> On 08/19/2011 04:39 PM, Tony Printezis wrote: >>>>>>> You think "do_mark_forwardee" is generic?!?!?! It's very >>>>>>> descriptive on what it does. >>>>>> >>>>>> Well, it's more generic than something like >>>>>> do_mark_forwardee_during_initial_mark_root_scanning, which at >>>>>> least to me, the comments imply. Anyways, you don't have to >>>>>> listen to my arguments if you don't want to. >>>>>> >>>>>> StefanK >>>>>> >>>>>>> >>>>>>> Tony >>>>>>> >>>>>>> On 08/19/2011 10:35 AM, Stefan Karlsson wrote: >>>>>>>> Tony, >>>>>>>> >>>>>>>> On 08/19/2011 04:19 PM, Tony Printezis wrote: >>>>>>>>> Stefan, >>>>>>>>> >>>>>>>>> OK, good point. Maybe John can change the comment to something >>>>>>>>> like "Need to mark the copied object if we're root scanning >>>>>>>>> closure during initial mark, ....". Would this address your >>>>>>>>> concern? >>>>>>>> >>>>>>>> I just don't see the reason for giving the parameter such a >>>>>>>> "generic" name and then having comments about initial marking >>>>>>>> root scan in the code whenever the parameter is used. >>>>>>>> >>>>>>>> StefanK >>>>>>>> >>>>>>>>> >>>>>>>>> Tony >>>>>>>>> >>>>>>>>> On 08/19/2011 10:01 AM, Stefan Karlsson wrote: >>>>>>>>>> But that's not what the comment says: >>>>>>>>>> >>>>>>>>>> 4339 // Need to mark the copied object if we're an initial >>>>>>>>>> 4340 // mark closure, or the object is already marked and >>>>>>>>>> 4341 // we need to preserve the mark. >>>>>>>>>> 4342 bool should_mark = do_mark_forwardee || >>>>>>>>>> 4343 (_g1->mark_in_progress()&& !_g1->is_obj_ill(obj)); >>>>>>>>>> >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>> 4357 // Object is not in collection set - if we're an >>>>>>>>>> initial mark >>>>>>>>>> 4358 // closure then mark the object. >>>>>>>>>> 4359 if (do_mark_forwardee) { >>>>>>>>>> >>>>>>>>>> StefanK >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Tony >>>>>>>>>> >>>>>>>> >>>>>> >>>> >> > From john.cuthbertson at oracle.com Fri Aug 19 20:28:01 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 19 Aug 2011 13:28:01 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4E5E99.3010303@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4E5E99.3010303@oracle.com> Message-ID: <4E4EC751.6030204@oracle.com> Hi Bengt, Thanks for the review - comments are inline.... On 08/19/11 06:01, Bengt Rutisson wrote: > > Hi John > > Looks good to me. > > A couple of minor comments: > > g1CollectedHeap.cpp: > > In the do_oop_work method (lines 4324-4362): > > * To show the limited scope of the should_mark variable I would like > to move the declaration of should_mark to just before it is being used > at line 4349. Done! Tony also suggested this. > > * At the two places in the do_oop_work method where you use the > do_mark_forwardee paramter you have comments saying that it is an > initial mark closure. This is correct, so I think that I would > actually prefer that the parameter was called something with inital > mark rather than having to have the comments. OK. I have changed the comments to read "... we're a root scanning closure during and initial-mark pause (i.e. do_mark_object will be true)....". > > * On the dead code subject: this closure seems to be unused (in > g1CollectedHeap.cpp): "G1ParScanAndMarkHeapRSClosure > scan_mark_heap_rs_cl(_g1h, &pss);" I have this (and other clean ups in g1OopClosures.hpp) in the reference processing webrev - I'd rather not duplicate it here. > > g1OopClosures.hpp: > > Just a nitpick: You removed a line break at row 114 and added a line > break at row 122-123. Since you didn't change anything else on these > lines it would make the diff easier to view if you left those changes > out. > To me, it's difficult to pick out the template parameters - especially when one is on a line on it's own. But I'll back them out. JohnC > Bengt > > On 2011-08-18 20:17, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers review these refactoring changes to >> the marking code used during evacuation pauses (both initial mark >> pauses and regular evacuation pauses when marking is active) - the >> change can be found at >> http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. >> >> The refactoring changes fix an issue that was seen with the code >> changes for 6486945. >> >> During an initial mark pause, during root scanning, one thread had >> successfully forwarded an object and had started to copy it. While >> the object was being copied to its new location, another thread saw >> that the object had been forwarded and, after checking that the new >> location was unmarked, successfully marked the new location. The >> first thread would finish the copying, see that the new location was >> marked and skip the mark. The situation I ran into was that I was >> attempting to obtain the size of the new object just after it was >> marked (by the thread doing the marking) and the old object had not >> yet been fully copied to its new location. >> >> With these refactoring changes, the thread that successfully forwards >> an object in the collection set will mark the forwardee after copying >> - allowing me to safely obtain it's size. >> >> Testing: several runs of the GC test suite with a marking threshold >> of 10 and 20%, Kitchensink, and jprt. >> >> Thanks, >> >> JohnC >> >> > From tom.deneau at amd.com Fri Aug 19 22:47:00 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Fri, 19 Aug 2011 17:47:00 -0500 Subject: Review Request: UseNUMAInterleaving #3 In-Reply-To: <91928C974B07497184AF80B96F196606@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> Please review this patch which adds a new flag called UseNUMAInterleaving. This flag provides a subset of the functionality provided by UseNUMA. In Hotspot UseNUMA terminology, UseNUMAInterleaved makes all memory "numa_global" which is implemented as interleaved. This patch's main purpose is to provide that subset on OSes like Windows which do not support the full UseNUMA functionality. However, a simple implementation of UseNUMAInterleaving is also provided for other OSes The situations where this shows the biggest benefits would be: * Windows platforms with multiple numa nodes (eg, 4) * The JVM process is run across all the nodes (not affinitized to one node). * A workload that has enough threads so that it uses the majority of the cores in the machine, so that the heap is being accessed from many cores, including remote ones. * Enough memory per node and a heap size such that the default heap placement policy on windows would end up with the heap (or nursery) placed on one node. jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our measurements, we have seen some cases where the performance with UseNUMAInterleaving was 2.7x vs. the performance without. There were gains of varying sizes across all systems. The webrev is at http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.03/ Summary of changes in webrev.03 from webrev.02: * As suggested by Igor Veresov, reverts to using UseNUMAInterleaving as the enabling flag. This will make it easier in the future when there are GCs that enable fuller UseNUMA on Windows. * Adds a simple implementation of UseNUMAInterleaving on Linux and Solaris, which just calls numa_make_global after commit_memory and reserve_memory_special * Adds a flag NUMAInterleaveGranularity which allows setting the granularity with which we move to a different node in a memory allocation. The default is 2MB. This flag only applies to Windows for now. * Several code cleanups in os_windows.cpp suggested by Igor. Summary of changes in os_windows.cpp: * Some static routines were added to set things up init time. These * check that the required APIs (VirtualAllocExNuma, GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in the OS * build the list of numa nodes on which this process has affinity * Changes to os::reserve_memory * There was already a routine that reserved pages one page at a time (used for Individual Large Page Allocation on WS2003). This was abstracted to a separate routine, called allocate_pages_individually. This gets called both for the Individual Large Page Allocation thing mentioned above and for UseNUMAInterleaving (for both small and large pages) * When used for NUMA Interleaving this just goes thru the numa node list in a round-robin fashion, using a different one for each chunk (with 4K pages, the minimum allocation granularity is 64K, with 2M pages it is 1 Page) * Whether we do just a reserve or a combined reserve/commit is determined by the caller of allocate_pages_individually * When used with large pages, we do a Reserve and Commit at the same time which is the way it always worked and the way it has to work on windows. * For small pages, only the reserve is done, the commit will come later. (which is the way it worked for non-interleaved) * os::commit_memory changes * If UseNUMAIntereaving is true, os::commit_memory has to check whether it was being asked to commit memory that might have come from multiple Reserve allocations, if so, the commits must also be broken up. We don't keep any data structure to keep track of this, we just use VirtualQuery which queries the properties of a VA range and can tell us how much came from one VirtualAlloc call. I do not have a bug id for this. -- Tom Deneau, AMD From john.cuthbertson at oracle.com Fri Aug 19 23:12:27 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 19 Aug 2011 16:12:27 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4D5757.7040004@oracle.com> References: <4E4D5757.7040004@oracle.com> Message-ID: <4E4EEDDB.4060408@oracle.com> Hi Everyone, Hopefully this webrev (http://cr.openjdk.java.net/~johnc/7080389/webrev.1/) addresses everyones' comments. Thanks, JohnC On 08/18/11 11:17, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers review these refactoring changes to > the marking code used during evacuation pauses (both initial mark > pauses and regular evacuation pauses when marking is active) - the > change can be found at > http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. > > The refactoring changes fix an issue that was seen with the code > changes for 6486945. > > During an initial mark pause, during root scanning, one thread had > successfully forwarded an object and had started to copy it. While the > object was being copied to its new location, another thread saw that > the object had been forwarded and, after checking that the new > location was unmarked, successfully marked the new location. The first > thread would finish the copying, see that the new location was marked > and skip the mark. The situation I ran into was that I was attempting > to obtain the size of the new object just after it was marked (by the > thread doing the marking) and the old object had not yet been fully > copied to its new location. > > With these refactoring changes, the thread that successfully forwards > an object in the collection set will mark the forwardee after copying > - allowing me to safely obtain it's size. > > Testing: several runs of the GC test suite with a marking threshold of > 10 and 20%, Kitchensink, and jprt. > > Thanks, > > JohnC > > From john.coomes at oracle.com Sat Aug 20 03:08:10 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 20 Aug 2011 03:08:10 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 29 new changesets Message-ID: <20110820030904.9958547E90@hg.openjdk.java.net> Changeset: 46cb9a7b8b01 Author: dsamersoff Date: 2011-08-10 15:04 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/46cb9a7b8b01 7073913: The fix for 7017193 causes segfaults Summary: Buffer overflow in os::get_line_chars Reviewed-by: coleenp, dholmes, dcubed Contributed-by: aph at redhat.com ! src/share/vm/runtime/os.cpp Changeset: b1cbb0907b36 Author: zgu Date: 2011-04-15 09:34 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b1cbb0907b36 7016797: Hotspot: securely/restrictive load dlls and new API for loading system dlls Summary: Created Windows Dll wrapped to handle jdk6 and jdk7 platform requirements, also provided more restictive Dll search orders for Windows system Dlls. Reviewed-by: acorn, dcubed, ohair, alanb ! make/windows/makefiles/compile.make ! src/os/windows/vm/decoder_windows.cpp ! src/os/windows/vm/jvm_windows.h ! src/os/windows/vm/os_windows.cpp ! src/os/windows/vm/os_windows.hpp Changeset: 279ef1916773 Author: zgu Date: 2011-07-12 21:13 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/279ef1916773 7065535: Mistyped function name that disabled UseLargePages on Windows Summary: Missing suffix "A" of Windows API LookupPrivilegeValue failed finding function pointer, caused VM to disable UseLargePages option Reviewed-by: coleenp, phh ! src/os/windows/vm/os_windows.cpp Changeset: a68e11dceb83 Author: zgu Date: 2011-08-16 09:18 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a68e11dceb83 Merge Changeset: 00ed4ccfe642 Author: collins Date: 2011-08-17 07:05 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/00ed4ccfe642 Merge Changeset: 43f9d800f276 Author: iveresov Date: 2011-07-20 18:04 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/43f9d800f276 7066339: Tiered: policy should make consistent decisions about osr levels Summary: Added feedback disabling flag to common(), fixed handling of TieredStopAtLevel. Reviewed-by: kvn, never ! src/share/vm/classfile/classLoader.cpp ! src/share/vm/interpreter/linkResolver.cpp ! src/share/vm/prims/methodHandles.cpp ! src/share/vm/runtime/advancedThresholdPolicy.cpp ! src/share/vm/runtime/advancedThresholdPolicy.hpp ! src/share/vm/runtime/compilationPolicy.hpp ! src/share/vm/runtime/javaCalls.cpp ! src/share/vm/runtime/simpleThresholdPolicy.cpp ! src/share/vm/runtime/simpleThresholdPolicy.hpp Changeset: 6a991dcb52bb Author: never Date: 2011-07-21 08:38 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6a991dcb52bb 7012081: JSR 292: SA-JDI can't read MH/MT/Indy ConstantPool entries Reviewed-by: kvn, twisti, jrose ! agent/src/share/classes/sun/jvm/hotspot/interpreter/Bytecode.java - agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeFastAAccess0.java - agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeFastIAccess0.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeLoadConstant.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeStream.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeWideable.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeWithCPIndex.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/Bytecodes.java ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstMethod.java ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstantPool.java ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstantPoolCache.java ! agent/src/share/classes/sun/jvm/hotspot/oops/GenerateOopMap.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Method.java ! agent/src/share/classes/sun/jvm/hotspot/oops/TypeArray.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/ConstantTag.java ! src/share/vm/oops/generateOopMap.cpp Changeset: 3d42f82cd811 Author: kvn Date: 2011-07-21 11:25 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/3d42f82cd811 7063628: Use cbcond on T4 Summary: Add new short branch instruction to Hotspot sparc assembler. Reviewed-by: never, twisti, jrose ! src/cpu/sparc/vm/assembler_sparc.cpp ! src/cpu/sparc/vm/assembler_sparc.hpp ! src/cpu/sparc/vm/assembler_sparc.inline.hpp ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_MacroAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_Runtime1_sparc.cpp ! src/cpu/sparc/vm/cppInterpreter_sparc.cpp ! src/cpu/sparc/vm/interp_masm_sparc.cpp ! src/cpu/sparc/vm/interpreter_sparc.cpp ! src/cpu/sparc/vm/methodHandles_sparc.cpp ! src/cpu/sparc/vm/sharedRuntime_sparc.cpp ! src/cpu/sparc/vm/sparc.ad ! src/cpu/sparc/vm/stubGenerator_sparc.cpp ! src/cpu/sparc/vm/templateInterpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/sparc/vm/vm_version_sparc.cpp ! src/cpu/sparc/vm/vm_version_sparc.hpp ! src/cpu/sparc/vm/vtableStubs_sparc.cpp ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/os_cpu/solaris_sparc/vm/vm_version_solaris_sparc.cpp ! src/share/vm/adlc/formssel.cpp ! src/share/vm/adlc/output_c.cpp ! src/share/vm/adlc/output_h.cpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/machnode.cpp ! src/share/vm/opto/machnode.hpp ! src/share/vm/opto/output.cpp ! src/share/vm/runtime/globals.hpp Changeset: 4e761e7e6e12 Author: kvn Date: 2011-07-26 19:35 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4e761e7e6e12 7070134: Hotspot crashes with sigsegv from PorterStemmer Summary: Do not move data nodes which are attached to a predicate test to a dominating test. Reviewed-by: never ! src/share/vm/opto/ifnode.cpp ! src/share/vm/opto/loopPredicate.cpp ! src/share/vm/opto/loopnode.hpp ! src/share/vm/opto/loopopts.cpp + test/compiler/7070134/Stemmer.java + test/compiler/7070134/Test7070134.sh + test/compiler/7070134/words Changeset: 0f34fdee809e Author: never Date: 2011-07-27 15:06 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0f34fdee809e 7071427: AdapterFingerPrint can hold 8 entries per int Reviewed-by: kvn ! src/share/vm/runtime/java.cpp ! src/share/vm/runtime/sharedRuntime.cpp Changeset: c7b60b601eb4 Author: kvn Date: 2011-07-27 17:28 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c7b60b601eb4 7069452: Cleanup NodeFlags Summary: Remove flags which duplicate information in Node::NodeClasses. Reviewed-by: never ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/adlc/adlparse.cpp ! src/share/vm/adlc/archDesc.cpp ! src/share/vm/adlc/formssel.cpp ! src/share/vm/adlc/formssel.hpp ! src/share/vm/adlc/output_h.cpp ! src/share/vm/opto/block.cpp ! src/share/vm/opto/callnode.hpp ! src/share/vm/opto/cfgnode.hpp ! src/share/vm/opto/coalesce.cpp ! src/share/vm/opto/gcm.cpp ! src/share/vm/opto/idealGraphPrinter.cpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/machnode.hpp ! src/share/vm/opto/mulnode.cpp ! src/share/vm/opto/mulnode.hpp ! src/share/vm/opto/node.hpp ! src/share/vm/opto/output.cpp ! src/share/vm/opto/reg_split.cpp ! src/share/vm/opto/superword.cpp ! src/share/vm/opto/superword.hpp ! src/share/vm/opto/vectornode.cpp ! src/share/vm/opto/vectornode.hpp Changeset: d17bd0b18663 Author: twisti Date: 2011-07-28 02:14 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d17bd0b18663 7066143: JSR 292: Zero support after regressions from 7009923 and 7009309 Reviewed-by: jrose, twisti Contributed-by: Xerxes Ranby ! src/cpu/zero/vm/stack_zero.cpp ! src/share/vm/runtime/vmStructs.cpp Changeset: ce3e1d4dc416 Author: never Date: 2011-07-28 13:03 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ce3e1d4dc416 7060619: C1 should respect inline and dontinline directives from CompilerOracle Reviewed-by: kvn, iveresov ! src/share/vm/c1/c1_GraphBuilder.cpp Changeset: c96c3eb1efae Author: kvn Date: 2011-07-29 09:16 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c96c3eb1efae 7068051: SIGSEGV in PhaseIdealLoop::build_loop_late_post Summary: Removed predicate cloning from loop peeling optimization and from split fall-in paths. Reviewed-by: never ! src/share/vm/opto/cfgnode.cpp ! src/share/vm/opto/ifnode.cpp ! src/share/vm/opto/loopPredicate.cpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopUnswitch.cpp ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/loopnode.hpp ! src/share/vm/opto/phaseX.hpp Changeset: 4aa5974a06dd Author: kvn Date: 2011-08-06 08:28 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4aa5974a06dd 7075559: JPRT windows_x64 build failure Summary: use SA_CLASSDIR variable instead of dirsctory saclasses. Reviewed-by: kamg, dcubed ! make/linux/makefiles/defs.make ! make/solaris/makefiles/defs.make ! make/solaris/makefiles/saproc.make ! make/windows/makefiles/defs.make ! make/windows/makefiles/sa.make Changeset: a3142bdb6707 Author: twisti Date: 2011-08-08 05:49 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a3142bdb6707 7071823: Zero: zero/shark doesn't build after b147-fcs Reviewed-by: gbenson, twisti Contributed-by: Chris Phillips ! src/cpu/zero/vm/frame_zero.cpp + src/cpu/zero/vm/methodHandles_zero.hpp ! src/cpu/zero/vm/sharedRuntime_zero.cpp ! src/share/vm/shark/sharkContext.hpp Changeset: a19c671188cb Author: never Date: 2011-08-08 13:19 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a19c671188cb 7075623: 6990212 broke raiseException in 64 bit Reviewed-by: kvn, twisti ! src/cpu/sparc/vm/methodHandles_sparc.cpp ! src/cpu/x86/vm/methodHandles_x86.cpp Changeset: f1c12354c3f7 Author: roland Date: 2011-08-02 18:36 +0200 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1c12354c3f7 7074017: Introduce MemBarAcquireLock/MemBarReleaseLock nodes for monitor enter/exit code paths Summary: replace MemBarAcquire/MemBarRelease nodes on the monitor enter/exit code paths with new MemBarAcquireLock/MemBarReleaseLock nodes Reviewed-by: kvn, twisti ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/adlc/formssel.cpp ! src/share/vm/opto/classes.hpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/matcher.cpp ! src/share/vm/opto/matcher.hpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/memnode.hpp Changeset: 6987871cfb9b Author: kvn Date: 2011-08-10 14:06 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6987871cfb9b 7077439: Possible reference through NULL in loopPredicate.cpp:726 Summary: Use cl->is_valid_counted_loop() check. Reviewed-by: never ! src/share/vm/opto/loopPredicate.cpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/superword.cpp Changeset: 95134e034042 Author: kvn Date: 2011-08-11 12:08 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/95134e034042 7063629: use cbcond in C2 generated code on T4 Summary: Use new short branch instruction in C2 generated code. Reviewed-by: never ! src/cpu/sparc/vm/assembler_sparc.hpp ! src/cpu/sparc/vm/sparc.ad ! src/cpu/sparc/vm/vm_version_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/os_cpu/linux_x86/vm/linux_x86_32.ad ! src/os_cpu/linux_x86/vm/linux_x86_64.ad ! src/os_cpu/solaris_x86/vm/solaris_x86_32.ad ! src/os_cpu/solaris_x86/vm/solaris_x86_64.ad ! src/share/vm/adlc/formssel.cpp ! src/share/vm/adlc/output_h.cpp ! src/share/vm/opto/block.cpp ! src/share/vm/opto/block.hpp ! src/share/vm/opto/compile.hpp ! src/share/vm/opto/machnode.hpp ! src/share/vm/opto/matcher.hpp ! src/share/vm/opto/node.hpp ! src/share/vm/opto/output.cpp Changeset: fdb992d83a87 Author: twisti Date: 2011-08-16 04:14 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/fdb992d83a87 7071653: JSR 292: call site change notification should be pushed not pulled Reviewed-by: kvn, never, bdelsart ! src/cpu/sparc/vm/interp_masm_sparc.cpp ! src/cpu/sparc/vm/interp_masm_sparc.hpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/interp_masm_x86_32.cpp ! src/cpu/x86/vm/interp_masm_x86_32.hpp ! src/cpu/x86/vm/interp_masm_x86_64.cpp ! src/cpu/x86/vm/interp_masm_x86_64.hpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/share/vm/ci/ciCallSite.cpp ! src/share/vm/ci/ciCallSite.hpp ! src/share/vm/ci/ciField.hpp ! src/share/vm/classfile/systemDictionary.cpp ! src/share/vm/classfile/systemDictionary.hpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/code/dependencies.cpp ! src/share/vm/code/dependencies.hpp ! src/share/vm/code/nmethod.cpp ! src/share/vm/interpreter/interpreterRuntime.cpp ! src/share/vm/interpreter/templateTable.hpp ! src/share/vm/memory/universe.cpp ! src/share/vm/memory/universe.hpp ! src/share/vm/oops/instanceKlass.cpp ! src/share/vm/opto/callGenerator.cpp ! src/share/vm/opto/callGenerator.hpp ! src/share/vm/opto/doCall.cpp ! src/share/vm/opto/parse3.cpp Changeset: 11211f7cb5a0 Author: kvn Date: 2011-08-16 11:53 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/11211f7cb5a0 7079317: Incorrect branch's destination block in PrintoOptoAssembly output Summary: save/restore label and block in scratch_emit_size() Reviewed-by: never ! src/share/vm/adlc/archDesc.cpp ! src/share/vm/adlc/formssel.cpp ! src/share/vm/adlc/output_c.cpp ! src/share/vm/adlc/output_h.cpp ! src/share/vm/opto/block.cpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/idealGraphPrinter.cpp ! src/share/vm/opto/machnode.cpp ! src/share/vm/opto/machnode.hpp ! src/share/vm/opto/node.hpp ! src/share/vm/opto/output.cpp Changeset: 1af104d6cf99 Author: kvn Date: 2011-08-16 16:59 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1af104d6cf99 7079329: Adjust allocation prefetching for T4 Summary: on T4 2 BIS instructions should be issued to prefetch 64 bytes Reviewed-by: iveresov, phh, twisti ! src/cpu/sparc/vm/assembler_sparc.hpp ! src/cpu/sparc/vm/sparc.ad ! src/cpu/sparc/vm/vm_version_sparc.cpp ! src/cpu/sparc/vm/vm_version_sparc.hpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/adlc/formssel.cpp ! src/share/vm/memory/threadLocalAllocBuffer.hpp ! src/share/vm/opto/classes.hpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/matcher.cpp ! src/share/vm/opto/memnode.hpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/vm_version.cpp ! src/share/vm/runtime/vm_version.hpp Changeset: 381bf869f784 Author: twisti Date: 2011-08-17 05:14 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/381bf869f784 7079626: x64 emits unnecessary REX prefix Reviewed-by: kvn, iveresov, never ! src/cpu/x86/vm/assembler_x86.cpp Changeset: bd87c0dcaba5 Author: twisti Date: 2011-08-17 11:52 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/bd87c0dcaba5 7079769: JSR 292: incorrect size() for CallStaticJavaHandle on sparc Reviewed-by: never, kvn ! src/cpu/sparc/vm/sparc.ad Changeset: 739a9abbbd4b Author: kvn Date: 2011-08-18 11:49 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/739a9abbbd4b 7080431: VM asserts if specified size(x) in .ad is larger than emitted size Summary: Move code from finalize_offsets_and_shorten() to fill_buffer() to restore previous behavior. Reviewed-by: never ! src/share/vm/opto/compile.hpp ! src/share/vm/opto/output.cpp Changeset: de147f62e695 Author: kvn Date: 2011-08-19 08:55 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/de147f62e695 Merge - agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeFastAAccess0.java - agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeFastIAccess0.java Changeset: 9f12ede5571a Author: jcoomes Date: 2011-08-19 14:08 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9f12ede5571a Merge ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/share/vm/oops/generateOopMap.cpp ! src/share/vm/runtime/os.cpp Changeset: 7c29742c41b4 Author: jcoomes Date: 2011-08-19 14:22 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7c29742c41b4 7081251: bump the hs22 build number to 02 Reviewed-by: johnc ! make/hotspot_version From bluedavy at gmail.com Sat Aug 20 03:12:24 2011 From: bluedavy at gmail.com (BlueDavy Lin) Date: Sat, 20 Aug 2011 11:12:24 +0800 Subject: why this young gc is so slow? In-Reply-To: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> Message-ID: I change the code,so avoid the old gen contains a huge data structure ref to young gen,now the ygc time reduce to about 40ms,live objects about 8m,but I think it should be faster than current,I'll see the code more detail. current the gc task timestamps is: VM-Thread 298267312 298296585 298312254 GC-Thread 0 entries: 11 [ old-to-young-roots-task 298267525 298267593 ] [ old-to-young-roots-task 298267596 298267642 ] [ thread-roots-task 298267643 298267664 ] [ thread-roots-task 298267666 298267679 ] [ thread-roots-task 298267680 298267688 ] [ thread-roots-task 298267690 298267699 ] [ thread-roots-task 298267703 298267713 ] [ thread-roots-task 298267752 298267760 ] [ thread-roots-task 298267762 298267768 ] [ thread-roots-task 298267771 298267861 ] [ steal-task 298267862 298296339 ] GC-Thread 1 entries: 6 [ thread-roots-task 298267623 298267739 ] [ thread-roots-task 298267743 298267764 ] [ thread-roots-task 298267766 298267775 ] [ thread-roots-task 298267776 298267818 ] [ thread-roots-task 298267820 298267835 ] [ steal-task 298267836 298296403 ] GC-Thread 2 entries: 8 [ old-to-young-roots-task 298267585 298267715 ] [ thread-roots-task 298267721 298267768 ] [ thread-roots-task 298267769 298267783 ] [ thread-roots-task 298267801 298267812 ] [ thread-roots-task 298267814 298267819 ] [ thread-roots-task 298267821 298267825 ] [ scavenge-roots-task 298267826 298267924 ] [ steal-task 298267925 298296466 ] GC-Thread 3 entries: 11 [ thread-roots-task 298267634 298267653 ] [ thread-roots-task 298267659 298267680 ] [ thread-roots-task 298267682 298267690 ] [ thread-roots-task 298267692 298267725 ] [ thread-roots-task 298267726 298267733 ] [ thread-roots-task 298267735 298267741 ] [ thread-roots-task 298267746 298267775 ] [ thread-roots-task 298267776 298267787 ] [ thread-roots-task 298267790 298267805 ] [ thread-roots-task 298267807 298267813 ] [ steal-task 298267848 298296340 ] GC-Thread 4 entries: 13 [ old-to-young-roots-task 298267561 298267611 ] [ scavenge-roots-task 298267613 298267639 ] [ thread-roots-task 298267640 298267661 ] [ thread-roots-task 298267664 298267673 ] [ thread-roots-task 298267675 298267685 ] [ thread-roots-task 298267687 298267696 ] [ thread-roots-task 298267699 298267707 ] [ thread-roots-task 298267708 298267717 ] [ thread-roots-task 298267725 298267765 ] [ thread-roots-task 298267767 298267781 ] [ thread-roots-task 298267807 298267828 ] [ scavenge-roots-task 298267829 298267831 ] [ steal-task 298267833 298296387 ] GC-Thread 5 entries: 13 [ old-to-young-roots-task 298267573 298267618 ] [ thread-roots-task 298267622 298267651 ] [ thread-roots-task 298267654 298267661 ] [ thread-roots-task 298267663 298267669 ] [ thread-roots-task 298267670 298267696 ] [ thread-roots-task 298267698 298267704 ] [ thread-roots-task 298267706 298267712 ] [ thread-roots-task 298267719 298267726 ] [ thread-roots-task 298267727 298267781 ] [ thread-roots-task 298267784 298267791 ] [ thread-roots-task 298267792 298267821 ] [ scavenge-roots-task 298267823 298269583 ] [ steal-task 298269584 298296338 ] GC-Thread 6 entries: 11 [ old-to-young-roots-task 298267551 298267601 ] [ serial-old-to-young-roots-task 298267603 298267625 ] [ thread-roots-task 298267626 298267653 ] [ thread-roots-task 298267666 298267676 ] [ thread-roots-task 298267678 298267684 ] [ thread-roots-task 298267686 298267691 ] [ thread-roots-task 298267692 298267723 ] [ thread-roots-task 298267725 298267730 ] [ thread-roots-task 298267789 298267803 ] [ thread-roots-task 298267804 298267818 ] [ steal-task 298267863 298296447 ] GC-Thread 7 entries: 8 [ old-to-young-roots-task 298267603 298267648 ] [ thread-roots-task 298267650 298267687 ] [ thread-roots-task 298267689 298267731 ] [ thread-roots-task 298267773 298267786 ] [ thread-roots-task 298267788 298267793 ] [ thread-roots-task 298267797 298267855 ] [ steal-task 298267857 298296336 ] [ waitfor-barrier-task 298296341 298296578 ] GC-Thread 8 entries: 9 [ old-to-young-roots-task 298267594 298267634 ] [ thread-roots-task 298267637 298267674 ] [ thread-roots-task 298267675 298267712 ] [ thread-roots-task 298267716 298267732 ] [ thread-roots-task 298267760 298267768 ] [ thread-roots-task 298267816 298267824 ] [ scavenge-roots-task 298267826 298267828 ] [ scavenge-roots-task 298267829 298267831 ] [ steal-task 298267832 298296428 ] GC-Thread 9 entries: 17 [ old-to-young-roots-task 298267540 298267584 ] [ old-to-young-roots-task 298267586 298267628 ] [ thread-roots-task 298267629 298267652 ] [ thread-roots-task 298267656 298267666 ] [ thread-roots-task 298267667 298267676 ] [ thread-roots-task 298267677 298267694 ] [ thread-roots-task 298267698 298267708 ] [ thread-roots-task 298267709 298267715 ] [ thread-roots-task 298267744 298267754 ] [ thread-roots-task 298267756 298267763 ] [ thread-roots-task 298267765 298267775 ] [ thread-roots-task 298267778 298267801 ] [ thread-roots-task 298267802 298267809 ] [ thread-roots-task 298267810 298267812 ] [ thread-roots-task 298267817 298267823 ] [ scavenge-roots-task 298267825 298267830 ] [ steal-task 298267832 298296460 ] GC-Thread 10 entries: 12 [ scavenge-roots-task 298267613 298267618 ] [ thread-roots-task 298267621 298267638 ] [ thread-roots-task 298267639 298267659 ] [ thread-roots-task 298267663 298267671 ] [ thread-roots-task 298267673 298267682 ] [ thread-roots-task 298267683 298267693 ] [ thread-roots-task 298267695 298267705 ] [ thread-roots-task 298267707 298267720 ] [ thread-roots-task 298267723 298267730 ] [ thread-roots-task 298267795 298267811 ] [ thread-roots-task 298267813 298267833 ] [ steal-task 298267834 298296472 ] GC-Thread 11 entries: 13 [ old-to-young-roots-task 298267514 298267589 ] [ old-to-young-roots-task 298267592 298267657 ] [ thread-roots-task 298267660 298267674 ] [ thread-roots-task 298267677 298267693 ] [ thread-roots-task 298267694 298267702 ] [ thread-roots-task 298267706 298267716 ] [ thread-roots-task 298267721 298267731 ] [ thread-roots-task 298267735 298267744 ] [ thread-roots-task 298267746 298267753 ] [ thread-roots-task 298267757 298267786 ] [ thread-roots-task 298267789 298267799 ] [ thread-roots-task 298267802 298267814 ] [ steal-task 298267838 298296336 ] GC-Thread 12 entries: 15 [ old-to-young-roots-task 298267499 298267597 ] [ thread-roots-task 298267636 298267652 ] [ thread-roots-task 298267654 298267686 ] [ thread-roots-task 298267688 298267698 ] [ thread-roots-task 298267701 298267710 ] [ thread-roots-task 298267711 298267720 ] [ thread-roots-task 298267723 298267730 ] [ thread-roots-task 298267731 298267740 ] [ thread-roots-task 298267743 298267755 ] [ thread-roots-task 298267757 298267770 ] [ thread-roots-task 298267772 298267783 ] [ thread-roots-task 298267786 298267796 ] [ thread-roots-task 298267798 298267805 ] [ thread-roots-task 298267811 298267813 ] [ steal-task 298267843 298296347 ] From lawrence.chow at oracle.com Sat Aug 20 03:17:59 2011 From: lawrence.chow at oracle.com (lawrence.chow at oracle.com) Date: Fri, 19 Aug 2011 20:17:59 -0700 (PDT) Subject: Auto Reply: hotspot-gc-dev Digest, Vol 50, Issue 26 Message-ID: <811458df-bca3-485f-adb6-19f19da99116@default> Lawrence Chow will be out of the office on 08/20/11 through 08/29/11 Lawrence will return to the office on Tueday, 08/30/11. Please contact Matt.Mille at oracle.com, Terry.Statt at oracle.com, or Mary.McCarthy at oracle.com if assistance is needed from a Java collaborator in my absence. From igor.veresov at oracle.com Sat Aug 20 05:53:10 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 19 Aug 2011 22:53:10 -0700 Subject: Review Request: UseNUMAInterleaving #3 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> Message-ID: <462098EF18364A629C463AC72D5495CC@oracle.com> Tom, this looks better, a few minor things: * os_solaris.cpp, os_linux.cpp: - There're some problems with indentation in os_linux.cpp and os_solaris.cpp, please follow the style (two spaces). - I think we can set UseNUMAInterleaving when UseNUMA is set, it wouldn't hurt the numa allocator in PS but will benefit other collectors that are not numa-aware. May be put this into arguments.cpp, since that behavior will be universal across platforms. * os_solaris.cpp - in os::commit_memory(char* addr, size_t bytes, size_t alignment_hint, bool exec): 2811 if (UseNUMAInterleaving) { 2812 numa_make_global(addr, bytes); 2813 } This is not necessary. This function in calls the other commit_memory(), the one without the alignment hint (unlike on linux) and you already call numa_make_global() there. - in os::reserve_memory_special() 3439 if (UseNUMAInterleaving) { 3440 numa_make_global(addr, bytes); 3441 } You should use retAddr instead of addr. The addr param is ignored there at all. * os_windows.cpp - two spaces indent please. - sometimes you don't put spaces around infix operators. - we typically don't prefix class names with "C": CNUMANodeListHolder - variables are usually lower case with underscore separating words: 2744 } NUMANodeListHolder 3162 size_t BytesRemaining = bytes; 3163 char * NextAllocAddr = addr; 3165 MEMORY_BASIC_INFORMATION allocInfo; 3167 size_t BytesToRq = MIN2(BytesRemaining, allocInfo.RegionSize); - class data members should start with underscore: 2719 int numa_used_node_list[64]; 2720 int numa_used_node_count; os_windows.cpp definitely doesn't set a good style example, sorry... - would it be easier not to have UseNUMAForced and just let it set lgrp_id during thread init (it's currently a no op on windows anyway)? Also, in the future, when windows fully support numa allocation it will be a required behavior anyway. - here, in case size is bigger than the number of nodes in the node list it should be adjusted: 3254 size_t os::numa_get_leaf_groups(int *ids, size_t size) { > size = MIN2(size, (size_t)NUMANodeListHolder.get_count()); 3255 for (int i = 0; i < size; i++) { 3256 ids[i] = NUMANodeListHolder.get_node_list_entry(i); 3257 } 3258 return size; 3259 } - in CNUMANodeListHolder::build(), is it guaranteed that we don't overflow numa_used_node_list? If not, can we stop adding to it if the index >=64? If yes, can we put a guarantee or assert? Can 64 be a named constant? - CNUMANodeListHolder probably needs a constructor that would set numa_used_count to 0 (I know in your case it's a global, but it feels safer to have it initialized). - in numa_interleaving_init() WARN need to be undef'ed. igor On Friday, August 19, 2011 at 3:47 PM, Deneau, Tom wrote: > Please review this patch which adds a new flag called > UseNUMAInterleaving. This flag provides a subset of the functionality > provided by UseNUMA. In Hotspot UseNUMA terminology, > UseNUMAInterleaved makes all memory "numa_global" which is implemented > as interleaved. This patch's main purpose is to provide that subset > on OSes like Windows which do not support the full UseNUMA > functionality. However, a simple implementation of UseNUMAInterleaving is > also provided for other OSes > > The situations where this shows the biggest benefits would be: > * Windows platforms with multiple numa nodes (eg, 4) > > * The JVM process is run across all the nodes (not affinitized to one > node). > > * A workload that has enough threads so that it uses the majority > of the cores in the machine, so that the heap is being accessed > from many cores, including remote ones. > > * Enough memory per node and a heap size such that the default heap > placement policy on windows would end up with the heap (or > nursery) placed on one node. > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > measurements, we have seen some cases where the performance with > UseNUMAInterleaving was 2.7x vs. the performance without. There were > gains of varying sizes across all systems. > > The webrev is at > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.03/ > > Summary of changes in webrev.03 from webrev.02: > > * As suggested by Igor Veresov, reverts to using > UseNUMAInterleaving as the enabling flag. This will make it > easier in the future when there are GCs that enable fuller > UseNUMA on Windows. > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > Solaris, which just calls numa_make_global after commit_memory > and reserve_memory_special > > * Adds a flag NUMAInterleaveGranularity which allows setting the > granularity with which we move to a different node in a memory > allocation. The default is 2MB. This flag only applies to > Windows for now. > > * Several code cleanups in os_windows.cpp suggested by Igor. > > Summary of changes in os_windows.cpp: > > * Some static routines were added to set things up init time. These > * check that the required APIs (VirtualAllocExNuma, > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > the OS > > * build the list of numa nodes on which this process has affinity > > * Changes to os::reserve_memory > * There was already a routine that reserved pages one page at a > time (used for Individual Large Page Allocation on WS2003). > This was abstracted to a separate routine, called > allocate_pages_individually. This gets called both for the > Individual Large Page Allocation thing mentioned above and for > UseNUMAInterleaving (for both small and large pages) > > * When used for NUMA Interleaving this just goes thru the numa > node list in a round-robin fashion, using a different one for > each chunk (with 4K pages, the minimum allocation granularity > is 64K, with 2M pages it is 1 Page) > > * Whether we do just a reserve or a combined reserve/commit is > determined by the caller of allocate_pages_individually > > * When used with large pages, we do a Reserve and Commit at > the same time which is the way it always worked and the way > it has to work on windows. > > * For small pages, only the reserve is done, the commit will > come later. (which is the way it worked for > non-interleaved) > > * os::commit_memory changes > * If UseNUMAIntereaving is true, os::commit_memory has to check > whether it was being asked to commit memory that might have > come from multiple Reserve allocations, if so, the commits > must also be broken up. We don't keep any data structure to > keep track of this, we just use VirtualQuery which queries the > properties of a VA range and can tell us how much came from > one VirtualAlloc call. > > I do not have a bug id for this. > > -- Tom Deneau, AMD From igor.veresov at oracle.com Sat Aug 20 06:35:46 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 19 Aug 2011 23:35:46 -0700 Subject: why this young gc is so slow? In-Reply-To: References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> Message-ID: How big are the survivor spaces (you can get this info by running with -XX:+PrintHeapAtGC)? It seems that now it spends most of the time in steal-task, which is basically traversal of the subgraph that is in eden+survivor spaces and copying. igor On Friday, August 19, 2011 at 8:12 PM, BlueDavy Lin wrote: > I change the code,so avoid the old gen contains a huge data structure > ref to young gen,now the ygc time reduce to about 40ms,live objects > about 8m,but I think it should be faster than current,I'll see the > code more detail. > > current the gc task timestamps is: > VM-Thread 298267312 298296585 298312254 > GC-Thread 0 entries: 11 > [ old-to-young-roots-task 298267525 298267593 ] > [ old-to-young-roots-task 298267596 298267642 ] > [ thread-roots-task 298267643 298267664 ] > [ thread-roots-task 298267666 298267679 ] > [ thread-roots-task 298267680 298267688 ] > [ thread-roots-task 298267690 298267699 ] > [ thread-roots-task 298267703 298267713 ] > [ thread-roots-task 298267752 298267760 ] > [ thread-roots-task 298267762 298267768 ] > [ thread-roots-task 298267771 298267861 ] > [ steal-task 298267862 298296339 ] > GC-Thread 1 entries: 6 > [ thread-roots-task 298267623 298267739 ] > [ thread-roots-task 298267743 298267764 ] > [ thread-roots-task 298267766 298267775 ] > [ thread-roots-task 298267776 298267818 ] > [ thread-roots-task 298267820 298267835 ] > [ steal-task 298267836 298296403 ] > GC-Thread 2 entries: 8 > [ old-to-young-roots-task 298267585 298267715 ] > [ thread-roots-task 298267721 298267768 ] > [ thread-roots-task 298267769 298267783 ] > [ thread-roots-task 298267801 298267812 ] > [ thread-roots-task 298267814 298267819 ] > [ thread-roots-task 298267821 298267825 ] > [ scavenge-roots-task 298267826 298267924 ] > [ steal-task 298267925 298296466 ] > GC-Thread 3 entries: 11 > [ thread-roots-task 298267634 298267653 ] > [ thread-roots-task 298267659 298267680 ] > [ thread-roots-task 298267682 298267690 ] > [ thread-roots-task 298267692 298267725 ] > [ thread-roots-task 298267726 298267733 ] > [ thread-roots-task 298267735 298267741 ] > [ thread-roots-task 298267746 298267775 ] > [ thread-roots-task 298267776 298267787 ] > [ thread-roots-task 298267790 298267805 ] > [ thread-roots-task 298267807 298267813 ] > [ steal-task 298267848 298296340 ] > GC-Thread 4 entries: 13 > [ old-to-young-roots-task 298267561 298267611 ] > [ scavenge-roots-task 298267613 298267639 ] > [ thread-roots-task 298267640 298267661 ] > [ thread-roots-task 298267664 298267673 ] > [ thread-roots-task 298267675 298267685 ] > [ thread-roots-task 298267687 298267696 ] > [ thread-roots-task 298267699 298267707 ] > [ thread-roots-task 298267708 298267717 ] > [ thread-roots-task 298267725 298267765 ] > [ thread-roots-task 298267767 298267781 ] > [ thread-roots-task 298267807 298267828 ] > [ scavenge-roots-task 298267829 298267831 ] > [ steal-task 298267833 298296387 ] > GC-Thread 5 entries: 13 > [ old-to-young-roots-task 298267573 298267618 ] > [ thread-roots-task 298267622 298267651 ] > [ thread-roots-task 298267654 298267661 ] > [ thread-roots-task 298267663 298267669 ] > [ thread-roots-task 298267670 298267696 ] > [ thread-roots-task 298267698 298267704 ] > [ thread-roots-task 298267706 298267712 ] > [ thread-roots-task 298267719 298267726 ] > [ thread-roots-task 298267727 298267781 ] > [ thread-roots-task 298267784 298267791 ] > [ thread-roots-task 298267792 298267821 ] > [ scavenge-roots-task 298267823 298269583 ] > [ steal-task 298269584 298296338 ] > GC-Thread 6 entries: 11 > [ old-to-young-roots-task 298267551 298267601 ] > [ serial-old-to-young-roots-task 298267603 298267625 ] > [ thread-roots-task 298267626 298267653 ] > [ thread-roots-task 298267666 298267676 ] > [ thread-roots-task 298267678 298267684 ] > [ thread-roots-task 298267686 298267691 ] > [ thread-roots-task 298267692 298267723 ] > [ thread-roots-task 298267725 298267730 ] > [ thread-roots-task 298267789 298267803 ] > [ thread-roots-task 298267804 298267818 ] > [ steal-task 298267863 298296447 ] > GC-Thread 7 entries: 8 > [ old-to-young-roots-task 298267603 298267648 ] > [ thread-roots-task 298267650 298267687 ] > [ thread-roots-task 298267689 298267731 ] > [ thread-roots-task 298267773 298267786 ] > [ thread-roots-task 298267788 298267793 ] > [ thread-roots-task 298267797 298267855 ] > [ steal-task 298267857 298296336 ] > [ waitfor-barrier-task 298296341 298296578 ] > GC-Thread 8 entries: 9 > [ old-to-young-roots-task 298267594 298267634 ] > [ thread-roots-task 298267637 298267674 ] > [ thread-roots-task 298267675 298267712 ] > [ thread-roots-task 298267716 298267732 ] > [ thread-roots-task 298267760 298267768 ] > [ thread-roots-task 298267816 298267824 ] > [ scavenge-roots-task 298267826 298267828 ] > [ scavenge-roots-task 298267829 298267831 ] > [ steal-task 298267832 298296428 ] > GC-Thread 9 entries: 17 > [ old-to-young-roots-task 298267540 298267584 ] > [ old-to-young-roots-task 298267586 298267628 ] > [ thread-roots-task 298267629 298267652 ] > [ thread-roots-task 298267656 298267666 ] > [ thread-roots-task 298267667 298267676 ] > [ thread-roots-task 298267677 298267694 ] > [ thread-roots-task 298267698 298267708 ] > [ thread-roots-task 298267709 298267715 ] > [ thread-roots-task 298267744 298267754 ] > [ thread-roots-task 298267756 298267763 ] > [ thread-roots-task 298267765 298267775 ] > [ thread-roots-task 298267778 298267801 ] > [ thread-roots-task 298267802 298267809 ] > [ thread-roots-task 298267810 298267812 ] > [ thread-roots-task 298267817 298267823 ] > [ scavenge-roots-task 298267825 298267830 ] > [ steal-task 298267832 298296460 ] > GC-Thread 10 entries: 12 > [ scavenge-roots-task 298267613 298267618 ] > [ thread-roots-task 298267621 298267638 ] > [ thread-roots-task 298267639 298267659 ] > [ thread-roots-task 298267663 298267671 ] > [ thread-roots-task 298267673 298267682 ] > [ thread-roots-task 298267683 298267693 ] > [ thread-roots-task 298267695 298267705 ] > [ thread-roots-task 298267707 298267720 ] > [ thread-roots-task 298267723 298267730 ] > [ thread-roots-task 298267795 298267811 ] > [ thread-roots-task 298267813 298267833 ] > [ steal-task 298267834 298296472 ] > GC-Thread 11 entries: 13 > [ old-to-young-roots-task 298267514 298267589 ] > [ old-to-young-roots-task 298267592 298267657 ] > [ thread-roots-task 298267660 298267674 ] > [ thread-roots-task 298267677 298267693 ] > [ thread-roots-task 298267694 298267702 ] > [ thread-roots-task 298267706 298267716 ] > [ thread-roots-task 298267721 298267731 ] > [ thread-roots-task 298267735 298267744 ] > [ thread-roots-task 298267746 298267753 ] > [ thread-roots-task 298267757 298267786 ] > [ thread-roots-task 298267789 298267799 ] > [ thread-roots-task 298267802 298267814 ] > [ steal-task 298267838 298296336 ] > GC-Thread 12 entries: 15 > [ old-to-young-roots-task 298267499 298267597 ] > [ thread-roots-task 298267636 298267652 ] > [ thread-roots-task 298267654 298267686 ] > [ thread-roots-task 298267688 298267698 ] > [ thread-roots-task 298267701 298267710 ] > [ thread-roots-task 298267711 298267720 ] > [ thread-roots-task 298267723 298267730 ] > [ thread-roots-task 298267731 298267740 ] > [ thread-roots-task 298267743 298267755 ] > [ thread-roots-task 298267757 298267770 ] > [ thread-roots-task 298267772 298267783 ] > [ thread-roots-task 298267786 298267796 ] > [ thread-roots-task 298267798 298267805 ] > [ thread-roots-task 298267811 298267813 ] > [ steal-task 298267843 298296347 ] From bengt.rutisson at oracle.com Sat Aug 20 22:14:08 2011 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Sat, 20 Aug 2011 22:14:08 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 6814390: G1: remove the concept of non-generational G1 Message-ID: <20110820221409.E631447F39@hg.openjdk.java.net> Changeset: ff53346271fe Author: brutisso Date: 2011-08-19 09:30 +0200 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ff53346271fe 6814390: G1: remove the concept of non-generational G1 Summary: Removed the possibility to turn off generational mode for G1. Reviewed-by: johnc, ysr, tonyp ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp From bengt.rutisson at oracle.com Sun Aug 21 21:03:32 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Sun, 21 Aug 2011 23:03:32 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E4EEDDB.4060408@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4EEDDB.4060408@oracle.com> Message-ID: <4E5172A4.7020802@oracle.com> Hi John, This new webrev looks good to me. The discussion that Stefan and Tony had was good. I like the new template parameter name (do_mark_object) and the new comments much better. In fact, I realize my initial comments regarding those were a bit wrong since I had misinterpreted the code due to the old comments. Thanks for finding a good solution to this. Ship it! Bengt On 2011-08-20 01:12, John Cuthbertson wrote: > Hi Everyone, > > Hopefully this webrev > (http://cr.openjdk.java.net/~johnc/7080389/webrev.1/) addresses > everyones' comments. > > Thanks, > > JohnC > > On 08/18/11 11:17, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers review these refactoring changes to >> the marking code used during evacuation pauses (both initial mark >> pauses and regular evacuation pauses when marking is active) - the >> change can be found at >> http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. >> >> The refactoring changes fix an issue that was seen with the code >> changes for 6486945. >> >> During an initial mark pause, during root scanning, one thread had >> successfully forwarded an object and had started to copy it. While >> the object was being copied to its new location, another thread saw >> that the object had been forwarded and, after checking that the new >> location was unmarked, successfully marked the new location. The >> first thread would finish the copying, see that the new location was >> marked and skip the mark. The situation I ran into was that I was >> attempting to obtain the size of the new object just after it was >> marked (by the thread doing the marking) and the old object had not >> yet been fully copied to its new location. >> >> With these refactoring changes, the thread that successfully forwards >> an object in the collection set will mark the forwardee after copying >> - allowing me to safely obtain it's size. >> >> Testing: several runs of the GC test suite with a marking threshold >> of 10 and 20%, Kitchensink, and jprt. >> >> Thanks, >> >> JohnC >> >> > From jesper.wilhelmsson at oracle.com Mon Aug 22 08:03:40 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 22 Aug 2011 10:03:40 +0200 Subject: CRR (XS): 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize In-Reply-To: <4E4E802E.3000602@oracle.com> References: <4E4E802E.3000602@oracle.com> Message-ID: <4E520D5C.3000605@oracle.com> Looks good to me. /Jesper On 08/19/2011 05:24 PM, Tony Printezis wrote: > Hi all, > > Could I have a couple of reviews for this simple change to remove three > non-product parameters we have not been using? > > http://cr.openjdk.java.net/~tonyp/7081064/webrev.0/ > > Thanks, > > Tony From bluedavy at gmail.com Mon Aug 22 08:34:12 2011 From: bluedavy at gmail.com (BlueDavy Lin) Date: Mon, 22 Aug 2011 16:34:12 +0800 Subject: why this young gc is so slow? In-Reply-To: References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> Message-ID: The app use default Parallel GC,so survivor space will be adjusted on the fly... 2011/8/20 Igor Veresov : > How big are the survivor spaces (you can get this info by running with -XX:+PrintHeapAtGC)? > > It seems that now it spends most of the time in steal-task, which is basically traversal of the subgraph that is in eden+survivor spaces and copying. > > igor > > On Friday, August 19, 2011 at 8:12 PM, BlueDavy Lin wrote: > >> I change the code,so avoid the old gen contains a huge data structure >> ref to young gen,now the ygc time reduce to about 40ms,live objects >> about 8m,but I think it should be faster than current,I'll see the >> code more detail. >> >> current the gc task timestamps is: >> VM-Thread 298267312 298296585 298312254 >> GC-Thread 0 entries: 11 >> ?[ old-to-young-roots-task 298267525 298267593 ] >> ?[ old-to-young-roots-task 298267596 298267642 ] >> ?[ thread-roots-task 298267643 298267664 ] >> ?[ thread-roots-task 298267666 298267679 ] >> ?[ thread-roots-task 298267680 298267688 ] >> ?[ thread-roots-task 298267690 298267699 ] >> ?[ thread-roots-task 298267703 298267713 ] >> ?[ thread-roots-task 298267752 298267760 ] >> ?[ thread-roots-task 298267762 298267768 ] >> ?[ thread-roots-task 298267771 298267861 ] >> ?[ steal-task 298267862 298296339 ] >> GC-Thread 1 entries: 6 >> ?[ thread-roots-task 298267623 298267739 ] >> ?[ thread-roots-task 298267743 298267764 ] >> ?[ thread-roots-task 298267766 298267775 ] >> ?[ thread-roots-task 298267776 298267818 ] >> ?[ thread-roots-task 298267820 298267835 ] >> ?[ steal-task 298267836 298296403 ] >> GC-Thread 2 entries: 8 >> ?[ old-to-young-roots-task 298267585 298267715 ] >> ?[ thread-roots-task 298267721 298267768 ] >> ?[ thread-roots-task 298267769 298267783 ] >> ?[ thread-roots-task 298267801 298267812 ] >> ?[ thread-roots-task 298267814 298267819 ] >> ?[ thread-roots-task 298267821 298267825 ] >> ?[ scavenge-roots-task 298267826 298267924 ] >> ?[ steal-task 298267925 298296466 ] >> GC-Thread 3 entries: 11 >> ?[ thread-roots-task 298267634 298267653 ] >> ?[ thread-roots-task 298267659 298267680 ] >> ?[ thread-roots-task 298267682 298267690 ] >> ?[ thread-roots-task 298267692 298267725 ] >> ?[ thread-roots-task 298267726 298267733 ] >> ?[ thread-roots-task 298267735 298267741 ] >> ?[ thread-roots-task 298267746 298267775 ] >> ?[ thread-roots-task 298267776 298267787 ] >> ?[ thread-roots-task 298267790 298267805 ] >> ?[ thread-roots-task 298267807 298267813 ] >> ?[ steal-task 298267848 298296340 ] >> GC-Thread 4 entries: 13 >> ?[ old-to-young-roots-task 298267561 298267611 ] >> ?[ scavenge-roots-task 298267613 298267639 ] >> ?[ thread-roots-task 298267640 298267661 ] >> ?[ thread-roots-task 298267664 298267673 ] >> ?[ thread-roots-task 298267675 298267685 ] >> ?[ thread-roots-task 298267687 298267696 ] >> ?[ thread-roots-task 298267699 298267707 ] >> ?[ thread-roots-task 298267708 298267717 ] >> ?[ thread-roots-task 298267725 298267765 ] >> ?[ thread-roots-task 298267767 298267781 ] >> ?[ thread-roots-task 298267807 298267828 ] >> ?[ scavenge-roots-task 298267829 298267831 ] >> ?[ steal-task 298267833 298296387 ] >> GC-Thread 5 entries: 13 >> ?[ old-to-young-roots-task 298267573 298267618 ] >> ?[ thread-roots-task 298267622 298267651 ] >> ?[ thread-roots-task 298267654 298267661 ] >> ?[ thread-roots-task 298267663 298267669 ] >> ?[ thread-roots-task 298267670 298267696 ] >> ?[ thread-roots-task 298267698 298267704 ] >> ?[ thread-roots-task 298267706 298267712 ] >> ?[ thread-roots-task 298267719 298267726 ] >> ?[ thread-roots-task 298267727 298267781 ] >> ?[ thread-roots-task 298267784 298267791 ] >> ?[ thread-roots-task 298267792 298267821 ] >> ?[ scavenge-roots-task 298267823 298269583 ] >> ?[ steal-task 298269584 298296338 ] >> GC-Thread 6 entries: 11 >> ?[ old-to-young-roots-task 298267551 298267601 ] >> ?[ serial-old-to-young-roots-task 298267603 298267625 ] >> ?[ thread-roots-task 298267626 298267653 ] >> ?[ thread-roots-task 298267666 298267676 ] >> ?[ thread-roots-task 298267678 298267684 ] >> ?[ thread-roots-task 298267686 298267691 ] >> ?[ thread-roots-task 298267692 298267723 ] >> ?[ thread-roots-task 298267725 298267730 ] >> ?[ thread-roots-task 298267789 298267803 ] >> ?[ thread-roots-task 298267804 298267818 ] >> ?[ steal-task 298267863 298296447 ] >> GC-Thread 7 entries: 8 >> ?[ old-to-young-roots-task 298267603 298267648 ] >> ?[ thread-roots-task 298267650 298267687 ] >> ?[ thread-roots-task 298267689 298267731 ] >> ?[ thread-roots-task 298267773 298267786 ] >> ?[ thread-roots-task 298267788 298267793 ] >> ?[ thread-roots-task 298267797 298267855 ] >> ?[ steal-task 298267857 298296336 ] >> ?[ waitfor-barrier-task 298296341 298296578 ] >> GC-Thread 8 entries: 9 >> ?[ old-to-young-roots-task 298267594 298267634 ] >> ?[ thread-roots-task 298267637 298267674 ] >> ?[ thread-roots-task 298267675 298267712 ] >> ?[ thread-roots-task 298267716 298267732 ] >> ?[ thread-roots-task 298267760 298267768 ] >> ?[ thread-roots-task 298267816 298267824 ] >> ?[ scavenge-roots-task 298267826 298267828 ] >> ?[ scavenge-roots-task 298267829 298267831 ] >> ?[ steal-task 298267832 298296428 ] >> GC-Thread 9 entries: 17 >> ?[ old-to-young-roots-task 298267540 298267584 ] >> ?[ old-to-young-roots-task 298267586 298267628 ] >> ?[ thread-roots-task 298267629 298267652 ] >> ?[ thread-roots-task 298267656 298267666 ] >> ?[ thread-roots-task 298267667 298267676 ] >> ?[ thread-roots-task 298267677 298267694 ] >> ?[ thread-roots-task 298267698 298267708 ] >> ?[ thread-roots-task 298267709 298267715 ] >> ?[ thread-roots-task 298267744 298267754 ] >> ?[ thread-roots-task 298267756 298267763 ] >> ?[ thread-roots-task 298267765 298267775 ] >> ?[ thread-roots-task 298267778 298267801 ] >> ?[ thread-roots-task 298267802 298267809 ] >> ?[ thread-roots-task 298267810 298267812 ] >> ?[ thread-roots-task 298267817 298267823 ] >> ?[ scavenge-roots-task 298267825 298267830 ] >> ?[ steal-task 298267832 298296460 ] >> GC-Thread 10 entries: 12 >> ?[ scavenge-roots-task 298267613 298267618 ] >> ?[ thread-roots-task 298267621 298267638 ] >> ?[ thread-roots-task 298267639 298267659 ] >> ?[ thread-roots-task 298267663 298267671 ] >> ?[ thread-roots-task 298267673 298267682 ] >> ?[ thread-roots-task 298267683 298267693 ] >> ?[ thread-roots-task 298267695 298267705 ] >> ?[ thread-roots-task 298267707 298267720 ] >> ?[ thread-roots-task 298267723 298267730 ] >> ?[ thread-roots-task 298267795 298267811 ] >> ?[ thread-roots-task 298267813 298267833 ] >> ?[ steal-task 298267834 298296472 ] >> GC-Thread 11 entries: 13 >> ?[ old-to-young-roots-task 298267514 298267589 ] >> ?[ old-to-young-roots-task 298267592 298267657 ] >> ?[ thread-roots-task 298267660 298267674 ] >> ?[ thread-roots-task 298267677 298267693 ] >> ?[ thread-roots-task 298267694 298267702 ] >> ?[ thread-roots-task 298267706 298267716 ] >> ?[ thread-roots-task 298267721 298267731 ] >> ?[ thread-roots-task 298267735 298267744 ] >> ?[ thread-roots-task 298267746 298267753 ] >> ?[ thread-roots-task 298267757 298267786 ] >> ?[ thread-roots-task 298267789 298267799 ] >> ?[ thread-roots-task 298267802 298267814 ] >> ?[ steal-task 298267838 298296336 ] >> GC-Thread 12 entries: 15 >> ?[ old-to-young-roots-task 298267499 298267597 ] >> ?[ thread-roots-task 298267636 298267652 ] >> ?[ thread-roots-task 298267654 298267686 ] >> ?[ thread-roots-task 298267688 298267698 ] >> ?[ thread-roots-task 298267701 298267710 ] >> ?[ thread-roots-task 298267711 298267720 ] >> ?[ thread-roots-task 298267723 298267730 ] >> ?[ thread-roots-task 298267731 298267740 ] >> ?[ thread-roots-task 298267743 298267755 ] >> ?[ thread-roots-task 298267757 298267770 ] >> ?[ thread-roots-task 298267772 298267783 ] >> ?[ thread-roots-task 298267786 298267796 ] >> ?[ thread-roots-task 298267798 298267805 ] >> ?[ thread-roots-task 298267811 298267813 ] >> ?[ steal-task 298267843 298296347 ] > > > -- ============================= |? ?? BlueDavy? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | |? ?? http://www.bluedavy.com? ?? ? ? ? ? ? ?| ============================= From igor.veresov at oracle.com Mon Aug 22 09:51:57 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 22 Aug 2011 02:51:57 -0700 Subject: why this young gc is so slow? In-Reply-To: References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> Message-ID: <5757127A7BB14B5BA9D4196983245584@oracle.com> Right. But how big do they get at the time you see the pauses you'd like to optimize? If they're rather large consider making them smaller. That will come at price (most probably) of having full GCs happening sooner. Also, if you're interested in low pause times you might want to try G1 or CMS. igor On Monday, August 22, 2011 at 1:34 AM, BlueDavy Lin wrote: > The app use default Parallel GC,so survivor space will be adjusted on the fly... > > 2011/8/20 Igor Veresov : > > How big are the survivor spaces (you can get this info by running with -XX:+PrintHeapAtGC)? > > > > It seems that now it spends most of the time in steal-task, which is basically traversal of the subgraph that is in eden+survivor spaces and copying. > > > > igor > > > > On Friday, August 19, 2011 at 8:12 PM, BlueDavy Lin wrote: > > > > > I change the code,so avoid the old gen contains a huge data structure > > > ref to young gen,now the ygc time reduce to about 40ms,live objects > > > about 8m,but I think it should be faster than current,I'll see the > > > code more detail. > > > > > > current the gc task timestamps is: > > > VM-Thread 298267312 298296585 298312254 > > > GC-Thread 0 entries: 11 > > > [ old-to-young-roots-task 298267525 298267593 ] > > > [ old-to-young-roots-task 298267596 298267642 ] > > > [ thread-roots-task 298267643 298267664 ] > > > [ thread-roots-task 298267666 298267679 ] > > > [ thread-roots-task 298267680 298267688 ] > > > [ thread-roots-task 298267690 298267699 ] > > > [ thread-roots-task 298267703 298267713 ] > > > [ thread-roots-task 298267752 298267760 ] > > > [ thread-roots-task 298267762 298267768 ] > > > [ thread-roots-task 298267771 298267861 ] > > > [ steal-task 298267862 298296339 ] > > > GC-Thread 1 entries: 6 > > > [ thread-roots-task 298267623 298267739 ] > > > [ thread-roots-task 298267743 298267764 ] > > > [ thread-roots-task 298267766 298267775 ] > > > [ thread-roots-task 298267776 298267818 ] > > > [ thread-roots-task 298267820 298267835 ] > > > [ steal-task 298267836 298296403 ] > > > GC-Thread 2 entries: 8 > > > [ old-to-young-roots-task 298267585 298267715 ] > > > [ thread-roots-task 298267721 298267768 ] > > > [ thread-roots-task 298267769 298267783 ] > > > [ thread-roots-task 298267801 298267812 ] > > > [ thread-roots-task 298267814 298267819 ] > > > [ thread-roots-task 298267821 298267825 ] > > > [ scavenge-roots-task 298267826 298267924 ] > > > [ steal-task 298267925 298296466 ] > > > GC-Thread 3 entries: 11 > > > [ thread-roots-task 298267634 298267653 ] > > > [ thread-roots-task 298267659 298267680 ] > > > [ thread-roots-task 298267682 298267690 ] > > > [ thread-roots-task 298267692 298267725 ] > > > [ thread-roots-task 298267726 298267733 ] > > > [ thread-roots-task 298267735 298267741 ] > > > [ thread-roots-task 298267746 298267775 ] > > > [ thread-roots-task 298267776 298267787 ] > > > [ thread-roots-task 298267790 298267805 ] > > > [ thread-roots-task 298267807 298267813 ] > > > [ steal-task 298267848 298296340 ] > > > GC-Thread 4 entries: 13 > > > [ old-to-young-roots-task 298267561 298267611 ] > > > [ scavenge-roots-task 298267613 298267639 ] > > > [ thread-roots-task 298267640 298267661 ] > > > [ thread-roots-task 298267664 298267673 ] > > > [ thread-roots-task 298267675 298267685 ] > > > [ thread-roots-task 298267687 298267696 ] > > > [ thread-roots-task 298267699 298267707 ] > > > [ thread-roots-task 298267708 298267717 ] > > > [ thread-roots-task 298267725 298267765 ] > > > [ thread-roots-task 298267767 298267781 ] > > > [ thread-roots-task 298267807 298267828 ] > > > [ scavenge-roots-task 298267829 298267831 ] > > > [ steal-task 298267833 298296387 ] > > > GC-Thread 5 entries: 13 > > > [ old-to-young-roots-task 298267573 298267618 ] > > > [ thread-roots-task 298267622 298267651 ] > > > [ thread-roots-task 298267654 298267661 ] > > > [ thread-roots-task 298267663 298267669 ] > > > [ thread-roots-task 298267670 298267696 ] > > > [ thread-roots-task 298267698 298267704 ] > > > [ thread-roots-task 298267706 298267712 ] > > > [ thread-roots-task 298267719 298267726 ] > > > [ thread-roots-task 298267727 298267781 ] > > > [ thread-roots-task 298267784 298267791 ] > > > [ thread-roots-task 298267792 298267821 ] > > > [ scavenge-roots-task 298267823 298269583 ] > > > [ steal-task 298269584 298296338 ] > > > GC-Thread 6 entries: 11 > > > [ old-to-young-roots-task 298267551 298267601 ] > > > [ serial-old-to-young-roots-task 298267603 298267625 ] > > > [ thread-roots-task 298267626 298267653 ] > > > [ thread-roots-task 298267666 298267676 ] > > > [ thread-roots-task 298267678 298267684 ] > > > [ thread-roots-task 298267686 298267691 ] > > > [ thread-roots-task 298267692 298267723 ] > > > [ thread-roots-task 298267725 298267730 ] > > > [ thread-roots-task 298267789 298267803 ] > > > [ thread-roots-task 298267804 298267818 ] > > > [ steal-task 298267863 298296447 ] > > > GC-Thread 7 entries: 8 > > > [ old-to-young-roots-task 298267603 298267648 ] > > > [ thread-roots-task 298267650 298267687 ] > > > [ thread-roots-task 298267689 298267731 ] > > > [ thread-roots-task 298267773 298267786 ] > > > [ thread-roots-task 298267788 298267793 ] > > > [ thread-roots-task 298267797 298267855 ] > > > [ steal-task 298267857 298296336 ] > > > [ waitfor-barrier-task 298296341 298296578 ] > > > GC-Thread 8 entries: 9 > > > [ old-to-young-roots-task 298267594 298267634 ] > > > [ thread-roots-task 298267637 298267674 ] > > > [ thread-roots-task 298267675 298267712 ] > > > [ thread-roots-task 298267716 298267732 ] > > > [ thread-roots-task 298267760 298267768 ] > > > [ thread-roots-task 298267816 298267824 ] > > > [ scavenge-roots-task 298267826 298267828 ] > > > [ scavenge-roots-task 298267829 298267831 ] > > > [ steal-task 298267832 298296428 ] > > > GC-Thread 9 entries: 17 > > > [ old-to-young-roots-task 298267540 298267584 ] > > > [ old-to-young-roots-task 298267586 298267628 ] > > > [ thread-roots-task 298267629 298267652 ] > > > [ thread-roots-task 298267656 298267666 ] > > > [ thread-roots-task 298267667 298267676 ] > > > [ thread-roots-task 298267677 298267694 ] > > > [ thread-roots-task 298267698 298267708 ] > > > [ thread-roots-task 298267709 298267715 ] > > > [ thread-roots-task 298267744 298267754 ] > > > [ thread-roots-task 298267756 298267763 ] > > > [ thread-roots-task 298267765 298267775 ] > > > [ thread-roots-task 298267778 298267801 ] > > > [ thread-roots-task 298267802 298267809 ] > > > [ thread-roots-task 298267810 298267812 ] > > > [ thread-roots-task 298267817 298267823 ] > > > [ scavenge-roots-task 298267825 298267830 ] > > > [ steal-task 298267832 298296460 ] > > > GC-Thread 10 entries: 12 > > > [ scavenge-roots-task 298267613 298267618 ] > > > [ thread-roots-task 298267621 298267638 ] > > > [ thread-roots-task 298267639 298267659 ] > > > [ thread-roots-task 298267663 298267671 ] > > > [ thread-roots-task 298267673 298267682 ] > > > [ thread-roots-task 298267683 298267693 ] > > > [ thread-roots-task 298267695 298267705 ] > > > [ thread-roots-task 298267707 298267720 ] > > > [ thread-roots-task 298267723 298267730 ] > > > [ thread-roots-task 298267795 298267811 ] > > > [ thread-roots-task 298267813 298267833 ] > > > [ steal-task 298267834 298296472 ] > > > GC-Thread 11 entries: 13 > > > [ old-to-young-roots-task 298267514 298267589 ] > > > [ old-to-young-roots-task 298267592 298267657 ] > > > [ thread-roots-task 298267660 298267674 ] > > > [ thread-roots-task 298267677 298267693 ] > > > [ thread-roots-task 298267694 298267702 ] > > > [ thread-roots-task 298267706 298267716 ] > > > [ thread-roots-task 298267721 298267731 ] > > > [ thread-roots-task 298267735 298267744 ] > > > [ thread-roots-task 298267746 298267753 ] > > > [ thread-roots-task 298267757 298267786 ] > > > [ thread-roots-task 298267789 298267799 ] > > > [ thread-roots-task 298267802 298267814 ] > > > [ steal-task 298267838 298296336 ] > > > GC-Thread 12 entries: 15 > > > [ old-to-young-roots-task 298267499 298267597 ] > > > [ thread-roots-task 298267636 298267652 ] > > > [ thread-roots-task 298267654 298267686 ] > > > [ thread-roots-task 298267688 298267698 ] > > > [ thread-roots-task 298267701 298267710 ] > > > [ thread-roots-task 298267711 298267720 ] > > > [ thread-roots-task 298267723 298267730 ] > > > [ thread-roots-task 298267731 298267740 ] > > > [ thread-roots-task 298267743 298267755 ] > > > [ thread-roots-task 298267757 298267770 ] > > > [ thread-roots-task 298267772 298267783 ] > > > [ thread-roots-task 298267786 298267796 ] > > > [ thread-roots-task 298267798 298267805 ] > > > [ thread-roots-task 298267811 298267813 ] > > > [ steal-task 298267843 298296347 ] > > > > -- > ============================= > | BlueDavy | > | http://www.bluedavy.com | > ============================= From sbordet at intalio.com Mon Aug 22 10:28:51 2011 From: sbordet at intalio.com (Simone Bordet) Date: Mon, 22 Aug 2011 12:28:51 +0200 Subject: why this young gc is so slow? In-Reply-To: <5757127A7BB14B5BA9D4196983245584@oracle.com> References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> <5757127A7BB14B5BA9D4196983245584@oracle.com> Message-ID: Hi, On Mon, Aug 22, 2011 at 11:51, Igor Veresov wrote: > ?Right. But how big do they get at the time you see the pauses you'd like to optimize? If they're rather large consider making them smaller. That will come at price (most probably) of having full GCs happening sooner. > > Also, if you're interested in low pause times you might want to try G1 or CMS. Are you implying that G1 or CMS do faster young collections than PS ? I was under the impression that the basic algorithm for young collection was more or less the same for all GC algorithms, and dependent on live objects and edges (like you said in a previous email), but I'd be interested in knowing if there are more differences. Also, if steal-task dominates, would it be right to deduce that a bigger young generation will give more chances to young objects to become garbage and therefore steal-task times will reduce ? Thanks ! Simon -- http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.?? Victoria Livschitz From bengt.rutisson at oracle.com Mon Aug 22 10:37:06 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 22 Aug 2011 12:37:06 +0200 Subject: CRR (XS): 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize In-Reply-To: <4E520D5C.3000605@oracle.com> References: <4E4E802E.3000602@oracle.com> <4E520D5C.3000605@oracle.com> Message-ID: <4E523152.7060608@oracle.com> Looks good to me too. Bengt On 2011-08-22 10:03, Jesper Wilhelmsson wrote: > Looks good to me. > /Jesper > > On 08/19/2011 05:24 PM, Tony Printezis wrote: >> Hi all, >> >> Could I have a couple of reviews for this simple change to remove three >> non-product parameters we have not been using? >> >> http://cr.openjdk.java.net/~tonyp/7081064/webrev.0/ >> >> Thanks, >> >> Tony From tony.printezis at oracle.com Mon Aug 22 14:19:16 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 22 Aug 2011 10:19:16 -0400 Subject: CRR (XS): 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize In-Reply-To: <4E523152.7060608@oracle.com> References: <4E4E802E.3000602@oracle.com> <4E520D5C.3000605@oracle.com> <4E523152.7060608@oracle.com> Message-ID: <4E526564.6090805@oracle.com> Bengt and Jesper, Thanks! All set. Tony Bengt Rutisson wrote: > > Looks good to me too. > > Bengt > > On 2011-08-22 10:03, Jesper Wilhelmsson wrote: >> Looks good to me. >> /Jesper >> >> On 08/19/2011 05:24 PM, Tony Printezis wrote: >>> Hi all, >>> >>> Could I have a couple of reviews for this simple change to remove three >>> non-product parameters we have not been using? >>> >>> http://cr.openjdk.java.net/~tonyp/7081064/webrev.0/ >>> >>> Thanks, >>> >>> Tony > From kirk at kodewerk.com Mon Aug 22 15:08:11 2011 From: kirk at kodewerk.com (Charles K Pepperdine) Date: Mon, 22 Aug 2011 17:08:11 +0200 Subject: Young GC pause time definitions In-Reply-To: References: Message-ID: <92354D18-E0DC-4C02-8D61-91C076FB548D@kodewerk.com> > > > The customer site is running an old jdk 1.6.0_14, with > -XX:+UseParNewGC and -XX:UseConcMarkSweepGC. Uses a 12 G heap, a > relatively small 512Mb new size. This seems like a highly suspicious configuration that I would guess is at the root of the problem. Please use -XX:+PrintTenuringDistribution and post the gc log if you can. Regards, Kirk Pepperdine From igor.veresov at oracle.com Mon Aug 22 17:52:06 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 22 Aug 2011 10:52:06 -0700 Subject: why this young gc is so slow? In-Reply-To: References: <16DE4025B07540FFAADD1934D4921C9F@oracle.com> <5757127A7BB14B5BA9D4196983245584@oracle.com> Message-ID: On Monday, August 22, 2011 at 3:28 AM, Simone Bordet wrote: > Hi, > > On Mon, Aug 22, 2011 at 11:51, Igor Veresov wrote: > > Right. But how big do they get at the time you see the pauses you'd like to optimize? If they're rather large consider making them smaller. That will come at price (most probably) of having full GCs happening sooner. > > > > Also, if you're interested in low pause times you might want to try G1 or CMS. > > Are you implying that G1 or CMS do faster young collections than PS ? > > I was under the impression that the basic algorithm for young > collection was more or less the same for all GC algorithms, and > dependent on live objects and edges (like you said in a previous > email), but I'd be interested in knowing if there are more > differences. > No, the young collections are not faster, but the old collections sort of are (STW time, from the application perspective). Which allows for larger promotion (young->old) rates. Plus, G1 has better adaptive sizing algorithms - you can just set the pause time goal. > Also, if steal-task dominates, would it be right to deduce that a > bigger young generation will give more chances to young objects to > become garbage and therefore steal-task times will reduce ? In theory, yes. But in reality it depends on how good the lifetime of the objects fits the generation hypothesis and what the distribution is. If a substantial portion of objects has a relatively long lifetime it doesn't make much sense to spend time copying them between the survivor spaces. Basically, the cost of keeping an object in the survivor spaces should be less than collecting it as a part of the old gen to keep the scheme beneficial. So with cheaper old collections you can allow for faster promotions. igor > > Thanks ! > > Simon > -- > http://bordet.blogspot.com > --- > Finally, no matter how good the architecture and design are, > to deliver bug-free software with optimal performance and reliability, > the implementation technique must be flawless. Victoria Livschitz From tony.printezis at oracle.com Mon Aug 22 21:01:26 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Mon, 22 Aug 2011 21:01:26 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize Message-ID: <20110822210128.4592B47FEA@hg.openjdk.java.net> Changeset: ae73da50be4b Author: tonyp Date: 2011-08-22 10:16 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ae73da50be4b 7081064: G1: remove develop params G1FixedSurvivorSpaceSize, G1FixedTenuringThreshold, and G1FixedEdenSize Summary: Remove three develop parameters we don't use. Reviewed-by: brutisso, jwilhelm ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp From john.cuthbertson at oracle.com Mon Aug 22 22:12:50 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Mon, 22 Aug 2011 15:12:50 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E5172A4.7020802@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4EEDDB.4060408@oracle.com> <4E5172A4.7020802@oracle.com> Message-ID: <4E52D462.9040409@oracle.com> Hi Bengt, Tony, Stefan, Thanks again for the review comments. I have to hold off pushing this as I have seen a failure while doing a sanity check after merging the workspace. So until I get to the bottom of the failure - consider this as "on hold". JohnC On 08/21/11 14:03, Bengt Rutisson wrote: > > Hi John, > > This new webrev looks good to me. > > The discussion that Stefan and Tony had was good. I like the new > template parameter name (do_mark_object) and the new comments much > better. In fact, I realize my initial comments regarding those were a > bit wrong since I had misinterpreted the code due to the old comments. > Thanks for finding a good solution to this. > > Ship it! > Bengt > > On 2011-08-20 01:12, John Cuthbertson wrote: >> Hi Everyone, >> >> Hopefully this webrev >> (http://cr.openjdk.java.net/~johnc/7080389/webrev.1/) addresses >> everyones' comments. >> >> Thanks, >> >> JohnC >> >> On 08/18/11 11:17, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> Can I have a couple of volunteers review these refactoring changes >>> to the marking code used during evacuation pauses (both initial mark >>> pauses and regular evacuation pauses when marking is active) - the >>> change can be found at >>> http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. >>> >>> The refactoring changes fix an issue that was seen with the code >>> changes for 6486945. >>> >>> During an initial mark pause, during root scanning, one thread had >>> successfully forwarded an object and had started to copy it. While >>> the object was being copied to its new location, another thread saw >>> that the object had been forwarded and, after checking that the new >>> location was unmarked, successfully marked the new location. The >>> first thread would finish the copying, see that the new location was >>> marked and skip the mark. The situation I ran into was that I was >>> attempting to obtain the size of the new object just after it was >>> marked (by the thread doing the marking) and the old object had not >>> yet been fully copied to its new location. >>> >>> With these refactoring changes, the thread that successfully >>> forwards an object in the collection set will mark the forwardee >>> after copying - allowing me to safely obtain it's size. >>> >>> Testing: several runs of the GC test suite with a marking threshold >>> of 10 and 20%, Kitchensink, and jprt. >>> >>> Thanks, >>> >>> JohnC >>> >>> >> > From y.s.ramakrishna at oracle.com Tue Aug 23 09:11:37 2011 From: y.s.ramakrishna at oracle.com (y.s.ramakrishna at oracle.com) Date: Tue, 23 Aug 2011 09:11:37 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110823091143.0FB6747015@hg.openjdk.java.net> Changeset: 7f776886a215 Author: ysr Date: 2011-08-22 12:30 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7f776886a215 6810861: G1: support -XX:+{PrintClassHistogram,HeapDump}{Before,After}FullGC Summary: Call {pre,post}_full_gc_dump() before and after a STW full gc of G1CollectedHeap. Also adjusted the prefix message, including the addition of missing whitespace. Reviewed-by: brutisso, tonyp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_interface/collectedHeap.cpp Changeset: be05e987ba07 Author: ysr Date: 2011-08-22 23:57 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/be05e987ba07 Merge From bengt.rutisson at oracle.com Tue Aug 23 09:26:35 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 23 Aug 2011 11:26:35 +0200 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new Message-ID: <4E53724B.90408@oracle.com> Hi all, Could I please have a couple of reviews for this small fix? After the secure dll loading fix (7016797) the psapi.lib library is needed for Windows builds. The original change made sure that this library is provided to the linker for the command line builds. However, the builds from inside Visual Studio also need to know about this library. Webrev: http://cr.openjdk.java.net/~brutisso/7082220/webrev/ CR: http://monaco.us.oracle.com/detail.jsf?cr=7082220 Testing: I created a Visual Studio project with the create script and with the change above the project builds nicely. I am including both Runtime and GC in this mail. The change is to runtime code, but I would like to push this through hotspot-gc. The reason is that this is blocking my work. Whenever I am setting up a new repository I run into this issue. But it seems that I am the only one who uses the Visual Studio builds at the moment. The issue has been around for several months but it only got integrated into hsx/hotspot-gc 5 days ago. So, for me it would be easier to integrate directly into hotspot-gc and start using the fix rather than having to wait for the fix to propagate from hotspot-rt to hotspot-gc. Thanks, Bengt From jesper.wilhelmsson at oracle.com Tue Aug 23 09:46:19 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Tue, 23 Aug 2011 11:46:19 +0200 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new In-Reply-To: <4E53724B.90408@oracle.com> References: <4E53724B.90408@oracle.com> Message-ID: <4E5376EB.1080905@oracle.com> Ship it! /Jesper On 08/23/2011 11:26 AM, Bengt Rutisson wrote: > > Hi all, > > Could I please have a couple of reviews for this small fix? After the secure > dll loading fix (7016797) the psapi.lib library is needed for Windows builds. > The original change made sure that this library is provided to the linker for > the command line builds. However, the builds from inside Visual Studio also > need to know about this library. > > Webrev: > http://cr.openjdk.java.net/~brutisso/7082220/webrev/ > > CR: > http://monaco.us.oracle.com/detail.jsf?cr=7082220 > > Testing: > I created a Visual Studio project with the create script and with the change > above the project builds nicely. > > I am including both Runtime and GC in this mail. The change is to runtime > code, but I would like to push this through hotspot-gc. The reason is that > this is blocking my work. Whenever I am setting up a new repository I run into > this issue. But it seems that I am the only one who uses the Visual Studio > builds at the moment. The issue has been around for several months but it only > got integrated into hsx/hotspot-gc 5 days ago. So, for me it would be easier > to integrate directly into hotspot-gc and start using the fix rather than > having to wait for the fix to propagate from hotspot-rt to hotspot-gc. > > Thanks, > Bengt From poonam.bajaj at oracle.com Tue Aug 23 10:58:23 2011 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Tue, 23 Aug 2011 16:28:23 +0530 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new In-Reply-To: <4E53724B.90408@oracle.com> References: <4E53724B.90408@oracle.com> Message-ID: <4E5387CF.2050809@oracle.com> Looks good! Thanks, Poonam On 8/23/2011 2:56 PM, Bengt Rutisson wrote: > > Hi all, > > Could I please have a couple of reviews for this small fix? After the > secure dll loading fix (7016797) the psapi.lib library is needed for > Windows builds. The original change made sure that this library is > provided to the linker for the command line builds. However, the > builds from inside Visual Studio also need to know about this library. > > Webrev: > http://cr.openjdk.java.net/~brutisso/7082220/webrev/ > > CR: > http://monaco.us.oracle.com/detail.jsf?cr=7082220 > > Testing: > I created a Visual Studio project with the create script and with the > change above the project builds nicely. > > I am including both Runtime and GC in this mail. The change is to > runtime code, but I would like to push this through hotspot-gc. The > reason is that this is blocking my work. Whenever I am setting up a > new repository I run into this issue. But it seems that I am the only > one who uses the Visual Studio builds at the moment. The issue has > been around for several months but it only got integrated into > hsx/hotspot-gc 5 days ago. So, for me it would be easier to integrate > directly into hotspot-gc and start using the fix rather than having to > wait for the fix to propagate from hotspot-rt to hotspot-gc. > > Thanks, > Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.deneau at amd.com Tue Aug 23 18:23:26 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 23 Aug 2011 13:23:26 -0500 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <462098EF18364A629C463AC72D5495CC@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> Please review this patch which adds a new flag called UseNUMAInterleaving. This flag provides a subset of the functionality provided by UseNUMA. In Hotspot UseNUMA terminology, UseNUMAInterleaved makes all memory "numa_global" which is implemented as interleaved. This patch's main purpose is to provide that subset on OSes like Windows which do not support the full UseNUMA functionality. However, a simple implementation of UseNUMAInterleaving is also provided for other OSes The situations where this shows the biggest benefits would be: * Windows platforms with multiple numa nodes (eg, 4) * The JVM process is run across all the nodes (not affinitized to one node). * A workload that has enough threads so that it uses the majority of the cores in the machine, so that the heap is being accessed from many cores, including remote ones. * Enough memory per node and a heap size such that the default heap placement policy on windows would end up with the heap (or nursery) placed on one node. jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our measurements, we have seen some cases where the performance with UseNUMAInterleaving was 2.7x vs. the performance without. There were gains of varying sizes across all systems. The webrev is at http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ Summary of changes in webrev.04 from webrev.03: * As suggested by Igor Veresov, UseNUMA can imply UseNUMAInterleaving on all platforms. This is in arguments.cpp * In NUMANodeListHolder in os_windows.cpp, allocates the node_list dynamically rather than assuming a length of 64. The method NUMANodeListHolder::get_node_list_entry checks returns -1 for indexes that are out of bounds. * Several code convention cleanups suggested by Igor. * Merge with the new style system dll function resolutions from "7016797: Hotspot: securely/restrictive load dlls and new API for loading system dlls" Note: my new NUMA functions are outside the ifdefs. Summary of changes in webrev.03 from webrev.02: * As suggested by Igor Veresov, reverts to using UseNUMAInterleaving as the enabling flag. This will make it easier in the future when there are GCs that enable fuller UseNUMA on Windows. * Adds a simple implementation of UseNUMAInterleaving on Linux and Solaris, which just calls numa_make_global after commit_memory and reserve_memory_special * Adds a flag NUMAInterleaveGranularity which allows setting the granularity with which we move to a different node in a memory allocation. The default is 2MB. This flag only applies to Windows for now. * Several code cleanups in os_windows.cpp suggested by Igor. Summary of overall changes in os_windows.cpp: * Some static routines were added to set things up init time. These * check that the required APIs (VirtualAllocExNuma, GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in the OS * build the list of numa nodes on which this process has affinity * Changes to os::reserve_memory * There was already a routine that reserved pages one page at a time (used for Individual Large Page Allocation on WS2003). This was abstracted to a separate routine, called allocate_pages_individually. This gets called both for the Individual Large Page Allocation thing mentioned above and for UseNUMAInterleaving (for both small and large pages) * When used for NUMA Interleaving this just goes thru the numa node list in a round-robin fashion, allocating chunks at the NUMAInterleaveGranularity using a different allocation for each chunk * Whether we do just a reserve or a combined reserve/commit is determined by the caller of allocate_pages_individually * When used with large pages, we do a Reserve and Commit at the same time which is the way it always worked and the way it has to work on windows. * For small pages, only the reserve is done, the commit will come later. (which is the way it worked for non-interleaved) * os::commit_memory changes * If UseNUMAIntereaving is true, os::commit_memory has to check whether it was being asked to commit memory that might have come from multiple Reserve allocations, if so, the commits must also be broken up. We don't keep any data structure to keep track of this, we just use VirtualQuery which queries the properties of a VA range and can tell us how much came from one VirtualAlloc call. I do not have a bug id for this. -- Tom Deneau, AMD From bengt.rutisson at oracle.com Tue Aug 23 18:31:02 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 23 Aug 2011 20:31:02 +0200 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new In-Reply-To: <4E5387CF.2050809@oracle.com> References: <4E53724B.90408@oracle.com> <4E5387CF.2050809@oracle.com> Message-ID: <4E53F1E6.7090208@oracle.com> Poonam and Jesper, Thanks for the prompt reviews! In theory I am all set now, but since I think this is runtime code it would be great if I could get a review from someone on the runtime team as well. It's a really small change, so a review should be fast... Thanks, Bengt On 2011-08-23 12:58, Poonam Bajaj wrote: > Looks good! > > Thanks, > Poonam > > On 8/23/2011 2:56 PM, Bengt Rutisson wrote: >> >> Hi all, >> >> Could I please have a couple of reviews for this small fix? After the >> secure dll loading fix (7016797) the psapi.lib library is needed for >> Windows builds. The original change made sure that this library is >> provided to the linker for the command line builds. However, the >> builds from inside Visual Studio also need to know about this library. >> >> Webrev: >> http://cr.openjdk.java.net/~brutisso/7082220/webrev/ >> >> CR: >> http://monaco.us.oracle.com/detail.jsf?cr=7082220 >> >> Testing: >> I created a Visual Studio project with the create script and with the >> change above the project builds nicely. >> >> I am including both Runtime and GC in this mail. The change is to >> runtime code, but I would like to push this through hotspot-gc. The >> reason is that this is blocking my work. Whenever I am setting up a >> new repository I run into this issue. But it seems that I am the only >> one who uses the Visual Studio builds at the moment. The issue has >> been around for several months but it only got integrated into >> hsx/hotspot-gc 5 days ago. So, for me it would be easier to integrate >> directly into hotspot-gc and start using the fix rather than having >> to wait for the fix to propagate from hotspot-rt to hotspot-gc. >> >> Thanks, >> Bengt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Tue Aug 23 18:53:29 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 23 Aug 2011 11:53:29 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> Message-ID: <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> Tom, This looks good to me, except three minor things: os_windows.cpp: - you should check for null here: 2630 ~NUMANodeListHolder() { > if (_numa_used_node_list != NULL) { 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > } 2632 } - if NUMANodeListHolder::build() will be called multiple times, you'll leak memory. I guess you should check if _numa_used_node_list is NULL and if not free it first. - you didn't modify os::numa_get_leaf_groups() to handle the situation when the value of argument "size" is bigger than NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. See my comment in the previous mail. igor On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > Please review this patch which adds a new flag called > UseNUMAInterleaving. This flag provides a subset of the functionality > provided by UseNUMA. In Hotspot UseNUMA terminology, > UseNUMAInterleaved makes all memory "numa_global" which is implemented > as interleaved. This patch's main purpose is to provide that subset > on OSes like Windows which do not support the full UseNUMA > functionality. However, a simple implementation of UseNUMAInterleaving is > also provided for other OSes > > The situations where this shows the biggest benefits would be: > * Windows platforms with multiple numa nodes (eg, 4) > > * The JVM process is run across all the nodes (not affinitized to > one node). > > * A workload that has enough threads so that it uses the majority > of the cores in the machine, so that the heap is being accessed > from many cores, including remote ones. > > * Enough memory per node and a heap size such that the default heap > placement policy on windows would end up with the heap (or > nursery) placed on one node. > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > measurements, we have seen some cases where the performance with > UseNUMAInterleaving was 2.7x vs. the performance without. There were > gains of varying sizes across all systems. > > The webrev is at > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > Summary of changes in webrev.04 from webrev.03: > > * As suggested by Igor Veresov, UseNUMA can imply > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > dynamically rather than assuming a length of 64. The method > NUMANodeListHolder::get_node_list_entry checks returns -1 for > indexes that are out of bounds. > > * Several code convention cleanups suggested by Igor. > > * Merge with the new style system dll function resolutions from > "7016797: Hotspot: securely/restrictive load dlls and new API for > loading system dlls" Note: my new NUMA functions are outside the ifdefs. > > > Summary of changes in webrev.03 from webrev.02: > > * As suggested by Igor Veresov, reverts to using > UseNUMAInterleaving as the enabling flag. This will make it > easier in the future when there are GCs that enable fuller > UseNUMA on Windows. > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > Solaris, which just calls numa_make_global after commit_memory > and reserve_memory_special > > * Adds a flag NUMAInterleaveGranularity which allows setting the > granularity with which we move to a different node in a memory > allocation. The default is 2MB. This flag only applies to > Windows for now. > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > Summary of overall changes in os_windows.cpp: > > * Some static routines were added to set things up init time. These > * check that the required APIs (VirtualAllocExNuma, > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > the OS > > * build the list of numa nodes on which this process has affinity > > * Changes to os::reserve_memory > * There was already a routine that reserved pages one page at a > time (used for Individual Large Page Allocation on WS2003). > This was abstracted to a separate routine, called > allocate_pages_individually. This gets called both for the > Individual Large Page Allocation thing mentioned above and for > UseNUMAInterleaving (for both small and large pages) > > * When used for NUMA Interleaving this just goes thru the numa > node list in a round-robin fashion, allocating chunks at the > NUMAInterleaveGranularity using a different allocation for > each chunk > > * Whether we do just a reserve or a combined reserve/commit is > determined by the caller of allocate_pages_individually > > * When used with large pages, we do a Reserve and Commit at > the same time which is the way it always worked and the way > it has to work on windows. > > * For small pages, only the reserve is done, the commit will > come later. (which is the way it worked for > non-interleaved) > > * os::commit_memory changes > * If UseNUMAIntereaving is true, os::commit_memory has to check > whether it was being asked to commit memory that might have > come from multiple Reserve allocations, if so, the commits > must also be broken up. We don't keep any data structure to > keep track of this, we just use VirtualQuery which queries the > properties of a VA range and can tell us how much came from > one VirtualAlloc call. > > I do not have a bug id for this. > > -- Tom Deneau, AMD From keith.mcguigan at oracle.com Tue Aug 23 18:59:59 2011 From: keith.mcguigan at oracle.com (Keith McGuigan) Date: Tue, 23 Aug 2011 14:59:59 -0400 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new In-Reply-To: <4E53F1E6.7090208@oracle.com> References: <4E53724B.90408@oracle.com> <4E5387CF.2050809@oracle.com> <4E53F1E6.7090208@oracle.com> Message-ID: <2B419C56-3576-4690-9FA5-2092083FE938@oracle.com> Thumbs up. On Aug 23, 2011, at 2:31 PM, Bengt Rutisson wrote: > > Poonam and Jesper, > > Thanks for the prompt reviews! In theory I am all set now, but since > I think this is runtime code it would be great if I could get a > review from someone on the runtime team as well. > > It's a really small change, so a review should be fast... > > Thanks, > Bengt > > On 2011-08-23 12:58, Poonam Bajaj wrote: >> >> Looks good! >> >> Thanks, >> Poonam >> >> On 8/23/2011 2:56 PM, Bengt Rutisson wrote: >>> >>> >>> Hi all, >>> >>> Could I please have a couple of reviews for this small fix? After >>> the secure dll loading fix (7016797) the psapi.lib library is >>> needed for Windows builds. The original change made sure that this >>> library is provided to the linker for the command line builds. >>> However, the builds from inside Visual Studio also need to know >>> about this library. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~brutisso/7082220/webrev/ >>> >>> CR: >>> http://monaco.us.oracle.com/detail.jsf?cr=7082220 >>> >>> Testing: >>> I created a Visual Studio project with the create script and with >>> the change above the project builds nicely. >>> >>> I am including both Runtime and GC in this mail. The change is to >>> runtime code, but I would like to push this through hotspot-gc. >>> The reason is that this is blocking my work. Whenever I am setting >>> up a new repository I run into this issue. But it seems that I am >>> the only one who uses the Visual Studio builds at the moment. The >>> issue has been around for several months but it only got >>> integrated into hsx/hotspot-gc 5 days ago. So, for me it would be >>> easier to integrate directly into hotspot-gc and start using the >>> fix rather than having to wait for the fix to propagate from >>> hotspot-rt to hotspot-gc. >>> >>> Thanks, >>> Bengt >> >> > From lawrence.chow at oracle.com Tue Aug 23 19:00:57 2011 From: lawrence.chow at oracle.com (lawrence.chow at oracle.com) Date: Tue, 23 Aug 2011 12:00:57 -0700 (PDT) Subject: Auto Reply: hotspot-gc-use Digest, Vol 42, Issue 10 Message-ID: Lawrence Chow will be out of the office on 08/20/11 through 08/29/11 Lawrence will return to the office on Tueday, 08/30/11. Please contact Matt.Mille at oracle.com, Terry.Statt at oracle.com, or Mary.McCarthy at oracle.com if assistance is needed from a Java collaborator in my absence. _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tom.deneau at amd.com Tue Aug 23 19:17:33 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 23 Aug 2011 14:17:33 -0500 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A9018D581E82@SAUSEXMBP01.amd.com> Igor -- For your comment on numa_get_leaf_groups, the bounds checking is done in NUMANodeListHolder::get_node_list_entry. Although I wasn't sure what you were supposed to return for indices that are out of bounds, so I returned -1. I'll take care of the other cleanups. -- Tom > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Tuesday, August 23, 2011 1:53 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving #4 > > Tom, > > This looks good to me, except three minor things: > > os_windows.cpp: > > - you should check for null here: > 2630 ~NUMANodeListHolder() { > > if (_numa_used_node_list != NULL) { > 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > > } > 2632 } > > - if NUMANodeListHolder::build() will be called multiple times, you'll > leak memory. I guess you should check if _numa_used_node_list is NULL and > if not free it first. > > - you didn't modify os::numa_get_leaf_groups() to handle the situation > when the value of argument "size" is bigger than > NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. > See my comment in the previous mail. > > > igor > > On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > > > Please review this patch which adds a new flag called > > UseNUMAInterleaving. This flag provides a subset of the functionality > > provided by UseNUMA. In Hotspot UseNUMA terminology, > > UseNUMAInterleaved makes all memory "numa_global" which is implemented > > as interleaved. This patch's main purpose is to provide that subset > > on OSes like Windows which do not support the full UseNUMA > > functionality. However, a simple implementation of UseNUMAInterleaving > is > > also provided for other OSes > > > > The situations where this shows the biggest benefits would be: > > * Windows platforms with multiple numa nodes (eg, 4) > > > > * The JVM process is run across all the nodes (not affinitized to > > one node). > > > > * A workload that has enough threads so that it uses the majority > > of the cores in the machine, so that the heap is being accessed > > from many cores, including remote ones. > > > > * Enough memory per node and a heap size such that the default heap > > placement policy on windows would end up with the heap (or > > nursery) placed on one node. > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > > measurements, we have seen some cases where the performance with > > UseNUMAInterleaving was 2.7x vs. the performance without. There were > > gains of varying sizes across all systems. > > > > The webrev is at > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > > > Summary of changes in webrev.04 from webrev.03: > > > > * As suggested by Igor Veresov, UseNUMA can imply > > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > > dynamically rather than assuming a length of 64. The method > > NUMANodeListHolder::get_node_list_entry checks returns -1 for > > indexes that are out of bounds. > > > > * Several code convention cleanups suggested by Igor. > > > > * Merge with the new style system dll function resolutions from > > "7016797: Hotspot: securely/restrictive load dlls and new API for > > loading system dlls" Note: my new NUMA functions are outside the > ifdefs. > > > > > > Summary of changes in webrev.03 from webrev.02: > > > > * As suggested by Igor Veresov, reverts to using > > UseNUMAInterleaving as the enabling flag. This will make it > > easier in the future when there are GCs that enable fuller > > UseNUMA on Windows. > > > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > > Solaris, which just calls numa_make_global after commit_memory > > and reserve_memory_special > > > > * Adds a flag NUMAInterleaveGranularity which allows setting the > > granularity with which we move to a different node in a memory > > allocation. The default is 2MB. This flag only applies to > > Windows for now. > > > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > > > > Summary of overall changes in os_windows.cpp: > > > > * Some static routines were added to set things up init time. These > > * check that the required APIs (VirtualAllocExNuma, > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > the OS > > > > * build the list of numa nodes on which this process has affinity > > > > * Changes to os::reserve_memory > > * There was already a routine that reserved pages one page at a > > time (used for Individual Large Page Allocation on WS2003). > > This was abstracted to a separate routine, called > > allocate_pages_individually. This gets called both for the > > Individual Large Page Allocation thing mentioned above and for > > UseNUMAInterleaving (for both small and large pages) > > > > * When used for NUMA Interleaving this just goes thru the numa > > node list in a round-robin fashion, allocating chunks at the > > NUMAInterleaveGranularity using a different allocation for > > each chunk > > > > * Whether we do just a reserve or a combined reserve/commit is > > determined by the caller of allocate_pages_individually > > > > * When used with large pages, we do a Reserve and Commit at > > the same time which is the way it always worked and the way > > it has to work on windows. > > > > * For small pages, only the reserve is done, the commit will > > come later. (which is the way it worked for > > non-interleaved) > > > > * os::commit_memory changes > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > whether it was being asked to commit memory that might have > > come from multiple Reserve allocations, if so, the commits > > must also be broken up. We don't keep any data structure to > > keep track of this, we just use VirtualQuery which queries the > > properties of a VA range and can tell us how much came from > > one VirtualAlloc call. > > > > I do not have a bug id for this. > > > > -- Tom Deneau, AMD > > From igor.veresov at oracle.com Tue Aug 23 19:33:26 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 23 Aug 2011 12:33:26 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A9018D581E82@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E82@SAUSEXMBP01.amd.com> Message-ID: Yeah, but the semantics of numa_get_leaf_groups is the following: - the size that it is supplied as an argument is the maximum amount of elements that would fit in ids. It's the upper limit. - the returned value is the real number of elements in ids that make sense. Returning the size large than NUMANodeListHolder::get_count() and sticking -1 into entries higher than that is wrong. Sorry about not making it clear how this function should work. igor On Tuesday, August 23, 2011 at 12:17 PM, Deneau, Tom wrote: > Igor -- > > For your comment on numa_get_leaf_groups, the bounds checking is done in NUMANodeListHolder::get_node_list_entry. Although I wasn't sure what you were supposed > to return for indices that are out of bounds, so I returned -1. > > I'll take care of the other cleanups. > > -- Tom > > > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Tuesday, August 23, 2011 1:53 PM > > To: Deneau, Tom > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net) > > Subject: Re: Review Request: UseNUMAInterleaving #4 > > > > Tom, > > > > This looks good to me, except three minor things: > > > > os_windows.cpp: > > > > - you should check for null here: > > 2630 ~NUMANodeListHolder() { > > > if (_numa_used_node_list != NULL) { > > 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > > > } > > 2632 } > > > > - if NUMANodeListHolder::build() will be called multiple times, you'll > > leak memory. I guess you should check if _numa_used_node_list is NULL and > > if not free it first. > > > > - you didn't modify os::numa_get_leaf_groups() to handle the situation > > when the value of argument "size" is bigger than > > NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. > > See my comment in the previous mail. > > > > > > igor > > > > On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > > > > > Please review this patch which adds a new flag called > > > UseNUMAInterleaving. This flag provides a subset of the functionality > > > provided by UseNUMA. In Hotspot UseNUMA terminology, > > > UseNUMAInterleaved makes all memory "numa_global" which is implemented > > > as interleaved. This patch's main purpose is to provide that subset > > > on OSes like Windows which do not support the full UseNUMA > > > functionality. However, a simple implementation of UseNUMAInterleaving > > is > > > also provided for other OSes > > > > > > The situations where this shows the biggest benefits would be: > > > * Windows platforms with multiple numa nodes (eg, 4) > > > > > > * The JVM process is run across all the nodes (not affinitized to > > > one node). > > > > > > * A workload that has enough threads so that it uses the majority > > > of the cores in the machine, so that the heap is being accessed > > > from many cores, including remote ones. > > > > > > * Enough memory per node and a heap size such that the default heap > > > placement policy on windows would end up with the heap (or > > > nursery) placed on one node. > > > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > > > measurements, we have seen some cases where the performance with > > > UseNUMAInterleaving was 2.7x vs. the performance without. There were > > > gains of varying sizes across all systems. > > > > > > The webrev is at > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > > > > > Summary of changes in webrev.04 from webrev.03: > > > > > > * As suggested by Igor Veresov, UseNUMA can imply > > > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > > > > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > > > dynamically rather than assuming a length of 64. The method > > > NUMANodeListHolder::get_node_list_entry checks returns -1 for > > > indexes that are out of bounds. > > > > > > * Several code convention cleanups suggested by Igor. > > > > > > * Merge with the new style system dll function resolutions from > > > "7016797: Hotspot: securely/restrictive load dlls and new API for > > > loading system dlls" Note: my new NUMA functions are outside the > > ifdefs. > > > > > > > > > Summary of changes in webrev.03 from webrev.02: > > > > > > * As suggested by Igor Veresov, reverts to using > > > UseNUMAInterleaving as the enabling flag. This will make it > > > easier in the future when there are GCs that enable fuller > > > UseNUMA on Windows. > > > > > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > > > Solaris, which just calls numa_make_global after commit_memory > > > and reserve_memory_special > > > > > > * Adds a flag NUMAInterleaveGranularity which allows setting the > > > granularity with which we move to a different node in a memory > > > allocation. The default is 2MB. This flag only applies to > > > Windows for now. > > > > > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > > > > > > > Summary of overall changes in os_windows.cpp: > > > > > > * Some static routines were added to set things up init time. These > > > * check that the required APIs (VirtualAllocExNuma, > > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > > the OS > > > > > > * build the list of numa nodes on which this process has affinity > > > > > > * Changes to os::reserve_memory > > > * There was already a routine that reserved pages one page at a > > > time (used for Individual Large Page Allocation on WS2003). > > > This was abstracted to a separate routine, called > > > allocate_pages_individually. This gets called both for the > > > Individual Large Page Allocation thing mentioned above and for > > > UseNUMAInterleaving (for both small and large pages) > > > > > > * When used for NUMA Interleaving this just goes thru the numa > > > node list in a round-robin fashion, allocating chunks at the > > > NUMAInterleaveGranularity using a different allocation for > > > each chunk > > > > > > * Whether we do just a reserve or a combined reserve/commit is > > > determined by the caller of allocate_pages_individually > > > > > > * When used with large pages, we do a Reserve and Commit at > > > the same time which is the way it always worked and the way > > > it has to work on windows. > > > > > > * For small pages, only the reserve is done, the commit will > > > come later. (which is the way it worked for > > > non-interleaved) > > > > > > * os::commit_memory changes > > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > > whether it was being asked to commit memory that might have > > > come from multiple Reserve allocations, if so, the commits > > > must also be broken up. We don't keep any data structure to > > > keep track of this, we just use VirtualQuery which queries the > > > properties of a VA range and can tell us how much came from > > > one VirtualAlloc call. > > > > > > I do not have a bug id for this. > > > > > > -- Tom Deneau, AMD From tom.deneau at amd.com Tue Aug 23 19:35:22 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 23 Aug 2011 14:35:22 -0500 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E82@SAUSEXMBP01.amd.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A9018D581E9C@SAUSEXMBP01.amd.com> OK, that makes things clearer, thanks. > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Tuesday, August 23, 2011 2:33 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving #4 > > Yeah, but the semantics of numa_get_leaf_groups is the following: > - the size that it is supplied as an argument is the maximum amount of > elements that would fit in ids. It's the upper limit. > - the returned value is the real number of elements in ids that make > sense. > > Returning the size large than NUMANodeListHolder::get_count() and > sticking -1 into entries higher than that is wrong. > Sorry about not making it clear how this function should work. > > igor > > On Tuesday, August 23, 2011 at 12:17 PM, Deneau, Tom wrote: > > > Igor -- > > > > For your comment on numa_get_leaf_groups, the bounds checking is done > in NUMANodeListHolder::get_node_list_entry. Although I wasn't sure what > you were supposed > > to return for indices that are out of bounds, so I returned -1. > > > > I'll take care of the other cleanups. > > > > -- Tom > > > > > > > > > -----Original Message----- > > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > > Sent: Tuesday, August 23, 2011 1:53 PM > > > To: Deneau, Tom > > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc- > dev at openjdk.java.net) > > > Subject: Re: Review Request: UseNUMAInterleaving #4 > > > > > > Tom, > > > > > > This looks good to me, except three minor things: > > > > > > os_windows.cpp: > > > > > > - you should check for null here: > > > 2630 ~NUMANodeListHolder() { > > > > if (_numa_used_node_list != NULL) { > > > 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > > > > } > > > 2632 } > > > > > > - if NUMANodeListHolder::build() will be called multiple times, > you'll > > > leak memory. I guess you should check if _numa_used_node_list is NULL > and > > > if not free it first. > > > > > > - you didn't modify os::numa_get_leaf_groups() to handle the > situation > > > when the value of argument "size" is bigger than > > > NUMANodeListHolder::get_count(). You can use MIN2 to adjust the > value. > > > See my comment in the previous mail. > > > > > > > > > igor > > > > > > On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > > > > > > > Please review this patch which adds a new flag called > > > > UseNUMAInterleaving. This flag provides a subset of the > functionality > > > > provided by UseNUMA. In Hotspot UseNUMA terminology, > > > > UseNUMAInterleaved makes all memory "numa_global" which is > implemented > > > > as interleaved. This patch's main purpose is to provide that subset > > > > on OSes like Windows which do not support the full UseNUMA > > > > functionality. However, a simple implementation of > UseNUMAInterleaving > > > is > > > > also provided for other OSes > > > > > > > > The situations where this shows the biggest benefits would be: > > > > * Windows platforms with multiple numa nodes (eg, 4) > > > > > > > > * The JVM process is run across all the nodes (not affinitized to > > > > one node). > > > > > > > > * A workload that has enough threads so that it uses the majority > > > > of the cores in the machine, so that the heap is being accessed > > > > from many cores, including remote ones. > > > > > > > > * Enough memory per node and a heap size such that the default > heap > > > > placement policy on windows would end up with the heap (or > > > > nursery) placed on one node. > > > > > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In > our > > > > measurements, we have seen some cases where the performance with > > > > UseNUMAInterleaving was 2.7x vs. the performance without. There > were > > > > gains of varying sizes across all systems. > > > > > > > > The webrev is at > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > > > > > > > Summary of changes in webrev.04 from webrev.03: > > > > > > > > * As suggested by Igor Veresov, UseNUMA can imply > > > > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > > > > > > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > > > > dynamically rather than assuming a length of 64. The method > > > > NUMANodeListHolder::get_node_list_entry checks returns -1 for > > > > indexes that are out of bounds. > > > > > > > > * Several code convention cleanups suggested by Igor. > > > > > > > > * Merge with the new style system dll function resolutions from > > > > "7016797: Hotspot: securely/restrictive load dlls and new API for > > > > loading system dlls" Note: my new NUMA functions are outside the > > > ifdefs. > > > > > > > > > > > > Summary of changes in webrev.03 from webrev.02: > > > > > > > > * As suggested by Igor Veresov, reverts to using > > > > UseNUMAInterleaving as the enabling flag. This will make it > > > > easier in the future when there are GCs that enable fuller > > > > UseNUMA on Windows. > > > > > > > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > > > > Solaris, which just calls numa_make_global after commit_memory > > > > and reserve_memory_special > > > > > > > > * Adds a flag NUMAInterleaveGranularity which allows setting the > > > > granularity with which we move to a different node in a memory > > > > allocation. The default is 2MB. This flag only applies to > > > > Windows for now. > > > > > > > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > > > > > > > > > > Summary of overall changes in os_windows.cpp: > > > > > > > > * Some static routines were added to set things up init time. > These > > > > * check that the required APIs (VirtualAllocExNuma, > > > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > > > the OS > > > > > > > > * build the list of numa nodes on which this process has affinity > > > > > > > > * Changes to os::reserve_memory > > > > * There was already a routine that reserved pages one page at a > > > > time (used for Individual Large Page Allocation on WS2003). > > > > This was abstracted to a separate routine, called > > > > allocate_pages_individually. This gets called both for the > > > > Individual Large Page Allocation thing mentioned above and for > > > > UseNUMAInterleaving (for both small and large pages) > > > > > > > > * When used for NUMA Interleaving this just goes thru the numa > > > > node list in a round-robin fashion, allocating chunks at the > > > > NUMAInterleaveGranularity using a different allocation for > > > > each chunk > > > > > > > > * Whether we do just a reserve or a combined reserve/commit is > > > > determined by the caller of allocate_pages_individually > > > > > > > > * When used with large pages, we do a Reserve and Commit at > > > > the same time which is the way it always worked and the way > > > > it has to work on windows. > > > > > > > > * For small pages, only the reserve is done, the commit will > > > > come later. (which is the way it worked for > > > > non-interleaved) > > > > > > > > * os::commit_memory changes > > > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > > > whether it was being asked to commit memory that might have > > > > come from multiple Reserve allocations, if so, the commits > > > > must also be broken up. We don't keep any data structure to > > > > keep track of this, we just use VirtualQuery which queries the > > > > properties of a VA range and can tell us how much came from > > > > one VirtualAlloc call. > > > > > > > > I do not have a bug id for this. > > > > > > > > -- Tom Deneau, AMD > > From john.pampuch at oracle.com Tue Aug 23 19:45:09 2011 From: john.pampuch at oracle.com (John Pampuch) Date: Tue, 23 Aug 2011 12:45:09 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> Message-ID: <4E540345.8050206@oracle.com> An HTML attachment was scrubbed... URL: From tom.deneau at amd.com Tue Aug 23 19:59:07 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Tue, 23 Aug 2011 14:59:07 -0500 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> OK, http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.05/ should address the concerns listed below... -- Tom > -----Original Message----- > From: Igor Veresov [mailto:igor.veresov at oracle.com] > Sent: Tuesday, August 23, 2011 1:53 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving #4 > > Tom, > > This looks good to me, except three minor things: > > os_windows.cpp: > > - you should check for null here: > 2630 ~NUMANodeListHolder() { > > if (_numa_used_node_list != NULL) { > 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > > } > 2632 } > > - if NUMANodeListHolder::build() will be called multiple times, you'll > leak memory. I guess you should check if _numa_used_node_list is NULL and > if not free it first. > > - you didn't modify os::numa_get_leaf_groups() to handle the situation > when the value of argument "size" is bigger than > NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. > See my comment in the previous mail. > > > igor > > On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > > > Please review this patch which adds a new flag called > > UseNUMAInterleaving. This flag provides a subset of the functionality > > provided by UseNUMA. In Hotspot UseNUMA terminology, > > UseNUMAInterleaved makes all memory "numa_global" which is implemented > > as interleaved. This patch's main purpose is to provide that subset > > on OSes like Windows which do not support the full UseNUMA > > functionality. However, a simple implementation of UseNUMAInterleaving > is > > also provided for other OSes > > > > The situations where this shows the biggest benefits would be: > > * Windows platforms with multiple numa nodes (eg, 4) > > > > * The JVM process is run across all the nodes (not affinitized to > > one node). > > > > * A workload that has enough threads so that it uses the majority > > of the cores in the machine, so that the heap is being accessed > > from many cores, including remote ones. > > > > * Enough memory per node and a heap size such that the default heap > > placement policy on windows would end up with the heap (or > > nursery) placed on one node. > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > > measurements, we have seen some cases where the performance with > > UseNUMAInterleaving was 2.7x vs. the performance without. There were > > gains of varying sizes across all systems. > > > > The webrev is at > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > > > Summary of changes in webrev.04 from webrev.03: > > > > * As suggested by Igor Veresov, UseNUMA can imply > > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > > dynamically rather than assuming a length of 64. The method > > NUMANodeListHolder::get_node_list_entry checks returns -1 for > > indexes that are out of bounds. > > > > * Several code convention cleanups suggested by Igor. > > > > * Merge with the new style system dll function resolutions from > > "7016797: Hotspot: securely/restrictive load dlls and new API for > > loading system dlls" Note: my new NUMA functions are outside the > ifdefs. > > > > > > Summary of changes in webrev.03 from webrev.02: > > > > * As suggested by Igor Veresov, reverts to using > > UseNUMAInterleaving as the enabling flag. This will make it > > easier in the future when there are GCs that enable fuller > > UseNUMA on Windows. > > > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > > Solaris, which just calls numa_make_global after commit_memory > > and reserve_memory_special > > > > * Adds a flag NUMAInterleaveGranularity which allows setting the > > granularity with which we move to a different node in a memory > > allocation. The default is 2MB. This flag only applies to > > Windows for now. > > > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > > > > Summary of overall changes in os_windows.cpp: > > > > * Some static routines were added to set things up init time. These > > * check that the required APIs (VirtualAllocExNuma, > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > the OS > > > > * build the list of numa nodes on which this process has affinity > > > > * Changes to os::reserve_memory > > * There was already a routine that reserved pages one page at a > > time (used for Individual Large Page Allocation on WS2003). > > This was abstracted to a separate routine, called > > allocate_pages_individually. This gets called both for the > > Individual Large Page Allocation thing mentioned above and for > > UseNUMAInterleaving (for both small and large pages) > > > > * When used for NUMA Interleaving this just goes thru the numa > > node list in a round-robin fashion, allocating chunks at the > > NUMAInterleaveGranularity using a different allocation for > > each chunk > > > > * Whether we do just a reserve or a combined reserve/commit is > > determined by the caller of allocate_pages_individually > > > > * When used with large pages, we do a Reserve and Commit at > > the same time which is the way it always worked and the way > > it has to work on windows. > > > > * For small pages, only the reserve is done, the commit will > > come later. (which is the way it worked for > > non-interleaved) > > > > * os::commit_memory changes > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > whether it was being asked to commit memory that might have > > come from multiple Reserve allocations, if so, the commits > > must also be broken up. We don't keep any data structure to > > keep track of this, we just use VirtualQuery which queries the > > properties of a VA range and can tell us how much came from > > one VirtualAlloc call. > > > > I do not have a bug id for this. > > > > -- Tom Deneau, AMD > > From igor.veresov at oracle.com Tue Aug 23 20:14:38 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 23 Aug 2011 13:14:38 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <4E540345.8050206@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <4E540345.8050206@oracle.com> Message-ID: John, it works precisely as you say. You just have to say UseNUMA and it will imply interleaving for the collectors that don't do any better. The extra flag is for case when you don't want a full blown numa allocator but just want the interleaving. igor On Aug 23, 2011, at 12:45 PM, John Pampuch wrote: > Could this be done without adding a new flag? Eg, could we just detect that the > platform doesn't support the full UseNUMA functionality, and only leverage what > it does implement? > > -John > > On 8/23/11 11:23 AM, Deneau, Tom wrote: >> >> Please review this patch which adds a new flag called >> UseNUMAInterleaving. This flag provides a subset of the functionality >> provided by UseNUMA. In Hotspot UseNUMA terminology, >> UseNUMAInterleaved makes all memory "numa_global" which is implemented >> as interleaved. This patch's main purpose is to provide that subset >> on OSes like Windows which do not support the full UseNUMA >> functionality. However, a simple implementation of UseNUMAInterleaving is >> also provided for other OSes >> >> The situations where this shows the biggest benefits would be: >> * Windows platforms with multiple numa nodes (eg, 4) >> >> * The JVM process is run across all the nodes (not affinitized to >> one node). >> >> * A workload that has enough threads so that it uses the majority >> of the cores in the machine, so that the heap is being accessed >> from many cores, including remote ones. >> >> * Enough memory per node and a heap size such that the default heap >> placement policy on windows would end up with the heap (or >> nursery) placed on one node. >> >> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our >> measurements, we have seen some cases where the performance with >> UseNUMAInterleaving was 2.7x vs. the performance without. There were >> gains of varying sizes across all systems. >> >> The webrev is at >> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ >> >> Summary of changes in webrev.04 from webrev.03: >> >> * As suggested by Igor Veresov, UseNUMA can imply >> UseNUMAInterleaving on all platforms. This is in arguments.cpp >> >> * In NUMANodeListHolder in os_windows.cpp, allocates the node_list >> dynamically rather than assuming a length of 64. The method >> NUMANodeListHolder::get_node_list_entry checks returns -1 for >> indexes that are out of bounds. >> >> * Several code convention cleanups suggested by Igor. >> >> * Merge with the new style system dll function resolutions from >> "7016797: Hotspot: securely/restrictive load dlls and new API for >> loading system dlls" Note: my new NUMA functions are outside the ifdefs. >> >> >> Summary of changes in webrev.03 from webrev.02: >> >> * As suggested by Igor Veresov, reverts to using >> UseNUMAInterleaving as the enabling flag. This will make it >> easier in the future when there are GCs that enable fuller >> UseNUMA on Windows. >> >> * Adds a simple implementation of UseNUMAInterleaving on Linux and >> Solaris, which just calls numa_make_global after commit_memory >> and reserve_memory_special >> >> * Adds a flag NUMAInterleaveGranularity which allows setting the >> granularity with which we move to a different node in a memory >> allocation. The default is 2MB. This flag only applies to >> Windows for now. >> >> * Several code cleanups in os_windows.cpp suggested by Igor. >> >> >> Summary of overall changes in os_windows.cpp: >> >> * Some static routines were added to set things up init time. These >> * check that the required APIs (VirtualAllocExNuma, >> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in >> the OS >> >> * build the list of numa nodes on which this process has affinity >> >> * Changes to os::reserve_memory >> * There was already a routine that reserved pages one page at a >> time (used for Individual Large Page Allocation on WS2003). >> This was abstracted to a separate routine, called >> allocate_pages_individually. This gets called both for the >> Individual Large Page Allocation thing mentioned above and for >> UseNUMAInterleaving (for both small and large pages) >> >> * When used for NUMA Interleaving this just goes thru the numa >> node list in a round-robin fashion, allocating chunks at the >> NUMAInterleaveGranularity using a different allocation for >> each chunk >> >> * Whether we do just a reserve or a combined reserve/commit is >> determined by the caller of allocate_pages_individually >> >> * When used with large pages, we do a Reserve and Commit at >> the same time which is the way it always worked and the way >> it has to work on windows. >> >> * For small pages, only the reserve is done, the commit will >> come later. (which is the way it worked for >> non-interleaved) >> >> * os::commit_memory changes >> * If UseNUMAIntereaving is true, os::commit_memory has to check >> whether it was being asked to commit memory that might have >> come from multiple Reserve allocations, if so, the commits >> must also be broken up. We don't keep any data structure to >> keep track of this, we just use VirtualQuery which queries the >> properties of a VA range and can tell us how much came from >> one VirtualAlloc call. >> >> I do not have a bug id for this. >> >> -- Tom Deneau, AMD >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.pampuch at oracle.com Tue Aug 23 20:24:24 2011 From: john.pampuch at oracle.com (John Pampuch) Date: Tue, 23 Aug 2011 13:24:24 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <4E540345.8050206@oracle.com> Message-ID: <4E540C78.7080607@oracle.com> An HTML attachment was scrubbed... URL: From bengt.rutisson at oracle.com Tue Aug 23 20:27:38 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 23 Aug 2011 22:27:38 +0200 Subject: Request for review (XS): 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new In-Reply-To: <2B419C56-3576-4690-9FA5-2092083FE938@oracle.com> References: <4E53724B.90408@oracle.com> <4E5387CF.2050809@oracle.com> <4E53F1E6.7090208@oracle.com> <2B419C56-3576-4690-9FA5-2092083FE938@oracle.com> Message-ID: <4E540D3A.5010808@oracle.com> Thanks, Keith! All set now. I'll go ahead and push this. Bengt On 2011-08-23 20:59, Keith McGuigan wrote: > > Thumbs up. > > On Aug 23, 2011, at 2:31 PM, Bengt Rutisson wrote: > >> >> Poonam and Jesper, >> >> Thanks for the prompt reviews! In theory I am all set now, but since >> I think this is runtime code it would be great if I could get a >> review from someone on the runtime team as well. >> >> It's a really small change, so a review should be fast... >> >> Thanks, >> Bengt >> >> On 2011-08-23 12:58, Poonam Bajaj wrote: >>> >>> Looks good! >>> >>> Thanks, >>> Poonam >>> >>> On 8/23/2011 2:56 PM, Bengt Rutisson wrote: >>>> >>>> >>>> Hi all, >>>> >>>> Could I please have a couple of reviews for this small fix? After >>>> the secure dll loading fix (7016797) the psapi.lib library is >>>> needed for Windows builds. The original change made sure that this >>>> library is provided to the linker for the command line builds. >>>> However, the builds from inside Visual Studio also need to know >>>> about this library. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~brutisso/7082220/webrev/ >>>> >>>> CR: >>>> http://monaco.us.oracle.com/detail.jsf?cr=7082220 >>>> >>>> Testing: >>>> I created a Visual Studio project with the create script and with >>>> the change above the project builds nicely. >>>> >>>> I am including both Runtime and GC in this mail. The change is to >>>> runtime code, but I would like to push this through hotspot-gc. The >>>> reason is that this is blocking my work. Whenever I am setting up a >>>> new repository I run into this issue. But it seems that I am the >>>> only one who uses the Visual Studio builds at the moment. The issue >>>> has been around for several months but it only got integrated into >>>> hsx/hotspot-gc 5 days ago. So, for me it would be easier to >>>> integrate directly into hotspot-gc and start using the fix rather >>>> than having to wait for the fix to propagate from hotspot-rt to >>>> hotspot-gc. >>>> >>>> Thanks, >>>> Bengt >>> >>> >> > From igor.veresov at oracle.com Tue Aug 23 21:16:20 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 23 Aug 2011 14:16:20 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> Message-ID: Looks good! igor On Tuesday, August 23, 2011 at 12:59 PM, Deneau, Tom wrote: > OK, http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.05/ > should address the concerns listed below... > > -- Tom > > > > -----Original Message----- > > From: Igor Veresov [mailto:igor.veresov at oracle.com] > > Sent: Tuesday, August 23, 2011 1:53 PM > > To: Deneau, Tom > > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net) > > Subject: Re: Review Request: UseNUMAInterleaving #4 > > > > Tom, > > > > This looks good to me, except three minor things: > > > > os_windows.cpp: > > > > - you should check for null here: > > 2630 ~NUMANodeListHolder() { > > > if (_numa_used_node_list != NULL) { > > 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > > > } > > 2632 } > > > > - if NUMANodeListHolder::build() will be called multiple times, you'll > > leak memory. I guess you should check if _numa_used_node_list is NULL and > > if not free it first. > > > > - you didn't modify os::numa_get_leaf_groups() to handle the situation > > when the value of argument "size" is bigger than > > NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. > > See my comment in the previous mail. > > > > > > igor > > > > On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > > > > > Please review this patch which adds a new flag called > > > UseNUMAInterleaving. This flag provides a subset of the functionality > > > provided by UseNUMA. In Hotspot UseNUMA terminology, > > > UseNUMAInterleaved makes all memory "numa_global" which is implemented > > > as interleaved. This patch's main purpose is to provide that subset > > > on OSes like Windows which do not support the full UseNUMA > > > functionality. However, a simple implementation of UseNUMAInterleaving > > is > > > also provided for other OSes > > > > > > The situations where this shows the biggest benefits would be: > > > * Windows platforms with multiple numa nodes (eg, 4) > > > > > > * The JVM process is run across all the nodes (not affinitized to > > > one node). > > > > > > * A workload that has enough threads so that it uses the majority > > > of the cores in the machine, so that the heap is being accessed > > > from many cores, including remote ones. > > > > > > * Enough memory per node and a heap size such that the default heap > > > placement policy on windows would end up with the heap (or > > > nursery) placed on one node. > > > > > > jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > > > measurements, we have seen some cases where the performance with > > > UseNUMAInterleaving was 2.7x vs. the performance without. There were > > > gains of varying sizes across all systems. > > > > > > The webrev is at > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > > > > > > Summary of changes in webrev.04 from webrev.03: > > > > > > * As suggested by Igor Veresov, UseNUMA can imply > > > UseNUMAInterleaving on all platforms. This is in arguments.cpp > > > > > > * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > > > dynamically rather than assuming a length of 64. The method > > > NUMANodeListHolder::get_node_list_entry checks returns -1 for > > > indexes that are out of bounds. > > > > > > * Several code convention cleanups suggested by Igor. > > > > > > * Merge with the new style system dll function resolutions from > > > "7016797: Hotspot: securely/restrictive load dlls and new API for > > > loading system dlls" Note: my new NUMA functions are outside the > > ifdefs. > > > > > > > > > Summary of changes in webrev.03 from webrev.02: > > > > > > * As suggested by Igor Veresov, reverts to using > > > UseNUMAInterleaving as the enabling flag. This will make it > > > easier in the future when there are GCs that enable fuller > > > UseNUMA on Windows. > > > > > > * Adds a simple implementation of UseNUMAInterleaving on Linux and > > > Solaris, which just calls numa_make_global after commit_memory > > > and reserve_memory_special > > > > > > * Adds a flag NUMAInterleaveGranularity which allows setting the > > > granularity with which we move to a different node in a memory > > > allocation. The default is 2MB. This flag only applies to > > > Windows for now. > > > > > > * Several code cleanups in os_windows.cpp suggested by Igor. > > > > > > > > > Summary of overall changes in os_windows.cpp: > > > > > > * Some static routines were added to set things up init time. These > > > * check that the required APIs (VirtualAllocExNuma, > > > GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > > > the OS > > > > > > * build the list of numa nodes on which this process has affinity > > > > > > * Changes to os::reserve_memory > > > * There was already a routine that reserved pages one page at a > > > time (used for Individual Large Page Allocation on WS2003). > > > This was abstracted to a separate routine, called > > > allocate_pages_individually. This gets called both for the > > > Individual Large Page Allocation thing mentioned above and for > > > UseNUMAInterleaving (for both small and large pages) > > > > > > * When used for NUMA Interleaving this just goes thru the numa > > > node list in a round-robin fashion, allocating chunks at the > > > NUMAInterleaveGranularity using a different allocation for > > > each chunk > > > > > > * Whether we do just a reserve or a combined reserve/commit is > > > determined by the caller of allocate_pages_individually > > > > > > * When used with large pages, we do a Reserve and Commit at > > > the same time which is the way it always worked and the way > > > it has to work on windows. > > > > > > * For small pages, only the reserve is done, the commit will > > > come later. (which is the way it worked for > > > non-interleaved) > > > > > > * os::commit_memory changes > > > * If UseNUMAIntereaving is true, os::commit_memory has to check > > > whether it was being asked to commit memory that might have > > > come from multiple Reserve allocations, if so, the commits > > > must also be broken up. We don't keep any data structure to > > > keep track of this, we just use VirtualQuery which queries the > > > properties of a VA range and can tell us how much came from > > > one VirtualAlloc call. > > > > > > I do not have a bug id for this. > > > > > > -- Tom Deneau, AMD From y.s.ramakrishna at oracle.com Tue Aug 23 23:50:50 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 23 Aug 2011 16:50:50 -0700 Subject: Review Request: UseNUMAInterleaving #4 In-Reply-To: <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> Message-ID: <4E543CDA.3050904@oracle.com> Hi Tom -- the perf improvement on windows is impressive. The changes look good. Just a few very minor nits below: globals.hpp: In the doc string field for NUMAInterleaveGranularity, you might state that this is a Windows only option. (although i recognize that this hasn't been done for some of the other windows options that i became aware of now as being used exclusively in windows before your changes, for instance: UseLargePagesIndividualAllocation and LargePagesIndividualAllocationInjectError). arguments.cpp: you could get rid of the empty lines 1432-1433, and move the content of 1428-1430 into the if-scope of 1422-1426. os_windows.cpp: you can probably get rid of the extra newline introduced at line 1967. line 3018, typo: "NUMAInterleavaing" also at line 3033: "thNUMANodeListHolderat" The comment at lines 3030-3033 would also benefit from a few missing punctuation marks. at lines 3040 and 3043, it might read better to place the returns on lines of their own. If you run with +UseNUMAInterleaving and a commit failed, it would seem that the error message at line 2987 would be confusing and incorrect. Perhaps you want to suitably modify it or just suppress the additional text in that case. os_solaris.cpp: 2780-2784, it might make sense to do the madvise global/many call only if the mmap_chunk() succeeds, rather than all the time as you are doing. May be something like:- 2780 char *res = Solaris::mmap_chunk(addr, size, MAP_PRIVATE|MAP_FIXED, prot); 2781 if (res != NULL) { if (UseNUMAInterleaving) { 2782 numa_make_global(res, size); } return true; 2783 } 2784 return false; At line 3444, would it make sense to use "size" instead of "bytes" (although size is just a copy of bytes -- i don't understand the reason for making the copy, so feel free to ignore if this is some recherche style issue; otherwise it might make sense to get rid of the copy and just use the formal parameter as is the case for the Linux code; although this is really not code that you introduced, but just because you happen to be touching code in the vicinity... your choice.) In the same vein, i'd make the Linux code similar in shape to the solaris code for the two hunks changed in os_linux.cpp. rest looks good. -- ramki On 08/23/11 12:59, Deneau, Tom wrote: > OK, http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.05/ > should address the concerns listed below... > > -- Tom > > >> -----Original Message----- >> From: Igor Veresov [mailto:igor.veresov at oracle.com] >> Sent: Tuesday, August 23, 2011 1:53 PM >> To: Deneau, Tom >> Cc: hotspot-gc-dev at openjdk.java.net >> Subject: Re: Review Request: UseNUMAInterleaving #4 >> >> Tom, >> >> This looks good to me, except three minor things: >> >> os_windows.cpp: >> >> - you should check for null here: >> 2630 ~NUMANodeListHolder() { >>> if (_numa_used_node_list != NULL) { >> 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); >>> } >> 2632 } >> >> - if NUMANodeListHolder::build() will be called multiple times, you'll >> leak memory. I guess you should check if _numa_used_node_list is NULL and >> if not free it first. >> >> - you didn't modify os::numa_get_leaf_groups() to handle the situation >> when the value of argument "size" is bigger than >> NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. >> See my comment in the previous mail. >> >> >> igor >> >> On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: >> >>> Please review this patch which adds a new flag called >>> UseNUMAInterleaving. This flag provides a subset of the functionality >>> provided by UseNUMA. In Hotspot UseNUMA terminology, >>> UseNUMAInterleaved makes all memory "numa_global" which is implemented >>> as interleaved. This patch's main purpose is to provide that subset >>> on OSes like Windows which do not support the full UseNUMA >>> functionality. However, a simple implementation of UseNUMAInterleaving >> is >>> also provided for other OSes >>> >>> The situations where this shows the biggest benefits would be: >>> * Windows platforms with multiple numa nodes (eg, 4) >>> >>> * The JVM process is run across all the nodes (not affinitized to >>> one node). >>> >>> * A workload that has enough threads so that it uses the majority >>> of the cores in the machine, so that the heap is being accessed >>> from many cores, including remote ones. >>> >>> * Enough memory per node and a heap size such that the default heap >>> placement policy on windows would end up with the heap (or >>> nursery) placed on one node. >>> >>> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our >>> measurements, we have seen some cases where the performance with >>> UseNUMAInterleaving was 2.7x vs. the performance without. There were >>> gains of varying sizes across all systems. >>> >>> The webrev is at >>> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ >>> >>> Summary of changes in webrev.04 from webrev.03: >>> >>> * As suggested by Igor Veresov, UseNUMA can imply >>> UseNUMAInterleaving on all platforms. This is in arguments.cpp >>> >>> * In NUMANodeListHolder in os_windows.cpp, allocates the node_list >>> dynamically rather than assuming a length of 64. The method >>> NUMANodeListHolder::get_node_list_entry checks returns -1 for >>> indexes that are out of bounds. >>> >>> * Several code convention cleanups suggested by Igor. >>> >>> * Merge with the new style system dll function resolutions from >>> "7016797: Hotspot: securely/restrictive load dlls and new API for >>> loading system dlls" Note: my new NUMA functions are outside the >> ifdefs. >>> >>> Summary of changes in webrev.03 from webrev.02: >>> >>> * As suggested by Igor Veresov, reverts to using >>> UseNUMAInterleaving as the enabling flag. This will make it >>> easier in the future when there are GCs that enable fuller >>> UseNUMA on Windows. >>> >>> * Adds a simple implementation of UseNUMAInterleaving on Linux and >>> Solaris, which just calls numa_make_global after commit_memory >>> and reserve_memory_special >>> >>> * Adds a flag NUMAInterleaveGranularity which allows setting the >>> granularity with which we move to a different node in a memory >>> allocation. The default is 2MB. This flag only applies to >>> Windows for now. >>> >>> * Several code cleanups in os_windows.cpp suggested by Igor. >>> >>> >>> Summary of overall changes in os_windows.cpp: >>> >>> * Some static routines were added to set things up init time. These >>> * check that the required APIs (VirtualAllocExNuma, >>> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in >>> the OS >>> >>> * build the list of numa nodes on which this process has affinity >>> >>> * Changes to os::reserve_memory >>> * There was already a routine that reserved pages one page at a >>> time (used for Individual Large Page Allocation on WS2003). >>> This was abstracted to a separate routine, called >>> allocate_pages_individually. This gets called both for the >>> Individual Large Page Allocation thing mentioned above and for >>> UseNUMAInterleaving (for both small and large pages) >>> >>> * When used for NUMA Interleaving this just goes thru the numa >>> node list in a round-robin fashion, allocating chunks at the >>> NUMAInterleaveGranularity using a different allocation for >>> each chunk >>> >>> * Whether we do just a reserve or a combined reserve/commit is >>> determined by the caller of allocate_pages_individually >>> >>> * When used with large pages, we do a Reserve and Commit at >>> the same time which is the way it always worked and the way >>> it has to work on windows. >>> >>> * For small pages, only the reserve is done, the commit will >>> come later. (which is the way it worked for >>> non-interleaved) >>> >>> * os::commit_memory changes >>> * If UseNUMAIntereaving is true, os::commit_memory has to check >>> whether it was being asked to commit memory that might have >>> come from multiple Reserve allocations, if so, the commits >>> must also be broken up. We don't keep any data structure to >>> keep track of this, we just use VirtualQuery which queries the >>> properties of a VA range and can tell us how much came from >>> one VirtualAlloc call. >>> >>> I do not have a bug id for this. >>> >>> -- Tom Deneau, AMD >> > From igor.veresov at oracle.com Wed Aug 24 00:16:23 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 23 Aug 2011 17:16:23 -0700 Subject: review(XS): 7082645: Hotspot doesn't compile on old linuxes after 7060836 Message-ID: Compilation fails because of the lack of syscall ids definitions and also because of the conflict of definitions of timespec-related structures. Webrev: http://cr.openjdk.java.net/~iveresov/7082645/webrev.00/ Thanks, igor From tom.deneau at amd.com Wed Aug 24 16:26:54 2011 From: tom.deneau at amd.com (Deneau, Tom) Date: Wed, 24 Aug 2011 11:26:54 -0500 Subject: Review Request: UseNUMAInterleaving #6 In-Reply-To: <4E543CDA.3050904@oracle.com> References: <5EA33A275136844D843B73A29FB9A6A901362B54B2@SAUSEXMBP01.amd.com> <4E402E1C.1010807@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF904E@SAUSEXMBP01.amd.com> <247BA26129A14681B03D0856A6FAC69D@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186EF98B7@SAUSEXMBP01.amd.com> <91928C974B07497184AF80B96F196606@oracle.com> <5EA33A275136844D843B73A29FB9A6A90186FA618A@SAUSEXMBP01.amd.com> <462098EF18364A629C463AC72D5495CC@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581E22@SAUSEXMBP01.amd.com> <9F66C366BA1C4D8A83183711ADE738A0@oracle.com> <5EA33A275136844D843B73A29FB9A6A9018D581EBA@SAUSEXMBP01.amd.com> <4E543CDA.3050904@oracle.com> Message-ID: <5EA33A275136844D843B73A29FB9A6A9018D582275@SAUSEXMBP01.amd.com> I believe I have addressed ramki's comments with http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.06/ -- Tom > -----Original Message----- > From: Y. S. Ramakrishna [mailto:y.s.ramakrishna at oracle.com] > Sent: Tuesday, August 23, 2011 6:51 PM > To: Deneau, Tom > Cc: hotspot-gc-dev at openjdk.java.net > Subject: Re: Review Request: UseNUMAInterleaving #4 > > Hi Tom -- the perf improvement on windows is impressive. > > The changes look good. Just a few very minor nits below: > > globals.hpp: In the doc string field for NUMAInterleaveGranularity, you > might state that this is a Windows only option. (although i recognize > that this hasn't been done for some of the other windows options that > i became aware of now as being used exclusively in windows before > your changes, for instance: UseLargePagesIndividualAllocation > and LargePagesIndividualAllocationInjectError). > > arguments.cpp: you could get rid of the empty lines 1432-1433, and move > the > content of 1428-1430 into the if-scope of 1422-1426. > > os_windows.cpp: you can probably get rid of the extra newline > introduced at line 1967. > > line 3018, typo: "NUMAInterleavaing" > also at line 3033: "thNUMANodeListHolderat" > The comment at lines 3030-3033 would also benefit > from a few missing punctuation marks. > > at lines 3040 and 3043, it might read better to place the returns > on lines of their own. > > If you run with +UseNUMAInterleaving and a commit failed, > it would seem that the error message at line 2987 would be > confusing and incorrect. Perhaps you want to suitably modify > it or just suppress the additional text in that case. > > os_solaris.cpp: 2780-2784, it might make sense to do the madvise > global/many > call only if the mmap_chunk() succeeds, rather than all the time as you > are doing. May be something like:- > > 2780 char *res = Solaris::mmap_chunk(addr, size, MAP_PRIVATE|MAP_FIXED, > prot); > 2781 if (res != NULL) { > if (UseNUMAInterleaving) { > 2782 numa_make_global(res, size); > } > return true; > 2783 } > 2784 return false; > > At line 3444, would it make sense to use "size" instead of "bytes" > (although > size is just a copy of bytes -- i don't understand the reason for making > the copy, so feel free to ignore if this is some recherche style issue; > otherwise > it might make sense to get rid of the copy and just use the formal > parameter as is > the case for the Linux code; although this is really not code that you > introduced, > but just because you happen to be touching code in the vicinity... your > choice.) > > In the same vein, i'd make the Linux code similar in shape to > the solaris code for the two hunks changed in os_linux.cpp. > > rest looks good. > -- ramki > > On 08/23/11 12:59, Deneau, Tom wrote: > > OK, http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.05/ > > should address the concerns listed below... > > > > -- Tom > > > > > >> -----Original Message----- > >> From: Igor Veresov [mailto:igor.veresov at oracle.com] > >> Sent: Tuesday, August 23, 2011 1:53 PM > >> To: Deneau, Tom > >> Cc: hotspot-gc-dev at openjdk.java.net > >> Subject: Re: Review Request: UseNUMAInterleaving #4 > >> > >> Tom, > >> > >> This looks good to me, except three minor things: > >> > >> os_windows.cpp: > >> > >> - you should check for null here: > >> 2630 ~NUMANodeListHolder() { > >>> if (_numa_used_node_list != NULL) { > >> 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list); > >>> } > >> 2632 } > >> > >> - if NUMANodeListHolder::build() will be called multiple times, you'll > >> leak memory. I guess you should check if _numa_used_node_list is NULL > and > >> if not free it first. > >> > >> - you didn't modify os::numa_get_leaf_groups() to handle the situation > >> when the value of argument "size" is bigger than > >> NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value. > >> See my comment in the previous mail. > >> > >> > >> igor > >> > >> On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote: > >> > >>> Please review this patch which adds a new flag called > >>> UseNUMAInterleaving. This flag provides a subset of the functionality > >>> provided by UseNUMA. In Hotspot UseNUMA terminology, > >>> UseNUMAInterleaved makes all memory "numa_global" which is > implemented > >>> as interleaved. This patch's main purpose is to provide that subset > >>> on OSes like Windows which do not support the full UseNUMA > >>> functionality. However, a simple implementation of > UseNUMAInterleaving > >> is > >>> also provided for other OSes > >>> > >>> The situations where this shows the biggest benefits would be: > >>> * Windows platforms with multiple numa nodes (eg, 4) > >>> > >>> * The JVM process is run across all the nodes (not affinitized to > >>> one node). > >>> > >>> * A workload that has enough threads so that it uses the majority > >>> of the cores in the machine, so that the heap is being accessed > >>> from many cores, including remote ones. > >>> > >>> * Enough memory per node and a heap size such that the default heap > >>> placement policy on windows would end up with the heap (or > >>> nursery) placed on one node. > >>> > >>> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our > >>> measurements, we have seen some cases where the performance with > >>> UseNUMAInterleaving was 2.7x vs. the performance without. There were > >>> gains of varying sizes across all systems. > >>> > >>> The webrev is at > >>> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/ > >>> > >>> Summary of changes in webrev.04 from webrev.03: > >>> > >>> * As suggested by Igor Veresov, UseNUMA can imply > >>> UseNUMAInterleaving on all platforms. This is in arguments.cpp > >>> > >>> * In NUMANodeListHolder in os_windows.cpp, allocates the node_list > >>> dynamically rather than assuming a length of 64. The method > >>> NUMANodeListHolder::get_node_list_entry checks returns -1 for > >>> indexes that are out of bounds. > >>> > >>> * Several code convention cleanups suggested by Igor. > >>> > >>> * Merge with the new style system dll function resolutions from > >>> "7016797: Hotspot: securely/restrictive load dlls and new API for > >>> loading system dlls" Note: my new NUMA functions are outside the > >> ifdefs. > >>> > >>> Summary of changes in webrev.03 from webrev.02: > >>> > >>> * As suggested by Igor Veresov, reverts to using > >>> UseNUMAInterleaving as the enabling flag. This will make it > >>> easier in the future when there are GCs that enable fuller > >>> UseNUMA on Windows. > >>> > >>> * Adds a simple implementation of UseNUMAInterleaving on Linux and > >>> Solaris, which just calls numa_make_global after commit_memory > >>> and reserve_memory_special > >>> > >>> * Adds a flag NUMAInterleaveGranularity which allows setting the > >>> granularity with which we move to a different node in a memory > >>> allocation. The default is 2MB. This flag only applies to > >>> Windows for now. > >>> > >>> * Several code cleanups in os_windows.cpp suggested by Igor. > >>> > >>> > >>> Summary of overall changes in os_windows.cpp: > >>> > >>> * Some static routines were added to set things up init time. These > >>> * check that the required APIs (VirtualAllocExNuma, > >>> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in > >>> the OS > >>> > >>> * build the list of numa nodes on which this process has affinity > >>> > >>> * Changes to os::reserve_memory > >>> * There was already a routine that reserved pages one page at a > >>> time (used for Individual Large Page Allocation on WS2003). > >>> This was abstracted to a separate routine, called > >>> allocate_pages_individually. This gets called both for the > >>> Individual Large Page Allocation thing mentioned above and for > >>> UseNUMAInterleaving (for both small and large pages) > >>> > >>> * When used for NUMA Interleaving this just goes thru the numa > >>> node list in a round-robin fashion, allocating chunks at the > >>> NUMAInterleaveGranularity using a different allocation for > >>> each chunk > >>> > >>> * Whether we do just a reserve or a combined reserve/commit is > >>> determined by the caller of allocate_pages_individually > >>> > >>> * When used with large pages, we do a Reserve and Commit at > >>> the same time which is the way it always worked and the way > >>> it has to work on windows. > >>> > >>> * For small pages, only the reserve is done, the commit will > >>> come later. (which is the way it worked for > >>> non-interleaved) > >>> > >>> * os::commit_memory changes > >>> * If UseNUMAIntereaving is true, os::commit_memory has to check > >>> whether it was being asked to commit memory that might have > >>> come from multiple Reserve allocations, if so, the commits > >>> must also be broken up. We don't keep any data structure to > >>> keep track of this, we just use VirtualQuery which queries the > >>> properties of a VA range and can tell us how much came from > >>> one VirtualAlloc call. > >>> > >>> I do not have a bug id for this. > >>> > >>> -- Tom Deneau, AMD > >> > > From bengt.rutisson at oracle.com Thu Aug 25 07:27:18 2011 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Thu, 25 Aug 2011 07:27:18 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new Message-ID: <20110825072720.2E597470D8@hg.openjdk.java.net> Changeset: 2f27ed2a98fa Author: brutisso Date: 2011-08-23 11:06 +0200 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2f27ed2a98fa 7082220: Visual Studio projects broken after change 7016797: Hotspot: securely/restrictive load dlls and new Summary: Add the psapi.lib library to Visual Studio projects Reviewed-by: jwilhelm, poonam, kamg ! src/share/tools/ProjectCreator/WinGammaPlatformVC10.java From john.cuthbertson at oracle.com Thu Aug 25 16:57:26 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 25 Aug 2011 09:57:26 -0700 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E52D462.9040409@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4EEDDB.4060408@oracle.com> <4E5172A4.7020802@oracle.com> <4E52D462.9040409@oracle.com> Message-ID: <4E567EF6.7010202@oracle.com> Hi Everyone, I'm opening up a new version of these changes for code review that includes a fix for the issue that was causing the failure I saw below (many thanks to Tony for helping to diagnose the issue). The new webrev can be found at: http://cr.openjdk.java.net/~johnc/7080389/webrev.2/ Testing: around 30 iterations of the GC test suite with both an instrumented and non instrumented product VM ,with a marking threshold of 20%, and with marking verification and VerifyBeforeGC enabled (this combo should the problem the most frequently - about 1/50 runs of xalan). Thanks, JohnC On 08/22/11 15:12, John Cuthbertson wrote: > Hi Bengt, Tony, Stefan, > > Thanks again for the review comments. I have to hold off pushing this > as I have seen a failure while doing a sanity check after merging the > workspace. So until I get to the bottom of the failure - consider this > as "on hold". > > > > JohnC > > On 08/21/11 14:03, Bengt Rutisson wrote: >> >> Hi John, >> >> This new webrev looks good to me. >> >> The discussion that Stefan and Tony had was good. I like the new >> template parameter name (do_mark_object) and the new comments much >> better. In fact, I realize my initial comments regarding those were a >> bit wrong since I had misinterpreted the code due to the old >> comments. Thanks for finding a good solution to this. >> >> Ship it! >> Bengt >> >> On 2011-08-20 01:12, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> Hopefully this webrev >>> (http://cr.openjdk.java.net/~johnc/7080389/webrev.1/) addresses >>> everyones' comments. >>> >>> Thanks, >>> >>> JohnC >>> >>> On 08/18/11 11:17, John Cuthbertson wrote: >>>> Hi Everyone, >>>> >>>> Can I have a couple of volunteers review these refactoring changes >>>> to the marking code used during evacuation pauses (both initial >>>> mark pauses and regular evacuation pauses when marking is active) - >>>> the change can be found at >>>> http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. >>>> >>>> The refactoring changes fix an issue that was seen with the code >>>> changes for 6486945. >>>> >>>> During an initial mark pause, during root scanning, one thread had >>>> successfully forwarded an object and had started to copy it. While >>>> the object was being copied to its new location, another thread saw >>>> that the object had been forwarded and, after checking that the new >>>> location was unmarked, successfully marked the new location. The >>>> first thread would finish the copying, see that the new location >>>> was marked and skip the mark. The situation I ran into was that I >>>> was attempting to obtain the size of the new object just after it >>>> was marked (by the thread doing the marking) and the old object had >>>> not yet been fully copied to its new location. >>>> >>>> With these refactoring changes, the thread that successfully >>>> forwards an object in the collection set will mark the forwardee >>>> after copying - allowing me to safely obtain it's size. >>>> >>>> Testing: several runs of the GC test suite with a marking threshold >>>> of 10 and 20%, Kitchensink, and jprt. >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> >>> >> > From suraj.puvvada at gmail.com Thu Aug 25 19:58:42 2011 From: suraj.puvvada at gmail.com (suraj puvvada) Date: Thu, 25 Aug 2011 12:58:42 -0700 Subject: Enabling non product flags like "Verbose" in GC Code Message-ID: Hi, How can I enable DEVELOP mode flags like "Verbose" ? I'm interested in seeing what the GC code logs - for example : if (PrintGCDetails && Verbose) { gclog_or_tty->print_cr("ConcurrentMarkSweepGeneration::shrink_by:" " desired_bytes " SIZE_FORMAT " shrinkable_size_in_bytes " SIZE_FORMAT " aligned_shrinkable_size_in_bytes " SIZE_FORMAT " bytes " SIZE_FORMAT, desired_bytes, shrinkable_size_in_bytes, aligned_shrinkable_size_in_bytes, bytes); gclog_or_tty->print_cr(" old_end " SIZE_FORMAT " unallocated_start " SIZE_FORMAT, old_end, unallocated_start); } -Suraj -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.s.ramakrishna at oracle.com Thu Aug 25 20:37:11 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Thu, 25 Aug 2011 13:37:11 -0700 Subject: Enabling non product flags like "Verbose" in GC Code In-Reply-To: References: Message-ID: <4E56B277.7080803@oracle.com> Hi Suraj -- Either: (1) use a non-product build where the flag is available, OR (2) rebuild with Verbose declared a product flag (but you will have to deal with develop->product contagion which will require more such changes), OR (3) (probably the easiest in a specific product build) rebuild with Verbose changed to a new product flag of your choice for the specific sites where you want to print the info but want to retain the option of turning it off. Depending on where you do this, this may also cause a develop->product contagion, but it will be a more controlled burn, if i may be allowed to mix my metaphors. (..) anything else? The above are all one-off's for use in a specific build. There may be good reason to protect some of these more useful messages with a product flag rather than with a develop flag. I recall Krystal Mok also mentioning something similar. Perhaps the community can work on what are the kinds of messages one might want to see in production (under control of a suitable manageable/product flag), and submit an OpenJDK patch with those changes (hopefully the performance impact of the check or enablement will be minor enough when these changes are for example communicating ergonomic decisions etc. -- this should of course be performance checked before a patch is submitted). I'm also hoping that in the future some of these may be captured by the logging framework under construction. Those working on or planning to work on the logging framework may hav more to add. So I am cc'ing the serviceability alias as well. -- ramki On 8/25/2011 12:58 PM, suraj puvvada wrote: > Hi, > > How can I enable DEVELOP mode flags like "Verbose" ? I'm interested in > seeing what the GC code logs - for example : > > if (PrintGCDetails && Verbose) { > gclog_or_tty->print_cr("ConcurrentMarkSweepGeneration::shrink_by:" > " desired_bytes " SIZE_FORMAT > " shrinkable_size_in_bytes " SIZE_FORMAT > " aligned_shrinkable_size_in_bytes " SIZE_FORMAT > " bytes " SIZE_FORMAT, > desired_bytes, shrinkable_size_in_bytes, > aligned_shrinkable_size_in_bytes, bytes); > gclog_or_tty->print_cr(" old_end " SIZE_FORMAT > " unallocated_start " SIZE_FORMAT, > old_end, unallocated_start); > } > > > -Suraj -------------- next part -------------- An HTML attachment was scrubbed... URL: From suraj.puvvada at gmail.com Thu Aug 25 21:34:31 2011 From: suraj.puvvada at gmail.com (suraj puvvada) Date: Thu, 25 Aug 2011 14:34:31 -0700 Subject: Enabling non product flags like "Verbose" in GC Code In-Reply-To: <4E56B277.7080803@oracle.com> References: <4E56B277.7080803@oracle.com> Message-ID: Thanks. Are the non-product builds available online to download ? -Suraj On Thu, Aug 25, 2011 at 1:37 PM, Ramki Ramakrishna < y.s.ramakrishna at oracle.com> wrote: > ** > Hi Suraj -- > > Either: > > (1) use a non-product build where the flag is available, OR > > (2) rebuild with Verbose declared a product flag (but you will have to deal > with > develop->product contagion which will require more such changes), OR > > (3) (probably the easiest in a specific product build) rebuild with Verbose > changed to > a new product flag of your choice for the specific sites where you > want to print the info > but want to retain the option of turning it off. Depending on where > you do this, this > may also cause a develop->product contagion, but it will be a more > controlled burn, if > i may be allowed to mix my metaphors. > > (..) anything else? > > The above are all one-off's for use in a specific build. > > There may be good reason to protect some of these more useful messages with > a product > flag rather than with a develop flag. I recall Krystal Mok also mentioning > something similar. > Perhaps the community can work on what are the kinds of messages one might > want to > see in production (under control of a suitable manageable/product flag), > and submit an OpenJDK > patch with those changes (hopefully the performance impact of the check or > enablement > will be minor enough when these changes are for example communicating > ergonomic > decisions etc. -- this should of course be performance checked before a > patch is submitted). > > I'm also hoping that in the future some of these may be captured by the > logging framework > under construction. Those working on or planning to work on the logging > framework may hav > more to add. So I am cc'ing the serviceability alias as well. > > -- ramki > > > On 8/25/2011 12:58 PM, suraj puvvada wrote: > > Hi, > > How can I enable DEVELOP mode flags like "Verbose" ? I'm interested in > seeing what the GC code logs - for example : > > if (PrintGCDetails && Verbose) { > gclog_or_tty->print_cr("ConcurrentMarkSweepGeneration::shrink_by:" > " desired_bytes " SIZE_FORMAT > " shrinkable_size_in_bytes " SIZE_FORMAT > " aligned_shrinkable_size_in_bytes " SIZE_FORMAT > " bytes " SIZE_FORMAT, > desired_bytes, shrinkable_size_in_bytes, > aligned_shrinkable_size_in_bytes, bytes); > gclog_or_tty->print_cr(" old_end " SIZE_FORMAT > " unallocated_start " SIZE_FORMAT, > old_end, unallocated_start); > } > > > -Suraj > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From y.s.ramakrishna at oracle.com Thu Aug 25 22:21:10 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Thu, 25 Aug 2011 15:21:10 -0700 Subject: Enabling non product flags like "Verbose" in GC Code In-Reply-To: References: <4E56B277.7080803@oracle.com> Message-ID: <4E56CAD6.4030007@oracle.com> i don't know about that (maybe someone on the list knows where one can download pre-built non-product binaries from), but if you are set up to build hotspot, you would use target fastdebug. -- ramki On 8/25/2011 2:34 PM, suraj puvvada wrote: > Thanks. > > Are the non-product builds available online to download ? > > -Suraj > > On Thu, Aug 25, 2011 at 1:37 PM, Ramki Ramakrishna > > wrote: > > Hi Suraj -- > > Either: > > (1) use a non-product build where the flag is available, OR > > (2) rebuild with Verbose declared a product flag (but you will > have to deal with > develop->product contagion which will require more such > changes), OR > > (3) (probably the easiest in a specific product build) rebuild > with Verbose changed to > a new product flag of your choice for the specific sites > where you want to print the info > but want to retain the option of turning it off. Depending > on where you do this, this > may also cause a develop->product contagion, but it will be > a more controlled burn, if > i may be allowed to mix my metaphors. > > (..) anything else? > > The above are all one-off's for use in a specific build. > > There may be good reason to protect some of these more useful > messages with a product > flag rather than with a develop flag. I recall Krystal Mok also > mentioning something similar. > Perhaps the community can work on what are the kinds of messages > one might want to > see in production (under control of a suitable manageable/product > flag), and submit an OpenJDK > patch with those changes (hopefully the performance impact of the > check or enablement > will be minor enough when these changes are for example > communicating ergonomic > decisions etc. -- this should of course be performance checked > before a patch is submitted). > > I'm also hoping that in the future some of these may be captured > by the logging framework > under construction. Those working on or planning to work on the > logging framework may hav > more to add. So I am cc'ing the serviceability alias as well. > > -- ramki > > > On 8/25/2011 12:58 PM, suraj puvvada wrote: >> Hi, >> >> How can I enable DEVELOP mode flags like "Verbose" ? I'm >> interested in seeing what the GC code logs - for example : >> >> if (PrintGCDetails && Verbose) { >> >> gclog_or_tty->print_cr("ConcurrentMarkSweepGeneration::shrink_by:" >> " desired_bytes " SIZE_FORMAT >> " shrinkable_size_in_bytes " SIZE_FORMAT >> " aligned_shrinkable_size_in_bytes " SIZE_FORMAT >> " bytes " SIZE_FORMAT, >> desired_bytes, shrinkable_size_in_bytes, >> aligned_shrinkable_size_in_bytes, bytes); >> gclog_or_tty->print_cr(" old_end " SIZE_FORMAT >> " unallocated_start " SIZE_FORMAT, >> old_end, unallocated_start); >> } >> >> >> -Suraj > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Fri Aug 26 02:31:53 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Fri, 26 Aug 2011 02:31:53 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7082969: NUMA interleaving Message-ID: <20110826023158.4F4D047119@hg.openjdk.java.net> Changeset: 3cd0157e1d4d Author: iveresov Date: 2011-08-25 02:57 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/3cd0157e1d4d 7082969: NUMA interleaving Summary: Support interleaving on NUMA systems for collectors that don't have NUMA-awareness. Reviewed-by: iveresov, ysr Contributed-by: Tom Deneau ! src/os/linux/vm/os_linux.cpp ! src/os/solaris/vm/os_solaris.cpp ! src/os/windows/vm/os_windows.cpp ! src/os/windows/vm/os_windows.hpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp From rednaxelafx at gmail.com Fri Aug 26 05:16:25 2011 From: rednaxelafx at gmail.com (Krystal Mok) Date: Fri, 26 Aug 2011 13:16:25 +0800 Subject: Enabling non product flags like "Verbose" in GC Code In-Reply-To: <4E56CAD6.4030007@oracle.com> References: <4E56B277.7080803@oracle.com> <4E56CAD6.4030007@oracle.com> Message-ID: Comments inline below. On Fri, Aug 26, 2011 at 4:37 AM, Ramki Ramakrishna < y.s.ramakrishna at oracle.com> wrote: > Hi Suraj -- > Either: > (1) use a non-product build where the flag is available, OR > (2) rebuild with Verbose declared a product flag (but you will have to deal > with > develop->product contagion which will require more such changes), OR > (3) (probably the easiest in a specific product build) rebuild with Verbose > changed to > a new product flag of your choice for the specific sites where you > want to print the info > but want to retain the option of turning it off. Depending on where > you do this, this > may also cause a develop->product contagion, but it will be a more > controlled burn, if > i may be allowed to mix my metaphors. > (..) anything else? > The above are all one-off's for use in a specific build. Yep, these are all valid. I've tried all of them. (1) is good for those who's not interested in building a VM themselves. See below for links for downloading a fastdebug build. (2) is probably not the way one would want to go if the interest is only in a specific part of the Verbose log. Turning Verbose from develop to product almost always give you too much information, and it's hard to filter out what you really want. Not to mention there's significant overhead to turing all Verbose log on. Besides, part of the Verbose log printing is wrapped in #ifndef PRODUCT, those are often really expensive and won't show up in a product build even if Verbose is changed to a product flag. (3) is what I've been using for investigating some of our GC issues in production, It works great. But of course it's only for a specific purpose. On Fri, Aug 26, 2011 at 4:37 AM, Ramki Ramakrishna < y.s.ramakrishna at oracle.com> wrote: > There may be good reason to protect some of these more useful messages with > a product > flag rather than with a develop flag. I recall Krystal Mok also mentioning > something similar. > Perhaps the community can work on what are the kinds of messages one might > want to > see in production (under control of a suitable manageable/product flag), > and submit an OpenJDK > patch with those changes (hopefully the performance impact of the check or > enablement > will be minor enough when these changes are for example communicating > ergonomic > decisions etc. -- this should of course be performance checked before a > patch is submitted). That's right, I've been working in this direction. I've temporarily added a "PrintGCReason" production flag in my own build to print short messages of the direct cause of a collection. Not done yet. It's not as simple as just replacing some of the (PrintGCDetails && Verbose) log to using the new flag, because there are cases where a collection is triggered but no verbose log is printed. If I can get it to a point when it's mature enough, I'll submit a patch to OpenJDK for open discussion. But that'll be after our OCA issue is resolved. (The "PrintGCReason" stuff is very different from HotSpot's existing notion of "GCCause". The latter is too coarse and doesn't really provide enough diagnostics of what's going on.) I'm also experimenting on a new GC log parser that tries to parse some of the output from the fancier flags. The original parser framework in GCHisto doesn't seem to be powerful enough to do so, because it's stuck with regular expressions but I think it needs a PDA to really get the job done. Wonder how other guys are approaching this problem. On Fri, Aug 26, 2011 at 4:37 AM, Ramki Ramakrishna < y.s.ramakrishna at oracle.com> wrote: > I'm also hoping that in the future some of these may be captured by the > logging framework > under construction. Those working on or planning to work on the logging > framework may have > more to add. So I am cc'ing the serviceability alias as well. I'm really looking forward to this one. Please keep us updated on any plans of a more uniform GC log format and the accompanying logging framework. On Fri, Aug 26, 2011 at 5:34 AM, suraj puvvada wrote: > Thanks. > Are the non-product builds available online to download ? You can find some of the older builds on java.net, such as [1]. The fastdebug build for JDK 6 used to be published at [2]. But they're no longer available. They used to be removed (or hidden? I don't know) once an FCS is released, though. Making your own build of the HotSpot VM isn't hard. Charles Nutter was kind enough to share his build script recently, following something similar may be a good starting point [3]. Regards, Kris Mok [1]: http://download.java.net/jdk6/6u25/promoted/b03/binaries/ [2]: http://download.java.net/jdk6/binaries/ [3]: http://mail.openjdk.java.net/pipermail/mlvm-dev/2011-August/003775.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From bengt.rutisson at oracle.com Fri Aug 26 08:11:40 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 26 Aug 2011 10:11:40 +0200 Subject: RFR (S): 7080389: G1: refactor marking code in evacuation pause copy closures In-Reply-To: <4E567EF6.7010202@oracle.com> References: <4E4D5757.7040004@oracle.com> <4E4EEDDB.4060408@oracle.com> <4E5172A4.7020802@oracle.com> <4E52D462.9040409@oracle.com> <4E567EF6.7010202@oracle.com> Message-ID: <4E57553C.1060208@oracle.com> Looks good to me. Bengt On 2011-08-25 18:57, John Cuthbertson wrote: > Hi Everyone, > > I'm opening up a new version of these changes for code review that > includes a fix for the issue that was causing the failure I saw below > (many thanks to Tony for helping to diagnose the issue). The new > webrev can be found at: > http://cr.openjdk.java.net/~johnc/7080389/webrev.2/ > > Testing: around 30 iterations of the GC test suite with both an > instrumented and non instrumented product VM ,with a marking threshold > of 20%, and with marking verification and VerifyBeforeGC enabled (this > combo should the problem the most frequently - about 1/50 runs of xalan). > > Thanks, > > JohnC > > On 08/22/11 15:12, John Cuthbertson wrote: >> Hi Bengt, Tony, Stefan, >> >> Thanks again for the review comments. I have to hold off pushing this >> as I have seen a failure while doing a sanity check after merging the >> workspace. So until I get to the bottom of the failure - consider >> this as "on hold". >> >> >> >> JohnC >> >> On 08/21/11 14:03, Bengt Rutisson wrote: >>> >>> Hi John, >>> >>> This new webrev looks good to me. >>> >>> The discussion that Stefan and Tony had was good. I like the new >>> template parameter name (do_mark_object) and the new comments much >>> better. In fact, I realize my initial comments regarding those were >>> a bit wrong since I had misinterpreted the code due to the old >>> comments. Thanks for finding a good solution to this. >>> >>> Ship it! >>> Bengt >>> >>> On 2011-08-20 01:12, John Cuthbertson wrote: >>>> Hi Everyone, >>>> >>>> Hopefully this webrev >>>> (http://cr.openjdk.java.net/~johnc/7080389/webrev.1/) addresses >>>> everyones' comments. >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> On 08/18/11 11:17, John Cuthbertson wrote: >>>>> Hi Everyone, >>>>> >>>>> Can I have a couple of volunteers review these refactoring changes >>>>> to the marking code used during evacuation pauses (both initial >>>>> mark pauses and regular evacuation pauses when marking is active) >>>>> - the change can be found at >>>>> http://cr.openjdk.java.net/~johnc/7080389/webrev.0/. >>>>> >>>>> The refactoring changes fix an issue that was seen with the code >>>>> changes for 6486945. >>>>> >>>>> During an initial mark pause, during root scanning, one thread had >>>>> successfully forwarded an object and had started to copy it. While >>>>> the object was being copied to its new location, another thread >>>>> saw that the object had been forwarded and, after checking that >>>>> the new location was unmarked, successfully marked the new >>>>> location. The first thread would finish the copying, see that the >>>>> new location was marked and skip the mark. The situation I ran >>>>> into was that I was attempting to obtain the size of the new >>>>> object just after it was marked (by the thread doing the marking) >>>>> and the old object had not yet been fully copied to its new location. >>>>> >>>>> With these refactoring changes, the thread that successfully >>>>> forwards an object in the collection set will mark the forwardee >>>>> after copying - allowing me to safely obtain it's size. >>>>> >>>>> Testing: several runs of the GC test suite with a marking >>>>> threshold of 10 and 20%, Kitchensink, and jprt. >>>>> >>>>> Thanks, >>>>> >>>>> JohnC >>>>> >>>>> >>>> >>> >> > From tony.printezis at oracle.com Mon Aug 29 15:33:13 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 29 Aug 2011 11:33:13 -0400 Subject: CRR (M): 7050392: G1: Introduce flag to generate a log of the G1 ergonomic decisions Message-ID: <4E5BB139.4040909@oracle.com> Hi all, I would like a couple of code reviews for this change which consolidates and extends a lot of the ad-hoc ergonomic decision output we have in G1: http://cr.openjdk.java.net/~tonyp/7050392/webrev.0/ This is the first batch of changes related to the ergonomic decision output and covers decisions that affect the following heuristics: * Heap Resizing : when and why we resize the heap * CSet Construction : how we construct the collection set, which old regions we add to it, when and why we stop adding old regions, etc. * Concurrent Cycles : when we initiate a concurrent cycle and why, what's the state of the heap at the end of a concurrent cycle, etc. * Partially-Young GCs : when we start and end partially-young GCs and why This will be extended with output for two additional heuristics * Young Gen Sizing : how we grow the young gen, why we stop allocating young gen regions, etc. * Pause Prediction : details of how the pause prediction mechanism works on a separate CR (7084525) as it requires some non-trivial code refactoring (7084509). I decided to split the work into three CRs to make the code reviews a bit more manageable. Probably, the most controversial part of this CR are the set of macros I introduced in order to generate the ergo decision records. I definitely wanted to have a standard way to construct the records in order to ensure consistency in their formatting (to make any potential parsing of the output easier). But the problem is that different ergo decision records have quite different information on them which made generating them in a standard way quite tricky. So, this is the approach that I took. Let's consider the following log record, which is in the format generated by the new code: 8.675: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 31346656 bytes, threshold: 30198960 bytes (45.00 %)] The log record consists of the following: 1) 8.675 : a time stamp in the same format enabled with the -XX:+PrintGCTimeStamps which is always enabled 2) [G1Ergonomics ... ] : standard prefix / suffix to be able to easily identify such records 3) (Concurrent Cycles) : the heuristic the record corresponds to 4) request concurrent cycle initiation : a string describing the action the ergonomic decision took 5) reason: occupancy higher than threshold : an optional string describing the reason for the above action 6) occupancy / threshold : optional values that contributed to the decision 1), 2), 3), and 4) always appear on each record, 5) and 6) are optional. I wanted each record to be printed with a single print command so that it's not split by concurrent output. I also wanted the format string of each record to be statically allocated to avoid having to stack-allocate char[] buffers, etc. These two requirements, and the fact that some of the compilers we use do not support var-arg macros, made constructing the macros I needed a bit ugly. Here's an example of such a macro; in fact it's the one that generates the above record (BTW: this is a good time to take your anti-nausea medication): ergo_verbose3(ErgoConcCycles, "request concurrent cycle initiation", ergo_reason_format("occupancy higher than threshold") ergo_byte_format("occupancy") ergo_byte_perc_format("threshold"), cur_used_bytes, min_used_targ, (double) InitiatingHeapOccupancyPercent); The parameters are: arg 1 : The heuristic ID. arg 2 : The action string. arg 3 : Concatenation of the format string, generated using the ergo_*_format() macros, that includes the optional parts of the record (reason and values). arg 4-6 : The optional values. Given that we don't have var-arg macros, this takes exactly 3 extra args (which is why it's named ergo_verbose3()). I have separate macros for between 0 and 6 args, they are wrappers around a common macro, and they pass to it dummy values for any args that are not needed. I'm not 100% happy with this but I can't think of a better way to do this. Let's go with it if it's not totally awful (or if there are no better alternative suggestions). Even though there's a fair amount of changes to the code, once you get past the ergo_verbose*() macros and friends :-), the rest of the changes are reasonably straightforward and mostly self-contained. I did very light refactoring in a few places in order to split some code paths which were taken by different ergonomic decisions so that I can generate the appropriate output. Tony From john.cuthbertson at oracle.com Mon Aug 29 23:55:29 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Mon, 29 Aug 2011 23:55:29 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7080389: G1: refactor marking code in evacuation pause copy closures Message-ID: <20110829235532.3BE0E471F2@hg.openjdk.java.net> Changeset: eeae91c9baba Author: johnc Date: 2011-08-29 10:13 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/eeae91c9baba 7080389: G1: refactor marking code in evacuation pause copy closures Summary: Refactor code marking code in the evacuation pause copy closures so that an evacuated object is only marked by the thread that successfully copies it. Reviewed-by: stefank, brutisso, tonyp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.hpp ! src/share/vm/gc_implementation/g1/g1_specialized_oop_closures.hpp From y.s.ramakrishna at oracle.com Tue Aug 30 06:40:04 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Mon, 29 Aug 2011 23:40:04 -0700 Subject: Default max heap size In-Reply-To: References: Message-ID: <4E5C85C4.3090600@oracle.com> Hi Sergejs -- You are right. This seems to have been changed in hs16/6u18 via:- *6887571 Increase default heap config sizes Changeset: *http://hg.openjdk.java.net/hsx/hsx16/baseline/rev/0799687b7385 The documentation you pointed to probably dates back to 6.0 FCS and is likely obsolete in places. Unfortunately, I do not have a more up to date counterpart of the document to point you to for the one place for consolidates and more up to date information. The release notes for 6u18 however did list this change here:- http://www.oracle.com/technetwork/java/javase/6u18-142093.html Search for "Server JVM heap configuration ergonomics". -- ramki On 8/29/2011 10:34 AM, Sergejs Melderis wrote: > Hello. > I am trying to figure out how the hotspot chooses the default maximum heap size. > I posted this question to stackoverflow, but got no answers. > I don't want to repeat it here, so here is the question > http://stackoverflow.com/questions/7194526/hotspot-default-max-heap-size > > I searched the jdk source code, for the place where it is calculated. > I found function set_heap_size defined here > http://hg.openjdk.java.net/jdk6/jdk6/hotspot/file/dc40301aed45/src/share/vm/runtime/arguments.cpp > > If am not wrong, the calculation happens in the following lines > > if (FLAG_IS_DEFAULT(MaxHeapSize)) { > julong reasonable_max = phys_mem / MaxRAMFraction; > > if (phys_mem<= MaxHeapSize * MinRAMFraction) { > // Small physical memory, so use a minimum fraction of it for the heap > reasonable_max = phys_mem / MinRAMFraction; > } else { > // Not-small physical memory, so require a heap at least > // as large as MaxHeapSize > reasonable_max = MAX2(reasonable_max, (julong)MaxHeapSize); > } > > > MaxRAMFraction is 4, so reasonable_max is phys_mem / 4. So, unless > physical memory is very small, > the reasonable_max will be MAX2(reasonable_max, (julong)MaxHeapSize); > > MAX2 is defined as > #define MAX2(a, b) (((a)< (b)) ? (b) : (a)) > > At the end reasonable_max is set as MaxHeapSize > FLAG_SET_ERGO(uintx, MaxHeapSize, (uintx)reasonable_max); > > If I plug in the memory size on my test machine, the reasonable_max > will be very close to what I get from jmap -heap. > With RAM of 8, 16 GB, or more, the MaxHeapSize will be greater than 1 > GB, which contradicts the documentation > http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#par_gc.ergonomics.default_size > > Thanks, > > Sergey. > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Tue Aug 30 06:53:05 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Mon, 29 Aug 2011 23:53:05 -0700 Subject: Long "stop-the-world" pauses in CMS GC mode In-Reply-To: References: Message-ID: <4E5C88D1.9040401@oracle.com> Hi David -- you have a 48 core machine but you have handicapped yr JVM by forcing -XX:-CMSParallelRemarkEnabled, which forces single--threaded remarks on such a huge heap. Please remove that flag so all 48 cores can be used for the remark pause (and it should disappear from your pause-time radar entirely i think). If "ref proc" turns out to be still a problem, you can then enable +ParallelRefProcEnabled to parallelize that sub-phase as well. As to the 20 s pause in the middle of nowhere, I am clueless, but switch on -XX:+PrintSfaepointStatistics to see what that long pause corresponds to. Perhaps some kind of bulk bias revocation perhaps, I am not sure... -- ramki On 8/29/2011 2:25 AM, David Tavoularis wrote: > Hi, > > I am trying to understand the cause of "stop-the-world" pauses in my > application using CMS GC and a large heap (48GB). > The production server SF6900 (24 x dual-core UltraSparc-IV 1.35GHz, 48 > working threads, 140GB RAM) is running on Solaris 9 and Java6u25. > > I know that there are several possible causes : > 1) OldGen fragmentation : to avoid it, I implemented an automatic > FullGC in crontab at 2:30am > 30 2 * * * /usr/jdk/instances/jdk1.6.0/bin/jmap -d64 -histo:live > `/usr/bin/pgrep -f "XXXXXXX"` 2>&1 >/dev/null > > 2) Weak refs processing : a workaround (not tried yet) is to use > -XX:+ParallelRefProcEnabled, as described in the following articles : > http://blogs.oracle.com/jonthecollector/entry/top_10_gc_reasons > http://stackoverflow.com/questions/4101540/how-can-i-lower-the-weak-ref-processing-time-during-gc > I have found out that it could be triggered by the daily unreferencing > of a big object containing millions of small objects (using weak > references). > > > The application has been running for almost a week and I can see some > "stop-the-world" pauses longer than 10 seconds : > *$ egrep "Total time for which application threads were stopped: > [0-9][0-9]\." gc_201108232207.log* > Total time for which application threads were stopped: *10*.8630158 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *18*.5259611 > seconds > Total time for which application threads were stopped: *10*.0777809 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *61*.5576519 > seconds > Total time for which application threads were stopped: *19*.0205127 > seconds > Total time for which application threads were stopped: *20*.6893643 > seconds > Total time for which application threads were stopped: *16*.0048075 > seconds > Total time for which application threads were stopped: *12*.3665083 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *11*.5213443 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *37*.1018520 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *16*.3988783 > seconds *<- due to weak refs* > Total time for which application threads were stopped: *12*.4057546 > seconds > > 6 of them have unknown explanation for me. > > For your information, here are the 6 "weak refs" log messages : > $ egrep "weak refs processing, [1-9][0-9]?" gc_201108232207.log | more > 2011-08-24T10:13:49.641+0100: 43564.409: [GC[YG occupancy: 342791 K > (943744 K)]43564.410: [Rescan (non-parallel) 43564.410: [grey object > rescan, 0.7358794 secs]43565.146: [root rescan, 1.9033345 secs], > 2.6398211 secs]43567.049: [weak refs processing, 8.2148555 secs] [1 > CMS-remark: 26914465K(49283072K)] 27257257K(50226816K), 10.8566498 > secs] *[Times: user=10.85 sys=0.00, real=10.86 secs]* > 2011-08-25T12:33:22.658+0100: 138336.194: [GC[YG occupancy: 179985 K > (943744 K)]138336.195: [Rescan (non-parallel) 138336.195: [grey object > rescan, 0.5969886 secs]138336.792: [root rescan, 0.5114118 secs], > 1.1089811 secs]138337.304: [weak refs processing, 8.8414246 secs] [1 > CMS-remark: 20122279K(49283072K)] 20302264K(5226816K), 9.9514563 secs] > *[Times: user=9.94 sys=0.01, real=9.95 secs]* > 2011-08-26T07:22:55.233+0100: 206107.887: [GC[YG occupancy: 177014 K > (943744 K)]206107.888: [Rescan (non-parallel) 206107.888: [grey object > rescan, 0.4472730 secs]206108.335: [root rescan, 1.5575365 secs], > 2.0053337 secs]206109.893: [weak refs processing, 10.3436973 secs] [1 > CMS-remark: 19861286K(49283072K)] 20038301K(50226816K), 12.3572481 > secs] *[Times: user=12.22 sys=0.00, real=12.36 secs]* > 2011-08-26T07:51:55.531+0100: 207848.163: [GC[YG occupancy: 423184 K > (943744 K)]207848.163: [Rescan (non-parallel) 207848.163: [grey object > rescan, 0.4466552 secs]207848.610: [root rescan, 3.4207362 secs], > 3.8680060 secs]207852.031: [weak refs processing, 7.6403893 secs] [1 > CMS-remark: 19714349K(49283072K)] 20137533K(50226816K), 11.5130922 > secs] *[Times: user=11.51 sys=0.00, real=11.51 secs]* > 2011-08-27T15:18:48.928+0100: 321060.091: [GC[YG occupancy: 711567 K > (943744 K)]321060.092: [Rescan (non-parallel) 321060.092: [grey object > rescan, 0.4628955 secs]321060.555: [root rescan, 3.2087381 secs], > 3.6721710 secs]321063.764: [weak refs processing, 33.3995481 secs] [1 > CMS-remark: 19918243K(49283072K)] 20629810K(50226816K), 37.0910804 > secs] *[Times: user=37.04 sys=0.00, real=37.09 secs]* > 2011-08-28T11:17:12.144+0100: 392962.378: [GC[YG occupancy: 811576 K > (943744 K)]392962.378: [Rescan (non-parallel) 392962.378: [grey object > rescan, 0.4140054 secs]392962.793: [root rescan, 4.4323136 secs], > 4.8469694 secs]392967.225: [weak refs processing, 11.5384812 secs] [1 > CMS-remark: 19819290K(49283072K)] 20630867K(50226816K), 16.3885374 > secs] *[Times: user=16.35 sys=0.01, real=16.39 secs]* > > > > > > > *1. Here is the first pattern : a _61-second pause_, but I don't see > any suspicious message in GC logs:* > 2011-08-24T10:24:25.748+0100: 44200.509: [GC 44200.511: [ParNew > Desired survivor size 53673984 bytes, new threshold 1 (max 4) > - age 1: 101879520 bytes, 101879520 total > : 933589K->104832K(943744K), 0.3947382 secs] > 21369469K->20703994K(50226816K), 0.3966779 secs] [Times: user=6.43 > sys=0.04, real=0.40 secs] > Heap after GC invocations=1187 (full 12): > par new generation total 943744K, used 104832K [0xfffffff353c00000, > 0xfffffff393c00000, 0xfffffff393c00000) > eden space 838912K, 0% used [0xfffffff353c00000, 0xfffffff353c00000, > 0xfffffff386f40000) > from space 104832K, 100% used [0xfffffff386f40000, 0xfffffff38d5a0000, > 0xfffffff38d5a0000) > to space 104832K, 0% used [0xfffffff38d5a0000, 0xfffffff38d5a0000, > 0xfffffff393c00000) > concurrent mark-sweep generation total 49283072K, used 20599162K > [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) > concurrent-mark-sweep perm gen total 524288K, used 42905K > [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) > } > Total time for which application threads were stopped: 0.4110458 seconds > Application time: 39.5906692 seconds > {Heap before GC invocations=1187 (full 12): > par new generation total 943744K, used 943744K [0xfffffff353c00000, > 0xfffffff393c00000, 0xfffffff393c00000) > eden space 838912K, 100% used [0xfffffff353c00000, 0xfffffff386f40000, > 0xfffffff386f40000) > from space 104832K, 100% used [0xfffffff386f40000, 0xfffffff38d5a0000, > 0xfffffff38d5a0000) > to space 104832K, 0% used [0xfffffff38d5a0000, 0xfffffff38d5a0000, > 0xfffffff393c00000) > concurrent mark-sweep generation total 49283072K, used 20599162K > [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) > concurrent-mark-sweep perm gen total 524288K, used 42905K > [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) > 2011-08-24T10:25:07.776+0100: 44242.537: [GC 44301.853: [ParNew > Desired survivor size 53673984 bytes, new threshold 1 (max 4) > - age 1: 99505080 bytes, 99505080 total > : 943744K->104832K(943744K), 0.2010508 secs] > 21542906K->20852742K(50226816K), 0.2022636 secs] *[Times: user=5.67 > sys=0.02, real=59.52 secs]* > Heap after GC invocations=1188 (full 12): > par new generation total 943744K, used 104832K [0xfffffff353c00000, > 0xfffffff393c00000, 0xfffffff393c00000) > eden space 838912K, 0% used [0xfffffff353c00000, 0xfffffff353c00000, > 0xfffffff386f40000) > from space 104832K, 100% used [0xfffffff38d5a0000, 0xfffffff393c00000, > 0xfffffff393c00000) > to space 104832K, 0% used [0xfffffff386f40000, 0xfffffff386f40000, > 0xfffffff38d5a0000) > concurrent mark-sweep generation total 49283072K, used 20747910K > [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) > concurrent-mark-sweep perm gen total 524288K, used 42905K > [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) > } > *Total time for which application threads were stopped: 61.5576519 > seconds* > Application time: 0.0245838 seconds > Total time for which application threads were stopped: 9.8331189 seconds > Application time: 0.0012626 seconds > Total time for which application threads were stopped: 0.0090404 seconds > Application time: 0.0008943 seconds > Total time for which application threads were stopped: 0.0020415 seconds > Application time: 0.0008181 seconds > Total time for which application threads were stopped: 0.2338605 seconds > Application time: 0.0018822 seconds > > The only suspicious thing is "[Times: user=5.67 sys=0.02, real=59.52 > secs]", which means that the "real" duration is a lot higher than > "user" CPU time. > Because "sys" duration is low, it also means that the server is not > swapping. > What could explain this 61 seconds pause ? > > > > *2. Here is the second pattern : a 20-second pause, in the middle of > nowhere in GC logs :* > {Heap before GC invocations=11132 (full 166): > par new generation total 943744K, used 882686K [0xfffffff353c00000, > 0xfffffff393c00000, 0xfffffff393c00000) > eden space 838912K, 100% used [0xfffffff353c00000, 0xfffffff386f40000, > 0xfffffff386f40000) > from space 104832K, 41% used [0xfffffff386f40000, 0xfffffff3899ffa48, > 0xfffffff38d5a0000) > to space 104832K, 0% used [0xfffffff38d5a0000, 0xfffffff38d5a0000, > 0xfffffff393c00000) > concurrent mark-sweep generation total 49283072K, used 19148140K > [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) > concurrent-mark-sweep perm gen total 524288K, used 44308K > [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) > 2011-08-25T20:07:07.235+0100: 165560.417: [GC 165560.417: [ParNew > Desired survivor size 53673984 bytes, new threshold 4 (max 4) > - age 1: 26189384 bytes, 26189384 total > - age 2: 1713728 bytes, 27903112 total > : 882686K->34449K(943744K), 0.1280202 secs] > 20030826K->19182589K(50226816K), 0.1285927 secs] [Times: user=3.94 > sys=0.01, real=0.13 secs] > Heap after GC invocations=11133 (full 166): > par new generation total 943744K, used 34449K [0xfffffff353c00000, > 0xfffffff393c00000, 0xfffffff393c00000) > eden space 838912K, 0% used [0xfffffff353c00000, 0xfffffff353c00000, > 0xfffffff386f40000) > from space 104832K, 32% used [0xfffffff38d5a0000, 0xfffffff38f744468, > 0xfffffff393c00000) > to space 104832K, 0% used [0xfffffff386f40000, 0xfffffff386f40000, > 0xfffffff38d5a0000) > concurrent mark-sweep generation total 49283072K, used 19148140K > [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) > concurrent-mark-sweep perm gen total 524288K, used 44308K > [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) > } > Total time for which application threads were stopped: 0.1370098 seconds > Application time: 53.6273550 seconds > Total time for which application threads were stopped: 0.0429426 seconds > Application time: 0.0002318 seconds > Total time for which application threads were stopped: 0.0044294 seconds > Application time: 0.0002250 seconds > Total time for which application threads were stopped: 0.0016478 seconds > Application time: 59.0926108 seconds > Total time for which application threads were stopped: 0.0431387 seconds > Application time: 0.0002193 seconds > Total time for which application threads were stopped: 0.0020966 seconds > Application time: 0.0000956 seconds > Total time for which application threads were stopped: 0.0016358 seconds > Application time: 60.1048190 seconds > Total time for which application threads were stopped: 0.0481582 seconds > Application time: 0.0002207 seconds > Total time for which application threads were stopped: 0.0067752 seconds > Application time: 0.0001073 seconds > Total time for which application threads were stopped: 0.0016387 seconds > Application time: 60.7453974 seconds > Total time for which application threads were stopped: 0.0425995 seconds > Application time: 0.0002457 seconds > Total time for which application threads were stopped: 0.0019724 seconds > Application time: 0.0001005 seconds > Total time for which application threads were stopped: 0.0016210 seconds > Application time: 59.0845530 seconds > Total time for which application threads were stopped: 0.0424095 seconds > Application time: 0.0002314 seconds > Total time for which application threads were stopped: 0.0020107 seconds > Application time: 0.0000959 seconds > Total time for which application threads were stopped: 0.0015940 seconds > Application time: 60.7994458 seconds > Total time for which application threads were stopped: 0.0428210 seconds > Application time: 0.0002210 seconds > Total time for which application threads were stopped: 0.0020541 seconds > Application time: 0.0000974 seconds > Total time for which application threads were stopped: 0.0016126 seconds > Application time: 59.0963098 seconds > Total time for which application threads were stopped: 0.0592795 seconds > Application time: 0.0002622 seconds > Total time for which application threads were stopped: 0.0023229 seconds > Application time: 0.0000926 seconds > Total time for which application threads were stopped: 0.0016296 seconds > Application time: 60.1021141 seconds > Total time for which application threads were stopped: 0.0443986 seconds > Application time: 0.0002462 seconds > Total time for which application threads were stopped: 0.0021135 seconds > Application time: 0.0001076 seconds > Total time for which application threads were stopped: 0.0016165 seconds > Application time: 60.0324234 seconds > Total time for which application threads were stopped: 0.0437486 seconds > Application time: 0.0002286 seconds > Total time for which application threads were stopped: 0.0021017 seconds > Application time: 0.0001073 seconds > Total time for which application threads were stopped: 0.0016570 seconds > Application time: 60.4613330 seconds > Total time for which application threads were stopped: 0.0490276 seconds > Application time: 0.0002947 seconds > Total time for which application threads were stopped: 0.0024618 seconds > Application time: 0.0001238 seconds > Total time for which application threads were stopped: 0.0019863 seconds > Application time: 59.8201422 seconds > Total time for which application threads were stopped: 0.0455540 seconds > Application time: 0.0003668 seconds > Total time for which application threads were stopped: 0.0020906 seconds > Application time: 0.0001126 seconds > Total time for which application threads were stopped: 0.0016693 seconds > Application time: 60.0721521 seconds > Total time for which application threads were stopped: 0.0438111 seconds > Application time: 0.0002660 seconds > Total time for which application threads were stopped: 0.0019814 seconds > Application time: 0.0001018 seconds > Total time for which application threads were stopped: 0.0017817 seconds > Application time: 60.0825886 seconds > Total time for which application threads were stopped: 0.0440386 seconds > Application time: 0.0002197 seconds > Total time for which application threads were stopped: 0.0020655 seconds > Application time: 0.0001093 seconds > Total time for which application threads were stopped: 0.0016122 seconds > Application time: 59.6628580 seconds > Total time for which application threads were stopped: 0.0425082 seconds > Application time: 0.0002121 seconds > Total time for which application threads were stopped: 0.0020967 seconds > Application time: 0.0000935 seconds > Total time for which application threads were stopped: 0.0015909 seconds > Application time: 60.1951548 seconds > Total time for which application threads were stopped: 0.0432125 seconds > Application time: 0.0002274 seconds > Total time for which application threads were stopped: 0.0020316 seconds > Application time: 0.0001062 seconds > Total time for which application threads were stopped: 0.0016534 seconds > Application time: 59.5329171 seconds > *Total time for which application threads were stopped: 20.6893643 > seconds* > Application time: 0.0002839 seconds > Total time for which application threads were stopped: 0.0076240 seconds > Application time: 0.0002137 seconds > Total time for which application threads were stopped: 0.0019918 seconds > Application time: 39.4376656 seconds > Total time for which application threads were stopped: 0.0612671 seconds > Application time: 0.0002478 seconds > > Any idea ? > > > Thanks in advance for your help > -- > David Tavoularis > > > > > > > [Annex] > Complete GC log file gc_201108232207.log.gz available here: > http://dl.free.fr/gxrxlLsVS > > JVM command line extract : > /usr/jdk/instances/jdk1.6.0/jre/bin/sparcv9/java > -Dsun.rmi.dgc.checkInterval=2000 -server -Xms49152m -Xmx49152m > -XX:PermSize=512m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:+DisableExplicitGC -XX:-CMSParallelRemarkEnabled > -XX:CMSInitiatingOccupancyFraction=40 -XX:NewSize=1024m > -XX:MaxNewSize=1024m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution > -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCDateStamps > -Xloggc:/logs/gc_201108232207.log -XX:+UseCompressedOops > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/data/heapdump > > *$ /usr/jdk/instances/jdk1.6.0/jre/bin/sparcv9/java -version* > java version "1.6.0_25" > Java(TM) SE Runtime Environment (build 1.6.0_25-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) > > *$ /usr/sbin/prtdiag | head -3* > System Configuration: Sun Microsystems sun4u Sun Fire E6900 > System clock frequency: 150 MHz > Memory size: 143360 Megabytes > > *$ mpstat | wc -l* > 49 > > *$ uname -a* > SunOS XXX 5.9 Generic_122300-05 sun4u sparc SUNW,Sun-Fire > > For your information, Full GC automatically triggered at 2:30am : > *$ grep Full gc_201108232207.log* > 2011-08-24T02:30:02.475+0100: 15737.603: [Full GC 15737.604: [CMS: > 11972490K->5028118K(49283072K), 137.9859661 secs] > 12141664K->5028118K(50226816K), [CMS Perm : 39558K->39491K(524288K)], > 137.9867010 secs] [Times: user=133.02 sys=4.89, real=137.99 secs] > 2011-08-25T02:30:05.142+0100: 102139.150: [Full GC 102139.150: [CMS: > 18724122K->11970549K(49283072K), 433.4189517 secs] > 18976948K->11970549K(50226816K), [CMS Perm : 44256K->42995K(524288K)], > 433.4350620 secs] [Times: user=429.00 sys=3.89, real=433.44 secs] > 2011-08-26T02:30:05.125+0100: 188538.009: [Full GC 188538.009: [CMS: > 15865994K->12528867K(49283072K), 477.0168566 secs] > 16343213K->12528867K(50226816K), [CMS Perm : 44324K->43408K(524288K)], > 477.0175358 secs] [Times: user=476.76 sys=0.05, real=477.02 secs] > 2011-08-27T02:30:03.084+0100: 274934.847: [Full GC 274934.849: [CMS: > 14857264K->8811922K(49283072K), 312.4786042 secs] > 15546860K->8811922K(50226816K), [CMS Perm : 44557K->43762K(524288K)], > 312.4796506 secs] [Times: user=312.38 sys=0.11, real=312.48 secs] > 2011-08-28T02:30:04.129+0100: 361334.770: [Full GC 361334.777: [CMS: > 16479144K->5767617K(49283072K), 161.5857103 secs] > 17318705K->5767617K(50226816K), [CMS Perm : 44127K->43481K(524288K)], > 161.5863909 secs] [Times: user=161.21 sys=0.02, real=161.59 secs] > 2011-08-29T02:30:03.316+0100: 447732.838: [Full GC 447732.838: [CMS: > 13471208K->6989798K(49283072K), 173.7255263 secs] > 13700543K->6989798K(50226816K), [CMS Perm : 43709K->43433K(524288K)], > 173.7260186 secs] [Times: user=173.48 sys=0.01, real=173.73 secs] > > > ------------------------------------------------------------------------ > > This electronic message contains information from Mycom which may be > privileged or confidential. The information is intended to be for the > use of the individual(s) or entity named above. If you are not the > intended recipient, be aware that any disclosure, copying, > distribution or any other use of the contents of this information is > prohibited. If you have received this electronic message in error, > please notify us by post or telephone (to the numbers or > correspondence address above) or by email (at the email address above) > immediately. > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Tue Aug 30 07:19:52 2011 From: y.s.ramakrishna at oracle.com (Ramki Ramakrishna) Date: Tue, 30 Aug 2011 00:19:52 -0700 Subject: Long "stop-the-world" pauses in CMS GC mode In-Reply-To: <4E5C88D1.9040401@oracle.com> References: <4E5C88D1.9040401@oracle.com> Message-ID: <4E5C8F18.2070801@oracle.com> David, I missed one of the longer pauses that you'd specifically drawn attention to:- > On 8/29/2011 2:25 AM, David Tavoularis wrote: >> *1. Here is the first pattern : a _61-second pause_, but I don't see >> any suspicious message in GC logs:* >> ... >> 2011-08-24T10:25:07.776+0100: 44242.537: [GC 44301.853: [ParNew >> Desired survivor size 53673984 bytes, new threshold 1 (max 4) >> - age 1: 99505080 bytes, 99505080 total >> : 943744K->104832K(943744K), 0.2010508 secs] >> 21542906K->20852742K(50226816K), 0.2022636 secs] *[Times: user=5.67 >> sys=0.02, real=59.52 secs]* If you look at the timestamps above, the GC event starts off at 44242.537 seconds, but then the GC itself does not commence until 44301.853 seconds, i.e. a full 59.32 seconds later. So the pause is associated not with GC work itself (which is correctly reported as 202 ms), but rather with a preamble to the GC, perhaps with bringing threads to a safepoint, I am guessing. Once again -XX:+PrintSafepointStatistics (which i mentioned in previous email wrt the 20 s pause in the middle of noweher) would likely provide some clues. I have heard apocryphal stories of -XX:+UseMembar having worked to get rid of overly long safepointing pauses,. and I have heard -XX:-UseBiasedLocking for pauses associated with bulk bias revocations. But, without +PrintSafepointStatistics data to draw inferences from, those incantations would just constitute superstitious mumbo-jumbo. -- ramki >> Heap after GC invocations=1188 (full 12): >> par new generation total 943744K, used 104832K [0xfffffff353c00000, >> 0xfffffff393c00000, 0xfffffff393c00000) >> eden space 838912K, 0% used [0xfffffff353c00000, 0xfffffff353c00000, >> 0xfffffff386f40000) >> from space 104832K, 100% used [0xfffffff38d5a0000, >> 0xfffffff393c00000, 0xfffffff393c00000) >> to space 104832K, 0% used [0xfffffff386f40000, 0xfffffff386f40000, >> 0xfffffff38d5a0000) >> concurrent mark-sweep generation total 49283072K, used 20747910K >> [0xfffffff393c00000, 0xffffffff53c00000, 0xffffffff53c00000) >> concurrent-mark-sweep perm gen total 524288K, used 42905K >> [0xffffffff53c00000, 0xffffffff73c00000, 0xffffffff73c00000) >> } >> *Total time for which application threads were stopped: 61.5576519 >> seconds* >> Application time: 0.0245838 seconds >> Total time for which application threads were stopped: 9.8331189 seconds >> Application time: 0.0012626 seconds >> Total time for which application threads were stopped: 0.0090404 seconds >> Application time: 0.0008943 seconds >> Total time for which application threads were stopped: 0.0020415 seconds >> Application time: 0.0008181 seconds >> Total time for which application threads were stopped: 0.2338605 seconds >> Application time: 0.0018822 seconds >> >> The only suspicious thing is "[Times: user=5.67 sys=0.02, real=59.52 >> secs]", which means that the "real" duration is a lot higher than >> "user" CPU time. >> Because "sys" duration is low, it also means that the server is not >> swapping. >> What could explain this 61 seconds pause ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Tue Aug 30 09:22:41 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 30 Aug 2011 05:22:41 -0400 Subject: CRR (M): 7084509: G1: fix inconsistencies and mistakes in the young list target length calculations Message-ID: <4E5CABE1.9050209@oracle.com> Hi all, I'd like one more review for this change (Bengt already reviewed it; I accidentally "stepped on his toes" with this refactoring, so we reviewed each other's changes to see how to move forward): http://cr.openjdk.java.net/~tonyp/7084509/webrev.0/ This change fixed several issues in the young target length calculations: - There are two entry points to the calculation: calculate_young_list_target_length() and calculate_young_list_target_length(size_t rs_lengths). The former calls the latter, but in some cases the latter can be called by itself too. But there are some extra calculations (max survivor size, max GC locker expansion) that were not done when the latter was called by itself. Additionally, when calculate_young_list_target_length() was called it also required for another method to also be called beforehand (calculate_min_young_list_length()). Fix: replace the above with a single method, update_young_list_target_length() which takes an optional rs_lengths parameter. Everything is done inside it to ensure that everything that needs to be calculated, it is calculated and no other methods need to be called beforehand. This also ensures that if we want to apply any min / max bounds to the young target length, we can do so in a single place. - The max survivor size is done with an integer division. This means that, if the resulting value is between 0.0 and 1.0, the max survivor size will be 0 which effectively tenures everything during the next GC. It'd be better if it was 1. Fix: use double division and ceiling in order for the max survivor size to be 1 in the above case. Additionally, I now calculate the survivor parameters at the beginning of a pause instead of when the young target length is calculated / recalculated. Since those parameters only affect the next GC it's pointless to calculate / recalculate them earlier. - The code that calculates the optimal young target length (i.e., the max young length predicted to be within the required pause time) is embarrassingly incorrect. It uses binary search to yield the optimal length, but unfortunately exits early and in many situations returns a young target length that is shorter than it could be. Fix: updated the binary search algorithm to do the right thing. I compared the before / after calculations and the after calculation consistently yielded longer young target lengths which still fit within the required pause time. Additional fixes: - I now calculate the heap reserve every time the heap is resized (as it stays the same for a given heap size). There's no point in recalculating it every time we do the young target length calculations. - Refactoring and simplification to make the code easier to follow. This should help make the changes for the following two CRs easier: 6929868: G1: introduce min / max young gen size bounds 7084525: G1: Generate ergonomic decision log records for young gen sizing and for pause prediction The bulk of the changes are in G1CollectorPolicy. It might be easier if you looked at the new versions of the following methods: G1CollectorPolicy::predict_will_fit() G1CollectorPolicy::calculate_young_list_desired_min_length() G1CollectorPolicy::calculate_young_list_desired_max_length() G1CollectorPolicy::update_young_list_target_length() G1CollectorPolicy::calculate_young_list_target_length() and compared them to the previous versions instead of looking at their diffs. Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Tue Aug 30 13:51:03 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Tue, 30 Aug 2011 13:51:03 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7082645: Hotspot doesn't compile on old linuxes after 7060836 Message-ID: <20110830135109.E6C2247216@hg.openjdk.java.net> Changeset: 9447b2fb6fcf Author: iveresov Date: 2011-08-29 17:42 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9447b2fb6fcf 7082645: Hotspot doesn't compile on old linuxes after 7060836 Summary: Move syscall ids definitions into os_linux.cpp Reviewed-by: johnc ! src/os/linux/vm/os_linux.cpp From john.cuthbertson at oracle.com Tue Aug 30 16:54:09 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 30 Aug 2011 09:54:09 -0700 Subject: RFR(S): 7066841: remove MacroAssembler::br_on_reg_cond() on sparc Message-ID: <4E5D15B1.9010006@oracle.com> Hi Everyone, Can I have couple of volunteers look over these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7066841/webrev.0/. These changes basically remove the macro assembler routine br_on_reg_cond and replace the remaining calls to that routine, in the G1 barriers, with an equivalent. Testing: GC test suite and Kitchensink on 32/64 bit sparc with -Xint, -client -Xcomp, -XX:+TieredCompilation -XX:TieredStopAtLevel=1, and default. VerifyDuringGC and VerifyBeforeGC were also enabled to detect missing barriers. Thanks, JohnC From y.s.ramakrishna at oracle.com Tue Aug 30 16:51:08 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 30 Aug 2011 09:51:08 -0700 Subject: Long "stop-the-world" pauses in CMS GC mode In-Reply-To: References: <4E5C88D1.9040401@oracle.com> <4E5C8F18.2070801@oracle.com> Message-ID: <4E5D14FC.5050704@oracle.com> On 08/30/11 02:52, David Tavoularis wrote: > Hi Ramki, Zden??k, > > Thank you for your valuable answers. > > /> So the pause is associated not with //GC work itself (which is > correctly reported as 202 ms), but rather with a / > /> preamble to the GC, perhaps //with bringing threads to a safepoint, I > am guessing./ > I will ask to add -XX:+PrintSafepointStatistics. What are the expected > outputs ? Will it be in GC logs or in stdout ? To stdout i believe. But with a latest JVM these data (which are batched into a record of several entries written out together) should have a timestamp column associated with each safepoint operation which will allow alignment of the data wrt the GC log events in the GC logs even, though the two split off into different i/o streams. > > /> you have a 48 core machine but you have handicapped yr JVM by forcing > -XX:-CMSParallelRemarkEnabled, > > which forces single--threaded remarks on such a huge heap./ > I will ask to remove it and let you know. Thanks. I am guessing it must be "legacy" from an (much) earlier time when there were bugs in the cms parallel remark. > > /> If "ref proc" turns out to be still a problem, you can then enable > +ParallelRefProcEnabled to parallelize that sub-phase as well./ > I will not activate -XX:+ParallelRefProcEnabled, because according to > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7028845, it is broken > in Java6u25 and fixed in Java6u27. Ah yes, good catch; sorry to've not remembered as i had fixed that problem a while back during JDK 7 development. > > /> I have heard apocryphal stories of -XX:+UseMembar having worked to > get rid of //overly long safepointing pauses,/ > /> and I have heard -XX:-UseBiasedLocking for pauses associated //with > bulk bias revocations./ > Good to know, but I won't use them until I get more info from > -XX:+PrintSafepointStatistics and a new analysis after removing > -XX:-CMSParallelRemarkEnabled Sounds good. > > />> The only suspicious thing is "[Times: user=5.67 sys=0.02, real=59.52 > secs]", which means that the "real" duration is a lot higher than "user" > CPU time. > >> Because "sys" duration is low, it also means that the server is not > swapping./ > /> this can happen when the machine is overloaded. And as for swapping, > I think it is not involved in the sys time because these times are times > of the application thread./ > In my experience, when server is swapping, the "sys" time duration is > increasing a lot. > I can confirm that there is no high CPU load on the server (max CPU > usage is 30% in the last 7 days) and no disk swapping (according to > vmstat "sr"="scan rate" metrics). > According to Ramki, I need to understand the reason of slow safepoint > action. Right; sounds like a plan. -- ramki _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From vladimir.kozlov at oracle.com Tue Aug 30 17:47:28 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 30 Aug 2011 10:47:28 -0700 Subject: RFR(S): 7066841: remove MacroAssembler::br_on_reg_cond() on sparc In-Reply-To: <4E5D15B1.9010006@oracle.com> References: <4E5D15B1.9010006@oracle.com> Message-ID: <4E5D2230.70802@oracle.com> Nice cleanup. Thank you, John. Vladimir John Cuthbertson wrote: > Hi Everyone, > > Can I have couple of volunteers look over these changes? The webrev can > be found at: http://cr.openjdk.java.net/~johnc/7066841/webrev.0/. > > These changes basically remove the macro assembler routine > br_on_reg_cond and replace the remaining calls to that routine, in the > G1 barriers, with an equivalent. > > Testing: GC test suite and Kitchensink on 32/64 bit sparc with -Xint, > -client -Xcomp, -XX:+TieredCompilation -XX:TieredStopAtLevel=1, and > default. VerifyDuringGC and VerifyBeforeGC were also enabled to detect > missing barriers. > > Thanks, > > JohnC From igor.veresov at oracle.com Wed Aug 31 01:47:36 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 30 Aug 2011 18:47:36 -0700 Subject: RFR(S): 7066841: remove MacroAssembler::br_on_reg_cond() on sparc In-Reply-To: <4E5D15B1.9010006@oracle.com> References: <4E5D15B1.9010006@oracle.com> Message-ID: <4EA6DFEB650F440C8DCBD000C1F07B34@oracle.com> Looks good. igor On Tuesday, August 30, 2011 at 9:54 AM, John Cuthbertson wrote: > Hi Everyone, > > Can I have couple of volunteers look over these changes? The webrev can > be found at: http://cr.openjdk.java.net/~johnc/7066841/webrev.0/. > > These changes basically remove the macro assembler routine > br_on_reg_cond and replace the remaining calls to that routine, in the > G1 barriers, with an equivalent. > > Testing: GC test suite and Kitchensink on 32/64 bit sparc with -Xint, > -client -Xcomp, -XX:+TieredCompilation -XX:TieredStopAtLevel=1, and > default. VerifyDuringGC and VerifyBeforeGC were also enabled to detect > missing barriers. > > Thanks, > > JohnC From john.cuthbertson at oracle.com Wed Aug 31 21:18:30 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Wed, 31 Aug 2011 21:18:30 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7066841: remove MacroAssembler::br_on_reg_cond() on sparc Message-ID: <20110831211833.DDD8747271@hg.openjdk.java.net> Changeset: 4fe626cbf0bf Author: johnc Date: 2011-08-31 10:16 -0700 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4fe626cbf0bf 7066841: remove MacroAssembler::br_on_reg_cond() on sparc Summary: Remove the macro assembler routine br_on_reg_cond() and replace the remaining calls to that routine with an equivalent. Reviewed-by: kvn, iveresov ! src/cpu/sparc/vm/assembler_sparc.cpp ! src/cpu/sparc/vm/assembler_sparc.hpp ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/sparc/vm/c1_Runtime1_sparc.cpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp