From john.cuthbertson at oracle.com  Thu Jan  3 03:46:59 2013
From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com)
Date: Thu, 03 Jan 2013 03:46:59 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8004132: SerialGC: ValidateMarkSweep
	broken when running GCOld
Message-ID: <20130103034702.DF902474D8@hg.openjdk.java.net>

Changeset: b735136e0d82
Author:    johnc
Date:      2013-01-02 11:32 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b735136e0d82

8004132: SerialGC: ValidateMarkSweep broken when running GCOld
Summary: Remove bit-rotten ValidateMarkSweep functionality and flag.
Reviewed-by: johnc, jmasa
Contributed-by: tamao <tao.mao at oracle.com>

! src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp
! src/share/vm/gc_implementation/g1/g1MarkSweep.cpp
! src/share/vm/gc_implementation/parallelScavenge/psMarkSweepDecorator.cpp
! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp
! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp
! src/share/vm/gc_implementation/shared/markSweep.cpp
! src/share/vm/gc_implementation/shared/markSweep.hpp
! src/share/vm/gc_implementation/shared/markSweep.inline.hpp
! src/share/vm/memory/genMarkSweep.cpp
! src/share/vm/memory/space.cpp
! src/share/vm/memory/space.hpp
! src/share/vm/runtime/globals.hpp
! src/share/vm/utilities/debug.cpp


From bengt.rutisson at oracle.com  Thu Jan  3 18:55:07 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 03 Jan 2013 19:55:07 +0100
Subject: Request for review (S): 8005396: Use ParNew with only one thread
	instead of DefNew as default for CMS on single CPU machines
In-Reply-To: <50E1ACCF.3000900@oracle.com>
References: <50D46750.70106@oracle.com> <20692.64227.950170.267793@oracle.com>
	<50DFF74D.8070106@oracle.com> <50E1ACCF.3000900@oracle.com>
Message-ID: <50E5D40B.1030500@oracle.com>


Hi Jon,

On 12/31/12 4:18 PM, Jon Masamitsu wrote:
> Bengt,
>
> Thanks for the changes.

Thanks for looking at it again!

> Could you also fix this code in
>
> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>
>
> If CMS is never going to see code with ParallelGCThreads == 0, then
> the then-block for 624 can be deleted.

I think it is a little early to remove this code. My change only makes 
sure that if you run CMS with ParNew you get at least one worker thread 
to make sure that you can use ParNew.

If you explicitly turn ParNew off you will still get DefNew and can use 
ParallelGCThreads == 0. My other review request will print a deprecation 
message for this combination but for JDK8 it will still be allowed.

For JDK9 we plan to completely disallow this combination and at that 
point I think we can remove the code in compactibleFreeListSpace.cpp.

Thanks,
Bengt

>
>
>    620        if (!_adaptive_freelists&& _smallLinearAllocBlock._ptr 
> == NULL) {
>
>    621          // Mark the boundary of the new block in BOT
>
>    622          _bt.mark_block(prevEnd, value);
>
>    623          // put it all in the linAB
>
>    624          if (ParallelGCThreads == 0) {
>
>    625            _smallLinearAllocBlock._ptr = prevEnd;
>
>    626            _smallLinearAllocBlock._word_size = newFcSize;
>
>    627 repairLinearAllocBlock(&_smallLinearAllocBlock);
>
>    628          } else { // ParallelGCThreads>  0
>
>    629            MutexLockerEx x(parDictionaryAllocLock(),
>
>    630 Mutex::_no_safepoint_check_flag);
>
>    631            _smallLinearAllocBlock._ptr = prevEnd;
>
>    632            _smallLinearAllocBlock._word_size = newFcSize;
>
>    633 repairLinearAllocBlock(&_smallLinearAllocBlock);
>
>    634          }
>
>
> Jon
>
> On 12/30/12 00:11, Bengt Rutisson wrote:
>>
>> Hi John and Jon,
>>
>> Thanks for the reviews!
>>
>> I discovered a bug in my fix. ParNew actually does not support 
>> ParallelGCThreads=0. I fixed this by making sure that we don't set 
>> ParallelGCThreads to 0 on single CPU machines. Instead we keep it at 1.
>>
>> And if someone explicitly set -XX:ParallelGCThreads=0 on the command 
>> line while trying to use ParNew I print an error message and exit.
>>
>> I assume that this is the reason that we previously picked DefNew if 
>> ParallelGCThreads was set to 0, but I think now that we want to 
>> deprecate DefNew for CMS it makes more sense to require users to 
>> explicitly turn ParNew off with -XX:-UseParNewGC if this is what they 
>> really want.
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.02/
>>
>> The only change to the previous version is in arguments.cpp. Here is 
>> the small diff compared to the previous webrev:
>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.01-02.diff/
>>
>> I have tested the fix on a single CPU virtual box instance.
>>
>> Thanks,
>> Bengt
>>
>> On 12/22/12 1:12 AM, John Coomes wrote:
>>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>>> Hi All,
>>>>
>>>> Can I have a couple of reviews for this change?
>>>>
>>>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.00/
>>>>
>>>> Currently we use ParNew as default for the young generation when 
>>>> CMS is
>>>> selected. But if the machine only has a single CPU we set the
>>>> ParallelGCThreads to 0 and and select DefNew instead of ParNew.
>>> Looks good to me.
>>>
>>> -John
>>>
>>>> As part of another change, 8003820, we will deprecate the DefNew + CMS
>>>> combination. Thus, it does not make sense anymore to have this 
>>>> selected
>>>> by default. This fix is to make CMS always pick ParNew by default.
>>>>
>>>> The change also has the side effect that the, in my opinion, rather
>>>> strange behavior that setting ParallelGCThreads=0 on the command line
>>>> overrides the GC choice. I would expect this command line to give me
>>>> ParNew, but it actually gives me DefNew:
>>>>
>>>> -XX:+UseParNewGC -XX:ParallelGCThreads=0
>>>>
>>>> After my proposed change you get ParNew with the above command line.
>>>>
>>>> I have done some performance testing to verify that ParNew with one
>>>> thread is not slower than DefNew. The details are in the bug report:
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8005396
>>>>
>>>> but as a summary it can be said that there is no noticeable 
>>>> difference.
>>>>
>>>> I am also running some more SPECjbb2005 runs and will analyze the 
>>>> gc times.
>>>>
>>>> Thanks,
>>>> Bengt
>>


From john.cuthbertson at oracle.com  Thu Jan  3 20:20:07 2013
From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com)
Date: Thu, 03 Jan 2013 20:20:07 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8001424: G1: Rename certain G1-specific
	flags
Message-ID: <20130103202011.9C2A6474FF@hg.openjdk.java.net>

Changeset: 37f7535e5f18
Author:    johnc
Date:      2012-12-21 11:45 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/37f7535e5f18

8001424: G1: Rename certain G1-specific flags
Summary: Rename G1DefaultMinNewGenPercent, G1DefaultMaxNewGenPercent, and G1OldCSetRegionLiveThresholdPercent to G1NewSizePercent, G1MaxNewSizePercent, and G1MixedGCLiveThresholdPercent respectively. The previous names are no longer accepted.
Reviewed-by: brutisso, ysr

! src/share/vm/gc_implementation/g1/collectionSetChooser.cpp
! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp
! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp
! src/share/vm/gc_implementation/g1/g1_globals.hpp


From john.cuthbertson at oracle.com  Fri Jan  4 02:39:09 2013
From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com)
Date: Fri, 04 Jan 2013 02:39:09 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8004816: G1: Kitchensink failures after
	marking stack changes
Message-ID: <20130104023914.70E0E4751F@hg.openjdk.java.net>

Changeset: d275c3dc73e6
Author:    johnc
Date:      2013-01-03 16:28 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d275c3dc73e6

8004816: G1: Kitchensink failures after marking stack changes
Summary: Reset the marking state, including the mark stack overflow flag, in the event of a marking stack overflow during serial reference processing.
Reviewed-by: jmasa

! src/share/vm/gc_implementation/g1/concurrentMark.cpp
! src/share/vm/gc_implementation/g1/concurrentMark.hpp
! src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp


From john.coomes at oracle.com  Fri Jan  4 05:36:33 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:36:33 +0000
Subject: hg: hsx/hotspot-gc/corba: Added tag jdk8-b71 for changeset
	8171d23e914d
Message-ID: <20130104053636.431904753A@hg.openjdk.java.net>

Changeset: cb40427f4714
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/cb40427f4714

Added tag jdk8-b71 for changeset 8171d23e914d

! .hgtags


From john.coomes at oracle.com  Fri Jan  4 05:36:29 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:36:29 +0000
Subject: hg: hsx/hotspot-gc: 5 new changesets
Message-ID: <20130104053629.CDA7747539@hg.openjdk.java.net>

Changeset: 2ed5be3dd506
Author:    lana
Date:      2012-12-16 22:02 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/2ed5be3dd506

Merge


Changeset: a0779b1e9a4d
Author:    jjg
Date:      2012-12-17 08:34 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/a0779b1e9a4d

8005090: Include com.sun.source.doctree in Tree API docs
Reviewed-by: erikj

! common/makefiles/javadoc/NON_CORE_PKGS.gmk

Changeset: 68a81db3ceb1
Author:    lana
Date:      2012-12-18 17:42 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/68a81db3ceb1

Merge


Changeset: 51ad2a343420
Author:    lana
Date:      2012-12-28 18:31 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/51ad2a343420

Merge


Changeset: c1be681d80a1
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/c1be681d80a1

Added tag jdk8-b71 for changeset 51ad2a343420

! .hgtags


From john.coomes at oracle.com  Fri Jan  4 05:37:05 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:37:05 +0000
Subject: hg: hsx/hotspot-gc/jaxws: Added tag jdk8-b71 for changeset
	f577a39c9fb3
Message-ID: <20130104053711.2F8C84753C@hg.openjdk.java.net>

Changeset: d9707230294d
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/d9707230294d

Added tag jdk8-b71 for changeset f577a39c9fb3

! .hgtags


From john.coomes at oracle.com  Fri Jan  4 05:36:41 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:36:41 +0000
Subject: hg: hsx/hotspot-gc/jaxp: 6 new changesets
Message-ID: <20130104053701.6213B4753B@hg.openjdk.java.net>

Changeset: b1fdb101c82e
Author:    joehw
Date:      2012-12-14 13:24 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/b1fdb101c82e

8003260: [findbug] some fields should be package protected
Summary: change public or protected mutable static fields to private or package private.
Reviewed-by: lancea

! src/com/sun/org/apache/xerces/internal/impl/XMLDocumentFragmentScannerImpl.java
! src/com/sun/org/apache/xerces/internal/impl/XMLEntityScanner.java

Changeset: 8a20e948b806
Author:    lana
Date:      2012-12-16 22:05 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/8a20e948b806

Merge


Changeset: 15b32367b23c
Author:    joehw
Date:      2012-12-18 21:11 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/15b32367b23c

8003261: static field is public but not final
Summary: add final to fVersion field, and make it a non-compile time constant.
Reviewed-by: hawtin, lancea, dholmes, chegar

! src/com/sun/org/apache/xerces/internal/impl/Version.java

Changeset: d4aea0225e80
Author:    joehw
Date:      2012-12-27 18:17 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/d4aea0225e80

8005473: Warnings compiling jaxp
Summary: clean up compiling warnings.
Reviewed-by: weijun, chegar, forax

! src/com/sun/org/apache/xalan/internal/xslt/EnvironmentCheck.java
! src/javax/xml/transform/FactoryFinder.java
! src/javax/xml/validation/SchemaFactoryFinder.java
! src/javax/xml/xpath/XPathFactoryFinder.java

Changeset: 499be952a291
Author:    lana
Date:      2012-12-28 18:31 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/499be952a291

Merge


Changeset: bdf2af722a6b
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/bdf2af722a6b

Added tag jdk8-b71 for changeset 499be952a291

! .hgtags


From john.coomes at oracle.com  Fri Jan  4 05:38:54 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:38:54 +0000
Subject: hg: hsx/hotspot-gc/jdk: 53 new changesets
Message-ID: <20130104054927.9736B4753E@hg.openjdk.java.net>

Changeset: a988c23b8553
Author:    jgodinez
Date:      2012-12-20 14:43 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/a988c23b8553

7180359: Assertion in awt_Win32GraphicsDevice.cpp when running specjbb in jprt
Reviewed-by: bae, prr

! src/windows/native/sun/windows/awt_Debug.cpp

Changeset: 2cf07dbdee64
Author:    bae
Date:      2012-12-24 14:03 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/2cf07dbdee64

7124245: [lcms] ColorConvertOp to color space CS_GRAY apparently converts orange to 244,244,0
Reviewed-by: prr

! src/share/classes/sun/java2d/cmm/lcms/LCMS.java
! src/share/classes/sun/java2d/cmm/lcms/LCMSImageLayout.java
! src/share/classes/sun/java2d/cmm/lcms/LCMSTransform.java
! src/share/native/sun/java2d/cmm/lcms/LCMS.c
+ test/sun/java2d/cmm/ColorConvertOp/GrayTest.java

Changeset: 3c1c0b7abe51
Author:    bae
Date:      2012-12-24 14:22 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3c1c0b7abe51

8005402: Need to provide benchmarks for color management
Reviewed-by: jgodinez, prr

! src/share/demo/java2d/J2DBench/build.xml
! src/share/demo/java2d/J2DBench/src/j2dbench/J2DBench.java
+ src/share/demo/java2d/J2DBench/src/j2dbench/tests/cmm/CMMTests.java
+ src/share/demo/java2d/J2DBench/src/j2dbench/tests/cmm/ColorConversionTests.java
+ src/share/demo/java2d/J2DBench/src/j2dbench/tests/cmm/ColorConvertOpTests.java
+ src/share/demo/java2d/J2DBench/src/j2dbench/tests/cmm/DataConversionTests.java
+ src/share/demo/java2d/J2DBench/src/j2dbench/tests/cmm/ProfileTests.java

Changeset: 1316d6d0900e
Author:    lana
Date:      2012-12-28 18:28 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/1316d6d0900e

Merge


Changeset: c25ea633b4de
Author:    malenkov
Date:      2012-12-17 16:58 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/c25ea633b4de

8005065: [findbugs] reference to mutable array in JavaBeans
Reviewed-by: alexsch

! src/share/classes/java/beans/DefaultPersistenceDelegate.java
! src/share/classes/java/beans/EventSetDescriptor.java
! src/share/classes/java/beans/MethodDescriptor.java
! src/share/classes/java/beans/Statement.java
+ test/java/beans/Introspector/Test8005065.java

Changeset: a78cb3c5d434
Author:    neugens
Date:      2012-12-17 17:43 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/a78cb3c5d434

8005018: X11: focus problems with openjdk 1.7.0 under gnome3 when selected keyboard is not the first in keyboard list
Summary: Don't consider extraenous bits when checking button mask, so that grabWindowRef on the window is not confused and released correctly
Reviewed-by: art, anthony

! src/solaris/classes/sun/awt/X11/XBaseWindow.java
! src/solaris/classes/sun/awt/X11/XConstants.java
! src/solaris/classes/sun/awt/X11/XToolkit.java
! src/solaris/classes/sun/awt/X11/XWindow.java
! src/solaris/classes/sun/awt/X11/XWindowPeer.java
! src/solaris/classes/sun/awt/X11/XlibUtil.java

Changeset: 985b523712c8
Author:    kshefov
Date:      2012-12-18 15:17 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/985b523712c8

7104594: [macosx] Test closed/javax/swing/JFrame/4962534/bug4962534 expects Metal L&F by default
Reviewed-by: yan, alexsch

+ test/javax/swing/JFrame/4962534/bug4962534.html
+ test/javax/swing/JFrame/4962534/bug4962534.java

Changeset: 90ad9e922042
Author:    lana
Date:      2012-12-18 16:14 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/90ad9e922042

Merge

- src/share/lib/security/java.security
- test/java/rmi/server/Unmarshal/checkUnmarshalOnStopThread/CheckUnmarshall.java

Changeset: 7082a96c02d2
Author:    alexp
Date:      2012-12-21 19:11 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/7082a96c02d2

8003982: new test javax/swing/AncestorNotifier/7193219/bug7193219.java failed on macosx
Reviewed-by: anthony, alexsch

! test/javax/swing/AncestorNotifier/7193219/bug7193219.java

Changeset: 14269f504837
Author:    dcherepanov
Date:      2012-12-27 16:08 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/14269f504837

8001161: mac: EmbeddedFrame doesn't become active window
Reviewed-by: ant

! src/macosx/classes/sun/lwawt/macosx/CEmbeddedFrame.java

Changeset: cf2bcb293f0b
Author:    lana
Date:      2012-12-28 18:30 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/cf2bcb293f0b

Merge


Changeset: 69fd3f3d20c1
Author:    alanb
Date:      2012-12-15 15:07 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/69fd3f3d20c1

8004963: URLConnection, downgrade normative reference to ${java.home}/lib/content-types.properties
Reviewed-by: chegar

! src/share/classes/java/net/URLConnection.java

Changeset: eaaec81aa974
Author:    weijun
Date:      2012-12-17 12:18 +0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/eaaec81aa974

7197159: accept different kvno if there no match
Reviewed-by: xuelei

! src/share/classes/sun/security/krb5/EncryptionKey.java
! test/sun/security/krb5/auto/DynamicKeytab.java
+ test/sun/security/krb5/auto/KvnoNA.java
! test/sun/security/krb5/auto/MoreKvno.java

Changeset: f959e0cc8766
Author:    lana
Date:      2012-12-16 22:09 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/f959e0cc8766

Merge

! makefiles/CompileNativeLibraries.gmk
- src/share/classes/sun/awt/TextureSizeConstraining.java

Changeset: a02212de8db6
Author:    uta
Date:      2012-12-17 14:34 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/a02212de8db6

8004928: TEST_BUG: Reduce dependence of CoreLib tests from the AWT subsystem
Summary: the tests were refactored to drop AWT dependence where it was possible.
Reviewed-by: alanb, mchung

! test/java/io/Serializable/resolveProxyClass/NonPublicInterface.java
! test/java/lang/Throwable/LegacyChainedExceptionSerialization.java
! test/java/lang/management/CompilationMXBean/Basic.java
! test/java/lang/reflect/Generics/Probe.java
! test/java/lang/reflect/Proxy/ClassRestrictions.java
! test/java/util/Collections/EmptyIterator.java
! test/java/util/logging/LoggingDeadlock4.java
! test/sun/tools/jrunscript/common.sh

Changeset: e4d88a7352c6
Author:    mullan
Date:      2012-12-17 08:28 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e4d88a7352c6

8004234: Downgrade normative references to ${java.home}/lib/security/krb5.conf
Reviewed-by: alanb, weijun

! src/share/classes/javax/security/auth/kerberos/package.html

Changeset: 4a21f818ebb1
Author:    mullan
Date:      2012-12-17 08:30 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/4a21f818ebb1

Merge

- src/share/classes/sun/awt/TextureSizeConstraining.java
- test/java/rmi/server/Unmarshal/checkUnmarshalOnStopThread/CheckUnmarshall.java

Changeset: bcf79e6f52a0
Author:    chegar
Date:      2012-12-17 16:27 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/bcf79e6f52a0

8005081: java/util/prefs/PrefsSpi.sh fails on macos-x
Reviewed-by: alanb

! test/java/util/prefs/PrefsSpi.sh

Changeset: 9f1b516cd9cb
Author:    jjg
Date:      2012-12-17 08:34 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/9f1b516cd9cb

8005090: Include com.sun.source.doctree in Tree API docs
Reviewed-by: erikj

! make/docs/NON_CORE_PKGS.gmk

Changeset: bac477d67867
Author:    jjg
Date:      2012-12-17 10:31 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/bac477d67867

8004832: Add new doclint package
Reviewed-by: erikj, ohair

! make/common/Release.gmk
! make/common/internal/Defs-langtools.gmk
! makefiles/CreateJars.gmk

Changeset: 0fabdf676395
Author:    martin
Date:      2012-12-17 18:39 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/0fabdf676395

8004863: Infinite Loop in KeepAliveStream
Reviewed-by: chegar

! src/share/classes/sun/net/www/http/KeepAliveStream.java
+ test/sun/net/www/http/KeepAliveStream/InfiniteLoop.java

Changeset: 0a1398021c7c
Author:    darcy
Date:      2012-12-18 14:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/0a1398021c7c

8005042: Add Method.isDefault to core reflection
Reviewed-by: alanb, forax, mduigou, jgish, mchung

! src/share/classes/java/lang/reflect/Method.java
+ test/java/lang/reflect/Method/IsDefaultTest.java

Changeset: 6d977f61af5e
Author:    darcy
Date:      2012-12-18 14:49 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/6d977f61af5e

8004699: Add type annotation storage to Constructor, Field and Method
Reviewed-by: darcy, dholmes
Contributed-by: joel.franck at oracle.com

! src/share/classes/java/lang/reflect/Constructor.java
! src/share/classes/java/lang/reflect/Field.java
! src/share/classes/java/lang/reflect/Method.java

Changeset: e515956879cd
Author:    lana
Date:      2012-12-18 18:14 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e515956879cd

Merge


Changeset: c79b26b8efe0
Author:    sjiang
Date:      2012-12-19 11:06 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/c79b26b8efe0

7158614: JMXStartStopTest.sh failing intermittently
Summary: fixed 3 problems here: 1) checked the lock file too eary 2) never got the process id of a java test 3) some shell commands were not supported in some Solaris machines.
Reviewed-by: dsamersoff, alanb

! test/ProblemList.txt
! test/sun/management/jmxremote/startstop/JMXStartStopDoSomething.java
! test/sun/management/jmxremote/startstop/JMXStartStopTest.sh

Changeset: 3fd3bcc8bd42
Author:    joehw
Date:      2012-12-19 12:09 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3fd3bcc8bd42

8004371: (props) Properties.loadFromXML needs small footprint XML parser as fallback when JAXP is not present
Reviewed-by: alanb, mchung, psandoz

+ src/share/classes/jdk/internal/org/xml/sax/Attributes.java
+ src/share/classes/jdk/internal/org/xml/sax/ContentHandler.java
+ src/share/classes/jdk/internal/org/xml/sax/DTDHandler.java
+ src/share/classes/jdk/internal/org/xml/sax/EntityResolver.java
+ src/share/classes/jdk/internal/org/xml/sax/ErrorHandler.java
+ src/share/classes/jdk/internal/org/xml/sax/InputSource.java
+ src/share/classes/jdk/internal/org/xml/sax/Locator.java
+ src/share/classes/jdk/internal/org/xml/sax/SAXException.java
+ src/share/classes/jdk/internal/org/xml/sax/SAXNotRecognizedException.java
+ src/share/classes/jdk/internal/org/xml/sax/SAXNotSupportedException.java
+ src/share/classes/jdk/internal/org/xml/sax/SAXParseException.java
+ src/share/classes/jdk/internal/org/xml/sax/XMLReader.java
+ src/share/classes/jdk/internal/org/xml/sax/helpers/DefaultHandler.java
+ src/share/classes/jdk/internal/util/xml/PropertiesDefaultHandler.java
+ src/share/classes/jdk/internal/util/xml/SAXParser.java
+ src/share/classes/jdk/internal/util/xml/XMLStreamException.java
+ src/share/classes/jdk/internal/util/xml/XMLStreamWriter.java
+ src/share/classes/jdk/internal/util/xml/impl/Attrs.java
+ src/share/classes/jdk/internal/util/xml/impl/Input.java
+ src/share/classes/jdk/internal/util/xml/impl/Pair.java
+ src/share/classes/jdk/internal/util/xml/impl/Parser.java
+ src/share/classes/jdk/internal/util/xml/impl/ParserSAX.java
+ src/share/classes/jdk/internal/util/xml/impl/ReaderUTF16.java
+ src/share/classes/jdk/internal/util/xml/impl/ReaderUTF8.java
+ src/share/classes/jdk/internal/util/xml/impl/SAXParserImpl.java
+ src/share/classes/jdk/internal/util/xml/impl/XMLStreamWriterImpl.java
+ src/share/classes/jdk/internal/util/xml/impl/XMLWriter.java

Changeset: cf15abdcdf88
Author:    alanb
Date:      2012-12-19 14:53 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/cf15abdcdf88

8005248: (props) Integrate small footprint parser into Properties
Reviewed-by: joehw, mchung, psandoz, erikj

! make/jdk/Makefile
- make/jdk/asm/Makefile
! src/share/classes/java/util/Properties.java
+ src/share/classes/jdk/internal/util/xml/BasicXmlPropertiesProvider.java
! test/java/util/Properties/LoadAndStoreXML.java
+ test/java/util/Properties/invalidxml/BadCase.xml
+ test/java/util/Properties/invalidxml/BadDocType.xml.excluded
+ test/java/util/Properties/invalidxml/NoClosingTag.xml
+ test/java/util/Properties/invalidxml/NoDocType.xml.excluded
+ test/java/util/Properties/invalidxml/NoRoot.xml
+ test/java/util/Properties/invalidxml/NotQuoted.xml
+ test/java/util/Properties/invalidxml/README.txt

Changeset: 1f9c19741285
Author:    darcy
Date:      2012-12-19 11:53 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/1f9c19741285

8005097: Tie isSynthetic javadoc to the JLS
Reviewed-by: mduigou

! src/share/classes/java/lang/Class.java
! src/share/classes/java/lang/reflect/Constructor.java
! src/share/classes/java/lang/reflect/Executable.java
! src/share/classes/java/lang/reflect/Member.java
! src/share/classes/java/lang/reflect/Method.java

Changeset: b600d490dc57
Author:    dsamersoff
Date:      2012-12-20 16:02 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/b600d490dc57

6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject
Summary: call readObject in all cases
Reviewed-by: emcmanus
Contributed-by: jaroslav.bachorik at oracle.com

! src/share/classes/javax/management/MBeanFeatureInfo.java
! src/share/classes/javax/management/MBeanInfo.java

Changeset: e43f90d5af11
Author:    dsamersoff
Date:      2012-12-20 16:56 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e43f90d5af11

6937053: RMI unmarshalling errors in ClientNotifForwarder cause silent failure
Summary: the catch block in the fetchNotifs() method is extended to expect UnmarshalException
Reviewed-by: emcmanus
Contributed-by: jaroslav.bachorik at oracle.com

! src/share/classes/com/sun/jmx/remote/internal/ClientNotifForwarder.java

Changeset: 3f014bc09297
Author:    dsamersoff
Date:      2012-12-20 17:24 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3f014bc09297

7009998: JMX synchronization during connection restart is faulty
Summary: add a return statement after the re-connecting has finished and the state is CONNECTED
Reviewed-by: sjiang
Contributed-by: jaroslav.bachorik at oracle.com

! make/netbeans/jmx/build.properties
! src/share/classes/com/sun/jmx/remote/internal/ClientCommunicatorAdmin.java

Changeset: d01a810798e0
Author:    dl
Date:      2012-12-20 13:44 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/d01a810798e0

8002356: Add ForkJoin common pool and CountedCompleter
Reviewed-by: chegar, mduigou

! make/java/java/FILES_java.gmk
+ src/share/classes/java/util/concurrent/CountedCompleter.java
! src/share/classes/java/util/concurrent/ForkJoinPool.java
! src/share/classes/java/util/concurrent/ForkJoinTask.java
! src/share/classes/java/util/concurrent/ForkJoinWorkerThread.java

Changeset: 31d2f9995d6c
Author:    chegar
Date:      2012-12-20 15:04 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/31d2f9995d6c

8005306: Redundant cast warning in KeepAliveStream.java
Reviewed-by: alanb

! src/share/classes/sun/net/www/http/KeepAliveStream.java

Changeset: c1a55ee9618e
Author:    dsamersoff
Date:      2012-12-20 20:12 +0400
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/c1a55ee9618e

8005309: Missed tests for 6783290,6937053,7009998
Summary: Missed tests for 6783290,6937053,7009998
Reviewed-by: sjiang, emcmanus
Contributed-by: jaroslav.bachorik at oracle.com

+ test/com/sun/jmx/remote/CCAdminReconnectTest.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Client/Client.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Client/ConfigKey.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Client/TestNotification.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Server/ConfigKey.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Server/Server.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Server/Ste.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Server/SteMBean.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/Server/TestNotification.java
+ test/com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.sh
+ test/javax/management/MBeanInfo/SerializationTest1.java

Changeset: edb71a37fcb7
Author:    alanb
Date:      2012-12-20 20:29 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/edb71a37fcb7

8001048: JSR-160: Allow IIOP transport to be optional
Reviewed-by: dsamersoff, dfuchs, mchung

! src/share/classes/com/sun/jmx/remote/internal/IIOPHelper.java
! src/share/classes/javax/management/remote/JMXConnectorFactory.java
! src/share/classes/javax/management/remote/JMXConnectorServerFactory.java
! src/share/classes/javax/management/remote/rmi/RMIConnector.java
! src/share/classes/javax/management/remote/rmi/RMIConnectorServer.java
! src/share/classes/javax/management/remote/rmi/package.html
! test/javax/management/remote/mandatory/connection/AddressableTest.java
! test/javax/management/remote/mandatory/connection/CloseableTest.java
! test/javax/management/remote/mandatory/connection/ConnectionListenerNullTest.java
! test/javax/management/remote/mandatory/connection/IIOPURLTest.java
! test/javax/management/remote/mandatory/connection/IdleTimeoutTest.java
! test/javax/management/remote/mandatory/connection/MultiThreadDeadLockTest.java
! test/javax/management/remote/mandatory/connectorServer/SetMBeanServerForwarder.java
! test/javax/management/remote/mandatory/loading/MissingClassTest.java
! test/javax/management/remote/mandatory/provider/ProviderTest.java
! test/javax/management/remote/mandatory/serverError/JMXServerErrorTest.java

Changeset: eeda18683ddc
Author:    alanb
Date:      2012-12-20 20:40 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/eeda18683ddc

8005281: (props) loadFromXML/storeToXML with small parser is not thread safe
Reviewed-by: mchung

! src/share/classes/jdk/internal/util/xml/BasicXmlPropertiesProvider.java
+ test/java/util/Properties/ConcurrentLoadAndStoreXML.java

Changeset: 60adb69bf043
Author:    smarks
Date:      2012-12-20 20:11 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/60adb69bf043

8005290: remove -showversion from RMI test library subprocess mechanism
Reviewed-by: jgish, chegar, dmocek

! test/java/rmi/testlibrary/JavaVM.java
! test/java/rmi/testlibrary/StreamPipe.java
! test/sun/rmi/runtime/Log/6409194/NoConsoleOutput.java

Changeset: 42ee6b6ad373
Author:    jbachorik
Date:      2012-12-21 09:27 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/42ee6b6ad373

7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently
Summary: ClientCommunicatorAdmin should call gotIOException((IOException)e) instead of restart((IOException)e) when detecting a communication error, because the method gotIOException will send a failure notification if necessary.
Reviewed-by: emcmanus, sjiang
Contributed-by: jaroslav.bachorik at oracle.com

! src/share/classes/com/sun/jmx/remote/internal/ClientCommunicatorAdmin.java

Changeset: 86c10d1484e9
Author:    sjiang
Date:      2012-12-21 10:58 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/86c10d1484e9

8005325: The script should use TESTVMOPTS
Summary: Put back TESTVMOPTS which was removed by mistake.
Reviewed-by: smarks

! test/sun/management/jmxremote/startstop/JMXStartStopTest.sh

Changeset: c1227b872a12
Author:    joehw
Date:      2012-12-21 17:29 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/c1227b872a12

8005280: (props) Improve test coverage for small XML parser
Summary: added a few more invalid XML files, international characters to LoadAndStore test, and a behavior compatibility test.
Reviewed-by: alanb, lancea

+ test/java/util/Properties/Compatibility.xml
+ test/java/util/Properties/CompatibilityTest.java
! test/java/util/Properties/LoadAndStoreXML.java
+ test/java/util/Properties/invalidxml/BadDocType.xml
- test/java/util/Properties/invalidxml/BadDocType.xml.excluded
+ test/java/util/Properties/invalidxml/DTDRootNotMatch.xml
+ test/java/util/Properties/invalidxml/IllegalComment.xml
+ test/java/util/Properties/invalidxml/IllegalEntry.xml
+ test/java/util/Properties/invalidxml/IllegalEntry1.xml
+ test/java/util/Properties/invalidxml/IllegalKeyAttribute.xml
+ test/java/util/Properties/invalidxml/NoDocType.xml
- test/java/util/Properties/invalidxml/NoDocType.xml.excluded
+ test/java/util/Properties/invalidxml/NoNamespaceSupport.xml

Changeset: 4d28776d7007
Author:    mullan
Date:      2012-12-26 10:07 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/4d28776d7007

8005117: Eliminate dependency from ConfigSpiFile to com.sun.security.auth.login.ConfigFile
Reviewed-by: alanb, mchung, weijun

! src/share/classes/com/sun/security/auth/login/ConfigFile.java
! src/share/classes/sun/security/provider/ConfigSpiFile.java

Changeset: d9cab18f326a
Author:    mullan
Date:      2012-12-26 10:08 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/d9cab18f326a

Merge

- make/jdk/asm/Makefile

Changeset: 9d984ccd17fc
Author:    chegar
Date:      2012-12-27 21:55 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/9d984ccd17fc

8003981: Support Parallel Array Sorting - JEP 103
Reviewed-by: chegar, forax, dholmes, dl
Contributed-by: david.holmes at oracle.com, dl at cs.oswego.edu, chris.hegarty at oracle.com

! make/java/java/FILES_java.gmk
! src/share/classes/java/util/Arrays.java
+ src/share/classes/java/util/ArraysParallelSortHelpers.java
+ test/java/util/Arrays/ParallelSorting.java

Changeset: 4ad38db38fff
Author:    okutsu
Date:      2012-12-28 14:13 +0900
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/4ad38db38fff

8005471: DateFormat: Time zone info is not localized when adapter is CLDR
Reviewed-by: peytoia

! src/share/classes/sun/util/resources/TimeZoneNamesBundle.java
+ test/java/util/TimeZone/CLDRDisplayNamesTest.java

Changeset: 1da019e7999a
Author:    peytoia
Date:      2012-12-28 15:07 +0900
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/1da019e7999a

8005277: Regression in JDK 7 in Bidi implementation
Reviewed-by: okutsu

! src/share/classes/sun/text/bidi/BidiBase.java
! test/java/text/Bidi/BidiConformance.java
+ test/java/text/Bidi/Bug8005277.java

Changeset: f3ac419e2bf0
Author:    okutsu
Date:      2012-12-28 16:39 +0900
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/f3ac419e2bf0

8005561: typo in Calendar
Reviewed-by: peytoia

! src/share/classes/java/util/Calendar.java

Changeset: 645d774b683a
Author:    xuelei
Date:      2012-12-28 00:48 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/645d774b683a

7109274: Restrict the use of certificates with RSA keys less than 1024 bits
Summary: This restriction is applied via the Java Security property, "jdk.certpath.disabledAlgorithms". This will impact providers that adhere to this security property.
Reviewed-by: mullan

! src/share/lib/security/java.security-linux
! src/share/lib/security/java.security-macosx
! src/share/lib/security/java.security-solaris
! src/share/lib/security/java.security-windows
! test/java/security/cert/CertPathBuilder/targetConstraints/BuildEEBasicConstraints.java
! test/java/security/cert/pkix/policyChanges/TestPolicy.java
! test/sun/security/provider/certpath/DisabledAlgorithms/CPBuilder.java
! test/sun/security/provider/certpath/DisabledAlgorithms/CPValidatorEndEntity.java
! test/sun/security/provider/certpath/DisabledAlgorithms/CPValidatorIntermediate.java
! test/sun/security/provider/certpath/DisabledAlgorithms/CPValidatorTrustAnchor.java
! test/sun/security/ssl/com/sun/net/ssl/internal/ssl/ClientHandshaker/RSAExport.java
+ test/sun/security/ssl/javax/net/ssl/TLSv12/DisabledShortRSAKeys.java
! test/sun/security/ssl/javax/net/ssl/TLSv12/ShortRSAKey512.java
! test/sun/security/tools/jarsigner/concise_jarsigner.sh

Changeset: 4472a641b4dc
Author:    xuelei
Date:      2012-12-28 03:50 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/4472a641b4dc

8003265: Need to clone array of input/output parameters
Reviewed-by: mullan

! src/share/classes/com/sun/jndi/dns/DnsContext.java
! src/share/classes/com/sun/jndi/ldap/BasicControl.java

Changeset: 46675076f753
Author:    sjiang
Date:      2012-12-28 16:44 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/46675076f753

7120365: DiffHBTest.java fails due to ConcurrentModificationException
Summary: The problem is from the server notification forwarder, it should use a copy of listener set to do iterate.
Reviewed-by: alanb

! src/share/classes/com/sun/jmx/remote/internal/ServerNotifForwarder.java
! test/ProblemList.txt
+ test/javax/management/remote/mandatory/notif/ConcurrentModificationTest.java

Changeset: 0cfcba56cfa7
Author:    jgish
Date:      2012-12-28 18:32 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/0cfcba56cfa7

8005594: Fix to 8003265 breaks build
Summary: backout changeset 4472a641b4dc
Reviewed-by: smarks, wetmore

! src/share/classes/com/sun/jndi/dns/DnsContext.java
! src/share/classes/com/sun/jndi/ldap/BasicControl.java

Changeset: ac5e29b62288
Author:    smarks
Date:      2012-12-28 17:36 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/ac5e29b62288

Merge


Changeset: 2a5af0f766d0
Author:    lana
Date:      2012-12-28 18:36 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/2a5af0f766d0

Merge

- make/jdk/asm/Makefile
! makefiles/CreateJars.gmk
! test/sun/management/jmxremote/startstop/JMXStartStopTest.sh

Changeset: 32a57e645e01
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/32a57e645e01

Added tag jdk8-b71 for changeset 2a5af0f766d0

! .hgtags


From john.coomes at oracle.com  Fri Jan  4 05:51:39 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 04 Jan 2013 05:51:39 +0000
Subject: hg: hsx/hotspot-gc/langtools: 20 new changesets
Message-ID: <20130104055232.1601E4753F@hg.openjdk.java.net>

Changeset: 37a5d7eccb87
Author:    vromero
Date:      2012-12-14 11:16 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/37a5d7eccb87

8004976: test/tools/javac/7153958/CPoolRefClassContainingInlinedCts.java can fail
Reviewed-by: jjg, mcimadamore

! test/tools/javac/7153958/CPoolRefClassContainingInlinedCts.java

Changeset: de1ec6fc93fe
Author:    vromero
Date:      2012-12-15 13:54 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/de1ec6fc93fe

8000518: Javac generates duplicate name_and_type constant pool entry for class BinaryOpValueExp.java
Reviewed-by: jjg, mcimadamore

! src/share/classes/com/sun/tools/javac/code/Type.java
! src/share/classes/com/sun/tools/javac/code/Types.java
! src/share/classes/com/sun/tools/javac/comp/LambdaToMethod.java
! src/share/classes/com/sun/tools/javac/jvm/ClassFile.java
! src/share/classes/com/sun/tools/javac/jvm/ClassReader.java
! src/share/classes/com/sun/tools/javac/jvm/ClassWriter.java
! src/share/classes/com/sun/tools/javac/jvm/Code.java
! src/share/classes/com/sun/tools/javac/jvm/Gen.java
! src/share/classes/com/sun/tools/javac/jvm/Pool.java
! src/share/classes/com/sun/tools/javac/sym/CreateSymbols.java
+ test/tools/javac/8000518/DuplicateConstantPoolEntry.java
! test/tools/javac/lambda/TestInvokeDynamic.java

Changeset: f72dc656a306
Author:    lana
Date:      2012-12-16 22:10 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/f72dc656a306

Merge


Changeset: 02a18f209ab3
Author:    vromero
Date:      2012-12-17 14:54 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/02a18f209ab3

8004814: javadoc should be able to detect default methods
Reviewed-by: jjg
Contributed-by: maurizio.cimadamore at oracle.com

! src/share/classes/com/sun/javadoc/ClassDoc.java
! src/share/classes/com/sun/javadoc/MethodDoc.java
! src/share/classes/com/sun/tools/javadoc/ClassDocImpl.java
! src/share/classes/com/sun/tools/javadoc/MethodDocImpl.java

Changeset: 75ab654b5cd5
Author:    jjg
Date:      2012-12-17 07:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/75ab654b5cd5

8004832: Add new doclint package
Reviewed-by: mcimadamore

! make/build.properties
! src/share/classes/com/sun/source/util/DocTrees.java
! src/share/classes/com/sun/source/util/JavacTask.java
! src/share/classes/com/sun/source/util/TreePath.java
+ src/share/classes/com/sun/tools/doclint/Checker.java
+ src/share/classes/com/sun/tools/doclint/DocLint.java
+ src/share/classes/com/sun/tools/doclint/Entity.java
+ src/share/classes/com/sun/tools/doclint/Env.java
+ src/share/classes/com/sun/tools/doclint/HtmlTag.java
+ src/share/classes/com/sun/tools/doclint/Messages.java
+ src/share/classes/com/sun/tools/doclint/resources/doclint.properties
! src/share/classes/com/sun/tools/javac/api/BasicJavacTask.java
! src/share/classes/com/sun/tools/javac/api/JavacTaskImpl.java
! src/share/classes/com/sun/tools/javac/api/JavacTrees.java
! src/share/classes/com/sun/tools/javac/main/JavaCompiler.java
! src/share/classes/com/sun/tools/javac/model/JavacTypes.java
! src/share/classes/com/sun/tools/javac/parser/DocCommentParser.java
! src/share/classes/com/sun/tools/javac/resources/compiler.properties
! src/share/classes/com/sun/tools/javac/tree/DCTree.java
! src/share/classes/com/sun/tools/javac/tree/DocPretty.java
! src/share/classes/com/sun/tools/javac/tree/TreeInfo.java
+ test/tools/doclint/AccessTest.java
+ test/tools/doclint/AccessTest.package.out
+ test/tools/doclint/AccessTest.private.out
+ test/tools/doclint/AccessTest.protected.out
+ test/tools/doclint/AccessTest.public.out
+ test/tools/doclint/AccessibilityTest.java
+ test/tools/doclint/AccessibilityTest.out
+ test/tools/doclint/DocLintTester.java
+ test/tools/doclint/EmptyAuthorTest.java
+ test/tools/doclint/EmptyAuthorTest.out
+ test/tools/doclint/EmptyExceptionTest.java
+ test/tools/doclint/EmptyExceptionTest.out
+ test/tools/doclint/EmptyParamTest.java
+ test/tools/doclint/EmptyParamTest.out
+ test/tools/doclint/EmptyReturnTest.java
+ test/tools/doclint/EmptyReturnTest.out
+ test/tools/doclint/EmptySerialDataTest.java
+ test/tools/doclint/EmptySerialDataTest.out
+ test/tools/doclint/EmptySerialFieldTest.java
+ test/tools/doclint/EmptySerialFieldTest.out
+ test/tools/doclint/EmptySinceTest.java
+ test/tools/doclint/EmptySinceTest.out
+ test/tools/doclint/EmptyVersionTest.java
+ test/tools/doclint/EmptyVersionTest.out
+ test/tools/doclint/HtmlAttrsTest.java
+ test/tools/doclint/HtmlAttrsTest.out
+ test/tools/doclint/HtmlTagsTest.java
+ test/tools/doclint/HtmlTagsTest.out
+ test/tools/doclint/MissingCommentTest.java
+ test/tools/doclint/MissingCommentTest.out
+ test/tools/doclint/MissingParamsTest.java
+ test/tools/doclint/MissingParamsTest.out
+ test/tools/doclint/MissingReturnTest.java
+ test/tools/doclint/MissingReturnTest.out
+ test/tools/doclint/MissingThrowsTest.java
+ test/tools/doclint/MissingThrowsTest.out
+ test/tools/doclint/OptionTest.java
+ test/tools/doclint/OverridesTest.java
+ test/tools/doclint/ReferenceTest.java
+ test/tools/doclint/ReferenceTest.out
+ test/tools/doclint/RunTest.java
+ test/tools/doclint/SyntaxTest.java
+ test/tools/doclint/SyntaxTest.out
+ test/tools/doclint/SyntheticTest.java
+ test/tools/doclint/ValidTest.java
+ test/tools/doclint/tidy/AnchorAlreadyDefined.java
+ test/tools/doclint/tidy/AnchorAlreadyDefined.out
+ test/tools/doclint/tidy/BadEnd.java
+ test/tools/doclint/tidy/BadEnd.out
+ test/tools/doclint/tidy/InsertImplicit.java
+ test/tools/doclint/tidy/InsertImplicit.out
+ test/tools/doclint/tidy/InvalidEntity.java
+ test/tools/doclint/tidy/InvalidEntity.out
+ test/tools/doclint/tidy/InvalidName.java
+ test/tools/doclint/tidy/InvalidName.out
+ test/tools/doclint/tidy/InvalidTag.java
+ test/tools/doclint/tidy/InvalidTag.out
+ test/tools/doclint/tidy/InvalidURI.java
+ test/tools/doclint/tidy/InvalidURI.out
+ test/tools/doclint/tidy/MissingGT.java
+ test/tools/doclint/tidy/MissingGT.out
+ test/tools/doclint/tidy/MissingTag.java
+ test/tools/doclint/tidy/MissingTag.out
+ test/tools/doclint/tidy/NestedTag.java
+ test/tools/doclint/tidy/NestedTag.out
+ test/tools/doclint/tidy/ParaInPre.java
+ test/tools/doclint/tidy/ParaInPre.out
+ test/tools/doclint/tidy/README.txt
+ test/tools/doclint/tidy/RepeatedAttr.java
+ test/tools/doclint/tidy/RepeatedAttr.out
+ test/tools/doclint/tidy/TextNotAllowed.java
+ test/tools/doclint/tidy/TextNotAllowed.out
+ test/tools/doclint/tidy/TrimmingEmptyTag.java
+ test/tools/doclint/tidy/TrimmingEmptyTag.out
+ test/tools/doclint/tidy/UnescapedOrUnknownEntity.java
+ test/tools/doclint/tidy/UnescapedOrUnknownEntity.out
+ test/tools/doclint/tidy/util/Main.java
+ test/tools/doclint/tidy/util/tidy.sh
+ test/tools/javac/diags/examples/NoContent.java

Changeset: f20568328a57
Author:    mcimadamore
Date:      2012-12-17 16:13 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/f20568328a57

8004099: Bad compiler diagnostic generated when poly expression is passed to non-existent method
Summary: Some code paths in resolve do not use methodArguments to correctly format actuals
Reviewed-by: jjg

! src/share/classes/com/sun/tools/javac/comp/Attr.java
! src/share/classes/com/sun/tools/javac/comp/Resolve.java
+ test/tools/javac/lambda/BadMethodCall2.java
+ test/tools/javac/lambda/BadMethodCall2.out

Changeset: 064e372f273d
Author:    jjg
Date:      2012-12-17 10:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/064e372f273d

8004961: rename Plugin.call to Plugin.init
Reviewed-by: mcimadamore

! src/share/classes/com/sun/source/util/Plugin.java
! src/share/classes/com/sun/tools/javac/main/Main.java
! test/tools/javac/plugin/showtype/ShowTypePlugin.java
! test/tools/javac/plugin/showtype/Test.java

Changeset: ef537bcc825a
Author:    mchung
Date:      2012-12-17 15:19 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/ef537bcc825a

8005137: Rename DocLint.call to DocLint.init which overrides Plugin.init
Reviewed-by: darcy, jjh

! src/share/classes/com/sun/tools/doclint/DocLint.java

Changeset: bc74006c2d8d
Author:    darcy
Date:      2012-12-18 00:24 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/bc74006c2d8d

8005046: Provide checking for a default method in javax.lang.model
Reviewed-by: jjg

! src/share/classes/com/sun/tools/javac/code/Symbol.java
! src/share/classes/javax/lang/model/element/ExecutableElement.java
+ test/tools/javac/processing/model/element/TestExecutableElement.java

Changeset: 92fcf299cd09
Author:    ohrstrom
Date:      2012-12-18 10:23 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/92fcf299cd09

8004657: Add hooks to javac to enable reporting dependency information.
Reviewed-by: jjg, mcimadamore

! src/share/classes/com/sun/tools/javac/api/JavacTool.java
! src/share/classes/com/sun/tools/javac/comp/Resolve.java
! src/share/classes/com/sun/tools/javac/main/JavaCompiler.java

Changeset: 250f0acf880c
Author:    mcimadamore
Date:      2012-12-18 22:16 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/250f0acf880c

8005193: New regression test test/tools/javac/lambda/BadMethodCall2.java fails
Summary: Bad golden file in negative test
Reviewed-by: jjh

! test/tools/javac/lambda/BadMethodCall2.out

Changeset: 573b38691a74
Author:    lana
Date:      2012-12-18 18:15 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/573b38691a74

Merge


Changeset: 67b01d295cd2
Author:    jjg
Date:      2012-12-19 11:29 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/67b01d295cd2

8004833: Integrate doclint support into javac
Reviewed-by: mcimadamore

! src/share/classes/com/sun/tools/javac/main/Main.java
! src/share/classes/com/sun/tools/javac/main/Option.java
! src/share/classes/com/sun/tools/javac/resources/javac.properties
+ test/tools/javac/doclint/DocLintTest.java

Changeset: f72c9c5aeaef
Author:    jfranck
Date:      2012-12-16 11:09 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/f72c9c5aeaef

8005098: Provide isSynthesized() information on Attribute.Compound
Reviewed-by: jjg

! make/build.properties
! src/share/classes/com/sun/tools/javac/code/Attribute.java
! src/share/classes/com/sun/tools/javac/code/Symbol.java
! src/share/classes/com/sun/tools/javac/comp/Annotate.java
! src/share/classes/com/sun/tools/javac/jvm/ClassWriter.java
! src/share/classes/com/sun/tools/javac/tree/TreeMaker.java
! src/share/classes/com/sun/tools/javadoc/PackageDocImpl.java
! src/share/classes/com/sun/tools/javadoc/ParameterImpl.java
! src/share/classes/com/sun/tools/javadoc/ProgramElementDocImpl.java

Changeset: a22f23fb7abf
Author:    jjg
Date:      2012-12-20 17:59 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/a22f23fb7abf

8005307: fix missing @bug tags
Reviewed-by: jjh

! test/tools/doclint/AccessTest.java
! test/tools/doclint/AccessTest.package.out
! test/tools/doclint/AccessTest.private.out
! test/tools/doclint/AccessTest.protected.out
! test/tools/doclint/AccessTest.public.out
! test/tools/doclint/AccessibilityTest.java
! test/tools/doclint/AccessibilityTest.out
! test/tools/doclint/EmptyAuthorTest.java
! test/tools/doclint/EmptyAuthorTest.out
! test/tools/doclint/EmptyExceptionTest.java
! test/tools/doclint/EmptyExceptionTest.out
! test/tools/doclint/EmptyParamTest.java
! test/tools/doclint/EmptyParamTest.out
! test/tools/doclint/EmptyReturnTest.java
! test/tools/doclint/EmptyReturnTest.out
! test/tools/doclint/EmptySerialDataTest.java
! test/tools/doclint/EmptySerialDataTest.out
! test/tools/doclint/EmptySerialFieldTest.java
! test/tools/doclint/EmptySerialFieldTest.out
! test/tools/doclint/EmptySinceTest.java
! test/tools/doclint/EmptySinceTest.out
! test/tools/doclint/EmptyVersionTest.java
! test/tools/doclint/EmptyVersionTest.out
! test/tools/doclint/HtmlAttrsTest.java
! test/tools/doclint/HtmlAttrsTest.out
! test/tools/doclint/HtmlTagsTest.java
! test/tools/doclint/HtmlTagsTest.out
! test/tools/doclint/MissingParamsTest.java
! test/tools/doclint/MissingParamsTest.out
! test/tools/doclint/MissingReturnTest.java
! test/tools/doclint/MissingReturnTest.out
! test/tools/doclint/MissingThrowsTest.java
! test/tools/doclint/MissingThrowsTest.out
! test/tools/doclint/OptionTest.java
! test/tools/doclint/OverridesTest.java
! test/tools/doclint/ReferenceTest.java
! test/tools/doclint/ReferenceTest.out
! test/tools/doclint/RunTest.java
! test/tools/doclint/SyntaxTest.java
! test/tools/doclint/SyntaxTest.out
! test/tools/doclint/SyntheticTest.java
! test/tools/doclint/ValidTest.java
! test/tools/doclint/tidy/AnchorAlreadyDefined.java
! test/tools/doclint/tidy/AnchorAlreadyDefined.out
! test/tools/doclint/tidy/BadEnd.java
! test/tools/doclint/tidy/BadEnd.out
! test/tools/doclint/tidy/InsertImplicit.java
! test/tools/doclint/tidy/InsertImplicit.out
! test/tools/doclint/tidy/InvalidEntity.java
! test/tools/doclint/tidy/InvalidEntity.out
! test/tools/doclint/tidy/InvalidName.java
! test/tools/doclint/tidy/InvalidName.out
! test/tools/doclint/tidy/InvalidTag.java
! test/tools/doclint/tidy/InvalidTag.out
! test/tools/doclint/tidy/InvalidURI.java
! test/tools/doclint/tidy/InvalidURI.out
! test/tools/doclint/tidy/MissingGT.java
! test/tools/doclint/tidy/MissingGT.out
! test/tools/doclint/tidy/MissingTag.java
! test/tools/doclint/tidy/MissingTag.out
! test/tools/doclint/tidy/NestedTag.java
! test/tools/doclint/tidy/NestedTag.out
! test/tools/doclint/tidy/ParaInPre.java
! test/tools/doclint/tidy/ParaInPre.out
! test/tools/doclint/tidy/RepeatedAttr.java
! test/tools/doclint/tidy/RepeatedAttr.out
! test/tools/doclint/tidy/TextNotAllowed.java
! test/tools/doclint/tidy/TextNotAllowed.out
! test/tools/doclint/tidy/TrimmingEmptyTag.java
! test/tools/doclint/tidy/TrimmingEmptyTag.out
! test/tools/doclint/tidy/UnescapedOrUnknownEntity.java
! test/tools/doclint/tidy/UnescapedOrUnknownEntity.out

Changeset: b52a38d4536c
Author:    darcy
Date:      2012-12-21 08:45 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/b52a38d4536c

8005282: Use @library tag with non-relative path for javac tests
Reviewed-by: jjg

! test/tools/javac/7129225/TestImportStar.java
! test/tools/javac/cast/intersection/model/Model01.java
! test/tools/javac/classreader/T7031108.java
! test/tools/javac/enum/6350057/T6350057.java
! test/tools/javac/enum/6424358/T6424358.java
! test/tools/javac/file/T7018098.java
! test/tools/javac/multicatch/model/ModelChecker.java
! test/tools/javac/options/T7022337.java
! test/tools/javac/processing/6348499/T6348499.java
! test/tools/javac/processing/6359313/T6359313.java
! test/tools/javac/processing/6365040/T6365040.java
! test/tools/javac/processing/6413690/T6413690.java
! test/tools/javac/processing/6414633/T6414633.java
! test/tools/javac/processing/6430209/T6430209.java
! test/tools/javac/processing/6499119/ClassProcessor.java
! test/tools/javac/processing/6511613/clss41701.java
! test/tools/javac/processing/6512707/T6512707.java
! test/tools/javac/processing/6634138/T6634138.java
! test/tools/javac/processing/6994946/SemanticErrorTest.java
! test/tools/javac/processing/6994946/SyntaxErrorTest.java
! test/tools/javac/processing/T6920317.java
! test/tools/javac/processing/T7196462.java
! test/tools/javac/processing/TestWarnErrorCount.java
! test/tools/javac/processing/environment/TestSourceVersion.java
! test/tools/javac/processing/environment/round/TestContext.java
! test/tools/javac/processing/environment/round/TestElementsAnnotatedWith.java
! test/tools/javac/processing/errors/TestErrorCount.java
! test/tools/javac/processing/errors/TestFatalityOfParseErrors.java
! test/tools/javac/processing/errors/TestOptionSyntaxErrors.java
! test/tools/javac/processing/errors/TestParseErrors/TestParseErrors.java
! test/tools/javac/processing/errors/TestReturnCode.java
! test/tools/javac/processing/filer/TestFilerConstraints.java
! test/tools/javac/processing/filer/TestGetResource.java
! test/tools/javac/processing/filer/TestGetResource2.java
! test/tools/javac/processing/filer/TestInvalidRelativeNames.java
! test/tools/javac/processing/filer/TestLastRound.java
! test/tools/javac/processing/filer/TestPackageInfo.java
! test/tools/javac/processing/filer/TestValidRelativeNames.java
! test/tools/javac/processing/messager/6362067/T6362067.java
! test/tools/javac/processing/messager/MessagerBasics.java
! test/tools/javac/processing/model/6194785/T6194785.java
! test/tools/javac/processing/model/6341534/T6341534.java
! test/tools/javac/processing/model/element/TestAnonClassNames.java
! test/tools/javac/processing/model/element/TestElement.java
! test/tools/javac/processing/model/element/TestMissingElement/TestMissingElement.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingClass.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingGenericClass1.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingGenericClass2.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingGenericInterface1.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingGenericInterface2.java
! test/tools/javac/processing/model/element/TestMissingElement2/TestMissingInterface.java
! test/tools/javac/processing/model/element/TestNames.java
! test/tools/javac/processing/model/element/TestPackageElement.java
! test/tools/javac/processing/model/element/TestResourceElement.java
! test/tools/javac/processing/model/element/TestResourceVariable.java
! test/tools/javac/processing/model/element/TestTypeParameter.java
! test/tools/javac/processing/model/element/TypeParamBounds.java
! test/tools/javac/processing/model/type/MirroredTypeEx/OverEager.java
! test/tools/javac/processing/model/type/MirroredTypeEx/Plurality.java
! test/tools/javac/processing/model/type/NoTypes.java
! test/tools/javac/processing/model/type/TestUnionType.java
! test/tools/javac/processing/model/util/BinaryName.java
! test/tools/javac/processing/model/util/GetTypeElemBadArg.java
! test/tools/javac/processing/model/util/NoSupers.java
! test/tools/javac/processing/model/util/OverridesSpecEx.java
! test/tools/javac/processing/model/util/TypesBadArg.java
! test/tools/javac/processing/model/util/deprecation/TestDeprecation.java
! test/tools/javac/processing/model/util/directSupersOfErr/DirectSupersOfErr.java
! test/tools/javac/processing/model/util/elements/TestGetConstantExpression.java
! test/tools/javac/processing/model/util/elements/TestGetPackageOf.java
! test/tools/javac/processing/model/util/filter/TestIterables.java
! test/tools/javac/processing/options/testCommandLineClasses/Test.java
! test/tools/javac/processing/options/testPrintProcessorInfo/Test.java
! test/tools/javac/processing/options/testPrintProcessorInfo/TestWithXstdout.java
! test/tools/javac/processing/warnings/UseImplicit/TestProcUseImplicitWarning.java
! test/tools/javac/processing/werror/WError1.java
! test/tools/javac/processing/werror/WErrorGen.java
! test/tools/javac/processing/werror/WErrorLast.java
! test/tools/javac/resolve/ResolveHarness.java
! test/tools/javac/util/T6597678.java
! test/tools/javac/util/context/T7021650.java

Changeset: 189b26e3818f
Author:    vromero
Date:      2012-12-21 15:27 +0000
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/189b26e3818f

8003512: javac doesn't work with jar files with >64k entries
Reviewed-by: jjg, ksrini
Contributed-by: martinrb at google.com

! src/share/classes/com/sun/tools/javac/file/ZipFileIndex.java
+ test/tools/javac/file/zip/8003512/LoadClassFromJava6CreatedJarTest.java
! test/tools/javac/file/zip/Utils.java

Changeset: 690c41cdab55
Author:    bpatel
Date:      2012-12-25 17:23 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/690c41cdab55

8004893: the javadoc/doclet needs to be updated to accommodate lambda changes
Reviewed-by: jjg

! src/share/classes/com/sun/tools/doclets/formats/html/AbstractMemberWriter.java
! src/share/classes/com/sun/tools/doclets/formats/html/ClassWriterImpl.java
! src/share/classes/com/sun/tools/doclets/formats/html/resources/standard.properties
! src/share/classes/com/sun/tools/doclets/internal/toolkit/ClassWriter.java
! src/share/classes/com/sun/tools/doclets/internal/toolkit/builders/ClassBuilder.java
! src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml
! src/share/classes/com/sun/tools/doclets/internal/toolkit/util/MethodTypes.java
! test/com/sun/javadoc/testHtmlTableTags/TestHtmlTableTags.java
+ test/com/sun/javadoc/testLambdaFeature/TestLambdaFeature.java
+ test/com/sun/javadoc/testLambdaFeature/pkg/A.java
+ test/com/sun/javadoc/testLambdaFeature/pkg/B.java
! test/com/sun/javadoc/testMethodTypes/TestMethodTypes.java

Changeset: 467e4d9281bc
Author:    lana
Date:      2012-12-28 18:39 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/467e4d9281bc

Merge

! test/tools/javac/processing/model/util/deprecation/TestDeprecation.java

Changeset: 6f0986ed9b7e
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/6f0986ed9b7e

Added tag jdk8-b71 for changeset 467e4d9281bc

! .hgtags


From bengt.rutisson at oracle.com  Fri Jan  4 07:23:45 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Fri, 04 Jan 2013 07:23:45 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8005396: Use ParNew with only one thread
	instead of DefNew as default for CMS on single CPU machines
Message-ID: <20130104072349.8453347546@hg.openjdk.java.net>

Changeset: ca0a78017dc7
Author:    brutisso
Date:      2012-12-30 08:47 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ca0a78017dc7

8005396: Use ParNew with only one thread instead of DefNew as default for CMS on single CPU machines
Reviewed-by: jmasa, jcoomes

! src/share/vm/gc_implementation/concurrentMarkSweep/cmsCollectorPolicy.cpp
! src/share/vm/gc_implementation/parNew/parNewGeneration.cpp
! src/share/vm/gc_implementation/parNew/parNewGeneration.hpp
! src/share/vm/memory/collectorPolicy.cpp
! src/share/vm/memory/tenuredGeneration.cpp
! src/share/vm/runtime/arguments.cpp


From bengt.rutisson at oracle.com  Fri Jan  4 10:23:46 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 04 Jan 2013 11:23:46 +0100
Subject: Request for review (S): 8005396: Use ParNew with only one thread
	instead of DefNew as default for CMS on single CPU machines
In-Reply-To: <50E5D40B.1030500@oracle.com>
References: <50D46750.70106@oracle.com> <20692.64227.950170.267793@oracle.com>
	<50DFF74D.8070106@oracle.com> <50E1ACCF.3000900@oracle.com>
	<50E5D40B.1030500@oracle.com>
Message-ID: <50E6ADB2.9010108@oracle.com>


Thanks Jon and John for the reviews!

Just pushed this change.

Bengt


On 1/3/13 7:55 PM, Bengt Rutisson wrote:
>
> Hi Jon,
>
> On 12/31/12 4:18 PM, Jon Masamitsu wrote:
>> Bengt,
>>
>> Thanks for the changes.
>
> Thanks for looking at it again!
>
>> Could you also fix this code in
>>
>> share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp 
>>
>>
>> If CMS is never going to see code with ParallelGCThreads == 0, then
>> the then-block for 624 can be deleted.
>
> I think it is a little early to remove this code. My change only makes 
> sure that if you run CMS with ParNew you get at least one worker 
> thread to make sure that you can use ParNew.
>
> If you explicitly turn ParNew off you will still get DefNew and can 
> use ParallelGCThreads == 0. My other review request will print a 
> deprecation message for this combination but for JDK8 it will still be 
> allowed.
>
> For JDK9 we plan to completely disallow this combination and at that 
> point I think we can remove the code in compactibleFreeListSpace.cpp.
>
> Thanks,
> Bengt
>
>>
>>
>>    620        if (!_adaptive_freelists&& _smallLinearAllocBlock._ptr 
>> == NULL) {
>>
>>    621          // Mark the boundary of the new block in BOT
>>
>>    622          _bt.mark_block(prevEnd, value);
>>
>>    623          // put it all in the linAB
>>
>>    624          if (ParallelGCThreads == 0) {
>>
>>    625            _smallLinearAllocBlock._ptr = prevEnd;
>>
>>    626            _smallLinearAllocBlock._word_size = newFcSize;
>>
>>    627 repairLinearAllocBlock(&_smallLinearAllocBlock);
>>
>>    628          } else { // ParallelGCThreads>  0
>>
>>    629            MutexLockerEx x(parDictionaryAllocLock(),
>>
>>    630 Mutex::_no_safepoint_check_flag);
>>
>>    631            _smallLinearAllocBlock._ptr = prevEnd;
>>
>>    632            _smallLinearAllocBlock._word_size = newFcSize;
>>
>>    633 repairLinearAllocBlock(&_smallLinearAllocBlock);
>>
>>    634          }
>>
>>
>> Jon
>>
>> On 12/30/12 00:11, Bengt Rutisson wrote:
>>>
>>> Hi John and Jon,
>>>
>>> Thanks for the reviews!
>>>
>>> I discovered a bug in my fix. ParNew actually does not support 
>>> ParallelGCThreads=0. I fixed this by making sure that we don't set 
>>> ParallelGCThreads to 0 on single CPU machines. Instead we keep it at 1.
>>>
>>> And if someone explicitly set -XX:ParallelGCThreads=0 on the command 
>>> line while trying to use ParNew I print an error message and exit.
>>>
>>> I assume that this is the reason that we previously picked DefNew if 
>>> ParallelGCThreads was set to 0, but I think now that we want to 
>>> deprecate DefNew for CMS it makes more sense to require users to 
>>> explicitly turn ParNew off with -XX:-UseParNewGC if this is what 
>>> they really want.
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.02/
>>>
>>> The only change to the previous version is in arguments.cpp. Here is 
>>> the small diff compared to the previous webrev:
>>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.01-02.diff/
>>>
>>> I have tested the fix on a single CPU virtual box instance.
>>>
>>> Thanks,
>>> Bengt
>>>
>>> On 12/22/12 1:12 AM, John Coomes wrote:
>>>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>>>> Hi All,
>>>>>
>>>>> Can I have a couple of reviews for this change?
>>>>>
>>>>> http://cr.openjdk.java.net/~brutisso/8005396/webrev.00/
>>>>>
>>>>> Currently we use ParNew as default for the young generation when 
>>>>> CMS is
>>>>> selected. But if the machine only has a single CPU we set the
>>>>> ParallelGCThreads to 0 and and select DefNew instead of ParNew.
>>>> Looks good to me.
>>>>
>>>> -John
>>>>
>>>>> As part of another change, 8003820, we will deprecate the DefNew + 
>>>>> CMS
>>>>> combination. Thus, it does not make sense anymore to have this 
>>>>> selected
>>>>> by default. This fix is to make CMS always pick ParNew by default.
>>>>>
>>>>> The change also has the side effect that the, in my opinion, rather
>>>>> strange behavior that setting ParallelGCThreads=0 on the command line
>>>>> overrides the GC choice. I would expect this command line to give me
>>>>> ParNew, but it actually gives me DefNew:
>>>>>
>>>>> -XX:+UseParNewGC -XX:ParallelGCThreads=0
>>>>>
>>>>> After my proposed change you get ParNew with the above command line.
>>>>>
>>>>> I have done some performance testing to verify that ParNew with one
>>>>> thread is not slower than DefNew. The details are in the bug report:
>>>>>
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8005396
>>>>>
>>>>> but as a summary it can be said that there is no noticeable 
>>>>> difference.
>>>>>
>>>>> I am also running some more SPECjbb2005 runs and will analyze the 
>>>>> gc times.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>
>


From bengt.rutisson at oracle.com  Fri Jan  4 10:23:54 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 04 Jan 2013 11:23:54 +0100
Subject: Request for review (S): 8003820: Deprecate untested and rarely
	used GC combinations
In-Reply-To: <50D54051.2030600@oracle.com>
References: <50D07790.6070304@oracle.com>
	<CABzyjymv_Exx=wMv+xCHC6uz3nXphaf=yXJzKfBJ-=6x+mKkRg@mail.gmail.com>
	<50D32527.4050007@oracle.com>
	<CABzyjy=G3+CDMSAFp1e3AGadGZih984zuKzxR30_qhazCtus8w@mail.gmail.com>
	<50D467FE.9000700@oracle.com> <20693.284.193568.701702@oracle.com>
	<50D54051.2030600@oracle.com>
Message-ID: <50E6ADBA.6000201@oracle.com>


Thanks Ramki, John and Jesper for the reviews!

Pushing this now.

Bengt

On 12/22/12 6:08 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> Thanks for looking at this!
>
> On 12/22/12 1:38 AM, John Coomes wrote:
>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>> Hi Ramki,
>>>
>>> I made the change to pick ParNew by default also on single CPU systems.
>>> However, I think this change deserves a separate bug ID and changeset.
>>> So, I just sent out a new review request with just this change.
>>>
>>> Once that has been handled. I think the review request discussed in 
>>> this
>>> email thread will look exactly as it is now.
>> Given the other change to keep ParNew enabled on 1-cpu systems, this
>> looks good to me.
>>
>> You might want to append to the warning something along the lines of
>> "and will likely be removed in a future release".  We intend to remove
>> this, so "deprecated" here is notably different from its use in the
>> Java APIs, where nothing deprecated has ever been removed.
>
> Good point. Updated webrev:
>
> http://cr.openjdk.java.net/~brutisso/8003820/webrev.01/
>
> Thanks,
> Bengt
>>
>> -John
>>
>>> Thanks again for looking at this!
>>> Bengt
>>>
>>> On 12/20/12 9:21 PM, Srinivas Ramakrishna wrote:
>>>>
>>>>
>>>>>      What happens when you run CMS on a single-processor. I hope you
>>>>>      don't see a deprecation warning.
>>>>      Ooops. Good point. It took me a long while to find a machine with
>>>>      just one cpu that could actually run JDK8. But you are 
>>>> correct. We
>>>>      will print a warning in that case.
>>>>
>>>>
>>>> Remember that virtualized platforms or LDOMS or Zones may partition a
>>>> large box into small 1-cpu slices (although may be not 1-core).
>>>>
>>>> On Solaris, you can easily test your code by means of psradm to turn
>>>> off all but one virtual cpu.
>>>>
>>>>
>>>>      I think the fix is to not pick DefNew by default for single
>>>>      processor machines. I'll see if I can get any performance data 
>>>> for
>>>>      that.
>>>>
>>>>
>>>> I'd test that on a regular MP with ParNew=1 vs DefNew, as well as
>>>> separately with psrset and pbind (although my guess is that
>>>> the latter two would be indistinguishable from each other). As I
>>>> recall, scaling was near linear at those small numbers for ParNew,
>>>> and the breakeven point was at 2, so my guess based on very old data
>>>> from the fogs of time is that we'd see a fairly sizable pause
>>>> time and overhead hit on a single cpu.
>>>>
>>>> Stepping back for a moment, is supporting embedded environments
>>>> perhaps from the same parent code base an issue, so DefNew &
>>>> Serial is going to be part of the code base for a while, anyway?
>>>>
>>>> I understand though that saving on testing resources by pruning down
>>>> supported combinations is one important motivation, in which case
>>>> DefNew+CMS gets deprecated (and switches to Parnew/1+CMS on 1-cpu
>>>> configs), but DefNew continues to be part of the code base,
>>>> and so DefNew code gets used (and tested) at least in part to the
>>>> extent that ParNew uses at least some functionality defined in DefNew.
>>>>
>>>> -- ramki
>>>>
>>>>
>


From bengt.rutisson at oracle.com  Fri Jan  4 12:01:26 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Fri, 04 Jan 2013 12:01:26 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8003820: Deprecate untested and rarely
	used GC combinations
Message-ID: <20130104120135.4B8DA47553@hg.openjdk.java.net>

Changeset: e0ab18eafbde
Author:    brutisso
Date:      2013-01-04 11:10 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e0ab18eafbde

8003820: Deprecate untested and rarely used GC combinations
Summary: Log warning messages for DefNew+CMS and ParNew+SerialOld
Reviewed-by: ysr, jwilhelm, jcoomes

! src/share/vm/runtime/arguments.cpp
! src/share/vm/runtime/arguments.hpp


From ysr1729 at gmail.com  Fri Jan  4 18:24:35 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 4 Jan 2013 10:24:35 -0800
Subject: Request for review (S): 8003820: Deprecate untested and rarely
	used GC combinations
In-Reply-To: <50E6ADBA.6000201@oracle.com>
References: <50D07790.6070304@oracle.com>
	<CABzyjymv_Exx=wMv+xCHC6uz3nXphaf=yXJzKfBJ-=6x+mKkRg@mail.gmail.com>
	<50D32527.4050007@oracle.com>
	<CABzyjy=G3+CDMSAFp1e3AGadGZih984zuKzxR30_qhazCtus8w@mail.gmail.com>
	<50D467FE.9000700@oracle.com> <20693.284.193568.701702@oracle.com>
	<50D54051.2030600@oracle.com> <50E6ADBA.6000201@oracle.com>
Message-ID: <CABzyjymrMOKFQEOYDJk_T5kzVO_bbHseKXVja7PUcansB4gDrA@mail.gmail.com>

Sorry for the delay in responding. The changes (this and the other bug id
for ParNew/1) all look good; thanks for verifying the performance numbers
as well!

thanks!
-- ramki

On Fri, Jan 4, 2013 at 2:23 AM, Bengt Rutisson <bengt.rutisson at oracle.com>wrote:

>
> Thanks Ramki, John and Jesper for the reviews!
>
> Pushing this now.
>
> Bengt
>
>
> On 12/22/12 6:08 AM, Bengt Rutisson wrote:
>
>>
>> Hi John,
>>
>> Thanks for looking at this!
>>
>> On 12/22/12 1:38 AM, John Coomes wrote:
>>
>>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>>
>>>> Hi Ramki,
>>>>
>>>> I made the change to pick ParNew by default also on single CPU systems.
>>>> However, I think this change deserves a separate bug ID and changeset.
>>>> So, I just sent out a new review request with just this change.
>>>>
>>>> Once that has been handled. I think the review request discussed in this
>>>> email thread will look exactly as it is now.
>>>>
>>> Given the other change to keep ParNew enabled on 1-cpu systems, this
>>> looks good to me.
>>>
>>> You might want to append to the warning something along the lines of
>>> "and will likely be removed in a future release".  We intend to remove
>>> this, so "deprecated" here is notably different from its use in the
>>> Java APIs, where nothing deprecated has ever been removed.
>>>
>>
>> Good point. Updated webrev:
>>
>> http://cr.openjdk.java.net/~**brutisso/8003820/webrev.01/<http://cr.openjdk.java.net/%7Ebrutisso/8003820/webrev.01/>
>>
>> Thanks,
>> Bengt
>>
>>>
>>> -John
>>>
>>>  Thanks again for looking at this!
>>>> Bengt
>>>>
>>>> On 12/20/12 9:21 PM, Srinivas Ramakrishna wrote:
>>>>
>>>>>
>>>>>
>>>>>       What happens when you run CMS on a single-processor. I hope you
>>>>>>      don't see a deprecation warning.
>>>>>>
>>>>>      Ooops. Good point. It took me a long while to find a machine with
>>>>>      just one cpu that could actually run JDK8. But you are correct. We
>>>>>      will print a warning in that case.
>>>>>
>>>>>
>>>>> Remember that virtualized platforms or LDOMS or Zones may partition a
>>>>> large box into small 1-cpu slices (although may be not 1-core).
>>>>>
>>>>> On Solaris, you can easily test your code by means of psradm to turn
>>>>> off all but one virtual cpu.
>>>>>
>>>>>
>>>>>      I think the fix is to not pick DefNew by default for single
>>>>>      processor machines. I'll see if I can get any performance data for
>>>>>      that.
>>>>>
>>>>>
>>>>> I'd test that on a regular MP with ParNew=1 vs DefNew, as well as
>>>>> separately with psrset and pbind (although my guess is that
>>>>> the latter two would be indistinguishable from each other). As I
>>>>> recall, scaling was near linear at those small numbers for ParNew,
>>>>> and the breakeven point was at 2, so my guess based on very old data
>>>>> from the fogs of time is that we'd see a fairly sizable pause
>>>>> time and overhead hit on a single cpu.
>>>>>
>>>>> Stepping back for a moment, is supporting embedded environments
>>>>> perhaps from the same parent code base an issue, so DefNew &
>>>>> Serial is going to be part of the code base for a while, anyway?
>>>>>
>>>>> I understand though that saving on testing resources by pruning down
>>>>> supported combinations is one important motivation, in which case
>>>>> DefNew+CMS gets deprecated (and switches to Parnew/1+CMS on 1-cpu
>>>>> configs), but DefNew continues to be part of the code base,
>>>>> and so DefNew code gets used (and tested) at least in part to the
>>>>> extent that ParNew uses at least some functionality defined in DefNew.
>>>>>
>>>>> -- ramki
>>>>>
>>>>>
>>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130104/54709642/attachment.htm>

From bengt.rutisson at oracle.com  Fri Jan  4 21:21:12 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 04 Jan 2013 22:21:12 +0100
Subject: Request for review (XS): 8003822: Deprecate the incremental mode
	of CMS
In-Reply-To: <50D46911.8030203@oracle.com>
References: <50D1B5B2.4010709@oracle.com> <50D25402.3010001@oracle.com>
	<50D46911.8030203@oracle.com>
Message-ID: <50E747C8.7060601@oracle.com>


Thanks John and Jesper for the reviews!

Pushing this now.

Bengt


On 12/21/12 2:50 PM, Bengt Rutisson wrote:
>
> Hi John,
>
> Thanks for looking at this!
>
> On 12/20/12 12:55 AM, John Cuthbertson wrote:
>> Hi Bengt,
>>
>> Changes look good to me. You may want to add an 
>> JDK_Version::is_gte_jdk18x_version() check just in case someone 
>> mistakenly backports this change or, for some unknown reason, hs25 is 
>> placed in a jdk7.
>
> I'm not sure we have to be this defensive. It would not be good if 
> this change got backported to JDK7.
>
>> I also want to call out that by not including a FLAG_IS_DEFAULT 
>> check, you are assuming that none of these flags are enabled by 
>> default. This is true for us. No change required - just calling it out.
>
> Good point. I'll leave it as it for now, though.
>
> Thanks,
> Bengt
>
>
>>
>> JohnC
>>
>> On 12/19/2012 4:40 AM, Bengt Rutisson wrote:
>>>
>>> Hi all,
>>>
>>> Can I have a couple of reviews for this change to deprecate iCMS?
>>>
>>> http://cr.openjdk.java.net/~brutisso/8003822/webrev.00/
>>>
>>> This is part of the work for JEP 173:
>>>
>>> JEP 173: Retire Some Rarely-Used GC Combinations
>>> http://openjdk.java.net/jeps/173
>>>
>>> The webrev is based on the the earlier webrev that I sent out to 
>>> deprecate the DefNew + CMS and ParNew + SerialOld GC combinations:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8003820/webrev.00/
>>>
>>> Thanks,
>>> Bengt
>>
>


From jon.masamitsu at oracle.com  Fri Jan  4 22:03:19 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 04 Jan 2013 14:03:19 -0800
Subject: request for review (s) - 8005672: Clean up some changes to GC logging
	with GCCause's
Message-ID: <50E751A7.2040202@oracle.com>

This is a clean up of some unintended white space changes in
the GC logging output.

http://cr.openjdk.java.net/~jmasa/8005672/webrev.00/

Thanks.

Jon


From bengt.rutisson at oracle.com  Fri Jan  4 22:30:54 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Fri, 04 Jan 2013 22:30:54 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8003822: Deprecate the incremental mode
	of CMS
Message-ID: <20130104223058.854A447569@hg.openjdk.java.net>

Changeset: c98b676a98b4
Author:    brutisso
Date:      2013-01-04 21:33 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c98b676a98b4

8003822: Deprecate the incremental mode of CMS
Reviewed-by: johnc, jwilhelm

! src/share/vm/runtime/arguments.cpp


From ysr1729 at gmail.com  Fri Jan  4 23:02:42 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 4 Jan 2013 15:02:42 -0800
Subject: request for review (s) - 8005672: Clean up some changes to GC
	logging with GCCause's
In-Reply-To: <50E751A7.2040202@oracle.com>
References: <50E751A7.2040202@oracle.com>
Message-ID: <CABzyjyk8bYgXBCt50T3N1drGf0NkOC=jZM+_RVmnAy-oHH7FTA@mail.gmail.com>

Hi Jon, could you post examples of pre-unintended-change,
post-unintended-change and post-this-fix? (couldn't find it in the visible
part of the bug report)

Happy New Year!
-- ramki

On Fri, Jan 4, 2013 at 2:03 PM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:

> This is a clean up of some unintended white space changes in
> the GC logging output.
>
> http://cr.openjdk.java.net/~**jmasa/8005672/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8005672/webrev.00/>
>
> Thanks.
>
> Jon
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130104/de5d2b50/attachment.htm>

From ysr1729 at gmail.com  Fri Jan  4 23:08:30 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 4 Jan 2013 15:08:30 -0800
Subject: Request for review (XS): 8003822: Deprecate the incremental mode
	of CMS
In-Reply-To: <E8706354-F4EC-46C3-BF42-FBB00E9CFF9B@oracle.com>
References: <E8706354-F4EC-46C3-BF42-FBB00E9CFF9B@oracle.com>
Message-ID: <CABzyjykgBWj7Psm1tB8Q0f9HrORxHqwcnqkXYaxfuU1sQeWHww@mail.gmail.com>

On Wed, Dec 19, 2012 at 12:46 PM, Joel Buckley <joel.buckley at oracle.com>wrote:

> Hey Bengt,
>
> Code review looks simple enough.
>
> One nit on naming consistency:
> Should "CMSIncrementalMode" be "UseCMSIncrementalGC"
> to be consistent with other flags?
>

Historical name, probably not worth renaming especially since that mode is
being deprecated now.


> A couple more general questions:
>
> What GC combinations are non-deprecated and what are their
> intended use cases?
>
> For example, what GC combinations are intended for the following?:
> * 4-100GB Heap (e.g. 64bit);  High Priority: 200ms max pause due to GC;
> Env: >50% Heap long lived (>12hour)
> * 4-100GB Heap (e.g. 64bit);  High Priority: 500ms max pause due to GC;
> Env: ~40% Heap short lived (<1minute) & ~40% Heap long lived (>12hour)
> * 1-3.5GB Heap (e.g. 32bit); High Priority: 200ms max pause due to GC;
> Env: ~90% Heap short lived (<1minute)
> * 100-1024MB Heap (e.g. 32bit/embedded); High Priority: 500ms max pause
> due to GC; Env: varying short/long lived data profiles over time
>

guesstimate: Probably CMS and G1 for all these cases except for embedded,
where serial or parallel would probably work. #2 and #3 could probably be
tuned for parallel old.

-- ramki


>
> Thanks,
> Joel.
>
> Date: Wed, 19 Dec 2012 13:40:18 +0100
> From: Bengt Rutisson <bengt.rutisson at oracle.com>
> Subject: Request for review (XS): 8003822: Deprecate the incremental
> mode of CMS
> To: hotspot-gc-dev at openjdk.java.net
> Message-ID: <50D1B5B2.4010709 at oracle.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
>
> Hi all,
>
> Can I have a couple of reviews for this change to deprecate iCMS?
>
> http://cr.openjdk.java.net/~brutisso/8003822/webrev.00/
>
> This is part of the work for JEP 173:
>
> JEP 173: Retire Some Rarely-Used GC Combinations
> http://openjdk.java.net/jeps/173
>
> The webrev is based on the the earlier webrev that I sent out to
> deprecate the DefNew + CMS and ParNew + SerialOld GC combinations:
>
> http://cr.openjdk.java.net/~brutisso/8003820/webrev.00/
>
> Thanks,
> Bengt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130104/d4a80434/attachment.htm>

From jon.masamitsu at oracle.com  Fri Jan  4 23:23:59 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 04 Jan 2013 15:23:59 -0800
Subject: Request for review (xs) - 8000325
In-Reply-To: <CABzyjymgxgkMRs3fJecsB3Uquag8HwwEQUHbxnLtsNgh=EdhaA@mail.gmail.com>
References: <50B79F8C.7030704@oracle.com>
	<CABzyjymgxgkMRs3fJecsB3Uquag8HwwEQUHbxnLtsNgh=EdhaA@mail.gmail.com>
Message-ID: <50E7648F.7030106@oracle.com>


On 11/29/2012 3:17 PM, Srinivas Ramakrishna wrote:
> Looks good. Would be great to prominently release-note this change when it
> appears in a public/GA update/release.
>
> Out of curiosity, any perf data on CMS pause time diffs from this change
> with current NPG?

I ran refworkload server_reference and saw a regression only on 
specjbb2000.  The regression was on the
remark pauses (as would be expected) and the largest I saw was about 
12%.  Larger than I would have
hoped.   The unloading is done serially so maybe things around it got 
faster.

Jon


> -- ramki
>
> On Thu, Nov 29, 2012 at 9:46 AM, Jon Masamitsu<jon.masamitsu at oracle.com>wrote:
>
>> This is a change in the default class unloading policy for CMS.
>>
>> 8000325:        Change default for CMSClassUnloadingEnabled to true
>>
>> http://cr.openjdk.java.net/~**jmasa/8000325/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8000325/webrev.00/>
>>
>> With perm gen removal it becomes important for CMS to unload
>> classes to avoid excessive consumption of native memory for
>> metadata.
>>


From jon.masamitsu at oracle.com  Fri Jan  4 23:51:20 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 04 Jan 2013 15:51:20 -0800
Subject: request for review (s) - 8005672: Clean up some changes to GC
	logging with GCCause's
In-Reply-To: <CABzyjyk8bYgXBCt50T3N1drGf0NkOC=jZM+_RVmnAy-oHH7FTA@mail.gmail.com>
References: <50E751A7.2040202@oracle.com>
	<CABzyjyk8bYgXBCt50T3N1drGf0NkOC=jZM+_RVmnAy-oHH7FTA@mail.gmail.com>
Message-ID: <50E76AF8.5060400@oracle.com>


On 1/4/2013 3:02 PM, Srinivas Ramakrishna wrote:
> Hi Jon, could you post examples of pre-unintended-change,
> post-unintended-change and post-this-fix? (couldn't find it in the visible
> part of the bug report)
Before unintended change (jdk7)

[GC [ParNew: 69952K->8704K(78656K), 1.1970131 secs] 69952K->69906K(253440K), 1.1971601 secs] [Times: user=5.08 sys=0.78, real=1.20 secs]

After unintended change - missing blank after "GC" (recent jdk8 promoted)

[GC[ParNew: 69952K->8704K(78656K), 1.1672417 secs] 69952K->68461K(253440K), 1.1673949 secs] [Times: user=5.01 sys=0.75, real=1.17 secs]

Fixed

[GC [ParNew: 69952K->8704K(78656K), 1.1765617 secs] 69952K->68482K(253440K), 1.1767129 secs] [Times: user=5.06 sys=0.77, real=1.18 secs]

Jon

> Happy New Year!
> -- ramki
>
> On Fri, Jan 4, 2013 at 2:03 PM, Jon Masamitsu<jon.masamitsu at oracle.com>wrote:
>
>> This is a clean up of some unintended white space changes in
>> the GC logging output.
>>
>> http://cr.openjdk.java.net/~**jmasa/8005672/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8005672/webrev.00/>
>>
>> Thanks.
>>
>> Jon
>>
>>


From ysr1729 at gmail.com  Sat Jan  5 00:07:07 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 4 Jan 2013 16:07:07 -0800
Subject: request for review (s) - 8005672: Clean up some changes to GC
	logging with GCCause's
In-Reply-To: <50E76AF8.5060400@oracle.com>
References: <50E751A7.2040202@oracle.com>
	<CABzyjyk8bYgXBCt50T3N1drGf0NkOC=jZM+_RVmnAy-oHH7FTA@mail.gmail.com>
	<50E76AF8.5060400@oracle.com>
Message-ID: <CABzyjynY1fDeewWb0teeC=JhdgQJOgz58ViaWv01z45CUAt2ZQ@mail.gmail.com>

Great, thanks! Changes look good to me.

reviewed.
-- ramki

On Fri, Jan 4, 2013 at 3:51 PM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:

>
>
> On 1/4/2013 3:02 PM, Srinivas Ramakrishna wrote:
>
>> Hi Jon, could you post examples of pre-unintended-change,
>> post-unintended-change and post-this-fix? (couldn't find it in the visible
>> part of the bug report)
>>
> Before unintended change (jdk7)
>
> [GC [ParNew: 69952K->8704K(78656K), 1.1970131 secs]
> 69952K->69906K(253440K), 1.1971601 secs] [Times: user=5.08 sys=0.78,
> real=1.20 secs]
>
> After unintended change - missing blank after "GC" (recent jdk8 promoted)
>
> [GC[ParNew: 69952K->8704K(78656K), 1.1672417 secs]
> 69952K->68461K(253440K), 1.1673949 secs] [Times: user=5.01 sys=0.75,
> real=1.17 secs]
>
> Fixed
>
> [GC [ParNew: 69952K->8704K(78656K), 1.1765617 secs]
> 69952K->68482K(253440K), 1.1767129 secs] [Times: user=5.06 sys=0.77,
> real=1.18 secs]
>
> Jon
>
>  Happy New Year!
>> -- ramki
>>
>> On Fri, Jan 4, 2013 at 2:03 PM, Jon Masamitsu<jon.masamitsu@**oracle.com<jon.masamitsu at oracle.com>
>> >wrote:
>>
>>  This is a clean up of some unintended white space changes in
>>> the GC logging output.
>>>
>>> http://cr.openjdk.java.net/~****jmasa/8005672/webrev.00/<http://cr.openjdk.java.net/%7E**jmasa/8005672/webrev.00/>
>>> <http:**//cr.openjdk.java.net/%**7Ejmasa/8005672/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8005672/webrev.00/>
>>> >
>>>
>>> Thanks.
>>>
>>> Jon
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130104/d15ada8f/attachment.htm>

From ysr1729 at gmail.com  Sat Jan  5 00:10:00 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 4 Jan 2013 16:10:00 -0800
Subject: Request for review (xs) - 8000325
In-Reply-To: <50E7648F.7030106@oracle.com>
References: <50B79F8C.7030704@oracle.com>
	<CABzyjymgxgkMRs3fJecsB3Uquag8HwwEQUHbxnLtsNgh=EdhaA@mail.gmail.com>
	<50E7648F.7030106@oracle.com>
Message-ID: <CABzyjy=mPStcsQoHDLUm7vgODcDBVwVYAQpY_dVEmyeiHo1DpA@mail.gmail.com>

Thanks for the perf data, Jon! I'd say that's not too bad.

Out of curiosity, was the diff statistically significant per appropriate
T-test? (i've seen remark pauses show some variance depending on
application and tuning.)

-- ramki

On Fri, Jan 4, 2013 at 3:23 PM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:

>
>
> On 11/29/2012 3:17 PM, Srinivas Ramakrishna wrote:
>
>> Looks good. Would be great to prominently release-note this change when it
>> appears in a public/GA update/release.
>>
>> Out of curiosity, any perf data on CMS pause time diffs from this change
>> with current NPG?
>>
>
> I ran refworkload server_reference and saw a regression only on
> specjbb2000.  The regression was on the
> remark pauses (as would be expected) and the largest I saw was about 12%.
>  Larger than I would have
> hoped.   The unloading is done serially so maybe things around it got
> faster.
>
> Jon
>
>
>  -- ramki
>>
>> On Thu, Nov 29, 2012 at 9:46 AM, Jon Masamitsu<jon.masamitsu@**oracle.com<jon.masamitsu at oracle.com>
>> >wrote:
>>
>>  This is a change in the default class unloading policy for CMS.
>>>
>>> 8000325:        Change default for CMSClassUnloadingEnabled to true
>>>
>>> http://cr.openjdk.java.net/~****jmasa/8000325/webrev.00/<http://cr.openjdk.java.net/%7E**jmasa/8000325/webrev.00/>
>>> <http:**//cr.openjdk.java.net/%**7Ejmasa/8000325/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8000325/webrev.00/>
>>> >
>>>
>>>
>>> With perm gen removal it becomes important for CMS to unload
>>> classes to avoid excessive consumption of native memory for
>>> metadata.
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130104/4fbb2821/attachment.htm>

From john.cuthbertson at oracle.com  Sat Jan  5 00:27:30 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Fri, 04 Jan 2013 16:27:30 -0800
Subject: request for review (s) - 8005672: Clean up some changes to GC
	logging with GCCause's
In-Reply-To: <50E751A7.2040202@oracle.com>
References: <50E751A7.2040202@oracle.com>
Message-ID: <50E77372.2000008@oracle.com>

Hi Jon,

Changes look good to me. Thank you doing this - the difference was 
confusing the awk scripts I use.

JohnC

On 1/4/2013 2:03 PM, Jon Masamitsu wrote:
> This is a clean up of some unintended white space changes in
> the GC logging output.
>
> http://cr.openjdk.java.net/~jmasa/8005672/webrev.00/
>
> Thanks.
>
> Jon
>


From jon.masamitsu at oracle.com  Sat Jan  5 00:57:41 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 04 Jan 2013 16:57:41 -0800
Subject: Request for review (xs) - 8000325
In-Reply-To: <CABzyjy=mPStcsQoHDLUm7vgODcDBVwVYAQpY_dVEmyeiHo1DpA@mail.gmail.com>
References: <50B79F8C.7030704@oracle.com>	<CABzyjymgxgkMRs3fJecsB3Uquag8HwwEQUHbxnLtsNgh=EdhaA@mail.gmail.com>	<50E7648F.7030106@oracle.com>
	<CABzyjy=mPStcsQoHDLUm7vgODcDBVwVYAQpY_dVEmyeiHo1DpA@mail.gmail.com>
Message-ID: <50E77A85.7060701@oracle.com>

Yes, the differences were statistically significant.

Jon


On 1/4/2013 4:10 PM, Srinivas Ramakrishna wrote:
> Thanks for the perf data, Jon! I'd say that's not too bad.
>
> Out of curiosity, was the diff statistically significant per appropriate
> T-test? (i've seen remark pauses show some variance depending on
> application and tuning.)
>
> -- ramki
>
> On Fri, Jan 4, 2013 at 3:23 PM, Jon Masamitsu<jon.masamitsu at oracle.com>wrote:
>
>>
>> On 11/29/2012 3:17 PM, Srinivas Ramakrishna wrote:
>>
>>> Looks good. Would be great to prominently release-note this change when it
>>> appears in a public/GA update/release.
>>>
>>> Out of curiosity, any perf data on CMS pause time diffs from this change
>>> with current NPG?
>>>
>> I ran refworkload server_reference and saw a regression only on
>> specjbb2000.  The regression was on the
>> remark pauses (as would be expected) and the largest I saw was about 12%.
>>   Larger than I would have
>> hoped.   The unloading is done serially so maybe things around it got
>> faster.
>>
>> Jon
>>
>>
>>   -- ramki
>>> On Thu, Nov 29, 2012 at 9:46 AM, Jon Masamitsu<jon.masamitsu@**oracle.com<jon.masamitsu at oracle.com>
>>>> wrote:
>>>   This is a change in the default class unloading policy for CMS.
>>>> 8000325:        Change default for CMSClassUnloadingEnabled to true
>>>>
>>>> http://cr.openjdk.java.net/~****jmasa/8000325/webrev.00/<http://cr.openjdk.java.net/%7E**jmasa/8000325/webrev.00/>
>>>> <http:**//cr.openjdk.java.net/%**7Ejmasa/8000325/webrev.00/<http://cr.openjdk.java.net/%7Ejmasa/8000325/webrev.00/>
>>>>
>>>> With perm gen removal it becomes important for CMS to unload
>>>> classes to avoid excessive consumption of native memory for
>>>> metadata.
>>>>
>>>>


From jon.masamitsu at oracle.com  Sat Jan  5 05:31:03 2013
From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com)
Date: Sat, 05 Jan 2013 05:31:03 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets
Message-ID: <20130105053109.2C59C47573@hg.openjdk.java.net>

Changeset: 6e9174173e00
Author:    jmasa
Date:      2013-01-04 17:04 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6e9174173e00

8000325: Change default for CMSClassUnloadingEnabled to true
Reviewed-by: stefank, ysr

! src/share/vm/runtime/globals.hpp

Changeset: 0b54ffe4c2d3
Author:    jmasa
Date:      2013-01-04 17:04 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0b54ffe4c2d3

8005672: Clean up some changes to GC logging with GCCause's
Reviewed-by: johnc, ysr

! src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp
! src/share/vm/gc_implementation/parallelScavenge/psYoungGen.cpp
! src/share/vm/gc_interface/gcCause.hpp


From bengt.rutisson at oracle.com  Tue Jan  8 10:16:03 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 08 Jan 2013 11:16:03 +0100
Subject: Request for review (S): 8005489: VM hangs during GC with ParallelGC
	and ParallelGCThreads=0
Message-ID: <50EBF1E3.4090809@oracle.com>


Hi all,

Could I have a couple of reviews for this change. The parallel collector 
needs at least one GC worker thread to function properly . Setting 
ParallelGCThreads=0 on the command line causes the parallel collector to 
hang in product builds and assert in debug builds. This has been the 
case back to JDK6u14.

The fix will instead print an error message and exit the VM if 
ParallelGCThreads=0 is used with UseParallelGC. This is similar to what 
we recently did for the ParNew collector.

http://cr.openjdk.java.net/~brutisso/8005489/webrev.00/

Thanks,
Bengt


From vitalyd at gmail.com  Tue Jan  8 11:20:12 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 8 Jan 2013 06:20:12 -0500
Subject: Request for review (S): 8005489: VM hangs during GC with
	ParallelGC and ParallelGCThreads=0
In-Reply-To: <50EBF1E3.4090809@oracle.com>
References: <50EBF1E3.4090809@oracle.com>
Message-ID: <CAHjP37FESKBxf5s4Gb3j0W8xH+5AgwQR=xwwanGNRA6q2AoitA@mail.gmail.com>

Nice and simple - looks good Bengt.

Sent from my phone
On Jan 8, 2013 5:16 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

>
> Hi all,
>
> Could I have a couple of reviews for this change. The parallel collector
> needs at least one GC worker thread to function properly . Setting
> ParallelGCThreads=0 on the command line causes the parallel collector to
> hang in product builds and assert in debug builds. This has been the case
> back to JDK6u14.
>
> The fix will instead print an error message and exit the VM if
> ParallelGCThreads=0 is used with UseParallelGC. This is similar to what we
> recently did for the ParNew collector.
>
> http://cr.openjdk.java.net/~**brutisso/8005489/webrev.00/<http://cr.openjdk.java.net/~brutisso/8005489/webrev.00/>
>
> Thanks,
> Bengt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130108/43e4bbc1/attachment.htm>

From erik.helin at oracle.com  Tue Jan  8 15:58:37 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Tue, 08 Jan 2013 16:58:37 +0100
Subject: Request for review (S): 8005489: VM hangs during GC with
	ParallelGC and ParallelGCThreads=0
In-Reply-To: <50EBF1E3.4090809@oracle.com>
References: <50EBF1E3.4090809@oracle.com>
Message-ID: <50EC422D.4010004@oracle.com>

Hi Bengt,

looks good!

Erik

On 01/08/2013 11:16 AM, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this change. The parallel collector
> needs at least one GC worker thread to function properly . Setting
> ParallelGCThreads=0 on the command line causes the parallel collector to
> hang in product builds and assert in debug builds. This has been the
> case back to JDK6u14.
>
> The fix will instead print an error message and exit the VM if
> ParallelGCThreads=0 is used with UseParallelGC. This is similar to what
> we recently did for the ParNew collector.
>
> http://cr.openjdk.java.net/~brutisso/8005489/webrev.00/
>
> Thanks,
> Bengt


From john.cuthbertson at oracle.com  Tue Jan  8 18:59:28 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 08 Jan 2013 10:59:28 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
Message-ID: <50EC6C90.4060502@oracle.com>

Hi Everyone,

Can I have a couple of volunteers review the changes for this CR - the 
webrev can be found at: http://cr.openjdk.java.net/~johnc/8005032/webrev.0/

Summary:
The previous serial keep-alive and complete-gc reference processing oop 
closures used during concurrent marking operated directly on the global 
marking stack while the parallel closures used the local task queues 
from the concurrent marking task objects (using the global marking stack 
as a backing store). Additionally the parallel keep-alive closure also 
drained the local task queue after processing a given number of references.

These changes make the serial reference processing code use the same oop 
closures as the parallel code. This will reduce the likelihood of 
hitting a marking stack overflow while processing the discovered 
references during the remark phase.

Testing:
GC test suite with a low IHOP and marking verification.
Test case for 8004812 (Kitchensink) with  a very low marking stack size 
(4K entries) and heap verification, and both with and without 
ParallelRefProcEnabled and ParallelGCThreads=0.
jprt.

Thanks,

JohnC


From john.cuthbertson at oracle.com  Tue Jan  8 22:13:01 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 08 Jan 2013 14:13:01 -0800
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
Message-ID: <50EC99ED.4090903@oracle.com>

Hi Everyone,

Can I please have a couple of volunteers look over the fix for this CR - 
the webrev can be found at: 
http://cr.openjdk.java.net/~johnc/8005875/webrev.0/

Summary:
One of the modules in the Kitchensink test generates a VM_PrintThreads 
vm operation. The JVM crashes when it tries to print out G1's concurrent 
marking worker threads when ParallelGCThreads=0 because the work gang 
has not been created. The fix is to add the same check that's used 
elsewhere in G1's concurrent marking.

Testing:
Kitchensink with ParallelGCThreads=0

Thanks,

JohnC


From vitalyd at gmail.com  Wed Jan  9 04:37:59 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 8 Jan 2013 23:37:59 -0500
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <50EC99ED.4090903@oracle.com>
References: <50EC99ED.4090903@oracle.com>
Message-ID: <CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>

Hi John,

What's the advantage of checking parallel marking thread count > 0 rather
than checking if parallel workers is not NULL? Is it clearer that way? I'm
thinking checking for NULL here (perhaps with a comment on when NULL can
happen) may be a bit more robust in case it can be null for some other
reason, even if parallel marking thread count is > 0.

Looks good though.

Thanks

Sent from my phone
On Jan 8, 2013 5:14 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Everyone,
>
> Can I please have a couple of volunteers look over the fix for this CR -
> the webrev can be found at: http://cr.openjdk.java.net/~**
> johnc/8005875/webrev.0/<http://cr.openjdk.java.net/~johnc/8005875/webrev.0/>
>
> Summary:
> One of the modules in the Kitchensink test generates a VM_PrintThreads vm
> operation. The JVM crashes when it tries to print out G1's concurrent
> marking worker threads when ParallelGCThreads=0 because the work gang has
> not been created. The fix is to add the same check that's used elsewhere in
> G1's concurrent marking.
>
> Testing:
> Kitchensink with ParallelGCThreads=0
>
> Thanks,
>
> JohnC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130108/6e7d9c67/attachment.htm>

From stefan.karlsson at oracle.com  Wed Jan  9 08:45:38 2013
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 09 Jan 2013 09:45:38 +0100
Subject: Request for review (S): 8005489: VM hangs during GC with
	ParallelGC and ParallelGCThreads=0
In-Reply-To: <50EBF1E3.4090809@oracle.com>
References: <50EBF1E3.4090809@oracle.com>
Message-ID: <50ED2E32.9040801@oracle.com>

Looks good.

StefanK

On 01/08/2013 11:16 AM, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this change. The parallel 
> collector needs at least one GC worker thread to function properly . 
> Setting ParallelGCThreads=0 on the command line causes the parallel 
> collector to hang in product builds and assert in debug builds. This 
> has been the case back to JDK6u14.
>
> The fix will instead print an error message and exit the VM if 
> ParallelGCThreads=0 is used with UseParallelGC. This is similar to 
> what we recently did for the ParNew collector.
>
> http://cr.openjdk.java.net/~brutisso/8005489/webrev.00/
>
> Thanks,
> Bengt


From bengt.rutisson at oracle.com  Wed Jan  9 08:53:37 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 09 Jan 2013 09:53:37 +0100
Subject: Request for review (S): 8005489: VM hangs during GC with
	ParallelGC and ParallelGCThreads=0
In-Reply-To: <50ED2E32.9040801@oracle.com>
References: <50EBF1E3.4090809@oracle.com> <50ED2E32.9040801@oracle.com>
Message-ID: <50ED3011.8060500@oracle.com>


Thanks for the reviews, Vitaly, Erik and Stefan!

Pushing this now.

Bengt

On 1/9/13 9:45 AM, Stefan Karlsson wrote:
> Looks good.
>
> StefanK
>
> On 01/08/2013 11:16 AM, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Could I have a couple of reviews for this change. The parallel 
>> collector needs at least one GC worker thread to function properly . 
>> Setting ParallelGCThreads=0 on the command line causes the parallel 
>> collector to hang in product builds and assert in debug builds. This 
>> has been the case back to JDK6u14.
>>
>> The fix will instead print an error message and exit the VM if 
>> ParallelGCThreads=0 is used with UseParallelGC. This is similar to 
>> what we recently did for the ParNew collector.
>>
>> http://cr.openjdk.java.net/~brutisso/8005489/webrev.00/
>>
>> Thanks,
>> Bengt
>


From bengt.rutisson at oracle.com  Wed Jan  9 10:37:52 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Wed, 09 Jan 2013 10:37:52 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8005489: VM hangs during GC with
	ParallelGC and ParallelGCThreads=0
Message-ID: <20130109103756.E0BDF47137@hg.openjdk.java.net>

Changeset: 4c8bf5e55392
Author:    brutisso
Date:      2013-01-09 09:48 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4c8bf5e55392

8005489: VM hangs during GC with ParallelGC and ParallelGCThreads=0
Summary: Print an error message and exit the VM if UseParallalGC is combined with ParllelGCThreads==0. Also reviewed by vitalyd at gmail.com.
Reviewed-by: stefank, ehelin

! src/share/vm/runtime/arguments.cpp


From bengt.rutisson at oracle.com  Wed Jan  9 15:35:56 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 09 Jan 2013 16:35:56 +0100
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <50EC6C90.4060502@oracle.com>
References: <50EC6C90.4060502@oracle.com>
Message-ID: <50ED8E5C.4010109@oracle.com>


Hi John,

Looks good! Thanks for fixing this!

A couple of comments:

In ConcurrentMark::weakRefsWork() there is this code:

2422     if (rp->processing_is_mt()) {
2423       // Set the degree of MT here.  If the discovery is done MT, there
2424       // may have been a different number of threads doing the 
discovery
2425       // and a different number of discovered lists may have Ref 
objects.
2426       // That is OK as long as the Reference lists are balanced (see
2427       // balance_all_queues() and balance_queues()).
2428       rp->set_active_mt_degree(active_workers);
2429     }

Could we now always call rp->set_active_mt_degree() ? Maybe I am missing 
the point here, but I thought that we are now using queues and all 
rp->set_active_mt_degree() does is set the number of queues to 
active_workers. Which will be 1 for the single threaded mode.

If we do that we can also remove the first part of the assert a bit 
further down:

2449     assert(!rp->processing_is_mt() || rp->num_q() == 
active_workers, "why not");


But I didn't have time to follow this code properly, so maybe I'm way 
off here?


Also, I think I would like to move the code you added to 
G1CMParDrainMarkingStackClosure::do_void() into 
ConcurrentMark::weakRefsWork() somewhere. Maybe something like:

if (!rp->processing_is_mt()) {
   set_phase(1, false /* concurrent */);
}

It is a bit strange to me that G1CMParDrainMarkingStackClosure should 
set this up. If we really want to keep it in 
G1CMParDrainMarkingStackClosure I think the constructor would be a 
better place to do it than do_void().


Some minor comments:

In G1CMParKeepAliveAndDrainClosure and G1CMParDrainMarkingStackClosure 
constructors there is this assert:

assert(_task->worker_id() == 0 || _is_par, "sanity");

I think it is good, but I had to think a bit about what it meant and to 
me I think it would be quicker to understand if it was the other way around:

assert(_is_par || _task->worker_id() == 0, "Only worker 0 should be used 
if single threaded");

But maybe it is just me...


Remove newline on line 2254 ?


ConcurrentMark::weakRefsWork()

How about introducing a variable that either hold the value of 
&par_task_executor or NULL depending on rp->processing_is_mt()? That way 
we don't have to duplicate and inline this test twice:

(rp->processing_is_mt() ? &par_task_executor : NULL)


As a separate change it might be worth renaming the closures to not have 
"Par" in the name, now that they are not always parallel...

G1CMParKeepAliveAndDrainClosure -> G1CMKeepAliveAndDrainClosure
G1CMParDrainMarkingStackClosure -> G1CMDrainMarkingStackClosure

But it would be very confusing to do this in the same change.

Thanks,
Bengt


On 1/8/13 7:59 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers review the changes for this CR - the 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8005032/webrev.0/
>
> Summary:
> The previous serial keep-alive and complete-gc reference processing 
> oop closures used during concurrent marking operated directly on the 
> global marking stack while the parallel closures used the local task 
> queues from the concurrent marking task objects (using the global 
> marking stack as a backing store). Additionally the parallel 
> keep-alive closure also drained the local task queue after processing 
> a given number of references.
>
> These changes make the serial reference processing code use the same 
> oop closures as the parallel code. This will reduce the likelihood of 
> hitting a marking stack overflow while processing the discovered 
> references during the remark phase.
>
> Testing:
> GC test suite with a low IHOP and marking verification.
> Test case for 8004812 (Kitchensink) with  a very low marking stack 
> size (4K entries) and heap verification, and both with and without 
> ParallelRefProcEnabled and ParallelGCThreads=0.
> jprt.
>
> Thanks,
>
> JohnC


From john.cuthbertson at oracle.com  Thu Jan 10 00:28:17 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 09 Jan 2013 16:28:17 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
Message-ID: <50EE0B21.4000909@oracle.com>

Hi Everyone,

Can I have a couple of volunteers look of the code changes for this CR? 
The webrev can be found at: 
http://cr.openjdk.java.net/~johnc/8001425/webrev.0/

Background:
These changes change the default values of:

G1MixedGCLiveThresholdPercent from 90 down to 65
This means that we we don't consider regions with a live occupancy > 65% 
as good candidates for collection during the next mixed GC phase. 
Evacuating regions that have a high live occupancy can get expensive and 
a 90% cut off was deemed to be too high.

G1HeapWastePercent up from 5% to 10%
This means that we are prepared to sacrifice 10% of the heap to avoid 
really expensive mixed GCs.

G1MixedGCCountTarget up from 4 to 8
We can do up to 8 mixed GCs after a marking cycle instead of 4. This 
should mean that an individual mixed GC is less expensive and we should 
collect more regions before we reach the regions that are really 
expensive to collect.

And for heaps no more than 4GB:

G1NewSizePercent is reduced from 20% to 5%. This value was placing a 
lower bound on how far we could shrink the young generation. 20% was 
deemed to be too high.

G1MaxNewSizePercent is reduced from 80% to 60%. A value of 80% was 
allowing the young generation to grow too large increasing the 
possibility of evacuation failures.

These new values have been suggested by the performance team based upon 
their tuning experiments (including feedback from people the performance 
team has helped out).

Going forward I think some of these values should be set dynamically. 
For example G1MixedGCLiveThresholdPercent could be tied to the overall 
heap occupancy.

Thanks,

JohnC


From john.cuthbertson at oracle.com  Thu Jan 10 00:41:23 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 09 Jan 2013 16:41:23 -0800
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>
References: <50EC99ED.4090903@oracle.com>
	<CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>
Message-ID: <50EE0E33.8010408@oracle.com>

Hi Vitaly,

Thanks for looking over the changes. AFAICT checking if 
_parallel_workers is not null is equivalent to checking that the number 
of parallel marking threads is > 0. I went with the latter check as 
other references to the parallel workers work gang are guarded by it. 
I'm not sure why the code was originally written that way but my guess 
is that, when originally written, the marking threads (like the 
concurrent refinement threads currently) were not in a work gang.

Thanks,

JohnC

On 1/8/2013 8:37 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> What's the advantage of checking parallel marking thread count > 0 
> rather than checking if parallel workers is not NULL? Is it clearer 
> that way? I'm thinking checking for NULL here (perhaps with a comment 
> on when NULL can happen) may be a bit more robust in case it can be 
> null for some other reason, even if parallel marking thread count is > 0.
>
> Looks good though.
>
> Thanks
>
> Sent from my phone
>
> On Jan 8, 2013 5:14 PM, "John Cuthbertson" 
> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>
>     Hi Everyone,
>
>     Can I please have a couple of volunteers look over the fix for
>     this CR - the webrev can be found at:
>     http://cr.openjdk.java.net/~johnc/8005875/webrev.0/
>     <http://cr.openjdk.java.net/%7Ejohnc/8005875/webrev.0/>
>
>     Summary:
>     One of the modules in the Kitchensink test generates a
>     VM_PrintThreads vm operation. The JVM crashes when it tries to
>     print out G1's concurrent marking worker threads when
>     ParallelGCThreads=0 because the work gang has not been created.
>     The fix is to add the same check that's used elsewhere in G1's
>     concurrent marking.
>
>     Testing:
>     Kitchensink with ParallelGCThreads=0
>
>     Thanks,
>
>     JohnC
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130109/d712d79a/attachment.htm>

From vitalyd at gmail.com  Thu Jan 10 00:47:05 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 9 Jan 2013 19:47:05 -0500
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <50EE0E33.8010408@oracle.com>
References: <50EC99ED.4090903@oracle.com>
	<CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>
	<50EE0E33.8010408@oracle.com>
Message-ID: <CAHjP37Hn=68a-xR439TKyC4LwSeaQ3WYSjE8uj+0adZ5+_KmOw@mail.gmail.com>

Hi John,

Thanks for the response.  Yeah, I figured it's the same thing since it's
not null iff # of workers > 0.  However, if this relationship is ever
broken or perhaps the gang can be set to null at some point even if workers
> 0, then this code will segv again.  Hence I thought a null guard is a bit
better, but it was just a side comment - code looks fine as is.

Thanks

Sent from my phone
On Jan 9, 2013 7:41 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

>  Hi Vitaly,
>
> Thanks for looking over the changes. AFAICT checking if _parallel_workers
> is not null is equivalent to checking that the number of parallel marking
> threads is > 0. I went with the latter check as other references to the
> parallel workers work gang are guarded by it. I'm not sure why the code was
> originally written that way but my guess is that, when originally written,
> the marking threads (like the concurrent refinement threads currently) were
> not in a work gang.
>
> Thanks,
>
> JohnC
>
> On 1/8/2013 8:37 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> What's the advantage of checking parallel marking thread count > 0 rather
> than checking if parallel workers is not NULL? Is it clearer that way? I'm
> thinking checking for NULL here (perhaps with a comment on when NULL can
> happen) may be a bit more robust in case it can be null for some other
> reason, even if parallel marking thread count is > 0.
>
> Looks good though.
>
> Thanks
>
> Sent from my phone
> On Jan 8, 2013 5:14 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
> wrote:
>
>> Hi Everyone,
>>
>> Can I please have a couple of volunteers look over the fix for this CR -
>> the webrev can be found at:
>> http://cr.openjdk.java.net/~johnc/8005875/webrev.0/
>>
>> Summary:
>> One of the modules in the Kitchensink test generates a VM_PrintThreads vm
>> operation. The JVM crashes when it tries to print out G1's concurrent
>> marking worker threads when ParallelGCThreads=0 because the work gang has
>> not been created. The fix is to add the same check that's used elsewhere in
>> G1's concurrent marking.
>>
>> Testing:
>> Kitchensink with ParallelGCThreads=0
>>
>> Thanks,
>>
>> JohnC
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130109/e75b0ffd/attachment.htm>

From tao.mao at oracle.com  Thu Jan 10 01:01:04 2013
From: tao.mao at oracle.com (Tao Mao)
Date: Wed, 09 Jan 2013 17:01:04 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50EE0B21.4000909@oracle.com>
References: <50EE0B21.4000909@oracle.com>
Message-ID: <50EE12D0.5070906@oracle.com>

Hi,
I don't know these specific performance tunings. I have another concern 
(maybe not that related to the change).

It seems the common practice to store globals is to define them directly 
in share/vm/runtime/globals.hpp although I know there are a lot of other 
global definition files for compilers and platforms. However, within GC 
code, I don't see other GC code does so. Please consider if it is 
motivated enough to move these definitions to globals.hpp in order to 
keep the code lean. Correct me if I've got wrong.

Thanks.
Tao

On 1/9/2013 4:28 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers look of the code changes for this 
> CR? The webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8001425/webrev.0/
>
> Background:
> These changes change the default values of:
>
> G1MixedGCLiveThresholdPercent from 90 down to 65
> This means that we we don't consider regions with a live occupancy > 
> 65% as good candidates for collection during the next mixed GC phase. 
> Evacuating regions that have a high live occupancy can get expensive 
> and a 90% cut off was deemed to be too high.
>
> G1HeapWastePercent up from 5% to 10%
> This means that we are prepared to sacrifice 10% of the heap to avoid 
> really expensive mixed GCs.
>
> G1MixedGCCountTarget up from 4 to 8
> We can do up to 8 mixed GCs after a marking cycle instead of 4. This 
> should mean that an individual mixed GC is less expensive and we 
> should collect more regions before we reach the regions that are 
> really expensive to collect.
>
> And for heaps no more than 4GB:
>
> G1NewSizePercent is reduced from 20% to 5%. This value was placing a 
> lower bound on how far we could shrink the young generation. 20% was 
> deemed to be too high.
>
> G1MaxNewSizePercent is reduced from 80% to 60%. A value of 80% was 
> allowing the young generation to grow too large increasing the 
> possibility of evacuation failures.
>
> These new values have been suggested by the performance team based 
> upon their tuning experiments (including feedback from people the 
> performance team has helped out).
>
> Going forward I think some of these values should be set dynamically. 
> For example G1MixedGCLiveThresholdPercent could be tied to the overall 
> heap occupancy.
>
> Thanks,
>
> JohnC


From jon.masamitsu at oracle.com  Thu Jan 10 04:48:09 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 09 Jan 2013 20:48:09 -0800
Subject: request for review (s): 8004895: NPG: JMapPermCore test failure caused
	by warnings about missing field
Message-ID: <50EE4809.3030300@oracle.com>

8004895: NPG: JMapPermCore test failure caused by warnings about missing 
field

http://cr.openjdk.java.net/~jmasa/8004895/webrev.00

Fixed some declarations in vmStructs_cms.hpp and added a missing
declaration for CompactibleFreeListSpace::_dictionary.  Moved the 
typedef for
AFLBinaryTreeDictionary to binaryTreeDictionary.hpp to allow
use in some additional declarations.


From bengt.rutisson at oracle.com  Thu Jan 10 09:30:00 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 10 Jan 2013 10:30:00 +0100
Subject: Request for review (S): 8005972: ParNew should not update the tenuring
	threshold when promotion failed has occurred
Message-ID: <50EE8A18.5070004@oracle.com>


Hi everyone,

Could I have a couple of reviews for this small change to make DefNew 
and ParNew be more consistent in the way they treat the tenuring threshold:

http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/

Thanks,
Bengt


From bengt.rutisson at oracle.com  Thu Jan 10 10:07:20 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 10 Jan 2013 11:07:20 +0100
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50EE0B21.4000909@oracle.com>
References: <50EE0B21.4000909@oracle.com>
Message-ID: <50EE92D8.8050903@oracle.com>


Hi John,

Changes look good.

One question about G1NewSizePercent and G1MaxNewSizePercent. Why are 
these only changed for heap sizes below 4GB? I would think that at least 
the reduction of G1NewSizePercent would be even more important for 
larger heap sizes. If we want to get lower pause times on larger heaps 
we need to be able to have a small young gen size.

Also, your change in arguments.cpp is guarded by #ifndef SERIALGC. This 
is correct of course, but Joe Provino has a change out that will replace 
this kind of check with #if INCLUDE_ALL_GCS:

http://cr.openjdk.java.net/~jprovino/8005915/webrev.00

Neither you nor Joe will get any merge conflicts if your changes are 
both pushed. It will even still compile. But the code inside #ifndef 
SERIALGC will never be executed. So, it might be good to keep any eye 
out for how Joe's change propagate through the repositories to make sure 
that you can manually resolve this.

My guess is that Joe's change will have to wait a while since it 
includes make file changes that potentially interfere with changes for 
the new build system. So, hopefully you get to push this first :)

Bengt


On 1/10/13 1:28 AM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers look of the code changes for this 
> CR? The webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8001425/webrev.0/
>
> Background:
> These changes change the default values of:
>
> G1MixedGCLiveThresholdPercent from 90 down to 65
> This means that we we don't consider regions with a live occupancy > 
> 65% as good candidates for collection during the next mixed GC phase. 
> Evacuating regions that have a high live occupancy can get expensive 
> and a 90% cut off was deemed to be too high.
>
> G1HeapWastePercent up from 5% to 10%
> This means that we are prepared to sacrifice 10% of the heap to avoid 
> really expensive mixed GCs.
>
> G1MixedGCCountTarget up from 4 to 8
> We can do up to 8 mixed GCs after a marking cycle instead of 4. This 
> should mean that an individual mixed GC is less expensive and we 
> should collect more regions before we reach the regions that are 
> really expensive to collect.
>
> And for heaps no more than 4GB:
>
> G1NewSizePercent is reduced from 20% to 5%. This value was placing a 
> lower bound on how far we could shrink the young generation. 20% was 
> deemed to be too high.
>
> G1MaxNewSizePercent is reduced from 80% to 60%. A value of 80% was 
> allowing the young generation to grow too large increasing the 
> possibility of evacuation failures.
>
> These new values have been suggested by the performance team based 
> upon their tuning experiments (including feedback from people the 
> performance team has helped out).
>
> Going forward I think some of these values should be set dynamically. 
> For example G1MixedGCLiveThresholdPercent could be tied to the overall 
> heap occupancy.
>
> Thanks,
>
> JohnC


From jesper.wilhelmsson at oracle.com  Thu Jan 10 10:16:14 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Thu, 10 Jan 2013 11:16:14 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50EE8A18.5070004@oracle.com>
References: <50EE8A18.5070004@oracle.com>
Message-ID: <50EE94EE.2070700@oracle.com>

Looks good.
Ship it!
/Jesper

On 10/1/13 10:30 AM, Bengt Rutisson wrote:
>
> Hi everyone,
>
> Could I have a couple of reviews for this small change to make DefNew 
> and ParNew be more consistent in the way they treat the tenuring 
> threshold:
>
> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>
> Thanks,
> Bengt


From john.cuthbertson at oracle.com  Thu Jan 10 18:10:01 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 10 Jan 2013 10:10:01 -0800
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50EE8A18.5070004@oracle.com>
References: <50EE8A18.5070004@oracle.com>
Message-ID: <50EF03F9.2080200@oracle.com>

Hi Bengt,

This looks good to me.

We may want a CR to make G1 follow a similar model. Currently G1 updates 
its tenuring threshold in 
G1CollectorPolicy:record_collection_pause_start() (via 
update_survivors_policy()).

JohnC

On 1/10/2013 1:30 AM, Bengt Rutisson wrote:
>
> Hi everyone,
>
> Could I have a couple of reviews for this small change to make DefNew 
> and ParNew be more consistent in the way they treat the tenuring 
> threshold:
>
> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>
> Thanks,
> Bengt


From john.cuthbertson at oracle.com  Thu Jan 10 18:47:38 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 10 Jan 2013 10:47:38 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50EE92D8.8050903@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
Message-ID: <50EF0CCA.8070005@oracle.com>

Hi Bengt,

Thanks for reviewing the code. Replies inline...

On 1/10/2013 2:07 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> Changes look good.
>
> One question about G1NewSizePercent and G1MaxNewSizePercent. Why are 
> these only changed for heap sizes below 4GB? I would think that at 
> least the reduction of G1NewSizePercent would be even more important 
> for larger heap sizes. If we want to get lower pause times on larger 
> heaps we need to be able to have a small young gen size.

The simple answer is: it was suggested by Monica and Charlie. Personally 
I'm OK with making the new values of G1NewSizePercent and 
G1MaxNewSizePercent the defaults for all heap sizes and we might (or 
most likely will) go there in the future - but for the moment we're 
being conservative.

As I mentioned we would like to make G1 a bit more adaptive - and both 
Monica and Charlie have some ideas in that area.

>
> Also, your change in arguments.cpp is guarded by #ifndef SERIALGC. 
> This is correct of course, but Joe Provino has a change out that will 
> replace this kind of check with #if INCLUDE_ALL_GCS:
>
> http://cr.openjdk.java.net/~jprovino/8005915/webrev.00
>
> Neither you nor Joe will get any merge conflicts if your changes are 
> both pushed. It will even still compile. But the code inside #ifndef 
> SERIALGC will never be executed. So, it might be good to keep any eye 
> out for how Joe's change propagate through the repositories to make 
> sure that you can manually resolve this.
>
> My guess is that Joe's change will have to wait a while since it 
> includes make file changes that potentially interfere with changes for 
> the new build system. So, hopefully you get to push this first :)

Thanks. I've been watching the progress that Joe's change has been 
making. I guess you can think of this change as the one for hs24 and 
another with the SERIALGC changed appropriately being for hs25. :)

Thanks,

JohnC


From john.cuthbertson at oracle.com  Thu Jan 10 19:46:13 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 10 Jan 2013 11:46:13 -0800
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental
	mode of CMS
In-Reply-To: <MEX4B1$E1E07BA2F872E0495C06E5D1E52F22E9@frajt.eu>
References: <508EB0D7.8020204@oracle.com>
	<CABzyjykeMB3gNgoCkd-n4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g@mail.gmail.com>
	<50C108C2.9@oracle.com>
	<CABzyjy=VeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg@mail.gmail.com>
	<CABzyjymQJArReNr2xQ9pYA6kUMqcmxW9fpiykic=Nrw8AaDG5g@mail.gmail.com>
	<MEO9HC$71BBF3DAB4563B2D26BA972076AA26C1@frajt.eu>
	<MEX4B1$E1E07BA2F872E0495C06E5D1E52F22E9@frajt.eu>
Message-ID: <50EF1A85.4010203@oracle.com>

Hi Michal,

Many apologies for the delay in generating a new webrev for this change 
but here is the new one: http://cr.openjdk.java.net/~johnc/7189971/webrev.1/

Can you verify the webrev to make sure that changes have been applied 
correctly? Looking at the new webrev it seems that the setting of the 
CMS has been moved back above the return out of the loop. Was this 
intentional?

I've done a couple of sanity tests with GCOld with CMSWaitDuration=0 and 
CMSWaitDuration=1500 with CMS.

Regards,

JohnC

On 12/12/2012 4:35 AM, Michal Frajt wrote:
> All,
>   
> Find the attached patch. It implements proposed recommendations and requested changes. Please mind that the CMSWaitDuration set to -1 (never wait) requires new parameter CMSCheckInterval (develop only, 1000 milliseconds default - constant).  The parameter defines the next CMS cycle start check interval in the case there are no desynchronization (notifications) events on the CGC_lock.
>
> Tested with the Solaris/amd64 build
>   
> CMS
> + CMSWaitDuration>0 OK
> + CMSWaitDuration=0 OK
> + CMSWaitDuration<0 OK
>   
> iCMS
> + CMSWaitDuration>0 OK
> + CMSWaitDuration=0 OK
> + CMSWaitDuration<0 OK
>   
> Regards,
> Michal
>   
>   
> Od: hotspot-gc-dev-bounces at openjdk.java.net
> Komu: hotspot-gc-dev at openjdk.java.net
> Kopie:
> Datum: Fri,  7 Dec 2012 18:48:48 +0100
> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental  mode of CMS
>
>> Hi John/Jon/Ramki,
>>
>> All proposed recommendations and requested changes have been implemented. We are going to test it on Monday. You will get the new tested patch soon.
>>
>> The attached code here just got compiled, no test executed yet, it might contain a bug, but you can quickly review it and send your comments.
>>
>> Best regards
>> Michal
>>
>>
>> // Wait until the next synchronous GC, a concurrent full gc request,
>> // or a timeout, whichever is earlier.
>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long t_millis) {
>>    // Wait time in millis or 0 value representing infinite wait for a scavenge
>>    assert(t_millis >= 0, "Wait time for scavenge should be 0 or positive");
>>
>>    GenCollectedHeap* gch = GenCollectedHeap::heap();
>>    double start_time_secs = os::elapsedTime();
>>    double end_time_secs = start_time_secs + (t_millis / ((double) MILLIUNITS));
>>
>>    // Total collections count before waiting loop
>>    unsigned int before_count;
>>    {
>>      MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>      before_count = gch->total_collections();
>>    }
>>
>>    unsigned int loop_count = 0;
>>
>>    while(!_should_terminate) {
>>      double now_time = os::elapsedTime();
>>      long wait_time_millis;
>>
>>      if(t_millis != 0) {
>>        // New wait limit
>>        wait_time_millis = (long) ((end_time_secs - now_time) * MILLIUNITS);
>>        if(wait_time_millis <= 0) {
>>          // Wait time is over
>>          break;
>>        }
>>      } else {
>>        // No wait limit, wait if necessary forever
>>        wait_time_millis = 0;
>>      }
>>
>>      // Wait until the next event or the remaining timeout
>>      {
>>        MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>>
>>        set_CMS_flag(CMS_cms_wants_token);   // to provoke notifies
>>        if (_should_terminate || _collector->_full_gc_requested) {
>>          return;
>>        }
>>        assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>        CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
>>        clear_CMS_flag(CMS_cms_wants_token);
>>        assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
>>               "Should not be set");
>>      }
>>
>>      // Extra wait time check before entering the heap lock to get the collection count
>>      if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
>>        // Wait time is over
>>        break;
>>      }
>>
>>      // Total collections count after the event
>>      unsigned int after_count;
>>      {
>>        MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>        after_count = gch->total_collections();
>>      }
>>
>>      if(before_count != after_count) {
>>        // There was a collection - success
>>        break;
>>      }
>>
>>      // Too many loops warning
>>      if(++loop_count == 0) {
>>        warning("wait_on_cms_lock_for_scavenge() has looped %d times", loop_count - 1);
>>      }
>>    }
>> }
>>
>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
>>    while (!_should_terminate) {
>>      if (CMSIncrementalMode) {
>>        icms_wait();
>>        if(CMSWaitDuration >= 0) {
>>          // Wait until the next synchronous GC, a concurrent full gc
>>          // request or a timeout, whichever is earlier.
>>          wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>        }
>>        return;
>>      } else {
>>        if(CMSWaitDuration >= 0) {
>>          // Wait until the next synchronous GC, a concurrent full gc
>>          // request or a timeout, whichever is earlier.
>>          wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>        } else {
>>          // Wait until any cms_lock event not to call shouldConcurrentCollect permanently
>>          wait_on_cms_lock(0);
>>        }
>>      }
>>      // Check if we should start a CMS collection cycle
>>      if (_collector->shouldConcurrentCollect()) {
>>        return;
>>      }
>>      // .. collection criterion not yet met, let's go back
>>      // and wait some more
>>    }
>> }
>>   
>>
>>   
>> Od: hotspot-gc-dev-bounces at openjdk.java.net
>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson" john.cuthbertson at oracle.com
>> Kopie: hotspot-gc-dev at openjdk.java.net
>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
>>
>>> Hi John --
>>>
>>> wrt the changes posted, i see the intent of the code and agree with
>>> it. I have a few minor suggestions on the
>>> details of how it's implemented. My comments are inline below,
>>> interleaved with the code:
>>>
>>>   317 // Wait until the next synchronous GC, a concurrent full gc request,
>>>   318 // or a timeout, whichever is earlier.
>>>   319 void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>> t_millis) {
>>>   320   // Wait for any cms_lock event when timeout not specified (0 millis)
>>>   321   if (t_millis == 0) {
>>>   322     wait_on_cms_lock(t_millis);
>>>   323     return;
>>>   324   }
>>>
>>> I'd completely avoid the special case above because it would miss the
>>> part about waiting for a
>>> scavenge, instead dealing with that case in the code in the loop below
>>> directly. The idea
>>> of the "0" value is not to ask that we return immediately, but that we
>>> wait, if necessary
>>> forever, for a scavenge. The "0" really represents the value infinity
>>> in that sense. This would
>>> be in keeping with our use of wait() with a "0" value for timeout at
>>> other places in the JVM as
>>> well, so it's consistent.
>>>
>>>   325
>>>   326   GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>   327   double start_time = os::elapsedTime();
>>>   328   double end_time = start_time + (t_millis / 1000.0);
>>>
>>> Note how, the end_time == start_time for the special case of t_millis
>>> == 0, so we need to treat that
>>> case specially below.
>>>
>>>   329
>>>   330   // Total collections count before waiting loop
>>>   331   unsigned int before_count;
>>>   332   {
>>>   333     MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>   334     before_count = gch->total_collections();
>>>   335   }
>>>
>>> Good.
>>>
>>>   336
>>>   337   while (true) {
>>>   338     double now_time = os::elapsedTime();
>>>   339     long wait_time_millis = (long)((end_time - now_time) * 1000.0);
>>>   340
>>>   341     if (wait_time_millis <= 0) {
>>>   342       // Wait time is over
>>>   343       break;
>>>   344     }
>>>
>>> Modify to:
>>>             if (t_millis != 0) {
>>>                if  (wait_time_millis <= 0)  {
>>>                   // Wait time is over
>>>                   break;
>>>               }
>>>            } else {
>>>               wait_time_millis = 0;     // for use in wait() below
>>>           }
>>>
>>>   345
>>>   346     // Wait until the next event or the remaining timeout
>>>   347     {
>>>   348       MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>>>   349       if (_should_terminate || _collector->_full_gc_requested) {
>>>   350         return;
>>>   351       }
>>>   352       set_CMS_flag(CMS_cms_wants_token);   // to provoke notifies
>>>
>>> insert:     assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>
>>>   353       CGC_lock->wait(Mutex::_no_safepoint_check_flag, wait_time_millis);
>>>   354       clear_CMS_flag(CMS_cms_wants_token);
>>>   355       assert(!CMS_flag_is_set(CMS_cms_has_token | CMS_cms_wants_token),
>>>   356              "Should not be set");
>>>   357     }
>>>   358
>>>   359     // Extra wait time check before entering the heap lock to get
>>> the collection count
>>>   360     if (os::elapsedTime() >= end_time) {
>>>   361       // Wait time is over
>>>   362       break;
>>>   363     }
>>>
>>> Modify above wait time check to make an exception for t_miliis == 0:
>>>             // Extra wait time check before checking collection count
>>>             if (t_millis != 0 && os::elapsedTime() >= end_time) {
>>>                // wait time exceeded
>>>                break;
>>>             }
>>>
>>>   364
>>>   365     // Total collections count after the event
>>>   366     unsigned int after_count;
>>>   367     {
>>>   368       MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>   369       after_count = gch->total_collections();
>>>   370     }
>>>   371
>>>   372     if (before_count != after_count) {
>>>   373       // There was a collection - success
>>>   374       break;
>>>   375     }
>>>   376   }
>>>   377 }
>>>
>>> While it is true that we do not have a case where the method is called
>>> with a time of "0", I think we
>>> want that value to be treated correctly as "infinity". For the case
>>> where we do not want a wait at all,
>>> we should use a small positive value, like "1 ms" to signal that
>>> intent, i.e. -XX:CMSWaitDuration=1,
>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
>>> by reserving negative values to
>>> signal infinity, but that would make the code in the loop a bit more fiddly.)
>>>
>>> As mentioned in my previous email, I'd like to see this tested with
>>> CMSWaitDuration set to 0, positive and
>>> negative values (if necessary, we can reject negative value settings),
>>> and with ExplicitGCInvokesConcurrent.
>>>
>>> Rest looks OK to me, although I am not sure how this behaves with
>>> iCMS, as I have forgotten that part of the
>>> code.
>>>
>>> Finally, in current code (before these changes) there are two callers
>>> of the former wait_for_cms_lock() method,
>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
>>> I think the right thing has been done
>>> in terms of leaving the latter alone.
>>>
>>> It would be good if this were checked with CMSInitiatingOccupancy set
>>> to 0 (or a small value), CMSWaitDuration set to 0,
>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
>>> CMS cycles start very soon after the end of
>>> a scavenge (and not at random times as Michal has observed earlier,
>>> although i am guessing that is difficult to test).
>>> It would be good to repeat the above test with iCMS as well.
>>>
>>> thanks!
>>> -- ramki
>>>
>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna  wrote:
>>>> Thanks Jon for the pointer:
>>>>
>>>>
>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu  wrote:
>>>>>
>>>>>
>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
>>>>>> The high level idea looks correct. I'll look at the details in a bit (seriously this time; sorry it dropped off my plate last time I promised).
>>>>>> Does anyone have a pointer to the related discussion thread on this aias from earlier in the year, by chance, so one could refresh one's
>>>>>> memory of that discussion?
>>>>>
>>>>> subj: CMSWaitDuration unstable behavior
>>>>>
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
>>>>>
>>>>>
>>>> also: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
>>>>
>>>> On to it later this afternoon, and TTYL w/review.
>>>> - ramki


From john.cuthbertson at oracle.com  Thu Jan 10 20:20:28 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 10 Jan 2013 12:20:28 -0800
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental
	mode of CMS
In-Reply-To: <50EF1A85.4010203@oracle.com>
References: <508EB0D7.8020204@oracle.com>
	<CABzyjykeMB3gNgoCkd-n4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g@mail.gmail.com>
	<50C108C2.9@oracle.com>
	<CABzyjy=VeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg@mail.gmail.com>
	<CABzyjymQJArReNr2xQ9pYA6kUMqcmxW9fpiykic=Nrw8AaDG5g@mail.gmail.com>
	<MEO9HC$71BBF3DAB4563B2D26BA972076AA26C1@frajt.eu>
	<MEX4B1$E1E07BA2F872E0495C06E5D1E52F22E9@frajt.eu>
	<50EF1A85.4010203@oracle.com>
Message-ID: <50EF228C.3030009@oracle.com>

Hi Michal,

On 1/10/2013 11:46 AM, John Cuthbertson wrote:
> Hi Michal,
>
> Many apologies for the delay in generating a new webrev for this 
> change but here is the new one: 
> http://cr.openjdk.java.net/~johnc/7189971/webrev.1/
>
> Can you verify the webrev to make sure that changes have been applied 
> correctly? Looking at the new webrev it seems that the setting of the 
> CMS has been moved back above the return out of the loop. Was this 
> intentional?

The above should be "... setting of the CMS token has been ...".

JohnC

>
> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0 
> and CMSWaitDuration=1500 with CMS.
>
> Regards,
>
> JohnC
>
> On 12/12/2012 4:35 AM, Michal Frajt wrote:
>> All,
>>   Find the attached patch. It implements proposed recommendations and 
>> requested changes. Please mind that the CMSWaitDuration set to -1 
>> (never wait) requires new parameter CMSCheckInterval (develop only, 
>> 1000 milliseconds default - constant).  The parameter defines the 
>> next CMS cycle start check interval in the case there are no 
>> desynchronization (notifications) events on the CGC_lock.
>>
>> Tested with the Solaris/amd64 build
>>   CMS
>> + CMSWaitDuration>0 OK
>> + CMSWaitDuration=0 OK
>> + CMSWaitDuration<0 OK
>>   iCMS
>> + CMSWaitDuration>0 OK
>> + CMSWaitDuration=0 OK
>> + CMSWaitDuration<0 OK
>>   Regards,
>> Michal
>>     Od: hotspot-gc-dev-bounces at openjdk.java.net
>> Komu: hotspot-gc-dev at openjdk.java.net
>> Kopie:
>> Datum: Fri,  7 Dec 2012 18:48:48 +0100
>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for 
>> non-incremental  mode of CMS
>>
>>> Hi John/Jon/Ramki,
>>>
>>> All proposed recommendations and requested changes have been 
>>> implemented. We are going to test it on Monday. You will get the new 
>>> tested patch soon.
>>>
>>> The attached code here just got compiled, no test executed yet, it 
>>> might contain a bug, but you can quickly review it and send your 
>>> comments.
>>>
>>> Best regards
>>> Michal
>>>
>>>
>>> // Wait until the next synchronous GC, a concurrent full gc request,
>>> // or a timeout, whichever is earlier.
>>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long 
>>> t_millis) {
>>>    // Wait time in millis or 0 value representing infinite wait for 
>>> a scavenge
>>>    assert(t_millis >= 0, "Wait time for scavenge should be 0 or 
>>> positive");
>>>
>>>    GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>    double start_time_secs = os::elapsedTime();
>>>    double end_time_secs = start_time_secs + (t_millis / ((double) 
>>> MILLIUNITS));
>>>
>>>    // Total collections count before waiting loop
>>>    unsigned int before_count;
>>>    {
>>>      MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>      before_count = gch->total_collections();
>>>    }
>>>
>>>    unsigned int loop_count = 0;
>>>
>>>    while(!_should_terminate) {
>>>      double now_time = os::elapsedTime();
>>>      long wait_time_millis;
>>>
>>>      if(t_millis != 0) {
>>>        // New wait limit
>>>        wait_time_millis = (long) ((end_time_secs - now_time) * 
>>> MILLIUNITS);
>>>        if(wait_time_millis <= 0) {
>>>          // Wait time is over
>>>          break;
>>>        }
>>>      } else {
>>>        // No wait limit, wait if necessary forever
>>>        wait_time_millis = 0;
>>>      }
>>>
>>>      // Wait until the next event or the remaining timeout
>>>      {
>>>        MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>>>
>>>        set_CMS_flag(CMS_cms_wants_token);   // to provoke notifies
>>>        if (_should_terminate || _collector->_full_gc_requested) {
>>>          return;
>>>        }
>>>        assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>        CGC_lock->wait(Mutex::_no_safepoint_check_flag, 
>>> wait_time_millis);
>>>        clear_CMS_flag(CMS_cms_wants_token);
>>>        assert(!CMS_flag_is_set(CMS_cms_has_token | 
>>> CMS_cms_wants_token),
>>>               "Should not be set");
>>>      }
>>>
>>>      // Extra wait time check before entering the heap lock to get 
>>> the collection count
>>>      if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
>>>        // Wait time is over
>>>        break;
>>>      }
>>>
>>>      // Total collections count after the event
>>>      unsigned int after_count;
>>>      {
>>>        MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>        after_count = gch->total_collections();
>>>      }
>>>
>>>      if(before_count != after_count) {
>>>        // There was a collection - success
>>>        break;
>>>      }
>>>
>>>      // Too many loops warning
>>>      if(++loop_count == 0) {
>>>        warning("wait_on_cms_lock_for_scavenge() has looped %d 
>>> times", loop_count - 1);
>>>      }
>>>    }
>>> }
>>>
>>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
>>>    while (!_should_terminate) {
>>>      if (CMSIncrementalMode) {
>>>        icms_wait();
>>>        if(CMSWaitDuration >= 0) {
>>>          // Wait until the next synchronous GC, a concurrent full gc
>>>          // request or a timeout, whichever is earlier.
>>>          wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>>        }
>>>        return;
>>>      } else {
>>>        if(CMSWaitDuration >= 0) {
>>>          // Wait until the next synchronous GC, a concurrent full gc
>>>          // request or a timeout, whichever is earlier.
>>>          wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>>        } else {
>>>          // Wait until any cms_lock event not to call 
>>> shouldConcurrentCollect permanently
>>>          wait_on_cms_lock(0);
>>>        }
>>>      }
>>>      // Check if we should start a CMS collection cycle
>>>      if (_collector->shouldConcurrentCollect()) {
>>>        return;
>>>      }
>>>      // .. collection criterion not yet met, let's go back
>>>      // and wait some more
>>>    }
>>> }
>>>
>>>   Od: hotspot-gc-dev-bounces at openjdk.java.net
>>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson" 
>>> john.cuthbertson at oracle.com
>>> Kopie: hotspot-gc-dev at openjdk.java.net
>>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
>>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for 
>>> non-incremental mode of CMS
>>>
>>>> Hi John --
>>>>
>>>> wrt the changes posted, i see the intent of the code and agree with
>>>> it. I have a few minor suggestions on the
>>>> details of how it's implemented. My comments are inline below,
>>>> interleaved with the code:
>>>>
>>>>   317 // Wait until the next synchronous GC, a concurrent full gc 
>>>> request,
>>>>   318 // or a timeout, whichever is earlier.
>>>>   319 void 
>>>> ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>>> t_millis) {
>>>>   320   // Wait for any cms_lock event when timeout not specified 
>>>> (0 millis)
>>>>   321   if (t_millis == 0) {
>>>>   322     wait_on_cms_lock(t_millis);
>>>>   323     return;
>>>>   324   }
>>>>
>>>> I'd completely avoid the special case above because it would miss the
>>>> part about waiting for a
>>>> scavenge, instead dealing with that case in the code in the loop below
>>>> directly. The idea
>>>> of the "0" value is not to ask that we return immediately, but that we
>>>> wait, if necessary
>>>> forever, for a scavenge. The "0" really represents the value infinity
>>>> in that sense. This would
>>>> be in keeping with our use of wait() with a "0" value for timeout at
>>>> other places in the JVM as
>>>> well, so it's consistent.
>>>>
>>>>   325
>>>>   326   GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>>   327   double start_time = os::elapsedTime();
>>>>   328   double end_time = start_time + (t_millis / 1000.0);
>>>>
>>>> Note how, the end_time == start_time for the special case of t_millis
>>>> == 0, so we need to treat that
>>>> case specially below.
>>>>
>>>>   329
>>>>   330   // Total collections count before waiting loop
>>>>   331   unsigned int before_count;
>>>>   332   {
>>>>   333     MutexLockerEx hl(Heap_lock, 
>>>> Mutex::_no_safepoint_check_flag);
>>>>   334     before_count = gch->total_collections();
>>>>   335   }
>>>>
>>>> Good.
>>>>
>>>>   336
>>>>   337   while (true) {
>>>>   338     double now_time = os::elapsedTime();
>>>>   339     long wait_time_millis = (long)((end_time - now_time) * 
>>>> 1000.0);
>>>>   340
>>>>   341     if (wait_time_millis <= 0) {
>>>>   342       // Wait time is over
>>>>   343       break;
>>>>   344     }
>>>>
>>>> Modify to:
>>>>             if (t_millis != 0) {
>>>>                if  (wait_time_millis <= 0)  {
>>>>                   // Wait time is over
>>>>                   break;
>>>>               }
>>>>            } else {
>>>>               wait_time_millis = 0;     // for use in wait() below
>>>>           }
>>>>
>>>>   345
>>>>   346     // Wait until the next event or the remaining timeout
>>>>   347     {
>>>>   348       MutexLockerEx x(CGC_lock, 
>>>> Mutex::_no_safepoint_check_flag);
>>>>   349       if (_should_terminate || _collector->_full_gc_requested) {
>>>>   350         return;
>>>>   351       }
>>>>   352       set_CMS_flag(CMS_cms_wants_token);   // to provoke 
>>>> notifies
>>>>
>>>> insert:     assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>>
>>>>   353 CGC_lock->wait(Mutex::_no_safepoint_check_flag, 
>>>> wait_time_millis);
>>>>   354       clear_CMS_flag(CMS_cms_wants_token);
>>>>   355       assert(!CMS_flag_is_set(CMS_cms_has_token | 
>>>> CMS_cms_wants_token),
>>>>   356              "Should not be set");
>>>>   357     }
>>>>   358
>>>>   359     // Extra wait time check before entering the heap lock to 
>>>> get
>>>> the collection count
>>>>   360     if (os::elapsedTime() >= end_time) {
>>>>   361       // Wait time is over
>>>>   362       break;
>>>>   363     }
>>>>
>>>> Modify above wait time check to make an exception for t_miliis == 0:
>>>>             // Extra wait time check before checking collection count
>>>>             if (t_millis != 0 && os::elapsedTime() >= end_time) {
>>>>                // wait time exceeded
>>>>                break;
>>>>             }
>>>>
>>>>   364
>>>>   365     // Total collections count after the event
>>>>   366     unsigned int after_count;
>>>>   367     {
>>>>   368       MutexLockerEx hl(Heap_lock, 
>>>> Mutex::_no_safepoint_check_flag);
>>>>   369       after_count = gch->total_collections();
>>>>   370     }
>>>>   371
>>>>   372     if (before_count != after_count) {
>>>>   373       // There was a collection - success
>>>>   374       break;
>>>>   375     }
>>>>   376   }
>>>>   377 }
>>>>
>>>> While it is true that we do not have a case where the method is called
>>>> with a time of "0", I think we
>>>> want that value to be treated correctly as "infinity". For the case
>>>> where we do not want a wait at all,
>>>> we should use a small positive value, like "1 ms" to signal that
>>>> intent, i.e. -XX:CMSWaitDuration=1,
>>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
>>>> by reserving negative values to
>>>> signal infinity, but that would make the code in the loop a bit 
>>>> more fiddly.)
>>>>
>>>> As mentioned in my previous email, I'd like to see this tested with
>>>> CMSWaitDuration set to 0, positive and
>>>> negative values (if necessary, we can reject negative value settings),
>>>> and with ExplicitGCInvokesConcurrent.
>>>>
>>>> Rest looks OK to me, although I am not sure how this behaves with
>>>> iCMS, as I have forgotten that part of the
>>>> code.
>>>>
>>>> Finally, in current code (before these changes) there are two callers
>>>> of the former wait_for_cms_lock() method,
>>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
>>>> I think the right thing has been done
>>>> in terms of leaving the latter alone.
>>>>
>>>> It would be good if this were checked with CMSInitiatingOccupancy set
>>>> to 0 (or a small value), CMSWaitDuration set to 0,
>>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
>>>> CMS cycles start very soon after the end of
>>>> a scavenge (and not at random times as Michal has observed earlier,
>>>> although i am guessing that is difficult to test).
>>>> It would be good to repeat the above test with iCMS as well.
>>>>
>>>> thanks!
>>>> -- ramki
>>>>
>>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna  wrote:
>>>>> Thanks Jon for the pointer:
>>>>>
>>>>>
>>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu  wrote:
>>>>>>
>>>>>>
>>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
>>>>>>> The high level idea looks correct. I'll look at the details in a 
>>>>>>> bit (seriously this time; sorry it dropped off my plate last 
>>>>>>> time I promised).
>>>>>>> Does anyone have a pointer to the related discussion thread on 
>>>>>>> this aias from earlier in the year, by chance, so one could 
>>>>>>> refresh one's
>>>>>>> memory of that discussion?
>>>>>>
>>>>>> subj: CMSWaitDuration unstable behavior
>>>>>>
>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html 
>>>>>>
>>>>>>
>>>>>>
>>>>> also: 
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
>>>>>
>>>>> On to it later this afternoon, and TTYL w/review.
>>>>> - ramki
>


From ysr1729 at gmail.com  Thu Jan 10 20:28:43 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 10 Jan 2013 12:28:43 -0800
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50EE8A18.5070004@oracle.com>
References: <50EE8A18.5070004@oracle.com>
Message-ID: <CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>

Hi Bengt --

The change looks reasonable, but I have a comment and a follow-up question.

Not your change, but I'd elide the "half the real survivor size" since it's
really a configurable parameter based on TargetSurvivorRatio with default
half.
I'd leave the comment as "set the new tenuring threshold and desired
survivor size".

I'm curious though, as to what performance data prompted this change, and
whether it might make sense, upon a promotion failure to do something about
the tenuring threshold for the next scavenge (i.e. for example make the
tenuring threshold half of its current value as a reaction to the fact that
promotion failed). Is it currently left at its previous value or is it
asjusted back to the default max value (which latter may be the wrong thing
to do) or something else?

-- ramki

On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
<bengt.rutisson at oracle.com>wrote:

>
> Hi everyone,
>
> Could I have a couple of reviews for this small change to make DefNew and
> ParNew be more consistent in the way they treat the tenuring threshold:
>
> http://cr.openjdk.java.net/~**brutisso/8005972/webrev.00/<http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>
> Thanks,
> Bengt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130110/da55c26d/attachment.htm>

From bernd-2013 at eckenfels.net  Thu Jan 10 21:02:05 2013
From: bernd-2013 at eckenfels.net (Bernd Eckenfels)
Date: Thu, 10 Jan 2013 22:02:05 +0100
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental
	mode of CMS
In-Reply-To: <50EF228C.3030009@oracle.com>
References: <508EB0D7.8020204@oracle.com>
	<CABzyjykeMB3gNgoCkd-n4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g@mail.gmail.com>
	<50C108C2.9@oracle.com>
	<CABzyjy=VeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg@mail.gmail.com>
	<CABzyjymQJArReNr2xQ9pYA6kUMqcmxW9fpiykic=Nrw8AaDG5g@mail.gmail.com>
	<MEO9HC$71BBF3DAB4563B2D26BA972076AA26C1@frajt.eu>
	<MEX4B1$E1E07BA2F872E0495C06E5D1E52F22E9@frajt.eu>
	<50EF1A85.4010203@oracle.com> <50EF228C.3030009@oracle.com>
Message-ID: <op.wqprdrvhtc8ri4@eckenfels02.seeburger.de>

Hello,

two amateur :) questions:

Am 10.01.2013, 21:20 Uhr, schrieb John Cuthbertson
<john.cuthbertson at oracle.com>:
>> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0  
>> and CMSWaitDuration=1500 with CMS.

Is there a risk involved in waiting long/endless? For example larger than
some RMI GC intervall or too long for catching up with the filling of OG?
How to test for that?

>>>>        // No wait limit, wait if necessary forever
>>>>        wait_time_millis = 0;

I typically not use a endless wait limit when there is a loop with a fixed
endtime but use a large but limited number. This helps to catch situations
where wakeup is missing (for example on shutdown) or lost. Would it be an
option to use something like 10s?

Gruss
Bernd
-- 
http://bernd.eckenfels.net


From john.cuthbertson at oracle.com  Thu Jan 10 23:23:52 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 10 Jan 2013 15:23:52 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <50ED8E5C.4010109@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
Message-ID: <50EF4D88.2050906@oracle.com>

Hi Bengt,

Thanks for looking over the code changes. Be prepared for some gory 
details. :) Replies inline...

On 1/9/2013 7:35 AM, Bengt Rutisson wrote:
>
> In ConcurrentMark::weakRefsWork() there is this code:
>
> 2422     if (rp->processing_is_mt()) {
> 2423       // Set the degree of MT here.  If the discovery is done MT, 
> there
> 2424       // may have been a different number of threads doing the 
> discovery
> 2425       // and a different number of discovered lists may have Ref 
> objects.
> 2426       // That is OK as long as the Reference lists are balanced (see
> 2427       // balance_all_queues() and balance_queues()).
> 2428       rp->set_active_mt_degree(active_workers);
> 2429     }
>
> Could we now always call rp->set_active_mt_degree() ? Maybe I am 
> missing the point here, but I thought that we are now using queues and 
> all rp->set_active_mt_degree() does is set the number of queues to 
> active_workers. Which will be 1 for the single threaded mode.

Yes - most likely we can but I would prefer to  set active_workers using:

     // We need at least one active thread. If reference processing is
     // not multi-threaded we use the current (ConcurrentMarkThread) thread,
     // otherwise we use the work gang from the G1CollectedHeap and we
     // utilize all the worker threads we can.
     uint active_workers = (rp->processing_is_mt() && g1h->workers() != NULL
                                 ? g1h->workers()->active_workers()
                                 : 1U);

since single threaded versus multi-threaded reference processing is 
determined using the ParallelRefProcEnabled flag. The number of active 
workers here is the number of workers in G1's STW work gang - which is 
controlled via ParallelGCThreads.


>
> If we do that we can also remove the first part of the assert a bit 
> further down:
>
> 2449     assert(!rp->processing_is_mt() || rp->num_q() == 
> active_workers, "why not");

OK. Done.

>
> Also, I think I would like to move the code you added to 
> G1CMParDrainMarkingStackClosure::do_void() into 
> ConcurrentMark::weakRefsWork() somewhere. Maybe something like:
>
> if (!rp->processing_is_mt()) {
>   set_phase(1, false /* concurrent */);
> }
>
> It is a bit strange to me that G1CMParDrainMarkingStackClosure should 
> set this up. If we really want to keep it in 
> G1CMParDrainMarkingStackClosure I think the constructor would be a 
> better place to do it than do_void().

Setting it once in weakRefsWork() will not be sufficient. We will run 
into an assertion failure in ParallelTaskTerminator::offer_termination().

During the reference processing, the do_void() method of the complete_gc 
oop closure (in our case the complete gc oop closure is an instance of 
G1CMParDrainMarkingStackClosure) is called multiple times (in 
process_phase1, sometimes process_phase2, process_phase3, and 
process_phaseJNI)

Setting the phase sets the number of active tasks (or threads) that the 
termination protocol in do_marking_step() will wait for. When an 
invocation of do_marking_step() offers termination, the number of 
tasks/threads in the terminator instance is decremented. So Setting the 
phase once will let the first execution of do_marking_step (with 
termination) from process_phase1() succeed, but subsequent calls to 
do_marking_step() will result in the assertion failure.

We also can't unconditionally set it in the do_void() method or even the 
constructor of G1CMParDrainMarkingStackClosure. Separate instances of 
this closure are created by each of the worker threads in the MT-case.

Note when processing is multi-threaded the complete_gc instance used is 
the one passed into the ProcessTask's work method (passed into 
process_discovered_references() using the task executor instance) which 
may not necessarily be the same complete gc instance as the one passed 
directly into process_discovered_references().

It might be possible to record whether processing is MT in the 
G1CMRefProcTaskExecutor class and always pass the executor instance into 
process_discovered_references. We could then set processing to MT so 
that the execute() methods in the executor instance are invoked but call 
the Proxy class' work method directly. Then we could override the 
set_single_threaded() routine (called just before process_phaseJNI) to 
set the phase.

I don't think that would be any clearer.

Perhaps a better name for the _is_par flag would be _processing_is_mt 
(and set it using ReferenceProcessor::processing_is_mt())?

> Some minor comments:
>
> In G1CMParKeepAliveAndDrainClosure and G1CMParDrainMarkingStackClosure 
> constructors there is this assert:
>
> assert(_task->worker_id() == 0 || _is_par, "sanity");
>
> I think it is good, but I had to think a bit about what it meant and 
> to me I think it would be quicker to understand if it was the other 
> way around:
>
> assert(_is_par || _task->worker_id() == 0, "Only worker 0 should be 
> used if single threaded");

Sure no problem. Done.

> Remove newline on line 2254 ?

Sure. What about the blank lines on 2187 and 2302 for consistency?

> ConcurrentMark::weakRefsWork()
>
> How about introducing a variable that either hold the value of 
> &par_task_executor or NULL depending on rp->processing_is_mt()? That 
> way we don't have to duplicate and inline this test twice:
>
> (rp->processing_is_mt() ? &par_task_executor : NULL)

OK.

> As a separate change it might be worth renaming the closures to not 
> have "Par" in the name, now that they are not always parallel...
>
> G1CMParKeepAliveAndDrainClosure -> G1CMKeepAliveAndDrainClosure
> G1CMParDrainMarkingStackClosure -> G1CMDrainMarkingStackClosure
>
> But it would be very confusing to do this in the same change.

I think we can change the names. The change shouldn't be that confusing.

Thanks. A new webrev will appear shortly.

JohnC


From john.coomes at oracle.com  Fri Jan 11 04:50:35 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:50:35 +0000
Subject: hg: hsx/hotspot-gc: Added tag jdk8-b72 for changeset c1be681d80a1
Message-ID: <20130111045035.6EF0E471CE@hg.openjdk.java.net>

Changeset: f03f90a4308d
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/rev/f03f90a4308d

Added tag jdk8-b72 for changeset c1be681d80a1

! .hgtags


From john.coomes at oracle.com  Fri Jan 11 04:50:38 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:50:38 +0000
Subject: hg: hsx/hotspot-gc/corba: Added tag jdk8-b72 for changeset
	cb40427f4714
Message-ID: <20130111045040.45D9F471CF@hg.openjdk.java.net>

Changeset: 191afde59e7b
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/191afde59e7b

Added tag jdk8-b72 for changeset cb40427f4714

! .hgtags


From john.coomes at oracle.com  Fri Jan 11 04:50:43 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:50:43 +0000
Subject: hg: hsx/hotspot-gc/jaxp: Added tag jdk8-b72 for changeset bdf2af722a6b
Message-ID: <20130111045049.418F0471D0@hg.openjdk.java.net>

Changeset: 84946404d1e1
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/84946404d1e1

Added tag jdk8-b72 for changeset bdf2af722a6b

! .hgtags


From john.coomes at oracle.com  Fri Jan 11 04:50:53 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:50:53 +0000
Subject: hg: hsx/hotspot-gc/jaxws: Added tag jdk8-b72 for changeset
	d9707230294d
Message-ID: <20130111045057.CE038471D1@hg.openjdk.java.net>

Changeset: c606f644a5d9
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/c606f644a5d9

Added tag jdk8-b72 for changeset d9707230294d

! .hgtags


From john.coomes at oracle.com  Fri Jan 11 04:51:04 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:51:04 +0000
Subject: hg: hsx/hotspot-gc/jdk: Added tag jdk8-b72 for changeset 32a57e645e01
Message-ID: <20130111045145.98095471D2@hg.openjdk.java.net>

Changeset: c9a914b11436
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/c9a914b11436

Added tag jdk8-b72 for changeset 32a57e645e01

! .hgtags


From john.coomes at oracle.com  Fri Jan 11 04:52:45 2013
From: john.coomes at oracle.com (john.coomes at oracle.com)
Date: Fri, 11 Jan 2013 04:52:45 +0000
Subject: hg: hsx/hotspot-gc/langtools: Added tag jdk8-b72 for changeset
	6f0986ed9b7e
Message-ID: <20130111045251.07405471D3@hg.openjdk.java.net>

Changeset: 45fed5cfd1c3
Author:    katleman
Date:      2013-01-10 09:56 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/45fed5cfd1c3

Added tag jdk8-b72 for changeset 6f0986ed9b7e

! .hgtags


From bengt.rutisson at oracle.com  Fri Jan 11 11:02:06 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 11 Jan 2013 12:02:06 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
Message-ID: <50EFF12E.20005@oracle.com>


Hi Ramki,

Thanks for looking at this!

On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
> Hi Bengt --
>
> The change looks reasonable, but I have a comment and a follow-up 
> question.
>
> Not your change, but I'd elide the "half the real survivor size" since 
> it's really a configurable parameter based on TargetSurvivorRatio with 
> default half.
> I'd leave the comment as "set the new tenuring threshold and desired 
> survivor size".

I'm fine with removing this from the comment, but I thought the "half 
the real survivor size" aimed at the fact that we pass only the "to" 
capacity and not the "from" capacity in to compute_tenuring_threshold(). 
With that interpretation I think the comment is correct.

Would you like me to remove it anyway? Either way is fine with me.

> I'm curious though, as to what performance data prompted this change,
Good point. This change was preceded by an internal discussion in the GC 
team, so I should probably have explained the background more in my 
review request to the open.

I was comparing the ParNew and DefNew implementation since I am seeing 
some strange differences in some SPECjbb2005 results. I am running 
ParNew with a single thread and get much better score than with DefNew. 
But I also get higher average GC times. So, I was trying to figure out 
what DefNew and ParNew does differently.

When I was looking at DefNewGeneration::collect() and 
ParNewGeneration::collect() I saw that they contain a whole lot of code 
duplication. It would be tempting to try to extract the common code out 
into DefNewGeneration since it is the super class. But there are some 
minor differences. One of them was this issue with how they handle the 
tenuring threshold.

We tried to figure out if there is a reason for ParNew and DefNew to 
behave different in this regard. We could not come up with any good 
reason for that. So, we needed to figure out if we should change ParNew 
or DefNew to make them consistent. The decision to change ParNew was 
based on two things. First, it seems wrong to use the data from a 
collection that got promotion failure. This collection will not have 
allowed the tenuring threshold to fulfill its purpose. Second, 
ParallelScavenge works the same way as DefNew.

BTW, the difference between DefNew and ParNew seems to have been there 
from the start. So, there is no bug or changeset in mercurial or 
TeamWare to explain why the difference was introduced.

(Just to be clear, this difference was not the cause of my performance 
issue. I still don't have a good explanation for how ParNew can have 
longer GC times but better SPECjbb score.)

> and whether it might make sense, upon a promotion failure to do 
> something about the tenuring threshold for the next scavenge (i.e. for 
> example make the tenuring threshold half of its current value as a 
> reaction to the fact that promotion failed). Is it currently left at 
> its previous value or is it asjusted back to the default max value 
> (which latter may be the wrong thing to do) or something else?

As far as I can tell the tenuring threshold is left untouched if we get 
a promotion failure. It is probably a good idea to update it in some 
way. But I would prefer to handle that as a separate bug fix.

This change is mostly a small cleanup to make 
DefNewGeneration::collect() and ParNewGeneration::collect() be more 
consistent. We've done the thinking so, it's good to make the change in 
preparation for the next person that comes a long and has a few cycles 
over and would like to merge the two collect() methods in some way.

Thanks again for looking at this!
Bengt

>
> -- ramki
>
> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson 
> <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
>     Hi everyone,
>
>     Could I have a couple of reviews for this small change to make
>     DefNew and ParNew be more consistent in the way they treat the
>     tenuring threshold:
>
>     http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>     <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>
>     Thanks,
>     Bengt
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/ee2bc14a/attachment.htm>

From vitalyd at gmail.com  Fri Jan 11 12:45:04 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 11 Jan 2013 07:45:04 -0500
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50EFF12E.20005@oracle.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
	<50EFF12E.20005@oracle.com>
Message-ID: <CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>

Hi Bengt,

Regarding the benchmark score, are you saying ParNew has longer cumulative
GC time or just the average is higher? If it's just average, maybe the
total # of them (and cumulative time) is less.  I don't know the
characteristics of this particular specjbb benchmark, but perhaps having
fewer total GCs is better because of the overhead of getting all threads to
a safe point, going go the OS to suspend them, and then restarting them.
After they're restarted, the CPU cache may be cold for it because the GC
thread polluted it.  Or I'm entirely wrong in my speculation ... :).

Thanks

Sent from my phone
On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

>
> Hi Ramki,
>
> Thanks for looking at this!
>
> On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>
> Hi Bengt --
>
> The change looks reasonable, but I have a comment and a follow-up question.
>
> Not your change, but I'd elide the "half the real survivor size" since
> it's really a configurable parameter based on TargetSurvivorRatio with
> default half.
> I'd leave the comment as "set the new tenuring threshold and desired
> survivor size".
>
>
> I'm fine with removing this from the comment, but I thought the "half the
> real survivor size" aimed at the fact that we pass only the "to" capacity
> and not the "from" capacity in to compute_tenuring_threshold(). With that
> interpretation I think the comment is correct.
>
> Would you like me to remove it anyway? Either way is fine with me.
>
> I'm curious though, as to what performance data prompted this change,
>
> Good point. This change was preceded by an internal discussion in the GC
> team, so I should probably have explained the background more in my review
> request to the open.
>
> I was comparing the ParNew and DefNew implementation since I am seeing
> some strange differences in some SPECjbb2005 results. I am running ParNew
> with a single thread and get much better score than with DefNew. But I also
> get higher average GC times. So, I was trying to figure out what DefNew and
> ParNew does differently.
>
> When I was looking at DefNewGeneration::collect() and
> ParNewGeneration::collect() I saw that they contain a whole lot of code
> duplication. It would be tempting to try to extract the common code out
> into DefNewGeneration since it is the super class. But there are some minor
> differences. One of them was this issue with how they handle the tenuring
> threshold.
>
> We tried to figure out if there is a reason for ParNew and DefNew to
> behave different in this regard. We could not come up with any good reason
> for that. So, we needed to figure out if we should change ParNew or DefNew
> to make them consistent. The decision to change ParNew was based on two
> things. First, it seems wrong to use the data from a collection that got
> promotion failure. This collection will not have allowed the tenuring
> threshold to fulfill its purpose. Second, ParallelScavenge works the same
> way as DefNew.
>
> BTW, the difference between DefNew and ParNew seems to have been there
> from the start. So, there is no bug or changeset in mercurial or TeamWare
> to explain why the difference was introduced.
>
> (Just to be clear, this difference was not the cause of my performance
> issue. I still don't have a good explanation for how ParNew can have longer
> GC times but better SPECjbb score.)
>
> and whether it might make sense, upon a promotion failure to do something
> about the tenuring threshold for the next scavenge (i.e. for example make
> the tenuring threshold half of its current value as a reaction to the fact
> that promotion failed). Is it currently left at its previous value or is it
> asjusted back to the default max value (which latter may be the wrong thing
> to do) or something else?
>
>
> As far as I can tell the tenuring threshold is left untouched if we get a
> promotion failure. It is probably a good idea to update it in some way. But
> I would prefer to handle that as a separate bug fix.
>
> This change is mostly a small cleanup to make DefNewGeneration::collect()
> and ParNewGeneration::collect() be more consistent. We've done the thinking
> so, it's good to make the change in preparation for the next person that
> comes a long and has a few cycles over and would like to merge the two
> collect() methods in some way.
>
> Thanks again for looking at this!
> Bengt
>
>
> -- ramki
>
> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson <bengt.rutisson at oracle.com
> > wrote:
>
>>
>> Hi everyone,
>>
>> Could I have a couple of reviews for this small change to make DefNew and
>> ParNew be more consistent in the way they treat the tenuring threshold:
>>
>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>
>> Thanks,
>> Bengt
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/f7b14295/attachment.htm>

From bengt.rutisson at oracle.com  Fri Jan 11 12:47:56 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 11 Jan 2013 13:47:56 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50EF03F9.2080200@oracle.com>
References: <50EE8A18.5070004@oracle.com> <50EF03F9.2080200@oracle.com>
Message-ID: <50F009FC.3080907@oracle.com>


Hi John,

Thanks for looking at this!

On 1/10/13 7:10 PM, John Cuthbertson wrote:
> Hi Bengt,
>
> This looks good to me.
Great!

> We may want a CR to make G1 follow a similar model. Currently G1 
> updates its tenuring threshold in 
> G1CollectorPolicy:record_collection_pause_start() (via 
> update_survivors_policy()).

Right. The problem here is I guess that G1 does it at the start of the 
GC since it needs to know that it has the right number of survivor 
regions set up. So, we would have to remember that the previous GC got 
an evacuation failure and in that case not update the tenuring threshold 
based on the data in the age table. Or maybe, as suggested by Ramki, 
update it in a different way if the previous GC got an evacuation failure.

Thanks again for the review!
Bengt
>
> JohnC
>
> On 1/10/2013 1:30 AM, Bengt Rutisson wrote:
>>
>> Hi everyone,
>>
>> Could I have a couple of reviews for this small change to make DefNew 
>> and ParNew be more consistent in the way they treat the tenuring 
>> threshold:
>>
>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>
>> Thanks,
>> Bengt
>


From bengt.rutisson at oracle.com  Fri Jan 11 12:57:06 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 11 Jan 2013 13:57:06 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
	<50EFF12E.20005@oracle.com>
	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>
Message-ID: <50F00C22.2060201@oracle.com>


Hi Vitaly,

On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Regarding the benchmark score, are you saying ParNew has longer 
> cumulative GC time or just the average is higher? If it's just 
> average, maybe the total # of them (and cumulative time) is less.  I 
> don't know the characteristics of this particular specjbb benchmark, 
> but perhaps having fewer total GCs is better because of the overhead 
> of getting all threads to a safe point, going go the OS to suspend 
> them, and then restarting them.  After they're restarted, the CPU 
> cache may be cold for it because the GC thread polluted it.  Or I'm 
> entirely wrong in my speculation ... :).
>

You have a good point about the number of GCs. The problem in my runs is 
that ParNew does more GCs than DefNew. So there are both more of them 
and their average time is higher, but the score is still better. That 
ParNew does more GCs is not that strange. It has a higher score, which 
means that it had higher throughput and had time to create more objects. 
So, that is kind of expected. But I don't understand how it can have 
higher throughput when the GCs take longer. My current guess is that it 
does something differently with how objects are copied in a way that is 
beneficial for the execution time between GCs.

It also seems like ParNew keeps more objects alive for each GC. That is 
either the reason why it does more and more frequent GCs than DefNew, or 
it is an effect of the fact that more objects are created due to the 
higher throughput. This is the reason I started looking at the tenuring 
threshold.

Bengt

> Thanks
>
> Sent from my phone
>
> On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com 
> <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
>     Hi Ramki,
>
>     Thanks for looking at this!
>
>     On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>     Hi Bengt --
>>
>>     The change looks reasonable, but I have a comment and a follow-up
>>     question.
>>
>>     Not your change, but I'd elide the "half the real survivor size"
>>     since it's really a configurable parameter based on
>>     TargetSurvivorRatio with default half.
>>     I'd leave the comment as "set the new tenuring threshold and
>>     desired survivor size".
>
>     I'm fine with removing this from the comment, but I thought the
>     "half the real survivor size" aimed at the fact that we pass only
>     the "to" capacity and not the "from" capacity in to
>     compute_tenuring_threshold(). With that interpretation I think the
>     comment is correct.
>
>     Would you like me to remove it anyway? Either way is fine with me.
>
>>     I'm curious though, as to what performance data prompted this change,
>     Good point. This change was preceded by an internal discussion in
>     the GC team, so I should probably have explained the background
>     more in my review request to the open.
>
>     I was comparing the ParNew and DefNew implementation since I am
>     seeing some strange differences in some SPECjbb2005 results. I am
>     running ParNew with a single thread and get much better score than
>     with DefNew. But I also get higher average GC times. So, I was
>     trying to figure out what DefNew and ParNew does differently.
>
>     When I was looking at DefNewGeneration::collect() and
>     ParNewGeneration::collect() I saw that they contain a whole lot of
>     code duplication. It would be tempting to try to extract the
>     common code out into DefNewGeneration since it is the super class.
>     But there are some minor differences. One of them was this issue
>     with how they handle the tenuring threshold.
>
>     We tried to figure out if there is a reason for ParNew and DefNew
>     to behave different in this regard. We could not come up with any
>     good reason for that. So, we needed to figure out if we should
>     change ParNew or DefNew to make them consistent. The decision to
>     change ParNew was based on two things. First, it seems wrong to
>     use the data from a collection that got promotion failure. This
>     collection will not have allowed the tenuring threshold to fulfill
>     its purpose. Second, ParallelScavenge works the same way as DefNew.
>
>     BTW, the difference between DefNew and ParNew seems to have been
>     there from the start. So, there is no bug or changeset in
>     mercurial or TeamWare to explain why the difference was introduced.
>
>     (Just to be clear, this difference was not the cause of my
>     performance issue. I still don't have a good explanation for how
>     ParNew can have longer GC times but better SPECjbb score.)
>
>>     and whether it might make sense, upon a promotion failure to do
>>     something about the tenuring threshold for the next scavenge
>>     (i.e. for example make the tenuring threshold half of its current
>>     value as a reaction to the fact that promotion failed). Is it
>>     currently left at its previous value or is it asjusted back to
>>     the default max value (which latter may be the wrong thing to do)
>>     or something else?
>
>     As far as I can tell the tenuring threshold is left untouched if
>     we get a promotion failure. It is probably a good idea to update
>     it in some way. But I would prefer to handle that as a separate
>     bug fix.
>
>     This change is mostly a small cleanup to make
>     DefNewGeneration::collect() and ParNewGeneration::collect() be
>     more consistent. We've done the thinking so, it's good to make the
>     change in preparation for the next person that comes a long and
>     has a few cycles over and would like to merge the two collect()
>     methods in some way.
>
>     Thanks again for looking at this!
>     Bengt
>
>>
>>     -- ramki
>>
>>     On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
>>     <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>
>>
>>         Hi everyone,
>>
>>         Could I have a couple of reviews for this small change to
>>         make DefNew and ParNew be more consistent in the way they
>>         treat the tenuring threshold:
>>
>>         http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>         <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>>
>>         Thanks,
>>         Bengt
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/d98f267b/attachment.htm>

From vitalyd at gmail.com  Fri Jan 11 13:05:33 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 11 Jan 2013 08:05:33 -0500
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50F00C22.2060201@oracle.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
	<50EFF12E.20005@oracle.com>
	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>
	<50F00C22.2060201@oracle.com>
Message-ID: <CAHjP37EB3k7f2DxcPj6oe7O=TOSyRnQrX+x6KO9-AOPLjP=LZA@mail.gmail.com>

That's very strange and unexpected indeed.  Perhaps you're right that
ParNew moves objects in a better manner (e.g. preserves locality).

Let us know if you manage to figure it out - I'd be very interested.

Thanks

Sent from my phone
On Jan 11, 2013 7:57 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

>
> Hi Vitaly,
>
> On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Regarding the benchmark score, are you saying ParNew has longer cumulative
> GC time or just the average is higher? If it's just average, maybe the
> total # of them (and cumulative time) is less.  I don't know the
> characteristics of this particular specjbb benchmark, but perhaps having
> fewer total GCs is better because of the overhead of getting all threads to
> a safe point, going go the OS to suspend them, and then restarting them.
> After they're restarted, the CPU cache may be cold for it because the GC
> thread polluted it.  Or I'm entirely wrong in my speculation ... :).
>
>
> You have a good point about the number of GCs. The problem in my runs is
> that ParNew does more GCs than DefNew. So there are both more of them and
> their average time is higher, but the score is still better. That ParNew
> does more GCs is not that strange. It has a higher score, which means that
> it had higher throughput and had time to create more objects. So, that is
> kind of expected. But I don't understand how it can have higher throughput
> when the GCs take longer. My current guess is that it does something
> differently with how objects are copied in a way that is beneficial for the
> execution time between GCs.
>
> It also seems like ParNew keeps more objects alive for each GC. That is
> either the reason why it does more and more frequent GCs than DefNew, or it
> is an effect of the fact that more objects are created due to the higher
> throughput. This is the reason I started looking at the tenuring threshold.
>
> Bengt
>
>  Thanks
>
> Sent from my phone
> On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com>
> wrote:
>
>>
>> Hi Ramki,
>>
>> Thanks for looking at this!
>>
>> On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>
>> Hi Bengt --
>>
>> The change looks reasonable, but I have a comment and a follow-up
>> question.
>>
>> Not your change, but I'd elide the "half the real survivor size" since
>> it's really a configurable parameter based on TargetSurvivorRatio with
>> default half.
>> I'd leave the comment as "set the new tenuring threshold and desired
>> survivor size".
>>
>>
>> I'm fine with removing this from the comment, but I thought the "half the
>> real survivor size" aimed at the fact that we pass only the "to" capacity
>> and not the "from" capacity in to compute_tenuring_threshold(). With that
>> interpretation I think the comment is correct.
>>
>> Would you like me to remove it anyway? Either way is fine with me.
>>
>> I'm curious though, as to what performance data prompted this change,
>>
>> Good point. This change was preceded by an internal discussion in the GC
>> team, so I should probably have explained the background more in my review
>> request to the open.
>>
>> I was comparing the ParNew and DefNew implementation since I am seeing
>> some strange differences in some SPECjbb2005 results. I am running ParNew
>> with a single thread and get much better score than with DefNew. But I also
>> get higher average GC times. So, I was trying to figure out what DefNew and
>> ParNew does differently.
>>
>> When I was looking at DefNewGeneration::collect() and
>> ParNewGeneration::collect() I saw that they contain a whole lot of code
>> duplication. It would be tempting to try to extract the common code out
>> into DefNewGeneration since it is the super class. But there are some minor
>> differences. One of them was this issue with how they handle the tenuring
>> threshold.
>>
>> We tried to figure out if there is a reason for ParNew and DefNew to
>> behave different in this regard. We could not come up with any good reason
>> for that. So, we needed to figure out if we should change ParNew or DefNew
>> to make them consistent. The decision to change ParNew was based on two
>> things. First, it seems wrong to use the data from a collection that got
>> promotion failure. This collection will not have allowed the tenuring
>> threshold to fulfill its purpose. Second, ParallelScavenge works the same
>> way as DefNew.
>>
>> BTW, the difference between DefNew and ParNew seems to have been there
>> from the start. So, there is no bug or changeset in mercurial or TeamWare
>> to explain why the difference was introduced.
>>
>> (Just to be clear, this difference was not the cause of my performance
>> issue. I still don't have a good explanation for how ParNew can have longer
>> GC times but better SPECjbb score.)
>>
>> and whether it might make sense, upon a promotion failure to do something
>> about the tenuring threshold for the next scavenge (i.e. for example make
>> the tenuring threshold half of its current value as a reaction to the fact
>> that promotion failed). Is it currently left at its previous value or is it
>> asjusted back to the default max value (which latter may be the wrong thing
>> to do) or something else?
>>
>>
>> As far as I can tell the tenuring threshold is left untouched if we get a
>> promotion failure. It is probably a good idea to update it in some way. But
>> I would prefer to handle that as a separate bug fix.
>>
>> This change is mostly a small cleanup to make DefNewGeneration::collect()
>> and ParNewGeneration::collect() be more consistent. We've done the thinking
>> so, it's good to make the change in preparation for the next person that
>> comes a long and has a few cycles over and would like to merge the two
>> collect() methods in some way.
>>
>> Thanks again for looking at this!
>> Bengt
>>
>>
>> -- ramki
>>
>> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson <
>> bengt.rutisson at oracle.com> wrote:
>>
>>>
>>> Hi everyone,
>>>
>>> Could I have a couple of reviews for this small change to make DefNew
>>> and ParNew be more consistent in the way they treat the tenuring threshold:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>
>>> Thanks,
>>> Bengt
>>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/5429c562/attachment.htm>

From alejandro.murillo at oracle.com  Fri Jan 11 14:34:06 2013
From: alejandro.murillo at oracle.com (alejandro.murillo at oracle.com)
Date: Fri, 11 Jan 2013 14:34:06 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 31 new changesets
Message-ID: <20130111143512.5F33A471EB@hg.openjdk.java.net>

Changeset: 79f492f184d0
Author:    katleman
Date:      2012-12-20 16:24 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/79f492f184d0

8004982: JDK8 source with GPL header errors
Reviewed-by: ohair

! agent/src/share/classes/sun/jvm/hotspot/ci/ciArrayKlass.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciField.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciInstance.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciKlass.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciMetadata.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciObjArrayKlass.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciObject.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciObjectFactory.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciReceiverTypeData.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciSymbol.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciType.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciTypeArrayKlass.java
! agent/src/share/classes/sun/jvm/hotspot/ci/ciVirtualCallData.java
! agent/src/share/classes/sun/jvm/hotspot/classfile/ClassLoaderData.java
! agent/src/share/classes/sun/jvm/hotspot/memory/LoaderConstraintTable.java
! agent/src/share/classes/sun/jvm/hotspot/oops/BitData.java
! agent/src/share/classes/sun/jvm/hotspot/oops/ProfileData.java
! agent/src/share/classes/sun/jvm/hotspot/oops/RetData.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Block.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Block_Array.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Block_List.java
! agent/src/share/classes/sun/jvm/hotspot/opto/CallDynamicJavaNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/CallJavaNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/CallNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/CallRuntimeNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/CallStaticJavaNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Compile.java
! agent/src/share/classes/sun/jvm/hotspot/opto/HaltNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/InlineTree.java
! agent/src/share/classes/sun/jvm/hotspot/opto/JVMState.java
! agent/src/share/classes/sun/jvm/hotspot/opto/LoopNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachCallJavaNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachCallNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachCallRuntimeNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachCallStaticJavaNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachIfNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachReturnNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MachSafePointNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/MultiNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Node.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Node_Array.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Node_List.java
! agent/src/share/classes/sun/jvm/hotspot/opto/Phase.java
! agent/src/share/classes/sun/jvm/hotspot/opto/PhaseCFG.java
! agent/src/share/classes/sun/jvm/hotspot/opto/PhaseRegAlloc.java
! agent/src/share/classes/sun/jvm/hotspot/opto/PhiNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/ProjNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/RegionNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/RootNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/SafePointNode.java
! agent/src/share/classes/sun/jvm/hotspot/opto/TypeNode.java
! agent/src/share/classes/sun/jvm/hotspot/prims/JvmtiExport.java
! agent/src/share/classes/sun/jvm/hotspot/utilities/GenericGrowableArray.java
! agent/src/share/classes/sun/jvm/hotspot/utilities/GrowableArray.java
! agent/src/share/native/sadis.c
! src/share/vm/classfile/classLoaderData.hpp
! src/share/vm/memory/metaspaceCounters.cpp
! src/share/vm/memory/metaspaceCounters.hpp
! src/share/vm/runtime/os_ext.hpp
! src/share/vm/services/diagnosticArgument.cpp
! src/share/vm/services/diagnosticCommand_ext.hpp
! src/share/vm/services/memReporter.cpp
! src/share/vm/services/memReporter.hpp
! test/runtime/7158804/Test7158804.sh

Changeset: e94068d4ff52
Author:    katleman
Date:      2012-12-26 14:23 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e94068d4ff52

Merge

! src/share/vm/classfile/classLoaderData.hpp

Changeset: 0847210f8548
Author:    katleman
Date:      2012-12-27 12:14 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0847210f8548

Added tag jdk8-b70 for changeset e94068d4ff52

! .hgtags

Changeset: d5cb5830f570
Author:    katleman
Date:      2013-01-03 12:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d5cb5830f570

Added tag jdk8-b71 for changeset 0847210f8548

! .hgtags

Changeset: 11619f33cd68
Author:    katleman
Date:      2013-01-10 09:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/11619f33cd68

Added tag jdk8-b72 for changeset d5cb5830f570

! .hgtags

Changeset: 7d42f3b08300
Author:    dcubed
Date:      2012-12-19 10:35 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7d42f3b08300

8005044: remove crufty '_g' support from HS runtime code
Summary: Phase 2 is removing '_g' support from the Runtime code.
Reviewed-by: dcubed, coleenp, hseigel
Contributed-by: ron.durbin at oracle.com

! src/os/bsd/vm/os_bsd.cpp
! src/os/linux/vm/os_linux.cpp
! src/os/solaris/vm/os_solaris.cpp
! src/os/windows/vm/os_windows.cpp
! src/share/tools/ProjectCreator/ProjectCreator.java
! src/share/vm/runtime/arguments.cpp

Changeset: 35431a769282
Author:    stefank
Date:      2012-12-20 10:22 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/35431a769282

8004823: Add VM support for type annotation reflection
Reviewed-by: dholmes, coleenp
Contributed-by: joel.franck at oracle.com

! make/bsd/makefiles/mapfile-vers-debug
! make/bsd/makefiles/mapfile-vers-product
! make/linux/makefiles/mapfile-vers-debug
! make/linux/makefiles/mapfile-vers-product
! make/solaris/makefiles/mapfile-vers
! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/classfile/classFileParser.hpp
! src/share/vm/classfile/javaClasses.cpp
! src/share/vm/classfile/javaClasses.hpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/oops/annotations.cpp
! src/share/vm/oops/annotations.hpp
! src/share/vm/oops/instanceKlass.cpp
! src/share/vm/oops/instanceKlass.hpp
! src/share/vm/oops/method.cpp
! src/share/vm/oops/method.hpp
! src/share/vm/prims/jvm.cpp
! src/share/vm/prims/jvm.h
! src/share/vm/prims/jvmtiRedefineClasses.cpp
! src/share/vm/runtime/fieldDescriptor.cpp
! src/share/vm/runtime/fieldDescriptor.hpp
! src/share/vm/runtime/reflection.cpp

Changeset: 4daebd4cc1dd
Author:    minqi
Date:      2012-12-24 11:46 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4daebd4cc1dd

Merge

! src/os/windows/vm/os_windows.cpp
! src/share/vm/classfile/javaClasses.cpp
! src/share/vm/classfile/javaClasses.hpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/oops/method.hpp
! src/share/vm/runtime/arguments.cpp

Changeset: cc6a617fffd2
Author:    coleenp
Date:      2013-01-02 20:28 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/cc6a617fffd2

8005494: SIGSEGV in Rewriter::relocate_and_link() when testing Weblogic with CompressedOops and KlassPtrs
Summary: Relocate functions with jsr's when rewriting so not repeated after reading shared archive
Reviewed-by: twisti, jrose

! src/share/vm/interpreter/rewriter.cpp
! src/share/vm/interpreter/rewriter.hpp
! src/share/vm/oops/instanceKlass.cpp
! src/share/vm/oops/instanceKlass.hpp
! src/share/vm/prims/jvmtiRedefineClasses.cpp
! src/share/vm/runtime/handles.inline.hpp

Changeset: 6c3f47d964f3
Author:    hseigel
Date:      2013-01-07 15:32 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6c3f47d964f3

8003705: CDS failed on Windows: can not map in the CDS.
Summary: Map memory only once to prevent 'already mapped' failures.
Reviewed-by: acorn, zgu

! src/share/vm/memory/filemap.cpp
! src/share/vm/memory/metaspaceShared.cpp

Changeset: 561148896559
Author:    hseigel
Date:      2013-01-08 13:38 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/561148896559

8005076: Creating a CDS archive with one alignment and running another causes a crash.
Summary: Save the alignment when writing the CDS and compare it when reading the CDS.
Reviewed-by: kvn, coleenp

! src/share/vm/memory/filemap.cpp
! src/share/vm/memory/filemap.hpp
! src/share/vm/runtime/arguments.cpp
! src/share/vm/runtime/globals.hpp

Changeset: ade95d680b42
Author:    coleenp
Date:      2013-01-08 14:01 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ade95d680b42

8004728: Add hotspot support for parameter reflection
Summary: Add hotspot support for parameter reflection
Reviewed-by: acorn, jrose, coleenp
Contributed-by: eric.mccorkle at oracle.com

! make/bsd/makefiles/mapfile-vers-debug
! make/bsd/makefiles/mapfile-vers-product
! make/linux/makefiles/mapfile-vers-debug
! make/linux/makefiles/mapfile-vers-product
! make/solaris/makefiles/mapfile-vers
! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/classfile/classFileStream.cpp
! src/share/vm/classfile/classFileStream.hpp
! src/share/vm/classfile/defaultMethods.cpp
! src/share/vm/classfile/javaClasses.cpp
! src/share/vm/classfile/javaClasses.hpp
! src/share/vm/classfile/systemDictionary.hpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/oops/constMethod.cpp
! src/share/vm/oops/constMethod.hpp
! src/share/vm/oops/method.cpp
! src/share/vm/oops/method.hpp
! src/share/vm/prims/jvm.cpp
! src/share/vm/prims/jvm.h
! src/share/vm/runtime/reflection.cpp
! src/share/vm/runtime/reflection.hpp

Changeset: 185a2c979a0e
Author:    coleenp
Date:      2013-01-08 13:44 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/185a2c979a0e

Merge


Changeset: ecd24264898b
Author:    zgu
Date:      2013-01-08 14:04 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ecd24264898b

8005048: NMT: #loaded classes needs to just show the # defined classes
Summary: Count number of instance classes so that it matches class metadata size
Reviewed-by: coleenp, acorn

! src/share/vm/oops/instanceKlass.cpp
! src/share/vm/oops/instanceKlass.hpp
! src/share/vm/services/memBaseline.cpp
! src/share/vm/services/memRecorder.cpp
! src/share/vm/services/memRecorder.hpp
! src/share/vm/services/memSnapshot.cpp
! src/share/vm/services/memSnapshot.hpp
! src/share/vm/services/memTrackWorker.cpp
! src/share/vm/services/memTrackWorker.hpp
! src/share/vm/services/memTracker.cpp
! src/share/vm/services/memTracker.hpp

Changeset: 37a3e8b7a1e9
Author:    zgu
Date:      2013-01-08 11:39 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/37a3e8b7a1e9

Merge

! src/share/vm/oops/instanceKlass.cpp
! src/share/vm/oops/instanceKlass.hpp

Changeset: 0c93d4818214
Author:    zgu
Date:      2013-01-08 15:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0c93d4818214

Merge


Changeset: 1f6d10b4cc0c
Author:    acorn
Date:      2013-01-09 18:06 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1f6d10b4cc0c

Merge

! src/share/vm/runtime/arguments.cpp
! src/share/vm/runtime/globals.hpp

Changeset: 608b2e8a0063
Author:    bpittore
Date:      2013-01-03 15:08 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/608b2e8a0063

8004051: assert(_oprs_len[mode] < maxNumberOfOperands) failed: array overflow
Summary: assert is triggered when number of register based arguments passed to a java method exceeds 16.
Reviewed-by: roland, vladidan

! src/share/vm/c1/c1_LIR.hpp

Changeset: 0c8717a92b2d
Author:    jiangli
Date:      2013-01-08 13:01 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0c8717a92b2d

8001341: SIGSEGV in methodOopDesc::fast_exception_handler_bci_for(KlassHandle,int,Thread*)+0x3e9.
Summary: Use methodHandle.
Reviewed-by: coleenp, acorn, twisti, sspitsyn

! src/share/vm/interpreter/interpreterRuntime.cpp
! src/share/vm/oops/method.cpp
! src/share/vm/oops/method.hpp
! src/share/vm/prims/jvmtiExport.cpp
! src/share/vm/runtime/sharedRuntime.cpp

Changeset: 18c3c3fa291b
Author:    dlong
Date:      2013-01-09 21:18 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/18c3c3fa291b

Merge

! src/share/vm/oops/method.cpp
! src/share/vm/oops/method.hpp

Changeset: b2fef6b220e9
Author:    jmasa
Date:      2013-01-10 07:32 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b2fef6b220e9

Merge

! src/share/vm/runtime/arguments.cpp

Changeset: d092d1b31229
Author:    roland
Date:      2012-12-23 17:08 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d092d1b31229

8005071: Incremental inlining for JSR 292
Summary: post parse inlining driven by number of live nodes.
Reviewed-by: twisti, kvn, jrose

! src/share/vm/opto/bytecodeInfo.cpp
! src/share/vm/opto/c2_globals.hpp
! src/share/vm/opto/callGenerator.cpp
! src/share/vm/opto/callGenerator.hpp
! src/share/vm/opto/callnode.cpp
! src/share/vm/opto/callnode.hpp
! src/share/vm/opto/cfgnode.cpp
! src/share/vm/opto/cfgnode.hpp
! src/share/vm/opto/compile.cpp
! src/share/vm/opto/compile.hpp
! src/share/vm/opto/doCall.cpp
! src/share/vm/opto/graphKit.cpp
! src/share/vm/opto/parse.hpp
! src/share/vm/opto/phaseX.cpp
! src/share/vm/opto/phaseX.hpp
! src/share/vm/opto/stringopts.cpp
! src/share/vm/runtime/arguments.cpp

Changeset: 00af3a3a8df4
Author:    kvn
Date:      2013-01-03 15:09 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/00af3a3a8df4

8005522: use fast-string instructions on x86 for zeroing
Summary: use 'rep stosb' instead of 'rep stosq' when fast-string operations are available.
Reviewed-by: twisti, roland

! src/cpu/x86/vm/assembler_x86.cpp
! src/cpu/x86/vm/assembler_x86.hpp
! src/cpu/x86/vm/globals_x86.hpp
! src/cpu/x86/vm/macroAssembler_x86.cpp
! src/cpu/x86/vm/macroAssembler_x86.hpp
! src/cpu/x86/vm/vm_version_x86.cpp
! src/cpu/x86/vm/vm_version_x86.hpp
! src/cpu/x86/vm/x86_32.ad
! src/cpu/x86/vm/x86_64.ad
! src/share/vm/opto/memnode.cpp

Changeset: e2e6bf86682c
Author:    kvn
Date:      2013-01-03 16:30 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e2e6bf86682c

8005544: Use 256bit YMM registers in arraycopy stubs on x86
Summary: Use YMM registers in arraycopy and array_fill stubs.
Reviewed-by: roland, twisti

! src/cpu/x86/vm/assembler_x86.cpp
! src/cpu/x86/vm/assembler_x86.hpp
! src/cpu/x86/vm/macroAssembler_x86.cpp
! src/cpu/x86/vm/stubGenerator_x86_32.cpp
! src/cpu/x86/vm/stubGenerator_x86_64.cpp

Changeset: ffa87474d7a4
Author:    twisti
Date:      2013-01-07 14:08 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ffa87474d7a4

8004537: replace AbstractAssembler emit_long with emit_int32
Reviewed-by: jrose, kvn, twisti
Contributed-by: Morris Meyer <morris.meyer at oracle.com>

! src/cpu/sparc/vm/assembler_sparc.hpp
! src/cpu/sparc/vm/assembler_sparc.inline.hpp
! src/cpu/sparc/vm/cppInterpreter_sparc.cpp
! src/cpu/sparc/vm/macroAssembler_sparc.cpp
! src/cpu/sparc/vm/templateInterpreter_sparc.cpp
! src/cpu/x86/vm/assembler_x86.cpp
! src/cpu/x86/vm/macroAssembler_x86.cpp
! src/share/vm/asm/assembler.hpp

Changeset: 038dd2875b94
Author:    kvn
Date:      2013-01-08 11:30 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/038dd2875b94

8005419: Improve intrinsics code performance on x86 by using AVX2
Summary: use 256bit vpxor,vptest instructions in String.compareTo() and equals() intrinsics.
Reviewed-by: twisti

! src/cpu/x86/vm/assembler_x86.cpp
! src/cpu/x86/vm/assembler_x86.hpp
! src/cpu/x86/vm/macroAssembler_x86.cpp
! src/cpu/x86/vm/macroAssembler_x86.hpp
+ test/compiler/8005419/Test8005419.java

Changeset: 5698813d45eb
Author:    twisti
Date:      2013-01-09 15:37 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5698813d45eb

8005418: JSR 292: virtual dispatch bug in 292 impl
Reviewed-by: jrose, kvn

! src/share/vm/opto/callGenerator.cpp
! src/share/vm/opto/compile.hpp
! src/share/vm/opto/doCall.cpp
! src/share/vm/opto/parse.hpp
! src/share/vm/opto/parse1.cpp

Changeset: f1c06dcee0b5
Author:    kvn
Date:      2013-01-10 10:00 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1c06dcee0b5

Merge

! src/share/vm/runtime/arguments.cpp

Changeset: 1e129851479e
Author:    amurillo
Date:      2013-01-11 01:43 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1e129851479e

Merge


Changeset: b5e6bec76f4a
Author:    amurillo
Date:      2013-01-11 01:43 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b5e6bec76f4a

Added tag hs25-b15 for changeset 1e129851479e

! .hgtags

Changeset: d58b7b43031b
Author:    amurillo
Date:      2013-01-11 02:02 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d58b7b43031b

8006034: new hotspot build - hs25-b16
Reviewed-by: jcoomes

! make/hotspot_version


From ysr1729 at gmail.com  Fri Jan 11 17:50:23 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 11 Jan 2013 09:50:23 -0800
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50F00C22.2060201@oracle.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
	<50EFF12E.20005@oracle.com>
	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>
	<50F00C22.2060201@oracle.com>
Message-ID: <CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>

Hi Bengt --

Try computing the GC overhead by normalizing wrt the work done (for which
the net allocation volume might be a good proxy). As you state, the
performance numbers will then likely make sense. Of course, they still
won't explain why ParNew does better.  As Vitaly conjectures, the
difference is likely in better object co-location with ParNew's slightly
more DFS-like evacuation compared with DefNew's considerably more BFS-like
evacuation because of the latter's use of a pure Cheney scan compared with
the use of (a) marking stack(s) in the former, as far as i can remember the
code. One way to tell if that accounts for the difference is to measure the
cache-miss rates in the two cases (and may be use a good tool like Solaris
perf analyzer to show you where the misses are coming from as well).

Also curious if you can share the two sets of GC logs, by chance? (specJBB
is a for-fee benchmark so is not freely available to the individual
developer.)

thanks.
-- ramki

On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson
<bengt.rutisson at oracle.com>wrote:

>
> Hi Vitaly,
>
>
> On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Regarding the benchmark score, are you saying ParNew has longer cumulative
> GC time or just the average is higher? If it's just average, maybe the
> total # of them (and cumulative time) is less.  I don't know the
> characteristics of this particular specjbb benchmark, but perhaps having
> fewer total GCs is better because of the overhead of getting all threads to
> a safe point, going go the OS to suspend them, and then restarting them.
> After they're restarted, the CPU cache may be cold for it because the GC
> thread polluted it.  Or I'm entirely wrong in my speculation ... :).
>
>
> You have a good point about the number of GCs. The problem in my runs is
> that ParNew does more GCs than DefNew. So there are both more of them and
> their average time is higher, but the score is still better. That ParNew
> does more GCs is not that strange. It has a higher score, which means that
> it had higher throughput and had time to create more objects. So, that is
> kind of expected. But I don't understand how it can have higher throughput
> when the GCs take longer. My current guess is that it does something
> differently with how objects are copied in a way that is beneficial for the
> execution time between GCs.
>
> It also seems like ParNew keeps more objects alive for each GC. That is
> either the reason why it does more and more frequent GCs than DefNew, or it
> is an effect of the fact that more objects are created due to the higher
> throughput. This is the reason I started looking at the tenuring threshold.
>
> Bengt
>
>
>  Thanks
>
> Sent from my phone
> On Jan 11, 2013 6:02 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com>
> wrote:
>
>>
>> Hi Ramki,
>>
>> Thanks for looking at this!
>>
>> On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>
>> Hi Bengt --
>>
>> The change looks reasonable, but I have a comment and a follow-up
>> question.
>>
>> Not your change, but I'd elide the "half the real survivor size" since
>> it's really a configurable parameter based on TargetSurvivorRatio with
>> default half.
>> I'd leave the comment as "set the new tenuring threshold and desired
>> survivor size".
>>
>>
>> I'm fine with removing this from the comment, but I thought the "half the
>> real survivor size" aimed at the fact that we pass only the "to" capacity
>> and not the "from" capacity in to compute_tenuring_threshold(). With that
>> interpretation I think the comment is correct.
>>
>> Would you like me to remove it anyway? Either way is fine with me.
>>
>> I'm curious though, as to what performance data prompted this change,
>>
>> Good point. This change was preceded by an internal discussion in the GC
>> team, so I should probably have explained the background more in my review
>> request to the open.
>>
>> I was comparing the ParNew and DefNew implementation since I am seeing
>> some strange differences in some SPECjbb2005 results. I am running ParNew
>> with a single thread and get much better score than with DefNew. But I also
>> get higher average GC times. So, I was trying to figure out what DefNew and
>> ParNew does differently.
>>
>> When I was looking at DefNewGeneration::collect() and
>> ParNewGeneration::collect() I saw that they contain a whole lot of code
>> duplication. It would be tempting to try to extract the common code out
>> into DefNewGeneration since it is the super class. But there are some minor
>> differences. One of them was this issue with how they handle the tenuring
>> threshold.
>>
>> We tried to figure out if there is a reason for ParNew and DefNew to
>> behave different in this regard. We could not come up with any good reason
>> for that. So, we needed to figure out if we should change ParNew or DefNew
>> to make them consistent. The decision to change ParNew was based on two
>> things. First, it seems wrong to use the data from a collection that got
>> promotion failure. This collection will not have allowed the tenuring
>> threshold to fulfill its purpose. Second, ParallelScavenge works the same
>> way as DefNew.
>>
>> BTW, the difference between DefNew and ParNew seems to have been there
>> from the start. So, there is no bug or changeset in mercurial or TeamWare
>> to explain why the difference was introduced.
>>
>> (Just to be clear, this difference was not the cause of my performance
>> issue. I still don't have a good explanation for how ParNew can have longer
>> GC times but better SPECjbb score.)
>>
>> and whether it might make sense, upon a promotion failure to do something
>> about the tenuring threshold for the next scavenge (i.e. for example make
>> the tenuring threshold half of its current value as a reaction to the fact
>> that promotion failed). Is it currently left at its previous value or is it
>> asjusted back to the default max value (which latter may be the wrong thing
>> to do) or something else?
>>
>>
>> As far as I can tell the tenuring threshold is left untouched if we get a
>> promotion failure. It is probably a good idea to update it in some way. But
>> I would prefer to handle that as a separate bug fix.
>>
>> This change is mostly a small cleanup to make DefNewGeneration::collect()
>> and ParNewGeneration::collect() be more consistent. We've done the thinking
>> so, it's good to make the change in preparation for the next person that
>> comes a long and has a few cycles over and would like to merge the two
>> collect() methods in some way.
>>
>> Thanks again for looking at this!
>> Bengt
>>
>>
>> -- ramki
>>
>> On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson <
>> bengt.rutisson at oracle.com> wrote:
>>
>>>
>>> Hi everyone,
>>>
>>> Could I have a couple of reviews for this small change to make DefNew
>>> and ParNew be more consistent in the way they treat the tenuring threshold:
>>>
>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>
>>> Thanks,
>>> Bengt
>>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/828168ad/attachment.htm>

From chunt at salesforce.com  Fri Jan 11 19:32:33 2013
From: chunt at salesforce.com (Charlie Hunt)
Date: Fri, 11 Jan 2013 11:32:33 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50EF0CCA.8070005@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
Message-ID: <55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>

Hi John,

Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.

I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.

So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?

I'd be interested in hearing your thoughts along with Monica's, Jon Masa, John Coomes and Ramki, if they have time of course.

hths,

charlie ...

On Jan 10, 2013, at 12:47 PM, John Cuthbertson wrote:

> Hi Bengt,
> 
> Thanks for reviewing the code. Replies inline...
> 
> On 1/10/2013 2:07 AM, Bengt Rutisson wrote:
>> 
>> Hi John,
>> 
>> Changes look good.
>> 
>> One question about G1NewSizePercent and G1MaxNewSizePercent. Why are 
>> these only changed for heap sizes below 4GB? I would think that at 
>> least the reduction of G1NewSizePercent would be even more important 
>> for larger heap sizes. If we want to get lower pause times on larger 
>> heaps we need to be able to have a small young gen size.
> 
> The simple answer is: it was suggested by Monica and Charlie. Personally 
> I'm OK with making the new values of G1NewSizePercent and 
> G1MaxNewSizePercent the defaults for all heap sizes and we might (or 
> most likely will) go there in the future - but for the moment we're 
> being conservative.
> 
> As I mentioned we would like to make G1 a bit more adaptive - and both 
> Monica and Charlie have some ideas in that area.
> 
>> 
>> Also, your change in arguments.cpp is guarded by #ifndef SERIALGC. 
>> This is correct of course, but Joe Provino has a change out that will 
>> replace this kind of check with #if INCLUDE_ALL_GCS:
>> 
>> http://cr.openjdk.java.net/~jprovino/8005915/webrev.00
>> 
>> Neither you nor Joe will get any merge conflicts if your changes are 
>> both pushed. It will even still compile. But the code inside #ifndef 
>> SERIALGC will never be executed. So, it might be good to keep any eye 
>> out for how Joe's change propagate through the repositories to make 
>> sure that you can manually resolve this.
>> 
>> My guess is that Joe's change will have to wait a while since it 
>> includes make file changes that potentially interfere with changes for 
>> the new build system. So, hopefully you get to push this first :)
> 
> Thanks. I've been watching the progress that Joe's change has been 
> making. I guess you can think of this change as the one for hs24 and 
> another with the SERIALGC changed appropriately being for hs25. :)
> 
> Thanks,
> 
> JohnC
> 


From bengt.rutisson at oracle.com  Fri Jan 11 21:19:23 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Fri, 11 Jan 2013 22:19:23 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>
References: <50EE8A18.5070004@oracle.com>
	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>
	<50EFF12E.20005@oracle.com>
	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>
	<50F00C22.2060201@oracle.com>
	<CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>
Message-ID: <50F081DB.4040904@oracle.com>


Hi Ramki,

On 1/11/13 6:50 PM, Srinivas Ramakrishna wrote:
> Hi Bengt --
>
> Try computing the GC overhead by normalizing wrt the work done (for 
> which the net allocation volume might be a good proxy). As you state, 
> the performance numbers will then likely make sense. Of course, they 
> still won't explain why ParNew does better.  As Vitaly conjectures, 
> the difference is likely in better object co-location with ParNew's 
> slightly more DFS-like evacuation compared with DefNew's considerably 
> more BFS-like evacuation because of the latter's use of a pure Cheney 
> scan compared with the use of (a) marking stack(s) in the former, as 
> far as i can remember the code. One way to tell if that accounts for 
> the difference is to measure the cache-miss rates in the two cases 
> (and may be use a good tool like Solaris perf analyzer to show you 
> where the misses are coming from as well).

Thanks for bringing the DFS/BFS difference up. This is exactly the kind 
of difference I was looking for. My guess is that this is what causes 
the difference in JBB score. I'll see if I can investigate this further.
> Also curious if you can share the two sets of GC logs, by chance? 
> (specJBB is a for-fee benchmark so is not freely available to the 
> individual developer.)

I have a fairly large set of logs, but the runs are very stable so I'm 
just attaching logs for one run for each collector. For comparison I 
have also been running ParallelScavenge with one thread. I'm using 
separate gc logs and jbb logs. The log files called ".result" are the 
jbb output. The other logs are the gc logs.

I'm running with a heap size of 1GB to avoid full GCs. All runs have the 
two System.gc() induced full GCs but no other. ParallelScavenge is 
performing even better than ParNew, but I am mostly interested in the 
difference between ParNew and DefNew.

A quick summary of the data in the logs:

         Score  #GCs  AverageGCTime
DefNew: 57903  2083  0.044053195391262644
ParNew: 61363  2213  0.05931835969272489
PS:     69697  2213  0.06117092860370538

ParNew has a better score even though it does more GCs and they take longer.

If you have any insights from looking at the logs I would be very happy 
to hear about it.

Thanks,
Bengt
>
> thanks.
> -- ramki
>
> On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson 
> <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
>     Hi Vitaly,
>
>
>     On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>>
>>     Hi Bengt,
>>
>>     Regarding the benchmark score, are you saying ParNew has longer
>>     cumulative GC time or just the average is higher? If it's just
>>     average, maybe the total # of them (and cumulative time) is
>>     less.  I don't know the characteristics of this particular
>>     specjbb benchmark, but perhaps having fewer total GCs is better
>>     because of the overhead of getting all threads to a safe point,
>>     going go the OS to suspend them, and then restarting them.  After
>>     they're restarted, the CPU cache may be cold for it because the
>>     GC thread polluted it.  Or I'm entirely wrong in my speculation
>>     ... :).
>>
>
>     You have a good point about the number of GCs. The problem in my
>     runs is that ParNew does more GCs than DefNew. So there are both
>     more of them and their average time is higher, but the score is
>     still better. That ParNew does more GCs is not that strange. It
>     has a higher score, which means that it had higher throughput and
>     had time to create more objects. So, that is kind of expected. But
>     I don't understand how it can have higher throughput when the GCs
>     take longer. My current guess is that it does something
>     differently with how objects are copied in a way that is
>     beneficial for the execution time between GCs.
>
>     It also seems like ParNew keeps more objects alive for each GC.
>     That is either the reason why it does more and more frequent GCs
>     than DefNew, or it is an effect of the fact that more objects are
>     created due to the higher throughput. This is the reason I started
>     looking at the tenuring threshold.
>
>     Bengt
>
>
>>     Thanks
>>
>>     Sent from my phone
>>
>>     On Jan 11, 2013 6:02 AM, "Bengt Rutisson"
>>     <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>
>>
>>         Hi Ramki,
>>
>>         Thanks for looking at this!
>>
>>         On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>>         Hi Bengt --
>>>
>>>         The change looks reasonable, but I have a comment and a
>>>         follow-up question.
>>>
>>>         Not your change, but I'd elide the "half the real survivor
>>>         size" since it's really a configurable parameter based on
>>>         TargetSurvivorRatio with default half.
>>>         I'd leave the comment as "set the new tenuring threshold and
>>>         desired survivor size".
>>
>>         I'm fine with removing this from the comment, but I thought
>>         the "half the real survivor size" aimed at the fact that we
>>         pass only the "to" capacity and not the "from" capacity in to
>>         compute_tenuring_threshold(). With that interpretation I
>>         think the comment is correct.
>>
>>         Would you like me to remove it anyway? Either way is fine
>>         with me.
>>
>>>         I'm curious though, as to what performance data prompted
>>>         this change,
>>         Good point. This change was preceded by an internal
>>         discussion in the GC team, so I should probably have
>>         explained the background more in my review request to the open.
>>
>>         I was comparing the ParNew and DefNew implementation since I
>>         am seeing some strange differences in some SPECjbb2005
>>         results. I am running ParNew with a single thread and get
>>         much better score than with DefNew. But I also get higher
>>         average GC times. So, I was trying to figure out what DefNew
>>         and ParNew does differently.
>>
>>         When I was looking at DefNewGeneration::collect() and
>>         ParNewGeneration::collect() I saw that they contain a whole
>>         lot of code duplication. It would be tempting to try to
>>         extract the common code out into DefNewGeneration since it is
>>         the super class. But there are some minor differences. One of
>>         them was this issue with how they handle the tenuring threshold.
>>
>>         We tried to figure out if there is a reason for ParNew and
>>         DefNew to behave different in this regard. We could not come
>>         up with any good reason for that. So, we needed to figure out
>>         if we should change ParNew or DefNew to make them consistent.
>>         The decision to change ParNew was based on two things. First,
>>         it seems wrong to use the data from a collection that got
>>         promotion failure. This collection will not have allowed the
>>         tenuring threshold to fulfill its purpose. Second,
>>         ParallelScavenge works the same way as DefNew.
>>
>>         BTW, the difference between DefNew and ParNew seems to have
>>         been there from the start. So, there is no bug or changeset
>>         in mercurial or TeamWare to explain why the difference was
>>         introduced.
>>
>>         (Just to be clear, this difference was not the cause of my
>>         performance issue. I still don't have a good explanation for
>>         how ParNew can have longer GC times but better SPECjbb score.)
>>
>>>         and whether it might make sense, upon a promotion failure to
>>>         do something about the tenuring threshold for the next
>>>         scavenge (i.e. for example make the tenuring threshold half
>>>         of its current value as a reaction to the fact that
>>>         promotion failed). Is it currently left at its previous
>>>         value or is it asjusted back to the default max value (which
>>>         latter may be the wrong thing to do) or something else?
>>
>>         As far as I can tell the tenuring threshold is left untouched
>>         if we get a promotion failure. It is probably a good idea to
>>         update it in some way. But I would prefer to handle that as a
>>         separate bug fix.
>>
>>         This change is mostly a small cleanup to make
>>         DefNewGeneration::collect() and ParNewGeneration::collect()
>>         be more consistent. We've done the thinking so, it's good to
>>         make the change in preparation for the next person that comes
>>         a long and has a few cycles over and would like to merge the
>>         two collect() methods in some way.
>>
>>         Thanks again for looking at this!
>>         Bengt
>>
>>>
>>>         -- ramki
>>>
>>>         On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
>>>         <bengt.rutisson at oracle.com
>>>         <mailto:bengt.rutisson at oracle.com>> wrote:
>>>
>>>
>>>             Hi everyone,
>>>
>>>             Could I have a couple of reviews for this small change
>>>             to make DefNew and ParNew be more consistent in the way
>>>             they treat the tenuring threshold:
>>>
>>>             http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>             <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>>>
>>>             Thanks,
>>>             Bengt
>>>
>>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/65f008a8/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: defnew-parnew-ps-logs.zip
Type: application/zip
Size: 193065 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130111/65f008a8/defnew-parnew-ps-logs.zip>

From Peter.B.Kessler at Oracle.COM  Fri Jan 11 22:18:47 2013
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Fri, 11 Jan 2013 14:18:47 -0800
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50F081DB.4040904@oracle.com>
References: <50EE8A18.5070004@oracle.com>	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>	<50EFF12E.20005@oracle.com>	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>	<50F00C22.2060201@oracle.com>	<CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>
	<50F081DB.4040904@oracle.com>
Message-ID: <50F08FC7.3070701@Oracle.COM>

I don't see -XX:+AlwaysPreTouch in your command line.  (Mostly because I'm not sure I see the command line: for example, I don't see your 1GB heap setting.)

When you are watching GC performance before the first full collection, you have to remember those faults for the OS to populate the old generation.  If you do that during promotions, you do it one page at a time.  -XX:+AlwaysPreTouch touches all of the committed old generation during start-up, which turns out to be much faster.  (And doesn't bill the time to GC. :-)

Maybe those forced System.gc() calls at the beginning touch all of the old generation?  Try a runs with and without -XX:+AlwaysPreTouch and report the results, because I'm curious.

I also see that the young generation spaces are different between the runs.  Looking at the heap shape printed at the end of the GC logs

     def new generation   total 314560K, used 232441K [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
      eden space 279616K,  76% used [0x00000000c0000000, 0x00000000cd0b4d38, 0x00000000d1110000)
      from space 34944K,  53% used [0x00000000d1110000, 0x00000000d2359a18, 0x00000000d3330000)
      to   space 34944K,   0% used [0x00000000d3330000, 0x00000000d3330000, 0x00000000d5550000)

     par new generation   total 314560K, used 74831K [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
      eden space 279616K,  18% used [0x00000000c0000000, 0x00000000c33154f8, 0x00000000d1110000)
      from space 34944K,  64% used [0x00000000d1110000, 0x00000000d270e760, 0x00000000d3330000)
      to   space 34944K,   0% used [0x00000000d3330000, 0x00000000d3330000, 0x00000000d5550000)

     PSYoungGen      total 329728K, used 93207K [0x00000000eaab0000, 0x0000000100000000, 0x0000000100000000)
      eden space 309952K, 23% used [0x00000000eaab0000,0x00000000ef30dd08,0x00000000fd960000)
      from space 19776K, 96% used [0x00000000fd960000,0x00000000fec08000,0x00000000fecb0000)
      to   space 19712K, 0% used [0x00000000fecc0000,0x00000000fecc0000,0x0000000100000000)

The PSYoung eden is 10% larger than the others (because the survivors are smaller?).  The sizes and occupancy of the survivors is different between DefNew and ParNew if you look in the logs.

			... peter

Bengt Rutisson wrote:
> 
> 
> Hi Ramki,
> 
> On 1/11/13 6:50 PM, Srinivas Ramakrishna wrote:
>> Hi Bengt --
>>
>> Try computing the GC overhead by normalizing wrt the work done (for 
>> which the net allocation volume might be a good proxy). As you state, 
>> the performance numbers will then likely make sense. Of course, they 
>> still won't explain why ParNew does better.  As Vitaly conjectures, 
>> the difference is likely in better object co-location with ParNew's 
>> slightly more DFS-like evacuation compared with DefNew's considerably 
>> more BFS-like evacuation because of the latter's use of a pure Cheney 
>> scan compared with the use of (a) marking stack(s) in the former, as 
>> far as i can remember the code. One way to tell if that accounts for 
>> the difference is to measure the cache-miss rates in the two cases 
>> (and may be use a good tool like Solaris perf analyzer to show you 
>> where the misses are coming from as well).
> 
> Thanks for bringing the DFS/BFS difference up. This is exactly the kind 
> of difference I was looking for. My guess is that this is what causes 
> the difference in JBB score. I'll see if I can investigate this further.
>> Also curious if you can share the two sets of GC logs, by chance? 
>> (specJBB is a for-fee benchmark so is not freely available to the 
>> individual developer.)
> 
> I have a fairly large set of logs, but the runs are very stable so I'm 
> just attaching logs for one run for each collector. For comparison I 
> have also been running ParallelScavenge with one thread. I'm using 
> separate gc logs and jbb logs. The log files called ".result" are the 
> jbb output. The other logs are the gc logs.
> 
> I'm running with a heap size of 1GB to avoid full GCs. All runs have the 
> two System.gc() induced full GCs but no other. ParallelScavenge is 
> performing even better than ParNew, but I am mostly interested in the 
> difference between ParNew and DefNew.
> 
> A quick summary of the data in the logs:
> 
>         Score  #GCs  AverageGCTime
> DefNew: 57903  2083  0.044053195391262644
> ParNew: 61363  2213  0.05931835969272489
> PS:     69697  2213  0.06117092860370538
> 
> ParNew has a better score even though it does more GCs and they take longer.
> 
> If you have any insights from looking at the logs I would be very happy 
> to hear about it.
> 
> Thanks,
> Bengt
>>
>> thanks.
>> -- ramki
>>
>> On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson 
>> <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>
>>
>>     Hi Vitaly,
>>
>>
>>     On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>>>
>>>     Hi Bengt,
>>>
>>>     Regarding the benchmark score, are you saying ParNew has longer
>>>     cumulative GC time or just the average is higher? If it's just
>>>     average, maybe the total # of them (and cumulative time) is
>>>     less.  I don't know the characteristics of this particular
>>>     specjbb benchmark, but perhaps having fewer total GCs is better
>>>     because of the overhead of getting all threads to a safe point,
>>>     going go the OS to suspend them, and then restarting them.  After
>>>     they're restarted, the CPU cache may be cold for it because the
>>>     GC thread polluted it.  Or I'm entirely wrong in my speculation
>>>     ... :).
>>>
>>
>>     You have a good point about the number of GCs. The problem in my
>>     runs is that ParNew does more GCs than DefNew. So there are both
>>     more of them and their average time is higher, but the score is
>>     still better. That ParNew does more GCs is not that strange. It
>>     has a higher score, which means that it had higher throughput and
>>     had time to create more objects. So, that is kind of expected. But
>>     I don't understand how it can have higher throughput when the GCs
>>     take longer. My current guess is that it does something
>>     differently with how objects are copied in a way that is
>>     beneficial for the execution time between GCs.
>>
>>     It also seems like ParNew keeps more objects alive for each GC.
>>     That is either the reason why it does more and more frequent GCs
>>     than DefNew, or it is an effect of the fact that more objects are
>>     created due to the higher throughput. This is the reason I started
>>     looking at the tenuring threshold.
>>
>>     Bengt
>>
>>
>>>     Thanks
>>>
>>>     Sent from my phone
>>>
>>>     On Jan 11, 2013 6:02 AM, "Bengt Rutisson"
>>>     <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>>
>>>
>>>         Hi Ramki,
>>>
>>>         Thanks for looking at this!
>>>
>>>         On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>>>         Hi Bengt --
>>>>
>>>>         The change looks reasonable, but I have a comment and a
>>>>         follow-up question.
>>>>
>>>>         Not your change, but I'd elide the "half the real survivor
>>>>         size" since it's really a configurable parameter based on
>>>>         TargetSurvivorRatio with default half.
>>>>         I'd leave the comment as "set the new tenuring threshold and
>>>>         desired survivor size".
>>>
>>>         I'm fine with removing this from the comment, but I thought
>>>         the "half the real survivor size" aimed at the fact that we
>>>         pass only the "to" capacity and not the "from" capacity in to
>>>         compute_tenuring_threshold(). With that interpretation I
>>>         think the comment is correct.
>>>
>>>         Would you like me to remove it anyway? Either way is fine
>>>         with me.
>>>
>>>>         I'm curious though, as to what performance data prompted
>>>>         this change,
>>>         Good point. This change was preceded by an internal
>>>         discussion in the GC team, so I should probably have
>>>         explained the background more in my review request to the open.
>>>
>>>         I was comparing the ParNew and DefNew implementation since I
>>>         am seeing some strange differences in some SPECjbb2005
>>>         results. I am running ParNew with a single thread and get
>>>         much better score than with DefNew. But I also get higher
>>>         average GC times. So, I was trying to figure out what DefNew
>>>         and ParNew does differently.
>>>
>>>         When I was looking at DefNewGeneration::collect() and
>>>         ParNewGeneration::collect() I saw that they contain a whole
>>>         lot of code duplication. It would be tempting to try to
>>>         extract the common code out into DefNewGeneration since it is
>>>         the super class. But there are some minor differences. One of
>>>         them was this issue with how they handle the tenuring threshold.
>>>
>>>         We tried to figure out if there is a reason for ParNew and
>>>         DefNew to behave different in this regard. We could not come
>>>         up with any good reason for that. So, we needed to figure out
>>>         if we should change ParNew or DefNew to make them consistent.
>>>         The decision to change ParNew was based on two things. First,
>>>         it seems wrong to use the data from a collection that got
>>>         promotion failure. This collection will not have allowed the
>>>         tenuring threshold to fulfill its purpose. Second,
>>>         ParallelScavenge works the same way as DefNew.
>>>
>>>         BTW, the difference between DefNew and ParNew seems to have
>>>         been there from the start. So, there is no bug or changeset
>>>         in mercurial or TeamWare to explain why the difference was
>>>         introduced.
>>>
>>>         (Just to be clear, this difference was not the cause of my
>>>         performance issue. I still don't have a good explanation for
>>>         how ParNew can have longer GC times but better SPECjbb score.)
>>>
>>>>         and whether it might make sense, upon a promotion failure to
>>>>         do something about the tenuring threshold for the next
>>>>         scavenge (i.e. for example make the tenuring threshold half
>>>>         of its current value as a reaction to the fact that
>>>>         promotion failed). Is it currently left at its previous
>>>>         value or is it asjusted back to the default max value (which
>>>>         latter may be the wrong thing to do) or something else?
>>>
>>>         As far as I can tell the tenuring threshold is left untouched
>>>         if we get a promotion failure. It is probably a good idea to
>>>         update it in some way. But I would prefer to handle that as a
>>>         separate bug fix.
>>>
>>>         This change is mostly a small cleanup to make
>>>         DefNewGeneration::collect() and ParNewGeneration::collect()
>>>         be more consistent. We've done the thinking so, it's good to
>>>         make the change in preparation for the next person that comes
>>>         a long and has a few cycles over and would like to merge the
>>>         two collect() methods in some way.
>>>
>>>         Thanks again for looking at this!
>>>         Bengt
>>>
>>>>
>>>>         -- ramki
>>>>
>>>>         On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
>>>>         <bengt.rutisson at oracle.com
>>>>         <mailto:bengt.rutisson at oracle.com>> wrote:
>>>>
>>>>
>>>>             Hi everyone,
>>>>
>>>>             Could I have a couple of reviews for this small change
>>>>             to make DefNew and ParNew be more consistent in the way
>>>>             they treat the tenuring threshold:
>>>>
>>>>             http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>>             <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>>>>
>>>>             Thanks,
>>>>             Bengt
>>>>
>>>>
>>>
>>
>>
> 


From kirk at kodewerk.com  Sat Jan 12 12:39:14 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Sat, 12 Jan 2013 07:39:14 -0500
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
Message-ID: <7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>

Hi Charlie,

In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?

As an aside, I had fun this week playing with G1 for a desktop app that did some things with video that makes it exceptionally pause time sensitive. Video processing of this nature gives some very very steady object allocation rates which does simplify the tuning process.

I decided to try G1 because I couldn't get a generational Eden to be as small as I needed to ensure frequent enough young gen collections while maintaining large enough survivor spaces to capture short lived objects. There was another issue in that the deployment environment didn't leave me feeling wonderful about deploying a right sized (small) heap. What I was looking to do was deploy with a large heap that behaved right sized in order to avoid possible OOMEs. The box has 24 cores and this is a perfect case for iCMS as the larger heap caused pause times to creep up until a CMS cycle was triggered.. this doesn't happen with iCMS but since it's been (or about to be) deprecated.... Anyways, in starting with default values G1 young size fell to a value that was about 4x the desired young gen size in CMS which resulted in too infrequent collections and missed pause time goals. This is were things ended for the moment. I'm going to force young gen to be smaller but I'm also observing that tenured collections are not happening frequently enough and that again appears to destabilize pause times for young gen collections. Do you have any advice on how one might best increase the frequency of tenured space marking?

Regards,
Kirk


On 2013-01-11, at 2:32 PM, Charlie Hunt <chunt at salesforce.com> wrote:

> Hi John,
> 
> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.
> 
> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
> 
> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?
> 
> I'd be interested in hearing your thoughts along with Monica's, Jon Masa, John Coomes and Ramki, if they have time of course.
> 
> hths,
> 
> charlie ...
> 
> On Jan 10, 2013, at 12:47 PM, John Cuthbertson wrote:
> 
>> Hi Bengt,
>> 
>> Thanks for reviewing the code. Replies inline...
>> 
>> On 1/10/2013 2:07 AM, Bengt Rutisson wrote:
>>> 
>>> Hi John,
>>> 
>>> Changes look good.
>>> 
>>> One question about G1NewSizePercent and G1MaxNewSizePercent. Why are 
>>> these only changed for heap sizes below 4GB? I would think that at 
>>> least the reduction of G1NewSizePercent would be even more important 
>>> for larger heap sizes. If we want to get lower pause times on larger 
>>> heaps we need to be able to have a small young gen size.
>> 
>> The simple answer is: it was suggested by Monica and Charlie. Personally 
>> I'm OK with making the new values of G1NewSizePercent and 
>> G1MaxNewSizePercent the defaults for all heap sizes and we might (or 
>> most likely will) go there in the future - but for the moment we're 
>> being conservative.
>> 
>> As I mentioned we would like to make G1 a bit more adaptive - and both 
>> Monica and Charlie have some ideas in that area.
>> 
>>> 
>>> Also, your change in arguments.cpp is guarded by #ifndef SERIALGC. 
>>> This is correct of course, but Joe Provino has a change out that will 
>>> replace this kind of check with #if INCLUDE_ALL_GCS:
>>> 
>>> http://cr.openjdk.java.net/~jprovino/8005915/webrev.00
>>> 
>>> Neither you nor Joe will get any merge conflicts if your changes are 
>>> both pushed. It will even still compile. But the code inside #ifndef 
>>> SERIALGC will never be executed. So, it might be good to keep any eye 
>>> out for how Joe's change propagate through the repositories to make 
>>> sure that you can manually resolve this.
>>> 
>>> My guess is that Joe's change will have to wait a while since it 
>>> includes make file changes that potentially interfere with changes for 
>>> the new build system. So, hopefully you get to push this first :)
>> 
>> Thanks. I've been watching the progress that Joe's change has been 
>> making. I guess you can think of this change as the one for hs24 and 
>> another with the SERIALGC changed appropriately being for hs25. :)
>> 
>> Thanks,
>> 
>> JohnC
>> 
> 


From bernd-2012 at eckenfels.net  Sat Jan 12 14:26:07 2013
From: bernd-2012 at eckenfels.net (Bernd Eckenfels)
Date: Sat, 12 Jan 2013 15:26:07 +0100
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
Message-ID: <op.wqsydty306c450@eckenfels02.seeburger.de>

Hello,

I know this is G1 ralted, but I just wanted to comment on a different  
approach.

Am 12.01.2013, 13:39 Uhr, schrieb Kirk Pepperdine <kirk at kodewerk.com>:

> I decided to try G1 because I couldn't get a generational Eden to be as  
> small as I needed to ensure frequent enough young gen collections while  
> maintaining large enough survivor spaces to capture short lived objects.

In this case I would shoot for a small YG with TenuringThreshold of 0 or 1  
resulting in a collection frequency of a few seconds. I cant imagine in a  
tight processing loop there are objects which are longer lived than one YG  
cycle. But even if there are, they spill over in the CMS where they are  
processed in the background.

Having smaller survivors is essential in that configuration as they will  
affect the STW pause times of the CMS to the largest degree.

> There was another issue in that the deployment environment didn't leave  
> me feeling wonderful about deploying a right sized (small) heap. What I  
> was looking to do was deploy with a large heap that behaved right sized  
> in order to avoid possible OOMEs. The box has 24 cores and this is a  
> perfect case for iCMS as the larger heap caused pause times to creep up  
> until a CMS cycle was triggered.. this doesn't happen with iCMS

You dont need iCMS for that, just use a small initiating occupancy and  
occupancy only. For example if the applications steady size is 10%, use  
this as the occupancy. You could even limit the degree of parallelity for  
the CMS in that case, so it will not affect your time critical calculation  
threads so much.


Gruss
Bernd


From bengt.rutisson at oracle.com  Mon Jan 14 11:03:50 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Mon, 14 Jan 2013 11:03:50 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8004018: Remove old initialization flags
Message-ID: <20130114110358.9CD5547242@hg.openjdk.java.net>

Changeset: 689e1218d7fe
Author:    brutisso
Date:      2013-01-14 09:58 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/689e1218d7fe

8004018: Remove old initialization flags
Reviewed-by: dholmes, stefank
Contributed-by: erik.helin at oracle.com

! src/share/vm/runtime/globals.hpp
! src/share/vm/runtime/thread.cpp


From bengt.rutisson at oracle.com  Mon Jan 14 13:13:09 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Mon, 14 Jan 2013 14:13:09 +0100
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50F08FC7.3070701@Oracle.COM>
References: <50EE8A18.5070004@oracle.com>	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>	<50EFF12E.20005@oracle.com>	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>	<50F00C22.2060201@oracle.com>	<CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>
	<50F081DB.4040904@oracle.com> <50F08FC7.3070701@Oracle.COM>
Message-ID: <50F40465.50400@oracle.com>


Hi Peter,

Thanks for looking at this!

On 1/11/13 11:18 PM, Peter B. Kessler wrote:
> I don't see -XX:+AlwaysPreTouch in your command line.  (Mostly because 
> I'm not sure I see the command line: for example, I don't see your 1GB 
> heap setting.)
>
> When you are watching GC performance before the first full collection, 
> you have to remember those faults for the OS to populate the old 
> generation.  If you do that during promotions, you do it one page at a 
> time.  -XX:+AlwaysPreTouch touches all of the committed old generation 
> during start-up, which turns out to be much faster.  (And doesn't bill 
> the time to GC. :-)
>
> Maybe those forced System.gc() calls at the beginning touch all of the 
> old generation?  Try a runs with and without -XX:+AlwaysPreTouch and 
> report the results, because I'm curious.

AlwaysPreTouch should not make a difference since SPECjbb does two 
System.gc()s in the begin. But I did do two runs with 
-XX:+AlwaysPreTouch just in case. The resulting log files are attached. 
As you can see the results are the same.

defnew:  56964
parnew:  59224

Here is the command line for the runs:

java -server -XX:+AlwaysPreTouch -XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps -XX:-PrintGCCause -cp ./jbb.jar:./check.jar 
-Xms1g -Xmx1g spec.jbb.JBBmain -propfile SPECjbb.props

I add "-XX:+UseSerialGC" for the defnew runs and "-XX:+UseParNewGC 
-XX:ParallelGCThreads=1" for the parnew runs.

>
>
> I also see that the young generation spaces are different between the 
> runs.  Looking at the heap shape printed at th end of the GC logs
>
>     def new generation   total 314560K, used 232441K 
> [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
>      eden space 279616K,  76% used [0x00000000c0000000, 
> 0x00000000cd0b4d38, 0x00000000d1110000)
>      from space 34944K,  53% used [0x00000000d1110000, 
> 0x00000000d2359a18, 0x00000000d3330000)
>      to   space 34944K,   0% used [0x00000000d3330000, 
> 0x00000000d3330000, 0x00000000d5550000)
>
>     par new generation   total 314560K, used 74831K 
> [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
>      eden space 279616K,  18% used [0x00000000c0000000, 
> 0x00000000c33154f8, 0x00000000d1110000)
>      from space 34944K,  64% used [0x00000000d1110000, 
> 0x00000000d270e760, 0x00000000d3330000)
>      to   space 34944K,   0% used [0x00000000d3330000, 
> 0x00000000d3330000, 0x00000000d5550000)
>
>     PSYoungGen      total 329728K, used 93207K [0x00000000eaab0000, 
> 0x0000000100000000, 0x0000000100000000)
>      eden space 309952K, 23% used 
> [0x00000000eaab0000,0x00000000ef30dd08,0x00000000fd960000)
>      from space 19776K, 96% used 
> [0x00000000fd960000,0x00000000fec08000,0x00000000fecb0000)
>      to   space 19712K, 0% used 
> [0x00000000fecc0000,0x00000000fecc0000,0x0000000100000000)
>
> The PSYoung eden is 10% larger than the others (because the survivors 
> are smaller?).

Right. This is because the PS runs have UseAdaptiveSizePolicy enabled. I 
mostly included the PS runs for comparison. My main interest in in the 
difference between DefNew and ParNew. They have the same size eden and 
survivors.
> The sizes and occupancy of the survivors is different between DefNew 
> and ParNew if you look in the logs.
Yes, this is what triggered me to go and look at the code for how we 
update the tenuring threshold. It looks like we use the same ergonomics 
except for the difference that this review request is addressing. But 
this difference is not the cause of the performance difference since I 
don't get any promotion failures.

Thanks again for looking at this!
Bengt

>
>             ... peter
>
> Bengt Rutisson wrote:
>>
>>
>> Hi Ramki,
>>
>> On 1/11/13 6:50 PM, Srinivas Ramakrishna wrote:
>>> Hi Bengt --
>>>
>>> Try computing the GC overhead by normalizing wrt the work done (for 
>>> which the net allocation volume might be a good proxy). As you 
>>> state, the performance numbers will then likely make sense. Of 
>>> course, they still won't explain why ParNew does better.  As Vitaly 
>>> conjectures, the difference is likely in better object co-location 
>>> with ParNew's slightly more DFS-like evacuation compared with 
>>> DefNew's considerably more BFS-like evacuation because of the 
>>> latter's use of a pure Cheney scan compared with the use of (a) 
>>> marking stack(s) in the former, as far as i can remember the code. 
>>> One way to tell if that accounts for the difference is to measure 
>>> the cache-miss rates in the two cases (and may be use a good tool 
>>> like Solaris perf analyzer to show you where the misses are coming 
>>> from as well).
>>
>> Thanks for bringing the DFS/BFS difference up. This is exactly the 
>> kind of difference I was looking for. My guess is that this is what 
>> causes the difference in JBB score. I'll see if I can investigate 
>> this further.
>>> Also curious if you can share the two sets of GC logs, by chance? 
>>> (specJBB is a for-fee benchmark so is not freely available to the 
>>> individual developer.)
>>
>> I have a fairly large set of logs, but the runs are very stable so 
>> I'm just attaching logs for one run for each collector. For 
>> comparison I have also been running ParallelScavenge with one thread. 
>> I'm using separate gc logs and jbb logs. The log files called 
>> ".result" are the jbb output. The other logs are the gc logs.
>>
>> I'm running with a heap size of 1GB to avoid full GCs. All runs have 
>> the two System.gc() induced full GCs but no other. ParallelScavenge 
>> is performing even better than ParNew, but I am mostly interested in 
>> the difference between ParNew and DefNew.
>>
>> A quick summary of the data in the logs:
>>
>>         Score  #GCs  AverageGCTime
>> DefNew: 57903  2083  0.044053195391262644
>> ParNew: 61363  2213  0.05931835969272489
>> PS:     69697  2213  0.06117092860370538
>>
>> ParNew has a better score even though it does more GCs and they take 
>> longer.
>>
>> If you have any insights from looking at the logs I would be very 
>> happy to hear about it.
>>
>> Thanks,
>> Bengt
>>>
>>> thanks.
>>> -- ramki
>>>
>>> On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson 
>>> <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>>
>>>
>>>     Hi Vitaly,
>>>
>>>
>>>     On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>>>>
>>>>     Hi Bengt,
>>>>
>>>>     Regarding the benchmark score, are you saying ParNew has longer
>>>>     cumulative GC time or just the average is higher? If it's just
>>>>     average, maybe the total # of them (and cumulative time) is
>>>>     less.  I don't know the characteristics of this particular
>>>>     specjbb benchmark, but perhaps having fewer total GCs is better
>>>>     because of the overhead of getting all threads to a safe point,
>>>>     going go the OS to suspend them, and then restarting them.  After
>>>>     they're restarted, the CPU cache may be cold for it because the
>>>>     GC thread polluted it.  Or I'm entirely wrong in my speculation
>>>>     ... :).
>>>>
>>>
>>>     You have a good point about the number of GCs. The problem in my
>>>     runs is that ParNew does more GCs than DefNew. So there are both
>>>     more of them and their average time is higher, but the score is
>>>     still better. That ParNew does more GCs is not that strange. It
>>>     has a higher score, which means that it had higher throughput and
>>>     had time to create more objects. So, that is kind of expected. But
>>>     I don't understand how it can have higher throughput when the GCs
>>>     take longer. My current guess is that it does something
>>>     differently with how objects are copied in a way that is
>>>     beneficial for the execution time between GCs.
>>>
>>>     It also seems like ParNew keeps more objects alive for each GC.
>>>     That is either the reason why it does more and more frequent GCs
>>>     than DefNew, or it is an effect of the fact that more objects are
>>>     created due to the higher throughput. This is the reason I started
>>>     looking at the tenuring threshold.
>>>
>>>     Bengt
>>>
>>>
>>>>     Thanks
>>>>
>>>>     Sent from my phone
>>>>
>>>>     On Jan 11, 2013 6:02 AM, "Bengt Rutisson"
>>>>     <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> 
>>>> wrote:
>>>>
>>>>
>>>>         Hi Ramki,
>>>>
>>>>         Thanks for looking at this!
>>>>
>>>>         On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>>>>         Hi Bengt --
>>>>>
>>>>>         The change looks reasonable, but I have a comment and a
>>>>>         follow-up question.
>>>>>
>>>>>         Not your change, but I'd elide the "half the real survivor
>>>>>         size" since it's really a configurable parameter based on
>>>>>         TargetSurvivorRatio with default half.
>>>>>         I'd leave the comment as "set the new tenuring threshold and
>>>>>         desired survivor size".
>>>>
>>>>         I'm fine with removing this from the comment, but I thought
>>>>         the "half the real survivor size" aimed at the fact that we
>>>>         pass only the "to" capacity and not the "from" capacity in to
>>>>         compute_tenuring_threshold(). With that interpretation I
>>>>         think the comment is correct.
>>>>
>>>>         Would you like me to remove it anyway? Either way is fine
>>>>         with me.
>>>>
>>>>>         I'm curious though, as to what performance data prompted
>>>>>         this change,
>>>>         Good point. This change was preceded by an internal
>>>>         discussion in the GC team, so I should probably have
>>>>         explained the background more in my review request to the 
>>>> open.
>>>>
>>>>         I was comparing the ParNew and DefNew implementation since I
>>>>         am seeing some strange differences in some SPECjbb2005
>>>>         results. I am running ParNew with a single thread and get
>>>>         much better score than with DefNew. But I also get higher
>>>>         average GC times. So, I was trying to figure out what DefNew
>>>>         and ParNew does differently.
>>>>
>>>>         When I was looking at DefNewGeneration::collect() and
>>>>         ParNewGeneration::collect() I saw that they contain a whole
>>>>         lot of code duplication. It would be tempting to try to
>>>>         extract the common code out into DefNewGeneration since it is
>>>>         the super class. But there are some minor differences. One of
>>>>         them was this issue with how they handle the tenuring 
>>>> threshold.
>>>>
>>>>         We tried to figure out if there is a reason for ParNew and
>>>>         DefNew to behave different in this regard. We could not come
>>>>         up with any good reason for that. So, we needed to figure out
>>>>         if we should change ParNew or DefNew to make them consistent.
>>>>         The decision to change ParNew was based on two things. First,
>>>>         it seems wrong to use the data from a collection that got
>>>>         promotion failure. This collection will not have allowed the
>>>>         tenuring threshold to fulfill its purpose. Second,
>>>>         ParallelScavenge works the same way as DefNew.
>>>>
>>>>         BTW, the difference between DefNew and ParNew seems to have
>>>>         been there from the start. So, there is no bug or changeset
>>>>         in mercurial or TeamWare to explain why the difference was
>>>>         introduced.
>>>>
>>>>         (Just to be clear, this difference was not the cause of my
>>>>         performance issue. I still don't have a good explanation for
>>>>         how ParNew can have longer GC times but better SPECjbb score.)
>>>>
>>>>>         and whether it might make sense, upon a promotion failure to
>>>>>         do something about the tenuring threshold for the next
>>>>>         scavenge (i.e. for example make the tenuring threshold half
>>>>>         of its current value as a reaction to the fact that
>>>>>         promotion failed). Is it currently left at its previous
>>>>>         value or is it asjusted back to the default max value (which
>>>>>         latter may be the wrong thing to do) or something else?
>>>>
>>>>         As far as I can tell the tenuring threshold is left untouched
>>>>         if we get a promotion failure. It is probably a good idea to
>>>>         update it in some way. But I would prefer to handle that as a
>>>>         separate bug fix.
>>>>
>>>>         This change is mostly a small cleanup to make
>>>>         DefNewGeneration::collect() and ParNewGeneration::collect()
>>>>         be more consistent. We've done the thinking so, it's good to
>>>>         make the change in preparation for the next person that comes
>>>>         a long and has a few cycles over and would like to merge the
>>>>         two collect() methods in some way.
>>>>
>>>>         Thanks again for looking at this!
>>>>         Bengt
>>>>
>>>>>
>>>>>         -- ramki
>>>>>
>>>>>         On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
>>>>>         <bengt.rutisson at oracle.com
>>>>>         <mailto:bengt.rutisson at oracle.com>> wrote:
>>>>>
>>>>>
>>>>>             Hi everyone,
>>>>>
>>>>>             Could I have a couple of reviews for this small change
>>>>>             to make DefNew and ParNew be more consistent in the way
>>>>>             they treat the tenuring threshold:
>>>>>
>>>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>>> <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>>>>>
>>>>>             Thanks,
>>>>>             Bengt
>>>>>
>>>>>
>>>>
>>>
>>>
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: alwayspretouch.zip
Type: application/zip
Size: 130581 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130114/e7df33ca/alwayspretouch.zip>

From jon.masamitsu at oracle.com  Mon Jan 14 15:30:40 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 14 Jan 2013 07:30:40 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
Message-ID: <50F424A0.6080907@oracle.com>

8005452: Create new flags for Metaspace resizing policy

Previously the calculation of the metadata capacity at which
to do a GC (high water mark, HWM) to recover
unloaded classes used the MinHeapFreeRatio
and MaxHeapFreeRatio to decide on the next HWM.  That
generally left an excessive amount of unused capacity for
metadata.  This change adds specific flags for metadata
capacity with defaults more conservative in terms of
unused capacity.

Added an additional check for doing a GC before expanding
the metadata capacity.  Required adding a new parameter to
get_new_chunk().

Added some additional diagnostic prints.

http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/

Thanks.


From jesper.wilhelmsson at oracle.com  Mon Jan 14 17:10:38 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 14 Jan 2013 18:10:38 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
Message-ID: <50F43C0E.1080308@oracle.com>

Hi,

I would like a couple of reviews of a small fix for JDK-6348447 - Specifying 
-XX:OldSize crashes 64-bit VMs

Webrev:
http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/

Summary:
When starting HotSpot with an OldSize larger than the default heap size one 
will run into a couple of problems. Basically what happens is that the OldSize 
is ignored because it is incompatible with the heap size. A debug build will 
assert since a calculation on the way results in a negative number, but since 
it is a size_t an if(x<0) won't trigger and the assert catches it later on as 
incompatible flags.

Changes:
I have made two changes to fix this.

The first is to change the calculation in 
TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it won't result in a 
negative number in the if statement. This way we will catch the case where the 
OldSize is larger than the heap size and adjust the OldSize instead of the 
young size. There are also some cosmetic changes here. For instance the 
argument min_gen0_size is actually used for the old generation size which was 
a bit confusing initially. I renamed it to min_gen1_size (which it already was 
called in the header file).

The second change is in Arguments::set_heap_size. My reasoning here is that if 
the user sets the OldSize we should probably adjust the heap size to 
accommodate that OldSize instead of complaining that the heap is too small. We 
determine the heap size first and the generation sizes later on while 
initializing the VM. To be able to fit the generations if the user specifies 
sizes on the command line we need to look at the generation size flags a 
little already when setting up the heap size.

Thanks,
/Jesper

-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130114/9a60f3f9/jesper_wilhelmsson.vcf>

From jon.masamitsu at oracle.com  Mon Jan 14 18:00:43 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 14 Jan 2013 10:00:43 -0800
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F43C0E.1080308@oracle.com>
References: <50F43C0E.1080308@oracle.com>
Message-ID: <50F447CB.1000604@oracle.com>

Jesper,

I'm a bit concerned that set_heap_size() now knows about how
the CollectorPolicy uses OldSize and NewSize.   In the distant
past set_heap_size() did not know what kind of collector was
going to be used and probably avoided looking at those
parameters for that reason.  Today we know that a generational
collector is to follow but maybe you could hide that knowledge
in CollectorPolicy somewhere and have set_heap_size() call into
CollectorPolicy to use that information?

Jon


On 01/14/13 09:10, Jesper Wilhelmsson wrote:
> Hi,
>
> I would like a couple of reviews of a small fix for JDK-6348447 - 
> Specifying -XX:OldSize crashes 64-bit VMs
>
> Webrev:
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>
> Summary:
> When starting HotSpot with an OldSize larger than the default heap 
> size one will run into a couple of problems. Basically what happens is 
> that the OldSize is ignored because it is incompatible with the heap 
> size. A debug build will assert since a calculation on the way results 
> in a negative number, but since it is a size_t an if(x<0) won't 
> trigger and the assert catches it later on as incompatible flags.
>
> Changes:
> I have made two changes to fix this.
>
> The first is to change the calculation in 
> TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it won't 
> result in a negative number in the if statement. This way we will 
> catch the case where the OldSize is larger than the heap size and 
> adjust the OldSize instead of the young size. There are also some 
> cosmetic changes here. For instance the argument min_gen0_size is 
> actually used for the old generation size which was a bit confusing 
> initially. I renamed it to min_gen1_size (which it already was called 
> in the header file).
>
> The second change is in Arguments::set_heap_size. My reasoning here is 
> that if the user sets the OldSize we should probably adjust the heap 
> size to accommodate that OldSize instead of complaining that the heap 
> is too small. We determine the heap size first and the generation 
> sizes later on while initializing the VM. To be able to fit the 
> generations if the user specifies sizes on the command line we need to 
> look at the generation size flags a little already when setting up the 
> heap size.
>
> Thanks,
> /Jesper
>


From bengt.rutisson at oracle.com  Mon Jan 14 21:06:41 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Mon, 14 Jan 2013 22:06:41 +0100
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
Message-ID: <50F47361.7020306@oracle.com>


Hi all,

Could I have a couple of reviews for this small change?
http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/

Thanks to John Cuthbertson for finding this bug and providing excellent 
data to track down the issue.

 From the bug report:

In non-product builds the WorkerDataArrays in G1 are initialized to -1 
in WorkerDataArray<T>::reset() when a GC starts. At the end of a GC 
WorkerDataArray<T>::verify() verifies that all entries in a 
WorkerDataArray has been set. Currently it does this by asserting that 
the entries are >= 0. This is fine in theory since the entries should 
contain counts or times that are all positive.

The problem is that some WorkerDataArrays are of type double. And some 
of those are set up through calculations using doubles. If those 
calculations result in a value close to 0 we could end up with a value 
slightly less than 0 since double calculations don't have full precision.

All we really want to verify is that all the entries were set. So, it 
should be enough to verify that entries do not contain the value set by 
the reset() method.

Bengt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130114/eb9ee407/attachment.htm>

From bengt.rutisson at oracle.com  Tue Jan 15 01:19:32 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Tue, 15 Jan 2013 01:19:32 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
Message-ID: <20130115011936.59EC547273@hg.openjdk.java.net>

Changeset: a30e7b564541
Author:    brutisso
Date:      2013-01-14 21:30 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a30e7b564541

8005972: ParNew should not update the tenuring threshold when promotion failed has occurred
Reviewed-by: ysr, johnc, jwilhelm

! src/share/vm/gc_implementation/parNew/parNewGeneration.cpp
! src/share/vm/gc_implementation/parNew/parNewGeneration.hpp
! src/share/vm/memory/defNewGeneration.cpp
! src/share/vm/memory/defNewGeneration.hpp


From bengt.rutisson at oracle.com  Tue Jan 15 09:18:56 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Tue, 15 Jan 2013 10:18:56 +0100
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <50EF4D88.2050906@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com>
Message-ID: <50F51F00.6040008@oracle.com>


Hi John,

Sorry for being late in replying. Thanks for taking the time to explain 
this in detail! :)

A couple of comments inline.


On 1/11/13 12:23 AM, John Cuthbertson wrote:
> Hi Bengt,
>
> Thanks for looking over the code changes. Be prepared for some gory 
> details. :) Replies inline...
>
> On 1/9/2013 7:35 AM, Bengt Rutisson wrote:
>>
>> In ConcurrentMark::weakRefsWork() there is this code:
>>
>> 2422     if (rp->processing_is_mt()) {
>> 2423       // Set the degree of MT here.  If the discovery is done 
>> MT, there
>> 2424       // may have been a different number of threads doing the 
>> discovery
>> 2425       // and a different number of discovered lists may have Ref 
>> objects.
>> 2426       // That is OK as long as the Reference lists are balanced 
>> (see
>> 2427       // balance_all_queues() and balance_queues()).
>> 2428       rp->set_active_mt_degree(active_workers);
>> 2429     }
>>
>> Could we now always call rp->set_active_mt_degree() ? Maybe I am 
>> missing the point here, but I thought that we are now using queues 
>> and all rp->set_active_mt_degree() does is set the number of queues 
>> to active_workers. Which will be 1 for the single threaded mode.
>
> Yes - most likely we can but I would prefer to  set active_workers using:
>
>     // We need at least one active thread. If reference processing is
>     // not multi-threaded we use the current (ConcurrentMarkThread) 
> thread,
>     // otherwise we use the work gang from the G1CollectedHeap and we
>     // utilize all the worker threads we can.
>     uint active_workers = (rp->processing_is_mt() && g1h->workers() != 
> NULL
>                                 ? g1h->workers()->active_workers()
>                                 : 1U);
>
> since single threaded versus multi-threaded reference processing is 
> determined using the ParallelRefProcEnabled flag. The number of active 
> workers here is the number of workers in G1's STW work gang - which is 
> controlled via ParallelGCThreads.

I see. I didn't think about the difference betweeen ParallelGCThreads 
and ParallelRefProcEnabled. BTW, not part of this change, but why do we 
have ParallelRefProcEnabled? And why is it false by default? Wouldn't it 
make more sense to have it just be dependent on ParallelGCThreads?

>>
>> If we do that we can also remove the first part of the assert a bit 
>> further down:
>>
>> 2449     assert(!rp->processing_is_mt() || rp->num_q() == 
>> active_workers, "why not");
>
> OK. Done.
>
>>
>> Also, I think I would like to move the code you added to 
>> G1CMParDrainMarkingStackClosure::do_void() into 
>> ConcurrentMark::weakRefsWork() somewhere. Maybe something like:
>>
>> if (!rp->processing_is_mt()) {
>>   set_phase(1, false /* concurrent */);
>> }
>>
>> It is a bit strange to me that G1CMParDrainMarkingStackClosure should 
>> set this up. If we really want to keep it in 
>> G1CMParDrainMarkingStackClosure I think the constructor would be a 
>> better place to do it than do_void().
>
> Setting it once in weakRefsWork() will not be sufficient. We will run 
> into an assertion failure in ParallelTaskTerminator::offer_termination().
>
> During the reference processing, the do_void() method of the 
> complete_gc oop closure (in our case the complete gc oop closure is an 
> instance of G1CMParDrainMarkingStackClosure) is called multiple times 
> (in process_phase1, sometimes process_phase2, process_phase3, and 
> process_phaseJNI)
>
> Setting the phase sets the number of active tasks (or threads) that 
> the termination protocol in do_marking_step() will wait for. When an 
> invocation of do_marking_step() offers termination, the number of 
> tasks/threads in the terminator instance is decremented. So Setting 
> the phase once will let the first execution of do_marking_step (with 
> termination) from process_phase1() succeed, but subsequent calls to 
> do_marking_step() will result in the assertion failure.
>
> We also can't unconditionally set it in the do_void() method or even 
> the constructor of G1CMParDrainMarkingStackClosure. Separate instances 
> of this closure are created by each of the worker threads in the MT-case.
>
> Note when processing is multi-threaded the complete_gc instance used 
> is the one passed into the ProcessTask's work method (passed into 
> process_discovered_references() using the task executor instance) 
> which may not necessarily be the same complete gc instance as the one 
> passed directly into process_discovered_references().

Thanks for this detailed explanation. It really helped!

I understand the issue now, but I still think it is very confusing that 
_cm->set_phase() is called from G1CMRefProcTaskExecutor::execute() in 
the multithreaded case and from 
G1CMParDrainMarkingStackClosure::do_void() in the single threaded case.

> It might be possible to record whether processing is MT in the 
> G1CMRefProcTaskExecutor class and always pass the executor instance 
> into process_discovered_references. We could then set processing to MT 
> so that the execute() methods in the executor instance are invoked but 
> call the Proxy class' work method directly. Then we could override the 
> set_single_threaded() routine (called just before process_phaseJNI) to 
> set the phase.

I think this would be a better solution, but if I understand it 
correctly it would mean that we would have to change all the collectors 
to always pass a TaskExecutor. All of them currently pass NULL in the 
non-MT case. I think it would be simpler if they always passed a 
TaskExecutor but it is a pretty big change.

Another possibility is to introduce some kind of prepare method to the 
VoidClosure (or maybe in a specialized subclass for ref processing). 
Then we could do something like:

   complete_gc->prologue();
   if (mt_processing) {
     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic() 
/*marks_oops_alive*/);
     task_executor->execute(phase2);
   } else {
     for (uint i = 0; i < _max_num_q; i++) {
       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
     }
   }

G1CMParDrainMarkingStackClosure::prologue() could do the call to 
_cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not have 
to do it.


BTW, not really part of your change, but above code is duplicated three 
times in ReferenceProcessor::process_discovered_reflist(). Would be nice 
to factor this out to a method.

Thanks,
Bengt

>
>
> I don't think that would be any clearer.
>
> Perhaps a better name for the _is_par flag would be _processing_is_mt 
> (and set it using ReferenceProcessor::processing_is_mt())?
>> Some minor comments:
>>
>> In G1CMParKeepAliveAndDrainClosure and 
>> G1CMParDrainMarkingStackClosure constructors there is this assert:
>>
>> assert(_task->worker_id() == 0 || _is_par, "sanity");
>>
>> I think it is good, but I had to think a bit about what it meant and 
>> to me I think it would be quicker to understand if it was the other 
>> way around:
>>
>> assert(_is_par || _task->worker_id() == 0, "Only worker 0 should be 
>> used if single threaded");
>
> Sure no problem. Done.
>
>> Remove newline on line 2254 ?
>
> Sure. What about the blank lines on 2187 and 2302 for consistency?
>
>> ConcurrentMark::weakRefsWork()
>>
>> How about introducing a variable that either hold the value of 
>> &par_task_executor or NULL depending on rp->processing_is_mt()? That 
>> way we don't have to duplicate and inline this test twice:
>>
>> (rp->processing_is_mt() ? &par_task_executor : NULL)
>
> OK.
>
>> As a separate change it might be worth renaming the closures to not 
>> have "Par" in the name, now that they are not always parallel...
>>
>> G1CMParKeepAliveAndDrainClosure -> G1CMKeepAliveAndDrainClosure
>> G1CMParDrainMarkingStackClosure -> G1CMDrainMarkingStackClosure
>>
>> But it would be very confusing to do this in the same change.
>
> I think we can change the names. The change shouldn't be that confusing.
>
> Thanks. A new webrev will appear shortly.
>
> JohnC


From vitalyd at gmail.com  Tue Jan 15 13:03:20 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 15 Jan 2013 08:03:20 -0500
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F47361.7020306@oracle.com>
References: <50F47361.7020306@oracle.com>
Message-ID: <CAHjP37GCUgS32KV5H5ZwUmTb_r61U0kKubwh5gY4UYkYu3GApg@mail.gmail.com>

Hi Bengt,

Looks good.  Do you need the constants for int/double/size_t? Would it be
easier to have a static getter that returns "(T)-1" and then use that
instead?

Thanks

Sent from my phone
On Jan 14, 2013 4:08 PM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

>
> Hi all,
>
> Could I have a couple of reviews for this small change?
> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>
> Thanks to John Cuthbertson for finding this bug and providing excellent
> data to track down the issue.
>
> From the bug report:
>
> In non-product builds the WorkerDataArrays in G1 are initialized to -1 in
> WorkerDataArray<T>::reset() when a GC starts. At the end of a GC
> WorkerDataArray<T>::verify() verifies that all entries in a WorkerDataArray
> has been set. Currently it does this by asserting that the entries are >=
> 0. This is fine in theory since the entries should contain counts or times
> that are all positive.
>
> The problem is that some WorkerDataArrays are of type double. And some of
> those are set up through calculations using doubles. If those calculations
> result in a value close to 0 we could end up with a value slightly less
> than 0 since double calculations don't have full precision.
>
> All we really want to verify is that all the entries were set. So, it
> should be enough to verify that entries do not contain the value set by the
> reset() method.
>
> Bengt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/f3c72b8e/attachment.htm>

From jesper.wilhelmsson at oracle.com  Tue Jan 15 13:07:17 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Tue, 15 Jan 2013 14:07:17 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F447CB.1000604@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
Message-ID: <50F55485.5020705@oracle.com>

Jon,

Thank you for looking at this! I share your concerns and I have moved the 
knowledge about policies to CollectorPolicy. set_heap_size() now simply asks 
the collector policy if it has any recommendations regarding the heap size.

Ideally, since the code knows about young and old generations, I guess the new 
function "recommended_heap_size()" should be placed in GenCollectorPolicy, but 
then the code would have to be duplicated for G1 as well. However, 
CollectorPolicy already know about OldSize and NewSize so I think it is OK to 
put it there.

Eventually I think that we should reduce the abstraction level in the 
generation policies and merge CollectorPolicy, GenCollectorPolicy and maybe 
even TwoGenerationCollectorPolicy and if possible G1CollectorPolicy, so I 
don't worry too much about having knowledge about the two generations in 
CollectorPolicy.


A new webrev is available here:
http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/

Thanks,
/Jesper


On 2013-01-14 19:00, Jon Masamitsu wrote:
> Jesper,
>
> I'm a bit concerned that set_heap_size() now knows about how
> the CollectorPolicy uses OldSize and NewSize.   In the distant
> past set_heap_size() did not know what kind of collector was
> going to be used and probably avoided looking at those
> parameters for that reason.  Today we know that a generational
> collector is to follow but maybe you could hide that knowledge
> in CollectorPolicy somewhere and have set_heap_size() call into
> CollectorPolicy to use that information?
>
> Jon
>
>
> On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>> Hi,
>>
>> I would like a couple of reviews of a small fix for JDK-6348447 - 
>> Specifying -XX:OldSize crashes 64-bit VMs
>>
>> Webrev:
>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>
>> Summary:
>> When starting HotSpot with an OldSize larger than the default heap size one 
>> will run into a couple of problems. Basically what happens is that the 
>> OldSize is ignored because it is incompatible with the heap size. A debug 
>> build will assert since a calculation on the way results in a negative 
>> number, but since it is a size_t an if(x<0) won't trigger and the assert 
>> catches it later on as incompatible flags.
>>
>> Changes:
>> I have made two changes to fix this.
>>
>> The first is to change the calculation in 
>> TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it won't result in 
>> a negative number in the if statement. This way we will catch the case 
>> where the OldSize is larger than the heap size and adjust the OldSize 
>> instead of the young size. There are also some cosmetic changes here. For 
>> instance the argument min_gen0_size is actually used for the old generation 
>> size which was a bit confusing initially. I renamed it to min_gen1_size 
>> (which it already was called in the header file).
>>
>> The second change is in Arguments::set_heap_size. My reasoning here is that 
>> if the user sets the OldSize we should probably adjust the heap size to 
>> accommodate that OldSize instead of complaining that the heap is too small. 
>> We determine the heap size first and the generation sizes later on while 
>> initializing the VM. To be able to fit the generations if the user 
>> specifies sizes on the command line we need to look at the generation size 
>> flags a little already when setting up the heap size.
>>
>> Thanks,
>> /Jesper
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 247 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/268ae372/jesper_wilhelmsson.vcf>

From vitalyd at gmail.com  Tue Jan 15 13:32:30 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 15 Jan 2013 08:32:30 -0500
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F55485.5020705@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
Message-ID: <CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>

Hi Jesper,

Is NewRatio guaranteed to be non-zero when used inside
recommended_heap_size?

Thanks

Sent from my phone
On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com>
wrote:

> Jon,
>
> Thank you for looking at this! I share your concerns and I have moved the
> knowledge about policies to CollectorPolicy. set_heap_size() now simply
> asks the collector policy if it has any recommendations regarding the heap
> size.
>
> Ideally, since the code knows about young and old generations, I guess the
> new function "recommended_heap_size()" should be placed in
> GenCollectorPolicy, but then the code would have to be duplicated for G1 as
> well. However, CollectorPolicy already know about OldSize and NewSize so I
> think it is OK to put it there.
>
> Eventually I think that we should reduce the abstraction level in the
> generation policies and merge CollectorPolicy, GenCollectorPolicy and maybe
> even TwoGenerationCollectorPolicy and if possible G1CollectorPolicy, so I
> don't worry too much about having knowledge about the two generations in
> CollectorPolicy.
>
>
> A new webrev is available here:
> http://cr.openjdk.java.net/~**jwilhelm/6348447/webrev.2/<http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/>
>
> Thanks,
> /Jesper
>
>
>
> On 2013-01-14 19:00, Jon Masamitsu wrote:
>
>> Jesper,
>>
>> I'm a bit concerned that set_heap_size() now knows about how
>> the CollectorPolicy uses OldSize and NewSize.   In the distant
>> past set_heap_size() did not know what kind of collector was
>> going to be used and probably avoided looking at those
>> parameters for that reason.  Today we know that a generational
>> collector is to follow but maybe you could hide that knowledge
>> in CollectorPolicy somewhere and have set_heap_size() call into
>> CollectorPolicy to use that information?
>>
>> Jon
>>
>>
>> On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>
>>> Hi,
>>>
>>> I would like a couple of reviews of a small fix for JDK-6348447 -
>>> Specifying -XX:OldSize crashes 64-bit VMs
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~**jwilhelm/6348447/webrev/<http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/>
>>>
>>> Summary:
>>> When starting HotSpot with an OldSize larger than the default heap size
>>> one will run into a couple of problems. Basically what happens is that the
>>> OldSize is ignored because it is incompatible with the heap size. A debug
>>> build will assert since a calculation on the way results in a negative
>>> number, but since it is a size_t an if(x<0) won't trigger and the assert
>>> catches it later on as incompatible flags.
>>>
>>> Changes:
>>> I have made two changes to fix this.
>>>
>>> The first is to change the calculation in TwoGenerationCollectorPolicy::
>>> **adjust_gen0_sizes so that it won't result in a negative number in the
>>> if statement. This way we will catch the case where the OldSize is larger
>>> than the heap size and adjust the OldSize instead of the young size. There
>>> are also some cosmetic changes here. For instance the argument
>>> min_gen0_size is actually used for the old generation size which was a bit
>>> confusing initially. I renamed it to min_gen1_size (which it already was
>>> called in the header file).
>>>
>>> The second change is in Arguments::set_heap_size. My reasoning here is
>>> that if the user sets the OldSize we should probably adjust the heap size
>>> to accommodate that OldSize instead of complaining that the heap is too
>>> small. We determine the heap size first and the generation sizes later on
>>> while initializing the VM. To be able to fit the generations if the user
>>> specifies sizes on the command line we need to look at the generation size
>>> flags a little already when setting up the heap size.
>>>
>>> Thanks,
>>> /Jesper
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/d8b79b51/attachment.htm>

From jesper.wilhelmsson at oracle.com  Tue Jan 15 13:41:14 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Tue, 15 Jan 2013 14:41:14 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
Message-ID: <50F55C7A.8070508@oracle.com>

On 2013-01-15 14:32, Vitaly Davidovich wrote:
>
> Hi Jesper,
>
> Is NewRatio guaranteed to be non-zero when used inside recommended_heap_size?
>
As far as I can see, yes. It defaults to two and is never set to zero.
/Jesper

> Thanks
>
> Sent from my phone
>
> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com 
> <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>
>     Jon,
>
>     Thank you for looking at this! I share your concerns and I have moved
>     the knowledge about policies to CollectorPolicy. set_heap_size() now
>     simply asks the collector policy if it has any recommendations regarding
>     the heap size.
>
>     Ideally, since the code knows about young and old generations, I guess
>     the new function "recommended_heap_size()" should be placed in
>     GenCollectorPolicy, but then the code would have to be duplicated for G1
>     as well. However, CollectorPolicy already know about OldSize and NewSize
>     so I think it is OK to put it there.
>
>     Eventually I think that we should reduce the abstraction level in the
>     generation policies and merge CollectorPolicy, GenCollectorPolicy and
>     maybe even TwoGenerationCollectorPolicy and if possible
>     G1CollectorPolicy, so I don't worry too much about having knowledge
>     about the two generations in CollectorPolicy.
>
>
>     A new webrev is available here:
>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>
>     Thanks,
>     /Jesper
>
>
>
>     On 2013-01-14 19:00, Jon Masamitsu wrote:
>
>         Jesper,
>
>         I'm a bit concerned that set_heap_size() now knows about how
>         the CollectorPolicy uses OldSize and NewSize.   In the distant
>         past set_heap_size() did not know what kind of collector was
>         going to be used and probably avoided looking at those
>         parameters for that reason.  Today we know that a generational
>         collector is to follow but maybe you could hide that knowledge
>         in CollectorPolicy somewhere and have set_heap_size() call into
>         CollectorPolicy to use that information?
>
>         Jon
>
>
>         On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>
>             Hi,
>
>             I would like a couple of reviews of a small fix for JDK-6348447
>             - Specifying -XX:OldSize crashes 64-bit VMs
>
>             Webrev:
>             http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>             <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>
>             Summary:
>             When starting HotSpot with an OldSize larger than the default
>             heap size one will run into a couple of problems. Basically what
>             happens is that the OldSize is ignored because it is
>             incompatible with the heap size. A debug build will assert since
>             a calculation on the way results in a negative number, but since
>             it is a size_t an if(x<0) won't trigger and the assert catches
>             it later on as incompatible flags.
>
>             Changes:
>             I have made two changes to fix this.
>
>             The first is to change the calculation in
>             TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it won't
>             result in a negative number in the if statement. This way we
>             will catch the case where the OldSize is larger than the heap
>             size and adjust the OldSize instead of the young size. There are
>             also some cosmetic changes here. For instance the argument
>             min_gen0_size is actually used for the old generation size which
>             was a bit confusing initially. I renamed it to min_gen1_size
>             (which it already was called in the header file).
>
>             The second change is in Arguments::set_heap_size. My reasoning
>             here is that if the user sets the OldSize we should probably
>             adjust the heap size to accommodate that OldSize instead of
>             complaining that the heap is too small. We determine the heap
>             size first and the generation sizes later on while initializing
>             the VM. To be able to fit the generations if the user specifies
>             sizes on the command line we need to look at the generation size
>             flags a little already when setting up the heap size.
>
>             Thanks,
>             /Jesper
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/fd600e0e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/fd600e0e/jesper_wilhelmsson.vcf>

From vitalyd at gmail.com  Tue Jan 15 13:55:56 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 15 Jan 2013 08:55:56 -0500
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <50F424A0.6080907@oracle.com>
References: <50F424A0.6080907@oracle.com>
Message-ID: <CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>

Hi Jon,

Does it make sense to validate that the new flags are consistent (I.e. max
>= min)? That is, if user changes one or both such that max < min, should
VM report an error and not start?

Thanks

Sent from my phone
On Jan 14, 2013 10:31 AM, "Jon Masamitsu" <jon.masamitsu at oracle.com> wrote:

> 8005452: Create new flags for Metaspace resizing policy
>
> Previously the calculation of the metadata capacity at which
> to do a GC (high water mark, HWM) to recover
> unloaded classes used the MinHeapFreeRatio
> and MaxHeapFreeRatio to decide on the next HWM.  That
> generally left an excessive amount of unused capacity for
> metadata.  This change adds specific flags for metadata
> capacity with defaults more conservative in terms of
> unused capacity.
>
> Added an additional check for doing a GC before expanding
> the metadata capacity.  Required adding a new parameter to
> get_new_chunk().
>
> Added some additional diagnostic prints.
>
> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/>
>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/6b02b8c6/attachment.htm>

From jon.masamitsu at oracle.com  Tue Jan 15 17:34:24 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 15 Jan 2013 09:34:24 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
References: <50F424A0.6080907@oracle.com>
	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
Message-ID: <50F59320.6060700@oracle.com>


On 01/15/13 05:55, Vitaly Davidovich wrote:
>
> Hi Jon,
>
> Does it make sense to validate that the new flags are consistent (I.e. 
> max >= min)? That is, if user changes one or both such that max < min, 
> should VM report an error and not start?
>

Yes it does make sense.  I'll add the checks and publish a new
webrev.

Thanks.

Jon
>
> Thanks
>
> Sent from my phone
>
> On Jan 14, 2013 10:31 AM, "Jon Masamitsu" <jon.masamitsu at oracle.com 
> <mailto:jon.masamitsu at oracle.com>> wrote:
>
>     8005452: Create new flags for Metaspace resizing policy
>
>     Previously the calculation of the metadata capacity at which
>     to do a GC (high water mark, HWM) to recover
>     unloaded classes used the MinHeapFreeRatio
>     and MaxHeapFreeRatio to decide on the next HWM.  That
>     generally left an excessive amount of unused capacity for
>     metadata.  This change adds specific flags for metadata
>     capacity with defaults more conservative in terms of
>     unused capacity.
>
>     Added an additional check for doing a GC before expanding
>     the metadata capacity.  Required adding a new parameter to
>     get_new_chunk().
>
>     Added some additional diagnostic prints.
>
>     http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>     <http://cr.openjdk.java.net/%7Ejmasa/8005452/webrev.00/>
>
>     Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/92d1c377/attachment.htm>

From john.cuthbertson at oracle.com  Tue Jan 15 17:40:18 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 15 Jan 2013 09:40:18 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
Message-ID: <50F59482.6060900@oracle.com>

Hi Charlie

Thanks for looking over the changes. Replies inline....


On 1/11/2013 11:32 AM, Charlie Hunt wrote:
> Hi John,
>
> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.

I don't have a problem with this. By applying it heaps > 4GB , I was 
just being conservative.

> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
>
> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?

Again I don't have a problem with applying the new value to all heap 
sizes but I am a little concerned about the implications. The benefit is 
definitely less risk of evacuation failures but the it could also

* increase the number of young GCs:
     ** increasing the GC overhead and increasing the heap slightly more 
aggressively
     ** lowering throughput
* slightly increase the amount that gets promoted
     ** triggering marking cycles earlier and more often (increased SATB 
barrier overhead)
     ** more cards to be refined (we only refine cards in old regions) 
increasing the write barrier costs and the RS updating phase of the pauses,
     ** increases the importance of "taming the mixed GCs".

 From Kirk's email it sounds like this is a trade off people are 
prepared to live with.

Unless I hear any objections, I'll apply the new young gen bounds to all 
heap sizes.

JohnC


From john.cuthbertson at oracle.com  Tue Jan 15 17:42:39 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 15 Jan 2013 09:42:39 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
Message-ID: <50F5950F.8040206@oracle.com>

Hi Kirk,

I know you haven't responded to me directly but I did your email with 
interest and cited it in my reply to Charlie Hunt.

On 1/12/2013 4:39 AM, Kirk Pepperdine wrote:
> Hi Charlie,
>
> In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?
>

Unless I hear any objections, I'll apply the new young gen bounds to all 
heap sizes.

JohnC


From Peter.B.Kessler at Oracle.COM  Tue Jan 15 17:43:27 2013
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Tue, 15 Jan 2013 09:43:27 -0800
Subject: Request for review (S): 8005972: ParNew should not update the
	tenuring threshold when promotion failed has occurred
In-Reply-To: <50F40465.50400@oracle.com>
References: <50EE8A18.5070004@oracle.com>	<CABzyjynu1PQQmujLujz+jWNYqK0Z7L0efsKSMiVk=FXV0HNLSA@mail.gmail.com>	<50EFF12E.20005@oracle.com>	<CAHjP37EydvWQuq9_Qfa2waiH1aH0sgTNjQujzAq9k+NqCmeSOA@mail.gmail.com>	<50F00C22.2060201@oracle.com>	<CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com>
	<50F081DB.4040904@oracle.com> <50F08FC7.3070701@Oracle.COM>
	<50F40465.50400@oracle.com>
Message-ID: <50F5953F.6010202@Oracle.COM>

Good that the full GC's do the -XX:+AlwaysPreTouch thing.  Still, something to remember for GC performance measurements in general.

			... peter

Bengt Rutisson wrote:
> 
> Hi Peter,
> 
> Thanks for looking at this!
> 
> On 1/11/13 11:18 PM, Peter B. Kessler wrote:
>> I don't see -XX:+AlwaysPreTouch in your command line.  (Mostly because 
>> I'm not sure I see the command line: for example, I don't see your 1GB 
>> heap setting.)
>>
>> When you are watching GC performance before the first full collection, 
>> you have to remember those faults for the OS to populate the old 
>> generation.  If you do that during promotions, you do it one page at a 
>> time.  -XX:+AlwaysPreTouch touches all of the committed old generation 
>> during start-up, which turns out to be much faster.  (And doesn't bill 
>> the time to GC. :-)
>>
>> Maybe those forced System.gc() calls at the beginning touch all of the 
>> old generation?  Try a runs with and without -XX:+AlwaysPreTouch and 
>> report the results, because I'm curious.
> 
> AlwaysPreTouch should not make a difference since SPECjbb does two 
> System.gc()s in the begin. But I did do two runs with 
> -XX:+AlwaysPreTouch just in case. The resulting log files are attached. 
> As you can see the results are the same.
> 
> defnew:  56964
> parnew:  59224
> 
> Here is the command line for the runs:
> 
> java -server -XX:+AlwaysPreTouch -XX:+PrintGCDetails 
> -XX:+PrintGCTimeStamps -XX:-PrintGCCause -cp ./jbb.jar:./check.jar 
> -Xms1g -Xmx1g spec.jbb.JBBmain -propfile SPECjbb.props
> 
> I add "-XX:+UseSerialGC" for the defnew runs and "-XX:+UseParNewGC 
> -XX:ParallelGCThreads=1" for the parnew runs.
> 
>>
>>
>> I also see that the young generation spaces are different between the 
>> runs.  Looking at the heap shape printed at th end of the GC logs
>>
>>     def new generation   total 314560K, used 232441K 
>> [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
>>      eden space 279616K,  76% used [0x00000000c0000000, 
>> 0x00000000cd0b4d38, 0x00000000d1110000)
>>      from space 34944K,  53% used [0x00000000d1110000, 
>> 0x00000000d2359a18, 0x00000000d3330000)
>>      to   space 34944K,   0% used [0x00000000d3330000, 
>> 0x00000000d3330000, 0x00000000d5550000)
>>
>>     par new generation   total 314560K, used 74831K 
>> [0x00000000c0000000, 0x00000000d5550000, 0x00000000d5550000)
>>      eden space 279616K,  18% used [0x00000000c0000000, 
>> 0x00000000c33154f8, 0x00000000d1110000)
>>      from space 34944K,  64% used [0x00000000d1110000, 
>> 0x00000000d270e760, 0x00000000d3330000)
>>      to   space 34944K,   0% used [0x00000000d3330000, 
>> 0x00000000d3330000, 0x00000000d5550000)
>>
>>     PSYoungGen      total 329728K, used 93207K [0x00000000eaab0000, 
>> 0x0000000100000000, 0x0000000100000000)
>>      eden space 309952K, 23% used 
>> [0x00000000eaab0000,0x00000000ef30dd08,0x00000000fd960000)
>>      from space 19776K, 96% used 
>> [0x00000000fd960000,0x00000000fec08000,0x00000000fecb0000)
>>      to   space 19712K, 0% used 
>> [0x00000000fecc0000,0x00000000fecc0000,0x0000000100000000)
>>
>> The PSYoung eden is 10% larger than the others (because the survivors 
>> are smaller?).
> 
> Right. This is because the PS runs have UseAdaptiveSizePolicy enabled. I 
> mostly included the PS runs for comparison. My main interest in in the 
> difference between DefNew and ParNew. They have the same size eden and 
> survivors.
>> The sizes and occupancy of the survivors is different between DefNew 
>> and ParNew if you look in the logs.
> Yes, this is what triggered me to go and look at the code for how we 
> update the tenuring threshold. It looks like we use the same ergonomics 
> except for the difference that this review request is addressing. But 
> this difference is not the cause of the performance difference since I 
> don't get any promotion failures.
> 
> Thanks again for looking at this!
> Bengt
> 
>>
>>             ... peter
>>
>> Bengt Rutisson wrote:
>>>
>>>
>>> Hi Ramki,
>>>
>>> On 1/11/13 6:50 PM, Srinivas Ramakrishna wrote:
>>>> Hi Bengt --
>>>>
>>>> Try computing the GC overhead by normalizing wrt the work done (for 
>>>> which the net allocation volume might be a good proxy). As you 
>>>> state, the performance numbers will then likely make sense. Of 
>>>> course, they still won't explain why ParNew does better.  As Vitaly 
>>>> conjectures, the difference is likely in better object co-location 
>>>> with ParNew's slightly more DFS-like evacuation compared with 
>>>> DefNew's considerably more BFS-like evacuation because of the 
>>>> latter's use of a pure Cheney scan compared with the use of (a) 
>>>> marking stack(s) in the former, as far as i can remember the code. 
>>>> One way to tell if that accounts for the difference is to measure 
>>>> the cache-miss rates in the two cases (and may be use a good tool 
>>>> like Solaris perf analyzer to show you where the misses are coming 
>>>> from as well).
>>>
>>> Thanks for bringing the DFS/BFS difference up. This is exactly the 
>>> kind of difference I was looking for. My guess is that this is what 
>>> causes the difference in JBB score. I'll see if I can investigate 
>>> this further.
>>>> Also curious if you can share the two sets of GC logs, by chance? 
>>>> (specJBB is a for-fee benchmark so is not freely available to the 
>>>> individual developer.)
>>>
>>> I have a fairly large set of logs, but the runs are very stable so 
>>> I'm just attaching logs for one run for each collector. For 
>>> comparison I have also been running ParallelScavenge with one thread. 
>>> I'm using separate gc logs and jbb logs. The log files called 
>>> ".result" are the jbb output. The other logs are the gc logs.
>>>
>>> I'm running with a heap size of 1GB to avoid full GCs. All runs have 
>>> the two System.gc() induced full GCs but no other. ParallelScavenge 
>>> is performing even better than ParNew, but I am mostly interested in 
>>> the difference between ParNew and DefNew.
>>>
>>> A quick summary of the data in the logs:
>>>
>>>         Score  #GCs  AverageGCTime
>>> DefNew: 57903  2083  0.044053195391262644
>>> ParNew: 61363  2213  0.05931835969272489
>>> PS:     69697  2213  0.06117092860370538
>>>
>>> ParNew has a better score even though it does more GCs and they take 
>>> longer.
>>>
>>> If you have any insights from looking at the logs I would be very 
>>> happy to hear about it.
>>>
>>> Thanks,
>>> Bengt
>>>>
>>>> thanks.
>>>> -- ramki
>>>>
>>>> On Fri, Jan 11, 2013 at 4:57 AM, Bengt Rutisson 
>>>> <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> wrote:
>>>>
>>>>
>>>>     Hi Vitaly,
>>>>
>>>>
>>>>     On 1/11/13 1:45 PM, Vitaly Davidovich wrote:
>>>>>
>>>>>     Hi Bengt,
>>>>>
>>>>>     Regarding the benchmark score, are you saying ParNew has longer
>>>>>     cumulative GC time or just the average is higher? If it's just
>>>>>     average, maybe the total # of them (and cumulative time) is
>>>>>     less.  I don't know the characteristics of this particular
>>>>>     specjbb benchmark, but perhaps having fewer total GCs is better
>>>>>     because of the overhead of getting all threads to a safe point,
>>>>>     going go the OS to suspend them, and then restarting them.  After
>>>>>     they're restarted, the CPU cache may be cold for it because the
>>>>>     GC thread polluted it.  Or I'm entirely wrong in my speculation
>>>>>     ... :).
>>>>>
>>>>
>>>>     You have a good point about the number of GCs. The problem in my
>>>>     runs is that ParNew does more GCs than DefNew. So there are both
>>>>     more of them and their average time is higher, but the score is
>>>>     still better. That ParNew does more GCs is not that strange. It
>>>>     has a higher score, which means that it had higher throughput and
>>>>     had time to create more objects. So, that is kind of expected. But
>>>>     I don't understand how it can have higher throughput when the GCs
>>>>     take longer. My current guess is that it does something
>>>>     differently with how objects are copied in a way that is
>>>>     beneficial for the execution time between GCs.
>>>>
>>>>     It also seems like ParNew keeps more objects alive for each GC.
>>>>     That is either the reason why it does more and more frequent GCs
>>>>     than DefNew, or it is an effect of the fact that more objects are
>>>>     created due to the higher throughput. This is the reason I started
>>>>     looking at the tenuring threshold.
>>>>
>>>>     Bengt
>>>>
>>>>
>>>>>     Thanks
>>>>>
>>>>>     Sent from my phone
>>>>>
>>>>>     On Jan 11, 2013 6:02 AM, "Bengt Rutisson"
>>>>>     <bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>> 
>>>>> wrote:
>>>>>
>>>>>
>>>>>         Hi Ramki,
>>>>>
>>>>>         Thanks for looking at this!
>>>>>
>>>>>         On 1/10/13 9:28 PM, Srinivas Ramakrishna wrote:
>>>>>>         Hi Bengt --
>>>>>>
>>>>>>         The change looks reasonable, but I have a comment and a
>>>>>>         follow-up question.
>>>>>>
>>>>>>         Not your change, but I'd elide the "half the real survivor
>>>>>>         size" since it's really a configurable parameter based on
>>>>>>         TargetSurvivorRatio with default half.
>>>>>>         I'd leave the comment as "set the new tenuring threshold and
>>>>>>         desired survivor size".
>>>>>
>>>>>         I'm fine with removing this from the comment, but I thought
>>>>>         the "half the real survivor size" aimed at the fact that we
>>>>>         pass only the "to" capacity and not the "from" capacity in to
>>>>>         compute_tenuring_threshold(). With that interpretation I
>>>>>         think the comment is correct.
>>>>>
>>>>>         Would you like me to remove it anyway? Either way is fine
>>>>>         with me.
>>>>>
>>>>>>         I'm curious though, as to what performance data prompted
>>>>>>         this change,
>>>>>         Good point. This change was preceded by an internal
>>>>>         discussion in the GC team, so I should probably have
>>>>>         explained the background more in my review request to the 
>>>>> open.
>>>>>
>>>>>         I was comparing the ParNew and DefNew implementation since I
>>>>>         am seeing some strange differences in some SPECjbb2005
>>>>>         results. I am running ParNew with a single thread and get
>>>>>         much better score than with DefNew. But I also get higher
>>>>>         average GC times. So, I was trying to figure out what DefNew
>>>>>         and ParNew does differently.
>>>>>
>>>>>         When I was looking at DefNewGeneration::collect() and
>>>>>         ParNewGeneration::collect() I saw that they contain a whole
>>>>>         lot of code duplication. It would be tempting to try to
>>>>>         extract the common code out into DefNewGeneration since it is
>>>>>         the super class. But there are some minor differences. One of
>>>>>         them was this issue with how they handle the tenuring 
>>>>> threshold.
>>>>>
>>>>>         We tried to figure out if there is a reason for ParNew and
>>>>>         DefNew to behave different in this regard. We could not come
>>>>>         up with any good reason for that. So, we needed to figure out
>>>>>         if we should change ParNew or DefNew to make them consistent.
>>>>>         The decision to change ParNew was based on two things. First,
>>>>>         it seems wrong to use the data from a collection that got
>>>>>         promotion failure. This collection will not have allowed the
>>>>>         tenuring threshold to fulfill its purpose. Second,
>>>>>         ParallelScavenge works the same way as DefNew.
>>>>>
>>>>>         BTW, the difference between DefNew and ParNew seems to have
>>>>>         been there from the start. So, there is no bug or changeset
>>>>>         in mercurial or TeamWare to explain why the difference was
>>>>>         introduced.
>>>>>
>>>>>         (Just to be clear, this difference was not the cause of my
>>>>>         performance issue. I still don't have a good explanation for
>>>>>         how ParNew can have longer GC times but better SPECjbb score.)
>>>>>
>>>>>>         and whether it might make sense, upon a promotion failure to
>>>>>>         do something about the tenuring threshold for the next
>>>>>>         scavenge (i.e. for example make the tenuring threshold half
>>>>>>         of its current value as a reaction to the fact that
>>>>>>         promotion failed). Is it currently left at its previous
>>>>>>         value or is it asjusted back to the default max value (which
>>>>>>         latter may be the wrong thing to do) or something else?
>>>>>
>>>>>         As far as I can tell the tenuring threshold is left untouched
>>>>>         if we get a promotion failure. It is probably a good idea to
>>>>>         update it in some way. But I would prefer to handle that as a
>>>>>         separate bug fix.
>>>>>
>>>>>         This change is mostly a small cleanup to make
>>>>>         DefNewGeneration::collect() and ParNewGeneration::collect()
>>>>>         be more consistent. We've done the thinking so, it's good to
>>>>>         make the change in preparation for the next person that comes
>>>>>         a long and has a few cycles over and would like to merge the
>>>>>         two collect() methods in some way.
>>>>>
>>>>>         Thanks again for looking at this!
>>>>>         Bengt
>>>>>
>>>>>>
>>>>>>         -- ramki
>>>>>>
>>>>>>         On Thu, Jan 10, 2013 at 1:30 AM, Bengt Rutisson
>>>>>>         <bengt.rutisson at oracle.com
>>>>>>         <mailto:bengt.rutisson at oracle.com>> wrote:
>>>>>>
>>>>>>
>>>>>>             Hi everyone,
>>>>>>
>>>>>>             Could I have a couple of reviews for this small change
>>>>>>             to make DefNew and ParNew be more consistent in the way
>>>>>>             they treat the tenuring threshold:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/
>>>>>> <http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/>
>>>>>>
>>>>>>             Thanks,
>>>>>>             Bengt
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
> 


From john.cuthbertson at oracle.com  Tue Jan 15 17:59:44 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 15 Jan 2013 09:59:44 -0800
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F47361.7020306@oracle.com>
References: <50F47361.7020306@oracle.com>
Message-ID: <50F59910.2010000@oracle.com>

Hi Bengt,

Changes look good to me. Minor nits:

Copyrights need updating.
Use UINT32_FORMAT instead of %d in the error message.
Check the indentation of the for loop in G1GCPhaseTimes::note_gc_end().

JohnC

On 1/14/2013 1:06 PM, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this small change?
> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>
> Thanks to John Cuthbertson for finding this bug and providing 
> excellent data to track down the issue.
>
> From the bug report:
>
> In non-product builds the WorkerDataArrays in G1 are initialized to -1 
> in WorkerDataArray<T>::reset() when a GC starts. At the end of a GC 
> WorkerDataArray<T>::verify() verifies that all entries in a 
> WorkerDataArray has been set. Currently it does this by asserting that 
> the entries are >= 0. This is fine in theory since the entries should 
> contain counts or times that are all positive.
>
> The problem is that some WorkerDataArrays are of type double. And some 
> of those are set up through calculations using doubles. If those 
> calculations result in a value close to 0 we could end up with a value 
> slightly less than 0 since double calculations don't have full precision.
>
> All we really want to verify is that all the entries were set. So, it 
> should be enough to verify that entries do not contain the value set 
> by the reset() method.
>
> Bengt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/b6ee45ea/attachment.htm>

From chunt at salesforce.com  Tue Jan 15 18:01:31 2013
From: chunt at salesforce.com (Charlie Hunt)
Date: Tue, 15 Jan 2013 10:01:31 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50F59482.6060900@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<50F59482.6060900@oracle.com>
Message-ID: <7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>

Hi John,

Completely agree with the excellent points you mention below (thanks for being thorough and listing them!).

Given G1 is (somewhat) positioned as a collector to use when improved latency is an important criteria, I think the tradeoffs are something people are willing to live with too.

Fwiw, you have my "ok" to go ahead with your suggestion to apply the new young gen bounds to all heap sizes.

hths,

charlie ...

On Jan 15, 2013, at 11:40 AM, John Cuthbertson wrote:

> Hi Charlie
> 
> Thanks for looking over the changes. Replies inline....
> 
> 
> On 1/11/2013 11:32 AM, Charlie Hunt wrote:
>> Hi John,
>> 
>> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.
> 
> I don't have a problem with this. By applying it heaps > 4GB , I was 
> just being conservative.
> 
>> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
>> 
>> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?
> 
> Again I don't have a problem with applying the new value to all heap 
> sizes but I am a little concerned about the implications. The benefit is 
> definitely less risk of evacuation failures but the it could also
> 
> * increase the number of young GCs:
>     ** increasing the GC overhead and increasing the heap slightly more 
> aggressively
>     ** lowering throughput
> * slightly increase the amount that gets promoted
>     ** triggering marking cycles earlier and more often (increased SATB 
> barrier overhead)
>     ** more cards to be refined (we only refine cards in old regions) 
> increasing the write barrier costs and the RS updating phase of the pauses,
>     ** increases the importance of "taming the mixed GCs".
> 
> From Kirk's email it sounds like this is a trade off people are 
> prepared to live with.
> 
> Unless I hear any objections, I'll apply the new young gen bounds to all 
> heap sizes.
> 
> JohnC


From monica.beckwith at oracle.com  Tue Jan 15 18:45:06 2013
From: monica.beckwith at oracle.com (Monica Beckwith)
Date: Tue, 15 Jan 2013 12:45:06 -0600
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<50F59482.6060900@oracle.com>
	<7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>
Message-ID: <50F5A3B2.9010505@oracle.com>

Thanks, Charlie -

If I may add two more things to John's points below and also expand a 
bit on the "latency" comment -
Even though we talk about latency, in reality, I have seen many people 
with bigger heap (around 200Gs) requirements really concerned about ART 
(Average Response Time)/ Throughput.
Also, we should remember that if the marking cycle is triggered earlier 
and more often, then we may end up under-utilizing the bigger heaps and 
will definitely have to spend time "taming the mixedGCs" :)

just my 2 cents.

-Monica

On 1/15/2013 12:01 PM, Charlie Hunt wrote:
> Hi John,
>
> Completely agree with the excellent points you mention below (thanks for being thorough and listing them!).
>
> Given G1 is (somewhat) positioned as a collector to use when improved latency is an important criteria, I think the tradeoffs are something people are willing to live with too.
>
> Fwiw, you have my "ok" to go ahead with your suggestion to apply the new young gen bounds to all heap sizes.
>
> hths,
>
> charlie ...
>
> On Jan 15, 2013, at 11:40 AM, John Cuthbertson wrote:
>
>> Hi Charlie
>>
>> Thanks for looking over the changes. Replies inline....
>>
>>
>> On 1/11/2013 11:32 AM, Charlie Hunt wrote:
>>> Hi John,
>>>
>>> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.
>> I don't have a problem with this. By applying it heaps>  4GB , I was
>> just being conservative.
>>
>>> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
>>>
>>> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?
>> Again I don't have a problem with applying the new value to all heap
>> sizes but I am a little concerned about the implications. The benefit is
>> definitely less risk of evacuation failures but the it could also
>>
>> * increase the number of young GCs:
>>      ** increasing the GC overhead and increasing the heap slightly more
>> aggressively
>>      ** lowering throughput
>> * slightly increase the amount that gets promoted
>>      ** triggering marking cycles earlier and more often (increased SATB
>> barrier overhead)
>>      ** more cards to be refined (we only refine cards in old regions)
>> increasing the write barrier costs and the RS updating phase of the pauses,
>>      ** increases the importance of "taming the mixed GCs".
>>
>>  From Kirk's email it sounds like this is a trade off people are
>> prepared to live with.
>>
>> Unless I hear any objections, I'll apply the new young gen bounds to all
>> heap sizes.
>>
>> JohnC

-- 
Oracle <http://www.oracle.com>
Monica Beckwith | Java Performance Engineer
VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
Texas
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/8dc37c12/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/8dc37c12/oracle_sig_logo.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/8dc37c12/green-for-email-sig_0.gif>

From chunt at salesforce.com  Tue Jan 15 19:20:11 2013
From: chunt at salesforce.com (Charlie Hunt)
Date: Tue, 15 Jan 2013 11:20:11 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50F5A3B2.9010505@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<50F59482.6060900@oracle.com>
	<7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>
	<50F5A3B2.9010505@oracle.com>
Message-ID: <FDA61CAB-EFDA-474A-B052-867251049216@salesforce.com>

Avg Response Time ... (sigh) --- one of our favorite subjects.  ;-)

You're right, if the marking cycles start earlier than ideally desired, you end up under-utilizing heap space and potentially having to tame mixed GCs.  But, G1 has a tunable we can set to start the marking cycle later.  The challenge there is setting the initiating heap occupancy percent too high and losing the race.  But, by setting it higher (and avoiding losing the race) with larger  heaps hopefully translates to more "good candidate" old gen regions to collect and also hopefully makes the exercise of taming mixed GCs a little easier too.

Thanks for sharing your thoughts.

charlie ...

On Jan 15, 2013, at 12:45 PM, Monica Beckwith wrote:

Thanks, Charlie -

If I may add two more things to John's points below and also expand a bit on the "latency" comment -
Even though we talk about latency, in reality, I have seen many people with bigger heap (around 200Gs) requirements really concerned about ART (Average Response Time)/ Throughput.
Also, we should remember that if the marking cycle is triggered earlier and more often, then we may end up under-utilizing the bigger heaps and will definitely have to spend time "taming the mixedGCs" :)

just my 2 cents.

-Monica

On 1/15/2013 12:01 PM, Charlie Hunt wrote:

Hi John,

Completely agree with the excellent points you mention below (thanks for being thorough and listing them!).

Given G1 is (somewhat) positioned as a collector to use when improved latency is an important criteria, I think the tradeoffs are something people are willing to live with too.

Fwiw, you have my "ok" to go ahead with your suggestion to apply the new young gen bounds to all heap sizes.

hths,

charlie ...

On Jan 15, 2013, at 11:40 AM, John Cuthbertson wrote:


Hi Charlie

Thanks for looking over the changes. Replies inline....


On 1/11/2013 11:32 AM, Charlie Hunt wrote:


Hi John,

Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.


I don't have a problem with this. By applying it heaps > 4GB , I was
just being conservative.


I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.

So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?


Again I don't have a problem with applying the new value to all heap
sizes but I am a little concerned about the implications. The benefit is
definitely less risk of evacuation failures but the it could also

* increase the number of young GCs:
    ** increasing the GC overhead and increasing the heap slightly more
aggressively
    ** lowering throughput
* slightly increase the amount that gets promoted
    ** triggering marking cycles earlier and more often (increased SATB
barrier overhead)
    ** more cards to be refined (we only refine cards in old regions)
increasing the write barrier costs and the RS updating phase of the pauses,
    ** increases the importance of "taming the mixed GCs".

>From Kirk's email it sounds like this is a trade off people are
prepared to live with.

Unless I hear any objections, I'll apply the new young gen bounds to all
heap sizes.

JohnC


--
<oracle_sig_logo.gif><http://www.oracle.com/>
Monica Beckwith | Java Performance Engineer
VOIP: +1 512 401 1274<tel:+1%20512%20401%201274>
Texas
<green-for-email-sig_0.gif><http://www.oracle.com/commitment> Oracle is committed to developing practices and products that help protect the environment

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/18ae73bf/attachment.htm>

From john.cuthbertson at oracle.com  Tue Jan 15 20:10:39 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 15 Jan 2013 12:10:39 -0800
Subject: RFR(S/M): 7132678: G1: verify that the marking bitmaps have no marks
	for objects over TAMS
Message-ID: <50F5B7BF.5060800@oracle.com>

Hi Everyone,

Can I have a couple of volunteers review the changes for this CR? The 
webrev can be found at: http://cr.openjdk.java.net/~johnc/7132678/webrev.0/

Most of the changes come from a patch that Tony gave me before he left 
and I had to tweak them slightly to remove a spurious failure. The 
changes verify that the heap regions don't have any marks between [TAMS, 
top) at strategic places: start and end of each GC, start and end of 
remark and cleanup, and when allocating a region. Tony deserves the bulk 
of the credit so, if possible and there are no objections, I intend to 
list him as author of the change and include myself as a reviewer.

Testing:
GC test suite with the both the new flags (separately and together) and 
a low IHOP value.
jprt with the new flags (+IgnoreUnrecognizedVMOptions so that product 
test runs did not fail).

Thanks,

JohnC


From john.cuthbertson at oracle.com  Tue Jan 15 23:31:10 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 15 Jan 2013 15:31:10 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
Message-ID: <50F5E6BE.9040901@oracle.com>

Hi Everyone,

Can I have a couple of people look over the changes for this CR - the 
webrev can be found at: http://cr.openjdk.java.net/~johnc/7176479/webrev.0/

Background:
The issue here was that we were encoding the card index into the card 
counts table entries along with the GC number so that we could determine 
if the count associated with was valid. We had a check to ensure that 
the maximum card index could be encoded in an int. With such large heap 
size - the number of cards could not be encoded and so the check failed.

The previous mechanism was an attempt to solve the problem of one thread 
arriving late to the actual GC work. The thread in question was being 
held up zeroing the card counts table at the start of the GC. The card 
counts table is used to determine which cards are being refined 
frequently. Once a card has been refined frequently enough, further 
refinements of that card are delayed by placing the card into a fixed 
size evicting table - the hot card cache. The card would then be refined 
when it was evicted from the hot card cache or when the cache was 
drained during the next GC.

To solve the problem of zeroing we added an epoch (GC number) to the 
entries in the counts table and, eliminate the increase in footprint, we 
made the counts table into a cache which would expand if needed. This 
approach had some negatives: we might have to refine two cards during a 
single refinement operation, hashing the card, and performing CAS 
operations increasing the overhead of concurrent refinement. Also 
expanding the counts table during a GC incurred a penalty.

This approach also limited the heap size to just under 1TB - which the 
systems team ran into.

The new approach effectively undoes the previous mechanism and 
re-simplifies the card counts table.

Summary of Changes:
The hot card cache and card counts table have been moved from the 
concurrent refinement code into their own files.

The hot card cache can now exist independently of whether the counts 
table exists. In this case refining a card once adds it to the hot card 
cache, i.e. all cards are treated as 'hot'.

The interface to the hot card cache has been simplified - a simple query 
and a simple drain routine. This simplifies the calling code in 
g1RemSet.cpp and results in up to only a single card being refined for 
every call to "refine_card" instead of possibly two. This should reduce 
the overhead of concurrent refinement.

The number of cards that the hot card cache can hold before cards start 
getting evicted is controlled by the flag G1ConcRSLogCacheSize, which is 
now product flag. The default value is 10 giving a hot card cache that 
can hold 1K cards.

The card counts table has been greatly simplified. It is a simple array 
of counts how many times a card has been refined. The space for the 
table is now allocated from virtual memory instead of C heap. The space 
for the table is committed when the heap is initially committed and the 
spans the committed size of the heap. When the committed size of the 
heap is expanded, the counts table is also expanded to cover the newly 
expanded heap. If we fail to commit the memory for the counts table, 
cards that map to the uncommitted space will be treated as cold, i.e. 
they will be refined immediately. Having a simpler counts table also 
should reduce the overhead of concurrent refinement (there is no need to 
hash the card index and there are no CAS operations) Having a simpler 
interface will allow us to change the underlying data structure to an 
alternative that's perhaps more sparse in the future.

During an incremental GC we no longer zero the entire counts table. We 
now zero the cards spanned by a region when the region is freed (i.e. 
when we free the collection set at the end of a GC and when we free 
regions at the end of a cleanup).  If a card was "hot" before a GC then 
we will consider it hot after the GC and the first refinement after the 
GC will insert the card into the hot card cache. Furthermore, since we 
don't refine cards in young regions, we only need to clear the counts 
associated with cards spanned by non-young regions.

During a full GC we still discard the entries in the hot card cache and 
zero the counts for all the cards in the heap.

Testing:
GC Test suite with MaxTenuringThreshold=0 (to increase the amount of 
refinement) and a low IHOP value (to force cleanups).
SPECjbb2005 with a 1.5TB heap size and 256GB young size, 
MaxTenuringThreshold=0 and a low IHOP value (1%). The systems team are 
continuing to test with very large heaps.


From vitalyd at gmail.com  Wed Jan 16 00:57:07 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 15 Jan 2013 19:57:07 -0500
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <50F5E6BE.9040901@oracle.com>
References: <50F5E6BE.9040901@oracle.com>
Message-ID: <CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>

Hi John,

Wow, that's a giant heap! :)

I think G1ConcRSLogCacheSize needs to be validated to make sure it's <= 31;
otherwise, I think you get undefined behavior on left shifting with it.

I don't think you need _def_use_cache -- can be replaced with
G1ConcRSLogCacheSize > 0?

I'm sure this is due to my lack of G1 knowledge, but the concurrency
control inside g1HotCardCache is a bit unclear.  There's a CAS to claim the
region of cards, there's a HotCache lock for inserting a card.  However,
reset_hot_cache() does a naked write of a few fields.  Are there any
visibility and ordering constraints that need to be enforced? Do some of
the stores need an OrderAccess barrier of some sort, depending on what's
required? Sorry if I'm just missing it ...

I didn't finish looking at the rest yet so that's all I have for the moment.

Thanks

Sent from my phone
On Jan 15, 2013 6:32 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Everyone,
>
> Can I have a couple of people look over the changes for this CR - the
> webrev can be found at: http://cr.openjdk.java.net/~**
> johnc/7176479/webrev.0/<http://cr.openjdk.java.net/~johnc/7176479/webrev.0/>
>
> Background:
> The issue here was that we were encoding the card index into the card
> counts table entries along with the GC number so that we could determine if
> the count associated with was valid. We had a check to ensure that the
> maximum card index could be encoded in an int. With such large heap size -
> the number of cards could not be encoded and so the check failed.
>
> The previous mechanism was an attempt to solve the problem of one thread
> arriving late to the actual GC work. The thread in question was being held
> up zeroing the card counts table at the start of the GC. The card counts
> table is used to determine which cards are being refined frequently. Once a
> card has been refined frequently enough, further refinements of that card
> are delayed by placing the card into a fixed size evicting table - the hot
> card cache. The card would then be refined when it was evicted from the hot
> card cache or when the cache was drained during the next GC.
>
> To solve the problem of zeroing we added an epoch (GC number) to the
> entries in the counts table and, eliminate the increase in footprint, we
> made the counts table into a cache which would expand if needed. This
> approach had some negatives: we might have to refine two cards during a
> single refinement operation, hashing the card, and performing CAS
> operations increasing the overhead of concurrent refinement. Also expanding
> the counts table during a GC incurred a penalty.
>
> This approach also limited the heap size to just under 1TB - which the
> systems team ran into.
>
> The new approach effectively undoes the previous mechanism and
> re-simplifies the card counts table.
>
> Summary of Changes:
> The hot card cache and card counts table have been moved from the
> concurrent refinement code into their own files.
>
> The hot card cache can now exist independently of whether the counts table
> exists. In this case refining a card once adds it to the hot card cache,
> i.e. all cards are treated as 'hot'.
>
> The interface to the hot card cache has been simplified - a simple query
> and a simple drain routine. This simplifies the calling code in
> g1RemSet.cpp and results in up to only a single card being refined for
> every call to "refine_card" instead of possibly two. This should reduce the
> overhead of concurrent refinement.
>
> The number of cards that the hot card cache can hold before cards start
> getting evicted is controlled by the flag G1ConcRSLogCacheSize, which is
> now product flag. The default value is 10 giving a hot card cache that can
> hold 1K cards.
>
> The card counts table has been greatly simplified. It is a simple array of
> counts how many times a card has been refined. The space for the table is
> now allocated from virtual memory instead of C heap. The space for the
> table is committed when the heap is initially committed and the spans the
> committed size of the heap. When the committed size of the heap is
> expanded, the counts table is also expanded to cover the newly expanded
> heap. If we fail to commit the memory for the counts table, cards that map
> to the uncommitted space will be treated as cold, i.e. they will be refined
> immediately. Having a simpler counts table also should reduce the overhead
> of concurrent refinement (there is no need to hash the card index and there
> are no CAS operations) Having a simpler interface will allow us to change
> the underlying data structure to an alternative that's perhaps more sparse
> in the future.
>
> During an incremental GC we no longer zero the entire counts table. We now
> zero the cards spanned by a region when the region is freed (i.e. when we
> free the collection set at the end of a GC and when we free regions at the
> end of a cleanup).  If a card was "hot" before a GC then we will consider
> it hot after the GC and the first refinement after the GC will insert the
> card into the hot card cache. Furthermore, since we don't refine cards in
> young regions, we only need to clear the counts associated with cards
> spanned by non-young regions.
>
> During a full GC we still discard the entries in the hot card cache and
> zero the counts for all the cards in the heap.
>
> Testing:
> GC Test suite with MaxTenuringThreshold=0 (to increase the amount of
> refinement) and a low IHOP value (to force cleanups).
> SPECjbb2005 with a 1.5TB heap size and 256GB young size,
> MaxTenuringThreshold=0 and a low IHOP value (1%). The systems team are
> continuing to test with very large heaps.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/f1c913bc/attachment.htm>

From vitalyd at gmail.com  Wed Jan 16 04:49:40 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 15 Jan 2013 23:49:40 -0500
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
Message-ID: <CAHjP37GBxLoqNaCSBzagucNx2ebc+Yhrm2PbypvbJQF+w-khXA@mail.gmail.com>

A few more comments/suggestions:

In g1CardCounts::ptr_2_card_num(), I'd assert that card_ptr >= _ct_bot.
This is mostly to avoid a null card_ptr (or some other bogus value) causing
the subtraction to go negative but then wrap around to a large size_t value
that just happens to fit into the card range.  Unlikely and maybe this is
too paranoid, so up to you.

Also in this class, it's a bit strange that G1CardCounts::is_hot() also
increments the count.  I know the comments in the header say that count is
updated but the name of the method implies it's just a read.  Maybe call
add_card_count() and then add an is_hot(int) method and call that with the
return value of add_card_count?

G1CardCounts::clear_region -- there are some casts of const jbyte to jbyte
at bottom of method.  Perhaps if ptr_2_card_num() were changed to be taking
const jbyte you wouldn't need the casts.  I think marking as much things
const as possible is good in general anyway ...

In the various places where values are asserted to be in some range, it may
be useful to add the valid range to the error message so that if it
triggers you get a bit more context/diagnostic info.

Thanks

Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130115/db856547/attachment.htm>

From bengt.rutisson at oracle.com  Wed Jan 16 07:35:45 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 08:35:45 +0100
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <FDA61CAB-EFDA-474A-B052-867251049216@salesforce.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<50F59482.6060900@oracle.com>
	<7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>
	<50F5A3B2.9010505@oracle.com>
	<FDA61CAB-EFDA-474A-B052-867251049216@salesforce.com>
Message-ID: <50F65851.6050207@oracle.com>


Hi all,

I haven't commented in this email thread but I've been following the 
discussion with interest. Since I was the one who brought up the 
question around the 4GB limit, I'd just like to state that I agree with 
the decision to skip this limit.

Thanks,
Bengt


On 1/15/13 8:20 PM, Charlie Hunt wrote:
> Avg Response Time ... (sigh) --- one of our favorite subjects.  ;-)
>
> You're right, if the marking cycles start earlier than ideally 
> desired, you end up under-utilizing heap space and potentially having 
> to tame mixed GCs.  But, G1 has a tunable we can set to start the 
> marking cycle later.  The challenge there is setting the initiating 
> heap occupancy percent too high and losing the race.  But, by setting 
> it higher (and avoiding losing the race) with larger  heaps hopefully 
> translates to more "good candidate" old gen regions to collect and 
> also hopefully makes the exercise of taming mixed GCs a little easier too.
>
> Thanks for sharing your thoughts.
>
> charlie ...
>
> On Jan 15, 2013, at 12:45 PM, Monica Beckwith wrote:
>
>> Thanks, Charlie -
>>
>> If I may add two more things to John's points below and also expand a 
>> bit on the "latency" comment -
>> Even though we talk about latency, in reality, I have seen many 
>> people with bigger heap (around 200Gs) requirements really concerned 
>> about ART (Average Response Time)/ Throughput.
>> Also, we should remember that if the marking cycle is triggered 
>> earlier and more often, then we may end up under-utilizing the bigger 
>> heaps and will definitely have to spend time "taming the mixedGCs" :)
>>
>> just my 2 cents.
>>
>> -Monica
>>
>> On 1/15/2013 12:01 PM, Charlie Hunt wrote:
>>> Hi John,
>>>
>>> Completely agree with the excellent points you mention below (thanks for being thorough and listing them!).
>>>
>>> Given G1 is (somewhat) positioned as a collector to use when improved latency is an important criteria, I think the tradeoffs are something people are willing to live with too.
>>>
>>> Fwiw, you have my "ok" to go ahead with your suggestion to apply the new young gen bounds to all heap sizes.
>>>
>>> hths,
>>>
>>> charlie ...
>>>
>>> On Jan 15, 2013, at 11:40 AM, John Cuthbertson wrote:
>>>
>>>> Hi Charlie
>>>>
>>>> Thanks for looking over the changes. Replies inline....
>>>>
>>>>
>>>> On 1/11/2013 11:32 AM, Charlie Hunt wrote:
>>>>> Hi John,
>>>>>
>>>>> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.
>>>> I don't have a problem with this. By applying it heaps > 4GB , I was
>>>> just being conservative.
>>>>
>>>>> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
>>>>>
>>>>> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?
>>>> Again I don't have a problem with applying the new value to all heap
>>>> sizes but I am a little concerned about the implications. The benefit is
>>>> definitely less risk of evacuation failures but the it could also
>>>>
>>>> * increase the number of young GCs:
>>>>      ** increasing the GC overhead and increasing the heap slightly more
>>>> aggressively
>>>>      ** lowering throughput
>>>> * slightly increase the amount that gets promoted
>>>>      ** triggering marking cycles earlier and more often (increased SATB
>>>> barrier overhead)
>>>>      ** more cards to be refined (we only refine cards in old regions)
>>>> increasing the write barrier costs and the RS updating phase of the pauses,
>>>>      ** increases the importance of "taming the mixed GCs".
>>>>
>>>> >From Kirk's email it sounds like this is a trade off people are
>>>> prepared to live with.
>>>>
>>>> Unless I hear any objections, I'll apply the new young gen bounds to all
>>>> heap sizes.
>>>>
>>>> JohnC
>>
>> -- 
>> <oracle_sig_logo.gif> <http://www.oracle.com/>
>> Monica Beckwith | Java Performance Engineer
>> VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
>> Texas
>> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle 
>> is committed to developing practices and products that help protect 
>> the environment
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/d88ea869/attachment.htm>

From kirk at kodewerk.com  Wed Jan 16 08:09:16 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Wed, 16 Jan 2013 00:09:16 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50F5950F.8040206@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
	<50F5950F.8040206@oracle.com>
Message-ID: <9CA215C3-6216-409E-BCAF-185835FDC3EB@kodewerk.com>

Hi John,

You know, there might be a good reason to have different values for different heap sizes.. some thing that makes sense when you look at the implementation. If so, that might justify the need to do this. I just don't understand *why*? But maybe that's just me. I'm not responsible for the implementation, I just help people deal with what's on the table and so unless something seem really not right, like dropping incremental modes, I'll pass comment and then shutup to let you get on with it... ;-)

BTW, not to stir up any trouble but it would be nice to have a incremental mode for G1 for machines with large number of cores.

Regards,
Kirk

On 2013-01-15, at 9:42 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Kirk,
> 
> I know you haven't responded to me directly but I did your email with interest and cited it in my reply to Charlie Hunt.
> 
> On 1/12/2013 4:39 AM, Kirk Pepperdine wrote:
>> Hi Charlie,
>> 
>> In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?
>> 
> 
> Unless I hear any objections, I'll apply the new young gen bounds to all heap sizes.
> 
> JohnC


From bengt.rutisson at oracle.com  Wed Jan 16 08:23:45 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 09:23:45 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F55C7A.8070508@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com>
Message-ID: <50F66391.6000504@oracle.com>

On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
> On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>
>> Hi Jesper,
>>
>> Is NewRatio guaranteed to be non-zero when used inside 
>> recommended_heap_size?
>>
> As far as I can see, yes. It defaults to two and is never set to zero.

No, there is no such guarantee this early in the argument parsing. The 
check to verify that NewRatio > 0 is done in 
GenCollectorPolicy::initialize_flags(), which is called later in the 
start up sequence than your call to 
CollectorPolicy::recommended_heap_size() and it is never called for G1.

Running with your patch crashes:

java -XX:OldSize=128m -XX:NewRatio=0 -version
Floating point exception: 8

Bengt
> /Jesper
>
>> Thanks
>>
>> Sent from my phone
>>
>> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" 
>> <jesper.wilhelmsson at oracle.com 
>> <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>
>>     Jon,
>>
>>     Thank you for looking at this! I share your concerns and I have
>>     moved the knowledge about policies to CollectorPolicy.
>>     set_heap_size() now simply asks the collector policy if it has
>>     any recommendations regarding the heap size.
>>
>>     Ideally, since the code knows about young and old generations, I
>>     guess the new function "recommended_heap_size()" should be placed
>>     in GenCollectorPolicy, but then the code would have to be
>>     duplicated for G1 as well. However, CollectorPolicy already know
>>     about OldSize and NewSize so I think it is OK to put it there.
>>
>>     Eventually I think that we should reduce the abstraction level in
>>     the generation policies and merge CollectorPolicy,
>>     GenCollectorPolicy and maybe even TwoGenerationCollectorPolicy
>>     and if possible G1CollectorPolicy, so I don't worry too much
>>     about having knowledge about the two generations in CollectorPolicy.
>>
>>
>>     A new webrev is available here:
>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>
>>     Thanks,
>>     /Jesper
>>
>>
>>
>>     On 2013-01-14 19:00, Jon Masamitsu wrote:
>>
>>         Jesper,
>>
>>         I'm a bit concerned that set_heap_size() now knows about how
>>         the CollectorPolicy uses OldSize and NewSize.   In the distant
>>         past set_heap_size() did not know what kind of collector was
>>         going to be used and probably avoided looking at those
>>         parameters for that reason.  Today we know that a generational
>>         collector is to follow but maybe you could hide that knowledge
>>         in CollectorPolicy somewhere and have set_heap_size() call into
>>         CollectorPolicy to use that information?
>>
>>         Jon
>>
>>
>>         On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>
>>             Hi,
>>
>>             I would like a couple of reviews of a small fix for
>>             JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>
>>             Webrev:
>>             http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>             <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>
>>             Summary:
>>             When starting HotSpot with an OldSize larger than the
>>             default heap size one will run into a couple of problems.
>>             Basically what happens is that the OldSize is ignored
>>             because it is incompatible with the heap size. A debug
>>             build will assert since a calculation on the way results
>>             in a negative number, but since it is a size_t an if(x<0)
>>             won't trigger and the assert catches it later on as
>>             incompatible flags.
>>
>>             Changes:
>>             I have made two changes to fix this.
>>
>>             The first is to change the calculation in
>>             TwoGenerationCollectorPolicy::adjust_gen0_sizes so that
>>             it won't result in a negative number in the if statement.
>>             This way we will catch the case where the OldSize is
>>             larger than the heap size and adjust the OldSize instead
>>             of the young size. There are also some cosmetic changes
>>             here. For instance the argument min_gen0_size is actually
>>             used for the old generation size which was a bit
>>             confusing initially. I renamed it to min_gen1_size (which
>>             it already was called in the header file).
>>
>>             The second change is in Arguments::set_heap_size. My
>>             reasoning here is that if the user sets the OldSize we
>>             should probably adjust the heap size to accommodate that
>>             OldSize instead of complaining that the heap is too
>>             small. We determine the heap size first and the
>>             generation sizes later on while initializing the VM. To
>>             be able to fit the generations if the user specifies
>>             sizes on the command line we need to look at the
>>             generation size flags a little already when setting up
>>             the heap size.
>>
>>             Thanks,
>>             /Jesper
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/1521ae2c/attachment.htm>

From stefan.karlsson at oracle.com  Wed Jan 16 11:19:34 2013
From: stefan.karlsson at oracle.com (stefan.karlsson at oracle.com)
Date: Wed, 16 Jan 2013 11:19:34 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8005590: java_lang_Class injected field
	resolved_constructor appears unused
Message-ID: <20130116111937.B210A472DB@hg.openjdk.java.net>

Changeset: ed6154d7d259
Author:    stefank
Date:      2013-01-15 13:32 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ed6154d7d259

8005590: java_lang_Class injected field resolved_constructor appears unused
Reviewed-by: coleenp, dholmes

! src/share/vm/classfile/javaClasses.cpp
! src/share/vm/classfile/javaClasses.hpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/oops/instanceKlass.cpp
! src/share/vm/runtime/vmStructs.cpp


From mikael.gerdin at oracle.com  Wed Jan 16 11:46:28 2013
From: mikael.gerdin at oracle.com (Mikael Gerdin)
Date: Wed, 16 Jan 2013 12:46:28 +0100
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F47361.7020306@oracle.com>
References: <50F47361.7020306@oracle.com>
Message-ID: <50F69314.40801@oracle.com>

Bengt,

On 2013-01-14 22:06, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this small change?
> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/

Looks good to me.

/Mikael

>
> Thanks to John Cuthbertson for finding this bug and providing excellent
> data to track down the issue.
>
>  From the bug report:
>
> In non-product builds the WorkerDataArrays in G1 are initialized to -1
> in WorkerDataArray<T>::reset() when a GC starts. At the end of a GC
> WorkerDataArray<T>::verify() verifies that all entries in a
> WorkerDataArray has been set. Currently it does this by asserting that
> the entries are >= 0. This is fine in theory since the entries should
> contain counts or times that are all positive.
>
> The problem is that some WorkerDataArrays are of type double. And some
> of those are set up through calculations using doubles. If those
> calculations result in a value close to 0 we could end up with a value
> slightly less than 0 since double calculations don't have full precision.
>
> All we really want to verify is that all the entries were set. So, it
> should be enough to verify that entries do not contain the value set by
> the reset() method.
>
> Bengt

-- 
Mikael Gerdin
Java SE VM SQE Stockholm


From bengt.rutisson at oracle.com  Wed Jan 16 12:01:24 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 13:01:24 +0100
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F59910.2010000@oracle.com>
References: <50F47361.7020306@oracle.com> <50F59910.2010000@oracle.com>
Message-ID: <50F69694.30309@oracle.com>


Hi John,

Thanks for the review!

On 1/15/13 6:59 PM, John Cuthbertson wrote:
> Hi Bengt,
>
> Changes look good to me. Minor nits:
>
> Copyrights need updating.

I'll leave the copyright year as is for now. There is an ongoing 
discussion about whether or not we need to do this. I'd prefer to wait 
and see what the decision is.

> Use UINT32_FORMAT instead of %d in the error message.

Done.

> Check the indentation of the for loop in G1GCPhaseTimes::note_gc_end().

Done.

Bengt

>
> JohnC
>
> On 1/14/2013 1:06 PM, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Could I have a couple of reviews for this small change?
>> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>>
>> Thanks to John Cuthbertson for finding this bug and providing 
>> excellent data to track down the issue.
>>
>> From the bug report:
>>
>> In non-product builds the WorkerDataArrays in G1 are initialized to 
>> -1 in WorkerDataArray<T>::reset() when a GC starts. At the end of a 
>> GC WorkerDataArray<T>::verify() verifies that all entries in a 
>> WorkerDataArray has been set. Currently it does this by asserting 
>> that the entries are >= 0. This is fine in theory since the entries 
>> should contain counts or times that are all positive.
>>
>> The problem is that some WorkerDataArrays are of type double. And 
>> some of those are set up through calculations using doubles. If those 
>> calculations result in a value close to 0 we could end up with a 
>> value slightly less than 0 since double calculations don't have full 
>> precision.
>>
>> All we really want to verify is that all the entries were set. So, it 
>> should be enough to verify that entries do not contain the value set 
>> by the reset() method.
>>
>> Bengt
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/eb5fce76/attachment.htm>

From bengt.rutisson at oracle.com  Wed Jan 16 12:18:05 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 13:18:05 +0100
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <CAHjP37GCUgS32KV5H5ZwUmTb_r61U0kKubwh5gY4UYkYu3GApg@mail.gmail.com>
References: <50F47361.7020306@oracle.com>
	<CAHjP37GCUgS32KV5H5ZwUmTb_r61U0kKubwh5gY4UYkYu3GApg@mail.gmail.com>
Message-ID: <50F69A7D.4010800@oracle.com>


Hi Vitaly,

Thanks for looking at this!

On 1/15/13 2:03 PM, Vitaly Davidovich wrote:
>
> Hi Bengt,
>
> Looks good.  Do you need the constants for int/double/size_t? Would it 
> be easier to have a static getter that returns "(T)-1" and then use 
> that instead?
>

It would work, but I'm not sure what is best. I prefer the constants, 
but I would be fine with a getter too. If you don't have strong 
objections I'll leave it as it is.

Thanks,
Bengt

> Thanks
>
> Sent from my phone
>
> On Jan 14, 2013 4:08 PM, "Bengt Rutisson" <bengt.rutisson at oracle.com 
> <mailto:bengt.rutisson at oracle.com>> wrote:
>
>
>     Hi all,
>
>     Could I have a couple of reviews for this small change?
>     http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>     <http://cr.openjdk.java.net/%7Ebrutisso/8006242/webrev.00/>
>
>     Thanks to John Cuthbertson for finding this bug and providing
>     excellent data to track down the issue.
>
>     From the bug report:
>
>     In non-product builds the WorkerDataArrays in G1 are initialized
>     to -1 in WorkerDataArray<T>::reset() when a GC starts. At the end
>     of a GC WorkerDataArray<T>::verify() verifies that all entries in
>     a WorkerDataArray has been set. Currently it does this by
>     asserting that the entries are >= 0. This is fine in theory since
>     the entries should contain counts or times that are all positive.
>
>     The problem is that some WorkerDataArrays are of type double. And
>     some of those are set up through calculations using doubles. If
>     those calculations result in a value close to 0 we could end up
>     with a value slightly less than 0 since double calculations don't
>     have full precision.
>
>     All we really want to verify is that all the entries were set. So,
>     it should be enough to verify that entries do not contain the
>     value set by the reset() method.
>
>     Bengt
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/dab8ba73/attachment.htm>

From bengt.rutisson at oracle.com  Wed Jan 16 12:18:37 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 13:18:37 +0100
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F69314.40801@oracle.com>
References: <50F47361.7020306@oracle.com> <50F69314.40801@oracle.com>
Message-ID: <50F69A9D.2080503@oracle.com>


Thanks for the review!

All set to push this now.

Bengt

On 1/16/13 12:46 PM, Mikael Gerdin wrote:
> Bengt,
>
> On 2013-01-14 22:06, Bengt Rutisson wrote:
>>
>> Hi all,
>>
>> Could I have a couple of reviews for this small change?
>> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>
> Looks good to me.
>
> /Mikael
>
>>
>> Thanks to John Cuthbertson for finding this bug and providing excellent
>> data to track down the issue.
>>
>>  From the bug report:
>>
>> In non-product builds the WorkerDataArrays in G1 are initialized to -1
>> in WorkerDataArray<T>::reset() when a GC starts. At the end of a GC
>> WorkerDataArray<T>::verify() verifies that all entries in a
>> WorkerDataArray has been set. Currently it does this by asserting that
>> the entries are >= 0. This is fine in theory since the entries should
>> contain counts or times that are all positive.
>>
>> The problem is that some WorkerDataArrays are of type double. And some
>> of those are set up through calculations using doubles. If those
>> calculations result in a value close to 0 we could end up with a value
>> slightly less than 0 since double calculations don't have full 
>> precision.
>>
>> All we really want to verify is that all the entries were set. So, it
>> should be enough to verify that entries do not contain the value set by
>> the reset() method.
>>
>> Bengt
>


From jesper.wilhelmsson at oracle.com  Wed Jan 16 12:45:14 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 16 Jan 2013 13:45:14 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F66391.6000504@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
Message-ID: <50F6A0DA.4040108@oracle.com>


On 2013-01-16 09:23, Bengt Rutisson wrote:
> On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>> On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>
>>> Hi Jesper,
>>>
>>> Is NewRatio guaranteed to be non-zero when used inside recommended_heap_size?
>>>
>> As far as I can see, yes. It defaults to two and is never set to zero.
>
> No, there is no such guarantee this early in the argument parsing. The check 
> to verify that NewRatio > 0 is done in 
> GenCollectorPolicy::initialize_flags(), which is called later in the start 
> up sequence than your call to CollectorPolicy::recommended_heap_size() and 
> it is never called for G1.
>
> Running with your patch crashes:
>
> java -XX:OldSize=128m -XX:NewRatio=0 -version
> Floating point exception: 8

Oh, yes, you're right. Sorry!

Good catch Vitaly!

New webrev:
http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3

I'm just skipping the calculation if NewRatio is zero. The VM will abort 
anyway as soon as it realizes that this is the case.
/Jesper


> Bengt
>> /Jesper
>>
>>> Thanks
>>>
>>> Sent from my phone
>>>
>>> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" 
>>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>
>>>     Jon,
>>>
>>>     Thank you for looking at this! I share your concerns and I have moved
>>>     the knowledge about policies to CollectorPolicy. set_heap_size() now
>>>     simply asks the collector policy if it has any recommendations
>>>     regarding the heap size.
>>>
>>>     Ideally, since the code knows about young and old generations, I guess
>>>     the new function "recommended_heap_size()" should be placed in
>>>     GenCollectorPolicy, but then the code would have to be duplicated for
>>>     G1 as well. However, CollectorPolicy already know about OldSize and
>>>     NewSize so I think it is OK to put it there.
>>>
>>>     Eventually I think that we should reduce the abstraction level in the
>>>     generation policies and merge CollectorPolicy, GenCollectorPolicy and
>>>     maybe even TwoGenerationCollectorPolicy and if possible
>>>     G1CollectorPolicy, so I don't worry too much about having knowledge
>>>     about the two generations in CollectorPolicy.
>>>
>>>
>>>     A new webrev is available here:
>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>
>>>     Thanks,
>>>     /Jesper
>>>
>>>
>>>
>>>     On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>
>>>         Jesper,
>>>
>>>         I'm a bit concerned that set_heap_size() now knows about how
>>>         the CollectorPolicy uses OldSize and NewSize.   In the distant
>>>         past set_heap_size() did not know what kind of collector was
>>>         going to be used and probably avoided looking at those
>>>         parameters for that reason.  Today we know that a generational
>>>         collector is to follow but maybe you could hide that knowledge
>>>         in CollectorPolicy somewhere and have set_heap_size() call into
>>>         CollectorPolicy to use that information?
>>>
>>>         Jon
>>>
>>>
>>>         On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>
>>>             Hi,
>>>
>>>             I would like a couple of reviews of a small fix for
>>>             JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>>
>>>             Webrev:
>>>             http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>             <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>
>>>             Summary:
>>>             When starting HotSpot with an OldSize larger than the default
>>>             heap size one will run into a couple of problems. Basically
>>>             what happens is that the OldSize is ignored because it is
>>>             incompatible with the heap size. A debug build will assert
>>>             since a calculation on the way results in a negative number,
>>>             but since it is a size_t an if(x<0) won't trigger and the
>>>             assert catches it later on as incompatible flags.
>>>
>>>             Changes:
>>>             I have made two changes to fix this.
>>>
>>>             The first is to change the calculation in
>>>             TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it
>>>             won't result in a negative number in the if statement. This
>>>             way we will catch the case where the OldSize is larger than
>>>             the heap size and adjust the OldSize instead of the young
>>>             size. There are also some cosmetic changes here. For instance
>>>             the argument min_gen0_size is actually used for the old
>>>             generation size which was a bit confusing initially. I renamed
>>>             it to min_gen1_size (which it already was called in the header
>>>             file).
>>>
>>>             The second change is in Arguments::set_heap_size. My reasoning
>>>             here is that if the user sets the OldSize we should probably
>>>             adjust the heap size to accommodate that OldSize instead of
>>>             complaining that the heap is too small. We determine the heap
>>>             size first and the generation sizes later on while
>>>             initializing the VM. To be able to fit the generations if the
>>>             user specifies sizes on the command line we need to look at
>>>             the generation size flags a little already when setting up the
>>>             heap size.
>>>
>>>             Thanks,
>>>             /Jesper
>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/74f7c609/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 247 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/74f7c609/jesper_wilhelmsson.vcf>

From bengt.rutisson at oracle.com  Wed Jan 16 12:58:36 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 13:58:36 +0100
Subject: Request for review (S): 8006398: Add regression tests for deprectated
	GCs
Message-ID: <50F6A3FC.8030000@oracle.com>


Hi all,

Could I have a couple of reviews for this change?
http://cr.openjdk.java.net/~brutisso/8006398/webrev.00/

Recently we deprecated some GC combinations. Those should now print a 
warning at startup. Other GC combinations should not print any warnings.

With the new process handling support that Christian T?rnqvist is adding 
to the JTREG tests for hotspot it is very easy to write test that start 
a VM and checks the output.

This changes makes use of Christian's testlibrary to verify that 
warnings are printed as expected.

I'm also adding the "gc" keyword to JTREG to make it possible to filter 
out GC tests. We should probably use this for all test in the the /gc 
folder, but I think that should be done as a separate change.

The webrev above is based on Christian's webrev to add the testlibrary:
http://cr.openjdk.java.net/~brutisso/8006413/webrev.00/

Thanks,
Bengt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/cfecc62d/attachment.htm>

From vitalyd at gmail.com  Wed Jan 16 13:23:17 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 16 Jan 2013 08:23:17 -0500
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F6A0DA.4040108@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
Message-ID: <CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>

Looks good Jesper.  Maybe just a comment there that NewRatio hasn't been
checked yet but if it's 0, VM will exit later on anyway - basically, what
you said in the email :).

Cheers

Sent from my phone
On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com>
wrote:

>
> On 2013-01-16 09:23, Bengt Rutisson wrote:
>
> On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>
> On 2013-01-15 14:32, Vitaly Davidovich wrote:
>
> Hi Jesper,
>
> Is NewRatio guaranteed to be non-zero when used inside
> recommended_heap_size?
>
> As far as I can see, yes. It defaults to two and is never set to zero.
>
>
> No, there is no such guarantee this early in the argument parsing. The
> check to verify that NewRatio > 0 is done in
> GenCollectorPolicy::initialize_flags(), which is called later in the start
> up sequence than your call to CollectorPolicy::recommended_heap_size() and
> it is never called for G1.
>
> Running with your patch crashes:
>
> java -XX:OldSize=128m -XX:NewRatio=0 -version
> Floating point exception: 8
>
>
> Oh, yes, you're right. Sorry!
>
> Good catch Vitaly!
>
> New webrev:
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>
> I'm just skipping the calculation if NewRatio is zero. The VM will abort
> anyway as soon as it realizes that this is the case.
> /Jesper
>
>
>  Bengt
>
> /Jesper
>
>  Thanks
>
> Sent from my phone
> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" <
> jesper.wilhelmsson at oracle.com> wrote:
>
>> Jon,
>>
>> Thank you for looking at this! I share your concerns and I have moved the
>> knowledge about policies to CollectorPolicy. set_heap_size() now simply
>> asks the collector policy if it has any recommendations regarding the heap
>> size.
>>
>> Ideally, since the code knows about young and old generations, I guess
>> the new function "recommended_heap_size()" should be placed in
>> GenCollectorPolicy, but then the code would have to be duplicated for G1 as
>> well. However, CollectorPolicy already know about OldSize and NewSize so I
>> think it is OK to put it there.
>>
>> Eventually I think that we should reduce the abstraction level in the
>> generation policies and merge CollectorPolicy, GenCollectorPolicy and maybe
>> even TwoGenerationCollectorPolicy and if possible G1CollectorPolicy, so I
>> don't worry too much about having knowledge about the two generations in
>> CollectorPolicy.
>>
>>
>> A new webrev is available here:
>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>
>> Thanks,
>> /Jesper
>>
>>
>>
>> On 2013-01-14 19:00, Jon Masamitsu wrote:
>>
>>> Jesper,
>>>
>>> I'm a bit concerned that set_heap_size() now knows about how
>>> the CollectorPolicy uses OldSize and NewSize.   In the distant
>>> past set_heap_size() did not know what kind of collector was
>>> going to be used and probably avoided looking at those
>>> parameters for that reason.  Today we know that a generational
>>> collector is to follow but maybe you could hide that knowledge
>>> in CollectorPolicy somewhere and have set_heap_size() call into
>>> CollectorPolicy to use that information?
>>>
>>> Jon
>>>
>>>
>>> On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>
>>>> Hi,
>>>>
>>>> I would like a couple of reviews of a small fix for JDK-6348447 -
>>>> Specifying -XX:OldSize crashes 64-bit VMs
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>
>>>> Summary:
>>>> When starting HotSpot with an OldSize larger than the default heap size
>>>> one will run into a couple of problems. Basically what happens is that the
>>>> OldSize is ignored because it is incompatible with the heap size. A debug
>>>> build will assert since a calculation on the way results in a negative
>>>> number, but since it is a size_t an if(x<0) won't trigger and the assert
>>>> catches it later on as incompatible flags.
>>>>
>>>> Changes:
>>>> I have made two changes to fix this.
>>>>
>>>> The first is to change the calculation in
>>>> TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it won't result in
>>>> a negative number in the if statement. This way we will catch the case
>>>> where the OldSize is larger than the heap size and adjust the OldSize
>>>> instead of the young size. There are also some cosmetic changes here. For
>>>> instance the argument min_gen0_size is actually used for the old generation
>>>> size which was a bit confusing initially. I renamed it to min_gen1_size
>>>> (which it already was called in the header file).
>>>>
>>>> The second change is in Arguments::set_heap_size. My reasoning here is
>>>> that if the user sets the OldSize we should probably adjust the heap size
>>>> to accommodate that OldSize instead of complaining that the heap is too
>>>> small. We determine the heap size first and the generation sizes later on
>>>> while initializing the VM. To be able to fit the generations if the user
>>>> specifies sizes on the command line we need to look at the generation size
>>>> flags a little already when setting up the heap size.
>>>>
>>>> Thanks,
>>>> /Jesper
>>>>
>>>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/d556e663/attachment.htm>

From jesper.wilhelmsson at oracle.com  Wed Jan 16 13:24:08 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 16 Jan 2013 14:24:08 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
Message-ID: <50F6A9F8.10004@oracle.com>

On 2013-01-16 14:23, Vitaly Davidovich wrote:
>
> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't been 
> checked yet but if it's 0, VM will exit later on anyway - basically, what 
> you said in the email :).
>

Yes, I'll add a comment about that.
Thanks,
/Jesper

> Cheers
>
> Sent from my phone
>
> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson" <jesper.wilhelmsson at oracle.com 
> <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>
>
>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>
>>>>     Hi Jesper,
>>>>
>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>     recommended_heap_size?
>>>>
>>>     As far as I can see, yes. It defaults to two and is never set to zero.
>>
>>     No, there is no such guarantee this early in the argument parsing. The
>>     check to verify that NewRatio > 0 is done in
>>     GenCollectorPolicy::initialize_flags(), which is called later in the
>>     start up sequence than your call to
>>     CollectorPolicy::recommended_heap_size() and it is never called for G1.
>>
>>     Running with your patch crashes:
>>
>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>     Floating point exception: 8
>
>     Oh, yes, you're right. Sorry!
>
>     Good catch Vitaly!
>
>     New webrev:
>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>
>     I'm just skipping the calculation if NewRatio is zero. The VM will abort
>     anyway as soon as it realizes that this is the case.
>     /Jesper
>
>
>>     Bengt
>>>     /Jesper
>>>
>>>>     Thanks
>>>>
>>>>     Sent from my phone
>>>>
>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>     <jesper.wilhelmsson at oracle.com
>>>>     <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>
>>>>         Jon,
>>>>
>>>>         Thank you for looking at this! I share your concerns and I have
>>>>         moved the knowledge about policies to CollectorPolicy.
>>>>         set_heap_size() now simply asks the collector policy if it has
>>>>         any recommendations regarding the heap size.
>>>>
>>>>         Ideally, since the code knows about young and old generations, I
>>>>         guess the new function "recommended_heap_size()" should be placed
>>>>         in GenCollectorPolicy, but then the code would have to be
>>>>         duplicated for G1 as well. However, CollectorPolicy already know
>>>>         about OldSize and NewSize so I think it is OK to put it there.
>>>>
>>>>         Eventually I think that we should reduce the abstraction level in
>>>>         the generation policies and merge CollectorPolicy,
>>>>         GenCollectorPolicy and maybe even TwoGenerationCollectorPolicy
>>>>         and if possible G1CollectorPolicy, so I don't worry too much
>>>>         about having knowledge about the two generations in CollectorPolicy.
>>>>
>>>>
>>>>         A new webrev is available here:
>>>>         http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>         <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>
>>>>         Thanks,
>>>>         /Jesper
>>>>
>>>>
>>>>
>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>
>>>>             Jesper,
>>>>
>>>>             I'm a bit concerned that set_heap_size() now knows about how
>>>>             the CollectorPolicy uses OldSize and NewSize. In the distant
>>>>             past set_heap_size() did not know what kind of collector was
>>>>             going to be used and probably avoided looking at those
>>>>             parameters for that reason.  Today we know that a generational
>>>>             collector is to follow but maybe you could hide that knowledge
>>>>             in CollectorPolicy somewhere and have set_heap_size() call into
>>>>             CollectorPolicy to use that information?
>>>>
>>>>             Jon
>>>>
>>>>
>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>
>>>>                 Hi,
>>>>
>>>>                 I would like a couple of reviews of a small fix for
>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>>>
>>>>                 Webrev:
>>>>                 http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>                 <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>
>>>>                 Summary:
>>>>                 When starting HotSpot with an OldSize larger than the
>>>>                 default heap size one will run into a couple of problems.
>>>>                 Basically what happens is that the OldSize is ignored
>>>>                 because it is incompatible with the heap size. A debug
>>>>                 build will assert since a calculation on the way results
>>>>                 in a negative number, but since it is a size_t an if(x<0)
>>>>                 won't trigger and the assert catches it later on as
>>>>                 incompatible flags.
>>>>
>>>>                 Changes:
>>>>                 I have made two changes to fix this.
>>>>
>>>>                 The first is to change the calculation in
>>>>                 TwoGenerationCollectorPolicy::adjust_gen0_sizes so that
>>>>                 it won't result in a negative number in the if statement.
>>>>                 This way we will catch the case where the OldSize is
>>>>                 larger than the heap size and adjust the OldSize instead
>>>>                 of the young size. There are also some cosmetic changes
>>>>                 here. For instance the argument min_gen0_size is actually
>>>>                 used for the old generation size which was a bit
>>>>                 confusing initially. I renamed it to min_gen1_size (which
>>>>                 it already was called in the header file).
>>>>
>>>>                 The second change is in Arguments::set_heap_size. My
>>>>                 reasoning here is that if the user sets the OldSize we
>>>>                 should probably adjust the heap size to accommodate that
>>>>                 OldSize instead of complaining that the heap is too
>>>>                 small. We determine the heap size first and the
>>>>                 generation sizes later on while initializing the VM. To
>>>>                 be able to fit the generations if the user specifies
>>>>                 sizes on the command line we need to look at the
>>>>                 generation size flags a little already when setting up
>>>>                 the heap size.
>>>>
>>>>                 Thanks,
>>>>                 /Jesper
>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/70d4a7c1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/70d4a7c1/jesper_wilhelmsson.vcf>

From erik.helin at oracle.com  Wed Jan 16 13:56:11 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 16 Jan 2013 14:56:11 +0100
Subject: Request for review (S): 8006398: Add regression tests for
	deprectated GCs
In-Reply-To: <50F6A3FC.8030000@oracle.com>
References: <50F6A3FC.8030000@oracle.com>
Message-ID: <50F6B17B.6040505@oracle.com>

Hi Bengt,

looks good!

The test looks really nice with Christian's testlibrary, much more 
readable than using shell.

Erik

On 01/16/2013 01:58 PM, Bengt Rutisson wrote:
>
> Hi all,
>
> Could I have a couple of reviews for this change?
> http://cr.openjdk.java.net/~brutisso/8006398/webrev.00/
>
> Recently we deprecated some GC combinations. Those should now print a
> warning at startup. Other GC combinations should not print any warnings.
>
> With the new process handling support that Christian T?rnqvist is adding
> to the JTREG tests for hotspot it is very easy to write test that start
> a VM and checks the output.
>
> This changes makes use of Christian's testlibrary to verify that
> warnings are printed as expected.
>
> I'm also adding the "gc" keyword to JTREG to make it possible to filter
> out GC tests. We should probably use this for all test in the the /gc
> folder, but I think that should be done as a separate change.
>
> The webrev above is based on Christian's webrev to add the testlibrary:
> http://cr.openjdk.java.net/~brutisso/8006413/webrev.00/
>
> Thanks,
> Bengt
>
>


From bengt.rutisson at oracle.com  Wed Jan 16 14:05:16 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 16 Jan 2013 15:05:16 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F6A0DA.4040108@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
Message-ID: <50F6B39C.9080802@oracle.com>

On 1/16/13 1:45 PM, Jesper Wilhelmsson wrote:
>
> On 2013-01-16 09:23, Bengt Rutisson wrote:
>> On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>> On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>
>>>> Hi Jesper,
>>>>
>>>> Is NewRatio guaranteed to be non-zero when used inside 
>>>> recommended_heap_size?
>>>>
>>> As far as I can see, yes. It defaults to two and is never set to zero.
>>
>> No, there is no such guarantee this early in the argument parsing. 
>> The check to verify that NewRatio > 0 is done in 
>> GenCollectorPolicy::initialize_flags(), which is called later in the 
>> start up sequence than your call to 
>> CollectorPolicy::recommended_heap_size() and it is never called for G1.
>>
>> Running with your patch crashes:
>>
>> java -XX:OldSize=128m -XX:NewRatio=0 -version
>> Floating point exception: 8
>
> Oh, yes, you're right. Sorry!
>
> Good catch Vitaly!
>
> New webrev:
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>
> I'm just skipping the calculation if NewRatio is zero. The VM will 
> abort anyway as soon as it realizes that this is the case.

It is not enough to check NewRatio != 0 since it is a signed value. You 
should check NewRatio > 0. Or change the declaration of NewRatio from 
intx to uintx.

As it is now I think you will get a really huge result from 
recommended_heap_size() if someone sets -XX:NewRatio=-10 on the command 
line since you will be converting negative integers to a size_t value.

As I mentioned before the NewRatio > 0 check is not done for G1. But 
this should probably be considered a separate bug. On the other hand if 
we could make G1 inherit from GenCollectorPolicy it would do the 
NewRatio > 0 check and we would also have a better place to put the 
OldSize checks that you want to add. Two bugs for the price of one ;)

 > java  -XX:+UseG1GC -XX:NewRatio=-1 -version
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGFPE (0x8) at pc=0x000000010f5468a8, pid=31206, tid=4611
#
# JRE version:  (8.0-b68) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b16-internal-jvmg 
mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.dylib+0x6618a8] G1YoungGenSizer::heap_size_changed(unsigned 
int)+0x14a
#
# Failed to write core dump. Core dumps have been disabled. To enable 
core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/brutisso/repos/hs-gc/hs_err_pid31206.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
#
Current thread is 4611
Dumping core ...
Abort trap: 6


Bengt

> /Jesper
>
>
>> Bengt
>>> /Jesper
>>>
>>>> Thanks
>>>>
>>>> Sent from my phone
>>>>
>>>> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson" 
>>>> <jesper.wilhelmsson at oracle.com 
>>>> <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>
>>>>     Jon,
>>>>
>>>>     Thank you for looking at this! I share your concerns and I have
>>>>     moved the knowledge about policies to CollectorPolicy.
>>>>     set_heap_size() now simply asks the collector policy if it has
>>>>     any recommendations regarding the heap size.
>>>>
>>>>     Ideally, since the code knows about young and old generations,
>>>>     I guess the new function "recommended_heap_size()" should be
>>>>     placed in GenCollectorPolicy, but then the code would have to
>>>>     be duplicated for G1 as well. However, CollectorPolicy already
>>>>     know about OldSize and NewSize so I think it is OK to put it there.
>>>>
>>>>     Eventually I think that we should reduce the abstraction level
>>>>     in the generation policies and merge CollectorPolicy,
>>>>     GenCollectorPolicy and maybe even TwoGenerationCollectorPolicy
>>>>     and if possible G1CollectorPolicy, so I don't worry too much
>>>>     about having knowledge about the two generations in
>>>>     CollectorPolicy.
>>>>
>>>>
>>>>     A new webrev is available here:
>>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>
>>>>     Thanks,
>>>>     /Jesper
>>>>
>>>>
>>>>
>>>>     On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>
>>>>         Jesper,
>>>>
>>>>         I'm a bit concerned that set_heap_size() now knows about how
>>>>         the CollectorPolicy uses OldSize and NewSize.   In the distant
>>>>         past set_heap_size() did not know what kind of collector was
>>>>         going to be used and probably avoided looking at those
>>>>         parameters for that reason.  Today we know that a generational
>>>>         collector is to follow but maybe you could hide that knowledge
>>>>         in CollectorPolicy somewhere and have set_heap_size() call into
>>>>         CollectorPolicy to use that information?
>>>>
>>>>         Jon
>>>>
>>>>
>>>>         On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>
>>>>             Hi,
>>>>
>>>>             I would like a couple of reviews of a small fix for
>>>>             JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>>>
>>>>             Webrev:
>>>>             http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>             <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>
>>>>             Summary:
>>>>             When starting HotSpot with an OldSize larger than the
>>>>             default heap size one will run into a couple of
>>>>             problems. Basically what happens is that the OldSize is
>>>>             ignored because it is incompatible with the heap size.
>>>>             A debug build will assert since a calculation on the
>>>>             way results in a negative number, but since it is a
>>>>             size_t an if(x<0) won't trigger and the assert catches
>>>>             it later on as incompatible flags.
>>>>
>>>>             Changes:
>>>>             I have made two changes to fix this.
>>>>
>>>>             The first is to change the calculation in
>>>>             TwoGenerationCollectorPolicy::adjust_gen0_sizes so that
>>>>             it won't result in a negative number in the if
>>>>             statement. This way we will catch the case where the
>>>>             OldSize is larger than the heap size and adjust the
>>>>             OldSize instead of the young size. There are also some
>>>>             cosmetic changes here. For instance the argument
>>>>             min_gen0_size is actually used for the old generation
>>>>             size which was a bit confusing initially. I renamed it
>>>>             to min_gen1_size (which it already was called in the
>>>>             header file).
>>>>
>>>>             The second change is in Arguments::set_heap_size. My
>>>>             reasoning here is that if the user sets the OldSize we
>>>>             should probably adjust the heap size to accommodate
>>>>             that OldSize instead of complaining that the heap is
>>>>             too small. We determine the heap size first and the
>>>>             generation sizes later on while initializing the VM. To
>>>>             be able to fit the generations if the user specifies
>>>>             sizes on the command line we need to look at the
>>>>             generation size flags a little already when setting up
>>>>             the heap size.
>>>>
>>>>             Thanks,
>>>>             /Jesper
>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/17638dbd/attachment.htm>

From jesper.wilhelmsson at oracle.com  Wed Jan 16 15:36:38 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 16 Jan 2013 16:36:38 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50F6B39C.9080802@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com> <50F6B39C.9080802@oracle.com>
Message-ID: <50F6C906.4080003@oracle.com>

On 2013-01-16 15:05, Bengt Rutisson wrote:
> On 1/16/13 1:45 PM, Jesper Wilhelmsson wrote:
>>
>> On 2013-01-16 09:23, Bengt Rutisson wrote:
>>> On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>> On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>>
>>>>> Hi Jesper,
>>>>>
>>>>> Is NewRatio guaranteed to be non-zero when used inside recommended_heap_size?
>>>>>
>>>> As far as I can see, yes. It defaults to two and is never set to zero.
>>>
>>> No, there is no such guarantee this early in the argument parsing. The
>>> check to verify that NewRatio > 0 is done in
>>> GenCollectorPolicy::initialize_flags(), which is called later in the start
>>> up sequence than your call to CollectorPolicy::recommended_heap_size() and
>>> it is never called for G1.
>>>
>>> Running with your patch crashes:
>>>
>>> java -XX:OldSize=128m -XX:NewRatio=0 -version
>>> Floating point exception: 8
>>
>> Oh, yes, you're right. Sorry!
>>
>> Good catch Vitaly!
>>
>> New webrev:
>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>>
>> I'm just skipping the calculation if NewRatio is zero. The VM will abort
>> anyway as soon as it realizes that this is the case.
>
> It is not enough to check NewRatio != 0 since it is a signed value. You should
> check NewRatio > 0. Or change the declaration of NewRatio from intx to uintx.

Checking for != 0 is enough to protect the division but I can change it to > 0 
if you have strong feelings about it. The flag verification will catch that -1 
is an invalid number for NewRatio. The fact that it's not caught in G1 is a 
bug in G1 that already exist today and is unrelated to this change.

I agree that NewRatio should be unsigned, but that's a separate bug. I created 
JDK-8006432 for that.

https://jbs.oracle.com/bugs/browse/JDK-8006432

> As it is now I think you will get a really huge result from
> recommended_heap_size() if someone sets -XX:NewRatio=-10 on the command line
> since you will be converting negative integers to a size_t value.

Actually no. The two negative NewRatio will be multiplied and give a positive 
answer: (OldSize / NewRatio) * (NewRatio + 1)
The corner case -1 will return 0.
But it really doesn't matter what the function returns if NewRatio is zero or 
negative since the VM will abort shortly after anyway. (I bluntly ignore the 
G1 bug in this statement.)

> As I mentioned before the NewRatio > 0 check is not done for G1. But this
> should probably be considered a separate bug. On the other hand if we could
> make G1 inherit from GenCollectorPolicy it would do the NewRatio > 0 check and
> we would also have a better place to put the OldSize checks that you want to
> add. Two bugs for the price of one ;)

The entire collector policy mess needs a round with a sledge hammer as I 
mentioned in my reply to Jon earlier in this mail thread. That is however a 
larger change that I think is unrelated to this change. OK, not completely 
unrelated, but it deserves its own CR.

The crash below is unrelated to this change and will happen with your 
suggested NewRatio > 0 as well.
/Jesper

>
>  > java  -XX:+UseG1GC -XX:NewRatio=-1 -version
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x000000010f5468a8, pid=31206, tid=4611
> #
> # JRE version:  (8.0-b68) (build )
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b16-internal-jvmg mixed
> mode bsd-amd64 compressed oops)
> # Problematic frame:
> # V  [libjvm.dylib+0x6618a8] G1YoungGenSizer::heap_size_changed(unsigned
> int)+0x14a
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /Users/brutisso/repos/hs-gc/hs_err_pid31206.log
> #
> # If you would like to submit a bug report, please visit:
> # http://bugreport.sun.com/bugreport/crash.jsp
> #
> Current thread is 4611
> Dumping core ...
> Abort trap: 6
>
>
> Bengt
>
>> /Jesper
>>
>>
>>> Bengt
>>>> /Jesper
>>>>
>>>>> Thanks
>>>>>
>>>>> Sent from my phone
>>>>>
>>>>> On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>>
>>>>>     Jon,
>>>>>
>>>>>     Thank you for looking at this! I share your concerns and I have moved
>>>>>     the knowledge about policies to CollectorPolicy. set_heap_size() now
>>>>>     simply asks the collector policy if it has any recommendations
>>>>>     regarding the heap size.
>>>>>
>>>>>     Ideally, since the code knows about young and old generations, I
>>>>>     guess the new function "recommended_heap_size()" should be placed in
>>>>>     GenCollectorPolicy, but then the code would have to be duplicated for
>>>>>     G1 as well. However, CollectorPolicy already know about OldSize and
>>>>>     NewSize so I think it is OK to put it there.
>>>>>
>>>>>     Eventually I think that we should reduce the abstraction level in the
>>>>>     generation policies and merge CollectorPolicy, GenCollectorPolicy and
>>>>>     maybe even TwoGenerationCollectorPolicy and if possible
>>>>>     G1CollectorPolicy, so I don't worry too much about having knowledge
>>>>>     about the two generations in CollectorPolicy.
>>>>>
>>>>>
>>>>>     A new webrev is available here:
>>>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>>
>>>>>     Thanks,
>>>>>     /Jesper
>>>>>
>>>>>
>>>>>
>>>>>     On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>>
>>>>>         Jesper,
>>>>>
>>>>>         I'm a bit concerned that set_heap_size() now knows about how
>>>>>         the CollectorPolicy uses OldSize and NewSize.   In the distant
>>>>>         past set_heap_size() did not know what kind of collector was
>>>>>         going to be used and probably avoided looking at those
>>>>>         parameters for that reason.  Today we know that a generational
>>>>>         collector is to follow but maybe you could hide that knowledge
>>>>>         in CollectorPolicy somewhere and have set_heap_size() call into
>>>>>         CollectorPolicy to use that information?
>>>>>
>>>>>         Jon
>>>>>
>>>>>
>>>>>         On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>>
>>>>>             Hi,
>>>>>
>>>>>             I would like a couple of reviews of a small fix for
>>>>>             JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>>>>
>>>>>             Webrev:
>>>>>             http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>>             <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>>
>>>>>             Summary:
>>>>>             When starting HotSpot with an OldSize larger than the default
>>>>>             heap size one will run into a couple of problems. Basically
>>>>>             what happens is that the OldSize is ignored because it is
>>>>>             incompatible with the heap size. A debug build will assert
>>>>>             since a calculation on the way results in a negative number,
>>>>>             but since it is a size_t an if(x<0) won't trigger and the
>>>>>             assert catches it later on as incompatible flags.
>>>>>
>>>>>             Changes:
>>>>>             I have made two changes to fix this.
>>>>>
>>>>>             The first is to change the calculation in
>>>>>             TwoGenerationCollectorPolicy::adjust_gen0_sizes so that it
>>>>>             won't result in a negative number in the if statement. This
>>>>>             way we will catch the case where the OldSize is larger than
>>>>>             the heap size and adjust the OldSize instead of the young
>>>>>             size. There are also some cosmetic changes here. For instance
>>>>>             the argument min_gen0_size is actually used for the old
>>>>>             generation size which was a bit confusing initially. I
>>>>>             renamed it to min_gen1_size (which it already was called in
>>>>>             the header file).
>>>>>
>>>>>             The second change is in Arguments::set_heap_size. My
>>>>>             reasoning here is that if the user sets the OldSize we should
>>>>>             probably adjust the heap size to accommodate that OldSize
>>>>>             instead of complaining that the heap is too small. We
>>>>>             determine the heap size first and the generation sizes later
>>>>>             on while initializing the VM. To be able to fit the
>>>>>             generations if the user specifies sizes on the command line
>>>>>             we need to look at the generation size flags a little already
>>>>>             when setting up the heap size.
>>>>>
>>>>>             Thanks,
>>>>>             /Jesper
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 247 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/9d0a5c54/jesper_wilhelmsson.vcf>

From jon.masamitsu at oracle.com  Wed Jan 16 16:50:52 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 16 Jan 2013 08:50:52 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
References: <50F424A0.6080907@oracle.com>
	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
Message-ID: <50F6DA6C.906@oracle.com>

I've added checks to arguments.cpp that are analogous to the
checks for MinHeapFreeRatio / MaxHeapFreeRatio

Changes since webrev.00 are in arguments.cpp

http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/

Thanks, Vitaly.

Jon

On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
> Hi Jon,
>
> Does it make sense to validate that the new flags are consistent (I.e. max
>> = min)? That is, if user changes one or both such that max<  min, should
> VM report an error and not start?
>
> Thanks
>
> Sent from my phone
> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com>  wrote:
>
>> 8005452: Create new flags for Metaspace resizing policy
>>
>> Previously the calculation of the metadata capacity at which
>> to do a GC (high water mark, HWM) to recover
>> unloaded classes used the MinHeapFreeRatio
>> and MaxHeapFreeRatio to decide on the next HWM.  That
>> generally left an excessive amount of unused capacity for
>> metadata.  This change adds specific flags for metadata
>> capacity with defaults more conservative in terms of
>> unused capacity.
>>
>> Added an additional check for doing a GC before expanding
>> the metadata capacity.  Required adding a new parameter to
>> get_new_chunk().
>>
>> Added some additional diagnostic prints.
>>
>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/>
>>
>> Thanks.
>>


From stefan.karlsson at oracle.com  Wed Jan 16 17:58:08 2013
From: stefan.karlsson at oracle.com (stefan.karlsson at oracle.com)
Date: Wed, 16 Jan 2013 17:58:08 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8005994: Method annotations are allocated
	unnecessarily during class file parsing
Message-ID: <20130116175813.1E14C472FB@hg.openjdk.java.net>

Changeset: ff0a7943fd29
Author:    stefank
Date:      2013-01-15 10:09 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ff0a7943fd29

8005994: Method annotations are allocated unnecessarily during class file parsing
Summary: Also reviewed by: vitalyd at gmail.com
Reviewed-by: coleenp, acorn

! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/prims/jvm.cpp


From vitalyd at gmail.com  Wed Jan 16 18:08:43 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 16 Jan 2013 13:08:43 -0500
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <50F6DA6C.906@oracle.com>
References: <50F424A0.6080907@oracle.com>
	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com>
Message-ID: <CAHjP37EzTbZrg5qWUewX9s71K_+U1-Bj4K6GrjwvX4=oXt3D_A@mail.gmail.com>

That looks good Jon.

Thanks

Sent from my phone
On Jan 16, 2013 11:51 AM, "Jon Masamitsu" <jon.masamitsu at oracle.com> wrote:

> I've added checks to arguments.cpp that are analogous to the
> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>
> Changes since webrev.00 are in arguments.cpp
>
> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.01/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/>
>
> Thanks, Vitaly.
>
> Jon
>
> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>
>> Hi Jon,
>>
>> Does it make sense to validate that the new flags are consistent (I.e. max
>>
>>> = min)? That is, if user changes one or both such that max<  min, should
>>>
>> VM report an error and not start?
>>
>> Thanks
>>
>> Sent from my phone
>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu@**oracle.com<jon.masamitsu at oracle.com>>
>>  wrote:
>>
>>  8005452: Create new flags for Metaspace resizing policy
>>>
>>> Previously the calculation of the metadata capacity at which
>>> to do a GC (high water mark, HWM) to recover
>>> unloaded classes used the MinHeapFreeRatio
>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>> generally left an excessive amount of unused capacity for
>>> metadata.  This change adds specific flags for metadata
>>> capacity with defaults more conservative in terms of
>>> unused capacity.
>>>
>>> Added an additional check for doing a GC before expanding
>>> the metadata capacity.  Required adding a new parameter to
>>> get_new_chunk().
>>>
>>> Added some additional diagnostic prints.
>>>
>>> http://cr.openjdk.java.net/~****jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/>
>>> <http:**//cr.openjdk.java.net/~jmasa/**8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/>
>>> >
>>>
>>> Thanks.
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/70bef2a7/attachment.htm>

From john.cuthbertson at oracle.com  Wed Jan 16 18:10:30 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 16 Jan 2013 10:10:30 -0800
Subject: Request for review (S): 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
In-Reply-To: <50F69694.30309@oracle.com>
References: <50F47361.7020306@oracle.com> <50F59910.2010000@oracle.com>
	<50F69694.30309@oracle.com>
Message-ID: <50F6ED16.7080309@oracle.com>

Hi Bengt,

Excellent. Ship it.

JohnC

On 1/16/2013 4:01 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> Thanks for the review!
>
> On 1/15/13 6:59 PM, John Cuthbertson wrote:
>> Hi Bengt,
>>
>> Changes look good to me. Minor nits:
>>
>> Copyrights need updating.
>
> I'll leave the copyright year as is for now. There is an ongoing 
> discussion about whether or not we need to do this. I'd prefer to wait 
> and see what the decision is.
>
>> Use UINT32_FORMAT instead of %d in the error message.
>
> Done.
>
>> Check the indentation of the for loop in G1GCPhaseTimes::note_gc_end().
>
> Done.
>
> Bengt
>
>>
>> JohnC
>>
>> On 1/14/2013 1:06 PM, Bengt Rutisson wrote:
>>>
>>> Hi all,
>>>
>>> Could I have a couple of reviews for this small change?
>>> http://cr.openjdk.java.net/~brutisso/8006242/webrev.00/
>>>
>>> Thanks to John Cuthbertson for finding this bug and providing 
>>> excellent data to track down the issue.
>>>
>>> From the bug report:
>>>
>>> In non-product builds the WorkerDataArrays in G1 are initialized to 
>>> -1 in WorkerDataArray<T>::reset() when a GC starts. At the end of a 
>>> GC WorkerDataArray<T>::verify() verifies that all entries in a 
>>> WorkerDataArray has been set. Currently it does this by asserting 
>>> that the entries are >= 0. This is fine in theory since the entries 
>>> should contain counts or times that are all positive.
>>>
>>> The problem is that some WorkerDataArrays are of type double. And 
>>> some of those are set up through calculations using doubles. If 
>>> those calculations result in a value close to 0 we could end up with 
>>> a value slightly less than 0 since double calculations don't have 
>>> full precision.
>>>
>>> All we really want to verify is that all the entries were set. So, 
>>> it should be enough to verify that entries do not contain the value 
>>> set by the reset() method.
>>>
>>> Bengt
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/47a010f2/attachment.htm>

From john.cuthbertson at oracle.com  Wed Jan 16 18:12:33 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 16 Jan 2013 10:12:33 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50F65851.6050207@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<50F59482.6060900@oracle.com>
	<7D0DFCF4-4F58-4902-BDC0-E1868BB5D786@salesforce.com>
	<50F5A3B2.9010505@oracle.com>
	<FDA61CAB-EFDA-474A-B052-867251049216@salesforce.com>
	<50F65851.6050207@oracle.com>
Message-ID: <50F6ED91.7040806@oracle.com>

Hi Everyone,

Thanks. The change is now just changing some of the defaults in 
g1_globals.hpp. Ready to push this now.

JohnC

On 1/15/2013 11:35 PM, Bengt Rutisson wrote:
>
> Hi all,
>
> I haven't commented in this email thread but I've been following the 
> discussion with interest. Since I was the one who brought up the 
> question around the 4GB limit, I'd just like to state that I agree 
> with the decision to skip this limit.
>
> Thanks,
> Bengt
>
>
> On 1/15/13 8:20 PM, Charlie Hunt wrote:
>> Avg Response Time ... (sigh) --- one of our favorite subjects.  ;-)
>>
>> You're right, if the marking cycles start earlier than ideally 
>> desired, you end up under-utilizing heap space and potentially having 
>> to tame mixed GCs.  But, G1 has a tunable we can set to start the 
>> marking cycle later.  The challenge there is setting the initiating 
>> heap occupancy percent too high and losing the race.  But, by setting 
>> it higher (and avoiding losing the race) with larger  heaps hopefully 
>> translates to more "good candidate" old gen regions to collect and 
>> also hopefully makes the exercise of taming mixed GCs a little easier 
>> too.
>>
>> Thanks for sharing your thoughts.
>>
>> charlie ...
>>
>> On Jan 15, 2013, at 12:45 PM, Monica Beckwith wrote:
>>
>>> Thanks, Charlie -
>>>
>>> If I may add two more things to John's points below and also expand 
>>> a bit on the "latency" comment -
>>> Even though we talk about latency, in reality, I have seen many 
>>> people with bigger heap (around 200Gs) requirements really concerned 
>>> about ART (Average Response Time)/ Throughput.
>>> Also, we should remember that if the marking cycle is triggered 
>>> earlier and more often, then we may end up under-utilizing the 
>>> bigger heaps and will definitely have to spend time "taming the 
>>> mixedGCs" :)
>>>
>>> just my 2 cents.
>>>
>>> -Monica
>>>
>>> On 1/15/2013 12:01 PM, Charlie Hunt wrote:
>>>> Hi John,
>>>>
>>>> Completely agree with the excellent points you mention below (thanks for being thorough and listing them!).
>>>>
>>>> Given G1 is (somewhat) positioned as a collector to use when improved latency is an important criteria, I think the tradeoffs are something people are willing to live with too.
>>>>
>>>> Fwiw, you have my "ok" to go ahead with your suggestion to apply the new young gen bounds to all heap sizes.
>>>>
>>>> hths,
>>>>
>>>> charlie ...
>>>>
>>>> On Jan 15, 2013, at 11:40 AM, John Cuthbertson wrote:
>>>>
>>>>> Hi Charlie
>>>>>
>>>>> Thanks for looking over the changes. Replies inline....
>>>>>
>>>>>
>>>>> On 1/11/2013 11:32 AM, Charlie Hunt wrote:
>>>>>> Hi John,
>>>>>>
>>>>>> Fwiw, I'm fine with Bengt's suggestion of having G1NewSizePercent the same for all Java heap sizes.
>>>>> I don't have a problem with this. By applying it heaps > 4GB , I was
>>>>> just being conservative.
>>>>>
>>>>>> I'm on the fence with whether to do the same with G1MaxNewSizePercent.  For me I find the MaxNewSizePercent a bit tricky than NewSizePercent.  WIth NewSizePercent, if young gen is sized "too small", I think the worst case is we have some GCs that are well below the pause time target.  But, with MaxNewSizePercent, if it's allowed to get "too big", then the worst case is evacuation failures.
>>>>>>
>>>>>> So, if you did move MaxNewSizePercent down to 60, we'd have a situation where we'd be less likely to have evacuation failures.  Perhaps it's ok to apply this change to all Java heap sizes too?
>>>>> Again I don't have a problem with applying the new value to all heap
>>>>> sizes but I am a little concerned about the implications. The benefit is
>>>>> definitely less risk of evacuation failures but the it could also
>>>>>
>>>>> * increase the number of young GCs:
>>>>>      ** increasing the GC overhead and increasing the heap slightly more
>>>>> aggressively
>>>>>      ** lowering throughput
>>>>> * slightly increase the amount that gets promoted
>>>>>      ** triggering marking cycles earlier and more often (increased SATB
>>>>> barrier overhead)
>>>>>      ** more cards to be refined (we only refine cards in old regions)
>>>>> increasing the write barrier costs and the RS updating phase of the pauses,
>>>>>      ** increases the importance of "taming the mixed GCs".
>>>>>
>>>>> >From Kirk's email it sounds like this is a trade off people are
>>>>> prepared to live with.
>>>>>
>>>>> Unless I hear any objections, I'll apply the new young gen bounds to all
>>>>> heap sizes.
>>>>>
>>>>> JohnC
>>>
>>> -- 
>>> <oracle_sig_logo.gif> <http://www.oracle.com/>
>>> Monica Beckwith | Java Performance Engineer
>>> VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
>>> Texas
>>> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> 
>>> Oracle is committed to developing practices and products that help 
>>> protect the environment
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130116/f3b07d6f/attachment.htm>

From john.cuthbertson at oracle.com  Wed Jan 16 18:17:58 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 16 Jan 2013 10:17:58 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <9CA215C3-6216-409E-BCAF-185835FDC3EB@kodewerk.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
	<50F5950F.8040206@oracle.com>
	<9CA215C3-6216-409E-BCAF-185835FDC3EB@kodewerk.com>
Message-ID: <50F6EED6.5060501@oracle.com>

Hi Kirk,

If there were a truly compelling reason then I would defend change and 
*try* to explain the reason. In this case it was just conservatism - 
changing the defaults of flags that can really alter behavior always 
makes me slightly nervous. :)

What do you mean by an incremental mode for G1? Anything you can cite?

JohnC

On 1/16/2013 12:09 AM, Kirk Pepperdine wrote:
> Hi John,
>
> You know, there might be a good reason to have different values for different heap sizes.. some thing that makes sense when you look at the implementation. If so, that might justify the need to do this. I just don't understand *why*? But maybe that's just me. I'm not responsible for the implementation, I just help people deal with what's on the table and so unless something seem really not right, like dropping incremental modes, I'll pass comment and then shutup to let you get on with it... ;-)
>
> BTW, not to stir up any trouble but it would be nice to have a incremental mode for G1 for machines with large number of cores.
>
> Regards,
> Kirk
>
> On 2013-01-15, at 9:42 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>
>> Hi Kirk,
>>
>> I know you haven't responded to me directly but I did your email with interest and cited it in my reply to Charlie Hunt.
>>
>> On 1/12/2013 4:39 AM, Kirk Pepperdine wrote:
>>> Hi Charlie,
>>>
>>> In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?
>>>
>> Unless I hear any objections, I'll apply the new young gen bounds to all heap sizes.
>>
>> JohnC


From kirk at kodewerk.com  Wed Jan 16 18:34:56 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Wed, 16 Jan 2013 10:34:56 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <50F6EED6.5060501@oracle.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
	<50F5950F.8040206@oracle.com>
	<9CA215C3-6216-409E-BCAF-185835FDC3EB@kodewerk.com>
	<50F6EED6.5060501@oracle.com>
Message-ID: <6B6E066B-A07B-4428-AB91-1715E89DB63B@kodewerk.com>

Well, I have an app running on a box with 24 cores with low latency concerns. The app isn't using all 24 cores which means I'd be happy to give the collector 12 of them if it ti were able to use them all the time without pausing the app.

Regards,
Kirk
On 2013-01-16, at 10:17 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Kirk,
> 
> If there were a truly compelling reason then I would defend change and *try* to explain the reason. In this case it was just conservatism - changing the defaults of flags that can really alter behavior always makes me slightly nervous. :)
> 
> What do you mean by an incremental mode for G1? Anything you can cite?
> 
> JohnC
> 
> On 1/16/2013 12:09 AM, Kirk Pepperdine wrote:
>> Hi John,
>> 
>> You know, there might be a good reason to have different values for different heap sizes.. some thing that makes sense when you look at the implementation. If so, that might justify the need to do this. I just don't understand *why*? But maybe that's just me. I'm not responsible for the implementation, I just help people deal with what's on the table and so unless something seem really not right, like dropping incremental modes, I'll pass comment and then shutup to let you get on with it... ;-)
>> 
>> BTW, not to stir up any trouble but it would be nice to have a incremental mode for G1 for machines with large number of cores.
>> 
>> Regards,
>> Kirk
>> 
>> On 2013-01-15, at 9:42 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>> 
>>> Hi Kirk,
>>> 
>>> I know you haven't responded to me directly but I did your email with interest and cited it in my reply to Charlie Hunt.
>>> 
>>> On 1/12/2013 4:39 AM, Kirk Pepperdine wrote:
>>>> Hi Charlie,
>>>> 
>>>> In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?
>>>> 
>>> Unless I hear any objections, I'll apply the new young gen bounds to all heap sizes.
>>> 
>>> JohnC
> 


From john.cuthbertson at oracle.com  Wed Jan 16 19:17:47 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 16 Jan 2013 11:17:47 -0800
Subject: RFR(XS): 8001425: G1: Change the default values for certain G1
	specific flags
In-Reply-To: <6B6E066B-A07B-4428-AB91-1715E89DB63B@kodewerk.com>
References: <50EE0B21.4000909@oracle.com> <50EE92D8.8050903@oracle.com>
	<50EF0CCA.8070005@oracle.com>
	<55413B02-B4BF-49E1-B895-BE5814CEC603@salesforce.com>
	<7F0ED8CA-F067-40F4-85E0-1143013B601D@kodewerk.com>
	<50F5950F.8040206@oracle.com>
	<9CA215C3-6216-409E-BCAF-185835FDC3EB@kodewerk.com>
	<50F6EED6.5060501@oracle.com>
	<6B6E066B-A07B-4428-AB91-1715E89DB63B@kodewerk.com>
Message-ID: <50F6FCDB.70302@oracle.com>

Hi Kirk,

You should be able to give all the cores to the STW GCs 
(ParallelGCThreads) unless your application is in JNI when a STW GC starts.

You can also explicitly set the number of concurrent marking threads 
(ConcGCThreads) and concurrent refinement threads 
(G1ConcRefinementThreads) to the number of cores you are prepared to 
give up when the application is running.

When a marking cycle is started all of the marking threads are activated 
and participate equally.

The activation of the concurrent refinement threads is stepped, i.e. 
when the number of pending remembered set updates goes above a threshold 
the next thread is activated and so on. Once the final refinement thread 
is activated, if the number of pending updates is still above the next 
step, the application threads are employed to update the remembered 
sets.  Once the number of pending updates drops below the thresholds the 
application threads stop doing the work. The refinement threads are 
progressively deactivated as the number of pending updates further reduces.

Choosing the right mix of concurrent threads depends upon your 
application. Since you most likely do not want your application threads 
to do any processing of pending remembered set updates, I would bias 
towards more refinement threads and less marking threads. If your 
marking cycles are taking a long time and the amount of old data 
mutation is low then I would suggest biasing toward more marking 
threads. I think you would find a sweet spot for either/both somewhere 
between 4 and 8 cores.

But definitely give as many cores as you can to the STW phases. But I'll 
let the more experienced performance guys chime in now. :)

JohnC

On 1/16/2013 10:34 AM, Kirk Pepperdine wrote:
> Well, I have an app running on a box with 24 cores with low latency concerns. The app isn't using all 24 cores which means I'd be happy to give the collector 12 of them if it ti were able to use them all the time without pausing the app.
>
> Regards,
> Kirk
> On 2013-01-16, at 10:17 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>
>> Hi Kirk,
>>
>> If there were a truly compelling reason then I would defend change and *try* to explain the reason. In this case it was just conservatism - changing the defaults of flags that can really alter behavior always makes me slightly nervous. :)
>>
>> What do you mean by an incremental mode for G1? Anything you can cite?
>>
>> JohnC
>>
>> On 1/16/2013 12:09 AM, Kirk Pepperdine wrote:
>>> Hi John,
>>>
>>> You know, there might be a good reason to have different values for different heap sizes.. some thing that makes sense when you look at the implementation. If so, that might justify the need to do this. I just don't understand *why*? But maybe that's just me. I'm not responsible for the implementation, I just help people deal with what's on the table and so unless something seem really not right, like dropping incremental modes, I'll pass comment and then shutup to let you get on with it... ;-)
>>>
>>> BTW, not to stir up any trouble but it would be nice to have a incremental mode for G1 for machines with large number of cores.
>>>
>>> Regards,
>>> Kirk
>>>
>>> On 2013-01-15, at 9:42 AM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>>>
>>>> Hi Kirk,
>>>>
>>>> I know you haven't responded to me directly but I did your email with interest and cited it in my reply to Charlie Hunt.
>>>>
>>>> On 1/12/2013 4:39 AM, Kirk Pepperdine wrote:
>>>>> Hi Charlie,
>>>>>
>>>>> In this case I would have to say that having more frequent GCs that succeed is much better than evacuation failures. Also having different values for different heap sizes is really confusing. Is it really necessary to have different percentages for different heap sizes and is so is there a known gradient for correlating the size vs percent?
>>>>>
>>>> Unless I hear any objections, I'll apply the new young gen bounds to all heap sizes.
>>>>
>>>> JohnC


From john.cuthbertson at oracle.com  Wed Jan 16 21:57:07 2013
From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com)
Date: Wed, 16 Jan 2013 21:57:07 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8001425: G1: Change the default values
	for certain G1 specific flags
Message-ID: <20130116215711.98E1B47316@hg.openjdk.java.net>

Changeset: 4967eb4f67a9
Author:    johnc
Date:      2013-01-15 12:32 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4967eb4f67a9

8001425: G1: Change the default values for certain G1 specific flags
Summary: Changes to default and ergonomic flag values recommended by performance team. Changes were also reviewed by Monica Beckwith <monica.beckwith at oracle.com>.
Reviewed-by: brutisso, huntch

! src/share/vm/gc_implementation/g1/g1_globals.hpp


From stefan.karlsson at oracle.com  Thu Jan 17 10:50:32 2013
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 17 Jan 2013 11:50:32 +0100
Subject: Review Request: 8006513: Null pointer in
	DefaultMethods::generate_default_methods when merging annotations
Message-ID: <50F7D778.5010300@oracle.com>

http://cr.openjdk.java.net/~stefank/8006513/webrev.00/

This fixes a bug introduced in the fix for JDK-8005994. Which is 
surfacing now that HotSpot is combined with JDK8-b73. The failing code 
path was not exercised when JPRT and nightly testing ran with with JDK8-b72.

This fix is urgent and is going to be pushed soon, since it block all 
other pushes to hotspot-gc.

thanks,
StefanK


From michal at frajt.eu  Thu Jan 17 10:58:20 2013
From: michal at frajt.eu (Michal Frajt)
Date: Thu, 17 Jan 2013 11:58:20 +0100
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental 
	mode of CMS
In-Reply-To: <50EF228C.3030009@oracle.com>
References: =?iso-8859-1?q?=3C508EB0D7=2E8020204=40oracle=2Ecom=3E_=3CCABzyjykeMB?=
	=?iso-8859-1?q?3gNgoCkd=2Dn4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g=40mail=2Egm?=
	=?iso-8859-1?q?ail=2Ecom=3E_=3C50C108C2=2E9=40oracle=2Ecom=3E_=3CCABz?=
	=?iso-8859-1?q?yjy=3DVeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg=40ma?=
	=?iso-8859-1?q?il=2Egmail=2Ecom=3E_=3CCABzyjymQJArReNr2xQ9pYA6kUMqcmx?=
	=?iso-8859-1?q?W9fpiykic=3DNrw8AaDG5g=40mail=2Egmail=2Ecom=3E_=3CMEO9?=
	=?iso-8859-1?q?HC=2471BBF3DAB4563B2D26BA972076AA26C1=40frajt=2Eeu=3E_?=
	=?iso-8859-1?q?=3CMEX4B1=24E1E07BA2F872E0495C06E5D1E52F22E9=40frajt?=
	=?iso-8859-1?q?=2Eeu=3E_=3C50EF1A85=2E4010203=40oracle=2Ecom=3E_=3C50?=
	=?iso-8859-1?q?EF228C=2E3030009=40oracle=2Ecom=3E?=
Message-ID: <MGRNT8$15B65F8C7747BCD32CB2F745B9DB2BA9@frajt.eu>

Hi John,

Please apply the attached patch to the webrev. You are right, the setting of the CMS token has been somehow moved back above the method return. Additionally I have fixed the printf of the unsigned loop counter (correct is %u). 

Regards,
Michal 
 
Od: hotspot-gc-dev-bounces at openjdk.java.net
Komu: hotspot-gc-dev at openjdk.java.net
Kopie: 
Datum: Thu, 10 Jan 2013 12:20:28 -0800
P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS


> Hi Michal,
> 
> On 1/10/2013 11:46 AM, John Cuthbertson wrote:
> > Hi Michal,
> >
> > Many apologies for the delay in generating a new webrev for this 
> > change but here is the new one: 
> > http://cr.openjdk.java.net/~johnc/7189971/webrev.1/
> >
> > Can you verify the webrev to make sure that changes have been applied 
> > correctly? Looking at the new webrev it seems that the setting of the 
> > CMS has been moved back above the return out of the loop. Was this 
> > intentional?
> 
> The above should be "... setting of the CMS token has been ...".
> 
> JohnC
> 
> >
> > I've done a couple of sanity tests with GCOld with CMSWaitDuration=0 
> > and CMSWaitDuration=1500 with CMS.
> >
> > Regards,
> >
> > JohnC
> >
> > On 12/12/2012 4:35 AM, Michal Frajt wrote:
> >> All,
> >> Find the attached patch. It implements proposed recommendations and 
> >> requested changes. Please mind that the CMSWaitDuration set to -1 
> >> (never wait) requires new parameter CMSCheckInterval (develop only, 
> >> 1000 milliseconds default - constant). The parameter defines the 
> >> next CMS cycle start check interval in the case there are no 
> >> desynchronization (notifications) events on the CGC_lock.
> >>
> >> Tested with the Solaris/amd64 build
> >> CMS
> >> + CMSWaitDuration>0 OK
> >> + CMSWaitDuration=0 OK
> >> + CMSWaitDuration<0 OK
> >> iCMS
> >> + CMSWaitDuration>0 OK
> >> + CMSWaitDuration=0 OK
> >> + CMSWaitDuration<0 OK
> >> Regards,
> >> Michal
> >> Od: hotspot-gc-dev-bounces at openjdk.java.net
> >> Komu: hotspot-gc-dev at openjdk.java.net
> >> Kopie:
> >> Datum: Fri, 7 Dec 2012 18:48:48 +0100
> >> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for 
> >> non-incremental mode of CMS
> >>
> >>> Hi John/Jon/Ramki,
> >>>
> >>> All proposed recommendations and requested changes have been 
> >>> implemented. We are going to test it on Monday. You will get the new 
> >>> tested patch soon.
> >>>
> >>> The attached code here just got compiled, no test executed yet, it 
> >>> might contain a bug, but you can quickly review it and send your 
> >>> comments.
> >>>
> >>> Best regards
> >>> Michal
> >>>
> >>>
> >>> // Wait until the next synchronous GC, a concurrent full gc request,
> >>> // or a timeout, whichever is earlier.
> >>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long 
> >>> t_millis) {
> >>> // Wait time in millis or 0 value representing infinite wait for 
> >>> a scavenge
> >>> assert(t_millis >= 0, "Wait time for scavenge should be 0 or 
> >>> positive");
> >>>
> >>> GenCollectedHeap* gch = GenCollectedHeap::heap();
> >>> double start_time_secs = os::elapsedTime();
> >>> double end_time_secs = start_time_secs + (t_millis / ((double) 
> >>> MILLIUNITS));
> >>>
> >>> // Total collections count before waiting loop
> >>> unsigned int before_count;
> >>> {
> >>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> >>> before_count = gch->total_collections();
> >>> }
> >>>
> >>> unsigned int loop_count = 0;
> >>>
> >>> while(!_should_terminate) {
> >>> double now_time = os::elapsedTime();
> >>> long wait_time_millis;
> >>>
> >>> if(t_millis != 0) {
> >>> // New wait limit
> >>> wait_time_millis = (long) ((end_time_secs - now_time) * 
> >>> MILLIUNITS);
> >>> if(wait_time_millis <= 0) {
> >>> // Wait time is over
> >>> break;
> >>> }
> >>> } else {
> >>> // No wait limit, wait if necessary forever
> >>> wait_time_millis = 0;
> >>> }
> >>>
> >>> // Wait until the next event or the remaining timeout
> >>> {
> >>> MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
> >>>
> >>> set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
> >>> if (_should_terminate || _collector->_full_gc_requested) {
> >>> return;
> >>> }
> >>> assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> >>> CGC_lock->wait(Mutex::_no_safepoint_check_flag, 
> >>> wait_time_millis);
> >>> clear_CMS_flag(CMS_cms_wants_token);
> >>> assert(!CMS_flag_is_set(CMS_cms_has_token | 
> >>> CMS_cms_wants_token),
> >>> "Should not be set");
> >>> }
> >>>
> >>> // Extra wait time check before entering the heap lock to get 
> >>> the collection count
> >>> if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
> >>> // Wait time is over
> >>> break;
> >>> }
> >>>
> >>> // Total collections count after the event
> >>> unsigned int after_count;
> >>> {
> >>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> >>> after_count = gch->total_collections();
> >>> }
> >>>
> >>> if(before_count != after_count) {
> >>> // There was a collection - success
> >>> break;
> >>> }
> >>>
> >>> // Too many loops warning
> >>> if(++loop_count == 0) {
> >>> warning("wait_on_cms_lock_for_scavenge() has looped %d 
> >>> times", loop_count - 1);
> >>> }
> >>> }
> >>> }
> >>>
> >>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
> >>> while (!_should_terminate) {
> >>> if (CMSIncrementalMode) {
> >>> icms_wait();
> >>> if(CMSWaitDuration >= 0) {
> >>> // Wait until the next synchronous GC, a concurrent full gc
> >>> // request or a timeout, whichever is earlier.
> >>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> >>> }
> >>> return;
> >>> } else {
> >>> if(CMSWaitDuration >= 0) {
> >>> // Wait until the next synchronous GC, a concurrent full gc
> >>> // request or a timeout, whichever is earlier.
> >>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> >>> } else {
> >>> // Wait until any cms_lock event not to call 
> >>> shouldConcurrentCollect permanently
> >>> wait_on_cms_lock(0);
> >>> }
> >>> }
> >>> // Check if we should start a CMS collection cycle
> >>> if (_collector->shouldConcurrentCollect()) {
> >>> return;
> >>> }
> >>> // .. collection criterion not yet met, let's go back
> >>> // and wait some more
> >>> }
> >>> }
> >>>
> >>> Od: hotspot-gc-dev-bounces at openjdk.java.net
> >>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson" 
> >>> john.cuthbertson at oracle.com
> >>> Kopie: hotspot-gc-dev at openjdk.java.net
> >>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
> >>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for 
> >>> non-incremental mode of CMS
> >>>
> >>>> Hi John --
> >>>>
> >>>> wrt the changes posted, i see the intent of the code and agree with
> >>>> it. I have a few minor suggestions on the
> >>>> details of how it's implemented. My comments are inline below,
> >>>> interleaved with the code:
> >>>>
> >>>> 317 // Wait until the next synchronous GC, a concurrent full gc 
> >>>> request,
> >>>> 318 // or a timeout, whichever is earlier.
> >>>> 319 void 
> >>>> ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
> >>>> t_millis) {
> >>>> 320 // Wait for any cms_lock event when timeout not specified 
> >>>> (0 millis)
> >>>> 321 if (t_millis == 0) {
> >>>> 322 wait_on_cms_lock(t_millis);
> >>>> 323 return;
> >>>> 324 }
> >>>>
> >>>> I'd completely avoid the special case above because it would miss the
> >>>> part about waiting for a
> >>>> scavenge, instead dealing with that case in the code in the loop below
> >>>> directly. The idea
> >>>> of the "0" value is not to ask that we return immediately, but that we
> >>>> wait, if necessary
> >>>> forever, for a scavenge. The "0" really represents the value infinity
> >>>> in that sense. This would
> >>>> be in keeping with our use of wait() with a "0" value for timeout at
> >>>> other places in the JVM as
> >>>> well, so it's consistent.
> >>>>
> >>>> 325
> >>>> 326 GenCollectedHeap* gch = GenCollectedHeap::heap();
> >>>> 327 double start_time = os::elapsedTime();
> >>>> 328 double end_time = start_time + (t_millis / 1000.0);
> >>>>
> >>>> Note how, the end_time == start_time for the special case of t_millis
> >>>> == 0, so we need to treat that
> >>>> case specially below.
> >>>>
> >>>> 329
> >>>> 330 // Total collections count before waiting loop
> >>>> 331 unsigned int before_count;
> >>>> 332 {
> >>>> 333 MutexLockerEx hl(Heap_lock, 
> >>>> Mutex::_no_safepoint_check_flag);
> >>>> 334 before_count = gch->total_collections();
> >>>> 335 }
> >>>>
> >>>> Good.
> >>>>
> >>>> 336
> >>>> 337 while (true) {
> >>>> 338 double now_time = os::elapsedTime();
> >>>> 339 long wait_time_millis = (long)((end_time - now_time) * 
> >>>> 1000.0);
> >>>> 340
> >>>> 341 if (wait_time_millis <= 0) {
> >>>> 342 // Wait time is over
> >>>> 343 break;
> >>>> 344 }
> >>>>
> >>>> Modify to:
> >>>> if (t_millis != 0) {
> >>>> if (wait_time_millis <= 0) {
> >>>> // Wait time is over
> >>>> break;
> >>>> }
> >>>> } else {
> >>>> wait_time_millis = 0; // for use in wait() below
> >>>> }
> >>>>
> >>>> 345
> >>>> 346 // Wait until the next event or the remaining timeout
> >>>> 347 {
> >>>> 348 MutexLockerEx x(CGC_lock, 
> >>>> Mutex::_no_safepoint_check_flag);
> >>>> 349 if (_should_terminate || _collector->_full_gc_requested) {
> >>>> 350 return;
> >>>> 351 }
> >>>> 352 set_CMS_flag(CMS_cms_wants_token); // to provoke 
> >>>> notifies
> >>>>
> >>>> insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> >>>>
> >>>> 353 CGC_lock->wait(Mutex::_no_safepoint_check_flag, 
> >>>> wait_time_millis);
> >>>> 354 clear_CMS_flag(CMS_cms_wants_token);
> >>>> 355 assert(!CMS_flag_is_set(CMS_cms_has_token | 
> >>>> CMS_cms_wants_token),
> >>>> 356 "Should not be set");
> >>>> 357 }
> >>>> 358
> >>>> 359 // Extra wait time check before entering the heap lock to 
> >>>> get
> >>>> the collection count
> >>>> 360 if (os::elapsedTime() >= end_time) {
> >>>> 361 // Wait time is over
> >>>> 362 break;
> >>>> 363 }
> >>>>
> >>>> Modify above wait time check to make an exception for t_miliis == 0:
> >>>> // Extra wait time check before checking collection count
> >>>> if (t_millis != 0 && os::elapsedTime() >= end_time) {
> >>>> // wait time exceeded
> >>>> break;
> >>>> }
> >>>>
> >>>> 364
> >>>> 365 // Total collections count after the event
> >>>> 366 unsigned int after_count;
> >>>> 367 {
> >>>> 368 MutexLockerEx hl(Heap_lock, 
> >>>> Mutex::_no_safepoint_check_flag);
> >>>> 369 after_count = gch->total_collections();
> >>>> 370 }
> >>>> 371
> >>>> 372 if (before_count != after_count) {
> >>>> 373 // There was a collection - success
> >>>> 374 break;
> >>>> 375 }
> >>>> 376 }
> >>>> 377 }
> >>>>
> >>>> While it is true that we do not have a case where the method is called
> >>>> with a time of "0", I think we
> >>>> want that value to be treated correctly as "infinity". For the case
> >>>> where we do not want a wait at all,
> >>>> we should use a small positive value, like "1 ms" to signal that
> >>>> intent, i.e. -XX:CMSWaitDuration=1,
> >>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
> >>>> by reserving negative values to
> >>>> signal infinity, but that would make the code in the loop a bit 
> >>>> more fiddly.)
> >>>>
> >>>> As mentioned in my previous email, I'd like to see this tested with
> >>>> CMSWaitDuration set to 0, positive and
> >>>> negative values (if necessary, we can reject negative value settings),
> >>>> and with ExplicitGCInvokesConcurrent.
> >>>>
> >>>> Rest looks OK to me, although I am not sure how this behaves with
> >>>> iCMS, as I have forgotten that part of the
> >>>> code.
> >>>>
> >>>> Finally, in current code (before these changes) there are two callers
> >>>> of the former wait_for_cms_lock() method,
> >>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
> >>>> I think the right thing has been done
> >>>> in terms of leaving the latter alone.
> >>>>
> >>>> It would be good if this were checked with CMSInitiatingOccupancy set
> >>>> to 0 (or a small value), CMSWaitDuration set to 0,
> >>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
> >>>> CMS cycles start very soon after the end of
> >>>> a scavenge (and not at random times as Michal has observed earlier,
> >>>> although i am guessing that is difficult to test).
> >>>> It would be good to repeat the above test with iCMS as well.
> >>>>
> >>>> thanks!
> >>>> -- ramki
> >>>>
> >>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna wrote:
> >>>>> Thanks Jon for the pointer:
> >>>>>
> >>>>>
> >>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
> >>>>>>> The high level idea looks correct. I'll look at the details in a 
> >>>>>>> bit (seriously this time; sorry it dropped off my plate last 
> >>>>>>> time I promised).
> >>>>>>> Does anyone have a pointer to the related discussion thread on 
> >>>>>>> this aias from earlier in the year, by chance, so one could 
> >>>>>>> refresh one's
> >>>>>>> memory of that discussion?
> >>>>>>
> >>>>>> subj: CMSWaitDuration unstable behavior
> >>>>>>
> >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html 
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> also: 
> >>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
> >>>>>
> >>>>> On to it later this afternoon, and TTYL w/review.
> >>>>> - ramki
> >
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openjdk7u-hotspot-7189971_v2.patch
Type: application/octet-stream
Size: 1410 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130117/bd80159b/openjdk7u-hotspot-7189971_v2.patch>

From michal at frajt.eu  Thu Jan 17 11:00:14 2013
From: michal at frajt.eu (Michal Frajt)
Date: Thu, 17 Jan 2013 12:00:14 +0100
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental 
	mode of CMS
In-Reply-To: <op.wqprdrvhtc8ri4@eckenfels02.seeburger.de>
References: =?iso-8859-1?q?=3C508EB0D7=2E8020204=40oracle=2Ecom=3E_=3CCABzyjykeMB?=
	=?iso-8859-1?q?3gNgoCkd=2Dn4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g=40mail=2Egm?=
	=?iso-8859-1?q?ail=2Ecom=3E_=3C50C108C2=2E9=40oracle=2Ecom=3E_=3CCABz?=
	=?iso-8859-1?q?yjy=3DVeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg=40ma?=
	=?iso-8859-1?q?il=2Egmail=2Ecom=3E_=3CCABzyjymQJArReNr2xQ9pYA6kUMqcmx?=
	=?iso-8859-1?q?W9fpiykic=3DNrw8AaDG5g=40mail=2Egmail=2Ecom=3E_=3CMEO9?=
	=?iso-8859-1?q?HC=2471BBF3DAB4563B2D26BA972076AA26C1=40frajt=2Eeu=3E_?=
	=?iso-8859-1?q?=3CMEX4B1=24E1E07BA2F872E0495C06E5D1E52F22E9=40frajt?=
	=?iso-8859-1?q?=2Eeu=3E_=3C50EF1A85=2E4010203=40oracle=2Ecom=3E_=3C50?=
	=?iso-8859-1?q?EF228C=2E3030009=40oracle=2Ecom=3E_=3Cop=2Ewqprdrvhtc8?=
	=?iso-8859-1?q?ri4=40eckenfels02=2Eseeburger=2Ede=3E?=
Message-ID: <MGRNWE$82762F2A1369F35EAC4D7E6830540374@frajt.eu>

 
Hi Bernd,

The catching up with the filling old gen is handled via the _full_gc_requested state (set and notify on the CGC_lock). The RMI GC interval is probably not handled.

The current CMS wait duration implementation is having endless wait limit implemented very same way. There is no code protection to help with situations where the wakeup (notify) is missing. The new imlementation does not make anything worse or better in the endless wait for the scavenge. 

The endless (forever) wait support was implemented on ramki's request. Please find here http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-December/005394.html his detailed implementation proposal. 

Regards,
Michal

 
Od: hotspot-gc-dev-bounces at openjdk.java.net
Komu: hotspot-gc-dev at openjdk.java.net
Kopie: 
Datum: Thu, 10 Jan 2013 22:02:05 +0100
P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS


> Hello,
> 
> two amateur :) questions:
> 
> Am 10.01.2013, 21:20 Uhr, schrieb John Cuthbertson
> :
> >> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0 
> >> and CMSWaitDuration=1500 with CMS.
> 
> Is there a risk involved in waiting long/endless? For example larger than
> some RMI GC intervall or too long for catching up with the filling of OG?
> How to test for that?
> 
> >>>> // No wait limit, wait if necessary forever
> >>>> wait_time_millis = 0;
> 
> I typically not use a endless wait limit when there is a loop with a fixed
> endtime but use a large but limited number. This helps to catch situations
> where wakeup is missing (for example on shutdown) or lost. Would it be an
> option to use something like 10s?
> 
> Gruss
> Bernd
> -- 
> http://bernd.eckenfels.net


From john.cuthbertson at oracle.com  Thu Jan 17 20:02:34 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 17 Jan 2013 12:02:34 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <50F51F00.6040008@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
Message-ID: <50F858DA.8050508@oracle.com>

Hi Bengt,

There's a new webrev at: http://cr.openjdk.java.net/~johnc/8005032/webrev.1/

It looks larger than the previous webrev but the most of the change was 
tweaking comments. The actual code changes are smaller.

Testing was the same as before.

On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>
> I see. I didn't think about the difference betweeen ParallelGCThreads 
> and ParallelRefProcEnabled. BTW, not part of this change, but why do 
> we have ParallelRefProcEnabled? And why is it false by default? 
> Wouldn't it make more sense to have it just be dependent on 
> ParallelGCThreads?

I don't know and the answer is probably lost in the dark depths of time 
- I can only speculate. For G1 we have a CR to turn 
ParallelRefProcEnabled on if the number of GC threads > 1. I'm not sure 
about the other collectors.

>
>> Setting it once in weakRefsWork() will not be sufficient. We will run 
>> into an assertion failure in 
>> ParallelTaskTerminator::offer_termination().
>>
>> During the reference processing, the do_void() method of the 
>> complete_gc oop closure (in our case the complete gc oop closure is 
>> an instance of G1CMParDrainMarkingStackClosure) is called multiple 
>> times (in process_phase1, sometimes process_phase2, process_phase3, 
>> and process_phaseJNI)
>>
>> Setting the phase sets the number of active tasks (or threads) that 
>> the termination protocol in do_marking_step() will wait for. When an 
>> invocation of do_marking_step() offers termination, the number of 
>> tasks/threads in the terminator instance is decremented. So Setting 
>> the phase once will let the first execution of do_marking_step (with 
>> termination) from process_phase1() succeed, but subsequent calls to 
>> do_marking_step() will result in the assertion failure.
>>
>> We also can't unconditionally set it in the do_void() method or even 
>> the constructor of G1CMParDrainMarkingStackClosure. Separate 
>> instances of this closure are created by each of the worker threads 
>> in the MT-case.
>>
>> Note when processing is multi-threaded the complete_gc instance used 
>> is the one passed into the ProcessTask's work method (passed into 
>> process_discovered_references() using the task executor instance) 
>> which may not necessarily be the same complete gc instance as the one 
>> passed directly into process_discovered_references().
>
> Thanks for this detailed explanation. It really helped!
>
> I understand the issue now, but I still think it is very confusing 
> that _cm->set_phase() is called from 
> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from 
> G1CMParDrainMarkingStackClosure::do_void() in the single threaded case.
>
>> It might be possible to record whether processing is MT in the 
>> G1CMRefProcTaskExecutor class and always pass the executor instance 
>> into process_discovered_references. We could then set processing to 
>> MT so that the execute() methods in the executor instance are invoked 
>> but call the Proxy class' work method directly. Then we could 
>> override the set_single_threaded() routine (called just before 
>> process_phaseJNI) to set the phase.
>
> I think this would be a better solution, but if I understand it 
> correctly it would mean that we would have to change all the 
> collectors to always pass a TaskExecutor. All of them currently pass 
> NULL in the non-MT case. I think it would be simpler if they always 
> passed a TaskExecutor but it is a pretty big change.

I wasn't meaning to do that for the other collectors just G1's 
concurrent mark reference processor i.e. fool the ref processor into 
think it's MT so that the parallel task executor is used but only use 
the work gang if reference processing was _really_ MT.

I decided not to do this as there is an easier way. For the non-MT case 
we do not need to enter the termination protocol in 
CMTask::do_marking_step(). When there's only one thread we don't need to 
use the ParallelTaskTerminator to wait for other threads. And we 
certainly don't need stealing. Hence the solution is to only do the 
termination and stealing if the closure is instantiated for MT reference 
processing. That removes the set_phase call().

> Another possibility is to introduce some kind of prepare method to the 
> VoidClosure (or maybe in a specialized subclass for ref processing). 
> Then we could do something like:
>
>   complete_gc->prologue();
>   if (mt_processing) {
>     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic() 
> /*marks_oops_alive*/);
>     task_executor->execute(phase2);
>   } else {
>     for (uint i = 0; i < _max_num_q; i++) {
>       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
>     }
>   }
>
> G1CMParDrainMarkingStackClosure::prologue() could do the call to 
> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not 
> have to do it.

The above is a reasonable extension to the reference processing code. I 
no longer need this feature for this change but we should submit a CR 
for it. I'll do that.

> BTW, not really part of your change, but above code is duplicated 
> three times in ReferenceProcessor::process_discovered_reflist(). Would 
> be nice to factor this out to a method.

Completely agree. Again I'll submit a CR for it.

Thanks,

JohnC


From chunt at salesforce.com  Thu Jan 17 20:52:57 2013
From: chunt at salesforce.com (Charlie Hunt)
Date: Thu, 17 Jan 2013 12:52:57 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing
	closures	in concurrent marking
In-Reply-To: <50F858DA.8050508@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com>
Message-ID: <8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>

John / Bengt:

I think I can offer a bit of info on Bengt's earlier question about ParallelProcRefEnabled being disabled by default.

IIRC, there was one workload that showed a slight perf regression with +ParallelProcRefEnabled.  That workload that showed a regression may not be as relevant as it was back when the evaluation / decision was made to disable it by default?

You both have probably thought about this already?  My reaction is ... I think reasonable defaults would be to enable +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when ParallelGCThreads is greater than 1, and disable -ParallelProcRefEnabled with -XX:+UseSerialGC.

hths,

charlie ...

On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:

> Hi Bengt,
> 
> There's a new webrev at: http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
> 
> It looks larger than the previous webrev but the most of the change was 
> tweaking comments. The actual code changes are smaller.
> 
> Testing was the same as before.
> 
> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>> 
>> I see. I didn't think about the difference betweeen ParallelGCThreads 
>> and ParallelRefProcEnabled. BTW, not part of this change, but why do 
>> we have ParallelRefProcEnabled? And why is it false by default? 
>> Wouldn't it make more sense to have it just be dependent on 
>> ParallelGCThreads?
> 
> I don't know and the answer is probably lost in the dark depths of time 
> - I can only speculate. For G1 we have a CR to turn 
> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not sure 
> about the other collectors.
> 
>> 
>>> Setting it once in weakRefsWork() will not be sufficient. We will run 
>>> into an assertion failure in 
>>> ParallelTaskTerminator::offer_termination().
>>> 
>>> During the reference processing, the do_void() method of the 
>>> complete_gc oop closure (in our case the complete gc oop closure is 
>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple 
>>> times (in process_phase1, sometimes process_phase2, process_phase3, 
>>> and process_phaseJNI)
>>> 
>>> Setting the phase sets the number of active tasks (or threads) that 
>>> the termination protocol in do_marking_step() will wait for. When an 
>>> invocation of do_marking_step() offers termination, the number of 
>>> tasks/threads in the terminator instance is decremented. So Setting 
>>> the phase once will let the first execution of do_marking_step (with 
>>> termination) from process_phase1() succeed, but subsequent calls to 
>>> do_marking_step() will result in the assertion failure.
>>> 
>>> We also can't unconditionally set it in the do_void() method or even 
>>> the constructor of G1CMParDrainMarkingStackClosure. Separate 
>>> instances of this closure are created by each of the worker threads 
>>> in the MT-case.
>>> 
>>> Note when processing is multi-threaded the complete_gc instance used 
>>> is the one passed into the ProcessTask's work method (passed into 
>>> process_discovered_references() using the task executor instance) 
>>> which may not necessarily be the same complete gc instance as the one 
>>> passed directly into process_discovered_references().
>> 
>> Thanks for this detailed explanation. It really helped!
>> 
>> I understand the issue now, but I still think it is very confusing 
>> that _cm->set_phase() is called from 
>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from 
>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded case.
>> 
>>> It might be possible to record whether processing is MT in the 
>>> G1CMRefProcTaskExecutor class and always pass the executor instance 
>>> into process_discovered_references. We could then set processing to 
>>> MT so that the execute() methods in the executor instance are invoked 
>>> but call the Proxy class' work method directly. Then we could 
>>> override the set_single_threaded() routine (called just before 
>>> process_phaseJNI) to set the phase.
>> 
>> I think this would be a better solution, but if I understand it 
>> correctly it would mean that we would have to change all the 
>> collectors to always pass a TaskExecutor. All of them currently pass 
>> NULL in the non-MT case. I think it would be simpler if they always 
>> passed a TaskExecutor but it is a pretty big change.
> 
> I wasn't meaning to do that for the other collectors just G1's 
> concurrent mark reference processor i.e. fool the ref processor into 
> think it's MT so that the parallel task executor is used but only use 
> the work gang if reference processing was _really_ MT.
> 
> I decided not to do this as there is an easier way. For the non-MT case 
> we do not need to enter the termination protocol in 
> CMTask::do_marking_step(). When there's only one thread we don't need to 
> use the ParallelTaskTerminator to wait for other threads. And we 
> certainly don't need stealing. Hence the solution is to only do the 
> termination and stealing if the closure is instantiated for MT reference 
> processing. That removes the set_phase call().
> 
>> Another possibility is to introduce some kind of prepare method to the 
>> VoidClosure (or maybe in a specialized subclass for ref processing). 
>> Then we could do something like:
>> 
>>  complete_gc->prologue();
>>  if (mt_processing) {
>>    RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic() 
>> /*marks_oops_alive*/);
>>    task_executor->execute(phase2);
>>  } else {
>>    for (uint i = 0; i < _max_num_q; i++) {
>>      process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
>>    }
>>  }
>> 
>> G1CMParDrainMarkingStackClosure::prologue() could do the call to 
>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not 
>> have to do it.
> 
> The above is a reasonable extension to the reference processing code. I 
> no longer need this feature for this change but we should submit a CR 
> for it. I'll do that.
> 
>> BTW, not really part of your change, but above code is duplicated 
>> three times in ReferenceProcessor::process_discovered_reflist(). Would 
>> be nice to factor this out to a method.
> 
> Completely agree. Again I'll submit a CR for it.
> 
> Thanks,
> 
> JohnC


From jon.masamitsu at oracle.com  Thu Jan 17 23:15:36 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 17 Jan 2013 15:15:36 -0800
Subject: Request for review 8006537: Missing initialization of Metaspace
	variables with -Xshare:dump
Message-ID: <50F88618.1070602@oracle.com>

  8006537: Missing initialization of Metaspace variables with -Xshare:dump

Always initialize _first_chunk_word_size and _first_class_chunk_word_size.
Prior to b73 these variables were not being used extensively (if at all)
when DumpSharedSpace was on.  With b73 they need to be used.

When DumpSharedSpace was on previous to b73 there was not a second
call to the constructor for VirtualSpaceNode so the initialization done for
DumpSharedSpace was not called a second time and did not cause a problem.
With b73 and DumpSharedSpace it is called a second time so the 
initialization
for DumpSharedSpace had to be short circuited. This is a workaround. A
better fix would be to move the DumpSharedSpace initialization code to an
appropriate place.

http://cr.openjdk.java.net/~jmasa/8006537/webrev.00/

Thanks.


From jon.masamitsu at oracle.com  Fri Jan 18 00:00:02 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 17 Jan 2013 16:00:02 -0800
Subject: Request for review 8006537: Missing initialization of Metaspace
	variables with -Xshare:dump
In-Reply-To: <50F88618.1070602@oracle.com>
References: <50F88618.1070602@oracle.com>
Message-ID: <50F89082.5030808@oracle.com>

JohnC,

Thanks for you prompt review.

All,

These bugs have broken the hotspot build so I'm
eager to get them back so will try pushing them
soon.  Other comments are always welcome.

Jon

On 01/17/13 15:15, Jon Masamitsu wrote:
>  8006537: Missing initialization of Metaspace variables with -Xshare:dump
>
> Always initialize _first_chunk_word_size and 
> _first_class_chunk_word_size.
> Prior to b73 these variables were not being used extensively (if at all)
> when DumpSharedSpace was on.  With b73 they need to be used.
>
> When DumpSharedSpace was on previous to b73 there was not a second
> call to the constructor for VirtualSpaceNode so the initialization 
> done for
> DumpSharedSpace was not called a second time and did not cause a problem.
> With b73 and DumpSharedSpace it is called a second time so the 
> initialization
> for DumpSharedSpace had to be short circuited. This is a workaround. A
> better fix would be to move the DumpSharedSpace initialization code to an
> appropriate place.
>
> http://cr.openjdk.java.net/~jmasa/8006537/webrev.00/
>
> Thanks.


From vladimir.kozlov at oracle.com  Fri Jan 18 00:23:39 2013
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 17 Jan 2013 16:23:39 -0800
Subject: Request for review 8006537: Missing initialization of Metaspace
	variables with -Xshare:dump
In-Reply-To: <50F88618.1070602@oracle.com>
References: <50F88618.1070602@oracle.com>
Message-ID: <50F8960B.2010306@oracle.com>

Good. Thank you for fixing it.

Vladimir

On 1/17/13 3:15 PM, Jon Masamitsu wrote:
>   8006537: Missing initialization of Metaspace variables with -Xshare:dump
>
> Always initialize _first_chunk_word_size and _first_class_chunk_word_size.
> Prior to b73 these variables were not being used extensively (if at all)
> when DumpSharedSpace was on.  With b73 they need to be used.
>
> When DumpSharedSpace was on previous to b73 there was not a second
> call to the constructor for VirtualSpaceNode so the initialization done for
> DumpSharedSpace was not called a second time and did not cause a problem.
> With b73 and DumpSharedSpace it is called a second time so the
> initialization
> for DumpSharedSpace had to be short circuited. This is a workaround. A
> better fix would be to move the DumpSharedSpace initialization code to an
> appropriate place.
>
> http://cr.openjdk.java.net/~jmasa/8006537/webrev.00/
>
> Thanks.


From jon.masamitsu at oracle.com  Fri Jan 18 02:44:04 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 17 Jan 2013 18:44:04 -0800
Subject: Request for review 8006537: Assert when dumping archive with
	default methods
In-Reply-To: <50F89082.5030808@oracle.com>
References: <50F88618.1070602@oracle.com> <50F89082.5030808@oracle.com>
Message-ID: <50F8B6F4.9020307@oracle.com>

New summary but same bug - 8006537

New fix from Coleen that is a fix and not a workaround.

http://cr.openjdk.java.net/~jmasa/8006537/webrev.00/

Sorry about wasting people's time with the previous
attempt.

Thanks.

Jon

On 01/17/13 16:00, Jon Masamitsu wrote:
> JohnC,
>
> Thanks for you prompt review.
>
> All,
>
> These bugs have broken the hotspot build so I'm
> eager to get them back so will try pushing them
> soon.  Other comments are always welcome.
>
> Jon
>
> On 01/17/13 15:15, Jon Masamitsu wrote:
>>  8006537: Missing initialization of Metaspace variables with 
>> -Xshare:dump
>>
>> Always initialize _first_chunk_word_size and 
>> _first_class_chunk_word_size.
>> Prior to b73 these variables were not being used extensively (if at all)
>> when DumpSharedSpace was on.  With b73 they need to be used.
>>
>> When DumpSharedSpace was on previous to b73 there was not a second
>> call to the constructor for VirtualSpaceNode so the initialization 
>> done for
>> DumpSharedSpace was not called a second time and did not cause a 
>> problem.
>> With b73 and DumpSharedSpace it is called a second time so the 
>> initialization
>> for DumpSharedSpace had to be short circuited. This is a workaround. A
>> better fix would be to move the DumpSharedSpace initialization code 
>> to an
>> appropriate place.
>>
>> http://cr.openjdk.java.net/~jmasa/8006537/webrev.00/
>>
>> Thanks.


From jon.masamitsu at oracle.com  Fri Jan 18 07:27:53 2013
From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com)
Date: Fri, 18 Jan 2013 07:27:53 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets
Message-ID: <20130118072759.BA1CE473A6@hg.openjdk.java.net>

Changeset: 2dce7c34c564
Author:    stefank
Date:      2013-01-17 11:39 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2dce7c34c564

8006513: Null pointer in DefaultMethods::generate_default_methods when merging annotations
Reviewed-by: brutisso, jfranck

! src/share/vm/classfile/defaultMethods.cpp

Changeset: 59a58e20dc60
Author:    jmasa
Date:      2013-01-17 19:04 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/59a58e20dc60

8006537: Assert when dumping archive with default methods
Reviewed-by: coleenp

! src/share/vm/classfile/classLoaderData.cpp
! src/share/vm/memory/metadataFactory.hpp


From alejandro.murillo at oracle.com  Fri Jan 18 19:11:53 2013
From: alejandro.murillo at oracle.com (alejandro.murillo at oracle.com)
Date: Fri, 18 Jan 2013 19:11:53 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 35 new changesets
Message-ID: <20130118191312.40533473D7@hg.openjdk.java.net>

Changeset: 41ccb2e737fb
Author:    katleman
Date:      2013-01-16 11:59 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/41ccb2e737fb

Added tag jdk8-b73 for changeset 11619f33cd68

! .hgtags

Changeset: 1a3e54283c54
Author:    katleman
Date:      2013-01-16 20:53 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1a3e54283c54

Merge

! .hgtags

Changeset: adc176e95bf2
Author:    acorn
Date:      2013-01-09 11:39 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/adc176e95bf2

8005689: InterfaceAccessFlagsTest failures in Lambda-JDK tests
Summary: Fix verifier for new interface access flags
Reviewed-by: acorn, kvn
Contributed-by: bharadwaj.yadavalli at oracle.com

! src/share/vm/classfile/classFileParser.cpp

Changeset: dd7248d3e151
Author:    zgu
Date:      2013-01-09 14:46 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/dd7248d3e151

7152671: RFE: Windows decoder should add some std dirs to the symbol search path
Summary: Added JRE/JDK bin directories to decoder's symbol search path
Reviewed-by: dcubed, sla

! src/os/windows/vm/decoder_windows.cpp
! src/os/windows/vm/decoder_windows.hpp

Changeset: 97ee8abd6ab2
Author:    zgu
Date:      2013-01-09 12:10 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/97ee8abd6ab2

Merge


Changeset: aefb345d3f5e
Author:    acorn
Date:      2013-01-10 17:38 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/aefb345d3f5e

7199207: NPG: Crash in PlaceholderTable::verify after StackOverflow
Summary: Reduce scope of placeholder table entries to improve cleanup
Reviewed-by: dholmes, coleenp

! src/share/vm/classfile/placeholders.cpp
! src/share/vm/classfile/placeholders.hpp
! src/share/vm/classfile/systemDictionary.cpp
! src/share/vm/utilities/exceptions.hpp

Changeset: 91bf7da5c609
Author:    mikael
Date:      2013-01-10 17:06 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/91bf7da5c609

8004747: Remove last_entry from VM_STRUCT macros
Summary: Instead of passing in last_entry to all the VM_ macros just expand it in the main vmStructs.cpp file.
Reviewed-by: dholmes, sspitsyn, minqi

! src/cpu/sparc/vm/vmStructs_sparc.hpp
! src/cpu/x86/vm/vmStructs_x86.hpp
! src/cpu/zero/vm/vmStructs_zero.hpp
! src/os_cpu/bsd_x86/vm/vmStructs_bsd_x86.hpp
! src/os_cpu/bsd_zero/vm/vmStructs_bsd_zero.hpp
! src/os_cpu/linux_sparc/vm/vmStructs_linux_sparc.hpp
! src/os_cpu/linux_x86/vm/vmStructs_linux_x86.hpp
! src/os_cpu/linux_zero/vm/vmStructs_linux_zero.hpp
! src/os_cpu/solaris_sparc/vm/vmStructs_solaris_sparc.hpp
! src/os_cpu/solaris_x86/vm/vmStructs_solaris_x86.hpp
! src/os_cpu/windows_x86/vm/vmStructs_windows_x86.hpp
! src/share/vm/runtime/vmStructs.cpp

Changeset: c1c8479222cd
Author:    dholmes
Date:      2013-01-10 21:00 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c1c8479222cd

8005921: Memory leaks in vmStructs.cpp
Reviewed-by: dholmes, mikael, rasbold
Contributed-by: Jeremy Manson <jeremymanson at google.com>

! src/share/vm/runtime/vmStructs.cpp

Changeset: e0cf9af8978e
Author:    zgu
Date:      2013-01-11 12:30 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e0cf9af8978e

8005936: PrintNMTStatistics doesn't work for normal JVM exit
Summary: Moved NMT shutdown code to JVM exit handler to ensure NMT statistics is printed when PrintNMTStatistics is enabled
Reviewed-by: acorn, dholmes, coleenp

! src/share/vm/runtime/java.cpp
! src/share/vm/runtime/thread.cpp

Changeset: 90a92d5bca17
Author:    zgu
Date:      2013-01-11 09:53 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/90a92d5bca17

Merge


Changeset: 4a916f2ce331
Author:    jwilhelm
Date:      2013-01-14 15:17 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4a916f2ce331

8003985: Support @Contended Annotation - JEP 142
Summary: HotSpot changes to support @Contended annotation.
Reviewed-by: coleenp, kvn, jrose
Contributed-by: Aleksey Shipilev <aleksey.shipilev at oracle.com>

! agent/src/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java
! src/cpu/sparc/vm/vm_version_sparc.cpp
! src/cpu/x86/vm/vm_version_x86.cpp
! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/classfile/classFileParser.hpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/oops/fieldInfo.hpp
! src/share/vm/oops/fieldStreams.hpp
! src/share/vm/oops/instanceKlass.hpp
! src/share/vm/runtime/globals.hpp
! src/share/vm/runtime/vmStructs.cpp

Changeset: f9eb431c3efe
Author:    coleenp
Date:      2013-01-14 11:01 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f9eb431c3efe

8006005: Fix constant pool index validation and alignment trap for method parameter reflection
Summary: This patch addresses an alignment trap due to the storage format of method parameters data in constMethod.  It also adds code to validate constant pool indexes for method parameters data.
Reviewed-by: jrose, dholmes
Contributed-by: eric.mccorkle at oracle.com

! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/oops/constMethod.hpp
! src/share/vm/prims/jvm.cpp
! src/share/vm/runtime/reflection.cpp

Changeset: 5b6a231e5a86
Author:    coleenp
Date:      2013-01-14 08:37 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5b6a231e5a86

Merge

! src/share/vm/classfile/classFileParser.cpp

Changeset: fe1472c87a27
Author:    mikael
Date:      2013-01-14 11:00 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/fe1472c87a27

8005592: ClassLoaderDataGraph::_unloading incorrectly defined as nonstatic in vmStructs
Summary: Added assertion to catch problem earlier and removed the unused field
Reviewed-by: dholmes, acorn

! src/share/vm/runtime/vmStructs.cpp

Changeset: c793367610c1
Author:    coleenp
Date:      2013-01-15 17:05 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c793367610c1

8005467: CDS size information is incorrect and unfriendly
Summary: Changed words to bytes, and added usage percentage information
Reviewed-by: coleenp, twisti
Contributed-by: ioi.lam at oracle.com

! src/share/vm/memory/metaspaceShared.cpp

Changeset: 92d4b5d8dde4
Author:    acorn
Date:      2013-01-16 18:23 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/92d4b5d8dde4

Merge

! src/cpu/x86/vm/vm_version_x86.cpp
! src/share/vm/runtime/globals.hpp

Changeset: 337e1dd9d902
Author:    jiangli
Date:      2013-01-11 16:55 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/337e1dd9d902

8005895: Inefficient InstanceKlass field packing wasts memory.
Summary: Pack _misc_has_default_methods into the _misc_flags, move _idnum_allocated_count.
Reviewed-by: coleenp, shade

! src/share/vm/oops/instanceKlass.hpp

Changeset: 94fa3c4e7643
Author:    vladidan
Date:      2013-01-14 13:44 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/94fa3c4e7643

8005639: Move InlineSynchronizedMethods flag from develop to product
Summary: Move InlineSynchronizedMethods flag from develop to product
Reviewed-by: kvn, vladidan
Contributed-by: Alexander Harlap <alexander.harlap at oracle.com>

! src/share/vm/c1/c1_globals.hpp

Changeset: 9deda4d8e126
Author:    vladidan
Date:      2013-01-14 13:52 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9deda4d8e126

8005204: Code Cache Reduction: command line options implementation
Summary: Adding more detailed output on CodeCache usage
Reviewed-by: kvn, vladidan
Contributed-by: Alexander Harlap <alexander.harlap at oracle.com>

! src/share/vm/code/codeCache.cpp
! src/share/vm/code/codeCache.hpp
! src/share/vm/compiler/compileBroker.cpp
! src/share/vm/runtime/globals.hpp
! src/share/vm/runtime/java.cpp
! src/share/vm/utilities/vmError.cpp

Changeset: 212c5b9c38e7
Author:    dlong
Date:      2013-01-17 01:27 -0500
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/212c5b9c38e7

Merge

! src/share/vm/oops/instanceKlass.hpp
! src/share/vm/runtime/globals.hpp
! src/share/vm/runtime/java.cpp

Changeset: a3f92e6c0274
Author:    twisti
Date:      2013-01-11 14:07 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a3f92e6c0274

8006031: LibraryCallKit::inline_array_copyOf disabled unintentionally with 7172640
Reviewed-by: kvn

! src/share/vm/opto/library_call.cpp

Changeset: f9bda35f4226
Author:    twisti
Date:      2013-01-11 16:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f9bda35f4226

8005816: Shark: fix volatile float field access
Reviewed-by: twisti
Contributed-by: Roman Kennke <rkennke at redhat.com>

! src/share/vm/shark/sharkBlock.cpp

Changeset: c566b81b3323
Author:    twisti
Date:      2013-01-11 16:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c566b81b3323

8005817: Shark: implement deoptimization support
Reviewed-by: twisti
Contributed-by: Roman Kennke <rkennke at redhat.com>

! src/cpu/zero/vm/frame_zero.cpp
! src/cpu/zero/vm/frame_zero.inline.hpp
! src/cpu/zero/vm/sharkFrame_zero.hpp
! src/share/vm/shark/sharkInvariants.hpp
! src/share/vm/shark/sharkTopLevelBlock.cpp

Changeset: c095a7f289aa
Author:    twisti
Date:      2013-01-11 16:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c095a7f289aa

8005818: Shark: fix OSR for non-empty incoming stack
Reviewed-by: twisti
Contributed-by: Roman Kennke <rkennke at redhat.com>

! src/share/vm/shark/sharkCompiler.cpp
! src/share/vm/shark/sharkFunction.cpp
! src/share/vm/shark/sharkInvariants.hpp

Changeset: 606eada1bf86
Author:    twisti
Date:      2013-01-11 16:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/606eada1bf86

8005820: Shark: enable JSR292 support
Reviewed-by: twisti
Contributed-by: Roman Kennke <rkennke at redhat.com>

! src/share/vm/compiler/abstractCompiler.hpp
! src/share/vm/compiler/compileBroker.cpp
! src/share/vm/shark/sharkBlock.cpp
! src/share/vm/shark/sharkCompiler.hpp
! src/share/vm/shark/sharkConstant.cpp
! src/share/vm/shark/sharkInliner.cpp
! src/share/vm/shark/sharkTopLevelBlock.cpp

Changeset: 6d1f5516534e
Author:    twisti
Date:      2013-01-11 20:01 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6d1f5516534e

8006127: remove printing code added with 8006031
Reviewed-by: kvn

! src/share/vm/opto/library_call.cpp

Changeset: d92fa52a5d03
Author:    vlivanov
Date:      2013-01-14 08:22 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d92fa52a5d03

8006095: C1: SIGSEGV w/ -XX:+LogCompilation
Summary: avoid printing inlining decision when compilation fails
Reviewed-by: kvn, roland

! src/share/vm/c1/c1_GraphBuilder.cpp

Changeset: f1de9dbc914e
Author:    twisti
Date:      2013-01-15 12:06 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1de9dbc914e

8006109: test/java/util/AbstractSequentialList/AddAll.java fails: assert(rtype == ctype) failed: mismatched return types
Reviewed-by: kvn

! src/share/vm/ci/ciType.cpp
! src/share/vm/ci/ciType.hpp
! src/share/vm/opto/doCall.cpp

Changeset: 5b8548391bf3
Author:    kvn
Date:      2013-01-15 14:45 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5b8548391bf3

8005821: C2: -XX:+PrintIntrinsics is broken
Summary: Check all print inlining flags when processing inlining list.
Reviewed-by: kvn, twisti
Contributed-by: david.r.chase at oracle.com

! src/share/vm/opto/compile.cpp

Changeset: bf623b2d5508
Author:    kvn
Date:      2013-01-16 14:55 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/bf623b2d5508

8006204: please JTREGify test/compiler/7190310/Test7190310.java
Summary: Add proper jtreg annotations in the preceding comment, including an explicit timeout.
Reviewed-by: kvn, twisti
Contributed-by: david.r.chase at oracle.com

! test/compiler/7190310/Test7190310.java

Changeset: eab4f9ed602c
Author:    kvn
Date:      2013-01-17 18:47 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/eab4f9ed602c

Merge

! src/share/vm/compiler/compileBroker.cpp

Changeset: f422634e5828
Author:    brutisso
Date:      2013-01-18 11:03 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f422634e5828

Merge

! src/share/vm/classfile/classFileParser.cpp
! src/share/vm/classfile/vmSymbols.hpp
! src/share/vm/prims/jvm.cpp
! src/share/vm/runtime/globals.hpp
! src/share/vm/runtime/thread.cpp
! src/share/vm/runtime/vmStructs.cpp

Changeset: 70c89bd6b895
Author:    amurillo
Date:      2013-01-18 05:19 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/70c89bd6b895

Merge


Changeset: 2b878edabfc0
Author:    amurillo
Date:      2013-01-18 05:19 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2b878edabfc0

Added tag hs25-b16 for changeset 70c89bd6b895

! .hgtags

Changeset: 46e60405583b
Author:    amurillo
Date:      2013-01-18 05:33 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/46e60405583b

8006511: new hotspot build - hs25-b17
Reviewed-by: jcoomes

! make/hotspot_version


From yamauchi at google.com  Fri Jan 18 23:29:53 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Fri, 18 Jan 2013 15:29:53 -0800
Subject: Deallocating memory pages
Message-ID: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>

http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/

Hi folks,

I'd like to see if it makes sense to contribute this patch.

If it's enabled, it helps reduce the JVM memory/RAM footprint by
deallocating (releasing) the underlying memory pages that correspond to the
unused or free portions of the heap (more specifically, it calls
madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
without unmapping the heap address space).

Though the worst-case memory footprint (that is, when the heap is full)
does not change, this helps the JVM bring its RAM usage closer to what it
actually is using at the moment (that is, occupied by objects) and Java
applications behave more nicely in shared environments in which multiple
servers or applications run.

In fact, this has been very useful in certain servers and desktop tools
that we have at Google and helped save a lot of RAM use. It tries to
address the issue where a Java server or app runs for a while and almost
never releases its RAM even when it is mostly idle.

Of course, a higher degree of heap fragmentation deteriorates the utility
of this because a free chunk smaller than a page cannot be deallocated, but
it has the advantage of being able to work without shrinking the heap or
the generation.

Despite the fact that this can slow down things due to the on-demand page
reallocation that happens when a deallocated page is first touched, the
performance hit seems not bad. In my measurements, I see a ~1-3% overall
overhead in an internal server test and a ~0-4% overall overhead in the
DaCapo benchmarks.

It supports the CMS collector and Linux only in the current form though
it's probably possible to extend this to other collectors and platforms in
the future.

I thought this could be useful in wider audience.

Chuck Rasbold has kindly reviewed this change.

Thanks,
Hiroshi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130118/b4ee81c3/attachment.htm>

From vitalyd at gmail.com  Sat Jan 19 02:20:15 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 18 Jan 2013 21:20:15 -0500
Subject: Deallocating memory pages
In-Reply-To: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
Message-ID: <CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>

Hi Hiroshi,

I'm not an official reviewer, but I wonder whether deallocate_pages_raw in
os_linux.cpp should handle an EAGAIN return value from madvise.
Specifically, should the code loop on EAGAIN and retry the syscall? Maybe
have some safety value there to stop looping if too many tries (or too much
time) are failing.

Thanks

Sent from my phone
On Jan 18, 2013 6:30 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:

> http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/
>
> Hi folks,
>
> I'd like to see if it makes sense to contribute this patch.
>
> If it's enabled, it helps reduce the JVM memory/RAM footprint by
> deallocating (releasing) the underlying memory pages that correspond to the
> unused or free portions of the heap (more specifically, it calls
> madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
> without unmapping the heap address space).
>
> Though the worst-case memory footprint (that is, when the heap is full)
> does not change, this helps the JVM bring its RAM usage closer to what it
> actually is using at the moment (that is, occupied by objects) and Java
> applications behave more nicely in shared environments in which multiple
> servers or applications run.
>
> In fact, this has been very useful in certain servers and desktop tools
> that we have at Google and helped save a lot of RAM use. It tries to
> address the issue where a Java server or app runs for a while and almost
> never releases its RAM even when it is mostly idle.
>
> Of course, a higher degree of heap fragmentation deteriorates the utility
> of this because a free chunk smaller than a page cannot be deallocated, but
> it has the advantage of being able to work without shrinking the heap or
> the generation.
>
> Despite the fact that this can slow down things due to the on-demand page
> reallocation that happens when a deallocated page is first touched, the
> performance hit seems not bad. In my measurements, I see a ~1-3% overall
> overhead in an internal server test and a ~0-4% overall overhead in the
> DaCapo benchmarks.
>
> It supports the CMS collector and Linux only in the current form though
> it's probably possible to extend this to other collectors and platforms in
> the future.
>
> I thought this could be useful in wider audience.
>
> Chuck Rasbold has kindly reviewed this change.
>
> Thanks,
> Hiroshi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130118/e77fcf62/attachment.htm>

From jesper.wilhelmsson at oracle.com  Sat Jan 19 17:38:18 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Sat, 19 Jan 2013 18:38:18 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
Message-ID: <50FADA0A.7040209@oracle.com>

Hi,

Some further code inspection showed that it's possible to fix this bug in
TwoGenerationCollectorPolicy::initialize_flags() and keep the fix local. 
Thanks to Erik Helin who suggested this. A new webrev is available here:

http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3/

/Jesper


On 16/1/13 2:23 PM, Vitaly Davidovich wrote:
>
> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't 
> been checked yet but if it's 0, VM will exit later on anyway - 
> basically, what you said in the email :).
>
> Cheers
>
> Sent from my phone
>
> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson" 
> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>> 
> wrote:
>
>
>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>
>>>>     Hi Jesper,
>>>>
>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>     recommended_heap_size?
>>>>
>>>     As far as I can see, yes. It defaults to two and is never set to
>>>     zero.
>>
>>     No, there is no such guarantee this early in the argument
>>     parsing. The check to verify that NewRatio > 0 is done in
>>     GenCollectorPolicy::initialize_flags(), which is called later in
>>     the start up sequence than your call to
>>     CollectorPolicy::recommended_heap_size() and it is never called
>>     for G1.
>>
>>     Running with your patch crashes:
>>
>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>     Floating point exception: 8
>
>     Oh, yes, you're right. Sorry!
>
>     Good catch Vitaly!
>
>     New webrev:
>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>
>     I'm just skipping the calculation if NewRatio is zero. The VM will
>     abort anyway as soon as it realizes that this is the case.
>     /Jesper
>
>
>>     Bengt
>>>     /Jesper
>>>
>>>>     Thanks
>>>>
>>>>     Sent from my phone
>>>>
>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>     <jesper.wilhelmsson at oracle.com
>>>>     <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>
>>>>         Jon,
>>>>
>>>>         Thank you for looking at this! I share your concerns and I
>>>>         have moved the knowledge about policies to CollectorPolicy.
>>>>         set_heap_size() now simply asks the collector policy if it
>>>>         has any recommendations regarding the heap size.
>>>>
>>>>         Ideally, since the code knows about young and old
>>>>         generations, I guess the new function
>>>>         "recommended_heap_size()" should be placed in
>>>>         GenCollectorPolicy, but then the code would have to be
>>>>         duplicated for G1 as well. However, CollectorPolicy already
>>>>         know about OldSize and NewSize so I think it is OK to put
>>>>         it there.
>>>>
>>>>         Eventually I think that we should reduce the abstraction
>>>>         level in the generation policies and merge CollectorPolicy,
>>>>         GenCollectorPolicy and maybe even
>>>>         TwoGenerationCollectorPolicy and if possible
>>>>         G1CollectorPolicy, so I don't worry too much about having
>>>>         knowledge about the two generations in CollectorPolicy.
>>>>
>>>>
>>>>         A new webrev is available here:
>>>>         http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>         <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>
>>>>         Thanks,
>>>>         /Jesper
>>>>
>>>>
>>>>
>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>
>>>>             Jesper,
>>>>
>>>>             I'm a bit concerned that set_heap_size() now knows
>>>>             about how
>>>>             the CollectorPolicy uses OldSize and NewSize. In the
>>>>             distant
>>>>             past set_heap_size() did not know what kind of
>>>>             collector was
>>>>             going to be used and probably avoided looking at those
>>>>             parameters for that reason.  Today we know that a
>>>>             generational
>>>>             collector is to follow but maybe you could hide that
>>>>             knowledge
>>>>             in CollectorPolicy somewhere and have set_heap_size()
>>>>             call into
>>>>             CollectorPolicy to use that information?
>>>>
>>>>             Jon
>>>>
>>>>
>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>
>>>>                 Hi,
>>>>
>>>>                 I would like a couple of reviews of a small fix for
>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
>>>>
>>>>                 Webrev:
>>>>                 http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/ <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>
>>>>                 Summary:
>>>>                 When starting HotSpot with an OldSize larger than
>>>>                 the default heap size one will run into a couple of
>>>>                 problems. Basically what happens is that the
>>>>                 OldSize is ignored because it is incompatible with
>>>>                 the heap size. A debug build will assert since a
>>>>                 calculation on the way results in a negative
>>>>                 number, but since it is a size_t an if(x<0) won't
>>>>                 trigger and the assert catches it later on as
>>>>                 incompatible flags.
>>>>
>>>>                 Changes:
>>>>                 I have made two changes to fix this.
>>>>
>>>>                 The first is to change the calculation in
>>>>                 TwoGenerationCollectorPolicy::adjust_gen0_sizes so
>>>>                 that it won't result in a negative number in the if
>>>>                 statement. This way we will catch the case where
>>>>                 the OldSize is larger than the heap size and adjust
>>>>                 the OldSize instead of the young size. There are
>>>>                 also some cosmetic changes here. For instance the
>>>>                 argument min_gen0_size is actually used for the old
>>>>                 generation size which was a bit confusing
>>>>                 initially. I renamed it to min_gen1_size (which it
>>>>                 already was called in the header file).
>>>>
>>>>                 The second change is in Arguments::set_heap_size.
>>>>                 My reasoning here is that if the user sets the
>>>>                 OldSize we should probably adjust the heap size to
>>>>                 accommodate that OldSize instead of complaining
>>>>                 that the heap is too small. We determine the heap
>>>>                 size first and the generation sizes later on while
>>>>                 initializing the VM. To be able to fit the
>>>>                 generations if the user specifies sizes on the
>>>>                 command line we need to look at the generation size
>>>>                 flags a little already when setting up the heap size.
>>>>
>>>>                 Thanks,
>>>>                 /Jesper
>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130119/a30d587a/attachment.htm>

From bengt.rutisson at oracle.com  Mon Jan 21 07:07:25 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Mon, 21 Jan 2013 07:07:25 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8006242: G1: WorkerDataArray<T>::verify()
	too strict for double calculations
Message-ID: <20130121070729.A41B547415@hg.openjdk.java.net>

Changeset: 7df93f7c14a5
Author:    brutisso
Date:      2013-01-16 12:46 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7df93f7c14a5

8006242: G1: WorkerDataArray<T>::verify() too strict for double calculations
Summary: Also reviewed by vitalyd at gmail.com.
Reviewed-by: johnc, mgerdin

! src/share/vm/gc_implementation/g1/g1GCPhaseTimes.cpp
! src/share/vm/gc_implementation/g1/g1GCPhaseTimes.hpp


From erik.helin at oracle.com  Mon Jan 21 10:12:29 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 21 Jan 2013 11:12:29 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50FADA0A.7040209@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
	<50FADA0A.7040209@oracle.com>
Message-ID: <50FD148D.6080000@oracle.com>

Jesper,

On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
> Some further code inspection showed that it's possible to fix this bug in
> TwoGenerationCollectorPolicy::initialize_flags() and keep the fix local.

I think this is much better, nice work!

On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
> A new webrev is available here:
>
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3/

A couple of comments:

- Instead of having the comment:
   > "NewRatio is checked earlier and can not be zero here"
   Could you add an assert that says:
   > assert(NewRatio != 0, "Should have been checked earlier");
   This way you get a verifiable comment :)

- I think its better if the calculation
   > (OldSize / NewRatio) * (NewRatio + 1)
   is saved in a variable, perhaps:
   >    uintx heap_size_by_scaling_old_size =
   >      (OldSize / NewRatio) * (NewRatio + 1);
   >
   >    MaxHeapSize = heap_size_by_scaling_old_size;
   >    InitialHeapSize = heap_size_by_scaling_old_size;

- This is not related to your change, but I would prefer if some of
   the logic in TwoGenerationCollectorPolicy::adjust_gen0_size got some
   more descriptive names, perhaps:
   > bool is_heap_too_small =
   >   (*gen1_size_ptr + *gen0_size_ptr) > heap_size;
   >
   >  if (is_heap_too_small) {
   >    bool has_heap_space_left_for_gen0 =
   >      heap_size >= (min_gen1_size + min_alignment());
   >    bool is_gen0_too_large =
   >      heap_size < (*gen0_size_ptr + min_gen1_size);
   >
   >    if (is_gen0_too_large && has_heap_space_left_for_gen0) {
   >    ....
   >    }

- As a final possible cleanup of
   TwoGenerationCollectorPolicy::adjust_gen0_size, I think the variable:
   > bool result = false;
   can be removed and instead an early exit can be used in the if
   statement that sets "result" to true and the method can return false
   in the end.

What do you think of these suggestions?

Thanks,
Erik

> /Jesper
>
>
> On 16/1/13 2:23 PM, Vitaly Davidovich wrote:
>>
>> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't
>> been checked yet but if it's 0, VM will exit later on anyway -
>> basically, what you said in the email :).
>>
>> Cheers
>>
>> Sent from my phone
>>
>> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson"
>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>
>> wrote:
>>
>>
>>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>>
>>>>>     Hi Jesper,
>>>>>
>>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>>     recommended_heap_size?
>>>>>
>>>>     As far as I can see, yes. It defaults to two and is never set to
>>>>     zero.
>>>
>>>     No, there is no such guarantee this early in the argument
>>>     parsing. The check to verify that NewRatio > 0 is done in
>>>     GenCollectorPolicy::initialize_flags(), which is called later in
>>>     the start up sequence than your call to
>>>     CollectorPolicy::recommended_heap_size() and it is never called
>>>     for G1.
>>>
>>>     Running with your patch crashes:
>>>
>>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>>     Floating point exception: 8
>>
>>     Oh, yes, you're right. Sorry!
>>
>>     Good catch Vitaly!
>>
>>     New webrev:
>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>>     <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>>
>>     I'm just skipping the calculation if NewRatio is zero. The VM will
>>     abort anyway as soon as it realizes that this is the case.
>>     /Jesper
>>
>>
>>>     Bengt
>>>>     /Jesper
>>>>
>>>>>     Thanks
>>>>>
>>>>>     Sent from my phone
>>>>>
>>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>>     <jesper.wilhelmsson at oracle.com
>>>>>     <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>>
>>>>>         Jon,
>>>>>
>>>>>         Thank you for looking at this! I share your concerns and I
>>>>>         have moved the knowledge about policies to CollectorPolicy.
>>>>>         set_heap_size() now simply asks the collector policy if it
>>>>>         has any recommendations regarding the heap size.
>>>>>
>>>>>         Ideally, since the code knows about young and old
>>>>>         generations, I guess the new function
>>>>>         "recommended_heap_size()" should be placed in
>>>>>         GenCollectorPolicy, but then the code would have to be
>>>>>         duplicated for G1 as well. However, CollectorPolicy already
>>>>>         know about OldSize and NewSize so I think it is OK to put
>>>>>         it there.
>>>>>
>>>>>         Eventually I think that we should reduce the abstraction
>>>>>         level in the generation policies and merge CollectorPolicy,
>>>>>         GenCollectorPolicy and maybe even
>>>>>         TwoGenerationCollectorPolicy and if possible
>>>>>         G1CollectorPolicy, so I don't worry too much about having
>>>>>         knowledge about the two generations in CollectorPolicy.
>>>>>
>>>>>
>>>>>         A new webrev is available here:
>>>>>         http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>>         <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>>
>>>>>         Thanks,
>>>>>         /Jesper
>>>>>
>>>>>
>>>>>
>>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>>
>>>>>             Jesper,
>>>>>
>>>>>             I'm a bit concerned that set_heap_size() now knows
>>>>>             about how
>>>>>             the CollectorPolicy uses OldSize and NewSize. In the
>>>>>             distant
>>>>>             past set_heap_size() did not know what kind of
>>>>>             collector was
>>>>>             going to be used and probably avoided looking at those
>>>>>             parameters for that reason.  Today we know that a
>>>>>             generational
>>>>>             collector is to follow but maybe you could hide that
>>>>>             knowledge
>>>>>             in CollectorPolicy somewhere and have set_heap_size()
>>>>>             call into
>>>>>             CollectorPolicy to use that information?
>>>>>
>>>>>             Jon
>>>>>
>>>>>
>>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>>
>>>>>                 Hi,
>>>>>
>>>>>                 I would like a couple of reviews of a small fix for
>>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit
>>>>> VMs
>>>>>
>>>>>                 Webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>>
>>>>>                 Summary:
>>>>>                 When starting HotSpot with an OldSize larger than
>>>>>                 the default heap size one will run into a couple of
>>>>>                 problems. Basically what happens is that the
>>>>>                 OldSize is ignored because it is incompatible with
>>>>>                 the heap size. A debug build will assert since a
>>>>>                 calculation on the way results in a negative
>>>>>                 number, but since it is a size_t an if(x<0) won't
>>>>>                 trigger and the assert catches it later on as
>>>>>                 incompatible flags.
>>>>>
>>>>>                 Changes:
>>>>>                 I have made two changes to fix this.
>>>>>
>>>>>                 The first is to change the calculation in
>>>>>                 TwoGenerationCollectorPolicy::adjust_gen0_sizes so
>>>>>                 that it won't result in a negative number in the if
>>>>>                 statement. This way we will catch the case where
>>>>>                 the OldSize is larger than the heap size and adjust
>>>>>                 the OldSize instead of the young size. There are
>>>>>                 also some cosmetic changes here. For instance the
>>>>>                 argument min_gen0_size is actually used for the old
>>>>>                 generation size which was a bit confusing
>>>>>                 initially. I renamed it to min_gen1_size (which it
>>>>>                 already was called in the header file).
>>>>>
>>>>>                 The second change is in Arguments::set_heap_size.
>>>>>                 My reasoning here is that if the user sets the
>>>>>                 OldSize we should probably adjust the heap size to
>>>>>                 accommodate that OldSize instead of complaining
>>>>>                 that the heap is too small. We determine the heap
>>>>>                 size first and the generation sizes later on while
>>>>>                 initializing the VM. To be able to fit the
>>>>>                 generations if the user specifies sizes on the
>>>>>                 command line we need to look at the generation size
>>>>>                 flags a little already when setting up the heap size.
>>>>>
>>>>>                 Thanks,
>>>>>                 /Jesper
>>>>>
>>>>>
>>>>
>>>
>>
>
>


From bengt.rutisson at oracle.com  Tue Jan 22 16:21:40 2013
From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com)
Date: Tue, 22 Jan 2013 16:21:40 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8004147: test/Makefile jtreg_tests target
	does not work with cygwin
Message-ID: <20130122162144.13E3747460@hg.openjdk.java.net>

Changeset: bf8c2b2c8cfa
Author:    mgerdin
Date:      2013-01-22 13:42 +0100
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/bf8c2b2c8cfa

8004147: test/Makefile jtreg_tests target does not work with cygwin
Reviewed-by: ctornqvi, brutisso

! test/Makefile


From filipp.zhinkin at oracle.com  Mon Jan 21 11:14:43 2013
From: filipp.zhinkin at oracle.com (Filipp Zhinkin)
Date: Mon, 21 Jan 2013 15:14:43 +0400
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
Message-ID: <50FD2323.9050702@oracle.com>

Hi all,

Would someone review the following regression test please?

Test verifies that VM will not crash with G1 GC and ParallelGCThreads == 0.

To ensure that it is true test allocates array until OOME.
Max heap size is limited by 32M for this test to ensure that GC will occur.
Since crash could occur only during PLAB resizing after GC,
ResizePLAB option is explicitly turned on.

http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/

Thanks,
Filipp.


From jesper.wilhelmsson at oracle.com  Wed Jan 23 13:43:45 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 23 Jan 2013 14:43:45 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50FD148D.6080000@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
	<50FADA0A.7040209@oracle.com> <50FD148D.6080000@oracle.com>
Message-ID: <50FFE911.40901@oracle.com>

Erik,

Thanks for looking at it again. An updated webrev can be found here:
http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.4/

See comments inline.

On 21/1/13 11:12 AM, Erik Helin wrote:
> Jesper,
>
> On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
>> Some further code inspection showed that it's possible to fix this 
>> bug in
>> TwoGenerationCollectorPolicy::initialize_flags() and keep the fix local.
>
> I think this is much better, nice work!
Thanks for the suggestions!

>
> On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
>> A new webrev is available here:
>>
>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3/
>
> A couple of comments:
>
> - Instead of having the comment:
>   > "NewRatio is checked earlier and can not be zero here"
>   Could you add an assert that says:
>   > assert(NewRatio != 0, "Should have been checked earlier");
>   This way you get a verifiable comment :)
Agreed and fixed.

> - I think its better if the calculation
>   > (OldSize / NewRatio) * (NewRatio + 1)
>   is saved in a variable, perhaps:
>   >    uintx heap_size_by_scaling_old_size =
>   >      (OldSize / NewRatio) * (NewRatio + 1);
>   >
>   >    MaxHeapSize = heap_size_by_scaling_old_size;
>   >    InitialHeapSize = heap_size_by_scaling_old_size;
Agreed and fixed.

> - This is not related to your change, but I would prefer if some of
>   the logic in TwoGenerationCollectorPolicy::adjust_gen0_size got some
>   more descriptive names, perhaps:
>   > bool is_heap_too_small =
>   >   (*gen1_size_ptr + *gen0_size_ptr) > heap_size;
>   >
>   >  if (is_heap_too_small) {
>   >    bool has_heap_space_left_for_gen0 =
>   >      heap_size >= (min_gen1_size + min_alignment());
>   >    bool is_gen0_too_large =
>   >      heap_size < (*gen0_size_ptr + min_gen1_size);
>   >
>   >    if (is_gen0_too_large && has_heap_space_left_for_gen0) {
>   >    ....
>   >    }
I guess this is a matter of taste. I would actually prefer to keep it as 
is. I do appreciate naming parts of complex expressions into local 
variables to increase readability, but I don't find this expression 
complex enough to motivate it.

> - As a final possible cleanup of
>   TwoGenerationCollectorPolicy::adjust_gen0_size, I think the variable:
>   > bool result = false;
>   can be removed and instead an early exit can be used in the if
>   statement that sets "result" to true and the method can return false
>   in the end.
Again a matter of taste. At some point I was told that it was 
discouraged to use early exit. I don't remember where I heard/read it, 
but until we have some guidelines to dictate early exit I prefer using a 
result variable. Random return statements are easily overlooked when 
browsing code.
/Jesper

>
> What do you think of these suggestions?
>
> Thanks,
> Erik
>
>> /Jesper
>>
>>
>> On 16/1/13 2:23 PM, Vitaly Davidovich wrote:
>>>
>>> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't
>>> been checked yet but if it's 0, VM will exit later on anyway -
>>> basically, what you said in the email :).
>>>
>>> Cheers
>>>
>>> Sent from my phone
>>>
>>> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson"
>>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>
>>> wrote:
>>>
>>>
>>>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>>>
>>>>>>     Hi Jesper,
>>>>>>
>>>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>>>     recommended_heap_size?
>>>>>>
>>>>>     As far as I can see, yes. It defaults to two and is never set to
>>>>>     zero.
>>>>
>>>>     No, there is no such guarantee this early in the argument
>>>>     parsing. The check to verify that NewRatio > 0 is done in
>>>>     GenCollectorPolicy::initialize_flags(), which is called later in
>>>>     the start up sequence than your call to
>>>>     CollectorPolicy::recommended_heap_size() and it is never called
>>>>     for G1.
>>>>
>>>>     Running with your patch crashes:
>>>>
>>>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>>>     Floating point exception: 8
>>>
>>>     Oh, yes, you're right. Sorry!
>>>
>>>     Good catch Vitaly!
>>>
>>>     New webrev:
>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>>>
>>>     I'm just skipping the calculation if NewRatio is zero. The VM will
>>>     abort anyway as soon as it realizes that this is the case.
>>>     /Jesper
>>>
>>>
>>>>     Bengt
>>>>>     /Jesper
>>>>>
>>>>>>     Thanks
>>>>>>
>>>>>>     Sent from my phone
>>>>>>
>>>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>>>     <jesper.wilhelmsson at oracle.com
>>>>>>     <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>>>
>>>>>>         Jon,
>>>>>>
>>>>>>         Thank you for looking at this! I share your concerns and I
>>>>>>         have moved the knowledge about policies to CollectorPolicy.
>>>>>>         set_heap_size() now simply asks the collector policy if it
>>>>>>         has any recommendations regarding the heap size.
>>>>>>
>>>>>>         Ideally, since the code knows about young and old
>>>>>>         generations, I guess the new function
>>>>>>         "recommended_heap_size()" should be placed in
>>>>>>         GenCollectorPolicy, but then the code would have to be
>>>>>>         duplicated for G1 as well. However, CollectorPolicy already
>>>>>>         know about OldSize and NewSize so I think it is OK to put
>>>>>>         it there.
>>>>>>
>>>>>>         Eventually I think that we should reduce the abstraction
>>>>>>         level in the generation policies and merge CollectorPolicy,
>>>>>>         GenCollectorPolicy and maybe even
>>>>>>         TwoGenerationCollectorPolicy and if possible
>>>>>>         G1CollectorPolicy, so I don't worry too much about having
>>>>>>         knowledge about the two generations in CollectorPolicy.
>>>>>>
>>>>>>
>>>>>>         A new webrev is available here:
>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>>>
>>>>>>         Thanks,
>>>>>>         /Jesper
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>>>
>>>>>>             Jesper,
>>>>>>
>>>>>>             I'm a bit concerned that set_heap_size() now knows
>>>>>>             about how
>>>>>>             the CollectorPolicy uses OldSize and NewSize. In the
>>>>>>             distant
>>>>>>             past set_heap_size() did not know what kind of
>>>>>>             collector was
>>>>>>             going to be used and probably avoided looking at those
>>>>>>             parameters for that reason.  Today we know that a
>>>>>>             generational
>>>>>>             collector is to follow but maybe you could hide that
>>>>>>             knowledge
>>>>>>             in CollectorPolicy somewhere and have set_heap_size()
>>>>>>             call into
>>>>>>             CollectorPolicy to use that information?
>>>>>>
>>>>>>             Jon
>>>>>>
>>>>>>
>>>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>>>
>>>>>>                 Hi,
>>>>>>
>>>>>>                 I would like a couple of reviews of a small fix for
>>>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit
>>>>>> VMs
>>>>>>
>>>>>>                 Webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>>>
>>>>>>                 Summary:
>>>>>>                 When starting HotSpot with an OldSize larger than
>>>>>>                 the default heap size one will run into a couple of
>>>>>>                 problems. Basically what happens is that the
>>>>>>                 OldSize is ignored because it is incompatible with
>>>>>>                 the heap size. A debug build will assert since a
>>>>>>                 calculation on the way results in a negative
>>>>>>                 number, but since it is a size_t an if(x<0) won't
>>>>>>                 trigger and the assert catches it later on as
>>>>>>                 incompatible flags.
>>>>>>
>>>>>>                 Changes:
>>>>>>                 I have made two changes to fix this.
>>>>>>
>>>>>>                 The first is to change the calculation in
>>>>>> TwoGenerationCollectorPolicy::adjust_gen0_sizes so
>>>>>>                 that it won't result in a negative number in the if
>>>>>>                 statement. This way we will catch the case where
>>>>>>                 the OldSize is larger than the heap size and adjust
>>>>>>                 the OldSize instead of the young size. There are
>>>>>>                 also some cosmetic changes here. For instance the
>>>>>>                 argument min_gen0_size is actually used for the old
>>>>>>                 generation size which was a bit confusing
>>>>>>                 initially. I renamed it to min_gen1_size (which it
>>>>>>                 already was called in the header file).
>>>>>>
>>>>>>                 The second change is in Arguments::set_heap_size.
>>>>>>                 My reasoning here is that if the user sets the
>>>>>>                 OldSize we should probably adjust the heap size to
>>>>>>                 accommodate that OldSize instead of complaining
>>>>>>                 that the heap is too small. We determine the heap
>>>>>>                 size first and the generation sizes later on while
>>>>>>                 initializing the VM. To be able to fit the
>>>>>>                 generations if the user specifies sizes on the
>>>>>>                 command line we need to look at the generation size
>>>>>>                 flags a little already when setting up the heap 
>>>>>> size.
>>>>>>
>>>>>>                 Thanks,
>>>>>>                 /Jesper
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>


From erik.helin at oracle.com  Wed Jan 23 16:19:56 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 23 Jan 2013 17:19:56 +0100
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50FFE911.40901@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
	<50FADA0A.7040209@oracle.com> <50FD148D.6080000@oracle.com>
	<50FFE911.40901@oracle.com>
Message-ID: <51000DAC.6000809@oracle.com>

Jesper,

thanks for updating the code, see my comments inline.

On 01/23/2013 02:43 PM, Jesper Wilhelmsson wrote:
> Erik,
>
> Thanks for looking at it again. An updated webrev can be found here:
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.4/
>
> See comments inline.
>
> On 21/1/13 11:12 AM, Erik Helin wrote:
>> A couple of comments:
>>
>> - Instead of having the comment:
>>   > "NewRatio is checked earlier and can not be zero here"
>>   Could you add an assert that says:
>>   > assert(NewRatio != 0, "Should have been checked earlier");
>>   This way you get a verifiable comment :)
> Agreed and fixed.

Thanks, looks good!

On 01/23/2013 02:43 PM, Jesper Wilhelmsson wrote:
 > On 21/1/13 11:12 AM, Erik Helin wrote:
>> - I think its better if the calculation
>>   > (OldSize / NewRatio) * (NewRatio + 1)
>>   is saved in a variable, perhaps:
>>   >    uintx heap_size_by_scaling_old_size =
>>   >      (OldSize / NewRatio) * (NewRatio + 1);
>>   >
>>   >    MaxHeapSize = heap_size_by_scaling_old_size;
>>   >    InitialHeapSize = heap_size_by_scaling_old_size;
> Agreed and fixed.

Looks good as well!

On 01/23/2013 02:43 PM, Jesper Wilhelmsson wrote:
 > On 21/1/13 11:12 AM, Erik Helin wrote:
>> - This is not related to your change, but I would prefer if some of
>>   the logic in TwoGenerationCollectorPolicy::adjust_gen0_size got some
>>   more descriptive names, perhaps:
>>   > bool is_heap_too_small =
>>   >   (*gen1_size_ptr + *gen0_size_ptr) > heap_size;
>>   >
>>   >  if (is_heap_too_small) {
>>   >    bool has_heap_space_left_for_gen0 =
>>   >      heap_size >= (min_gen1_size + min_alignment());
>>   >    bool is_gen0_too_large =
>>   >      heap_size < (*gen0_size_ptr + min_gen1_size);
>>   >
>>   >    if (is_gen0_too_large && has_heap_space_left_for_gen0) {
>>   >    ....
>>   >    }
> I guess this is a matter of taste. I would actually prefer to keep it as
> is. I do appreciate naming parts of complex expressions into local
> variables to increase readability, but I don't find this expression
> complex enough to motivate it.

Sure, this is a matter of taste, each to his own. I prefer the 
variables, but if a second reviewer also prefers to not introduce the 
variables, then I'm fine with that as well.

On 01/23/2013 02:43 PM, Jesper Wilhelmsson wrote:
 > On 21/1/13 11:12 AM, Erik Helin wrote:
>> - As a final possible cleanup of
>>   TwoGenerationCollectorPolicy::adjust_gen0_size, I think the variable:
>>   > bool result = false;
>>   can be removed and instead an early exit can be used in the if
>>   statement that sets "result" to true and the method can return false
>>   in the end.
> Again a matter of taste. At some point I was told that it was
> discouraged to use early exit. I don't remember where I heard/read it,
> but until we have some guidelines to dictate early exit I prefer using a
> result variable. Random return statements are easily overlooked when
> browsing code.

I agree in the common case that early exits can cause problems, but when 
a function is as small as this one, then I think it increases 
readability. Again, let a second reviewer have an opinion about this. If 
the decision is to keep the variable, then I'm fine with that :)

Thanks,
Erik

> /Jesper
>
>>
>> What do you think of these suggestions?
>>
>> Thanks,
>> Erik
>>
>>> /Jesper
>>>
>>>
>>> On 16/1/13 2:23 PM, Vitaly Davidovich wrote:
>>>>
>>>> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't
>>>> been checked yet but if it's 0, VM will exit later on anyway -
>>>> basically, what you said in the email :).
>>>>
>>>> Cheers
>>>>
>>>> Sent from my phone
>>>>
>>>> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson"
>>>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>
>>>> wrote:
>>>>
>>>>
>>>>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>>>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>>>>
>>>>>>>     Hi Jesper,
>>>>>>>
>>>>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>>>>     recommended_heap_size?
>>>>>>>
>>>>>>     As far as I can see, yes. It defaults to two and is never set to
>>>>>>     zero.
>>>>>
>>>>>     No, there is no such guarantee this early in the argument
>>>>>     parsing. The check to verify that NewRatio > 0 is done in
>>>>>     GenCollectorPolicy::initialize_flags(), which is called later in
>>>>>     the start up sequence than your call to
>>>>>     CollectorPolicy::recommended_heap_size() and it is never called
>>>>>     for G1.
>>>>>
>>>>>     Running with your patch crashes:
>>>>>
>>>>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>>>>     Floating point exception: 8
>>>>
>>>>     Oh, yes, you're right. Sorry!
>>>>
>>>>     Good catch Vitaly!
>>>>
>>>>     New webrev:
>>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>>>>
>>>>     I'm just skipping the calculation if NewRatio is zero. The VM will
>>>>     abort anyway as soon as it realizes that this is the case.
>>>>     /Jesper
>>>>
>>>>
>>>>>     Bengt
>>>>>>     /Jesper
>>>>>>
>>>>>>>     Thanks
>>>>>>>
>>>>>>>     Sent from my phone
>>>>>>>
>>>>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>>>>     <jesper.wilhelmsson at oracle.com
>>>>>>>     <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>>>>
>>>>>>>         Jon,
>>>>>>>
>>>>>>>         Thank you for looking at this! I share your concerns and I
>>>>>>>         have moved the knowledge about policies to CollectorPolicy.
>>>>>>>         set_heap_size() now simply asks the collector policy if it
>>>>>>>         has any recommendations regarding the heap size.
>>>>>>>
>>>>>>>         Ideally, since the code knows about young and old
>>>>>>>         generations, I guess the new function
>>>>>>>         "recommended_heap_size()" should be placed in
>>>>>>>         GenCollectorPolicy, but then the code would have to be
>>>>>>>         duplicated for G1 as well. However, CollectorPolicy already
>>>>>>>         know about OldSize and NewSize so I think it is OK to put
>>>>>>>         it there.
>>>>>>>
>>>>>>>         Eventually I think that we should reduce the abstraction
>>>>>>>         level in the generation policies and merge CollectorPolicy,
>>>>>>>         GenCollectorPolicy and maybe even
>>>>>>>         TwoGenerationCollectorPolicy and if possible
>>>>>>>         G1CollectorPolicy, so I don't worry too much about having
>>>>>>>         knowledge about the two generations in CollectorPolicy.
>>>>>>>
>>>>>>>
>>>>>>>         A new webrev is available here:
>>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>>>>
>>>>>>>         Thanks,
>>>>>>>         /Jesper
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>>>>
>>>>>>>             Jesper,
>>>>>>>
>>>>>>>             I'm a bit concerned that set_heap_size() now knows
>>>>>>>             about how
>>>>>>>             the CollectorPolicy uses OldSize and NewSize. In the
>>>>>>>             distant
>>>>>>>             past set_heap_size() did not know what kind of
>>>>>>>             collector was
>>>>>>>             going to be used and probably avoided looking at those
>>>>>>>             parameters for that reason.  Today we know that a
>>>>>>>             generational
>>>>>>>             collector is to follow but maybe you could hide that
>>>>>>>             knowledge
>>>>>>>             in CollectorPolicy somewhere and have set_heap_size()
>>>>>>>             call into
>>>>>>>             CollectorPolicy to use that information?
>>>>>>>
>>>>>>>             Jon
>>>>>>>
>>>>>>>
>>>>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>>>>
>>>>>>>                 Hi,
>>>>>>>
>>>>>>>                 I would like a couple of reviews of a small fix for
>>>>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit
>>>>>>> VMs
>>>>>>>
>>>>>>>                 Webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>>>>
>>>>>>>                 Summary:
>>>>>>>                 When starting HotSpot with an OldSize larger than
>>>>>>>                 the default heap size one will run into a couple of
>>>>>>>                 problems. Basically what happens is that the
>>>>>>>                 OldSize is ignored because it is incompatible with
>>>>>>>                 the heap size. A debug build will assert since a
>>>>>>>                 calculation on the way results in a negative
>>>>>>>                 number, but since it is a size_t an if(x<0) won't
>>>>>>>                 trigger and the assert catches it later on as
>>>>>>>                 incompatible flags.
>>>>>>>
>>>>>>>                 Changes:
>>>>>>>                 I have made two changes to fix this.
>>>>>>>
>>>>>>>                 The first is to change the calculation in
>>>>>>> TwoGenerationCollectorPolicy::adjust_gen0_sizes so
>>>>>>>                 that it won't result in a negative number in the if
>>>>>>>                 statement. This way we will catch the case where
>>>>>>>                 the OldSize is larger than the heap size and adjust
>>>>>>>                 the OldSize instead of the young size. There are
>>>>>>>                 also some cosmetic changes here. For instance the
>>>>>>>                 argument min_gen0_size is actually used for the old
>>>>>>>                 generation size which was a bit confusing
>>>>>>>                 initially. I renamed it to min_gen1_size (which it
>>>>>>>                 already was called in the header file).
>>>>>>>
>>>>>>>                 The second change is in Arguments::set_heap_size.
>>>>>>>                 My reasoning here is that if the user sets the
>>>>>>>                 OldSize we should probably adjust the heap size to
>>>>>>>                 accommodate that OldSize instead of complaining
>>>>>>>                 that the heap is too small. We determine the heap
>>>>>>>                 size first and the generation sizes later on while
>>>>>>>                 initializing the VM. To be able to fit the
>>>>>>>                 generations if the user specifies sizes on the
>>>>>>>                 command line we need to look at the generation size
>>>>>>>                 flags a little already when setting up the heap
>>>>>>> size.
>>>>>>>
>>>>>>>                 Thanks,
>>>>>>>                 /Jesper
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


From coleen.phillimore at oracle.com  Wed Jan 23 17:01:03 2013
From: coleen.phillimore at oracle.com (Coleen Phillimore)
Date: Wed, 23 Jan 2013 12:01:03 -0500
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <50F424A0.6080907@oracle.com>
References: <50F424A0.6080907@oracle.com>
Message-ID: <5100174F.7060301@oracle.com>


It looks okay except I think passing SpaceManager* to get_new_chunk() is 
really gross just to do a print out.  I'd rather see the printing at 
line 1014 moved to 2170 whether you get a new virtual space or not.  
There's already a ton of output from TraceMetadataChunkAllocation && 
Verbose, but you can leave the printing at 1013 so you know which one 
created a new virtual space.

Coleen


On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
> 8005452: Create new flags for Metaspace resizing policy
>
> Previously the calculation of the metadata capacity at which
> to do a GC (high water mark, HWM) to recover
> unloaded classes used the MinHeapFreeRatio
> and MaxHeapFreeRatio to decide on the next HWM.  That
> generally left an excessive amount of unused capacity for
> metadata.  This change adds specific flags for metadata
> capacity with defaults more conservative in terms of
> unused capacity.
>
> Added an additional check for doing a GC before expanding
> the metadata capacity.  Required adding a new parameter to
> get_new_chunk().
>
> Added some additional diagnostic prints.
>
> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>
> Thanks.


From jon.masamitsu at oracle.com  Wed Jan 23 18:15:33 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 23 Jan 2013 10:15:33 -0800
Subject: RFR (S): JDK-6348447 - Specifying -XX:OldSize crashes 64-bit VMs
In-Reply-To: <50FFE911.40901@oracle.com>
References: <50F43C0E.1080308@oracle.com> <50F447CB.1000604@oracle.com>
	<50F55485.5020705@oracle.com>
	<CAHjP37GPhmZzGrPUkiaeuabQRq3o1YtNXnj+4-Y4RD25+Vb-gQ@mail.gmail.com>
	<50F55C7A.8070508@oracle.com> <50F66391.6000504@oracle.com>
	<50F6A0DA.4040108@oracle.com>
	<CAHjP37HtKTM522d6_P_OmvG1wb0iSyKxtX7LXuV7XrNC0EKMdw@mail.gmail.com>
	<50FADA0A.7040209@oracle.com> <50FD148D.6080000@oracle.com>
	<50FFE911.40901@oracle.com>
Message-ID: <510028C5.40906@oracle.com>

Jesper,

Looks good.

Jon

On 1/23/2013 5:43 AM, Jesper Wilhelmsson wrote:
> Erik,
>
> Thanks for looking at it again. An updated webrev can be found here:
> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.4/
>
> See comments inline.
>
> On 21/1/13 11:12 AM, Erik Helin wrote:
>> Jesper,
>>
>> On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
>>> Some further code inspection showed that it's possible to fix this 
>>> bug in
>>> TwoGenerationCollectorPolicy::initialize_flags() and keep the fix 
>>> local.
>>
>> I think this is much better, nice work!
> Thanks for the suggestions!
>
>>
>> On 01/19/2013 06:38 PM, Jesper Wilhelmsson wrote:
>>> A new webrev is available here:
>>>
>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3/
>>
>> A couple of comments:
>>
>> - Instead of having the comment:
>> > "NewRatio is checked earlier and can not be zero here"
>>   Could you add an assert that says:
>> > assert(NewRatio != 0, "Should have been checked earlier");
>>   This way you get a verifiable comment :)
> Agreed and fixed.
>
>> - I think its better if the calculation
>> > (OldSize / NewRatio) * (NewRatio + 1)
>>   is saved in a variable, perhaps:
>> >    uintx heap_size_by_scaling_old_size =
>> >      (OldSize / NewRatio) * (NewRatio + 1);
>> >
>> >    MaxHeapSize = heap_size_by_scaling_old_size;
>> >    InitialHeapSize = heap_size_by_scaling_old_size;
> Agreed and fixed.
>
>> - This is not related to your change, but I would prefer if some of
>>   the logic in TwoGenerationCollectorPolicy::adjust_gen0_size got some
>>   more descriptive names, perhaps:
>> > bool is_heap_too_small =
>> >   (*gen1_size_ptr + *gen0_size_ptr) > heap_size;
>> >
>> >  if (is_heap_too_small) {
>> >    bool has_heap_space_left_for_gen0 =
>> >      heap_size >= (min_gen1_size + min_alignment());
>> >    bool is_gen0_too_large =
>> >      heap_size < (*gen0_size_ptr + min_gen1_size);
>> >
>> >    if (is_gen0_too_large && has_heap_space_left_for_gen0) {
>> >    ....
>> >    }
> I guess this is a matter of taste. I would actually prefer to keep it 
> as is. I do appreciate naming parts of complex expressions into local 
> variables to increase readability, but I don't find this expression 
> complex enough to motivate it.
>
>> - As a final possible cleanup of
>>   TwoGenerationCollectorPolicy::adjust_gen0_size, I think the variable:
>> > bool result = false;
>>   can be removed and instead an early exit can be used in the if
>>   statement that sets "result" to true and the method can return false
>>   in the end.
> Again a matter of taste. At some point I was told that it was 
> discouraged to use early exit. I don't remember where I heard/read it, 
> but until we have some guidelines to dictate early exit I prefer using 
> a result variable. Random return statements are easily overlooked when 
> browsing code.
> /Jesper
>
>>
>> What do you think of these suggestions?
>>
>> Thanks,
>> Erik
>>
>>> /Jesper
>>>
>>>
>>> On 16/1/13 2:23 PM, Vitaly Davidovich wrote:
>>>>
>>>> Looks good Jesper.  Maybe just a comment there that NewRatio hasn't
>>>> been checked yet but if it's 0, VM will exit later on anyway -
>>>> basically, what you said in the email :).
>>>>
>>>> Cheers
>>>>
>>>> Sent from my phone
>>>>
>>>> On Jan 16, 2013 7:49 AM, "Jesper Wilhelmsson"
>>>> <jesper.wilhelmsson at oracle.com <mailto:jesper.wilhelmsson at oracle.com>>
>>>> wrote:
>>>>
>>>>
>>>>     On 2013-01-16 09:23, Bengt Rutisson wrote:
>>>>>     On 1/15/13 2:41 PM, Jesper Wilhelmsson wrote:
>>>>>>     On 2013-01-15 14:32, Vitaly Davidovich wrote:
>>>>>>>
>>>>>>>     Hi Jesper,
>>>>>>>
>>>>>>>     Is NewRatio guaranteed to be non-zero when used inside
>>>>>>>     recommended_heap_size?
>>>>>>>
>>>>>>     As far as I can see, yes. It defaults to two and is never set to
>>>>>>     zero.
>>>>>
>>>>>     No, there is no such guarantee this early in the argument
>>>>>     parsing. The check to verify that NewRatio > 0 is done in
>>>>>     GenCollectorPolicy::initialize_flags(), which is called later in
>>>>>     the start up sequence than your call to
>>>>>     CollectorPolicy::recommended_heap_size() and it is never called
>>>>>     for G1.
>>>>>
>>>>>     Running with your patch crashes:
>>>>>
>>>>>     java -XX:OldSize=128m -XX:NewRatio=0 -version
>>>>>     Floating point exception: 8
>>>>
>>>>     Oh, yes, you're right. Sorry!
>>>>
>>>>     Good catch Vitaly!
>>>>
>>>>     New webrev:
>>>>     http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.3
>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.3>
>>>>
>>>>     I'm just skipping the calculation if NewRatio is zero. The VM will
>>>>     abort anyway as soon as it realizes that this is the case.
>>>>     /Jesper
>>>>
>>>>
>>>>>     Bengt
>>>>>>     /Jesper
>>>>>>
>>>>>>>     Thanks
>>>>>>>
>>>>>>>     Sent from my phone
>>>>>>>
>>>>>>>     On Jan 15, 2013 8:11 AM, "Jesper Wilhelmsson"
>>>>>>> <jesper.wilhelmsson at oracle.com
>>>>>>> <mailto:jesper.wilhelmsson at oracle.com>> wrote:
>>>>>>>
>>>>>>>         Jon,
>>>>>>>
>>>>>>>         Thank you for looking at this! I share your concerns and I
>>>>>>>         have moved the knowledge about policies to CollectorPolicy.
>>>>>>>         set_heap_size() now simply asks the collector policy if it
>>>>>>>         has any recommendations regarding the heap size.
>>>>>>>
>>>>>>>         Ideally, since the code knows about young and old
>>>>>>>         generations, I guess the new function
>>>>>>>         "recommended_heap_size()" should be placed in
>>>>>>>         GenCollectorPolicy, but then the code would have to be
>>>>>>>         duplicated for G1 as well. However, CollectorPolicy already
>>>>>>>         know about OldSize and NewSize so I think it is OK to put
>>>>>>>         it there.
>>>>>>>
>>>>>>>         Eventually I think that we should reduce the abstraction
>>>>>>>         level in the generation policies and merge CollectorPolicy,
>>>>>>>         GenCollectorPolicy and maybe even
>>>>>>>         TwoGenerationCollectorPolicy and if possible
>>>>>>>         G1CollectorPolicy, so I don't worry too much about having
>>>>>>>         knowledge about the two generations in CollectorPolicy.
>>>>>>>
>>>>>>>
>>>>>>>         A new webrev is available here:
>>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev.2/
>>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev.2/>
>>>>>>>
>>>>>>>         Thanks,
>>>>>>>         /Jesper
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On 2013-01-14 19:00, Jon Masamitsu wrote:
>>>>>>>
>>>>>>>             Jesper,
>>>>>>>
>>>>>>>             I'm a bit concerned that set_heap_size() now knows
>>>>>>>             about how
>>>>>>>             the CollectorPolicy uses OldSize and NewSize. In the
>>>>>>>             distant
>>>>>>>             past set_heap_size() did not know what kind of
>>>>>>>             collector was
>>>>>>>             going to be used and probably avoided looking at those
>>>>>>>             parameters for that reason.  Today we know that a
>>>>>>>             generational
>>>>>>>             collector is to follow but maybe you could hide that
>>>>>>>             knowledge
>>>>>>>             in CollectorPolicy somewhere and have set_heap_size()
>>>>>>>             call into
>>>>>>>             CollectorPolicy to use that information?
>>>>>>>
>>>>>>>             Jon
>>>>>>>
>>>>>>>
>>>>>>>             On 01/14/13 09:10, Jesper Wilhelmsson wrote:
>>>>>>>
>>>>>>>                 Hi,
>>>>>>>
>>>>>>>                 I would like a couple of reviews of a small fix for
>>>>>>>                 JDK-6348447 - Specifying -XX:OldSize crashes 64-bit
>>>>>>> VMs
>>>>>>>
>>>>>>>                 Webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~jwilhelm/6348447/webrev/
>>>>>>> <http://cr.openjdk.java.net/%7Ejwilhelm/6348447/webrev/>
>>>>>>>
>>>>>>>                 Summary:
>>>>>>>                 When starting HotSpot with an OldSize larger than
>>>>>>>                 the default heap size one will run into a couple of
>>>>>>>                 problems. Basically what happens is that the
>>>>>>>                 OldSize is ignored because it is incompatible with
>>>>>>>                 the heap size. A debug build will assert since a
>>>>>>>                 calculation on the way results in a negative
>>>>>>>                 number, but since it is a size_t an if(x<0) won't
>>>>>>>                 trigger and the assert catches it later on as
>>>>>>>                 incompatible flags.
>>>>>>>
>>>>>>>                 Changes:
>>>>>>>                 I have made two changes to fix this.
>>>>>>>
>>>>>>>                 The first is to change the calculation in
>>>>>>> TwoGenerationCollectorPolicy::adjust_gen0_sizes so
>>>>>>>                 that it won't result in a negative number in the if
>>>>>>>                 statement. This way we will catch the case where
>>>>>>>                 the OldSize is larger than the heap size and adjust
>>>>>>>                 the OldSize instead of the young size. There are
>>>>>>>                 also some cosmetic changes here. For instance the
>>>>>>>                 argument min_gen0_size is actually used for the old
>>>>>>>                 generation size which was a bit confusing
>>>>>>>                 initially. I renamed it to min_gen1_size (which it
>>>>>>>                 already was called in the header file).
>>>>>>>
>>>>>>>                 The second change is in Arguments::set_heap_size.
>>>>>>>                 My reasoning here is that if the user sets the
>>>>>>>                 OldSize we should probably adjust the heap size to
>>>>>>>                 accommodate that OldSize instead of complaining
>>>>>>>                 that the heap is too small. We determine the heap
>>>>>>>                 size first and the generation sizes later on while
>>>>>>>                 initializing the VM. To be able to fit the
>>>>>>>                 generations if the user specifies sizes on the
>>>>>>>                 command line we need to look at the generation size
>>>>>>>                 flags a little already when setting up the heap 
>>>>>>> size.
>>>>>>>
>>>>>>>                 Thanks,
>>>>>>>                 /Jesper
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>


From john.cuthbertson at oracle.com  Wed Jan 23 22:51:17 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 23 Jan 2013 14:51:17 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
Message-ID: <51006965.6040105@oracle.com>

Hi Vitaly,

Thanks for looking over the code changes. I'll respond to your other 
comments in a separate email. Detailed responses inline....

On 1/15/2013 4:57 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> Wow, that's a giant heap! :)
>
> I think G1ConcRSLogCacheSize needs to be validated to make sure it's 
> <= 31; otherwise, I think you get undefined behavior on left shifting 
> with it.
>

Good catch. Done.

> I don't think you need _def_use_cache -- can be replaced with 
> G1ConcRSLogCacheSize > 0?
>

Done. I've added a function that returns the result of the comparison 
and I use that in place of G1ConcRSLogCacheSize.

> I'm sure this is due to my lack of G1 knowledge, but the concurrency 
> control inside g1HotCardCache is a bit unclear. There's a CAS to claim 
> the region of cards, there's a HotCache lock for inserting a card.  
> However, reset_hot_cache() does a naked write of a few fields.  Are 
> there any visibility and ordering constraints that need to be 
> enforced? Do some of the stores need an OrderAccess barrier of some 
> sort, depending on what's required? Sorry if I'm just missing it ...
>

The drain routine is only called from within a GC pause but it is called 
by multiple GC worker threads. Each worker will claim a chunk of cards 
using the CAS and refine them. Resetting the boundaries (the values 
reset by reset_hot_cache()) in the drain routine would be a mistake 
since a worker thread could see the new boundary values and return, 
potentially leaving some cards unrefined and some missing entries in 
remembered sets. I can only clear the fields when the last thread has 
finished draining the cache. The best place to do this is just before 
the VM thread re-enables the cache (we know the worker threads will have 
finished at this point). Since the "drain" doesn't actually drain, 
perhaps a better name might be refine_all()?

The HotCache lock is used when adding entries to the cache. Entries are 
added by the refinement threads (and there will most likely be more than 
one). Since the act of adding an entry can also evict an entry we need 
the lock to guard against hitting the ABA problem. This could result in 
skipping the refinement of a card, which will lead to missing remembered 
set entries which are not fun to track down.

Draining during the GC is immune from the ABA problem because we're not 
actually removing entries from the cache. We would still be immune, 
however, if we were removing entries since we would not be adding 
entries at the same time.

Thanks,

JohnC


From john.cuthbertson at oracle.com  Wed Jan 23 23:36:30 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 23 Jan 2013 15:36:30 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <CAHjP37GBxLoqNaCSBzagucNx2ebc+Yhrm2PbypvbJQF+w-khXA@mail.gmail.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<CAHjP37GBxLoqNaCSBzagucNx2ebc+Yhrm2PbypvbJQF+w-khXA@mail.gmail.com>
Message-ID: <510073FE.5010706@oracle.com>

Hi Vitaly,

Second response. Details inline....

On 1/15/2013 8:49 PM, Vitaly Davidovich wrote:
>
> A few more comments/suggestions:
>
> In g1CardCounts::ptr_2_card_num(), I'd assert that card_ptr >= 
> _ct_bot.  This is mostly to avoid a null card_ptr (or some other bogus 
> value) causing the subtraction to go negative but then wrap around to 
> a large size_t value that just happens to fit into the card range.  
> Unlikely and maybe this is too paranoid, so up to you.
>

Good idea. Done. I'm all for being paranoid. I've also changed the 
subtraction to use pointer_delta:

     assert((size_t)card_ptr >= (size_t)_ct_bot, "wraparound?");
     size_t card_num = pointer_delta(card_ptr, _ct_bot, sizeof(jbyte));


> Also in this class, it's a bit strange that G1CardCounts::is_hot() 
> also increments the count.  I know the comments in the header say that 
> count is updated but the name of the method implies it's just a read.  
> Maybe call add_card_count() and then add an is_hot(int) method and 
> call that with the return value of add_card_count?
>

Let me think about that one. I was looking for a much simpler interface 
that returned whether a card was was hot. How it made that determination 
was up to it. But I'm leaning toward following your suggestion.

> G1CardCounts::clear_region -- there are some casts of const jbyte to 
> jbyte at bottom of method.  Perhaps if ptr_2_card_num() were changed 
> to be taking const jbyte you wouldn't need the casts.  I think marking 
> as much things const as possible is good in general anyway ...
>

Good point. Done.

> In the various places where values are asserted to be in some range, 
> it may be useful to add the valid range to the error message so that 
> if it triggers you get a bit more context/diagnostic info.
>

I thought I was doing that in most cases. Can you give some examples?

Thanks,

JohnC


From vitalyd at gmail.com  Thu Jan 24 00:06:46 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 23 Jan 2013 19:06:46 -0500
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <510073FE.5010706@oracle.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<CAHjP37GBxLoqNaCSBzagucNx2ebc+Yhrm2PbypvbJQF+w-khXA@mail.gmail.com>
	<510073FE.5010706@oracle.com>
Message-ID: <CAHjP37ExQPPaW_G99HSSZPMfCDcko+xk+DsdxcNtOwGB-tH3Jw@mail.gmail.com>

Hi John,

Thanks for the feedback.  Regarding the asserts question, here's an example
g1HotCardCache.cpp):

assert(worker_i < (int) (ParallelGCThreads == 0 ? 1 : ParallelGCThreads),
"incorrect worker id");

Would be useful to see worker_i here?

Another in g1CardCounts.hpp:

  67   void check_card_num(size_t card_num, const char* msg) {
  68     assert(card_num >= 0 && card_num < _committed_max_card_num, msg);
  69   }

The msg will have the card_num, but you won't have _committed_max_card_num
reported, for example.  That's the kind of thing I meant.

There may be a few more examples like that, but this came up during quick
scan.

Thanks

Sent from my phone
On Jan 23, 2013 6:36 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Vitaly,
>
> Second response. Details inline....
>
> On 1/15/2013 8:49 PM, Vitaly Davidovich wrote:
>
>>
>> A few more comments/suggestions:
>>
>> In g1CardCounts::ptr_2_card_num()**, I'd assert that card_ptr >=
>> _ct_bot.  This is mostly to avoid a null card_ptr (or some other bogus
>> value) causing the subtraction to go negative but then wrap around to a
>> large size_t value that just happens to fit into the card range.  Unlikely
>> and maybe this is too paranoid, so up to you.
>>
>>
> Good idea. Done. I'm all for being paranoid. I've also changed the
> subtraction to use pointer_delta:
>
>     assert((size_t)card_ptr >= (size_t)_ct_bot, "wraparound?");
>     size_t card_num = pointer_delta(card_ptr, _ct_bot, sizeof(jbyte));
>
>
>  Also in this class, it's a bit strange that G1CardCounts::is_hot() also
>> increments the count.  I know the comments in the header say that count is
>> updated but the name of the method implies it's just a read.  Maybe call
>> add_card_count() and then add an is_hot(int) method and call that with the
>> return value of add_card_count?
>>
>>
> Let me think about that one. I was looking for a much simpler interface
> that returned whether a card was was hot. How it made that determination
> was up to it. But I'm leaning toward following your suggestion.
>
>  G1CardCounts::clear_region -- there are some casts of const jbyte to
>> jbyte at bottom of method.  Perhaps if ptr_2_card_num() were changed to be
>> taking const jbyte you wouldn't need the casts.  I think marking as much
>> things const as possible is good in general anyway ...
>>
>>
> Good point. Done.
>
>  In the various places where values are asserted to be in some range, it
>> may be useful to add the valid range to the error message so that if it
>> triggers you get a bit more context/diagnostic info.
>>
>>
> I thought I was doing that in most cases. Can you give some examples?
>
> Thanks,
>
> JohnC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130123/ce8f37a1/attachment.htm>

From vitalyd at gmail.com  Thu Jan 24 00:19:51 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 23 Jan 2013 19:19:51 -0500
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <51006965.6040105@oracle.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<51006965.6040105@oracle.com>
Message-ID: <CAHjP37FkcsXuEq44XFSrgjWL0_oSUEf_8oZpV0JtWtyNN9-BVA@mail.gmail.com>

Hi John,

Thanks for this explanation as well.  I see what you're saying about the
concurrency control, but what I don't understand is when this is called:

void reset_hot_cache() {
107     _hot_cache_idx = 0; _n_hot = 0;
108   }

Since these are plain stores, what exactly ensures that they're (promptly)
visible to other GC threads? Is there some dependency here, e.g. if you see
_n_hot = 0 then _hot_cache_idx must also be zero? I strongly suspect I
missed the details in your response that explain why this isn't a concern.
Is there only a particular type of thread that can call reset_hot_cache
and/or only at a certain point? It kind of sounds like it so don't know if
there's an assert that can be added to verify that.

Thanks

Sent from my phone
On Jan 23, 2013 5:51 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Vitaly,
>
> Thanks for looking over the code changes. I'll respond to your other
> comments in a separate email. Detailed responses inline....
>
> On 1/15/2013 4:57 PM, Vitaly Davidovich wrote:
>
>>
>> Hi John,
>>
>> Wow, that's a giant heap! :)
>>
>> I think G1ConcRSLogCacheSize needs to be validated to make sure it's <=
>> 31; otherwise, I think you get undefined behavior on left shifting with it.
>>
>>
> Good catch. Done.
>
>  I don't think you need _def_use_cache -- can be replaced with
>> G1ConcRSLogCacheSize > 0?
>>
>>
> Done. I've added a function that returns the result of the comparison and
> I use that in place of G1ConcRSLogCacheSize.
>
>  I'm sure this is due to my lack of G1 knowledge, but the concurrency
>> control inside g1HotCardCache is a bit unclear. There's a CAS to claim the
>> region of cards, there's a HotCache lock for inserting a card.  However,
>> reset_hot_cache() does a naked write of a few fields.  Are there any
>> visibility and ordering constraints that need to be enforced? Do some of
>> the stores need an OrderAccess barrier of some sort, depending on what's
>> required? Sorry if I'm just missing it ...
>>
>>
> The drain routine is only called from within a GC pause but it is called
> by multiple GC worker threads. Each worker will claim a chunk of cards
> using the CAS and refine them. Resetting the boundaries (the values reset
> by reset_hot_cache()) in the drain routine would be a mistake since a
> worker thread could see the new boundary values and return, potentially
> leaving some cards unrefined and some missing entries in remembered sets. I
> can only clear the fields when the last thread has finished draining the
> cache. The best place to do this is just before the VM thread re-enables
> the cache (we know the worker threads will have finished at this point).
> Since the "drain" doesn't actually drain, perhaps a better name might be
> refine_all()?
>
> The HotCache lock is used when adding entries to the cache. Entries are
> added by the refinement threads (and there will most likely be more than
> one). Since the act of adding an entry can also evict an entry we need the
> lock to guard against hitting the ABA problem. This could result in
> skipping the refinement of a card, which will lead to missing remembered
> set entries which are not fun to track down.
>
> Draining during the GC is immune from the ABA problem because we're not
> actually removing entries from the cache. We would still be immune,
> however, if we were removing entries since we would not be adding entries
> at the same time.
>
> Thanks,
>
> JohnC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130123/2aa568db/attachment.htm>

From john.cuthbertson at oracle.com  Thu Jan 24 00:36:53 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 23 Jan 2013 16:36:53 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <CAHjP37FkcsXuEq44XFSrgjWL0_oSUEf_8oZpV0JtWtyNN9-BVA@mail.gmail.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<51006965.6040105@oracle.com>
	<CAHjP37FkcsXuEq44XFSrgjWL0_oSUEf_8oZpV0JtWtyNN9-BVA@mail.gmail.com>
Message-ID: <51008225.3060306@oracle.com>

Hi Vitalty,

On 1/23/2013 4:19 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> Thanks for this explanation as well.  I see what you're saying about 
> the concurrency control, but what I don't understand is when this is 
> called:
>
> void reset_hot_cache() {
> 107     _hot_cache_idx = 0; _n_hot = 0;
> 108   }
>
> Since these are plain stores, what exactly ensures that they're 
> (promptly) visible to other GC threads? Is there some dependency here, 
> e.g. if you see _n_hot = 0 then _hot_cache_idx must also be zero? I 
> strongly suspect I missed the details in your response that explain 
> why this isn't a concern.  Is there only a particular type of thread 
> that can call reset_hot_cache and/or only at a certain point? It kind 
> of sounds like it so don't know if there's an assert that can be added 
> to verify that.
>

At the point where this routine is called the GC workers have finished 
the parallel phases of the GC and are idle. The thread that is running 
is the VM thread. The rest of the VM is at a safepoint so we are, in 
effect, single threaded. Yes there is an assert we can add here:

assert(SafepointSynchronize::is_at_safepoint() && 
Thread::current()->is_VM_thread(), "...");

JohnC

> Thanks
>
> Sent from my phone
>
> On Jan 23, 2013 5:51 PM, "John Cuthbertson" 
> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>
>     Hi Vitaly,
>
>     Thanks for looking over the code changes. I'll respond to your
>     other comments in a separate email. Detailed responses inline....
>
>     On 1/15/2013 4:57 PM, Vitaly Davidovich wrote:
>
>
>         Hi John,
>
>         Wow, that's a giant heap! :)
>
>         I think G1ConcRSLogCacheSize needs to be validated to make
>         sure it's <= 31; otherwise, I think you get undefined behavior
>         on left shifting with it.
>
>
>     Good catch. Done.
>
>         I don't think you need _def_use_cache -- can be replaced with
>         G1ConcRSLogCacheSize > 0?
>
>
>     Done. I've added a function that returns the result of the
>     comparison and I use that in place of G1ConcRSLogCacheSize.
>
>         I'm sure this is due to my lack of G1 knowledge, but the
>         concurrency control inside g1HotCardCache is a bit unclear.
>         There's a CAS to claim the region of cards, there's a HotCache
>         lock for inserting a card.  However, reset_hot_cache() does a
>         naked write of a few fields.  Are there any visibility and
>         ordering constraints that need to be enforced? Do some of the
>         stores need an OrderAccess barrier of some sort, depending on
>         what's required? Sorry if I'm just missing it ...
>
>
>     The drain routine is only called from within a GC pause but it is
>     called by multiple GC worker threads. Each worker will claim a
>     chunk of cards using the CAS and refine them. Resetting the
>     boundaries (the values reset by reset_hot_cache()) in the drain
>     routine would be a mistake since a worker thread could see the new
>     boundary values and return, potentially leaving some cards
>     unrefined and some missing entries in remembered sets. I can only
>     clear the fields when the last thread has finished draining the
>     cache. The best place to do this is just before the VM thread
>     re-enables the cache (we know the worker threads will have
>     finished at this point). Since the "drain" doesn't actually drain,
>     perhaps a better name might be refine_all()?
>
>     The HotCache lock is used when adding entries to the cache.
>     Entries are added by the refinement threads (and there will most
>     likely be more than one). Since the act of adding an entry can
>     also evict an entry we need the lock to guard against hitting the
>     ABA problem. This could result in skipping the refinement of a
>     card, which will lead to missing remembered set entries which are
>     not fun to track down.
>
>     Draining during the GC is immune from the ABA problem because
>     we're not actually removing entries from the cache. We would still
>     be immune, however, if we were removing entries since we would not
>     be adding entries at the same time.
>
>     Thanks,
>
>     JohnC
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130123/ff49868d/attachment.htm>

From vitalyd at gmail.com  Thu Jan 24 00:41:12 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 23 Jan 2013 19:41:12 -0500
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <51008225.3060306@oracle.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<51006965.6040105@oracle.com>
	<CAHjP37FkcsXuEq44XFSrgjWL0_oSUEf_8oZpV0JtWtyNN9-BVA@mail.gmail.com>
	<51008225.3060306@oracle.com>
Message-ID: <CAHjP37HB1pQVCTr0QHXBQ2x10evHJRgMauEz5UqK5Sdnih3TwQ@mail.gmail.com>

Got it now - thanks.  So then does exiting the safepoint guarantee that
these writes are flushed to memory so next time GC threads run they see 0s?
Or is that not important/enforced elsewhere?

Sent from my phone
On Jan 23, 2013 7:36 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

>  Hi Vitalty,
>
> On 1/23/2013 4:19 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> Thanks for this explanation as well.  I see what you're saying about the
> concurrency control, but what I don't understand is when this is called:
>
> void reset_hot_cache() {
> 107     _hot_cache_idx = 0; _n_hot = 0;
> 108   }
>
> Since these are plain stores, what exactly ensures that they're (promptly)
> visible to other GC threads? Is there some dependency here, e.g. if you see
> _n_hot = 0 then _hot_cache_idx must also be zero? I strongly suspect I
> missed the details in your response that explain why this isn't a concern.
> Is there only a particular type of thread that can call reset_hot_cache
> and/or only at a certain point? It kind of sounds like it so don't know if
> there's an assert that can be added to verify that.
>
>
> At the point where this routine is called the GC workers have finished the
> parallel phases of the GC and are idle. The thread that is running is the
> VM thread. The rest of the VM is at a safepoint so we are, in effect,
> single threaded. Yes there is an assert we can add here:
>
> assert(SafepointSynchronize::is_at_safepoint() &&
> Thread::current()->is_VM_thread(), "...");
>
> JohnC
>
>  Thanks
>
> Sent from my phone
> On Jan 23, 2013 5:51 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
> wrote:
>
>> Hi Vitaly,
>>
>> Thanks for looking over the code changes. I'll respond to your other
>> comments in a separate email. Detailed responses inline....
>>
>> On 1/15/2013 4:57 PM, Vitaly Davidovich wrote:
>>
>>>
>>> Hi John,
>>>
>>> Wow, that's a giant heap! :)
>>>
>>> I think G1ConcRSLogCacheSize needs to be validated to make sure it's <=
>>> 31; otherwise, I think you get undefined behavior on left shifting with it.
>>>
>>>
>> Good catch. Done.
>>
>>  I don't think you need _def_use_cache -- can be replaced with
>>> G1ConcRSLogCacheSize > 0?
>>>
>>>
>> Done. I've added a function that returns the result of the comparison and
>> I use that in place of G1ConcRSLogCacheSize.
>>
>>  I'm sure this is due to my lack of G1 knowledge, but the concurrency
>>> control inside g1HotCardCache is a bit unclear. There's a CAS to claim the
>>> region of cards, there's a HotCache lock for inserting a card.  However,
>>> reset_hot_cache() does a naked write of a few fields.  Are there any
>>> visibility and ordering constraints that need to be enforced? Do some of
>>> the stores need an OrderAccess barrier of some sort, depending on what's
>>> required? Sorry if I'm just missing it ...
>>>
>>>
>> The drain routine is only called from within a GC pause but it is called
>> by multiple GC worker threads. Each worker will claim a chunk of cards
>> using the CAS and refine them. Resetting the boundaries (the values reset
>> by reset_hot_cache()) in the drain routine would be a mistake since a
>> worker thread could see the new boundary values and return, potentially
>> leaving some cards unrefined and some missing entries in remembered sets. I
>> can only clear the fields when the last thread has finished draining the
>> cache. The best place to do this is just before the VM thread re-enables
>> the cache (we know the worker threads will have finished at this point).
>> Since the "drain" doesn't actually drain, perhaps a better name might be
>> refine_all()?
>>
>> The HotCache lock is used when adding entries to the cache. Entries are
>> added by the refinement threads (and there will most likely be more than
>> one). Since the act of adding an entry can also evict an entry we need the
>> lock to guard against hitting the ABA problem. This could result in
>> skipping the refinement of a card, which will lead to missing remembered
>> set entries which are not fun to track down.
>>
>> Draining during the GC is immune from the ABA problem because we're not
>> actually removing entries from the cache. We would still be immune,
>> however, if we were removing entries since we would not be adding entries
>> at the same time.
>>
>> Thanks,
>>
>> JohnC
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130123/f37e95f0/attachment.htm>

From jon.masamitsu at oracle.com  Thu Jan 24 04:38:19 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 23 Jan 2013 20:38:19 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <5100174F.7060301@oracle.com>
References: <50F424A0.6080907@oracle.com> <5100174F.7060301@oracle.com>
Message-ID: <5100BABB.4040004@oracle.com>

Coleen,

Thanks for the review.

I delete the print at 1013 (instead of moving it) and  reverted 
get_new_chunk().  I left in the
print at 1014.

I have 2 webrevs for  these changes now.  Your suggested changes are in

http://cr.openjdk.java.net/~jmasa/8006815/webrev.00/

Webrev with the Min/MaxMetaspaceFreeRatio changes is

http://cr.openjdk.java.net/~jmasa/8005452/webrev.02/

Jon

On 1/23/2013 9:01 AM, Coleen Phillimore wrote:
>
> It looks okay except I think passing SpaceManager* to get_new_chunk() 
> is really gross just to do a print out.  I'd rather see the printing 
> at line 1014 moved to 2170 whether you get a new virtual space or 
> not.  There's already a ton of output from 
> TraceMetadataChunkAllocation && Verbose, but you can leave the 
> printing at 1013 so you know which one created a new virtual space.
>
> Coleen
>
>
> On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
>> 8005452: Create new flags for Metaspace resizing policy
>>
>> Previously the calculation of the metadata capacity at which
>> to do a GC (high water mark, HWM) to recover
>> unloaded classes used the MinHeapFreeRatio
>> and MaxHeapFreeRatio to decide on the next HWM.  That
>> generally left an excessive amount of unused capacity for
>> metadata.  This change adds specific flags for metadata
>> capacity with defaults more conservative in terms of
>> unused capacity.
>>
>> Added an additional check for doing a GC before expanding
>> the metadata capacity.  Required adding a new parameter to
>> get_new_chunk().
>>
>> Added some additional diagnostic prints.
>>
>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>>
>> Thanks.
>


From yumin.qi at oracle.com  Thu Jan 24 06:14:17 2013
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 23 Jan 2013 22:14:17 -0800
Subject: RFR: 8005278: Serviceability Agent: jmap -heap and jstack -m fail
Message-ID: <5100D139.3040705@oracle.com>

Hi,

   Can I have your comments on fix for
   8005278: Serviceability Agent: jmap -heap and jstack -m fail

   http://cr.openjdk.java.net/~minqi/8005278/

   Problems: 1) In JVM, BinaryTreeDictionary is typedef'ed as 
AFLBinaryTreeDictionary and this name carried to type library for SA. In 
SA we still use olde name for that; 2) FreeList now is template based 
which is not reflected in SA;  3) There is a misuse of 
FIELFINFO_TAG_MASK(which is not in SA code), in SA code 
FIELDINFO_TAG_SIZE was wrongly used as FIELDINFO_TAG_MASK and lead to 
not able to find correct field info.

   Thanks
   Yumin


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130123/03e377ca/attachment.htm>

From david.holmes at oracle.com  Thu Jan 24 06:29:04 2013
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 24 Jan 2013 16:29:04 +1000
Subject: RFR: 8005278: Serviceability Agent: jmap -heap and jstack -m fail
In-Reply-To: <5100D139.3040705@oracle.com>
References: <5100D139.3040705@oracle.com>
Message-ID: <5100D4B0.4060105@oracle.com>

Thanks Yumin this all looks okay to me.

David

On 24/01/2013 4:14 PM, Yumin Qi wrote:
> Hi,
>
>    Can I have your comments on fix for
>    8005278: Serviceability Agent: jmap -heap and jstack -m fail
>
> http://cr.openjdk.java.net/~minqi/8005278/
>
>    Problems: 1) In JVM, BinaryTreeDictionary is typedef'ed as
> AFLBinaryTreeDictionary and this name carried to type library for SA. In
> SA we still use olde name for that; 2) FreeList now is template based
> which is not reflected in SA;  3) There is a misuse of
> FIELFINFO_TAG_MASK(which is not in SA code), in SA code
> FIELDINFO_TAG_SIZE was wrongly used as FIELDINFO_TAG_MASK and lead to
> not able to find correct field info.
>
>    Thanks
>    Yumin
>
>
>


From erik.helin at oracle.com  Thu Jan 24 10:51:13 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 24 Jan 2013 11:51:13 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
Message-ID: <51011221.8050102@oracle.com>

Hi all,

here are the HotSpot changes for fixing JDK-8004172. This change uses 
the new namespace "sun.gc.metaspace" for the metaspace counters and also 
removes some code from metaspaceCounters.hpp/cpp that is not needed any 
longer.

Note that the tests will continue to fail until the JDK part of the 
change finds it way into the hotspot-gc forest.

The JDK part of the change is also out for review on 
serviceability-dev at openjdk.java.net.

Webrev:
HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/

Bug:
http://bugs.sun.com/view_bug.do?bug_id=8004172

Testing:
Run the jstat jtreg tests locally on my machine on a repository where 
I've applied both the JDK changes and the HotSpot changes.

Thanks,
Erik


From stefan.karlsson at oracle.com  Thu Jan 24 13:32:26 2013
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 24 Jan 2013 14:32:26 +0100
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <5100BABB.4040004@oracle.com>
References: <50F424A0.6080907@oracle.com> <5100174F.7060301@oracle.com>
	<5100BABB.4040004@oracle.com>
Message-ID: <510137EA.4050102@oracle.com>

On 01/24/2013 05:38 AM, Jon Masamitsu wrote:
> Coleen,
>
> Thanks for the review.
>
> I delete the print at 1013 (instead of moving it) and  reverted 
> get_new_chunk().  I left in the
> print at 1014.
>
> I have 2 webrevs for  these changes now.

Thanks for splitting this into two changes.

> Your suggested changes are in
>
> http://cr.openjdk.java.net/~jmasa/8006815/webrev.00/

I don't know if this is a reasonable change or not.

Why are you checking if we should expand, before trying to allocate in 
the current virtual space?

  982   // The next attempts at allocating a chunk will expand the
  983   // Metaspace capacity.  Check first if there should be an expansion.
  984   if (!MetaspaceGC::should_expand(this, word_size, grow_chunks_by_words)) {
  985     return next;
  986   }
  987
  988   // Allocate a chunk out of the current virtual space.
  989   if (next == NULL) {
  990     next = current_virtual_space()->get_chunk_vs(grow_chunks_by_words);
  991   }

Shouldn't this line be checking "less than or equal":

*!    if (_(_vsl->capacity_words_sum(_) + expansion_word_size_) < metaspace_size_words ||*
         capacity_until_GC() == 0) {
       set_capacity_until_GC(metaspace_size_words);
       return true;
     }

> Webrev with the Min/MaxMetaspaceFreeRatio changes is
>
> http://cr.openjdk.java.net/~jmasa/8005452/webrev.02/

This looks good.

Though, I think you need to update the descriptions of the new flags:
+   product(uintx, MinMetaspaceFreeRatio, 10,                              \
+           "Min percentage of heap free after GC to avoid 
expansion")        \
+ \
+   product(uintx, MaxMetaspaceFreeRatio, 20,                              \
+           "Max percentage of heap free after GC to avoid 
shrinking")        \

thanks,
StefanK

>
> Jon
>
> On 1/23/2013 9:01 AM, Coleen Phillimore wrote:
>>
>> It looks okay except I think passing SpaceManager* to get_new_chunk() 
>> is really gross just to do a print out.  I'd rather see the printing 
>> at line 1014 moved to 2170 whether you get a new virtual space or 
>> not.  There's already a ton of output from 
>> TraceMetadataChunkAllocation && Verbose, but you can leave the 
>> printing at 1013 so you know which one created a new virtual space.
>>
>> Coleen
>>
>>
>> On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
>>> 8005452: Create new flags for Metaspace resizing policy
>>>
>>> Previously the calculation of the metadata capacity at which
>>> to do a GC (high water mark, HWM) to recover
>>> unloaded classes used the MinHeapFreeRatio
>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>> generally left an excessive amount of unused capacity for
>>> metadata.  This change adds specific flags for metadata
>>> capacity with defaults more conservative in terms of
>>> unused capacity.
>>>
>>> Added an additional check for doing a GC before expanding
>>> the metadata capacity.  Required adding a new parameter to
>>> get_new_chunk().
>>>
>>> Added some additional diagnostic prints.
>>>
>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>>>
>>> Thanks.
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/95d0043c/attachment.htm>

From jon.masamitsu at oracle.com  Thu Jan 24 15:04:47 2013
From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com)
Date: Thu, 24 Jan 2013 15:04:47 +0000
Subject: hg: hsx/hotspot-gc/hotspot: 8004895: NPG: JMapPermCore test failure
	caused by warnings about missing field
Message-ID: <20130124150508.160824751A@hg.openjdk.java.net>

Changeset: 3c327c2b6782
Author:    jmasa
Date:      2013-01-03 15:03 -0800
URL:       http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/3c327c2b6782

8004895: NPG: JMapPermCore test failure caused by warnings about missing field
Reviewed-by: johnc

! src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp
! src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.hpp
! src/share/vm/gc_implementation/concurrentMarkSweep/vmStructs_cms.hpp
! src/share/vm/memory/binaryTreeDictionary.cpp
! src/share/vm/memory/binaryTreeDictionary.hpp
! src/share/vm/runtime/vmStructs.cpp


From krystal.mo at oracle.com  Thu Jan 24 06:33:18 2013
From: krystal.mo at oracle.com (Krystal Mo)
Date: Thu, 24 Jan 2013 14:33:18 +0800
Subject: RFR: 8005278: Serviceability Agent: jmap -heap and jstack -m fail
In-Reply-To: <5100D139.3040705@oracle.com>
References: <5100D139.3040705@oracle.com>
Message-ID: <5100D5AE.6010105@oracle.com>

Yumin,

The FIELDINFO_TAG_MASK part in InstanceKlass is already fixed in 
JDK-8006403. It should be sync'd to the dev repos soon (or has it already?)
I fell in that trap of duplicating the fix as JDK-8006641 already...

- Kris

On 01/24/2013 02:14 PM, Yumin Qi wrote:
> Hi,
>
>   Can I have your comments on fix for
>   8005278: Serviceability Agent: jmap -heap and jstack -m fail
>
> http://cr.openjdk.java.net/~minqi/8005278/
>
>   Problems: 1) In JVM, BinaryTreeDictionary is typedef'ed as 
> AFLBinaryTreeDictionary and this name carried to type library for SA. 
> In SA we still use olde name for that; 2) FreeList now is template 
> based which is not reflected in SA;  3) There is a misuse of 
> FIELFINFO_TAG_MASK(which is not in SA code), in SA code 
> FIELDINFO_TAG_SIZE was wrongly used as FIELDINFO_TAG_MASK and lead to 
> not able to find correct field info.
>
>   Thanks
>   Yumin
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/5a1a1415/attachment.htm>

From yumin.qi at oracle.com  Thu Jan 24 16:07:41 2013
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 24 Jan 2013 08:07:41 -0800
Subject: RFR: 8005278: Serviceability Agent: jmap -heap and jstack -m fail
In-Reply-To: <5100D5AE.6010105@oracle.com>
References: <5100D139.3040705@oracle.com> <5100D5AE.6010105@oracle.com>
Message-ID: <51015C4D.2030504@oracle.com>

I haven't seen this fix so will remove it from my diff.
I cloned from hotspot-gc which has not had this fix yet.  So I will send 
out another webrev based on hotspot-rt.

Thanks
Yumin

On 1/23/2013 10:33 PM, Krystal Mo wrote:
> Yumin,
>
> The FIELDINFO_TAG_MASK part in InstanceKlass is already fixed in 
> JDK-8006403. It should be sync'd to the dev repos soon (or has it 
> already?)
> I fell in that trap of duplicating the fix as JDK-8006641 already...
>
> - Kris
>
> On 01/24/2013 02:14 PM, Yumin Qi wrote:
>> Hi,
>>
>>   Can I have your comments on fix for
>>   8005278: Serviceability Agent: jmap -heap and jstack -m fail
>>
>> http://cr.openjdk.java.net/~minqi/8005278/
>>
>>   Problems: 1) In JVM, BinaryTreeDictionary is typedef'ed as 
>> AFLBinaryTreeDictionary and this name carried to type library for SA. 
>> In SA we still use olde name for that; 2) FreeList now is template 
>> based which is not reflected in SA;  3) There is a misuse of 
>> FIELFINFO_TAG_MASK(which is not in SA code), in SA code 
>> FIELDINFO_TAG_SIZE was wrongly used as FIELDINFO_TAG_MASK and lead to 
>> not able to find correct field info.
>>
>>   Thanks
>>   Yumin
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/bd46b99a/attachment.htm>

From jon.masamitsu at oracle.com  Thu Jan 24 16:35:37 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 24 Jan 2013 08:35:37 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <510137EA.4050102@oracle.com>
References: <50F424A0.6080907@oracle.com> <5100174F.7060301@oracle.com>
	<5100BABB.4040004@oracle.com> <510137EA.4050102@oracle.com>
Message-ID: <510162D9.9040609@oracle.com>


On 1/24/2013 5:32 AM, Stefan Karlsson wrote:
> On 01/24/2013 05:38 AM, Jon Masamitsu wrote:
>> Coleen,
>>
>> Thanks for the review.
>>
>> I delete the print at 1013 (instead of moving it) and  reverted 
>> get_new_chunk().  I left in the
>> print at 1014.
>>
>> I have 2 webrevs for  these changes now.
>
> Thanks for splitting this into two changes.
>
>> Your suggested changes are in
>>
>> http://cr.openjdk.java.net/~jmasa/8006815/webrev.00/
>
> I don't know if this is a reasonable change or not.
>
> Why are you checking if we should expand, before trying to allocate in 
> the current virtual space?
>
>  982   // The next attempts at allocating a chunk will expand the
>  983   // Metaspace capacity.  Check first if there should be an 
> expansion.
>  984   if (!MetaspaceGC::should_expand(this, word_size, 
> grow_chunks_by_words)) {
>  985     return next;
>  986   }

Do you mean that I should put the check should_expand() after
line 991.  That would be better.

Or do you mean that the test should look at the current capacity (as it
did before) and not at the capacity after the addition of the chunk (for
comparing to the HWM)?

>  987
>  988   // Allocate a chunk out of the current virtual space.
>  989   if (next == NULL) {
>  990     next = 
> current_virtual_space()->get_chunk_vs(grow_chunks_by_words);
>  991   }
>
> Shouldn't this line be checking "less than or equal":
>
> *!    if (_(_vsl->capacity_words_sum(_) + expansion_word_size_) < 
> metaspace_size_words ||*
>         capacity_until_GC() == 0) {
>       set_capacity_until_GC(metaspace_size_words);
>       return true;
>     }

Yes. Fixed.

>
>> Webrev with the Min/MaxMetaspaceFreeRatio changes is
>>
>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.02/
>
> This looks good.
>
> Though, I think you need to update the descriptions of the new flags:
> +   product(uintx, MinMetaspaceFreeRatio, 
> 10,                              \
> +           "Min percentage of heap free after GC to avoid 
> expansion")        \
> + \
> +   product(uintx, MaxMetaspaceFreeRatio, 
> 20,                              \
> +           "Max percentage of heap free after GC to avoid 
> shrinking")        \

Fixed.

Jon
>
> thanks,
> StefanK
>
>>
>> Jon
>>
>> On 1/23/2013 9:01 AM, Coleen Phillimore wrote:
>>>
>>> It looks okay except I think passing SpaceManager* to 
>>> get_new_chunk() is really gross just to do a print out.  I'd rather 
>>> see the printing at line 1014 moved to 2170 whether you get a new 
>>> virtual space or not.  There's already a ton of output from 
>>> TraceMetadataChunkAllocation && Verbose, but you can leave the 
>>> printing at 1013 so you know which one created a new virtual space.
>>>
>>> Coleen
>>>
>>>
>>> On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>
>>>> Previously the calculation of the metadata capacity at which
>>>> to do a GC (high water mark, HWM) to recover
>>>> unloaded classes used the MinHeapFreeRatio
>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>> generally left an excessive amount of unused capacity for
>>>> metadata.  This change adds specific flags for metadata
>>>> capacity with defaults more conservative in terms of
>>>> unused capacity.
>>>>
>>>> Added an additional check for doing a GC before expanding
>>>> the metadata capacity.  Required adding a new parameter to
>>>> get_new_chunk().
>>>>
>>>> Added some additional diagnostic prints.
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>>>>
>>>> Thanks.
>>>
>
>


From jon.masamitsu at oracle.com  Thu Jan 24 18:57:24 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 24 Jan 2013 10:57:24 -0800
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <51011221.8050102@oracle.com>
References: <51011221.8050102@oracle.com>
Message-ID: <51018414.1000103@oracle.com>

Erik,

I looked at the hotspot changes and they look correct.  But I'm
not sure that "sun.gc" should be in the name of the counter.  Maybe
use SUN_RT instead of SUN_GC.

Jon

On 1/24/2013 2:51 AM, Erik Helin wrote:
> Hi all,
>
> here are the HotSpot changes for fixing JDK-8004172. This change uses 
> the new namespace "sun.gc.metaspace" for the metaspace counters and 
> also removes some code from metaspaceCounters.hpp/cpp that is not 
> needed any longer.
>
> Note that the tests will continue to fail until the JDK part of the 
> change finds it way into the hotspot-gc forest.
>
> The JDK part of the change is also out for review on 
> serviceability-dev at openjdk.java.net.
>
> Webrev:
> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>
> Bug:
> http://bugs.sun.com/view_bug.do?bug_id=8004172
>
> Testing:
> Run the jstat jtreg tests locally on my machine on a repository where 
> I've applied both the JDK changes and the HotSpot changes.
>
> Thanks,
> Erik


From john.cuthbertson at oracle.com  Thu Jan 24 19:34:32 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 24 Jan 2013 11:34:32 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <CAHjP37HB1pQVCTr0QHXBQ2x10evHJRgMauEz5UqK5Sdnih3TwQ@mail.gmail.com>
References: <50F5E6BE.9040901@oracle.com>
	<CAHjP37GiBWpdDBWMcHhvNyPWU56kSc9-JHa-Ed56wpbOPgaYmw@mail.gmail.com>
	<51006965.6040105@oracle.com>
	<CAHjP37FkcsXuEq44XFSrgjWL0_oSUEf_8oZpV0JtWtyNN9-BVA@mail.gmail.com>
	<51008225.3060306@oracle.com>
	<CAHjP37HB1pQVCTr0QHXBQ2x10evHJRgMauEz5UqK5Sdnih3TwQ@mail.gmail.com>
Message-ID: <51018CC8.4040904@oracle.com>

Hi Vitaly.

I'm not sure it's an issue. I don't recall seeing any unflushed writes 
from during the GC when the VM leaves the safepoint even on an RMO 
architecture (Itanium).

JohnC

On 1/23/2013 4:41 PM, Vitaly Davidovich wrote:
>
> Got it now - thanks.  So then does exiting the safepoint guarantee 
> that these writes are flushed to memory so next time GC threads run 
> they see 0s? Or is that not important/enforced elsewhere?
>
> Sent from my phone
>
> On Jan 23, 2013 7:36 PM, "John Cuthbertson" 
> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>
>     Hi Vitalty,
>
>     On 1/23/2013 4:19 PM, Vitaly Davidovich wrote:
>>
>>     Hi John,
>>
>>     Thanks for this explanation as well.  I see what you're saying
>>     about the concurrency control, but what I don't understand is
>>     when this is called:
>>
>>     void reset_hot_cache() {
>>     107     _hot_cache_idx = 0; _n_hot = 0;
>>     108   }
>>
>>     Since these are plain stores, what exactly ensures that they're
>>     (promptly) visible to other GC threads? Is there some dependency
>>     here, e.g. if you see _n_hot = 0 then _hot_cache_idx must also be
>>     zero? I strongly suspect I missed the details in your response
>>     that explain why this isn't a concern.  Is there only a
>>     particular type of thread that can call reset_hot_cache and/or
>>     only at a certain point? It kind of sounds like it so don't know
>>     if there's an assert that can be added to verify that.
>>
>
>     At the point where this routine is called the GC workers have
>     finished the parallel phases of the GC and are idle. The thread
>     that is running is the VM thread. The rest of the VM is at a
>     safepoint so we are, in effect, single threaded. Yes there is an
>     assert we can add here:
>
>     assert(SafepointSynchronize::is_at_safepoint() &&
>     Thread::current()->is_VM_thread(), "...");
>
>     JohnC
>
>>     Thanks
>>
>>     Sent from my phone
>>
>>     On Jan 23, 2013 5:51 PM, "John Cuthbertson"
>>     <john.cuthbertson at oracle.com
>>     <mailto:john.cuthbertson at oracle.com>> wrote:
>>
>>         Hi Vitaly,
>>
>>         Thanks for looking over the code changes. I'll respond to
>>         your other comments in a separate email. Detailed responses
>>         inline....
>>
>>         On 1/15/2013 4:57 PM, Vitaly Davidovich wrote:
>>
>>
>>             Hi John,
>>
>>             Wow, that's a giant heap! :)
>>
>>             I think G1ConcRSLogCacheSize needs to be validated to
>>             make sure it's <= 31; otherwise, I think you get
>>             undefined behavior on left shifting with it.
>>
>>
>>         Good catch. Done.
>>
>>             I don't think you need _def_use_cache -- can be replaced
>>             with G1ConcRSLogCacheSize > 0?
>>
>>
>>         Done. I've added a function that returns the result of the
>>         comparison and I use that in place of G1ConcRSLogCacheSize.
>>
>>             I'm sure this is due to my lack of G1 knowledge, but the
>>             concurrency control inside g1HotCardCache is a bit
>>             unclear. There's a CAS to claim the region of cards,
>>             there's a HotCache lock for inserting a card.  However,
>>             reset_hot_cache() does a naked write of a few fields.
>>              Are there any visibility and ordering constraints that
>>             need to be enforced? Do some of the stores need an
>>             OrderAccess barrier of some sort, depending on what's
>>             required? Sorry if I'm just missing it ...
>>
>>
>>         The drain routine is only called from within a GC pause but
>>         it is called by multiple GC worker threads. Each worker will
>>         claim a chunk of cards using the CAS and refine them.
>>         Resetting the boundaries (the values reset by
>>         reset_hot_cache()) in the drain routine would be a mistake
>>         since a worker thread could see the new boundary values and
>>         return, potentially leaving some cards unrefined and some
>>         missing entries in remembered sets. I can only clear the
>>         fields when the last thread has finished draining the cache.
>>         The best place to do this is just before the VM thread
>>         re-enables the cache (we know the worker threads will have
>>         finished at this point). Since the "drain" doesn't actually
>>         drain, perhaps a better name might be refine_all()?
>>
>>         The HotCache lock is used when adding entries to the cache.
>>         Entries are added by the refinement threads (and there will
>>         most likely be more than one). Since the act of adding an
>>         entry can also evict an entry we need the lock to guard
>>         against hitting the ABA problem. This could result in
>>         skipping the refinement of a card, which will lead to missing
>>         remembered set entries which are not fun to track down.
>>
>>         Draining during the GC is immune from the ABA problem because
>>         we're not actually removing entries from the cache. We would
>>         still be immune, however, if we were removing entries since
>>         we would not be adding entries at the same time.
>>
>>         Thanks,
>>
>>         JohnC
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/bd10cd21/attachment.htm>

From yamauchi at google.com  Thu Jan 24 20:40:43 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Thu, 24 Jan 2013 12:40:43 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>
Message-ID: <CAASM7NJ2BNOh8WO268cSdSWRnVnsCPP3TimLptj4XGqt4ycqVA@mail.gmail.com>

Hi Vitaly,

Thanks for the feedback.

It's a good point. I looked into this and a Linux kernel engineer that I
know tells me that madvise(MADV_DONTNEED) won't return with error
code EAGAIN in the Linux implementation though I don't think it's what the
spec necessarily guarantees. I personally haven't seen it fail that way in
my experience. That said, it'd be no problem to change it so it does retry
several times in a loop for the extra defensiveness. Vitaly (or anyone
else), do you have experience with this?

Hiroshi

On Fri, Jan 18, 2013 at 6:20 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> Hi Hiroshi,
>
> I'm not an official reviewer, but I wonder whether deallocate_pages_raw in
> os_linux.cpp should handle an EAGAIN return value from madvise.
> Specifically, should the code loop on EAGAIN and retry the syscall? Maybe
> have some safety value there to stop looping if too many tries (or too much
> time) are failing.
>
> Thanks
>
> Sent from my phone
> On Jan 18, 2013 6:30 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:
>
>> http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/
>>
>> Hi folks,
>>
>> I'd like to see if it makes sense to contribute this patch.
>>
>> If it's enabled, it helps reduce the JVM memory/RAM footprint by
>> deallocating (releasing) the underlying memory pages that correspond to the
>> unused or free portions of the heap (more specifically, it calls
>> madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
>> without unmapping the heap address space).
>>
>> Though the worst-case memory footprint (that is, when the heap is full)
>> does not change, this helps the JVM bring its RAM usage closer to what it
>> actually is using at the moment (that is, occupied by objects) and Java
>> applications behave more nicely in shared environments in which multiple
>> servers or applications run.
>>
>> In fact, this has been very useful in certain servers and desktop tools
>> that we have at Google and helped save a lot of RAM use. It tries to
>> address the issue where a Java server or app runs for a while and almost
>> never releases its RAM even when it is mostly idle.
>>
>> Of course, a higher degree of heap fragmentation deteriorates the utility
>> of this because a free chunk smaller than a page cannot be deallocated, but
>> it has the advantage of being able to work without shrinking the heap or
>> the generation.
>>
>> Despite the fact that this can slow down things due to the on-demand page
>> reallocation that happens when a deallocated page is first touched, the
>> performance hit seems not bad. In my measurements, I see a ~1-3% overall
>> overhead in an internal server test and a ~0-4% overall overhead in the
>> DaCapo benchmarks.
>>
>> It supports the CMS collector and Linux only in the current form though
>> it's probably possible to extend this to other collectors and platforms in
>> the future.
>>
>> I thought this could be useful in wider audience.
>>
>> Chuck Rasbold has kindly reviewed this change.
>>
>> Thanks,
>> Hiroshi
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/51da0f11/attachment.htm>

From jesper.wilhelmsson at oracle.com  Thu Jan 24 21:45:14 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Thu, 24 Jan 2013 22:45:14 +0100
Subject: RFR (S): JDK-8006432 - Ratio flags should be unsigned
Message-ID: <5101AB6A.80205@oracle.com>

Hi,

I'm looking for a couple of reviews for this small change.

Bug: JDK-8006432 - Ratio flags should be unsigned

Webrev:
http://cr.openjdk.java.net/~jwilhelm/8006432/webrev/

Summary:
Four flags whose contents are assumed to be unsigned were stored in 
signed variables. I have changed these to be unsigned instead.

Testing:
Manual testing and JPRT.

/Jesper


From vitalyd at gmail.com  Thu Jan 24 21:47:22 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 24 Jan 2013 16:47:22 -0500
Subject: Deallocating memory pages
In-Reply-To: <CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>
Message-ID: <CAHjP37FDkcvovmNU3gRRY7H-kBg=2Pg9vhdPis_+yWU2J7UteQ@mail.gmail.com>

Hi Hiroshi,

I don't have any experience with this and a quick Google didn't turn
anything up.  Browsing Linux kernel source (mm/madvise.c madvise_dontneed)
doesn't show anything that would return it (just EINVAL in some cases).  I
was only going by the man page for madvise, but it's a bit general.  I
guess I'd ignore this and leave your code as-is since I doubt the behavior
will change on Linux.  Sorry for the noise.

Thanks

Sent from my phone
On Jan 18, 2013 9:20 PM, "Vitaly Davidovich" <vitalyd at gmail.com> wrote:

> Hi Hiroshi,
>
> I'm not an official reviewer, but I wonder whether deallocate_pages_raw in
> os_linux.cpp should handle an EAGAIN return value from madvise.
> Specifically, should the code loop on EAGAIN and retry the syscall? Maybe
> have some safety value there to stop looping if too many tries (or too much
> time) are failing.
>
> Thanks
>
> Sent from my phone
> On Jan 18, 2013 6:30 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:
>
>> http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/
>>
>> Hi folks,
>>
>> I'd like to see if it makes sense to contribute this patch.
>>
>> If it's enabled, it helps reduce the JVM memory/RAM footprint by
>> deallocating (releasing) the underlying memory pages that correspond to the
>> unused or free portions of the heap (more specifically, it calls
>> madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
>> without unmapping the heap address space).
>>
>> Though the worst-case memory footprint (that is, when the heap is full)
>> does not change, this helps the JVM bring its RAM usage closer to what it
>> actually is using at the moment (that is, occupied by objects) and Java
>> applications behave more nicely in shared environments in which multiple
>> servers or applications run.
>>
>> In fact, this has been very useful in certain servers and desktop tools
>> that we have at Google and helped save a lot of RAM use. It tries to
>> address the issue where a Java server or app runs for a while and almost
>> never releases its RAM even when it is mostly idle.
>>
>> Of course, a higher degree of heap fragmentation deteriorates the utility
>> of this because a free chunk smaller than a page cannot be deallocated, but
>> it has the advantage of being able to work without shrinking the heap or
>> the generation.
>>
>> Despite the fact that this can slow down things due to the on-demand page
>> reallocation that happens when a deallocated page is first touched, the
>> performance hit seems not bad. In my measurements, I see a ~1-3% overall
>> overhead in an internal server test and a ~0-4% overall overhead in the
>> DaCapo benchmarks.
>>
>> It supports the CMS collector and Linux only in the current form though
>> it's probably possible to extend this to other collectors and platforms in
>> the future.
>>
>> I thought this could be useful in wider audience.
>>
>> Chuck Rasbold has kindly reviewed this change.
>>
>> Thanks,
>> Hiroshi
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/5b01b368/attachment.htm>

From yamauchi at google.com  Thu Jan 24 22:24:40 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Thu, 24 Jan 2013 14:24:40 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAHjP37FDkcvovmNU3gRRY7H-kBg=2Pg9vhdPis_+yWU2J7UteQ@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<CAHjP37GSY2B6Eq2r7n-r5gGf-MNfSLepQm8dFnz13z1AoCgvRA@mail.gmail.com>
	<CAHjP37FDkcvovmNU3gRRY7H-kBg=2Pg9vhdPis_+yWU2J7UteQ@mail.gmail.com>
Message-ID: <CAASM7N+f5h5BSqmwu4CBBG9hRkwP_PME7PjhYmfOnDL4x3s_WA@mail.gmail.com>

Vitaly, thanks for confirming that with the kernel source.


On Thu, Jan 24, 2013 at 1:47 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> Hi Hiroshi,
>
> I don't have any experience with this and a quick Google didn't turn
> anything up.  Browsing Linux kernel source (mm/madvise.c madvise_dontneed)
> doesn't show anything that would return it (just EINVAL in some cases).  I
> was only going by the man page for madvise, but it's a bit general.  I
> guess I'd ignore this and leave your code as-is since I doubt the behavior
> will change on Linux.  Sorry for the noise.
>
> Thanks
>
> Sent from my phone
> On Jan 18, 2013 9:20 PM, "Vitaly Davidovich" <vitalyd at gmail.com> wrote:
>
>> Hi Hiroshi,
>>
>> I'm not an official reviewer, but I wonder whether deallocate_pages_raw
>> in os_linux.cpp should handle an EAGAIN return value from madvise.
>> Specifically, should the code loop on EAGAIN and retry the syscall? Maybe
>> have some safety value there to stop looping if too many tries (or too much
>> time) are failing.
>>
>> Thanks
>>
>> Sent from my phone
>> On Jan 18, 2013 6:30 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:
>>
>>> http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/
>>>
>>> Hi folks,
>>>
>>> I'd like to see if it makes sense to contribute this patch.
>>>
>>> If it's enabled, it helps reduce the JVM memory/RAM footprint by
>>> deallocating (releasing) the underlying memory pages that correspond to the
>>> unused or free portions of the heap (more specifically, it calls
>>> madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
>>> without unmapping the heap address space).
>>>
>>> Though the worst-case memory footprint (that is, when the heap is full)
>>> does not change, this helps the JVM bring its RAM usage closer to what it
>>> actually is using at the moment (that is, occupied by objects) and Java
>>> applications behave more nicely in shared environments in which multiple
>>> servers or applications run.
>>>
>>> In fact, this has been very useful in certain servers and desktop tools
>>> that we have at Google and helped save a lot of RAM use. It tries to
>>> address the issue where a Java server or app runs for a while and almost
>>> never releases its RAM even when it is mostly idle.
>>>
>>> Of course, a higher degree of heap fragmentation deteriorates the
>>> utility of this because a free chunk smaller than a page cannot be
>>> deallocated, but it has the advantage of being able to work without
>>> shrinking the heap or the generation.
>>>
>>> Despite the fact that this can slow down things due to the on-demand
>>> page reallocation that happens when a deallocated page is first touched,
>>> the performance hit seems not bad. In my measurements, I see a ~1-3%
>>> overall overhead in an internal server test and a ~0-4% overall overhead in
>>> the DaCapo benchmarks.
>>>
>>> It supports the CMS collector and Linux only in the current form though
>>> it's probably possible to extend this to other collectors and platforms in
>>> the future.
>>>
>>> I thought this could be useful in wider audience.
>>>
>>> Chuck Rasbold has kindly reviewed this change.
>>>
>>> Thanks,
>>> Hiroshi
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/46b0a8fe/attachment.htm>

From john.cuthbertson at oracle.com  Thu Jan 24 23:01:54 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 24 Jan 2013 15:01:54 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
Message-ID: <5101BD62.4030504@oracle.com>

Hi All,

Can I have a couple of volunteers look over this small change? The 
webrev can be found at: http://cr.openjdk.java.net/~johnc/8006894/webrev.0/

Summary:
When G1 calculates the number of marking threads based upon (the 
develop-only) G1MarkingOverheadPercent or (more usually) 
ParallelGCThreads, we weren't setting the value of ConcGCThreads. As a 
result the output of PrintFlagsFinal would always show a zero if 
ConcGCThreads wasn't specified on the command line:

     uintx ConcGCThreads                             = 0               
{product}

This made it difficult for the performance team to analyze marking 
behavior and offer advice. With this change we now get the calculated 
number of marking threads:

Using ParallelGCThreads (default: 4):

     uintx ConcGCThreads                            := 1               
{product}

Using G1MarkingOverheadPercent (50):

     uintx ConcGCThreads                            := 2               
{product}

Testing:
Command line testing; specjvm98 and dacapo with a low IHOP value 
(marking threshold).

Thanks,

JohnC


From jon.masamitsu at oracle.com  Thu Jan 24 23:45:35 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 24 Jan 2013 15:45:35 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <5101BD62.4030504@oracle.com>
References: <5101BD62.4030504@oracle.com>
Message-ID: <5101C79F.6050801@oracle.com>

John,

Change looks good.

It wasn't obvious to me that the default for ConcGCThreads is 0 so
I had to look at the "if (ConcGCThreads > 0)" a couple of times
before I understood that it was really
"if (ConcGCThreads-is-user-specified-and-greater-than-0)"

Would you consider changing to

if (!FLAG_IS_DEFAULT(ConcGCThreads) && ConcGCThreads > 0)

It's more wordy and doesn't really change anything but I think
it's more readable.

Jon


On 1/24/2013 3:01 PM, John Cuthbertson wrote:
> Hi All,
>
> Can I have a couple of volunteers look over this small change? The 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8006894/webrev.0/
>
> Summary:
> When G1 calculates the number of marking threads based upon (the 
> develop-only) G1MarkingOverheadPercent or (more usually) 
> ParallelGCThreads, we weren't setting the value of ConcGCThreads. As a 
> result the output of PrintFlagsFinal would always show a zero if 
> ConcGCThreads wasn't specified on the command line:
>
>     uintx ConcGCThreads                             = 0               
> {product}
>
> This made it difficult for the performance team to analyze marking 
> behavior and offer advice. With this change we now get the calculated 
> number of marking threads:
>
> Using ParallelGCThreads (default: 4):
>
>     uintx ConcGCThreads                            := 1               
> {product}
>
> Using G1MarkingOverheadPercent (50):
>
>     uintx ConcGCThreads                            := 2               
> {product}
>
> Testing:
> Command line testing; specjvm98 and dacapo with a low IHOP value 
> (marking threshold).
>
> Thanks,
>
> JohnC


From jon.masamitsu at oracle.com  Fri Jan 25 00:00:55 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 24 Jan 2013 16:00:55 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <5101C79F.6050801@oracle.com>
References: <5101BD62.4030504@oracle.com> <5101C79F.6050801@oracle.com>
Message-ID: <5101CB37.9090102@oracle.com>

John,

My comment about adding the FLAG_IS_DEFAULT to the
test is a little crazy.  Unless someone else thinks it has
value, then it's brilliant :-).  Ignore it unless you get a second
yea on it.

Jon

On 1/24/2013 3:45 PM, Jon Masamitsu wrote:
> John,
>
> Change looks good.
>
> It wasn't obvious to me that the default for ConcGCThreads is 0 so
> I had to look at the "if (ConcGCThreads > 0)" a couple of times
> before I understood that it was really
> "if (ConcGCThreads-is-user-specified-and-greater-than-0)"
>
> Would you consider changing to
>
> if (!FLAG_IS_DEFAULT(ConcGCThreads) && ConcGCThreads > 0)
>
> It's more wordy and doesn't really change anything but I think
> it's more readable.
>
> Jon
>
>
>
> On 1/24/2013 3:01 PM, John Cuthbertson wrote:
>> Hi All,
>>
>> Can I have a couple of volunteers look over this small change? The 
>> webrev can be found at: 
>> http://cr.openjdk.java.net/~johnc/8006894/webrev.0/
>>
>> Summary:
>> When G1 calculates the number of marking threads based upon (the 
>> develop-only) G1MarkingOverheadPercent or (more usually) 
>> ParallelGCThreads, we weren't setting the value of ConcGCThreads. As 
>> a result the output of PrintFlagsFinal would always show a zero if 
>> ConcGCThreads wasn't specified on the command line:
>>
>>     uintx ConcGCThreads                             = 0               
>> {product}
>>
>> This made it difficult for the performance team to analyze marking 
>> behavior and offer advice. With this change we now get the calculated 
>> number of marking threads:
>>
>> Using ParallelGCThreads (default: 4):
>>
>>     uintx ConcGCThreads                            := 1               
>> {product}
>>
>> Using G1MarkingOverheadPercent (50):
>>
>>     uintx ConcGCThreads                            := 2               
>> {product}
>>
>> Testing:
>> Command line testing; specjvm98 and dacapo with a low IHOP value 
>> (marking threshold).
>>
>> Thanks,
>>
>> JohnC


From john.cuthbertson at oracle.com  Fri Jan 25 00:08:34 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 24 Jan 2013 16:08:34 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <5101C79F.6050801@oracle.com>
References: <5101BD62.4030504@oracle.com> <5101C79F.6050801@oracle.com>
Message-ID: <5101CD02.8020906@oracle.com>

Hi Jon,

Thanks for the review. I'll make the change you suggested.

I was originally toying with adding an assert: 
assert(FLAG_IS_CMDLINE(...), "..") to capture that a non-zero value had 
to have been set by the user but decided against it.

Thanks,
JohnC

On 1/24/2013 3:45 PM, Jon Masamitsu wrote:
> John,
>
> Change looks good.
>
> It wasn't obvious to me that the default for ConcGCThreads is 0 so
> I had to look at the "if (ConcGCThreads > 0)" a couple of times
> before I understood that it was really
> "if (ConcGCThreads-is-user-specified-and-greater-than-0)"
>
> Would you consider changing to
>
> if (!FLAG_IS_DEFAULT(ConcGCThreads) && ConcGCThreads > 0)
>
> It's more wordy and doesn't really change anything but I think
> it's more readable.
>
> Jon
>
>
>
> On 1/24/2013 3:01 PM, John Cuthbertson wrote:
>> Hi All,
>>
>> Can I have a couple of volunteers look over this small change? The 
>> webrev can be found at: 
>> http://cr.openjdk.java.net/~johnc/8006894/webrev.0/
>>
>> Summary:
>> When G1 calculates the number of marking threads based upon (the 
>> develop-only) G1MarkingOverheadPercent or (more usually) 
>> ParallelGCThreads, we weren't setting the value of ConcGCThreads. As 
>> a result the output of PrintFlagsFinal would always show a zero if 
>> ConcGCThreads wasn't specified on the command line:
>>
>>     uintx ConcGCThreads                             = 0               
>> {product}
>>
>> This made it difficult for the performance team to analyze marking 
>> behavior and offer advice. With this change we now get the calculated 
>> number of marking threads:
>>
>> Using ParallelGCThreads (default: 4):
>>
>>     uintx ConcGCThreads                            := 1               
>> {product}
>>
>> Using G1MarkingOverheadPercent (50):
>>
>>     uintx ConcGCThreads                            := 2               
>> {product}
>>
>> Testing:
>> Command line testing; specjvm98 and dacapo with a low IHOP value 
>> (marking threshold).
>>
>> Thanks,
>>
>> JohnC


From john.cuthbertson at oracle.com  Fri Jan 25 00:27:03 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 24 Jan 2013 16:27:03 -0800
Subject: RFR (S): JDK-8006432 - Ratio flags should be unsigned
In-Reply-To: <5101AB6A.80205@oracle.com>
References: <5101AB6A.80205@oracle.com>
Message-ID: <5101D157.1030903@oracle.com>

Hi Jesper,

Looks good to me.

Can you also remove the unused G1InitYoungSurvRatio from g1_globals.hpp?

Thanks,

JohnC

On 1/24/2013 1:45 PM, Jesper Wilhelmsson wrote:
> Hi,
>
> I'm looking for a couple of reviews for this small change.
>
> Bug: JDK-8006432 - Ratio flags should be unsigned
>
> Webrev:
> http://cr.openjdk.java.net/~jwilhelm/8006432/webrev/
>
> Summary:
> Four flags whose contents are assumed to be unsigned were stored in 
> signed variables. I have changed these to be unsigned instead.
>
> Testing:
> Manual testing and JPRT.
>
> /Jesper


From ysr1729 at gmail.com  Fri Jan 25 01:29:55 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 24 Jan 2013 17:29:55 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <5101BD62.4030504@oracle.com>
References: <5101BD62.4030504@oracle.com>
Message-ID: <CABzyjy=OgWZQQFc5rMJRqoJ+JZnZ6Vm6uykN1P3OKZ=KCZyoeA@mail.gmail.com>

Looks good to me too. (Just out of curiosity, what happens with CMS, is it
correctly reported/set, or does it have the same issue -- i am not
suggesting fixing it given the EOL plans for CMS; just wondered. Hmm, I
think in CMS we directly use the flag variable, so should probably report
fine.)

-- ramki

On Thu, Jan 24, 2013 at 3:01 PM, John Cuthbertson <
john.cuthbertson at oracle.com> wrote:

> Hi All,
>
> Can I have a couple of volunteers look over this small change? The webrev
> can be found at: http://cr.openjdk.java.net/~**johnc/8006894/webrev.0/<http://cr.openjdk.java.net/%7Ejohnc/8006894/webrev.0/>
>
> Summary:
> When G1 calculates the number of marking threads based upon (the
> develop-only) G1MarkingOverheadPercent or (more usually) ParallelGCThreads,
> we weren't setting the value of ConcGCThreads. As a result the output of
> PrintFlagsFinal would always show a zero if ConcGCThreads wasn't specified
> on the command line:
>
>     uintx ConcGCThreads                             = 0
> {product}
>
> This made it difficult for the performance team to analyze marking
> behavior and offer advice. With this change we now get the calculated
> number of marking threads:
>
> Using ParallelGCThreads (default: 4):
>
>     uintx ConcGCThreads                            := 1
> {product}
>
> Using G1MarkingOverheadPercent (50):
>
>     uintx ConcGCThreads                            := 2
> {product}
>
> Testing:
> Command line testing; specjvm98 and dacapo with a low IHOP value (marking
> threshold).
>
> Thanks,
>
> JohnC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/603c0631/attachment.htm>

From john.cuthbertson at oracle.com  Fri Jan 25 01:33:49 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 24 Jan 2013 17:33:49 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <CABzyjy=OgWZQQFc5rMJRqoJ+JZnZ6Vm6uykN1P3OKZ=KCZyoeA@mail.gmail.com>
References: <5101BD62.4030504@oracle.com>
	<CABzyjy=OgWZQQFc5rMJRqoJ+JZnZ6Vm6uykN1P3OKZ=KCZyoeA@mail.gmail.com>
Message-ID: <5101E0FD.1030700@oracle.com>

Hi Ramki,

Thanks for looking at the change. I'll generate the output for CMS later 
tonight. From conversations with the perf team I believe CMS is OK in 
this regard.

JohnC

On 1/24/2013 5:29 PM, Srinivas Ramakrishna wrote:
> Looks good to me too. (Just out of curiosity, what happens with CMS, 
> is it correctly reported/set, or does it have the same issue -- i am 
> not suggesting fixing it given the EOL plans for CMS; just wondered. 
> Hmm, I think in CMS we directly use the flag variable, so should 
> probably report fine.)
>
> -- ramki
>
> On Thu, Jan 24, 2013 at 3:01 PM, John Cuthbertson 
> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>
>     Hi All,
>
>     Can I have a couple of volunteers look over this small change? The
>     webrev can be found at:
>     http://cr.openjdk.java.net/~johnc/8006894/webrev.0/
>     <http://cr.openjdk.java.net/%7Ejohnc/8006894/webrev.0/>
>
>     Summary:
>     When G1 calculates the number of marking threads based upon (the
>     develop-only) G1MarkingOverheadPercent or (more usually)
>     ParallelGCThreads, we weren't setting the value of ConcGCThreads.
>     As a result the output of PrintFlagsFinal would always show a zero
>     if ConcGCThreads wasn't specified on the command line:
>
>         uintx ConcGCThreads                             = 0        
>     {product}
>
>     This made it difficult for the performance team to analyze marking
>     behavior and offer advice. With this change we now get the
>     calculated number of marking threads:
>
>     Using ParallelGCThreads (default: 4):
>
>         uintx ConcGCThreads                            := 1        
>     {product}
>
>     Using G1MarkingOverheadPercent (50):
>
>         uintx ConcGCThreads                            := 2        
>     {product}
>
>     Testing:
>     Command line testing; specjvm98 and dacapo with a low IHOP value
>     (marking threshold).
>
>     Thanks,
>
>     JohnC
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130124/e001140f/attachment.htm>

From jon.masamitsu at oracle.com  Fri Jan 25 05:37:50 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Thu, 24 Jan 2013 21:37:50 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
Message-ID: <51021A2E.9050503@oracle.com>

Hiroshi,

This is a nice feature but I'm asking myself what type
of applications would see significant footprint reductions.
Would they be

1) applications whose heap usage varies significantly
over time so that there are periods when some large
fraction of the heap is unused and

2) applications with many objects larger than a page so that freeing
those objects could free memory.

Is that a good guess or is it more general than that?

Jon

On 1/18/2013 3:29 PM, Hiroshi Yamauchi wrote:
> http://cr.openjdk.java.net/~hiroshi/webrevs/dhp/webrev.00/
>
> Hi folks,
>
> I'd like to see if it makes sense to contribute this patch.
>
> If it's enabled, it helps reduce the JVM memory/RAM footprint by
> deallocating (releasing) the underlying memory pages that correspond to the
> unused or free portions of the heap (more specifically, it calls
> madvise(MADV_DONTNEED) for the bodies of free chunks in the old generation
> without unmapping the heap address space).
>
> Though the worst-case memory footprint (that is, when the heap is full)
> does not change, this helps the JVM bring its RAM usage closer to what it
> actually is using at the moment (that is, occupied by objects) and Java
> applications behave more nicely in shared environments in which multiple
> servers or applications run.
>
> In fact, this has been very useful in certain servers and desktop tools
> that we have at Google and helped save a lot of RAM use. It tries to
> address the issue where a Java server or app runs for a while and almost
> never releases its RAM even when it is mostly idle.
>
> Of course, a higher degree of heap fragmentation deteriorates the utility
> of this because a free chunk smaller than a page cannot be deallocated, but
> it has the advantage of being able to work without shrinking the heap or
> the generation.
>
> Despite the fact that this can slow down things due to the on-demand page
> reallocation that happens when a deallocated page is first touched, the
> performance hit seems not bad. In my measurements, I see a ~1-3% overall
> overhead in an internal server test and a ~0-4% overall overhead in the
> DaCapo benchmarks.
>
> It supports the CMS collector and Linux only in the current form though
> it's probably possible to extend this to other collectors and platforms in
> the future.
>
> I thought this could be useful in wider audience.
>
> Chuck Rasbold has kindly reviewed this change.
>
> Thanks,
> Hiroshi
>


From stefan.karlsson at oracle.com  Fri Jan 25 09:37:09 2013
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 25 Jan 2013 10:37:09 +0100
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <510162D9.9040609@oracle.com>
References: <50F424A0.6080907@oracle.com> <5100174F.7060301@oracle.com>
	<5100BABB.4040004@oracle.com> <510137EA.4050102@oracle.com>
	<510162D9.9040609@oracle.com>
Message-ID: <51025245.6050901@oracle.com>

On 01/24/2013 05:35 PM, Jon Masamitsu wrote:
>
>
> On 1/24/2013 5:32 AM, Stefan Karlsson wrote:
>> On 01/24/2013 05:38 AM, Jon Masamitsu wrote:
>>> Coleen,
>>>
>>> Thanks for the review.
>>>
>>> I delete the print at 1013 (instead of moving it) and reverted 
>>> get_new_chunk().  I left in the
>>> print at 1014.
>>>
>>> I have 2 webrevs for  these changes now.
>>
>> Thanks for splitting this into two changes.
>>
>>> Your suggested changes are in
>>>
>>> http://cr.openjdk.java.net/~jmasa/8006815/webrev.00/
>>
>> I don't know if this is a reasonable change or not.
>>
>> Why are you checking if we should expand, before trying to allocate 
>> in the current virtual space?
>>
>>  982   // The next attempts at allocating a chunk will expand the
>>  983   // Metaspace capacity.  Check first if there should be an 
>> expansion.
>>  984   if (!MetaspaceGC::should_expand(this, word_size, 
>> grow_chunks_by_words)) {
>>  985     return next;
>>  986   }
>
> Do you mean that I should put the check should_expand() after
> line 991.  That would be better.

Yes. It seemed like you could return NULL without using all memory in 
the VirtualSpace.

StefanK

>
> Or do you mean that the test should look at the current capacity (as it
> did before) and not at the capacity after the addition of the chunk (for
> comparing to the HWM)?
>
>>  987
>>  988   // Allocate a chunk out of the current virtual space.
>>  989   if (next == NULL) {
>>  990     next = 
>> current_virtual_space()->get_chunk_vs(grow_chunks_by_words);
>>  991   }
>>
>> Shouldn't this line be checking "less than or equal":
>>
>> *!    if (_(_vsl->capacity_words_sum(_) + expansion_word_size_) < 
>> metaspace_size_words ||*
>>         capacity_until_GC() == 0) {
>>       set_capacity_until_GC(metaspace_size_words);
>>       return true;
>>     }
>
> Yes. Fixed.
>
>>
>>> Webrev with the Min/MaxMetaspaceFreeRatio changes is
>>>
>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.02/
>>
>> This looks good.
>>
>> Though, I think you need to update the descriptions of the new flags:
>> +   product(uintx, MinMetaspaceFreeRatio, 
>> 10,                              \
>> +           "Min percentage of heap free after GC to avoid 
>> expansion")        \
>> + \
>> +   product(uintx, MaxMetaspaceFreeRatio, 
>> 20,                              \
>> +           "Max percentage of heap free after GC to avoid 
>> shrinking")        \
>
> Fixed.
>
> Jon
>>
>> thanks,
>> StefanK
>>
>>>
>>> Jon
>>>
>>> On 1/23/2013 9:01 AM, Coleen Phillimore wrote:
>>>>
>>>> It looks okay except I think passing SpaceManager* to 
>>>> get_new_chunk() is really gross just to do a print out.  I'd rather 
>>>> see the printing at line 1014 moved to 2170 whether you get a new 
>>>> virtual space or not.  There's already a ton of output from 
>>>> TraceMetadataChunkAllocation && Verbose, but you can leave the 
>>>> printing at 1013 so you know which one created a new virtual space.
>>>>
>>>> Coleen
>>>>
>>>>
>>>> On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
>>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>>
>>>>> Previously the calculation of the metadata capacity at which
>>>>> to do a GC (high water mark, HWM) to recover
>>>>> unloaded classes used the MinHeapFreeRatio
>>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>>> generally left an excessive amount of unused capacity for
>>>>> metadata.  This change adds specific flags for metadata
>>>>> capacity with defaults more conservative in terms of
>>>>> unused capacity.
>>>>>
>>>>> Added an additional check for doing a GC before expanding
>>>>> the metadata capacity.  Required adding a new parameter to
>>>>> get_new_chunk().
>>>>>
>>>>> Added some additional diagnostic prints.
>>>>>
>>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>>>>>
>>>>> Thanks.
>>>>
>>
>>


From jon.masamitsu at oracle.com  Fri Jan 25 16:14:00 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 25 Jan 2013 08:14:00 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <51025245.6050901@oracle.com>
References: <50F424A0.6080907@oracle.com> <5100174F.7060301@oracle.com>
	<5100BABB.4040004@oracle.com> <510137EA.4050102@oracle.com>
	<510162D9.9040609@oracle.com> <51025245.6050901@oracle.com>
Message-ID: <5102AF48.2090804@oracle.com>


On 1/25/2013 1:37 AM, Stefan Karlsson wrote:
> ...
>> Do you mean that I should put the check should_expand() after
>> line 991.  That would be better.
>
>
> Yes. It seemed like you could return NULL without using all memory in 
> the VirtualSpace.

Fixed.

Jon

>
> StefanK
>
>>
>> Or do you mean that the test should look at the current capacity (as it
>> did before) and not at the capacity after the addition of the chunk (for
>> comparing to the HWM)?
>>
>>>  987
>>>  988   // Allocate a chunk out of the current virtual space.
>>>  989   if (next == NULL) {
>>>  990     next = 
>>> current_virtual_space()->get_chunk_vs(grow_chunks_by_words);
>>>  991   }
>>>
>>> Shouldn't this line be checking "less than or equal":
>>>
>>> *!    if (_(_vsl->capacity_words_sum(_) + expansion_word_size_) < 
>>> metaspace_size_words ||*
>>>         capacity_until_GC() == 0) {
>>>       set_capacity_until_GC(metaspace_size_words);
>>>       return true;
>>>     }
>>
>> Yes. Fixed.
>>
>>>
>>>> Webrev with the Min/MaxMetaspaceFreeRatio changes is
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.02/
>>>
>>> This looks good.
>>>
>>> Though, I think you need to update the descriptions of the new flags:
>>> +   product(uintx, MinMetaspaceFreeRatio, 
>>> 10,                              \
>>> +           "Min percentage of heap free after GC to avoid 
>>> expansion")        \
>>> + \
>>> +   product(uintx, MaxMetaspaceFreeRatio, 
>>> 20,                              \
>>> +           "Max percentage of heap free after GC to avoid 
>>> shrinking")        \
>>
>> Fixed.
>>
>> Jon
>>>
>>> thanks,
>>> StefanK
>>>
>>>>
>>>> Jon
>>>>
>>>> On 1/23/2013 9:01 AM, Coleen Phillimore wrote:
>>>>>
>>>>> It looks okay except I think passing SpaceManager* to 
>>>>> get_new_chunk() is really gross just to do a print out.  I'd 
>>>>> rather see the printing at line 1014 moved to 2170 whether you get 
>>>>> a new virtual space or not.  There's already a ton of output from 
>>>>> TraceMetadataChunkAllocation && Verbose, but you can leave the 
>>>>> printing at 1013 so you know which one created a new virtual space.
>>>>>
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 01/14/2013 10:30 AM, Jon Masamitsu wrote:
>>>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>>>
>>>>>> Previously the calculation of the metadata capacity at which
>>>>>> to do a GC (high water mark, HWM) to recover
>>>>>> unloaded classes used the MinHeapFreeRatio
>>>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>>>> generally left an excessive amount of unused capacity for
>>>>>> metadata.  This change adds specific flags for metadata
>>>>>> capacity with defaults more conservative in terms of
>>>>>> unused capacity.
>>>>>>
>>>>>> Added an additional check for doing a GC before expanding
>>>>>> the metadata capacity.  Required adding a new parameter to
>>>>>> get_new_chunk().
>>>>>>
>>>>>> Added some additional diagnostic prints.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/
>>>>>>
>>>>>> Thanks.
>>>>>
>>>
>>>
>


From jesper.wilhelmsson at oracle.com  Fri Jan 25 16:24:49 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Fri, 25 Jan 2013 17:24:49 +0100
Subject: RFR (S): JDK-8006432 - Ratio flags should be unsigned
In-Reply-To: <5101D157.1030903@oracle.com>
References: <5101AB6A.80205@oracle.com> <5101D157.1030903@oracle.com>
Message-ID: <5102B1D1.3020509@oracle.com>

Hi John,

Thanks for the review!
I have removed G1InitYoungSurvRatio and updated the webrev and the bug to 
reflect this.
/Jesper


On 2013-01-25 01:27, John Cuthbertson wrote:
> Hi Jesper,
>
> Looks good to me.
>
> Can you also remove the unused G1InitYoungSurvRatio from g1_globals.hpp?
>
> Thanks,
>
> JohnC
>
> On 1/24/2013 1:45 PM, Jesper Wilhelmsson wrote:
>> Hi,
>>
>> I'm looking for a couple of reviews for this small change.
>>
>> Bug: JDK-8006432 - Ratio flags should be unsigned
>>
>> Webrev:
>> http://cr.openjdk.java.net/~jwilhelm/8006432/webrev/
>>
>> Summary:
>> Four flags whose contents are assumed to be unsigned were stored in signed
>> variables. I have changed these to be unsigned instead.
>>
>> Testing:
>> Manual testing and JPRT.
>>
>> /Jesper
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130125/6bac9c91/jesper_wilhelmsson.vcf>

From john.cuthbertson at oracle.com  Fri Jan 25 20:46:37 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Fri, 25 Jan 2013 12:46:37 -0800
Subject: RFR(XXS): 8006894: G1: Number of marking threads missing from
	PrintFlagsFinal output
In-Reply-To: <5101E0FD.1030700@oracle.com>
References: <5101BD62.4030504@oracle.com>
	<CABzyjy=OgWZQQFc5rMJRqoJ+JZnZ6Vm6uykN1P3OKZ=KCZyoeA@mail.gmail.com>
	<5101E0FD.1030700@oracle.com>
Message-ID: <5102EF2D.9020102@oracle.com>

Hi Ramki,

CMS does do the correct thing. Here's the code:

   // Support for multi-threaded concurrent phases
   if (CMSConcurrentMTEnabled) {
     if (FLAG_IS_DEFAULT(ConcGCThreads)) {
       // just for now
       FLAG_SET_DEFAULT(ConcGCThreads, (ParallelGCThreads + 3)/4);
     }
     if (ConcGCThreads > 1) {
       _conc_workers = new YieldingFlexibleWorkGang("Parallel CMS Threads",
                                  ConcGCThreads, true);
       if (_conc_workers == NULL) {
         warning("GC/CMS: _conc_workers allocation failure: "
               "forcing -CMSConcurrentMTEnabled");
         CMSConcurrentMTEnabled = false;
       } else {
         _conc_workers->initialize_workers();
       }
     } else {
       CMSConcurrentMTEnabled = false;
     }
   }

I guess an argument could be made that it should only set ConcGCThreads 
if the value is 0. If a non-zero default was given then we override that 
default. Also I think we should set ConcGCThreads using FLAG_SET_ERGO. 
Using FLAG_SET_ERGO when we're ergonomically turning off 
CMSConcurrentMTEnabled (because the # of threads is 1 or the allocation 
of work gang fails).

But the main answer is I see a non-zero value for ConcGCThreads in the 
PrintFlagsFinal output:

     uintx ConcGCThreads                             = 2               
{product}

JohnC


On 1/24/2013 5:33 PM, John Cuthbertson wrote:
> Hi Ramki,
>
> Thanks for looking at the change. I'll generate the output for CMS 
> later tonight. From conversations with the perf team I believe CMS is 
> OK in this regard.
>
> JohnC
>
> On 1/24/2013 5:29 PM, Srinivas Ramakrishna wrote:
>> Looks good to me too. (Just out of curiosity, what happens with CMS, 
>> is it correctly reported/set, or does it have the same issue -- i am 
>> not suggesting fixing it given the EOL plans for CMS; just wondered. 
>> Hmm, I think in CMS we directly use the flag variable, so should 
>> probably report fine.)
>>
>> -- ramki
>>
>> On Thu, Jan 24, 2013 at 3:01 PM, John Cuthbertson 
>> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>>
>>     Hi All,
>>
>>     Can I have a couple of volunteers look over this small change?
>>     The webrev can be found at:
>>     http://cr.openjdk.java.net/~johnc/8006894/webrev.0/
>>     <http://cr.openjdk.java.net/%7Ejohnc/8006894/webrev.0/>
>>
>>     Summary:
>>     When G1 calculates the number of marking threads based upon (the
>>     develop-only) G1MarkingOverheadPercent or (more usually)
>>     ParallelGCThreads, we weren't setting the value of ConcGCThreads.
>>     As a result the output of PrintFlagsFinal would always show a
>>     zero if ConcGCThreads wasn't specified on the command line:
>>
>>         uintx ConcGCThreads                             = 0          
>>     {product}
>>
>>     This made it difficult for the performance team to analyze
>>     marking behavior and offer advice. With this change we now get
>>     the calculated number of marking threads:
>>
>>     Using ParallelGCThreads (default: 4):
>>
>>         uintx ConcGCThreads                            := 1          
>>     {product}
>>
>>     Using G1MarkingOverheadPercent (50):
>>
>>         uintx ConcGCThreads                            := 2          
>>     {product}
>>
>>     Testing:
>>     Command line testing; specjvm98 and dacapo with a low IHOP value
>>     (marking threshold).
>>
>>     Thanks,
>>
>>     JohnC
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130125/7d9b12fc/attachment.htm>

From john.cuthbertson at oracle.com  Fri Jan 25 21:13:27 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Fri, 25 Jan 2013 13:13:27 -0800
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental
	mode of CMS
In-Reply-To: <MGRNT8$15B65F8C7747BCD32CB2F745B9DB2BA9@frajt.eu>
References: <508EB0D7.8020204@oracle.com>
	<CABzyjykeMB3gNgoCkd-n4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g@mail.gmail.com>
	<50C108C2.9@oracle.com>
	<CABzyjy=VeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg@mail.gmail.com>
	<CABzyjymQJArReNr2xQ9pYA6kUMqcmxW9fpiykic=Nrw8AaDG5g@mail.gmail.com>
	<MEO9HC$71BBF3DAB4563B2D26BA972076AA26C1@frajt.eu>
	<MEX4B1$E1E07BA2F872E0495C06E5D1E52F22E9@frajt.eu>
	<50EF1A85.4010203@oracle.com> <50EF228C.3030009@oracle.com>
	<MGRNT8$15B65F8C7747BCD32CB2F745B9DB2BA9@frajt.eu>
Message-ID: <5102F577.6000706@oracle.com>

Hi Michal,

The patch is applied. The new webrev can be found at 
http://cr.openjdk.java.net/~johnc/7189971/webrev.2/

Thanks,

JohnC

On 1/17/2013 2:58 AM, Michal Frajt wrote:
> Hi John,
>
> Please apply the attached patch to the webrev. You are right, the setting of the CMS token has been somehow moved back above the method return. Additionally I have fixed the printf of the unsigned loop counter (correct is %u).
>
> Regards,
> Michal
>   
> Od: hotspot-gc-dev-bounces at openjdk.java.net
> Komu: hotspot-gc-dev at openjdk.java.net
> Kopie:
> Datum: Thu, 10 Jan 2013 12:20:28 -0800
> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
>
>
>> Hi Michal,
>>
>> On 1/10/2013 11:46 AM, John Cuthbertson wrote:
>>> Hi Michal,
>>>
>>> Many apologies for the delay in generating a new webrev for this
>>> change but here is the new one:
>>> http://cr.openjdk.java.net/~johnc/7189971/webrev.1/
>>>
>>> Can you verify the webrev to make sure that changes have been applied
>>> correctly? Looking at the new webrev it seems that the setting of the
>>> CMS has been moved back above the return out of the loop. Was this
>>> intentional?
>> The above should be "... setting of the CMS token has been ...".
>>
>> JohnC
>>
>>> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0
>>> and CMSWaitDuration=1500 with CMS.
>>>
>>> Regards,
>>>
>>> JohnC
>>>
>>> On 12/12/2012 4:35 AM, Michal Frajt wrote:
>>>> All,
>>>> Find the attached patch. It implements proposed recommendations and
>>>> requested changes. Please mind that the CMSWaitDuration set to -1
>>>> (never wait) requires new parameter CMSCheckInterval (develop only,
>>>> 1000 milliseconds default - constant). The parameter defines the
>>>> next CMS cycle start check interval in the case there are no
>>>> desynchronization (notifications) events on the CGC_lock.
>>>>
>>>> Tested with the Solaris/amd64 build
>>>> CMS
>>>> + CMSWaitDuration>0 OK
>>>> + CMSWaitDuration=0 OK
>>>> + CMSWaitDuration<0 OK
>>>> iCMS
>>>> + CMSWaitDuration>0 OK
>>>> + CMSWaitDuration=0 OK
>>>> + CMSWaitDuration<0 OK
>>>> Regards,
>>>> Michal
>>>> Od: hotspot-gc-dev-bounces at openjdk.java.net
>>>> Komu: hotspot-gc-dev at openjdk.java.net
>>>> Kopie:
>>>> Datum: Fri, 7 Dec 2012 18:48:48 +0100
>>>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
>>>> non-incremental mode of CMS
>>>>
>>>>> Hi John/Jon/Ramki,
>>>>>
>>>>> All proposed recommendations and requested changes have been
>>>>> implemented. We are going to test it on Monday. You will get the new
>>>>> tested patch soon.
>>>>>
>>>>> The attached code here just got compiled, no test executed yet, it
>>>>> might contain a bug, but you can quickly review it and send your
>>>>> comments.
>>>>>
>>>>> Best regards
>>>>> Michal
>>>>>
>>>>>
>>>>> // Wait until the next synchronous GC, a concurrent full gc request,
>>>>> // or a timeout, whichever is earlier.
>>>>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>>>> t_millis) {
>>>>> // Wait time in millis or 0 value representing infinite wait for
>>>>> a scavenge
>>>>> assert(t_millis >= 0, "Wait time for scavenge should be 0 or
>>>>> positive");
>>>>>
>>>>> GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>>> double start_time_secs = os::elapsedTime();
>>>>> double end_time_secs = start_time_secs + (t_millis / ((double)
>>>>> MILLIUNITS));
>>>>>
>>>>> // Total collections count before waiting loop
>>>>> unsigned int before_count;
>>>>> {
>>>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>>> before_count = gch->total_collections();
>>>>> }
>>>>>
>>>>> unsigned int loop_count = 0;
>>>>>
>>>>> while(!_should_terminate) {
>>>>> double now_time = os::elapsedTime();
>>>>> long wait_time_millis;
>>>>>
>>>>> if(t_millis != 0) {
>>>>> // New wait limit
>>>>> wait_time_millis = (long) ((end_time_secs - now_time) *
>>>>> MILLIUNITS);
>>>>> if(wait_time_millis <= 0) {
>>>>> // Wait time is over
>>>>> break;
>>>>> }
>>>>> } else {
>>>>> // No wait limit, wait if necessary forever
>>>>> wait_time_millis = 0;
>>>>> }
>>>>>
>>>>> // Wait until the next event or the remaining timeout
>>>>> {
>>>>> MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
>>>>>
>>>>> set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
>>>>> if (_should_terminate || _collector->_full_gc_requested) {
>>>>> return;
>>>>> }
>>>>> assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>>> CGC_lock->wait(Mutex::_no_safepoint_check_flag,
>>>>> wait_time_millis);
>>>>> clear_CMS_flag(CMS_cms_wants_token);
>>>>> assert(!CMS_flag_is_set(CMS_cms_has_token |
>>>>> CMS_cms_wants_token),
>>>>> "Should not be set");
>>>>> }
>>>>>
>>>>> // Extra wait time check before entering the heap lock to get
>>>>> the collection count
>>>>> if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
>>>>> // Wait time is over
>>>>> break;
>>>>> }
>>>>>
>>>>> // Total collections count after the event
>>>>> unsigned int after_count;
>>>>> {
>>>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
>>>>> after_count = gch->total_collections();
>>>>> }
>>>>>
>>>>> if(before_count != after_count) {
>>>>> // There was a collection - success
>>>>> break;
>>>>> }
>>>>>
>>>>> // Too many loops warning
>>>>> if(++loop_count == 0) {
>>>>> warning("wait_on_cms_lock_for_scavenge() has looped %d
>>>>> times", loop_count - 1);
>>>>> }
>>>>> }
>>>>> }
>>>>>
>>>>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
>>>>> while (!_should_terminate) {
>>>>> if (CMSIncrementalMode) {
>>>>> icms_wait();
>>>>> if(CMSWaitDuration >= 0) {
>>>>> // Wait until the next synchronous GC, a concurrent full gc
>>>>> // request or a timeout, whichever is earlier.
>>>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>>>> }
>>>>> return;
>>>>> } else {
>>>>> if(CMSWaitDuration >= 0) {
>>>>> // Wait until the next synchronous GC, a concurrent full gc
>>>>> // request or a timeout, whichever is earlier.
>>>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
>>>>> } else {
>>>>> // Wait until any cms_lock event not to call
>>>>> shouldConcurrentCollect permanently
>>>>> wait_on_cms_lock(0);
>>>>> }
>>>>> }
>>>>> // Check if we should start a CMS collection cycle
>>>>> if (_collector->shouldConcurrentCollect()) {
>>>>> return;
>>>>> }
>>>>> // .. collection criterion not yet met, let's go back
>>>>> // and wait some more
>>>>> }
>>>>> }
>>>>>
>>>>> Od: hotspot-gc-dev-bounces at openjdk.java.net
>>>>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson"
>>>>> john.cuthbertson at oracle.com
>>>>> Kopie: hotspot-gc-dev at openjdk.java.net
>>>>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
>>>>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
>>>>> non-incremental mode of CMS
>>>>>
>>>>>> Hi John --
>>>>>>
>>>>>> wrt the changes posted, i see the intent of the code and agree with
>>>>>> it. I have a few minor suggestions on the
>>>>>> details of how it's implemented. My comments are inline below,
>>>>>> interleaved with the code:
>>>>>>
>>>>>> 317 // Wait until the next synchronous GC, a concurrent full gc
>>>>>> request,
>>>>>> 318 // or a timeout, whichever is earlier.
>>>>>> 319 void
>>>>>> ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
>>>>>> t_millis) {
>>>>>> 320 // Wait for any cms_lock event when timeout not specified
>>>>>> (0 millis)
>>>>>> 321 if (t_millis == 0) {
>>>>>> 322 wait_on_cms_lock(t_millis);
>>>>>> 323 return;
>>>>>> 324 }
>>>>>>
>>>>>> I'd completely avoid the special case above because it would miss the
>>>>>> part about waiting for a
>>>>>> scavenge, instead dealing with that case in the code in the loop below
>>>>>> directly. The idea
>>>>>> of the "0" value is not to ask that we return immediately, but that we
>>>>>> wait, if necessary
>>>>>> forever, for a scavenge. The "0" really represents the value infinity
>>>>>> in that sense. This would
>>>>>> be in keeping with our use of wait() with a "0" value for timeout at
>>>>>> other places in the JVM as
>>>>>> well, so it's consistent.
>>>>>>
>>>>>> 325
>>>>>> 326 GenCollectedHeap* gch = GenCollectedHeap::heap();
>>>>>> 327 double start_time = os::elapsedTime();
>>>>>> 328 double end_time = start_time + (t_millis / 1000.0);
>>>>>>
>>>>>> Note how, the end_time == start_time for the special case of t_millis
>>>>>> == 0, so we need to treat that
>>>>>> case specially below.
>>>>>>
>>>>>> 329
>>>>>> 330 // Total collections count before waiting loop
>>>>>> 331 unsigned int before_count;
>>>>>> 332 {
>>>>>> 333 MutexLockerEx hl(Heap_lock,
>>>>>> Mutex::_no_safepoint_check_flag);
>>>>>> 334 before_count = gch->total_collections();
>>>>>> 335 }
>>>>>>
>>>>>> Good.
>>>>>>
>>>>>> 336
>>>>>> 337 while (true) {
>>>>>> 338 double now_time = os::elapsedTime();
>>>>>> 339 long wait_time_millis = (long)((end_time - now_time) *
>>>>>> 1000.0);
>>>>>> 340
>>>>>> 341 if (wait_time_millis <= 0) {
>>>>>> 342 // Wait time is over
>>>>>> 343 break;
>>>>>> 344 }
>>>>>>
>>>>>> Modify to:
>>>>>> if (t_millis != 0) {
>>>>>> if (wait_time_millis <= 0) {
>>>>>> // Wait time is over
>>>>>> break;
>>>>>> }
>>>>>> } else {
>>>>>> wait_time_millis = 0; // for use in wait() below
>>>>>> }
>>>>>>
>>>>>> 345
>>>>>> 346 // Wait until the next event or the remaining timeout
>>>>>> 347 {
>>>>>> 348 MutexLockerEx x(CGC_lock,
>>>>>> Mutex::_no_safepoint_check_flag);
>>>>>> 349 if (_should_terminate || _collector->_full_gc_requested) {
>>>>>> 350 return;
>>>>>> 351 }
>>>>>> 352 set_CMS_flag(CMS_cms_wants_token); // to provoke
>>>>>> notifies
>>>>>>
>>>>>> insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
>>>>>>
>>>>>> 353 CGC_lock->wait(Mutex::_no_safepoint_check_flag,
>>>>>> wait_time_millis);
>>>>>> 354 clear_CMS_flag(CMS_cms_wants_token);
>>>>>> 355 assert(!CMS_flag_is_set(CMS_cms_has_token |
>>>>>> CMS_cms_wants_token),
>>>>>> 356 "Should not be set");
>>>>>> 357 }
>>>>>> 358
>>>>>> 359 // Extra wait time check before entering the heap lock to
>>>>>> get
>>>>>> the collection count
>>>>>> 360 if (os::elapsedTime() >= end_time) {
>>>>>> 361 // Wait time is over
>>>>>> 362 break;
>>>>>> 363 }
>>>>>>
>>>>>> Modify above wait time check to make an exception for t_miliis == 0:
>>>>>> // Extra wait time check before checking collection count
>>>>>> if (t_millis != 0 && os::elapsedTime() >= end_time) {
>>>>>> // wait time exceeded
>>>>>> break;
>>>>>> }
>>>>>>
>>>>>> 364
>>>>>> 365 // Total collections count after the event
>>>>>> 366 unsigned int after_count;
>>>>>> 367 {
>>>>>> 368 MutexLockerEx hl(Heap_lock,
>>>>>> Mutex::_no_safepoint_check_flag);
>>>>>> 369 after_count = gch->total_collections();
>>>>>> 370 }
>>>>>> 371
>>>>>> 372 if (before_count != after_count) {
>>>>>> 373 // There was a collection - success
>>>>>> 374 break;
>>>>>> 375 }
>>>>>> 376 }
>>>>>> 377 }
>>>>>>
>>>>>> While it is true that we do not have a case where the method is called
>>>>>> with a time of "0", I think we
>>>>>> want that value to be treated correctly as "infinity". For the case
>>>>>> where we do not want a wait at all,
>>>>>> we should use a small positive value, like "1 ms" to signal that
>>>>>> intent, i.e. -XX:CMSWaitDuration=1,
>>>>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
>>>>>> by reserving negative values to
>>>>>> signal infinity, but that would make the code in the loop a bit
>>>>>> more fiddly.)
>>>>>>
>>>>>> As mentioned in my previous email, I'd like to see this tested with
>>>>>> CMSWaitDuration set to 0, positive and
>>>>>> negative values (if necessary, we can reject negative value settings),
>>>>>> and with ExplicitGCInvokesConcurrent.
>>>>>>
>>>>>> Rest looks OK to me, although I am not sure how this behaves with
>>>>>> iCMS, as I have forgotten that part of the
>>>>>> code.
>>>>>>
>>>>>> Finally, in current code (before these changes) there are two callers
>>>>>> of the former wait_for_cms_lock() method,
>>>>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
>>>>>> I think the right thing has been done
>>>>>> in terms of leaving the latter alone.
>>>>>>
>>>>>> It would be good if this were checked with CMSInitiatingOccupancy set
>>>>>> to 0 (or a small value), CMSWaitDuration set to 0,
>>>>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
>>>>>> CMS cycles start very soon after the end of
>>>>>> a scavenge (and not at random times as Michal has observed earlier,
>>>>>> although i am guessing that is difficult to test).
>>>>>> It would be good to repeat the above test with iCMS as well.
>>>>>>
>>>>>> thanks!
>>>>>> -- ramki
>>>>>>
>>>>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna wrote:
>>>>>>> Thanks Jon for the pointer:
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu wrote:
>>>>>>>>
>>>>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
>>>>>>>>> The high level idea looks correct. I'll look at the details in a
>>>>>>>>> bit (seriously this time; sorry it dropped off my plate last
>>>>>>>>> time I promised).
>>>>>>>>> Does anyone have a pointer to the related discussion thread on
>>>>>>>>> this aias from earlier in the year, by chance, so one could
>>>>>>>>> refresh one's
>>>>>>>>> memory of that discussion?
>>>>>>>> subj: CMSWaitDuration unstable behavior
>>>>>>>>
>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> also:
>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
>>>>>>>
>>>>>>> On to it later this afternoon, and TTYL w/review.
>>>>>>> - ramki


From yamauchi at google.com  Fri Jan 25 23:12:32 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Fri, 25 Jan 2013 15:12:32 -0800
Subject: Deallocating memory pages
In-Reply-To: <51021A2E.9050503@oracle.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
Message-ID: <CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>

Hi Jon,

I haven't talked to you for a while. I hope you are doing well :)

This is a nice feature but I'm asking myself what type
> of applications would see significant footprint reductions.
> Would they be
>
> 1) applications whose heap usage varies significantly
> over time so that there are periods when some large
> fraction of the heap is unused and
>

Yes. Especially multiple applications with workload variations running on a
(shared) machine. Under this sort of environment, it seems to make things
work more nicely as a whole from a machine/memory resource utilization
point of view because it's not uncommon (in my opinion) to see some
applications happen to be currently running with higher workload and
needing more RAM while others happen to be currently running with lower
workload and needing less RAM at a point in time in a shared machine.

In server applications, this sort of workload variations can happen for
reasons such as capacity redundancy and time-of-day variations, etc. On
desktops, one might keep open all sorts of applications at the same time
such as web browsers, developer tools, graphics tools, etc. but might put
significant workload (or a temporary memory usage increase) on only one
application at a time. In such an environment, if an application that was
running with high workload in the past can release some RAM, it'd be much
nicer for other applications.


> 2) applications with many objects larger than a page so that freeing
> those objects could free memory.
>

Yes, but objects don't necessarily have to be larger than a page as long as
free chunks that are left after they are freed get coalesced together into
a free chunk that's larger than a page.


>
> Is that a good guess or is it more general than that?


I hope the above makes sense. I can discuss more if desired.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130125/876d4b58/attachment.htm>

From vitalyd at gmail.com  Fri Jan 25 23:29:00 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 25 Jan 2013 18:29:00 -0500
Subject: Deallocating memory pages
In-Reply-To: <CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
Message-ID: <CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>

Hiroshi,

I'll second your explanation for #1 - we also have some server workloads
that fluctuate.

Since the kernel will page out memory if need arises on its own, the main
benefit I'm seeing in your patch is that it removes the possibility that
kernel will page out the wrong pages.  That is, if it's using LRU page
replacement (or whatever the latest heuristic might be) and GC just freed
up those pages (but ended up marking them dirty in the process), then if
kernel needs to swap out it may overlook these pages because they don't fit
the replacement policy(I guess it also ensures that these pages don't need
swap backing either). Otherwise, if the server app has not been using those
extra pages for a while anyway, I'm thinking kernel will pick up on that.

Is that right? Just want to make sure I understand.

Thanks

Sent from my phone
On Jan 25, 2013 6:13 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:

> Hi Jon,
>
> I haven't talked to you for a while. I hope you are doing well :)
>
> This is a nice feature but I'm asking myself what type
>> of applications would see significant footprint reductions.
>> Would they be
>>
>> 1) applications whose heap usage varies significantly
>> over time so that there are periods when some large
>> fraction of the heap is unused and
>>
>
> Yes. Especially multiple applications with workload variations running on
> a (shared) machine. Under this sort of environment, it seems to make things
> work more nicely as a whole from a machine/memory resource utilization
> point of view because it's not uncommon (in my opinion) to see some
> applications happen to be currently running with higher workload and
> needing more RAM while others happen to be currently running with lower
> workload and needing less RAM at a point in time in a shared machine.
>
> In server applications, this sort of workload variations can happen for
> reasons such as capacity redundancy and time-of-day variations, etc. On
> desktops, one might keep open all sorts of applications at the same time
> such as web browsers, developer tools, graphics tools, etc. but might put
> significant workload (or a temporary memory usage increase) on only one
> application at a time. In such an environment, if an application that was
> running with high workload in the past can release some RAM, it'd be much
> nicer for other applications.
>
>
>> 2) applications with many objects larger than a page so that freeing
>> those objects could free memory.
>>
>
> Yes, but objects don't necessarily have to be larger than a page as long
> as free chunks that are left after they are freed get coalesced together
> into a free chunk that's larger than a page.
>
>
>>
>> Is that a good guess or is it more general than that?
>
>
> I hope the above makes sense. I can discuss more if desired.
>
> Thanks.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130125/fd626010/attachment.htm>

From jon.masamitsu at oracle.com  Fri Jan 25 23:50:13 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 25 Jan 2013 15:50:13 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <50F6DA6C.906@oracle.com>
References: <50F424A0.6080907@oracle.com>	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com>
Message-ID: <51031A35.8040004@oracle.com>

I've update the webrev2 (now 2 separate webrevs)  for review comments.

8005452: NPG: Create new flags for Metaspace resizing policy

http://cr.openjdk.java.net/~jmasa/8005452/webrev.03/

8006815: NPG: Trigger a GC for metadata collection just before the 
threshold is exceeded.

http://cr.openjdk.java.net/~jmasa/8006815/webrev.01/


On 1/16/2013 8:50 AM, Jon Masamitsu wrote:
> I've added checks to arguments.cpp that are analogous to the
> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>
> Changes since webrev.00 are in arguments.cpp
>
> http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/
>
> Thanks, Vitaly.
>
> Jon
>
> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>> Hi Jon,
>>
>> Does it make sense to validate that the new flags are consistent 
>> (I.e. max
>>> = min)? That is, if user changes one or both such that max<  min, 
>>> should
>> VM report an error and not start?
>>
>> Thanks
>>
>> Sent from my phone
>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com>  
>> wrote:
>>
>>> 8005452: Create new flags for Metaspace resizing policy
>>>
>>> Previously the calculation of the metadata capacity at which
>>> to do a GC (high water mark, HWM) to recover
>>> unloaded classes used the MinHeapFreeRatio
>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>> generally left an excessive amount of unused capacity for
>>> metadata.  This change adds specific flags for metadata
>>> capacity with defaults more conservative in terms of
>>> unused capacity.
>>>
>>> Added an additional check for doing a GC before expanding
>>> the metadata capacity.  Required adding a new parameter to
>>> get_new_chunk().
>>>
>>> Added some additional diagnostic prints.
>>>
>>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/> 
>>>
>>>
>>> Thanks.
>>>


From erik.helin at oracle.com  Mon Jan 28 15:14:33 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 28 Jan 2013 16:14:33 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <51018414.1000103@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
Message-ID: <510695D9.5030603@oracle.com>

Jon,

thanks for your review!

On 01/24/2013 07:57 PM, Jon Masamitsu wrote:
> I looked at the hotspot changes and they look correct.  But I'm
> not sure that "sun.gc" should be in the name of the counter.  Maybe
> use SUN_RT instead of SUN_GC.

I've updated the code to use the SUN_RT namespace instead of the SUN_GC 
namespace. This also required changes to the JDK code.

I've also added better error handling if a Java Out Of Memory exceptions 
occur is raised in PerfDataManager::create_variable.

Finally, I've moved some common code to the function create_ms_variable.

Webrev:
- hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/
- jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/

What do you think?

Thanks,
Erik

> Jon
>
> On 1/24/2013 2:51 AM, Erik Helin wrote:
>> Hi all,
>>
>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>> also removes some code from metaspaceCounters.hpp/cpp that is not
>> needed any longer.
>>
>> Note that the tests will continue to fail until the JDK part of the
>> change finds it way into the hotspot-gc forest.
>>
>> The JDK part of the change is also out for review on
>> serviceability-dev at openjdk.java.net.
>>
>> Webrev:
>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>
>> Bug:
>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>
>> Testing:
>> Run the jstat jtreg tests locally on my machine on a repository where
>> I've applied both the JDK changes and the HotSpot changes.
>>
>> Thanks,
>> Erik


From jesper.wilhelmsson at oracle.com  Mon Jan 28 15:08:59 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Mon, 28 Jan 2013 16:08:59 +0100
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <51031A35.8040004@oracle.com>
References: <50F424A0.6080907@oracle.com>	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com> <51031A35.8040004@oracle.com>
Message-ID: <5106948B.3090408@oracle.com>

On 2013-01-26 00:50, Jon Masamitsu wrote:
> I've update the webrev2 (now 2 separate webrevs)  for review comments.
>
> 8005452: NPG: Create new flags for Metaspace resizing policy
>
> http://cr.openjdk.java.net/~jmasa/8005452/webrev.03/

I have looked at the flag changes and they look good. I have a question 
though. How do we want to handle the case where the user sets only 
MinMetaspaceFreeRatio = 30 ? With your current change this will give an error 
because the min is larger than the default max (20).

Would it make sense to assume that the user actually wants to use min=30 and 
increase max to 30 as well? (or slightly more if they can't be equal) And 
maybe issue a warning that the value of max has been changed.
The error would then just be given if the user specifies both flags and they 
don't work out.

I'm not asking you to do this change now but it relates to other changes we 
have done recently.
/Jesper


>
> 8006815: NPG: Trigger a GC for metadata collection just before the threshold
> is exceeded.
>
> http://cr.openjdk.java.net/~jmasa/8006815/webrev.01/
>
>
> On 1/16/2013 8:50 AM, Jon Masamitsu wrote:
>> I've added checks to arguments.cpp that are analogous to the
>> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>>
>> Changes since webrev.00 are in arguments.cpp
>>
>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/
>>
>> Thanks, Vitaly.
>>
>> Jon
>>
>> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>>> Hi Jon,
>>>
>>> Does it make sense to validate that the new flags are consistent (I.e. max
>>>> = min)? That is, if user changes one or both such that max<  min, should
>>> VM report an error and not start?
>>>
>>> Thanks
>>>
>>> Sent from my phone
>>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com> wrote:
>>>
>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>
>>>> Previously the calculation of the metadata capacity at which
>>>> to do a GC (high water mark, HWM) to recover
>>>> unloaded classes used the MinHeapFreeRatio
>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>> generally left an excessive amount of unused capacity for
>>>> metadata.  This change adds specific flags for metadata
>>>> capacity with defaults more conservative in terms of
>>>> unused capacity.
>>>>
>>>> Added an additional check for doing a GC before expanding
>>>> the metadata capacity.  Required adding a new parameter to
>>>> get_new_chunk().
>>>>
>>>> Added some additional diagnostic prints.
>>>>
>>>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/>
>>>>
>>>>
>>>> Thanks.
>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/eb1229b0/jesper_wilhelmsson.vcf>

From erik.helin at oracle.com  Mon Jan 28 15:25:24 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Mon, 28 Jan 2013 16:25:24 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <510695D9.5030603@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
	<510695D9.5030603@oracle.com>
Message-ID: <51069864.30007@oracle.com>

Sorry,

I got the wrong JDK webrev version in the last email. The JDK change was 
updated due to a test for jstatd that needed to be updated as well.

The latest version of the JDK webrev is:
http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.02/

Thanks,
Erik

On 01/28/2013 04:14 PM, Erik Helin wrote:
> Jon,
>
> thanks for your review!
>
> On 01/24/2013 07:57 PM, Jon Masamitsu wrote:
>> I looked at the hotspot changes and they look correct.  But I'm
>> not sure that "sun.gc" should be in the name of the counter.  Maybe
>> use SUN_RT instead of SUN_GC.
>
> I've updated the code to use the SUN_RT namespace instead of the SUN_GC
> namespace. This also required changes to the JDK code.
>
> I've also added better error handling if a Java Out Of Memory exceptions
> occur is raised in PerfDataManager::create_variable.
>
> Finally, I've moved some common code to the function create_ms_variable.
>
> Webrev:
> - hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/
> - jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/
>
> What do you think?
>
> Thanks,
> Erik
>
>> Jon
>>
>> On 1/24/2013 2:51 AM, Erik Helin wrote:
>>> Hi all,
>>>
>>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>>> also removes some code from metaspaceCounters.hpp/cpp that is not
>>> needed any longer.
>>>
>>> Note that the tests will continue to fail until the JDK part of the
>>> change finds it way into the hotspot-gc forest.
>>>
>>> The JDK part of the change is also out for review on
>>> serviceability-dev at openjdk.java.net.
>>>
>>> Webrev:
>>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>>
>>> Bug:
>>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>>
>>> Testing:
>>> Run the jstat jtreg tests locally on my machine on a repository where
>>> I've applied both the JDK changes and the HotSpot changes.
>>>
>>> Thanks,
>>> Erik
>


From jon.masamitsu at oracle.com  Mon Jan 28 17:01:50 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 28 Jan 2013 09:01:50 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
Message-ID: <5106AEFE.2070502@oracle.com>


On 01/25/13 15:12, Hiroshi Yamauchi wrote:

...
> In server applications, this sort of workload variations can happen 
> for reasons such as capacity redundancy and 
> time-of-day variations, etc. On desktops, one might keep open all 
> sorts of applications at the same time such as web browsers, developer 
> tools, graphics tools, etc. but might put significant workload (or a 
> temporary memory usage increase) on only one application at a time. In 
> such an environment, if an application that was running with high 
> workload in the past can release some RAM, it'd be much nicer for 
> other applications.
>
>
>     2) applications with many objects larger than a page so that freeing
>     those objects could free memory.
>
>
> Yes, but objects don't necessarily have to be larger than a page as 
> long as free chunks that are left after they are freed get coalesced 
> together into a free chunk that's larger than a page.

So when a coalesced page gets added back to the free lists it can deallocate
memory even if the neither of the objects coalesced was greater than a
page in size.  Cool.

Do you see any increase in the sweeping times?  Or the young gen
collection times?

You mention that you don't deallocate the headers of objects on the free 
list.
Was that because you tried deallocation that included the headers and that
was worse?

I'll get feedback from the other GC guys and let you know what we
want to do.

Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/19401c95/attachment.htm>

From jon.masamitsu at oracle.com  Mon Jan 28 17:18:38 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 28 Jan 2013 09:18:38 -0800
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <50FD2323.9050702@oracle.com>
References: <50FD2323.9050702@oracle.com>
Message-ID: <5106B2EE.9060509@oracle.com>

Can this test be implemented using a call to
System.gc() instead of trying to fill up the heap
to provoke a GC?

Jon

On 01/21/13 03:14, Filipp Zhinkin wrote:
> Hi all,
>
> Would someone review the following regression test please?
>
> Test verifies that VM will not crash with G1 GC and ParallelGCThreads 
> == 0.
>
> To ensure that it is true test allocates array until OOME.
> Max heap size is limited by 32M for this test to ensure that GC will 
> occur.
> Since crash could occur only during PLAB resizing after GC,
> ResizePLAB option is explicitly turned on.
>
> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>
> Thanks,
> Filipp.
>


From john.cuthbertson at oracle.com  Mon Jan 28 18:58:54 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Mon, 28 Jan 2013 10:58:54 -0800
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <5106B2EE.9060509@oracle.com>
References: <50FD2323.9050702@oracle.com> <5106B2EE.9060509@oracle.com>
Message-ID: <5106CA6E.6030405@oracle.com>

Hi Filipp,

In addition to what Jon suggests (i.e. using System.gc() to guarantee a 
GC), please add -XX:+ExplicitGCInvokesConcurrent. The addition of this 
flag will cause G1 to perform an incremental GC (instead of the full GC 
that a System.gc() call provokes). IIRC the PLAB resizing code is only 
exercised at the end of an incremental GC.

Thanks,

JohnC

On 1/28/2013 9:18 AM, Jon Masamitsu wrote:
> Can this test be implemented using a call to
> System.gc() instead of trying to fill up the heap
> to provoke a GC?
>
> Jon
>
> On 01/21/13 03:14, Filipp Zhinkin wrote:
>> Hi all,
>>
>> Would someone review the following regression test please?
>>
>> Test verifies that VM will not crash with G1 GC and ParallelGCThreads 
>> == 0.
>>
>> To ensure that it is true test allocates array until OOME.
>> Max heap size is limited by 32M for this test to ensure that GC will 
>> occur.
>> Since crash could occur only during PLAB resizing after GC,
>> ResizePLAB option is explicitly turned on.
>>
>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>>
>> Thanks,
>> Filipp.
>>


From jon.masamitsu at oracle.com  Mon Jan 28 19:30:03 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 28 Jan 2013 11:30:03 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <5106948B.3090408@oracle.com>
References: <50F424A0.6080907@oracle.com>	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com> <51031A35.8040004@oracle.com>
	<5106948B.3090408@oracle.com>
Message-ID: <5106D1BB.1050804@oracle.com>

Jesper,

If the user is increasing MinMetaspaceFreeRatio, and
MaxMetaspaceFreeRatio is not compatible, maybe we
should be forcing the user to think about MaxMetaspaceFreeRatio.
It's not obvious to me that something like

MaxMetaspaceFreeRatio = MinMetaspaceFreeRatio + 1

is a good choice.

Jon

On 01/28/13 07:08, Jesper Wilhelmsson wrote:
> On 2013-01-26 00:50, Jon Masamitsu wrote:
>> I've update the webrev2 (now 2 separate webrevs)  for review comments.
>>
>> 8005452: NPG: Create new flags for Metaspace resizing policy
>>
>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.03/
>
> I have looked at the flag changes and they look good. I have a 
> question though. How do we want to handle the case where the user sets 
> only MinMetaspaceFreeRatio = 30 ? With your current change this will 
> give an error because the min is larger than the default max (20).
>
> Would it make sense to assume that the user actually wants to use 
> min=30 and increase max to 30 as well? (or slightly more if they can't 
> be equal) And maybe issue a warning that the value of max has been 
> changed.
> The error would then just be given if the user specifies both flags 
> and they don't work out.
>
> I'm not asking you to do this change now but it relates to other 
> changes we have done recently.
> /Jesper
>
>
>>
>> 8006815: NPG: Trigger a GC for metadata collection just before the 
>> threshold
>> is exceeded.
>>
>> http://cr.openjdk.java.net/~jmasa/8006815/webrev.01/
>>
>>
>> On 1/16/2013 8:50 AM, Jon Masamitsu wrote:
>>> I've added checks to arguments.cpp that are analogous to the
>>> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>>>
>>> Changes since webrev.00 are in arguments.cpp
>>>
>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/
>>>
>>> Thanks, Vitaly.
>>>
>>> Jon
>>>
>>> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>>>> Hi Jon,
>>>>
>>>> Does it make sense to validate that the new flags are consistent 
>>>> (I.e. max
>>>>> = min)? That is, if user changes one or both such that max<  min, 
>>>>> should
>>>> VM report an error and not start?
>>>>
>>>> Thanks
>>>>
>>>> Sent from my phone
>>>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com> 
>>>> wrote:
>>>>
>>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>>
>>>>> Previously the calculation of the metadata capacity at which
>>>>> to do a GC (high water mark, HWM) to recover
>>>>> unloaded classes used the MinHeapFreeRatio
>>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>>> generally left an excessive amount of unused capacity for
>>>>> metadata.  This change adds specific flags for metadata
>>>>> capacity with defaults more conservative in terms of
>>>>> unused capacity.
>>>>>
>>>>> Added an additional check for doing a GC before expanding
>>>>> the metadata capacity.  Required adding a new parameter to
>>>>> get_new_chunk().
>>>>>
>>>>> Added some additional diagnostic prints.
>>>>>
>>>>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/> 
>>>>>
>>>>>
>>>>>
>>>>> Thanks.
>>>>>


From tao.mao at oracle.com  Mon Jan 28 20:21:43 2013
From: tao.mao at oracle.com (Tao Mao)
Date: Mon, 28 Jan 2013 12:21:43 -0800
Subject: Request for review: 6976350 G1: deal with fragmentation while copying
	objects during GC
Message-ID: <5106DDD7.6090900@oracle.com>

6976350 G1: deal with fragmentation while copying objects during GC
https://jbs.oracle.com/bugs/browse/JDK-6976350

webrev:
http://cr.openjdk.java.net/~tamao/6976350/webrev.00/

changeset:
Basically, we want to reuse more of par-allocation buffers instead of 
retiring it immediately when it encounters an object larger than its 
remaining part.

(1) instead of previously using one allocation buffer per GC purpose, we 
use N(=2) buffers per GC purpose and modify the corresponding code. The 
changeset would easily scale up to whatever N (though Tony Printezis 
suggests 2, or 3 may be good enough)

*(2) Two places of cleanup: allocate_during_gc_slow() is removed due to 
its never being called.
                                               access modifier (public) 
before trim_queue() is redundant.


From john.cuthbertson at oracle.com  Mon Jan 28 21:59:45 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Mon, 28 Jan 2013 13:59:45 -0800
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
Message-ID: <5106F4D1.50200@oracle.com>

Hi Everyone,

Can I have a couple of volunteers look over the changes for this CR? The 
webrev is at: http://cr.openjdk.java.net/~johnc/8007036/webrev.0/

Summary:
When adding old regions to the collection set we don't take into account 
whether the old regions added so far take us below the 
G1HeapWastePercent. As a result we could end up adding (and collecting) 
many more regions than we needed to. The actual number added was the 
minimum between the number of candidate regions / G1MixedGCCountTarget 
and 10% of the heap.

Currently the calculation of the reclaimable bytes as a percentage of 
the uses exact arithmetic. It might make sense, at some point in the 
future, to use inexact arithmetic (rounding) in the decision on whether 
to continue mixed GCs and use exact arithmetic when adding regions.

As part of this change I've also moved a couple routines from 
CollectionSetChooser to G1CollectorPolicy. I think they "fit" better in 
G1CollectorPolicy.

Testing:
GCOld with tenuring threshold = 1 and a marking threshold = 10.

Many thanks to Monica for identifying the issue.

Thanks,

JohnC


From vitalyd at gmail.com  Mon Jan 28 22:36:31 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 28 Jan 2013 17:36:31 -0500
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <5106F4D1.50200@oracle.com>
References: <5106F4D1.50200@oracle.com>
Message-ID: <CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>

Hi John,

In G1CollectorPolicy::calc_min_old_cset_length(), is it possible to get 0
for G1MixedGCCountTarget? If so, will get div by zero there.

Thanks

Sent from my phone
On Jan 28, 2013 5:01 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Everyone,
>
> Can I have a couple of volunteers look over the changes for this CR? The
> webrev is at: http://cr.openjdk.java.net/~**johnc/8007036/webrev.0/<http://cr.openjdk.java.net/~johnc/8007036/webrev.0/>
>
> Summary:
> When adding old regions to the collection set we don't take into account
> whether the old regions added so far take us below the G1HeapWastePercent.
> As a result we could end up adding (and collecting) many more regions than
> we needed to. The actual number added was the minimum between the number of
> candidate regions / G1MixedGCCountTarget and 10% of the heap.
>
> Currently the calculation of the reclaimable bytes as a percentage of the
> uses exact arithmetic. It might make sense, at some point in the future, to
> use inexact arithmetic (rounding) in the decision on whether to continue
> mixed GCs and use exact arithmetic when adding regions.
>
> As part of this change I've also moved a couple routines from
> CollectionSetChooser to G1CollectorPolicy. I think they "fit" better in
> G1CollectorPolicy.
>
> Testing:
> GCOld with tenuring threshold = 1 and a marking threshold = 10.
>
> Many thanks to Monica for identifying the issue.
>
> Thanks,
>
> JohnC
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/c95fd54d/attachment.htm>

From yamauchi at google.com  Mon Jan 28 23:15:18 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Mon, 28 Jan 2013 15:15:18 -0800
Subject: Deallocating memory pages
In-Reply-To: <5106AEFE.2070502@oracle.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<5106AEFE.2070502@oracle.com>
Message-ID: <CAASM7N+ZqtoYKN=0Hv7p4cEoM-jvusY-6fHjOdRfG67HUoc_5g@mail.gmail.com>

> So when a coalesced page gets added back to the free lists it can
> deallocate
> memory even if the neither of the objects coalesced was greater than a
> page in size.  Cool.
>

Exactly.


>
> Do you see any increase in the sweeping times?  Or the young gen
> collection times?
>

In an internal server test, the total GC pause time (which is mostly the
young gen collection times) had a ~3% overhead (while the total execution
time had a ~1-2% overhead.) The concurrent sweep time had a ~10% overhead
while the whole concurrent collection time (from the beginning of the
ininitial mark phase to the end of the reset phase) had a ~3-4% overhead.
So, yes. The numbers are not completely noise-free, but these are most
likely due to the cost of calling madvise and page reallocation.


> You mention that you don't deallocate the headers of objects on the free
> list.
> Was that because you tried deallocation that included the headers and that
> was worse?
>

No, I don't deallocate the header of a free chunk because it contains valid
data (the prev, the next pointers, and the size) as it becomes a node in a
(doubly-linked) free list.


>
> I'll get feedback from the other GC guys and let you know what we
> want to do.
>

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/49e345b3/attachment.htm>

From yamauchi at google.com  Mon Jan 28 23:28:01 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Mon, 28 Jan 2013 15:28:01 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
Message-ID: <CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>

On Fri, Jan 25, 2013 at 3:29 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> Hiroshi,
>
> I'll second your explanation for #1 - we also have some server workloads
> that fluctuate.
>
> Since the kernel will page out memory if need arises on its own, the main
> benefit I'm seeing in your patch is that it removes the possibility that
> kernel will page out the wrong pages.  That is, if it's using LRU page
> replacement (or whatever the latest heuristic might be) and GC just freed
> up those pages (but ended up marking them dirty in the process), then if
> kernel needs to swap out it may overlook these pages because they don't fit
> the replacement policy(I guess it also ensures that these pages don't need
> swap backing either). Otherwise, if the server app has not been using those
> extra pages for a while anyway, I'm thinking kernel will pick up on that.
>
> Is that right? Just want to make sure I understand.
>
Vitaly, do you mean that without this feature/patch, the kernel might
choose to swap out a page with valid data as opposed to a page that has
garbage data and would have been deallocated with this feature/patch if it
follows an LRU-like policy or something similar? If so, you are probably
right about that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/4be92c94/attachment.htm>

From vitalyd at gmail.com  Tue Jan 29 00:11:53 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 28 Jan 2013 19:11:53 -0500
Subject: Deallocating memory pages
In-Reply-To: <CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
	<CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>
Message-ID: <CAHjP37F_wpuwydFXWxQ+3MjYpauY8Ch1kkJCQCQB8-2A93Y9mA@mail.gmail.com>

Yes, exactly; I'm trying to understand (for my own sake, really, but maybe
others too) what this patch adds over letting kernel manage physical pages
on its own given it has a better global view of the system.
madvise(MADV_DONTNEED) appears to actively unmap the pages, rather than
just marking them (so, e.g., swap decisions can be made later about them);
if there's no pressure/shortage for physical pages, this will just create
unneeded overhead (both with this initial syscall and later if pages need
to mapped again), won't it?

Also, if an app has spikey usage but where spikes are frequent, this patch
would probably be a net negative since it's not based on
ergonomics/statistics/trend/etc.  Since your patch makes this new behavior
"toggleable" it's probably not an issue.

Thanks

Sent from my phone
On Jan 28, 2013 6:28 PM, "Hiroshi Yamauchi" <yamauchi at google.com> wrote:

>
>
>
> On Fri, Jan 25, 2013 at 3:29 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:
>
>> Hiroshi,
>>
>> I'll second your explanation for #1 - we also have some server workloads
>> that fluctuate.
>>
>> Since the kernel will page out memory if need arises on its own, the main
>> benefit I'm seeing in your patch is that it removes the possibility that
>> kernel will page out the wrong pages.  That is, if it's using LRU page
>> replacement (or whatever the latest heuristic might be) and GC just freed
>> up those pages (but ended up marking them dirty in the process), then if
>> kernel needs to swap out it may overlook these pages because they don't fit
>> the replacement policy(I guess it also ensures that these pages don't need
>> swap backing either). Otherwise, if the server app has not been using those
>> extra pages for a while anyway, I'm thinking kernel will pick up on that.
>>
>> Is that right? Just want to make sure I understand.
>>
> Vitaly, do you mean that without this feature/patch, the kernel might
> choose to swap out a page with valid data as opposed to a page that has
> garbage data and would have been deallocated with this feature/patch if it
> follows an LRU-like policy or something similar? If so, you are probably
> right about that.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/405fad73/attachment.htm>

From vitalyd at gmail.com  Tue Jan 29 00:30:04 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 28 Jan 2013 19:30:04 -0500
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>
References: <5106F4D1.50200@oracle.com>
	<CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>
Message-ID: <CAHjP37GoXDSPYN3QDMviznToibBc_cJBLZSh0bR-BO0DRsBXhw@mail.gmail.com>

In same file,

1829   // Is the amount of uncollected reclaimable space above
G1HeapWastePercent?
1830   size_t reclaimable_bytes =
cset_chooser->remaining_reclaimable_bytes();
1831   double reclaimable_perc = reclaimable_bytes_perc();

1832   double threshold = (double) G1HeapWastePercent;
1833   if (!over_waste_threshold()) {

I think there's going to be some duplicate code running unless compiler
helps out:
1) cset_chooser->remaining_reclaimable_bytes() called above + by
reclaimable_bytes_perc()
2) reclaimable_bytes_perc() called above + by over_waste_threshold()

Don't know if this is a concern or not but thought I'd mention it.

Thanks

Sent from my phone
On Jan 28, 2013 5:36 PM, "Vitaly Davidovich" <vitalyd at gmail.com> wrote:

> Hi John,
>
> In G1CollectorPolicy::calc_min_old_cset_length(), is it possible to get 0
> for G1MixedGCCountTarget? If so, will get div by zero there.
>
> Thanks
>
> Sent from my phone
> On Jan 28, 2013 5:01 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
> wrote:
>
>> Hi Everyone,
>>
>> Can I have a couple of volunteers look over the changes for this CR? The
>> webrev is at: http://cr.openjdk.java.net/~**johnc/8007036/webrev.0/<http://cr.openjdk.java.net/~johnc/8007036/webrev.0/>
>>
>> Summary:
>> When adding old regions to the collection set we don't take into account
>> whether the old regions added so far take us below the G1HeapWastePercent.
>> As a result we could end up adding (and collecting) many more regions than
>> we needed to. The actual number added was the minimum between the number of
>> candidate regions / G1MixedGCCountTarget and 10% of the heap.
>>
>> Currently the calculation of the reclaimable bytes as a percentage of the
>> uses exact arithmetic. It might make sense, at some point in the future, to
>> use inexact arithmetic (rounding) in the decision on whether to continue
>> mixed GCs and use exact arithmetic when adding regions.
>>
>> As part of this change I've also moved a couple routines from
>> CollectionSetChooser to G1CollectorPolicy. I think they "fit" better in
>> G1CollectorPolicy.
>>
>> Testing:
>> GCOld with tenuring threshold = 1 and a marking threshold = 10.
>>
>> Many thanks to Monica for identifying the issue.
>>
>> Thanks,
>>
>> JohnC
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130128/4400c790/attachment.htm>

From filipp.zhinkin at oracle.com  Tue Jan 29 09:15:35 2013
From: filipp.zhinkin at oracle.com (Filipp Zhinkin)
Date: Tue, 29 Jan 2013 13:15:35 +0400
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <5106CA6E.6030405@oracle.com>
References: <50FD2323.9050702@oracle.com> <5106B2EE.9060509@oracle.com>
	<5106CA6E.6030405@oracle.com>
Message-ID: <51079337.6040108@oracle.com>

Hi John,

thanks for advice! I'll reimplement the test using System.gc() calls and 
-XX:+ExplicitGCInvokesConcurrent option.
And yes, you're right, PLAB resizes only at the end of incremental GC. 
Thats why I've tried to provoke GC by filling up the heap instead of 
calling System.gc() (I've missed ExplicitGCInvokesConcurrent flag before).

Thanks,
Filipp.

On 01/28/2013 10:58 PM, John Cuthbertson wrote:
> Hi Filipp,
>
> In addition to what Jon suggests (i.e. using System.gc() to guarantee 
> a GC), please add -XX:+ExplicitGCInvokesConcurrent. The addition of 
> this flag will cause G1 to perform an incremental GC (instead of the 
> full GC that a System.gc() call provokes). IIRC the PLAB resizing code 
> is only exercised at the end of an incremental GC.
>
> Thanks,
>
> JohnC
>
> On 1/28/2013 9:18 AM, Jon Masamitsu wrote:
>> Can this test be implemented using a call to
>> System.gc() instead of trying to fill up the heap
>> to provoke a GC?
>>
>> Jon
>>
>> On 01/21/13 03:14, Filipp Zhinkin wrote:
>>> Hi all,
>>>
>>> Would someone review the following regression test please?
>>>
>>> Test verifies that VM will not crash with G1 GC and 
>>> ParallelGCThreads == 0.
>>>
>>> To ensure that it is true test allocates array until OOME.
>>> Max heap size is limited by 32M for this test to ensure that GC will 
>>> occur.
>>> Since crash could occur only during PLAB resizing after GC,
>>> ResizePLAB option is explicitly turned on.
>>>
>>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>>>
>>> Thanks,
>>> Filipp.
>>>
>


From jesper.wilhelmsson at oracle.com  Tue Jan 29 13:18:05 2013
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Tue, 29 Jan 2013 14:18:05 +0100
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <5106D1BB.1050804@oracle.com>
References: <50F424A0.6080907@oracle.com>	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com> <51031A35.8040004@oracle.com>
	<5106948B.3090408@oracle.com> <5106D1BB.1050804@oracle.com>
Message-ID: <5107CC0D.5030409@oracle.com>

Jon,

OK, in that case I think the error message could be slightly more 
informative, just so that it is clear to someone that only sets one of 
them that the flag they set conflicts with the default value of another 
flag. How about this:

jio_fprintf(defaultStream::error_stream(),
             "MinMetaspaceFreeRatio (%s" UINTX_FORMAT ") must be less 
than or "
             "equal to MaxMetaspaceFreeRatio (%s" UINTX_FORMAT ")\n",
             FLAG_IS_DEFAULT(MinMetaspaceFreeRatio) ? "Default: " : "",
             MinMetaspaceFreeRatio,
             FLAG_IS_DEFAULT(MaxMetaspaceFreeRatio) ? "Default: " : "",
             MaxMetaspaceFreeRatio);

If you don't like this I'm fine with your current patch as well.
/Jesper


On 28/1/13 8:30 PM, Jon Masamitsu wrote:
> Jesper,
>
> If the user is increasing MinMetaspaceFreeRatio, and
> MaxMetaspaceFreeRatio is not compatible, maybe we
> should be forcing the user to think about MaxMetaspaceFreeRatio.
> It's not obvious to me that something like
>
> MaxMetaspaceFreeRatio = MinMetaspaceFreeRatio + 1
>
> is a good choice.
>
> Jon
>
> On 01/28/13 07:08, Jesper Wilhelmsson wrote:
>> On 2013-01-26 00:50, Jon Masamitsu wrote:
>>> I've update the webrev2 (now 2 separate webrevs)  for review comments.
>>>
>>> 8005452: NPG: Create new flags for Metaspace resizing policy
>>>
>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.03/
>>
>> I have looked at the flag changes and they look good. I have a
>> question though. How do we want to handle the case where the user sets
>> only MinMetaspaceFreeRatio = 30 ? With your current change this will
>> give an error because the min is larger than the default max (20).
>>
>> Would it make sense to assume that the user actually wants to use
>> min=30 and increase max to 30 as well? (or slightly more if they can't
>> be equal) And maybe issue a warning that the value of max has been
>> changed.
>> The error would then just be given if the user specifies both flags
>> and they don't work out.
>>
>> I'm not asking you to do this change now but it relates to other
>> changes we have done recently.
>> /Jesper
>>
>>
>>>
>>> 8006815: NPG: Trigger a GC for metadata collection just before the
>>> threshold
>>> is exceeded.
>>>
>>> http://cr.openjdk.java.net/~jmasa/8006815/webrev.01/
>>>
>>>
>>> On 1/16/2013 8:50 AM, Jon Masamitsu wrote:
>>>> I've added checks to arguments.cpp that are analogous to the
>>>> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>>>>
>>>> Changes since webrev.00 are in arguments.cpp
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/
>>>>
>>>> Thanks, Vitaly.
>>>>
>>>> Jon
>>>>
>>>> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>>>>> Hi Jon,
>>>>>
>>>>> Does it make sense to validate that the new flags are consistent
>>>>> (I.e. max
>>>>>> = min)? That is, if user changes one or both such that max<  min,
>>>>>> should
>>>>> VM report an error and not start?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Sent from my phone
>>>>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com>
>>>>> wrote:
>>>>>
>>>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>>>
>>>>>> Previously the calculation of the metadata capacity at which
>>>>>> to do a GC (high water mark, HWM) to recover
>>>>>> unloaded classes used the MinHeapFreeRatio
>>>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>>>> generally left an excessive amount of unused capacity for
>>>>>> metadata.  This change adds specific flags for metadata
>>>>>> capacity with defaults more conservative in terms of
>>>>>> unused capacity.
>>>>>>
>>>>>> Added an additional check for doing a GC before expanding
>>>>>> the metadata capacity.  Required adding a new parameter to
>>>>>> get_new_chunk().
>>>>>>
>>>>>> Added some additional diagnostic prints.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>


From john.cuthbertson at oracle.com  Tue Jan 29 18:28:17 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 29 Jan 2013 10:28:17 -0800
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>
References: <5106F4D1.50200@oracle.com>
	<CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>
Message-ID: <510814C1.4000408@oracle.com>

Hi Vitaly,

Thanks for looking at the code changes. Response inline...

On 1/28/2013 2:36 PM, Vitaly Davidovich wrote:
>
> Hi John,
>
> In G1CollectorPolicy::calc_min_old_cset_length(), is it possible to 
> get 0 for G1MixedGCCountTarget? If so, will get div by zero there.
>

Good catch. There's nothing to stop a user specifying 
-XX:G1MixedGCCountTarget=0 even though it doesn't make sense. I've added:

   const size_t gc_num = MAX2((size_t) G1MixedGCCountTarget, 1);

in calc_min_old_cset_length().

Thanks,

JohnC


From john.cuthbertson at oracle.com  Tue Jan 29 18:38:49 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 29 Jan 2013 10:38:49 -0800
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <CAHjP37GoXDSPYN3QDMviznToibBc_cJBLZSh0bR-BO0DRsBXhw@mail.gmail.com>
References: <5106F4D1.50200@oracle.com>
	<CAHjP37EEew2hMak=neZv2OJaa9tbLS1RxE3U7D+BnoKX3wWEOQ@mail.gmail.com>
	<CAHjP37GoXDSPYN3QDMviznToibBc_cJBLZSh0bR-BO0DRsBXhw@mail.gmail.com>
Message-ID: <51081739.8060900@oracle.com>

Hi Vitaly,

It's a good point and I don't know if it's an issue. The results of the 
duplicated calls should be dead code if PrintAdaptiveSizePolicy is not 
enabled.

I want consistency between determining whether to continue with mixed 
GCs and whether to continue to add old regions to the collection set. I 
also want the PrintAdaptiveSizePolicy output to display the correct and 
consistent information. I'll rework the code so that 
over_waste_threshold() takes the reclaimable space as a percentage as a 
parameter, hence removing the duplicated calls.

Thanks,

JohnC

On 1/28/2013 4:30 PM, Vitaly Davidovich wrote:
>
> In same file,
>
> 1829   // Is the amount of uncollected reclaimable space above 
> G1HeapWastePercent?
> 1830   size_t reclaimable_bytes = 
> cset_chooser->remaining_reclaimable_bytes();
> 1831   double reclaimable_perc = reclaimable_bytes_perc();
>
> 1832   double threshold = (double) G1HeapWastePercent;
> 1833   if (!over_waste_threshold()) {
>
> I think there's going to be some duplicate code running unless 
> compiler helps out:
> 1) cset_chooser->remaining_reclaimable_bytes() called above + by 
> reclaimable_bytes_perc()
> 2) reclaimable_bytes_perc() called above + by over_waste_threshold()
>
> Don't know if this is a concern or not but thought I'd mention it.
>
> Thanks
>
> Sent from my phone
>
>


From yamauchi at google.com  Tue Jan 29 20:23:13 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Tue, 29 Jan 2013 12:23:13 -0800
Subject: Deallocating memory pages
In-Reply-To: <CAHjP37F_wpuwydFXWxQ+3MjYpauY8Ch1kkJCQCQB8-2A93Y9mA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
	<CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>
	<CAHjP37F_wpuwydFXWxQ+3MjYpauY8Ch1kkJCQCQB8-2A93Y9mA@mail.gmail.com>
Message-ID: <CAASM7NJACYx9FVQBcQWVHkWXBjQCPHC0_csv2QBGdFVxz8BpeA@mail.gmail.com>

> Yes, exactly; I'm trying to understand (for my own sake, really, but maybe others too) what this patch adds over letting kernel manage physical pages on its own given it has a better global view of the system.   madvise(MADV_DONTNEED) appears to actively unmap the pages, rather than just marking them (so, e.g., swap decisions can be made later about them); if there's no pressure/shortage for physical pages, this will just create unneeded overhead (both with this initial syscall and later if pages need to mapped again), won't it?

That's basically how the cost/benefit works out. If no other apps need
the RAM, no point. If they do, they may get to avoid the very
expensive swapping (which, I imagine, one would avoid at almost all
cost as the performance falls off a cliff with swapping) or be able to
run at all if swap is turned off, and win.

The question is how likely they need the RAM and whether you'd be
willing to pay the page deallocation cost to get this sort of memory
management flexibility. An alternative way to look at it, in my view,
is that it'd make it a little bit closer to how many C/C++
applications tend to behave in that they usually release a fair amount
of unused pages when they are less loaded (bugs and memory leaks
aside).

>
> Also, if an app has spikey usage but where spikes are frequent, this patch would probably be a net negative since it's not based on ergonomics/statistics/trend/etc.  Since your patch makes this new behavior "toggleable" it's probably not an issue.

You are probably right about that. It's currently based on a simple
strategy of deallocating pages as soon as they are freed (which might
be worth improving in the future.) And that'd probably be the
worst-case situation.

That said, there are a few side notes to keep in mind: As this feature
deallocates pages in the old generation only, the frequency of the
'spikes' that would add to the cost of this feature is limited by how
fast objects get allocated in the old gen (mostly promotions) and
freed by the old-gen collections. And if you have frequent spikes in
the old generation, there's a chance that you are having a
not-so-great GC behavior and if so, you'd probably like to avoid
either way. Also, even if there are spikes, as long as there's a gap
between the peaks of the spikes and the maximum heap size, there would
be some pages/RAM that this feature could deallocate and make
available to other apps.

And yes, the feature is disabled by default by a flag.


From john.cuthbertson at oracle.com  Tue Jan 29 21:35:28 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Tue, 29 Jan 2013 13:35:28 -0800
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <5106F4D1.50200@oracle.com>
References: <5106F4D1.50200@oracle.com>
Message-ID: <510840A0.1000709@oracle.com>

Hi Everyone,

Here's a new webrev based upon feedback from Vitaly: 
http://cr.openjdk.java.net/~johnc/8007036/webrev.1/

JohnC


On 1/28/2013 1:59 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers look over the changes for this CR? 
> The webrev is at: http://cr.openjdk.java.net/~johnc/8007036/webrev.0/
>
> Summary:
> When adding old regions to the collection set we don't take into 
> account whether the old regions added so far take us below the 
> G1HeapWastePercent. As a result we could end up adding (and 
> collecting) many more regions than we needed to. The actual number 
> added was the minimum between the number of candidate regions / 
> G1MixedGCCountTarget and 10% of the heap.
>
> Currently the calculation of the reclaimable bytes as a percentage of 
> the uses exact arithmetic. It might make sense, at some point in the 
> future, to use inexact arithmetic (rounding) in the decision on 
> whether to continue mixed GCs and use exact arithmetic when adding 
> regions.
>
> As part of this change I've also moved a couple routines from 
> CollectionSetChooser to G1CollectorPolicy. I think they "fit" better 
> in G1CollectorPolicy.
>
> Testing:
> GCOld with tenuring threshold = 1 and a marking threshold = 10.
>
> Many thanks to Monica for identifying the issue.
>
> Thanks,
>
> JohnC


From bernd-2012 at eckenfels.net  Tue Jan 29 21:43:03 2013
From: bernd-2012 at eckenfels.net (Bernd Eckenfels)
Date: Tue, 29 Jan 2013 22:43:03 +0100
Subject: Deallocating memory pages
In-Reply-To: <CAASM7NJACYx9FVQBcQWVHkWXBjQCPHC0_csv2QBGdFVxz8BpeA@mail.gmail.com>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
	<CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>
	<CAHjP37F_wpuwydFXWxQ+3MjYpauY8Ch1kkJCQCQB8-2A93Y9mA@mail.gmail.com>
	<CAASM7NJACYx9FVQBcQWVHkWXBjQCPHC0_csv2QBGdFVxz8BpeA@mail.gmail.com>
Message-ID: <op.wrozx1w706c450@eckenfels02.seeburger.de>

Am 29.01.2013, 21:23 Uhr, schrieb Hiroshi Yamauchi <yamauchi at google.com>:
> The question is how likely they need the RAM and whether you'd be
> willing to pay the page deallocation cost to get this sort of memory
> management flexibility.

I wonder if there is any deallocation cost involved at all. The VM can
just mark the page as not-dirty and not-used. The only cost to use the
page again (at the same place) would be to zero it. (and even that could
be avoided if the VMM remembers the original owner and map it backl to the
process if that process touches it again and no other process had the need
for the page. Kind of same as buffer cache pages.

But I guess only some performance tests can answer that. (And add a test
for large/hugepages to avoid automatic page splits if they are partially
freed)

Gruss
Bernd
-- 
http://bernd.eckenfels.net


From stefan.karlsson at oracle.com  Wed Jan 30 09:56:51 2013
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 30 Jan 2013 10:56:51 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <510695D9.5030603@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
	<510695D9.5030603@oracle.com>
Message-ID: <7F5DD330-F1DB-467A-9A73-DAC4F87478DA@oracle.com>


On 28 jan 2013, at 16:14, Erik Helin <erik.helin at oracle.com> wrote:

> Jon,
> 
> thanks for your review!
> 
> On 01/24/2013 07:57 PM, Jon Masamitsu wrote:
>> I looked at the hotspot changes and they look correct.  But I'm
>> not sure that "sun.gc" should be in the name of the counter.  Maybe
>> use SUN_RT instead of SUN_GC.
> 
> I've updated the code to use the SUN_RT namespace instead of the SUN_GC namespace. This also required changes to the JDK code.
> 
> I've also added better error handling if a Java Out Of Memory exceptions occur is raised in PerfDataManager::create_variable.
> 
> Finally, I've moved some common code to the function create_ms_variable.
> 
> Webrev:
> - hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/

Would you mind using two indentation levels here:
+MetaspaceCounters::MetaspaceCounters() :
+  _capacity(NULL),
+  _used(NULL),
+  _max_capacity(NULL) {

I think it would be good to also extract this:
  68     const char *counter_name = PerfDataManager::counter_name(ms, "minCapacity");
  69     PerfDataManager::create_constant(SUN_RT, counter_name, PerfData::U_Bytes,

into a create_ms_constant, just like you did with create_ms_variable.

You should probably use CHECK instead of THREAD.
+    _max_capacity = create_ms_variable(ms, "maxCapacity", max_capacity, THREAD);
+    _capacity = create_ms_variable(ms, "capacity", curr_capacity, THREAD);
+    _used = create_ms_variable(ms, "used", used, THREAD);

I think it would be enough to assert that the variables are not NULL.
 void MetaspaceCounters::update_capacity() {
   assert(UsePerfData, "Should not be called unless being used");
   size_t capacity_in_bytes = MetaspaceAux::capacity_in_bytes();
+  if (_capacity != NULL) {

thanks,
StefanK

> - jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/
> 
> What do you think?
> 
> Thanks,
> Erik
> 
>> Jon
>> 
>> On 1/24/2013 2:51 AM, Erik Helin wrote:
>>> Hi all,
>>> 
>>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>>> also removes some code from metaspaceCounters.hpp/cpp that is not
>>> needed any longer.
>>> 
>>> Note that the tests will continue to fail until the JDK part of the
>>> change finds it way into the hotspot-gc forest.
>>> 
>>> The JDK part of the change is also out for review on
>>> serviceability-dev at openjdk.java.net.
>>> 
>>> Webrev:
>>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>> 
>>> Bug:
>>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>> 
>>> Testing:
>>> Run the jstat jtreg tests locally on my machine on a repository where
>>> I've applied both the JDK changes and the HotSpot changes.
>>> 
>>> Thanks,
>>> Erik
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/bade3711/attachment.htm>

From filipp.zhinkin at oracle.com  Wed Jan 30 11:21:28 2013
From: filipp.zhinkin at oracle.com (Filipp Zhinkin)
Date: Wed, 30 Jan 2013 15:21:28 +0400
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <51079337.6040108@oracle.com>
References: <50FD2323.9050702@oracle.com> <5106B2EE.9060509@oracle.com>
	<5106CA6E.6030405@oracle.com> <51079337.6040108@oracle.com>
Message-ID: <51090238.7000004@oracle.com>

Here is an updated webrev:
http://cr.openjdk.java.net/~kshefov/8000311/webrev.01/

I've added ExplicitGCInvokesConcurrent option and replaced heap-filling 
by frequent System.gc() calls.

Thanks,
Filipp.

On 01/29/2013 01:15 PM, Filipp Zhinkin wrote:
> Hi John,
>
> thanks for advice! I'll reimplement the test using System.gc() calls 
> and -XX:+ExplicitGCInvokesConcurrent option.
> And yes, you're right, PLAB resizes only at the end of incremental GC. 
> Thats why I've tried to provoke GC by filling up the heap instead of 
> calling System.gc() (I've missed ExplicitGCInvokesConcurrent flag 
> before).
>
> Thanks,
> Filipp.
>
> On 01/28/2013 10:58 PM, John Cuthbertson wrote:
>> Hi Filipp,
>>
>> In addition to what Jon suggests (i.e. using System.gc() to guarantee 
>> a GC), please add -XX:+ExplicitGCInvokesConcurrent. The addition of 
>> this flag will cause G1 to perform an incremental GC (instead of the 
>> full GC that a System.gc() call provokes). IIRC the PLAB resizing 
>> code is only exercised at the end of an incremental GC.
>>
>> Thanks,
>>
>> JohnC
>>
>> On 1/28/2013 9:18 AM, Jon Masamitsu wrote:
>>> Can this test be implemented using a call to
>>> System.gc() instead of trying to fill up the heap
>>> to provoke a GC?
>>>
>>> Jon
>>>
>>> On 01/21/13 03:14, Filipp Zhinkin wrote:
>>>> Hi all,
>>>>
>>>> Would someone review the following regression test please?
>>>>
>>>> Test verifies that VM will not crash with G1 GC and 
>>>> ParallelGCThreads == 0.
>>>>
>>>> To ensure that it is true test allocates array until OOME.
>>>> Max heap size is limited by 32M for this test to ensure that GC 
>>>> will occur.
>>>> Since crash could occur only during PLAB resizing after GC,
>>>> ResizePLAB option is explicitly turned on.
>>>>
>>>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>>>>
>>>> Thanks,
>>>> Filipp.
>>>>
>>
>


From michal at frajt.eu  Wed Jan 30 11:35:47 2013
From: michal at frajt.eu (Michal Frajt)
Date: Wed, 30 Jan 2013 12:35:47 +0100
Subject: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental 
	mode of CMS
In-Reply-To: <5102F577.6000706@oracle.com>
References: =?iso-8859-1?q?=3C508EB0D7=2E8020204=40oracle=2Ecom=3E_=3CCABzyjykeMB?=
	=?iso-8859-1?q?3gNgoCkd=2Dn4CgCDhzHUt4pdvNutGo7PqoJ7xtB6g=40mail=2Egm?=
	=?iso-8859-1?q?ail=2Ecom=3E_=3C50C108C2=2E9=40oracle=2Ecom=3E_=3CCABz?=
	=?iso-8859-1?q?yjy=3DVeiKEpARPdPAZLpdpibLKZxsCMcuMhfhswAi0CTF5Dg=40ma?=
	=?iso-8859-1?q?il=2Egmail=2Ecom=3E_=3CCABzyjymQJArReNr2xQ9pYA6kUMqcmx?=
	=?iso-8859-1?q?W9fpiykic=3DNrw8AaDG5g=40mail=2Egmail=2Ecom=3E_=3CMEO9?=
	=?iso-8859-1?q?HC=2471BBF3DAB4563B2D26BA972076AA26C1=40frajt=2Eeu=3E_?=
	=?iso-8859-1?q?=3CMEX4B1=24E1E07BA2F872E0495C06E5D1E52F22E9=40frajt?=
	=?iso-8859-1?q?=2Eeu=3E_=3C50EF1A85=2E4010203=40oracle=2Ecom=3E_=3C50?=
	=?iso-8859-1?q?EF228C=2E3030009=40oracle=2Ecom=3E_=3CMGRNT8=2415B65F8?=
	=?iso-8859-1?q?C7747BCD32CB2F745B9DB2BA9=40frajt=2Eeu=3E_=3C5102F577?=
	=?iso-8859-1?q?=2E6000706=40oracle=2Ecom=3E?=
Message-ID: <MHFS7N$4FAA3E73B7C819DDD3FF59B7AFDF497B@frajt.eu>


Hi John,

We have verified the new webrev. The changes have been applied correctly.

Regards,
Michal


Od: "John Cuthbertson" john.cuthbertson at oracle.com
Komu: "Michal Frajt" michal at frajt.eu
Kopie: hotspot-gc-dev at openjdk.java.net
Datum: Fri, 25 Jan 2013 13:13:27 -0800
P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS

> Hi Michal,
> 
> The patch is applied. The new webrev can be found at 
> http://cr.openjdk.java.net/~johnc/7189971/webrev.2/
> 
> Thanks,
> 
> JohnC
> 
> On 1/17/2013 2:58 AM, Michal Frajt wrote:
> > Hi John,
> >
> > Please apply the attached patch to the webrev. You are right, the setting of the CMS token has been somehow moved back above the method return. Additionally I have fixed the printf of the unsigned loop counter (correct is %u).
> >
> > Regards,
> > Michal
> >   
> > Od: hotspot-gc-dev-bounces at openjdk.java.net
> > Komu: hotspot-gc-dev at openjdk.java.net
> > Kopie:
> > Datum: Thu, 10 Jan 2013 12:20:28 -0800
> > P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for non-incremental mode of CMS
> >
> >
> >> Hi Michal,
> >>
> >> On 1/10/2013 11:46 AM, John Cuthbertson wrote:
> >>> Hi Michal,
> >>>
> >>> Many apologies for the delay in generating a new webrev for this
> >>> change but here is the new one:
> >>> http://cr.openjdk.java.net/~johnc/7189971/webrev.1/
> >>>
> >>> Can you verify the webrev to make sure that changes have been applied
> >>> correctly? Looking at the new webrev it seems that the setting of the
> >>> CMS has been moved back above the return out of the loop. Was this
> >>> intentional?
> >> The above should be "... setting of the CMS token has been ...".
> >>
> >> JohnC
> >>
> >>> I've done a couple of sanity tests with GCOld with CMSWaitDuration=0
> >>> and CMSWaitDuration=1500 with CMS.
> >>>
> >>> Regards,
> >>>
> >>> JohnC
> >>>
> >>> On 12/12/2012 4:35 AM, Michal Frajt wrote:
> >>>> All,
> >>>> Find the attached patch. It implements proposed recommendations and
> >>>> requested changes. Please mind that the CMSWaitDuration set to -1
> >>>> (never wait) requires new parameter CMSCheckInterval (develop only,
> >>>> 1000 milliseconds default - constant). The parameter defines the
> >>>> next CMS cycle start check interval in the case there are no
> >>>> desynchronization (notifications) events on the CGC_lock.
> >>>>
> >>>> Tested with the Solaris/amd64 build
> >>>> CMS
> >>>> + CMSWaitDuration>0 OK
> >>>> + CMSWaitDuration=0 OK
> >>>> + CMSWaitDuration<0 OK
> >>>> iCMS
> >>>> + CMSWaitDuration>0 OK
> >>>> + CMSWaitDuration=0 OK
> >>>> + CMSWaitDuration<0 OK
> >>>> Regards,
> >>>> Michal
> >>>> Od: hotspot-gc-dev-bounces at openjdk.java.net
> >>>> Komu: hotspot-gc-dev at openjdk.java.net
> >>>> Kopie:
> >>>> Datum: Fri, 7 Dec 2012 18:48:48 +0100
> >>>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
> >>>> non-incremental mode of CMS
> >>>>
> >>>>> Hi John/Jon/Ramki,
> >>>>>
> >>>>> All proposed recommendations and requested changes have been
> >>>>> implemented. We are going to test it on Monday. You will get the new
> >>>>> tested patch soon.
> >>>>>
> >>>>> The attached code here just got compiled, no test executed yet, it
> >>>>> might contain a bug, but you can quickly review it and send your
> >>>>> comments.
> >>>>>
> >>>>> Best regards
> >>>>> Michal
> >>>>>
> >>>>>
> >>>>> // Wait until the next synchronous GC, a concurrent full gc request,
> >>>>> // or a timeout, whichever is earlier.
> >>>>> void ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
> >>>>> t_millis) {
> >>>>> // Wait time in millis or 0 value representing infinite wait for
> >>>>> a scavenge
> >>>>> assert(t_millis >= 0, "Wait time for scavenge should be 0 or
> >>>>> positive");
> >>>>>
> >>>>> GenCollectedHeap* gch = GenCollectedHeap::heap();
> >>>>> double start_time_secs = os::elapsedTime();
> >>>>> double end_time_secs = start_time_secs + (t_millis / ((double)
> >>>>> MILLIUNITS));
> >>>>>
> >>>>> // Total collections count before waiting loop
> >>>>> unsigned int before_count;
> >>>>> {
> >>>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> >>>>> before_count = gch->total_collections();
> >>>>> }
> >>>>>
> >>>>> unsigned int loop_count = 0;
> >>>>>
> >>>>> while(!_should_terminate) {
> >>>>> double now_time = os::elapsedTime();
> >>>>> long wait_time_millis;
> >>>>>
> >>>>> if(t_millis != 0) {
> >>>>> // New wait limit
> >>>>> wait_time_millis = (long) ((end_time_secs - now_time) *
> >>>>> MILLIUNITS);
> >>>>> if(wait_time_millis <= 0) {
> >>>>> // Wait time is over
> >>>>> break;
> >>>>> }
> >>>>> } else {
> >>>>> // No wait limit, wait if necessary forever
> >>>>> wait_time_millis = 0;
> >>>>> }
> >>>>>
> >>>>> // Wait until the next event or the remaining timeout
> >>>>> {
> >>>>> MutexLockerEx x(CGC_lock, Mutex::_no_safepoint_check_flag);
> >>>>>
> >>>>> set_CMS_flag(CMS_cms_wants_token); // to provoke notifies
> >>>>> if (_should_terminate || _collector->_full_gc_requested) {
> >>>>> return;
> >>>>> }
> >>>>> assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> >>>>> CGC_lock->wait(Mutex::_no_safepoint_check_flag,
> >>>>> wait_time_millis);
> >>>>> clear_CMS_flag(CMS_cms_wants_token);
> >>>>> assert(!CMS_flag_is_set(CMS_cms_has_token |
> >>>>> CMS_cms_wants_token),
> >>>>> "Should not be set");
> >>>>> }
> >>>>>
> >>>>> // Extra wait time check before entering the heap lock to get
> >>>>> the collection count
> >>>>> if(t_millis != 0 && os::elapsedTime() >= end_time_secs) {
> >>>>> // Wait time is over
> >>>>> break;
> >>>>> }
> >>>>>
> >>>>> // Total collections count after the event
> >>>>> unsigned int after_count;
> >>>>> {
> >>>>> MutexLockerEx hl(Heap_lock, Mutex::_no_safepoint_check_flag);
> >>>>> after_count = gch->total_collections();
> >>>>> }
> >>>>>
> >>>>> if(before_count != after_count) {
> >>>>> // There was a collection - success
> >>>>> break;
> >>>>> }
> >>>>>
> >>>>> // Too many loops warning
> >>>>> if(++loop_count == 0) {
> >>>>> warning("wait_on_cms_lock_for_scavenge() has looped %d
> >>>>> times", loop_count - 1);
> >>>>> }
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
> >>>>> while (!_should_terminate) {
> >>>>> if (CMSIncrementalMode) {
> >>>>> icms_wait();
> >>>>> if(CMSWaitDuration >= 0) {
> >>>>> // Wait until the next synchronous GC, a concurrent full gc
> >>>>> // request or a timeout, whichever is earlier.
> >>>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> >>>>> }
> >>>>> return;
> >>>>> } else {
> >>>>> if(CMSWaitDuration >= 0) {
> >>>>> // Wait until the next synchronous GC, a concurrent full gc
> >>>>> // request or a timeout, whichever is earlier.
> >>>>> wait_on_cms_lock_for_scavenge(CMSWaitDuration);
> >>>>> } else {
> >>>>> // Wait until any cms_lock event not to call
> >>>>> shouldConcurrentCollect permanently
> >>>>> wait_on_cms_lock(0);
> >>>>> }
> >>>>> }
> >>>>> // Check if we should start a CMS collection cycle
> >>>>> if (_collector->shouldConcurrentCollect()) {
> >>>>> return;
> >>>>> }
> >>>>> // .. collection criterion not yet met, let's go back
> >>>>> // and wait some more
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> Od: hotspot-gc-dev-bounces at openjdk.java.net
> >>>>> Komu: "Jon Masamitsu" jon.masamitsu at oracle.com,"John Cuthbertson"
> >>>>> john.cuthbertson at oracle.com
> >>>>> Kopie: hotspot-gc-dev at openjdk.java.net
> >>>>> Datum: Thu, 6 Dec 2012 23:43:29 -0800
> >>>>> P?edmet: Re: RFR(S): 7189971: Implement CMSWaitDuration for
> >>>>> non-incremental mode of CMS
> >>>>>
> >>>>>> Hi John --
> >>>>>>
> >>>>>> wrt the changes posted, i see the intent of the code and agree with
> >>>>>> it. I have a few minor suggestions on the
> >>>>>> details of how it's implemented. My comments are inline below,
> >>>>>> interleaved with the code:
> >>>>>>
> >>>>>> 317 // Wait until the next synchronous GC, a concurrent full gc
> >>>>>> request,
> >>>>>> 318 // or a timeout, whichever is earlier.
> >>>>>> 319 void
> >>>>>> ConcurrentMarkSweepThread::wait_on_cms_lock_for_scavenge(long
> >>>>>> t_millis) {
> >>>>>> 320 // Wait for any cms_lock event when timeout not specified
> >>>>>> (0 millis)
> >>>>>> 321 if (t_millis == 0) {
> >>>>>> 322 wait_on_cms_lock(t_millis);
> >>>>>> 323 return;
> >>>>>> 324 }
> >>>>>>
> >>>>>> I'd completely avoid the special case above because it would miss the
> >>>>>> part about waiting for a
> >>>>>> scavenge, instead dealing with that case in the code in the loop below
> >>>>>> directly. The idea
> >>>>>> of the "0" value is not to ask that we return immediately, but that we
> >>>>>> wait, if necessary
> >>>>>> forever, for a scavenge. The "0" really represents the value infinity
> >>>>>> in that sense. This would
> >>>>>> be in keeping with our use of wait() with a "0" value for timeout at
> >>>>>> other places in the JVM as
> >>>>>> well, so it's consistent.
> >>>>>>
> >>>>>> 325
> >>>>>> 326 GenCollectedHeap* gch = GenCollectedHeap::heap();
> >>>>>> 327 double start_time = os::elapsedTime();
> >>>>>> 328 double end_time = start_time + (t_millis / 1000.0);
> >>>>>>
> >>>>>> Note how, the end_time == start_time for the special case of t_millis
> >>>>>> == 0, so we need to treat that
> >>>>>> case specially below.
> >>>>>>
> >>>>>> 329
> >>>>>> 330 // Total collections count before waiting loop
> >>>>>> 331 unsigned int before_count;
> >>>>>> 332 {
> >>>>>> 333 MutexLockerEx hl(Heap_lock,
> >>>>>> Mutex::_no_safepoint_check_flag);
> >>>>>> 334 before_count = gch->total_collections();
> >>>>>> 335 }
> >>>>>>
> >>>>>> Good.
> >>>>>>
> >>>>>> 336
> >>>>>> 337 while (true) {
> >>>>>> 338 double now_time = os::elapsedTime();
> >>>>>> 339 long wait_time_millis = (long)((end_time - now_time) *
> >>>>>> 1000.0);
> >>>>>> 340
> >>>>>> 341 if (wait_time_millis <= 0) {
> >>>>>> 342 // Wait time is over
> >>>>>> 343 break;
> >>>>>> 344 }
> >>>>>>
> >>>>>> Modify to:
> >>>>>> if (t_millis != 0) {
> >>>>>> if (wait_time_millis <= 0) {
> >>>>>> // Wait time is over
> >>>>>> break;
> >>>>>> }
> >>>>>> } else {
> >>>>>> wait_time_millis = 0; // for use in wait() below
> >>>>>> }
> >>>>>>
> >>>>>> 345
> >>>>>> 346 // Wait until the next event or the remaining timeout
> >>>>>> 347 {
> >>>>>> 348 MutexLockerEx x(CGC_lock,
> >>>>>> Mutex::_no_safepoint_check_flag);
> >>>>>> 349 if (_should_terminate || _collector->_full_gc_requested) {
> >>>>>> 350 return;
> >>>>>> 351 }
> >>>>>> 352 set_CMS_flag(CMS_cms_wants_token); // to provoke
> >>>>>> notifies
> >>>>>>
> >>>>>> insert: assert(t_millis == 0 || wait_time_millis > 0, "Sanity");
> >>>>>>
> >>>>>> 353 CGC_lock->wait(Mutex::_no_safepoint_check_flag,
> >>>>>> wait_time_millis);
> >>>>>> 354 clear_CMS_flag(CMS_cms_wants_token);
> >>>>>> 355 assert(!CMS_flag_is_set(CMS_cms_has_token |
> >>>>>> CMS_cms_wants_token),
> >>>>>> 356 "Should not be set");
> >>>>>> 357 }
> >>>>>> 358
> >>>>>> 359 // Extra wait time check before entering the heap lock to
> >>>>>> get
> >>>>>> the collection count
> >>>>>> 360 if (os::elapsedTime() >= end_time) {
> >>>>>> 361 // Wait time is over
> >>>>>> 362 break;
> >>>>>> 363 }
> >>>>>>
> >>>>>> Modify above wait time check to make an exception for t_miliis == 0:
> >>>>>> // Extra wait time check before checking collection count
> >>>>>> if (t_millis != 0 && os::elapsedTime() >= end_time) {
> >>>>>> // wait time exceeded
> >>>>>> break;
> >>>>>> }
> >>>>>>
> >>>>>> 364
> >>>>>> 365 // Total collections count after the event
> >>>>>> 366 unsigned int after_count;
> >>>>>> 367 {
> >>>>>> 368 MutexLockerEx hl(Heap_lock,
> >>>>>> Mutex::_no_safepoint_check_flag);
> >>>>>> 369 after_count = gch->total_collections();
> >>>>>> 370 }
> >>>>>> 371
> >>>>>> 372 if (before_count != after_count) {
> >>>>>> 373 // There was a collection - success
> >>>>>> 374 break;
> >>>>>> 375 }
> >>>>>> 376 }
> >>>>>> 377 }
> >>>>>>
> >>>>>> While it is true that we do not have a case where the method is called
> >>>>>> with a time of "0", I think we
> >>>>>> want that value to be treated correctly as "infinity". For the case
> >>>>>> where we do not want a wait at all,
> >>>>>> we should use a small positive value, like "1 ms" to signal that
> >>>>>> intent, i.e. -XX:CMSWaitDuration=1,
> >>>>>> reserving CMSWaitDuration=0 to signal infinity. (We could also do that
> >>>>>> by reserving negative values to
> >>>>>> signal infinity, but that would make the code in the loop a bit
> >>>>>> more fiddly.)
> >>>>>>
> >>>>>> As mentioned in my previous email, I'd like to see this tested with
> >>>>>> CMSWaitDuration set to 0, positive and
> >>>>>> negative values (if necessary, we can reject negative value settings),
> >>>>>> and with ExplicitGCInvokesConcurrent.
> >>>>>>
> >>>>>> Rest looks OK to me, although I am not sure how this behaves with
> >>>>>> iCMS, as I have forgotten that part of the
> >>>>>> code.
> >>>>>>
> >>>>>> Finally, in current code (before these changes) there are two callers
> >>>>>> of the former wait_for_cms_lock() method,
> >>>>>> one here in sleepBeforeNextCycle() and one from the precleaning loop.
> >>>>>> I think the right thing has been done
> >>>>>> in terms of leaving the latter alone.
> >>>>>>
> >>>>>> It would be good if this were checked with CMSInitiatingOccupancy set
> >>>>>> to 0 (or a small value), CMSWaitDuration set to 0,
> >>>>>> -+PromotionFailureALot and checking that (1) it does not deadlock (2)
> >>>>>> CMS cycles start very soon after the end of
> >>>>>> a scavenge (and not at random times as Michal has observed earlier,
> >>>>>> although i am guessing that is difficult to test).
> >>>>>> It would be good to repeat the above test with iCMS as well.
> >>>>>>
> >>>>>> thanks!
> >>>>>> -- ramki
> >>>>>>
> >>>>>> On Thu, Dec 6, 2012 at 1:39 PM, Srinivas Ramakrishna wrote:
> >>>>>>> Thanks Jon for the pointer:
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Dec 6, 2012 at 1:06 PM, Jon Masamitsu wrote:
> >>>>>>>>
> >>>>>>>> On 12/05/12 14:47, Srinivas Ramakrishna wrote:
> >>>>>>>>> The high level idea looks correct. I'll look at the details in a
> >>>>>>>>> bit (seriously this time; sorry it dropped off my plate last
> >>>>>>>>> time I promised).
> >>>>>>>>> Does anyone have a pointer to the related discussion thread on
> >>>>>>>>> this aias from earlier in the year, by chance, so one could
> >>>>>>>>> refresh one's
> >>>>>>>>> memory of that discussion?
> >>>>>>>> subj: CMSWaitDuration unstable behavior
> >>>>>>>>
> >>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/thread.html
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>> also:
> >>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-August/004880.html
> >>>>>>>
> >>>>>>> On to it later this afternoon, and TTYL w/review.
> >>>>>>> - ramki
> 


From bengt.rutisson at oracle.com  Wed Jan 30 12:57:34 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 30 Jan 2013 13:57:34 +0100
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <CAHjP37Hn=68a-xR439TKyC4LwSeaQ3WYSjE8uj+0adZ5+_KmOw@mail.gmail.com>
References: <50EC99ED.4090903@oracle.com>
	<CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>
	<50EE0E33.8010408@oracle.com>
	<CAHjP37Hn=68a-xR439TKyC4LwSeaQ3WYSjE8uj+0adZ5+_KmOw@mail.gmail.com>
Message-ID: <510918BE.7010808@oracle.com>


Hi John,

This looks good to me.

I think Vitaly has a point about the fact that we just "know" that 
_parallel_workers == NULL is equivalent to parallel_marking_threads() == 0.

I'm not going to insist on this, but would it make sense to add an 
assert to convey this information? I'm actually less worried about the 
case "parallel_marking_threads() > 0 but _parallel_workers == NULL" 
since that will result in a null de-reference that can be fairly easy to 
debug.

But maybe this assert could be added at the start of the method:

   assert(_parallel_workers == NULL || parallel_marking_threads() > 0, 
"work gang not set up correctly");

I'm not sure we need that assert and I'm not convinced this is the right 
place to have it. But my thought is that this will detect an unexpected 
state that we would otherwise silently ignore.

Bengt


This is probably not such a big problem if we

On 1/10/13 1:47 AM, Vitaly Davidovich wrote:
>
> Hi John,
>
> Thanks for the response.  Yeah, I figured it's the same thing since 
> it's not null iff # of workers > 0. However, if this relationship is 
> ever broken or perhaps the gang can be set to null at some point even 
> if workers > 0, then this code will segv again.  Hence I thought a 
> null guard is a bit better, but it was just a side comment - code 
> looks fine as is.
>
> Thanks
>
> Sent from my phone
>
> On Jan 9, 2013 7:41 PM, "John Cuthbertson" 
> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>
>     Hi Vitaly,
>
>     Thanks for looking over the changes. AFAICT checking if
>     _parallel_workers is not null is equivalent to checking that the
>     number of parallel marking threads is > 0. I went with the latter
>     check as other references to the parallel workers work gang are
>     guarded by it. I'm not sure why the code was originally written
>     that way but my guess is that, when originally written, the
>     marking threads (like the concurrent refinement threads currently)
>     were not in a work gang.
>
>     Thanks,
>
>     JohnC
>
>     On 1/8/2013 8:37 PM, Vitaly Davidovich wrote:
>>
>>     Hi John,
>>
>>     What's the advantage of checking parallel marking thread count >
>>     0 rather than checking if parallel workers is not NULL? Is it
>>     clearer that way? I'm thinking checking for NULL here (perhaps
>>     with a comment on when NULL can happen) may be a bit more robust
>>     in case it can be null for some other reason, even if parallel
>>     marking thread count is > 0.
>>
>>     Looks good though.
>>
>>     Thanks
>>
>>     Sent from my phone
>>
>>     On Jan 8, 2013 5:14 PM, "John Cuthbertson"
>>     <john.cuthbertson at oracle.com
>>     <mailto:john.cuthbertson at oracle.com>> wrote:
>>
>>         Hi Everyone,
>>
>>         Can I please have a couple of volunteers look over the fix
>>         for this CR - the webrev can be found at:
>>         http://cr.openjdk.java.net/~johnc/8005875/webrev.0/
>>         <http://cr.openjdk.java.net/%7Ejohnc/8005875/webrev.0/>
>>
>>         Summary:
>>         One of the modules in the Kitchensink test generates a
>>         VM_PrintThreads vm operation. The JVM crashes when it tries
>>         to print out G1's concurrent marking worker threads when
>>         ParallelGCThreads=0 because the work gang has not been
>>         created. The fix is to add the same check that's used
>>         elsewhere in G1's concurrent marking.
>>
>>         Testing:
>>         Kitchensink with ParallelGCThreads=0
>>
>>         Thanks,
>>
>>         JohnC
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/02ea8b75/attachment.htm>

From bengt.rutisson at oracle.com  Wed Jan 30 13:14:35 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 30 Jan 2013 14:14:35 +0100
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <50F858DA.8050508@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com>
Message-ID: <51091CBB.7080500@oracle.com>


Hi John,

Thanks for doing these changes! Looks good.

Ship it!
Bengt


On 1/17/13 9:02 PM, John Cuthbertson wrote:
> Hi Bengt,
>
> There's a new webrev at: 
> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>
> It looks larger than the previous webrev but the most of the change 
> was tweaking comments. The actual code changes are smaller.
>
> Testing was the same as before.
>
> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>
>> I see. I didn't think about the difference betweeen ParallelGCThreads 
>> and ParallelRefProcEnabled. BTW, not part of this change, but why do 
>> we have ParallelRefProcEnabled? And why is it false by default? 
>> Wouldn't it make more sense to have it just be dependent on 
>> ParallelGCThreads?
>
> I don't know and the answer is probably lost in the dark depths of 
> time - I can only speculate. For G1 we have a CR to turn 
> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not 
> sure about the other collectors.
>
>>
>>> Setting it once in weakRefsWork() will not be sufficient. We will 
>>> run into an assertion failure in 
>>> ParallelTaskTerminator::offer_termination().
>>>
>>> During the reference processing, the do_void() method of the 
>>> complete_gc oop closure (in our case the complete gc oop closure is 
>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple 
>>> times (in process_phase1, sometimes process_phase2, process_phase3, 
>>> and process_phaseJNI)
>>>
>>> Setting the phase sets the number of active tasks (or threads) that 
>>> the termination protocol in do_marking_step() will wait for. When an 
>>> invocation of do_marking_step() offers termination, the number of 
>>> tasks/threads in the terminator instance is decremented. So Setting 
>>> the phase once will let the first execution of do_marking_step (with 
>>> termination) from process_phase1() succeed, but subsequent calls to 
>>> do_marking_step() will result in the assertion failure.
>>>
>>> We also can't unconditionally set it in the do_void() method or even 
>>> the constructor of G1CMParDrainMarkingStackClosure. Separate 
>>> instances of this closure are created by each of the worker threads 
>>> in the MT-case.
>>>
>>> Note when processing is multi-threaded the complete_gc instance used 
>>> is the one passed into the ProcessTask's work method (passed into 
>>> process_discovered_references() using the task executor instance) 
>>> which may not necessarily be the same complete gc instance as the 
>>> one passed directly into process_discovered_references().
>>
>> Thanks for this detailed explanation. It really helped!
>>
>> I understand the issue now, but I still think it is very confusing 
>> that _cm->set_phase() is called from 
>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from 
>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded case.
>>
>>> It might be possible to record whether processing is MT in the 
>>> G1CMRefProcTaskExecutor class and always pass the executor instance 
>>> into process_discovered_references. We could then set processing to 
>>> MT so that the execute() methods in the executor instance are 
>>> invoked but call the Proxy class' work method directly. Then we 
>>> could override the set_single_threaded() routine (called just before 
>>> process_phaseJNI) to set the phase.
>>
>> I think this would be a better solution, but if I understand it 
>> correctly it would mean that we would have to change all the 
>> collectors to always pass a TaskExecutor. All of them currently pass 
>> NULL in the non-MT case. I think it would be simpler if they always 
>> passed a TaskExecutor but it is a pretty big change.
>
> I wasn't meaning to do that for the other collectors just G1's 
> concurrent mark reference processor i.e. fool the ref processor into 
> think it's MT so that the parallel task executor is used but only use 
> the work gang if reference processing was _really_ MT.
>
> I decided not to do this as there is an easier way. For the non-MT 
> case we do not need to enter the termination protocol in 
> CMTask::do_marking_step(). When there's only one thread we don't need 
> to use the ParallelTaskTerminator to wait for other threads. And we 
> certainly don't need stealing. Hence the solution is to only do the 
> termination and stealing if the closure is instantiated for MT 
> reference processing. That removes the set_phase call().
>
>> Another possibility is to introduce some kind of prepare method to 
>> the VoidClosure (or maybe in a specialized subclass for ref 
>> processing). Then we could do something like:
>>
>>   complete_gc->prologue();
>>   if (mt_processing) {
>>     RefProcPhase2Task phase2(*this, refs_lists, 
>> !discovery_is_atomic() /*marks_oops_alive*/);
>>     task_executor->execute(phase2);
>>   } else {
>>     for (uint i = 0; i < _max_num_q; i++) {
>>       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
>>     }
>>   }
>>
>> G1CMParDrainMarkingStackClosure::prologue() could do the call to 
>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not 
>> have to do it.
>
> The above is a reasonable extension to the reference processing code. 
> I no longer need this feature for this change but we should submit a 
> CR for it. I'll do that.
>
>> BTW, not really part of your change, but above code is duplicated 
>> three times in ReferenceProcessor::process_discovered_reflist(). 
>> Would be nice to factor this out to a method.
>
> Completely agree. Again I'll submit a CR for it.
>
> Thanks,
>
> JohnC


From bengt.rutisson at oracle.com  Wed Jan 30 13:21:08 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 30 Jan 2013 14:21:08 +0100
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com>
	<8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
Message-ID: <51091E44.10405@oracle.com>


Hi Charlie,

On 1/17/13 9:52 PM, Charlie Hunt wrote:
> John / Bengt:
>
> I think I can offer a bit of info on Bengt's earlier question about ParallelProcRefEnabled being disabled by default.
>
> IIRC, there was one workload that showed a slight perf regression with +ParallelProcRefEnabled.  That workload that showed a regression may not be as relevant as it was back when the evaluation / decision was made to disable it by default?

Thanks for providing some history for these flags!

> You both have probably thought about this already?  My reaction is ... I think reasonable defaults would be to enable +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when ParallelGCThreads is greater than 1, and disable -ParallelProcRefEnabled with -XX:+UseSerialGC.

This sounds like a good enhancement. John, if you agree, could you file 
a CR for it? Or would you like me to file it?

Bengt

>
> hths,
>
> charlie ...
>
> On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:
>
>> Hi Bengt,
>>
>> There's a new webrev at: http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>
>> It looks larger than the previous webrev but the most of the change was
>> tweaking comments. The actual code changes are smaller.
>>
>> Testing was the same as before.
>>
>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>> I see. I didn't think about the difference betweeen ParallelGCThreads
>>> and ParallelRefProcEnabled. BTW, not part of this change, but why do
>>> we have ParallelRefProcEnabled? And why is it false by default?
>>> Wouldn't it make more sense to have it just be dependent on
>>> ParallelGCThreads?
>> I don't know and the answer is probably lost in the dark depths of time
>> - I can only speculate. For G1 we have a CR to turn
>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not sure
>> about the other collectors.
>>
>>>> Setting it once in weakRefsWork() will not be sufficient. We will run
>>>> into an assertion failure in
>>>> ParallelTaskTerminator::offer_termination().
>>>>
>>>> During the reference processing, the do_void() method of the
>>>> complete_gc oop closure (in our case the complete gc oop closure is
>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple
>>>> times (in process_phase1, sometimes process_phase2, process_phase3,
>>>> and process_phaseJNI)
>>>>
>>>> Setting the phase sets the number of active tasks (or threads) that
>>>> the termination protocol in do_marking_step() will wait for. When an
>>>> invocation of do_marking_step() offers termination, the number of
>>>> tasks/threads in the terminator instance is decremented. So Setting
>>>> the phase once will let the first execution of do_marking_step (with
>>>> termination) from process_phase1() succeed, but subsequent calls to
>>>> do_marking_step() will result in the assertion failure.
>>>>
>>>> We also can't unconditionally set it in the do_void() method or even
>>>> the constructor of G1CMParDrainMarkingStackClosure. Separate
>>>> instances of this closure are created by each of the worker threads
>>>> in the MT-case.
>>>>
>>>> Note when processing is multi-threaded the complete_gc instance used
>>>> is the one passed into the ProcessTask's work method (passed into
>>>> process_discovered_references() using the task executor instance)
>>>> which may not necessarily be the same complete gc instance as the one
>>>> passed directly into process_discovered_references().
>>> Thanks for this detailed explanation. It really helped!
>>>
>>> I understand the issue now, but I still think it is very confusing
>>> that _cm->set_phase() is called from
>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from
>>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded case.
>>>
>>>> It might be possible to record whether processing is MT in the
>>>> G1CMRefProcTaskExecutor class and always pass the executor instance
>>>> into process_discovered_references. We could then set processing to
>>>> MT so that the execute() methods in the executor instance are invoked
>>>> but call the Proxy class' work method directly. Then we could
>>>> override the set_single_threaded() routine (called just before
>>>> process_phaseJNI) to set the phase.
>>> I think this would be a better solution, but if I understand it
>>> correctly it would mean that we would have to change all the
>>> collectors to always pass a TaskExecutor. All of them currently pass
>>> NULL in the non-MT case. I think it would be simpler if they always
>>> passed a TaskExecutor but it is a pretty big change.
>> I wasn't meaning to do that for the other collectors just G1's
>> concurrent mark reference processor i.e. fool the ref processor into
>> think it's MT so that the parallel task executor is used but only use
>> the work gang if reference processing was _really_ MT.
>>
>> I decided not to do this as there is an easier way. For the non-MT case
>> we do not need to enter the termination protocol in
>> CMTask::do_marking_step(). When there's only one thread we don't need to
>> use the ParallelTaskTerminator to wait for other threads. And we
>> certainly don't need stealing. Hence the solution is to only do the
>> termination and stealing if the closure is instantiated for MT reference
>> processing. That removes the set_phase call().
>>
>>> Another possibility is to introduce some kind of prepare method to the
>>> VoidClosure (or maybe in a specialized subclass for ref processing).
>>> Then we could do something like:
>>>
>>>   complete_gc->prologue();
>>>   if (mt_processing) {
>>>     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic()
>>> /*marks_oops_alive*/);
>>>     task_executor->execute(phase2);
>>>   } else {
>>>     for (uint i = 0; i < _max_num_q; i++) {
>>>       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
>>>     }
>>>   }
>>>
>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to
>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not
>>> have to do it.
>> The above is a reasonable extension to the reference processing code. I
>> no longer need this feature for this change but we should submit a CR
>> for it. I'll do that.
>>
>>> BTW, not really part of your change, but above code is duplicated
>>> three times in ReferenceProcessor::process_discovered_reflist(). Would
>>> be nice to factor this out to a method.
>> Completely agree. Again I'll submit a CR for it.
>>
>> Thanks,
>>
>> JohnC


From erik.helin at oracle.com  Wed Jan 30 14:08:18 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Wed, 30 Jan 2013 15:08:18 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <7F5DD330-F1DB-467A-9A73-DAC4F87478DA@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
	<510695D9.5030603@oracle.com>
	<7F5DD330-F1DB-467A-9A73-DAC4F87478DA@oracle.com>
Message-ID: <51092952.9030004@oracle.com>

Hi Stefan,

thanks for reviewing!

See new webrev for hotspot at:
http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.02/

On 01/30/2013 10:56 AM, Stefan Karlsson wrote:
> On 28 jan 2013, at 16:14, Erik Helin <erik.helin at oracle.com> wrote:
>> Webrev:
>> - hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/
>
> Would you mind using two indentation levels here:
> +MetaspaceCounters::MetaspaceCounters() :
> +  _capacity(NULL),
> +  _used(NULL),
> +  _max_capacity(NULL) {

Fixed!

On 01/30/2013 10:56 AM, Stefan Karlsson wrote:
> I think it would be good to also extract this:
>    68     const char *counter_name = PerfDataManager::counter_name(ms, "minCapacity");
>    69     PerfDataManager::create_constant(SUN_RT, counter_name, PerfData::U_Bytes,
>
> into a create_ms_constant, just like you did with create_ms_variable.

Done!

On 01/30/2013 10:56 AM, Stefan Karlsson wrote:
> You should probably use CHECK instead of THREAD.
> +    _max_capacity = create_ms_variable(ms, "maxCapacity", max_capacity, THREAD);
> +    _capacity = create_ms_variable(ms, "capacity", curr_capacity, THREAD);
> +    _used = create_ms_variable(ms, "used", used, THREAD);

Agree, I've updated the code.

On 01/30/2013 10:56 AM, Stefan Karlsson wrote:
> I think it would be enough to assert that the variables are not NULL.
>   void MetaspaceCounters::update_capacity() {
>     assert(UsePerfData, "Should not be called unless being used");
>     size_t capacity_in_bytes = MetaspaceAux::capacity_in_bytes();
> +  if (_capacity != NULL) {

Agree, I've updated the code.

Thanks,
Erik

> thanks,
> StefanK
>
>> - jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/
>>
>> What do you think?
>>
>> Thanks,
>> Erik
>>
>>> Jon
>>>
>>> On 1/24/2013 2:51 AM, Erik Helin wrote:
>>>> Hi all,
>>>>
>>>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>>>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>>>> also removes some code from metaspaceCounters.hpp/cpp that is not
>>>> needed any longer.
>>>>
>>>> Note that the tests will continue to fail until the JDK part of the
>>>> change finds it way into the hotspot-gc forest.
>>>>
>>>> The JDK part of the change is also out for review on
>>>> serviceability-dev at openjdk.java.net.
>>>>
>>>> Webrev:
>>>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>>>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>>>
>>>> Bug:
>>>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>>>
>>>> Testing:
>>>> Run the jstat jtreg tests locally on my machine on a repository where
>>>> I've applied both the JDK changes and the HotSpot changes.
>>>>
>>>> Thanks,
>>>> Erik
>>
>
>


From vitalyd at gmail.com  Wed Jan 30 14:19:03 2013
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 30 Jan 2013 09:19:03 -0500
Subject: RFR(S): 8007036: G1: Too many old regions added to last mixed GC
In-Reply-To: <510840A0.1000709@oracle.com>
References: <5106F4D1.50200@oracle.com>
	<510840A0.1000709@oracle.com>
Message-ID: <CAHjP37GJVzDaZYTmAJ4ahK7vDfjZEMgbt4251w-02b2rp=eKtA@mail.gmail.com>

Hi John,

This nicely addresses my comments - thanks.

Vitaly

Sent from my phone
On Jan 29, 2013 4:36 PM, "John Cuthbertson" <john.cuthbertson at oracle.com>
wrote:

> Hi Everyone,
>
> Here's a new webrev based upon feedback from Vitaly:
> http://cr.openjdk.java.net/~**johnc/8007036/webrev.1/<http://cr.openjdk.java.net/~johnc/8007036/webrev.1/>
>
> JohnC
>
>
> On 1/28/2013 1:59 PM, John Cuthbertson wrote:
>
>> Hi Everyone,
>>
>> Can I have a couple of volunteers look over the changes for this CR? The
>> webrev is at: http://cr.openjdk.java.net/~**johnc/8007036/webrev.0/<http://cr.openjdk.java.net/~johnc/8007036/webrev.0/>
>>
>> Summary:
>> When adding old regions to the collection set we don't take into account
>> whether the old regions added so far take us below the G1HeapWastePercent.
>> As a result we could end up adding (and collecting) many more regions than
>> we needed to. The actual number added was the minimum between the number of
>> candidate regions / G1MixedGCCountTarget and 10% of the heap.
>>
>> Currently the calculation of the reclaimable bytes as a percentage of the
>> uses exact arithmetic. It might make sense, at some point in the future, to
>> use inexact arithmetic (rounding) in the decision on whether to continue
>> mixed GCs and use exact arithmetic when adding regions.
>>
>> As part of this change I've also moved a couple routines from
>> CollectionSetChooser to G1CollectorPolicy. I think they "fit" better in
>> G1CollectorPolicy.
>>
>> Testing:
>> GCOld with tenuring threshold = 1 and a marking threshold = 10.
>>
>> Many thanks to Monica for identifying the issue.
>>
>> Thanks,
>>
>> JohnC
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/e4a97f3f/attachment.htm>

From nhann at chalmers.se  Wed Jan 30 14:43:20 2013
From: nhann at chalmers.se (Dang Nhan Nguyen)
Date: Wed, 30 Jan 2013 14:43:20 +0000
Subject: Parallel Scavenge: algorithm ParallelOldGC
Message-ID: <D6D722B65BA28644BE53526500AC41BC3A56FA18@garrus.ita.chalmers.se>

Hi all,

I would like to ask about Parallel Scavenge GC:
Which parallel mark-compact algorithm is used for the old generation in Parallel Scavenge GC (when XX:+UseParallelOldGC is set)?
(Reference to publication(s) are appreciated)

Thanks,
/Nhan Nguyen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/2a6ac00b/attachment.htm>

From jon.masamitsu at oracle.com  Wed Jan 30 15:29:17 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 30 Jan 2013 07:29:17 -0800
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <510695D9.5030603@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
	<510695D9.5030603@oracle.com>
Message-ID: <51093C4D.70103@oracle.com>

Erik,

Sorry to make extra work for you but I've been convinced
that SUN_GC is the right name to use.  Even though the
metadata may seem (in my head) more like a runtime quantity,
it (class metadata) has been associated with GC historically
so should continue to be associated with GC.  So the
SUN_GC is appropriate.  Again, my apologies for flip-flopping
on this.

Jon

On 1/28/2013 7:14 AM, Erik Helin wrote:
> Jon,
>
> thanks for your review!
>
> On 01/24/2013 07:57 PM, Jon Masamitsu wrote:
>> I looked at the hotspot changes and they look correct.  But I'm
>> not sure that "sun.gc" should be in the name of the counter.  Maybe
>> use SUN_RT instead of SUN_GC.
>
> I've updated the code to use the SUN_RT namespace instead of the 
> SUN_GC namespace. This also required changes to the JDK code.
>
> I've also added better error handling if a Java Out Of Memory 
> exceptions occur is raised in PerfDataManager::create_variable.
>
> Finally, I've moved some common code to the function create_ms_variable.
>
> Webrev:
> - hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/
> - jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/
>
> What do you think?
>
> Thanks,
> Erik
>
>> Jon
>>
>> On 1/24/2013 2:51 AM, Erik Helin wrote:
>>> Hi all,
>>>
>>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>>> also removes some code from metaspaceCounters.hpp/cpp that is not
>>> needed any longer.
>>>
>>> Note that the tests will continue to fail until the JDK part of the
>>> change finds it way into the hotspot-gc forest.
>>>
>>> The JDK part of the change is also out for review on
>>> serviceability-dev at openjdk.java.net.
>>>
>>> Webrev:
>>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>>
>>> Bug:
>>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>>
>>> Testing:
>>> Run the jstat jtreg tests locally on my machine on a repository where
>>> I've applied both the JDK changes and the HotSpot changes.
>>>
>>> Thanks,
>>> Erik
>


From john.cuthbertson at oracle.com  Wed Jan 30 17:05:51 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 09:05:51 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <51091E44.10405@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com>
	<8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
	<51091E44.10405@oracle.com>
Message-ID: <510952EF.5090806@oracle.com>

Hi Bengt,

We already have a CR for G1. We'll just broaden the scope of that. I'll 
add Charlies email as a comment.

JohnC


On 1/30/2013 5:21 AM, Bengt Rutisson wrote:
>
> Hi Charlie,
>
> On 1/17/13 9:52 PM, Charlie Hunt wrote:
>> John / Bengt:
>>
>> I think I can offer a bit of info on Bengt's earlier question about 
>> ParallelProcRefEnabled being disabled by default.
>>
>> IIRC, there was one workload that showed a slight perf regression 
>> with +ParallelProcRefEnabled.  That workload that showed a regression 
>> may not be as relevant as it was back when the evaluation / decision 
>> was made to disable it by default?
>
> Thanks for providing some history for these flags!
>
>> You both have probably thought about this already?  My reaction is 
>> ... I think reasonable defaults would be to enable 
>> +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when 
>> ParallelGCThreads is greater than 1, and disable 
>> -ParallelProcRefEnabled with -XX:+UseSerialGC.
>
> This sounds like a good enhancement. John, if you agree, could you 
> file a CR for it? Or would you like me to file it?
>
> Bengt
>
>>
>> hths,
>>
>> charlie ...
>>
>> On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:
>>
>>> Hi Bengt,
>>>
>>> There's a new webrev at: 
>>> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>>
>>> It looks larger than the previous webrev but the most of the change was
>>> tweaking comments. The actual code changes are smaller.
>>>
>>> Testing was the same as before.
>>>
>>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>>> I see. I didn't think about the difference betweeen ParallelGCThreads
>>>> and ParallelRefProcEnabled. BTW, not part of this change, but why do
>>>> we have ParallelRefProcEnabled? And why is it false by default?
>>>> Wouldn't it make more sense to have it just be dependent on
>>>> ParallelGCThreads?
>>> I don't know and the answer is probably lost in the dark depths of time
>>> - I can only speculate. For G1 we have a CR to turn
>>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not sure
>>> about the other collectors.
>>>
>>>>> Setting it once in weakRefsWork() will not be sufficient. We will run
>>>>> into an assertion failure in
>>>>> ParallelTaskTerminator::offer_termination().
>>>>>
>>>>> During the reference processing, the do_void() method of the
>>>>> complete_gc oop closure (in our case the complete gc oop closure is
>>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple
>>>>> times (in process_phase1, sometimes process_phase2, process_phase3,
>>>>> and process_phaseJNI)
>>>>>
>>>>> Setting the phase sets the number of active tasks (or threads) that
>>>>> the termination protocol in do_marking_step() will wait for. When an
>>>>> invocation of do_marking_step() offers termination, the number of
>>>>> tasks/threads in the terminator instance is decremented. So Setting
>>>>> the phase once will let the first execution of do_marking_step (with
>>>>> termination) from process_phase1() succeed, but subsequent calls to
>>>>> do_marking_step() will result in the assertion failure.
>>>>>
>>>>> We also can't unconditionally set it in the do_void() method or even
>>>>> the constructor of G1CMParDrainMarkingStackClosure. Separate
>>>>> instances of this closure are created by each of the worker threads
>>>>> in the MT-case.
>>>>>
>>>>> Note when processing is multi-threaded the complete_gc instance used
>>>>> is the one passed into the ProcessTask's work method (passed into
>>>>> process_discovered_references() using the task executor instance)
>>>>> which may not necessarily be the same complete gc instance as the one
>>>>> passed directly into process_discovered_references().
>>>> Thanks for this detailed explanation. It really helped!
>>>>
>>>> I understand the issue now, but I still think it is very confusing
>>>> that _cm->set_phase() is called from
>>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from
>>>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded 
>>>> case.
>>>>
>>>>> It might be possible to record whether processing is MT in the
>>>>> G1CMRefProcTaskExecutor class and always pass the executor instance
>>>>> into process_discovered_references. We could then set processing to
>>>>> MT so that the execute() methods in the executor instance are invoked
>>>>> but call the Proxy class' work method directly. Then we could
>>>>> override the set_single_threaded() routine (called just before
>>>>> process_phaseJNI) to set the phase.
>>>> I think this would be a better solution, but if I understand it
>>>> correctly it would mean that we would have to change all the
>>>> collectors to always pass a TaskExecutor. All of them currently pass
>>>> NULL in the non-MT case. I think it would be simpler if they always
>>>> passed a TaskExecutor but it is a pretty big change.
>>> I wasn't meaning to do that for the other collectors just G1's
>>> concurrent mark reference processor i.e. fool the ref processor into
>>> think it's MT so that the parallel task executor is used but only use
>>> the work gang if reference processing was _really_ MT.
>>>
>>> I decided not to do this as there is an easier way. For the non-MT case
>>> we do not need to enter the termination protocol in
>>> CMTask::do_marking_step(). When there's only one thread we don't 
>>> need to
>>> use the ParallelTaskTerminator to wait for other threads. And we
>>> certainly don't need stealing. Hence the solution is to only do the
>>> termination and stealing if the closure is instantiated for MT 
>>> reference
>>> processing. That removes the set_phase call().
>>>
>>>> Another possibility is to introduce some kind of prepare method to the
>>>> VoidClosure (or maybe in a specialized subclass for ref processing).
>>>> Then we could do something like:
>>>>
>>>>   complete_gc->prologue();
>>>>   if (mt_processing) {
>>>>     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic()
>>>> /*marks_oops_alive*/);
>>>>     task_executor->execute(phase2);
>>>>   } else {
>>>>     for (uint i = 0; i < _max_num_q; i++) {
>>>>       process_phase2(refs_lists[i], is_alive, keep_alive, 
>>>> complete_gc);
>>>>     }
>>>>   }
>>>>
>>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to
>>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not
>>>> have to do it.
>>> The above is a reasonable extension to the reference processing code. I
>>> no longer need this feature for this change but we should submit a CR
>>> for it. I'll do that.
>>>
>>>> BTW, not really part of your change, but above code is duplicated
>>>> three times in ReferenceProcessor::process_discovered_reflist(). Would
>>>> be nice to factor this out to a method.
>>> Completely agree. Again I'll submit a CR for it.
>>>
>>> Thanks,
>>>
>>> JohnC
>


From john.cuthbertson at oracle.com  Wed Jan 30 17:07:49 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 09:07:49 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <51091CBB.7080500@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com> <51091CBB.7080500@oracle.com>
Message-ID: <51095365.7090004@oracle.com>

Hi Bengt,

Thanks. And I agree these changes look better. Thanks for insisting. :)

JohnC

On 1/30/2013 5:14 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> Thanks for doing these changes! Looks good.
>
> Ship it!
> Bengt
>
>
> On 1/17/13 9:02 PM, John Cuthbertson wrote:
>> Hi Bengt,
>>
>> There's a new webrev at: 
>> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>
>> It looks larger than the previous webrev but the most of the change 
>> was tweaking comments. The actual code changes are smaller.
>>
>> Testing was the same as before.
>>
>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>>
>>> I see. I didn't think about the difference betweeen 
>>> ParallelGCThreads and ParallelRefProcEnabled. BTW, not part of this 
>>> change, but why do we have ParallelRefProcEnabled? And why is it 
>>> false by default? Wouldn't it make more sense to have it just be 
>>> dependent on ParallelGCThreads?
>>
>> I don't know and the answer is probably lost in the dark depths of 
>> time - I can only speculate. For G1 we have a CR to turn 
>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not 
>> sure about the other collectors.
>>
>>>
>>>> Setting it once in weakRefsWork() will not be sufficient. We will 
>>>> run into an assertion failure in 
>>>> ParallelTaskTerminator::offer_termination().
>>>>
>>>> During the reference processing, the do_void() method of the 
>>>> complete_gc oop closure (in our case the complete gc oop closure is 
>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple 
>>>> times (in process_phase1, sometimes process_phase2, process_phase3, 
>>>> and process_phaseJNI)
>>>>
>>>> Setting the phase sets the number of active tasks (or threads) that 
>>>> the termination protocol in do_marking_step() will wait for. When 
>>>> an invocation of do_marking_step() offers termination, the number 
>>>> of tasks/threads in the terminator instance is decremented. So 
>>>> Setting the phase once will let the first execution of 
>>>> do_marking_step (with termination) from process_phase1() succeed, 
>>>> but subsequent calls to do_marking_step() will result in the 
>>>> assertion failure.
>>>>
>>>> We also can't unconditionally set it in the do_void() method or 
>>>> even the constructor of G1CMParDrainMarkingStackClosure. Separate 
>>>> instances of this closure are created by each of the worker threads 
>>>> in the MT-case.
>>>>
>>>> Note when processing is multi-threaded the complete_gc instance 
>>>> used is the one passed into the ProcessTask's work method (passed 
>>>> into process_discovered_references() using the task executor 
>>>> instance) which may not necessarily be the same complete gc 
>>>> instance as the one passed directly into 
>>>> process_discovered_references().
>>>
>>> Thanks for this detailed explanation. It really helped!
>>>
>>> I understand the issue now, but I still think it is very confusing 
>>> that _cm->set_phase() is called from 
>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and 
>>> from G1CMParDrainMarkingStackClosure::do_void() in the single 
>>> threaded case.
>>>
>>>> It might be possible to record whether processing is MT in the 
>>>> G1CMRefProcTaskExecutor class and always pass the executor instance 
>>>> into process_discovered_references. We could then set processing to 
>>>> MT so that the execute() methods in the executor instance are 
>>>> invoked but call the Proxy class' work method directly. Then we 
>>>> could override the set_single_threaded() routine (called just 
>>>> before process_phaseJNI) to set the phase.
>>>
>>> I think this would be a better solution, but if I understand it 
>>> correctly it would mean that we would have to change all the 
>>> collectors to always pass a TaskExecutor. All of them currently pass 
>>> NULL in the non-MT case. I think it would be simpler if they always 
>>> passed a TaskExecutor but it is a pretty big change.
>>
>> I wasn't meaning to do that for the other collectors just G1's 
>> concurrent mark reference processor i.e. fool the ref processor into 
>> think it's MT so that the parallel task executor is used but only use 
>> the work gang if reference processing was _really_ MT.
>>
>> I decided not to do this as there is an easier way. For the non-MT 
>> case we do not need to enter the termination protocol in 
>> CMTask::do_marking_step(). When there's only one thread we don't need 
>> to use the ParallelTaskTerminator to wait for other threads. And we 
>> certainly don't need stealing. Hence the solution is to only do the 
>> termination and stealing if the closure is instantiated for MT 
>> reference processing. That removes the set_phase call().
>>
>>> Another possibility is to introduce some kind of prepare method to 
>>> the VoidClosure (or maybe in a specialized subclass for ref 
>>> processing). Then we could do something like:
>>>
>>>   complete_gc->prologue();
>>>   if (mt_processing) {
>>>     RefProcPhase2Task phase2(*this, refs_lists, 
>>> !discovery_is_atomic() /*marks_oops_alive*/);
>>>     task_executor->execute(phase2);
>>>   } else {
>>>     for (uint i = 0; i < _max_num_q; i++) {
>>>       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
>>>     }
>>>   }
>>>
>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to 
>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not 
>>> have to do it.
>>
>> The above is a reasonable extension to the reference processing code. 
>> I no longer need this feature for this change but we should submit a 
>> CR for it. I'll do that.
>>
>>> BTW, not really part of your change, but above code is duplicated 
>>> three times in ReferenceProcessor::process_discovered_reflist(). 
>>> Would be nice to factor this out to a method.
>>
>> Completely agree. Again I'll submit a CR for it.
>>
>> Thanks,
>>
>> JohnC
>


From john.cuthbertson at oracle.com  Wed Jan 30 17:15:47 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 09:15:47 -0800
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <51090238.7000004@oracle.com>
References: <50FD2323.9050702@oracle.com> <5106B2EE.9060509@oracle.com>
	<5106CA6E.6030405@oracle.com> <51079337.6040108@oracle.com>
	<51090238.7000004@oracle.com>
Message-ID: <51095543.1050505@oracle.com>

Hi Filipp,

The test looks good. Thank you.

JohnC

On 1/30/2013 3:21 AM, Filipp Zhinkin wrote:
> Here is an updated webrev:
> http://cr.openjdk.java.net/~kshefov/8000311/webrev.01/
>
> I've added ExplicitGCInvokesConcurrent option and replaced 
> heap-filling by frequent System.gc() calls.
>
> Thanks,
> Filipp.
>
> On 01/29/2013 01:15 PM, Filipp Zhinkin wrote:
>> Hi John,
>>
>> thanks for advice! I'll reimplement the test using System.gc() calls 
>> and -XX:+ExplicitGCInvokesConcurrent option.
>> And yes, you're right, PLAB resizes only at the end of incremental 
>> GC. Thats why I've tried to provoke GC by filling up the heap instead 
>> of calling System.gc() (I've missed ExplicitGCInvokesConcurrent flag 
>> before).
>>
>> Thanks,
>> Filipp.
>>
>> On 01/28/2013 10:58 PM, John Cuthbertson wrote:
>>> Hi Filipp,
>>>
>>> In addition to what Jon suggests (i.e. using System.gc() to 
>>> guarantee a GC), please add -XX:+ExplicitGCInvokesConcurrent. The 
>>> addition of this flag will cause G1 to perform an incremental GC 
>>> (instead of the full GC that a System.gc() call provokes). IIRC the 
>>> PLAB resizing code is only exercised at the end of an incremental GC.
>>>
>>> Thanks,
>>>
>>> JohnC
>>>
>>> On 1/28/2013 9:18 AM, Jon Masamitsu wrote:
>>>> Can this test be implemented using a call to
>>>> System.gc() instead of trying to fill up the heap
>>>> to provoke a GC?
>>>>
>>>> Jon
>>>>
>>>> On 01/21/13 03:14, Filipp Zhinkin wrote:
>>>>> Hi all,
>>>>>
>>>>> Would someone review the following regression test please?
>>>>>
>>>>> Test verifies that VM will not crash with G1 GC and 
>>>>> ParallelGCThreads == 0.
>>>>>
>>>>> To ensure that it is true test allocates array until OOME.
>>>>> Max heap size is limited by 32M for this test to ensure that GC 
>>>>> will occur.
>>>>> Since crash could occur only during PLAB resizing after GC,
>>>>> ResizePLAB option is explicitly turned on.
>>>>>
>>>>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>>>>>
>>>>> Thanks,
>>>>> Filipp.
>>>>>
>>>
>>
>


From john.cuthbertson at oracle.com  Wed Jan 30 17:27:12 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 09:27:12 -0800
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <510918BE.7010808@oracle.com>
References: <50EC99ED.4090903@oracle.com>
	<CAHjP37EAuED3QsGsdQkD010nOh1j1E=1Uu+33XQjmTfoaDA63A@mail.gmail.com>
	<50EE0E33.8010408@oracle.com>
	<CAHjP37Hn=68a-xR439TKyC4LwSeaQ3WYSjE8uj+0adZ5+_KmOw@mail.gmail.com>
	<510918BE.7010808@oracle.com>
Message-ID: <510957F0.1070704@oracle.com>

Hi Bengt,

Thanks for looking over the change. Vitaly does have a point but IMO we 
change all the checks to check for null or none because (a) his point 
would be equally valid other places where we check the # of marking 
threads, and (b) it would better to consistent so that we don't have to 
search for multiple patterns if we ever need to change this.

With that said I have an idea - a new webrev will be published shortly.

JohnC

On 1/30/2013 4:57 AM, Bengt Rutisson wrote:
>
> Hi John,
>
> This looks good to me.
>
> I think Vitaly has a point about the fact that we just "know" that 
> _parallel_workers == NULL is equivalent to parallel_marking_threads() 
> == 0.
>
> I'm not going to insist on this, but would it make sense to add an 
> assert to convey this information? I'm actually less worried about the 
> case "parallel_marking_threads() > 0 but _parallel_workers == NULL" 
> since that will result in a null de-reference that can be fairly easy 
> to debug.
>
> But maybe this assert could be added at the start of the method:
>
>   assert(_parallel_workers == NULL || parallel_marking_threads() > 0, 
> "work gang not set up correctly");
>
> I'm not sure we need that assert and I'm not convinced this is the 
> right place to have it. But my thought is that this will detect an 
> unexpected state that we would otherwise silently ignore.
>
> Bengt
>
>
> This is probably not such a big problem if we
>
> On 1/10/13 1:47 AM, Vitaly Davidovich wrote:
>>
>> Hi John,
>>
>> Thanks for the response.  Yeah, I figured it's the same thing since 
>> it's not null iff # of workers > 0. However, if this relationship is 
>> ever broken or perhaps the gang can be set to null at some point even 
>> if workers > 0, then this code will segv again.  Hence I thought a 
>> null guard is a bit better, but it was just a side comment - code 
>> looks fine as is.
>>
>> Thanks
>>
>> Sent from my phone
>>
>> On Jan 9, 2013 7:41 PM, "John Cuthbertson" 
>> <john.cuthbertson at oracle.com <mailto:john.cuthbertson at oracle.com>> wrote:
>>
>>     Hi Vitaly,
>>
>>     Thanks for looking over the changes. AFAICT checking if
>>     _parallel_workers is not null is equivalent to checking that the
>>     number of parallel marking threads is > 0. I went with the latter
>>     check as other references to the parallel workers work gang are
>>     guarded by it. I'm not sure why the code was originally written
>>     that way but my guess is that, when originally written, the
>>     marking threads (like the concurrent refinement threads
>>     currently) were not in a work gang.
>>
>>     Thanks,
>>
>>     JohnC
>>
>>     On 1/8/2013 8:37 PM, Vitaly Davidovich wrote:
>>>
>>>     Hi John,
>>>
>>>     What's the advantage of checking parallel marking thread count >
>>>     0 rather than checking if parallel workers is not NULL? Is it
>>>     clearer that way? I'm thinking checking for NULL here (perhaps
>>>     with a comment on when NULL can happen) may be a bit more robust
>>>     in case it can be null for some other reason, even if parallel
>>>     marking thread count is > 0.
>>>
>>>     Looks good though.
>>>
>>>     Thanks
>>>
>>>     Sent from my phone
>>>
>>>     On Jan 8, 2013 5:14 PM, "John Cuthbertson"
>>>     <john.cuthbertson at oracle.com
>>>     <mailto:john.cuthbertson at oracle.com>> wrote:
>>>
>>>         Hi Everyone,
>>>
>>>         Can I please have a couple of volunteers look over the fix
>>>         for this CR - the webrev can be found at:
>>>         http://cr.openjdk.java.net/~johnc/8005875/webrev.0/
>>>         <http://cr.openjdk.java.net/%7Ejohnc/8005875/webrev.0/>
>>>
>>>         Summary:
>>>         One of the modules in the Kitchensink test generates a
>>>         VM_PrintThreads vm operation. The JVM crashes when it tries
>>>         to print out G1's concurrent marking worker threads when
>>>         ParallelGCThreads=0 because the work gang has not been
>>>         created. The fix is to add the same check that's used
>>>         elsewhere in G1's concurrent marking.
>>>
>>>         Testing:
>>>         Kitchensink with ParallelGCThreads=0
>>>
>>>         Thanks,
>>>
>>>         JohnC
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/b78102c4/attachment.htm>

From jon.masamitsu at oracle.com  Wed Jan 30 17:32:21 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 30 Jan 2013 09:32:21 -0800
Subject: request for review (s) - 8005452: Create new flags for Metaspace
	resizing policy
In-Reply-To: <5107CC0D.5030409@oracle.com>
References: <50F424A0.6080907@oracle.com>	<CAHjP37EhqMgGO+UzZk22YcETW8jatRt3nVVsya8yY6gR7b9J2g@mail.gmail.com>
	<50F6DA6C.906@oracle.com> <51031A35.8040004@oracle.com>
	<5106948B.3090408@oracle.com> <5106D1BB.1050804@oracle.com>
	<5107CC0D.5030409@oracle.com>
Message-ID: <51095925.5030705@oracle.com>

I added you improved message.  Thanks.

Jon

On 1/29/2013 5:18 AM, Jesper Wilhelmsson wrote:
> Jon,
>
> OK, in that case I think the error message could be slightly more 
> informative, just so that it is clear to someone that only sets one of 
> them that the flag they set conflicts with the default value of 
> another flag. How about this:
>
> jio_fprintf(defaultStream::error_stream(),
>             "MinMetaspaceFreeRatio (%s" UINTX_FORMAT ") must be less 
> than or "
>             "equal to MaxMetaspaceFreeRatio (%s" UINTX_FORMAT ")\n",
>             FLAG_IS_DEFAULT(MinMetaspaceFreeRatio) ? "Default: " : "",
>             MinMetaspaceFreeRatio,
>             FLAG_IS_DEFAULT(MaxMetaspaceFreeRatio) ? "Default: " : "",
>             MaxMetaspaceFreeRatio);
>
> If you don't like this I'm fine with your current patch as well.
> /Jesper
>
>
> On 28/1/13 8:30 PM, Jon Masamitsu wrote:
>> Jesper,
>>
>> If the user is increasing MinMetaspaceFreeRatio, and
>> MaxMetaspaceFreeRatio is not compatible, maybe we
>> should be forcing the user to think about MaxMetaspaceFreeRatio.
>> It's not obvious to me that something like
>>
>> MaxMetaspaceFreeRatio = MinMetaspaceFreeRatio + 1
>>
>> is a good choice.
>>
>> Jon
>>
>> On 01/28/13 07:08, Jesper Wilhelmsson wrote:
>>> On 2013-01-26 00:50, Jon Masamitsu wrote:
>>>> I've update the webrev2 (now 2 separate webrevs)  for review comments.
>>>>
>>>> 8005452: NPG: Create new flags for Metaspace resizing policy
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.03/
>>>
>>> I have looked at the flag changes and they look good. I have a
>>> question though. How do we want to handle the case where the user sets
>>> only MinMetaspaceFreeRatio = 30 ? With your current change this will
>>> give an error because the min is larger than the default max (20).
>>>
>>> Would it make sense to assume that the user actually wants to use
>>> min=30 and increase max to 30 as well? (or slightly more if they can't
>>> be equal) And maybe issue a warning that the value of max has been
>>> changed.
>>> The error would then just be given if the user specifies both flags
>>> and they don't work out.
>>>
>>> I'm not asking you to do this change now but it relates to other
>>> changes we have done recently.
>>> /Jesper
>>>
>>>
>>>>
>>>> 8006815: NPG: Trigger a GC for metadata collection just before the
>>>> threshold
>>>> is exceeded.
>>>>
>>>> http://cr.openjdk.java.net/~jmasa/8006815/webrev.01/
>>>>
>>>>
>>>> On 1/16/2013 8:50 AM, Jon Masamitsu wrote:
>>>>> I've added checks to arguments.cpp that are analogous to the
>>>>> checks for MinHeapFreeRatio / MaxHeapFreeRatio
>>>>>
>>>>> Changes since webrev.00 are in arguments.cpp
>>>>>
>>>>> http://cr.openjdk.java.net/~jmasa/8005452/webrev.01/
>>>>>
>>>>> Thanks, Vitaly.
>>>>>
>>>>> Jon
>>>>>
>>>>> On 1/15/2013 5:55 AM, Vitaly Davidovich wrote:
>>>>>> Hi Jon,
>>>>>>
>>>>>> Does it make sense to validate that the new flags are consistent
>>>>>> (I.e. max
>>>>>>> = min)? That is, if user changes one or both such that max<  min,
>>>>>>> should
>>>>>> VM report an error and not start?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Sent from my phone
>>>>>> On Jan 14, 2013 10:31 AM, "Jon Masamitsu"<jon.masamitsu at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>> 8005452: Create new flags for Metaspace resizing policy
>>>>>>>
>>>>>>> Previously the calculation of the metadata capacity at which
>>>>>>> to do a GC (high water mark, HWM) to recover
>>>>>>> unloaded classes used the MinHeapFreeRatio
>>>>>>> and MaxHeapFreeRatio to decide on the next HWM.  That
>>>>>>> generally left an excessive amount of unused capacity for
>>>>>>> metadata.  This change adds specific flags for metadata
>>>>>>> capacity with defaults more conservative in terms of
>>>>>>> unused capacity.
>>>>>>>
>>>>>>> Added an additional check for doing a GC before expanding
>>>>>>> the metadata capacity.  Required adding a new parameter to
>>>>>>> get_new_chunk().
>>>>>>>
>>>>>>> Added some additional diagnostic prints.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~**jmasa/8005452/webrev.00/<http://cr.openjdk.java.net/~jmasa/8005452/webrev.00/> 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks.
>>>>>>>


From jon.masamitsu at oracle.com  Wed Jan 30 17:51:37 2013
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 30 Jan 2013 09:51:37 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <51091E44.10405@oracle.com>
References: <50EC6C90.4060502@oracle.com>
	<50ED8E5C.4010109@oracle.com>	<50EF4D88.2050906@oracle.com>
	<50F51F00.6040008@oracle.com>	<50F858DA.8050508@oracle.com>	<8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
	<51091E44.10405@oracle.com>
Message-ID: <51095DA9.6050600@oracle.com>


On 1/30/2013 5:21 AM, Bengt Rutisson wrote:
>
> Hi Charlie,
>
> On 1/17/13 9:52 PM, Charlie Hunt wrote:
>> John / Bengt:
>>
>> I think I can offer a bit of info on Bengt's earlier question about 
>> ParallelProcRefEnabled being disabled by default.
>>
>> IIRC, there was one workload that showed a slight perf regression 
>> with +ParallelProcRefEnabled.  That workload that showed a regression 
>> may not be as relevant as it was back when the evaluation / decision 
>> was made to disable it by default?
>
> Thanks for providing some history for these flags!
>
>> You both have probably thought about this already?  My reaction is 
>> ... I think reasonable defaults would be to enable 
>> +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when 
>> ParallelGCThreads is greater than 1, and disable 
>> -ParallelProcRefEnabled with -XX:+UseSerialGC.
>
> This sounds like a good enhancement. John, if you agree, could you 
> file a CR for it? Or would you like me to file it?

I think  the regression was with benchmarks that did not have much 
Reference processing
but still had to pay the build up/tear down cost for the parallel work.  
Andrew did the work and
I think there was lots of performance data that went into turning it off 
by default.

Jon

>
> Bengt
>
>>
>> hths,
>>
>> charlie ...
>>
>> On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:
>>
>>> Hi Bengt,
>>>
>>> There's a new webrev at: 
>>> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>>
>>> It looks larger than the previous webrev but the most of the change was
>>> tweaking comments. The actual code changes are smaller.
>>>
>>> Testing was the same as before.
>>>
>>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>>> I see. I didn't think about the difference betweeen ParallelGCThreads
>>>> and ParallelRefProcEnabled. BTW, not part of this change, but why do
>>>> we have ParallelRefProcEnabled? And why is it false by default?
>>>> Wouldn't it make more sense to have it just be dependent on
>>>> ParallelGCThreads?
>>> I don't know and the answer is probably lost in the dark depths of time
>>> - I can only speculate. For G1 we have a CR to turn
>>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not sure
>>> about the other collectors.
>>>
>>>>> Setting it once in weakRefsWork() will not be sufficient. We will run
>>>>> into an assertion failure in
>>>>> ParallelTaskTerminator::offer_termination().
>>>>>
>>>>> During the reference processing, the do_void() method of the
>>>>> complete_gc oop closure (in our case the complete gc oop closure is
>>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple
>>>>> times (in process_phase1, sometimes process_phase2, process_phase3,
>>>>> and process_phaseJNI)
>>>>>
>>>>> Setting the phase sets the number of active tasks (or threads) that
>>>>> the termination protocol in do_marking_step() will wait for. When an
>>>>> invocation of do_marking_step() offers termination, the number of
>>>>> tasks/threads in the terminator instance is decremented. So Setting
>>>>> the phase once will let the first execution of do_marking_step (with
>>>>> termination) from process_phase1() succeed, but subsequent calls to
>>>>> do_marking_step() will result in the assertion failure.
>>>>>
>>>>> We also can't unconditionally set it in the do_void() method or even
>>>>> the constructor of G1CMParDrainMarkingStackClosure. Separate
>>>>> instances of this closure are created by each of the worker threads
>>>>> in the MT-case.
>>>>>
>>>>> Note when processing is multi-threaded the complete_gc instance used
>>>>> is the one passed into the ProcessTask's work method (passed into
>>>>> process_discovered_references() using the task executor instance)
>>>>> which may not necessarily be the same complete gc instance as the one
>>>>> passed directly into process_discovered_references().
>>>> Thanks for this detailed explanation. It really helped!
>>>>
>>>> I understand the issue now, but I still think it is very confusing
>>>> that _cm->set_phase() is called from
>>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from
>>>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded 
>>>> case.
>>>>
>>>>> It might be possible to record whether processing is MT in the
>>>>> G1CMRefProcTaskExecutor class and always pass the executor instance
>>>>> into process_discovered_references. We could then set processing to
>>>>> MT so that the execute() methods in the executor instance are invoked
>>>>> but call the Proxy class' work method directly. Then we could
>>>>> override the set_single_threaded() routine (called just before
>>>>> process_phaseJNI) to set the phase.
>>>> I think this would be a better solution, but if I understand it
>>>> correctly it would mean that we would have to change all the
>>>> collectors to always pass a TaskExecutor. All of them currently pass
>>>> NULL in the non-MT case. I think it would be simpler if they always
>>>> passed a TaskExecutor but it is a pretty big change.
>>> I wasn't meaning to do that for the other collectors just G1's
>>> concurrent mark reference processor i.e. fool the ref processor into
>>> think it's MT so that the parallel task executor is used but only use
>>> the work gang if reference processing was _really_ MT.
>>>
>>> I decided not to do this as there is an easier way. For the non-MT case
>>> we do not need to enter the termination protocol in
>>> CMTask::do_marking_step(). When there's only one thread we don't 
>>> need to
>>> use the ParallelTaskTerminator to wait for other threads. And we
>>> certainly don't need stealing. Hence the solution is to only do the
>>> termination and stealing if the closure is instantiated for MT 
>>> reference
>>> processing. That removes the set_phase call().
>>>
>>>> Another possibility is to introduce some kind of prepare method to the
>>>> VoidClosure (or maybe in a specialized subclass for ref processing).
>>>> Then we could do something like:
>>>>
>>>>   complete_gc->prologue();
>>>>   if (mt_processing) {
>>>>     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic()
>>>> /*marks_oops_alive*/);
>>>>     task_executor->execute(phase2);
>>>>   } else {
>>>>     for (uint i = 0; i < _max_num_q; i++) {
>>>>       process_phase2(refs_lists[i], is_alive, keep_alive, 
>>>> complete_gc);
>>>>     }
>>>>   }
>>>>
>>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to
>>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not
>>>> have to do it.
>>> The above is a reasonable extension to the reference processing code. I
>>> no longer need this feature for this change but we should submit a CR
>>> for it. I'll do that.
>>>
>>>> BTW, not really part of your change, but above code is duplicated
>>>> three times in ReferenceProcessor::process_discovered_reflist(). Would
>>>> be nice to factor this out to a method.
>>> Completely agree. Again I'll submit a CR for it.
>>>
>>> Thanks,
>>>
>>> JohnC
>


From john.cuthbertson at oracle.com  Wed Jan 30 19:19:11 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 11:19:11 -0800
Subject: RFR(XS): 8001384: G1: assert(!is_null(v)) failed: narrow oop value
	can never be zero
Message-ID: <5109722F.9080309@oracle.com>

Hi Everyone,

Can I have a couple volunteers review the changes for this CR - the 
webrev can be found at: http://cr.openjdk.java.net/~johnc/8001384/webrev.0/

Background:
The ReduceInitialCardMarks optimization allows the JIT compiler, in some 
circumstances, to skip generation of the card marks associated with the 
initializing stores of a newly allocated object. The skipped card marks 
are then elided into a single deferred operation.

The deferred card marks are recorded in a field in the allocating 
thread. Typically deferred card marks are flushed (and the associated 
cards dirtied) when another set of card marks is to be deferred for the 
same thread, or at the start of the next GC (in 
CollectedHeap::ensure_parseability()).

The problem here was that the deferred card marks, if any, for a given 
thread were not being flushed when that thread exited. As a result we 
would end up with missing (card marks) write barriers, (in the case of 
G1) missing RSet entries, and dangling references.

The fix is, obviously, flush any deferred cards marks before the thread 
exits, and before flushing the G1 dirty card queue for the thread.

Although the problem was found by G1's marking verification 
(VerifyDuringGC) occasionally detecting missing RSet entries and 
dangling references, I believe this issue affects all the collectors.

Testing:
runThese bigapp on the failing machine with IHOP=10 and marking 
verification;
runThese on my local workstation with IHOP=5 and marking verification;
gc test suite to sanity test the other collectors.

Thanks,

JohnC


From john.cuthbertson at oracle.com  Wed Jan 30 19:43:24 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 11:43:24 -0800
Subject: RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap
In-Reply-To: <50F5E6BE.9040901@oracle.com>
References: <50F5E6BE.9040901@oracle.com>
Message-ID: <510977DC.4040300@oracle.com>

Hi Everyone,

Here's a new webrev based upon comments from Vitaly: 
http://cr.openjdk.java.net/~johnc/7176479/webrev.1/

Thanks,

JohnC

On 1/15/2013 3:31 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of people look over the changes for this CR - the 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/7176479/webrev.0/
>
> Background:
> The issue here was that we were encoding the card index into the card 
> counts table entries along with the GC number so that we could 
> determine if the count associated with was valid. We had a check to 
> ensure that the maximum card index could be encoded in an int. With 
> such large heap size - the number of cards could not be encoded and so 
> the check failed.
>
> The previous mechanism was an attempt to solve the problem of one 
> thread arriving late to the actual GC work. The thread in question was 
> being held up zeroing the card counts table at the start of the GC. 
> The card counts table is used to determine which cards are being 
> refined frequently. Once a card has been refined frequently enough, 
> further refinements of that card are delayed by placing the card into 
> a fixed size evicting table - the hot card cache. The card would then 
> be refined when it was evicted from the hot card cache or when the 
> cache was drained during the next GC.
>
> To solve the problem of zeroing we added an epoch (GC number) to the 
> entries in the counts table and, eliminate the increase in footprint, 
> we made the counts table into a cache which would expand if needed. 
> This approach had some negatives: we might have to refine two cards 
> during a single refinement operation, hashing the card, and performing 
> CAS operations increasing the overhead of concurrent refinement. Also 
> expanding the counts table during a GC incurred a penalty.
>
> This approach also limited the heap size to just under 1TB - which the 
> systems team ran into.
>
> The new approach effectively undoes the previous mechanism and 
> re-simplifies the card counts table.
>
> Summary of Changes:
> The hot card cache and card counts table have been moved from the 
> concurrent refinement code into their own files.
>
> The hot card cache can now exist independently of whether the counts 
> table exists. In this case refining a card once adds it to the hot 
> card cache, i.e. all cards are treated as 'hot'.
>
> The interface to the hot card cache has been simplified - a simple 
> query and a simple drain routine. This simplifies the calling code in 
> g1RemSet.cpp and results in up to only a single card being refined for 
> every call to "refine_card" instead of possibly two. This should 
> reduce the overhead of concurrent refinement.
>
> The number of cards that the hot card cache can hold before cards 
> start getting evicted is controlled by the flag G1ConcRSLogCacheSize, 
> which is now product flag. The default value is 10 giving a hot card 
> cache that can hold 1K cards.
>
> The card counts table has been greatly simplified. It is a simple 
> array of counts how many times a card has been refined. The space for 
> the table is now allocated from virtual memory instead of C heap. The 
> space for the table is committed when the heap is initially committed 
> and the spans the committed size of the heap. When the committed size 
> of the heap is expanded, the counts table is also expanded to cover 
> the newly expanded heap. If we fail to commit the memory for the 
> counts table, cards that map to the uncommitted space will be treated 
> as cold, i.e. they will be refined immediately. Having a simpler 
> counts table also should reduce the overhead of concurrent refinement 
> (there is no need to hash the card index and there are no CAS 
> operations) Having a simpler interface will allow us to change the 
> underlying data structure to an alternative that's perhaps more sparse 
> in the future.
>
> During an incremental GC we no longer zero the entire counts table. We 
> now zero the cards spanned by a region when the region is freed (i.e. 
> when we free the collection set at the end of a GC and when we free 
> regions at the end of a cleanup).  If a card was "hot" before a GC 
> then we will consider it hot after the GC and the first refinement 
> after the GC will insert the card into the hot card cache. 
> Furthermore, since we don't refine cards in young regions, we only 
> need to clear the counts associated with cards spanned by non-young 
> regions.
>
> During a full GC we still discard the entries in the hot card cache 
> and zero the counts for all the cards in the heap.
>
> Testing:
> GC Test suite with MaxTenuringThreshold=0 (to increase the amount of 
> refinement) and a low IHOP value (to force cleanups).
> SPECjbb2005 with a 1.5TB heap size and 256GB young size, 
> MaxTenuringThreshold=0 and a low IHOP value (1%). The systems team are 
> continuing to test with very large heaps.
>


From ysr1729 at gmail.com  Wed Jan 30 19:58:24 2013
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Wed, 30 Jan 2013 11:58:24 -0800
Subject: G1 concurrent phase durations do not state the time units ("secs")
Message-ID: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>

Hi John, all --

I'm using 7u9, perhaps this has been fixed subsequently. Here's an example
of the missing units (and the inconsistency):-

The young and mixed pauses print duration units of "secs" :-

2013-01-30T01:46:45.652-0800: 9522.134: [GC pause (young), 1.10620800 secs]
2013-01-30T01:47:03.563-0800: 9540.045: [GC pause (young), 0.90593900 secs]
2013-01-30T01:47:18.619-0800: 9555.101: [GC pause (young), 0.81425400 secs]
2013-01-30T01:47:59.536-0800: 9596.018: [GC pause (young), 0.84935400 secs]
2013-01-30T01:48:12.097-0800: 9608.579: [GC pause (young) (initial-mark),
0.50153600 secs]
2013-01-30T01:48:59.189-0800: 9655.671: [GC pause (young), 0.02815400 secs]
2013-01-30T01:51:53.952-0800: 9830.457: [GC pause (mixed), 0.66860800 secs]
2013-01-30T01:53:20.704-0800: 9917.186: [GC pause (mixed), 0.47479200 secs]
2013-01-30T01:54:41.098-0800: 9997.579: [GC pause (mixed), 0.72149500 secs]
2013-01-30T01:55:58.944-0800: 10075.426: [GC pause (young), 0.32158300 secs]


The concurrent phases are often missing the units, but not always (mark
phase duration prints "sec", others are mum):-

2013-01-30T01:12:10.711-0800: 7447.193: [GC
concurrent-root-region-scan-end, 1.0222980]
2013-01-30T01:12:41.386-0800: 7477.868: [GC concurrent-mark-end, 30.6749800
sec]
2013-01-30T01:12:41.626-0800: 7478.108: [GC concurrent-cleanup-end,
0.0063520]
2013-01-30T01:24:18.588-0800: 8175.070: [GC
concurrent-root-region-scan-end, 0.5868510]
2013-01-30T01:25:01.089-0800: 8217.571: [GC concurrent-mark-end, 42.5016130
sec]
2013-01-30T01:25:01.321-0800: 8217.803: [GC concurrent-cleanup-end,
0.0057450]
2013-01-30T01:36:27.063-0800: 8903.545: [GC
concurrent-root-region-scan-end, 0.3746230]
2013-01-30T01:37:18.642-0800: 8955.124: [GC concurrent-mark-end, 51.5794260
sec]
2013-01-30T01:37:18.869-0800: 8955.351: [GC concurrent-cleanup-end,
0.0048270]
2013-01-30T01:48:13.162-0800: 9609.644: [GC
concurrent-root-region-scan-end, 0.5630820]
2013-01-30T01:48:55.513-0800: 9651.995: [GC concurrent-mark-end, 42.3504330
sec]
2013-01-30T01:48:55.769-0800: 9652.251: [GC concurrent-cleanup-end,
0.0041170]


Would be nice to have it be consistent across G1 and indeed across all
collectors, if not already the case. Also makes for more consistent parsing
of logs.

thanks!
-- ramki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130130/1cad30b7/attachment.htm>

From john.cuthbertson at oracle.com  Wed Jan 30 20:17:41 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 12:17:41 -0800
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
Message-ID: <51097FE5.1090402@oracle.com>

Hi Ramki,

Thanks for the report. Just checked with a log that I generated this 
morning and the issue is still there. It looks like the units weren't 
added to a couple of prints in concurrentMarkThread.cpp

I'll submit a CR. Expect a webrev later today or early tomorrow.

Thanks,

JohnC

On 1/30/2013 11:58 AM, Srinivas Ramakrishna wrote:
>
> Hi John, all --
>
> I'm using 7u9, perhaps this has been fixed subsequently. Here's an 
> example of the missing units (and the inconsistency):-
>
> The young and mixed pauses print duration units of "secs" :-
>
> 2013-01-30T01:46:45.652-0800: 9522.134: [GC pause (young), 1.10620800 
> secs]
> 2013-01-30T01:47:03.563-0800: 9540.045: [GC pause (young), 0.90593900 
> secs]
> 2013-01-30T01:47:18.619-0800: 9555.101: [GC pause (young), 0.81425400 
> secs]
> 2013-01-30T01:47:59.536-0800: 9596.018: [GC pause (young), 0.84935400 
> secs]
> 2013-01-30T01:48:12.097-0800: 9608.579: [GC pause (young) 
> (initial-mark), 0.50153600 secs]
> 2013-01-30T01:48:59.189-0800: 9655.671: [GC pause (young), 0.02815400 
> secs]
> 2013-01-30T01:51:53.952-0800: 9830.457: [GC pause (mixed), 0.66860800 
> secs]
> 2013-01-30T01:53:20.704-0800: 9917.186: [GC pause (mixed), 0.47479200 
> secs]
> 2013-01-30T01:54:41.098-0800: 9997.579: [GC pause (mixed), 0.72149500 
> secs]
> 2013-01-30T01:55:58.944-0800: 10075.426: [GC pause (young), 0.32158300 
> secs]
>
>
> The concurrent phases are often missing the units, but not always 
> (mark phase duration prints "sec", others are mum):-
>
> 2013-01-30T01:12:10.711-0800: 7447.193: [GC 
> concurrent-root-region-scan-end, 1.0222980]
> 2013-01-30T01:12:41.386-0800: 7477.868: [GC concurrent-mark-end, 
> 30.6749800 sec]
> 2013-01-30T01:12:41.626-0800: 7478.108: [GC concurrent-cleanup-end, 
> 0.0063520]
> 2013-01-30T01:24:18.588-0800: 8175.070: [GC 
> concurrent-root-region-scan-end, 0.5868510]
> 2013-01-30T01:25:01.089-0800: 8217.571: [GC concurrent-mark-end, 
> 42.5016130 sec]
> 2013-01-30T01:25:01.321-0800: 8217.803: [GC concurrent-cleanup-end, 
> 0.0057450]
> 2013-01-30T01:36:27.063-0800: 8903.545: [GC 
> concurrent-root-region-scan-end, 0.3746230]
> 2013-01-30T01:37:18.642-0800: 8955.124: [GC concurrent-mark-end, 
> 51.5794260 sec]
> 2013-01-30T01:37:18.869-0800: 8955.351: [GC concurrent-cleanup-end, 
> 0.0048270]
> 2013-01-30T01:48:13.162-0800: 9609.644: [GC 
> concurrent-root-region-scan-end, 0.5630820]
> 2013-01-30T01:48:55.513-0800: 9651.995: [GC concurrent-mark-end, 
> 42.3504330 sec]
> 2013-01-30T01:48:55.769-0800: 9652.251: [GC concurrent-cleanup-end, 
> 0.0041170]
>
>
> Would be nice to have it be consistent across G1 and indeed across all 
> collectors, if not already the case. Also makes for more consistent 
> parsing of logs.
>
> thanks!
> -- ramki


From kirk at kodewerk.com  Wed Jan 30 20:25:13 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Wed, 30 Jan 2013 21:25:13 +0100
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <51097FE5.1090402@oracle.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
	<51097FE5.1090402@oracle.com>
Message-ID: <35C8C843-CD7C-4B58-B25F-CBAFE54F4F7C@kodewerk.com>

Hi John,

Should I add that memory is reported in M,K, or B. I was really happy with just K being reported. Any comments on that?

Regards,
Kirk

On 2013-01-30, at 9:17 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Ramki,
> 
> Thanks for the report. Just checked with a log that I generated this morning and the issue is still there. It looks like the units weren't added to a couple of prints in concurrentMarkThread.cpp
> 
> I'll submit a CR. Expect a webrev later today or early tomorrow.
> 
> Thanks,
> 
> JohnC
> 
> On 1/30/2013 11:58 AM, Srinivas Ramakrishna wrote:
>> 
>> Hi John, all --
>> 
>> I'm using 7u9, perhaps this has been fixed subsequently. Here's an example of the missing units (and the inconsistency):-
>> 
>> The young and mixed pauses print duration units of "secs" :-
>> 
>> 2013-01-30T01:46:45.652-0800: 9522.134: [GC pause (young), 1.10620800 secs]
>> 2013-01-30T01:47:03.563-0800: 9540.045: [GC pause (young), 0.90593900 secs]
>> 2013-01-30T01:47:18.619-0800: 9555.101: [GC pause (young), 0.81425400 secs]
>> 2013-01-30T01:47:59.536-0800: 9596.018: [GC pause (young), 0.84935400 secs]
>> 2013-01-30T01:48:12.097-0800: 9608.579: [GC pause (young) (initial-mark), 0.50153600 secs]
>> 2013-01-30T01:48:59.189-0800: 9655.671: [GC pause (young), 0.02815400 secs]
>> 2013-01-30T01:51:53.952-0800: 9830.457: [GC pause (mixed), 0.66860800 secs]
>> 2013-01-30T01:53:20.704-0800: 9917.186: [GC pause (mixed), 0.47479200 secs]
>> 2013-01-30T01:54:41.098-0800: 9997.579: [GC pause (mixed), 0.72149500 secs]
>> 2013-01-30T01:55:58.944-0800: 10075.426: [GC pause (young), 0.32158300 secs]
>> 
>> 
>> The concurrent phases are often missing the units, but not always (mark phase duration prints "sec", others are mum):-
>> 
>> 2013-01-30T01:12:10.711-0800: 7447.193: [GC concurrent-root-region-scan-end, 1.0222980]
>> 2013-01-30T01:12:41.386-0800: 7477.868: [GC concurrent-mark-end, 30.6749800 sec]
>> 2013-01-30T01:12:41.626-0800: 7478.108: [GC concurrent-cleanup-end, 0.0063520]
>> 2013-01-30T01:24:18.588-0800: 8175.070: [GC concurrent-root-region-scan-end, 0.5868510]
>> 2013-01-30T01:25:01.089-0800: 8217.571: [GC concurrent-mark-end, 42.5016130 sec]
>> 2013-01-30T01:25:01.321-0800: 8217.803: [GC concurrent-cleanup-end, 0.0057450]
>> 2013-01-30T01:36:27.063-0800: 8903.545: [GC concurrent-root-region-scan-end, 0.3746230]
>> 2013-01-30T01:37:18.642-0800: 8955.124: [GC concurrent-mark-end, 51.5794260 sec]
>> 2013-01-30T01:37:18.869-0800: 8955.351: [GC concurrent-cleanup-end, 0.0048270]
>> 2013-01-30T01:48:13.162-0800: 9609.644: [GC concurrent-root-region-scan-end, 0.5630820]
>> 2013-01-30T01:48:55.513-0800: 9651.995: [GC concurrent-mark-end, 42.3504330 sec]
>> 2013-01-30T01:48:55.769-0800: 9652.251: [GC concurrent-cleanup-end, 0.0041170]
>> 
>> 
>> Would be nice to have it be consistent across G1 and indeed across all collectors, if not already the case. Also makes for more consistent parsing of logs.
>> 
>> thanks!
>> -- ramki
> 


From bengt.rutisson at oracle.com  Wed Jan 30 21:18:09 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Wed, 30 Jan 2013 22:18:09 +0100
Subject: RFR(XS): 8001384: G1: assert(!is_null(v)) failed: narrow oop
	value can never be zero
In-Reply-To: <5109722F.9080309@oracle.com>
References: <5109722F.9080309@oracle.com>
Message-ID: <51098E11.70204@oracle.com>


Hi John,

Nice detective work to find this issue! Based on your very good 
explanation I think this change looks good!

One small detail. I think the comment in JavaThread::exit() is a little 
confusing. I think you are right in your explanation (and code change) 
that the deferred card marks need to be flushed for all collectors. But 
the comment seems to suggest that it only affects G1:

1899   // Flush any deferred card marks. Flushing may add cards to this
1900   // thread's G1 dirty card queue so we have to do this before
1901   // flushing the G1 barrier queues.

I would prefer to move this comment in to the G1 specific section where 
we flush the barrier queues:

1906   // We must flush G1-related buffers before removing a thread from
1907   // the list of active threads.
1908   if (UseG1GC) {
1909     flush_barrier_queues();
1910   }

The comment could say something like "For G1, flushing the deferred card 
marks may add cards to the thread's dirty card queue. So we need to 
flush the barrier queues after we have flushed the deferred card marks.".


Also, should we update the title of the bug? Should it have the "G1:" 
prefix since it is not a G1 specific fix?

Thanks,
Bengt

On 1/30/13 8:19 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple volunteers review the changes for this CR - the 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8001384/webrev.0/
>
> Background:
> The ReduceInitialCardMarks optimization allows the JIT compiler, in 
> some circumstances, to skip generation of the card marks associated 
> with the initializing stores of a newly allocated object. The skipped 
> card marks are then elided into a single deferred operation.
>
> The deferred card marks are recorded in a field in the allocating 
> thread. Typically deferred card marks are flushed (and the associated 
> cards dirtied) when another set of card marks is to be deferred for 
> the same thread, or at the start of the next GC (in 
> CollectedHeap::ensure_parseability()).
>
> The problem here was that the deferred card marks, if any, for a given 
> thread were not being flushed when that thread exited. As a result we 
> would end up with missing (card marks) write barriers, (in the case of 
> G1) missing RSet entries, and dangling references.
>
> The fix is, obviously, flush any deferred cards marks before the 
> thread exits, and before flushing the G1 dirty card queue for the thread.
>
> Although the problem was found by G1's marking verification 
> (VerifyDuringGC) occasionally detecting missing RSet entries and 
> dangling references, I believe this issue affects all the collectors.
>
> Testing:
> runThese bigapp on the failing machine with IHOP=10 and marking 
> verification;
> runThese on my local workstation with IHOP=5 and marking verification;
> gc test suite to sanity test the other collectors.
>
> Thanks,
>
> JohnC
>
>
>
>


From john.cuthbertson at oracle.com  Wed Jan 30 21:51:47 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 30 Jan 2013 13:51:47 -0800
Subject: RFR(XS): 8001384: G1: assert(!is_null(v)) failed: narrow oop
	value can never be zero
In-Reply-To: <51098E11.70204@oracle.com>
References: <5109722F.9080309@oracle.com> <51098E11.70204@oracle.com>
Message-ID: <510995F3.1020500@oracle.com>

Hi Bengt,

Thanks for looking ove the change

On 1/30/2013 1:18 PM, Bengt Rutisson wrote:
>
> Hi John,
>
> Nice detective work to find this issue! Based on your very good 
> explanation I think this change looks good!
>
> One small detail. I think the comment in JavaThread::exit() is a 
> little confusing. I think you are right in your explanation (and code 
> change) that the deferred card marks need to be flushed for all 
> collectors. But the comment seems to suggest that it only affects G1:
>
> 1899   // Flush any deferred card marks. Flushing may add cards to this
> 1900   // thread's G1 dirty card queue so we have to do this before
> 1901   // flushing the G1 barrier queues.
>
> I would prefer to move this comment in to the G1 specific section 
> where we flush the barrier queues:
>
> 1906   // We must flush G1-related buffers before removing a thread from
> 1907   // the list of active threads.
> 1908   if (UseG1GC) {
> 1909     flush_barrier_queues();
> 1910   }
>
> The comment could say something like "For G1, flushing the deferred 
> card marks may add cards to the thread's dirty card queue. So we need 
> to flush the barrier queues after we have flushed the deferred card 
> marks.".

Good suggestions. Done. I would say the latter part a bit differently 
since we need to flush G1 barrier queues anyway but now we have an 
ordering imposed upon us:

// We must flush the G1-related buffers before removing a thread
// from the list of active threads. We must do this after any deferred
// card marks have been flushed (above) so that any entries that are
// added to the thread's dirty card queue as a result are not lost.
if (UseG1GC) {
..
}

> Also, should we update the title of the bug? Should it have the "G1:" 
> prefix since it is not a G1 specific fix?

I'm not sure and don't really have a preference. My only concern is that 
the original problem was reported against G1 (and I don't think I've 
seen any issue with the other collectors that could be traced to this). 
Keeping the G1 might make it easier for SQE and others to find this CR 
when triaging other G1 test failures.

Thanks,

JohnC


From kirk at kodewerk.com  Wed Jan 30 22:40:57 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Wed, 30 Jan 2013 23:40:57 +0100
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <51097FE5.1090402@oracle.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
	<51097FE5.1090402@oracle.com>
Message-ID: <41178332-08A3-429F-94C4-4952EFDCBA0A@kodewerk.com>

Hi John,

Since we're on the subject of consistency, I have this record that reports in "sec" instead of "secs".

634.958: [GC concurrent-mark-end, 0.1525020 sec]

and I have memory being reported as 252M->252M(397M) here;

568.675: [GC cleanup 252M->252M(397M), 0.0055868 secs]

as well as 77M(77M)->0B(77M) here;

[Eden: 77M(77M)->0B(77M) Survivors: 2048K->2048K Heap: 324M(397M)->247M(397M)]

Either format works but it would be nice to have one and not both.
i'll repeat my request to report in K. B seems unnecessarily granular where as M might be too corse for smaller heap sizes.. but then, maybe not. Still sorting out the potential effects on analytics.

I guess you've picked up on the other places where secs is missing.

Regards,
Kirk


On 2013-01-30, at 9:17 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Ramki,
> 
> Thanks for the report. Just checked with a log that I generated this morning and the issue is still there. It looks like the units weren't added to a couple of prints in concurrentMarkThread.cpp
> 
> I'll submit a CR. Expect a webrev later today or early tomorrow.
> 
> Thanks,
> 
> JohnC
> 
> On 1/30/2013 11:58 AM, Srinivas Ramakrishna wrote:
>> 
>> Hi John, all --
>> 
>> I'm using 7u9, perhaps this has been fixed subsequently. Here's an example of the missing units (and the inconsistency):-
>> 
>> The young and mixed pauses print duration units of "secs" :-
>> 
>> 2013-01-30T01:46:45.652-0800: 9522.134: [GC pause (young), 1.10620800 secs]
>> 2013-01-30T01:47:03.563-0800: 9540.045: [GC pause (young), 0.90593900 secs]
>> 2013-01-30T01:47:18.619-0800: 9555.101: [GC pause (young), 0.81425400 secs]
>> 2013-01-30T01:47:59.536-0800: 9596.018: [GC pause (young), 0.84935400 secs]
>> 2013-01-30T01:48:12.097-0800: 9608.579: [GC pause (young) (initial-mark), 0.50153600 secs]
>> 2013-01-30T01:48:59.189-0800: 9655.671: [GC pause (young), 0.02815400 secs]
>> 2013-01-30T01:51:53.952-0800: 9830.457: [GC pause (mixed), 0.66860800 secs]
>> 2013-01-30T01:53:20.704-0800: 9917.186: [GC pause (mixed), 0.47479200 secs]
>> 2013-01-30T01:54:41.098-0800: 9997.579: [GC pause (mixed), 0.72149500 secs]
>> 2013-01-30T01:55:58.944-0800: 10075.426: [GC pause (young), 0.32158300 secs]
>> 
>> 
>> The concurrent phases are often missing the units, but not always (mark phase duration prints "sec", others are mum):-
>> 
>> 2013-01-30T01:12:10.711-0800: 7447.193: [GC concurrent-root-region-scan-end, 1.0222980]
>> 2013-01-30T01:12:41.386-0800: 7477.868: [GC concurrent-mark-end, 30.6749800 sec]
>> 2013-01-30T01:12:41.626-0800: 7478.108: [GC concurrent-cleanup-end, 0.0063520]
>> 2013-01-30T01:24:18.588-0800: 8175.070: [GC concurrent-root-region-scan-end, 0.5868510]
>> 2013-01-30T01:25:01.089-0800: 8217.571: [GC concurrent-mark-end, 42.5016130 sec]
>> 2013-01-30T01:25:01.321-0800: 8217.803: [GC concurrent-cleanup-end, 0.0057450]
>> 2013-01-30T01:36:27.063-0800: 8903.545: [GC concurrent-root-region-scan-end, 0.3746230]
>> 2013-01-30T01:37:18.642-0800: 8955.124: [GC concurrent-mark-end, 51.5794260 sec]
>> 2013-01-30T01:37:18.869-0800: 8955.351: [GC concurrent-cleanup-end, 0.0048270]
>> 2013-01-30T01:48:13.162-0800: 9609.644: [GC concurrent-root-region-scan-end, 0.5630820]
>> 2013-01-30T01:48:55.513-0800: 9651.995: [GC concurrent-mark-end, 42.3504330 sec]
>> 2013-01-30T01:48:55.769-0800: 9652.251: [GC concurrent-cleanup-end, 0.0041170]
>> 
>> 
>> Would be nice to have it be consistent across G1 and indeed across all collectors, if not already the case. Also makes for more consistent parsing of logs.
>> 
>> thanks!
>> -- ramki
> 


From tao.mao at oracle.com  Thu Jan 31 00:37:13 2013
From: tao.mao at oracle.com (Tao Mao)
Date: Wed, 30 Jan 2013 16:37:13 -0800
Subject: Request for review: 8007053: Refactor SizePolicy code for consistency
	across collectors
Message-ID: <5109BCB9.7040706@oracle.com>

8007053: Refactor SizePolicy code for consistency across collectors
https://jbs.oracle.com/bugs/browse/JDK-8007053

webrev:
http://cr.openjdk.java.net/~tamao/8007053/webrev.00/

changeset:
1. rename a bunch of functions across collectors related to compute and 
resize young/tenured gens
unified function names:
(1) compute_*:
       compute_survivor_space_size_and_threshold()
       compute_generations_free_space() [compute_eden_space_size() + 
compute_tenured_generation_free_space()]
(2) resize_*_gen:
       resize_young_gen()
       resize_tenured_gen()

2. split compute_generations_free_space() into two functions:
compute_eden_space_size() + compute_tenured_generation_free_space()
each of which (if needed) can be reused without executing an overhead of 
the other.

*3. in src/share/vm/memory/collectorPolicy.cpp
a minor bug in initializing an AdaptiveSizePolicy instance is caught.
MaxGCMinorPauseMillis -> MaxGCPauseMillis

testing:
refworkload test cases: jetstream, scimark, specjbb2000, specjbb2005, 
specjvm98
GC options: -XX:+UseParallelGC -Xmx512m
no regression found from the performance results;
PrintGCStats and CompareGCStats show no abnormal variations.


From bengt.rutisson at oracle.com  Thu Jan 31 07:35:52 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 31 Jan 2013 08:35:52 +0100
Subject: RFR(XS): 8001384: G1: assert(!is_null(v)) failed: narrow oop
	value can never be zero
In-Reply-To: <510995F3.1020500@oracle.com>
References: <5109722F.9080309@oracle.com> <51098E11.70204@oracle.com>
	<510995F3.1020500@oracle.com>
Message-ID: <510A1ED8.3000408@oracle.com>


Hi John,

On 1/30/13 10:51 PM, John Cuthbertson wrote:
> Hi Bengt,
>
> Thanks for looking ove the change
>
> On 1/30/2013 1:18 PM, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Nice detective work to find this issue! Based on your very good 
>> explanation I think this change looks good!
>>
>> One small detail. I think the comment in JavaThread::exit() is a 
>> little confusing. I think you are right in your explanation (and code 
>> change) that the deferred card marks need to be flushed for all 
>> collectors. But the comment seems to suggest that it only affects G1:
>>
>> 1899   // Flush any deferred card marks. Flushing may add cards to this
>> 1900   // thread's G1 dirty card queue so we have to do this before
>> 1901   // flushing the G1 barrier queues.
>>
>> I would prefer to move this comment in to the G1 specific section 
>> where we flush the barrier queues:
>>
>> 1906   // We must flush G1-related buffers before removing a thread from
>> 1907   // the list of active threads.
>> 1908   if (UseG1GC) {
>> 1909     flush_barrier_queues();
>> 1910   }
>>
>> The comment could say something like "For G1, flushing the deferred 
>> card marks may add cards to the thread's dirty card queue. So we need 
>> to flush the barrier queues after we have flushed the deferred card 
>> marks.".
>
> Good suggestions. Done. I would say the latter part a bit differently 
> since we need to flush G1 barrier queues anyway but now we have an 
> ordering imposed upon us:
>
> // We must flush the G1-related buffers before removing a thread
> // from the list of active threads. We must do this after any deferred
> // card marks have been flushed (above) so that any entries that are
> // added to the thread's dirty card queue as a result are not lost.
> if (UseG1GC) {
> ..
> }

Yes, much better! :)

>> Also, should we update the title of the bug? Should it have the "G1:" 
>> prefix since it is not a G1 specific fix?
>
> I'm not sure and don't really have a preference. My only concern is 
> that the original problem was reported against G1 (and I don't think 
> I've seen any issue with the other collectors that could be traced to 
> this). Keeping the G1 might make it easier for SQE and others to find 
> this CR when triaging other G1 test failures.

OK. I don't have strong opinions about it. Let's leave the title as it is.

Bengt

>
> Thanks,
>
> JohnC
>


From bengt.rutisson at oracle.com  Thu Jan 31 08:00:11 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 31 Jan 2013 09:00:11 +0100
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing closures
	in concurrent marking
In-Reply-To: <51095DA9.6050600@oracle.com>
References: <50EC6C90.4060502@oracle.com>
	<50ED8E5C.4010109@oracle.com>	<50EF4D88.2050906@oracle.com>
	<50F51F00.6040008@oracle.com>	<50F858DA.8050508@oracle.com>	<8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
	<51091E44.10405@oracle.com> <51095DA9.6050600@oracle.com>
Message-ID: <510A248B.3030501@oracle.com>


Jon,

On 1/30/13 6:51 PM, Jon Masamitsu wrote:
>
>
> On 1/30/2013 5:21 AM, Bengt Rutisson wrote:
>>
>> Hi Charlie,
>>
>> On 1/17/13 9:52 PM, Charlie Hunt wrote:
>>> John / Bengt:
>>>
>>> I think I can offer a bit of info on Bengt's earlier question about 
>>> ParallelProcRefEnabled being disabled by default.
>>>
>>> IIRC, there was one workload that showed a slight perf regression 
>>> with +ParallelProcRefEnabled.  That workload that showed a 
>>> regression may not be as relevant as it was back when the evaluation 
>>> / decision was made to disable it by default?
>>
>> Thanks for providing some history for these flags!
>>
>>> You both have probably thought about this already?  My reaction is 
>>> ... I think reasonable defaults would be to enable 
>>> +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when 
>>> ParallelGCThreads is greater than 1, and disable 
>>> -ParallelProcRefEnabled with -XX:+UseSerialGC.
>>
>> This sounds like a good enhancement. John, if you agree, could you 
>> file a CR for it? Or would you like me to file it?
>
> I think  the regression was with benchmarks that did not have much 
> Reference processing
> but still had to pay the build up/tear down cost for the parallel 
> work.  Andrew did the work and
> I think there was lots of performance data that went into turning it 
> off by default.

Thanks, that is good to know. Do you remember if the benchmark had 
regressions for all GCs or just some of them? I think Charlies 
suggestion was to make parallel reference processing true by default for 
the GCs where we think it would be beneficial. Not necessarily for all 
of them.

Do you know if it is possible to find the original performance data some 
where?

Thanks,
Bengt

>
> Jon
>
>>
>> Bengt
>>
>>>
>>> hths,
>>>
>>> charlie ...
>>>
>>> On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:
>>>
>>>> Hi Bengt,
>>>>
>>>> There's a new webrev at: 
>>>> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>>>
>>>> It looks larger than the previous webrev but the most of the change 
>>>> was
>>>> tweaking comments. The actual code changes are smaller.
>>>>
>>>> Testing was the same as before.
>>>>
>>>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>>>> I see. I didn't think about the difference betweeen ParallelGCThreads
>>>>> and ParallelRefProcEnabled. BTW, not part of this change, but why do
>>>>> we have ParallelRefProcEnabled? And why is it false by default?
>>>>> Wouldn't it make more sense to have it just be dependent on
>>>>> ParallelGCThreads?
>>>> I don't know and the answer is probably lost in the dark depths of 
>>>> time
>>>> - I can only speculate. For G1 we have a CR to turn
>>>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not 
>>>> sure
>>>> about the other collectors.
>>>>
>>>>>> Setting it once in weakRefsWork() will not be sufficient. We will 
>>>>>> run
>>>>>> into an assertion failure in
>>>>>> ParallelTaskTerminator::offer_termination().
>>>>>>
>>>>>> During the reference processing, the do_void() method of the
>>>>>> complete_gc oop closure (in our case the complete gc oop closure is
>>>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple
>>>>>> times (in process_phase1, sometimes process_phase2, process_phase3,
>>>>>> and process_phaseJNI)
>>>>>>
>>>>>> Setting the phase sets the number of active tasks (or threads) that
>>>>>> the termination protocol in do_marking_step() will wait for. When an
>>>>>> invocation of do_marking_step() offers termination, the number of
>>>>>> tasks/threads in the terminator instance is decremented. So Setting
>>>>>> the phase once will let the first execution of do_marking_step (with
>>>>>> termination) from process_phase1() succeed, but subsequent calls to
>>>>>> do_marking_step() will result in the assertion failure.
>>>>>>
>>>>>> We also can't unconditionally set it in the do_void() method or even
>>>>>> the constructor of G1CMParDrainMarkingStackClosure. Separate
>>>>>> instances of this closure are created by each of the worker threads
>>>>>> in the MT-case.
>>>>>>
>>>>>> Note when processing is multi-threaded the complete_gc instance used
>>>>>> is the one passed into the ProcessTask's work method (passed into
>>>>>> process_discovered_references() using the task executor instance)
>>>>>> which may not necessarily be the same complete gc instance as the 
>>>>>> one
>>>>>> passed directly into process_discovered_references().
>>>>> Thanks for this detailed explanation. It really helped!
>>>>>
>>>>> I understand the issue now, but I still think it is very confusing
>>>>> that _cm->set_phase() is called from
>>>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from
>>>>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded 
>>>>> case.
>>>>>
>>>>>> It might be possible to record whether processing is MT in the
>>>>>> G1CMRefProcTaskExecutor class and always pass the executor instance
>>>>>> into process_discovered_references. We could then set processing to
>>>>>> MT so that the execute() methods in the executor instance are 
>>>>>> invoked
>>>>>> but call the Proxy class' work method directly. Then we could
>>>>>> override the set_single_threaded() routine (called just before
>>>>>> process_phaseJNI) to set the phase.
>>>>> I think this would be a better solution, but if I understand it
>>>>> correctly it would mean that we would have to change all the
>>>>> collectors to always pass a TaskExecutor. All of them currently pass
>>>>> NULL in the non-MT case. I think it would be simpler if they always
>>>>> passed a TaskExecutor but it is a pretty big change.
>>>> I wasn't meaning to do that for the other collectors just G1's
>>>> concurrent mark reference processor i.e. fool the ref processor into
>>>> think it's MT so that the parallel task executor is used but only use
>>>> the work gang if reference processing was _really_ MT.
>>>>
>>>> I decided not to do this as there is an easier way. For the non-MT 
>>>> case
>>>> we do not need to enter the termination protocol in
>>>> CMTask::do_marking_step(). When there's only one thread we don't 
>>>> need to
>>>> use the ParallelTaskTerminator to wait for other threads. And we
>>>> certainly don't need stealing. Hence the solution is to only do the
>>>> termination and stealing if the closure is instantiated for MT 
>>>> reference
>>>> processing. That removes the set_phase call().
>>>>
>>>>> Another possibility is to introduce some kind of prepare method to 
>>>>> the
>>>>> VoidClosure (or maybe in a specialized subclass for ref processing).
>>>>> Then we could do something like:
>>>>>
>>>>>   complete_gc->prologue();
>>>>>   if (mt_processing) {
>>>>>     RefProcPhase2Task phase2(*this, refs_lists, 
>>>>> !discovery_is_atomic()
>>>>> /*marks_oops_alive*/);
>>>>>     task_executor->execute(phase2);
>>>>>   } else {
>>>>>     for (uint i = 0; i < _max_num_q; i++) {
>>>>>       process_phase2(refs_lists[i], is_alive, keep_alive, 
>>>>> complete_gc);
>>>>>     }
>>>>>   }
>>>>>
>>>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to
>>>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not
>>>>> have to do it.
>>>> The above is a reasonable extension to the reference processing 
>>>> code. I
>>>> no longer need this feature for this change but we should submit a CR
>>>> for it. I'll do that.
>>>>
>>>>> BTW, not really part of your change, but above code is duplicated
>>>>> three times in ReferenceProcessor::process_discovered_reflist(). 
>>>>> Would
>>>>> be nice to factor this out to a method.
>>>> Completely agree. Again I'll submit a CR for it.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>


From filipp.zhinkin at oracle.com  Thu Jan 31 13:25:25 2013
From: filipp.zhinkin at oracle.com (Filipp Zhinkin)
Date: Thu, 31 Jan 2013 17:25:25 +0400
Subject: Request for review: 8006628: NEED_TEST for JDK-8002870
In-Reply-To: <51095543.1050505@oracle.com>
References: <50FD2323.9050702@oracle.com> <5106B2EE.9060509@oracle.com>
	<5106CA6E.6030405@oracle.com> <51079337.6040108@oracle.com>
	<51090238.7000004@oracle.com> <51095543.1050505@oracle.com>
Message-ID: <510A70C5.2000404@oracle.com>

John,

thank you.

FIlipp.

On 01/30/2013 09:15 PM, John Cuthbertson wrote:
> Hi Filipp,
>
> The test looks good. Thank you.
>
> JohnC
>
> On 1/30/2013 3:21 AM, Filipp Zhinkin wrote:
>> Here is an updated webrev:
>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.01/
>>
>> I've added ExplicitGCInvokesConcurrent option and replaced 
>> heap-filling by frequent System.gc() calls.
>>
>> Thanks,
>> Filipp.
>>
>> On 01/29/2013 01:15 PM, Filipp Zhinkin wrote:
>>> Hi John,
>>>
>>> thanks for advice! I'll reimplement the test using System.gc() calls 
>>> and -XX:+ExplicitGCInvokesConcurrent option.
>>> And yes, you're right, PLAB resizes only at the end of incremental 
>>> GC. Thats why I've tried to provoke GC by filling up the heap 
>>> instead of calling System.gc() (I've missed 
>>> ExplicitGCInvokesConcurrent flag before).
>>>
>>> Thanks,
>>> Filipp.
>>>
>>> On 01/28/2013 10:58 PM, John Cuthbertson wrote:
>>>> Hi Filipp,
>>>>
>>>> In addition to what Jon suggests (i.e. using System.gc() to 
>>>> guarantee a GC), please add -XX:+ExplicitGCInvokesConcurrent. The 
>>>> addition of this flag will cause G1 to perform an incremental GC 
>>>> (instead of the full GC that a System.gc() call provokes). IIRC the 
>>>> PLAB resizing code is only exercised at the end of an incremental GC.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>>
>>>> On 1/28/2013 9:18 AM, Jon Masamitsu wrote:
>>>>> Can this test be implemented using a call to
>>>>> System.gc() instead of trying to fill up the heap
>>>>> to provoke a GC?
>>>>>
>>>>> Jon
>>>>>
>>>>> On 01/21/13 03:14, Filipp Zhinkin wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Would someone review the following regression test please?
>>>>>>
>>>>>> Test verifies that VM will not crash with G1 GC and 
>>>>>> ParallelGCThreads == 0.
>>>>>>
>>>>>> To ensure that it is true test allocates array until OOME.
>>>>>> Max heap size is limited by 32M for this test to ensure that GC 
>>>>>> will occur.
>>>>>> Since crash could occur only during PLAB resizing after GC,
>>>>>> ResizePLAB option is explicitly turned on.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~kshefov/8000311/webrev.00/
>>>>>>
>>>>>> Thanks,
>>>>>> Filipp.
>>>>>>
>>>>
>>>
>>
>


From michal at frajt.eu  Thu Jan 31 14:12:56 2013
From: michal at frajt.eu (Michal Frajt)
Date: Thu, 31 Jan 2013 15:12:56 +0100
Subject: CMS vs G1 - Scan RS very long
Message-ID: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>

Hi all,

After the iCMS got officially deprecated we decided to compare the G1 collector with our best tuned (i)CMS setup. Unfortunately we are not able to make the G1 young collection running any closer to the ParNew. Actually we wanted to compare the G1 concurrent marking STW pauses with the CMS initial-mark and remark STW pauses but already incredibly long running G1 young collections are unacceptable for us.

We were able to recognize that the very long G1 young collections are caused by the scanning remembered sets. There is not much documentation about G1 internals but we were able to understand that the size of the remembered sets is related to the amount of mutating references from old regions (cards) to young regions. Unfortunately all our applications mutate permanently thousands references from old objects to young objects. 

We are testing with the latest OpenJDK7u extended by the 7189971 patch and CMSTriggerInterval implementation. The attached GC log files represent two very equal applications processing very similar data sets, one running the G1, second running the CMS collector. The OpenJDK7u has an extra output of _pending_cards (when G1TraceConcRefinement activated) which somehow relates to the remembered sets size.

Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
CMS - invoked every ~20 sec, avg. stop 60ms 
G1 - invoked every ~16 sec, avg. stop 410ms !!!

It there anything what could help us to reduce the Scan RS time or the G1 is simply not targeted for applications mutating heavily old region objects?

CMS parameters
-Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2

G1 parameters (mind MaxNewSize not specified)
-Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2

G1 log file GC young pause
[GC pause (young) [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
time: 167.12 ms, target pause time: 200.00 ms] 
[Parallel Time: 389.8 ms, GC Workers: 8]
    >>>>
    [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9]
    <<<<
[Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 1801.6M(2048.0M)->1685.7M(2048.0M)]

Regards,
Michal


-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmsvsg1.tar.gz
Type: application/x-gzip
Size: 345199 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130131/f2ab9f2f/cmsvsg1.tar.gz>

From bengt.rutisson at oracle.com  Thu Jan 31 14:22:23 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 31 Jan 2013 15:22:23 +0100
Subject: Request for review: 8007053: Refactor SizePolicy code for
	consistency across collectors
In-Reply-To: <5109BCB9.7040706@oracle.com>
References: <5109BCB9.7040706@oracle.com>
Message-ID: <510A7E1F.8030809@oracle.com>


Hi Tao,

I have only looked briefly at your webrev, but I have a request before I 
look more at it.

Could you split it up into a few different changes? I find it difficult 
get a grasp of the changes when there are so many changed files and they 
contain changes for different reasons.

You probably know better than me how to split this up. But I think I 
would like to have at least these three separate changes:

* Renaming of methods
* Splitting up compute_generations_free_space()
* Changes to use MaxGCPauseMillis instead of MaxGCMinorPauseMillis

If this division does not make sense to you I have probably drawn the 
wrong conclusions. But we need to split the review up in some way to 
make it easier to review. I'm fine with other ways of splitting it up if 
you find more natural ways of splitting it up.

With splitting up the changes I mean filing separate bugs and submitting 
separate webrevs, so that they can be pushed as separate changesets.

Also, I don't think you should include the white space changes to these 
two files:
psGCAdaptivePolicyCounters.hpp
adaptiveSizePolicy.hpp

Thanks,
Bengt


On 1/31/13 1:37 AM, Tao Mao wrote:
> 8007053: Refactor SizePolicy code for consistency across collectors
> https://jbs.oracle.com/bugs/browse/JDK-8007053
>
> webrev:
> http://cr.openjdk.java.net/~tamao/8007053/webrev.00/
>
> changeset:
> 1. rename a bunch of functions across collectors related to compute 
> and resize young/tenured gens
> unified function names:
> (1) compute_*:
>       compute_survivor_space_size_and_threshold()
>       compute_generations_free_space() [compute_eden_space_size() + 
> compute_tenured_generation_free_space()]
> (2) resize_*_gen:
>       resize_young_gen()
>       resize_tenured_gen()
>
> 2. split compute_generations_free_space() into two functions:
> compute_eden_space_size() + compute_tenured_generation_free_space()
> each of which (if needed) can be reused without executing an overhead 
> of the other.
>
> *3. in src/share/vm/memory/collectorPolicy.cpp
> a minor bug in initializing an AdaptiveSizePolicy instance is caught.
> MaxGCMinorPauseMillis -> MaxGCPauseMillis
>
> testing:
> refworkload test cases: jetstream, scimark, specjbb2000, specjbb2005, 
> specjvm98
> GC options: -XX:+UseParallelGC -Xmx512m
> no regression found from the performance results;
> PrintGCStats and CompareGCStats show no abnormal variations.
>


From erik.helin at oracle.com  Thu Jan 31 14:37:02 2013
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 31 Jan 2013 15:37:02 +0100
Subject: RFR (S): 8004172: Update jstat counter names to reflect metaspace
	changes
In-Reply-To: <51093C4D.70103@oracle.com>
References: <51011221.8050102@oracle.com> <51018414.1000103@oracle.com>
	<510695D9.5030603@oracle.com> <51093C4D.70103@oracle.com>
Message-ID: <510A818E.3060003@oracle.com>

Jon,

On 01/30/2013 04:29 PM, Jon Masamitsu wrote:
> Sorry to make extra work for you but I've been convinced
> that SUN_GC is the right name to use.  Even though the
> metadata may seem (in my head) more like a runtime quantity,
> it (class metadata) has been associated with GC historically
> so should continue to be associated with GC.  So the
> SUN_GC is appropriate.

I've uploaded new webrevs:
- jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.03/
- hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.03/

On 01/30/2013 04:29 PM, Jon Masamitsu wrote:
> Again, my apologies for flip-flopping
> on this.

No problems!

Erik

> Jon
>
> On 1/28/2013 7:14 AM, Erik Helin wrote:
>> Jon,
>>
>> thanks for your review!
>>
>> On 01/24/2013 07:57 PM, Jon Masamitsu wrote:
>>> I looked at the hotspot changes and they look correct.  But I'm
>>> not sure that "sun.gc" should be in the name of the counter.  Maybe
>>> use SUN_RT instead of SUN_GC.
>>
>> I've updated the code to use the SUN_RT namespace instead of the
>> SUN_GC namespace. This also required changes to the JDK code.
>>
>> I've also added better error handling if a Java Out Of Memory
>> exceptions occur is raised in PerfDataManager::create_variable.
>>
>> Finally, I've moved some common code to the function create_ms_variable.
>>
>> Webrev:
>> - hotspot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.01/
>> - jdk: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.01/
>>
>> What do you think?
>>
>> Thanks,
>> Erik
>>
>>> Jon
>>>
>>> On 1/24/2013 2:51 AM, Erik Helin wrote:
>>>> Hi all,
>>>>
>>>> here are the HotSpot changes for fixing JDK-8004172. This change uses
>>>> the new namespace "sun.gc.metaspace" for the metaspace counters and
>>>> also removes some code from metaspaceCounters.hpp/cpp that is not
>>>> needed any longer.
>>>>
>>>> Note that the tests will continue to fail until the JDK part of the
>>>> change finds it way into the hotspot-gc forest.
>>>>
>>>> The JDK part of the change is also out for review on
>>>> serviceability-dev at openjdk.java.net.
>>>>
>>>> Webrev:
>>>> HotSpot: http://cr.openjdk.java.net/~ehelin/8004172/hotspot/webrev.00/
>>>> JDK: http://cr.openjdk.java.net/~ehelin/8004172/jdk/webrev.00/
>>>>
>>>> Bug:
>>>> http://bugs.sun.com/view_bug.do?bug_id=8004172
>>>>
>>>> Testing:
>>>> Run the jstat jtreg tests locally on my machine on a repository where
>>>> I've applied both the JDK changes and the HotSpot changes.
>>>>
>>>> Thanks,
>>>> Erik
>>


From kirk at kodewerk.com  Thu Jan 31 14:47:58 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 31 Jan 2013 15:47:58 +0100
Subject: CMS vs G1 - Scan RS very long
In-Reply-To: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
References: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
Message-ID: <C91FF7F8-86FC-4202-B8EC-ED56E7F3FFFF@kodewerk.com>

Hi,

I'd like to add to Michal's comment to say that i've to add that I've seen very similar results in recent tuning efforts for low latency. In this case we didn't have a lot of mutation in old gen but I wasn't able to get young gen pauses times down to any where near what I could get to with the CMS collector. Unfortunately I've not been able to characterize the problem as well as you have as we had other fish to fry and I only had a limited amount of time to look at GC. That said I still will be able to run more experiments in the next two weeks.

What I did notice is that young gen started reducing it's size but where as I calculated that a 15m eden was optimal, it stopped down sizing @ 40m. I'd be interested if anyone has any suggestions on how to get the young gen shrink if it's not shrinking enough on it's own. I'm hesitant to fix the size as there are times when heap should grow but under normal load I would hope that it would return to the smaller size.

Over all I'd have to say that this is an application where I definitively would have recommended iCMS even though the hardware has 24 cores. It's very disappointing that iCMS has been depreciated even though there are many using it. I did a quick scan of my GC log DB and I'm seeing about 15% of the logs showing an icms_dc tag.

Regards,
Kirk
On 2013-01-31, at 3:12 PM, "Michal Frajt" <michal at frajt.eu> wrote:

> Hi all,
> 
> After the iCMS got officially deprecated we decided to compare the G1 collector with our best tuned (i)CMS setup. Unfortunately we are not able to make the G1 young collection running any closer to the ParNew. Actually we wanted to compare the G1 concurrent marking STW pauses with the CMS initial-mark and remark STW pauses but already incredibly long running G1 young collections are unacceptable for us.
> 
> We were able to recognize that the very long G1 young collections are caused by the scanning remembered sets. There is not much documentation about G1 internals but we were able to understand that the size of the remembered sets is related to the amount of mutating references from old regions (cards) to young regions. Unfortunately all our applications mutate permanently thousands references from old objects to young objects.
> 
> We are testing with the latest OpenJDK7u extended by the 7189971 patch and CMSTriggerInterval implementation. The attached GC log files represent two very equal applications processing very similar data sets, one running the G1, second running the CMS collector. The OpenJDK7u has an extra output of _pending_cards (when G1TraceConcRefinement activated) which somehow relates to the remembered sets size.
> 
> Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
> CMS - invoked every ~20 sec, avg. stop 60ms
> G1 - invoked every ~16 sec, avg. stop 410ms !!!
> 
> It there anything what could help us to reduce the Scan RS time or the G1 is simply not targeted for applications mutating heavily old region objects?
> 
> CMS parameters
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2
> 
> G1 parameters (mind MaxNewSize not specified)
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2
> 
> G1 log file GC young pause
> [GC pause (young) [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
> time: 167.12 ms, target pause time: 200.00 ms]
> [Parallel Time: 389.8 ms, GC Workers: 8]
>>>>> 
>    [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9]
>    <<<<
> [Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 1801.6M(2048.0M)->1685.7M(2048.0M)]
> 
> Regards,
> Michal
> 
> 
> <cmsvsg1.tar.gz>


From bengt.rutisson at oracle.com  Thu Jan 31 13:41:55 2013
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 31 Jan 2013 14:41:55 +0100
Subject: Request for review: 8007053: Refactor SizePolicy code for
	consistency across collectors
In-Reply-To: <5109BCB9.7040706@oracle.com>
References: <5109BCB9.7040706@oracle.com>
Message-ID: <510A74A3.1060309@oracle.com>


Hi Tao,

I have only looked briefly at your webrev, but I have a request before I 
look more at it.

Could you split it up into a few different changes? I find it difficult 
get a grasp of the changes when there are so many changed files and they 
contain changes for different reasons.

You probably know better than me how to split this up. But I think I 
would like to have at least these three separate changes:

* Renaming of methods
* Splitting up compute_generations_free_space()
* Changes to use MaxGCPauseMillis instead of MaxGCMinorPauseMillis

If this division does not make sense to you I have probably drawn the 
wrong conclusions. But we need to split the review up in some way to 
make it easier to review. I'm fine with other ways of splitting it up if 
you find more natural ways of splitting it up.

With splitting up the changes I mean filing separate bugs and submitting 
separate webrevs, so that they can be pushed as separate changesets.

Also, I don't think you should include the white space changes to these 
two files:
psGCAdaptivePolicyCounters.hpp
adaptiveSizePolicy.hpp

Thanks,
Bengt


On 1/31/13 1:37 AM, Tao Mao wrote:
> 8007053: Refactor SizePolicy code for consistency across collectors
> https://jbs.oracle.com/bugs/browse/JDK-8007053
>
> webrev:
> http://cr.openjdk.java.net/~tamao/8007053/webrev.00/
>
> changeset:
> 1. rename a bunch of functions across collectors related to compute 
> and resize young/tenured gens
> unified function names:
> (1) compute_*:
>       compute_survivor_space_size_and_threshold()
>       compute_generations_free_space() [compute_eden_space_size() + 
> compute_tenured_generation_free_space()]
> (2) resize_*_gen:
>       resize_young_gen()
>       resize_tenured_gen()
>
> 2. split compute_generations_free_space() into two functions:
> compute_eden_space_size() + compute_tenured_generation_free_space()
> each of which (if needed) can be reused without executing an overhead 
> of the other.
>
> *3. in src/share/vm/memory/collectorPolicy.cpp
> a minor bug in initializing an AdaptiveSizePolicy instance is caught.
> MaxGCMinorPauseMillis -> MaxGCPauseMillis
>
> testing:
> refworkload test cases: jetstream, scimark, specjbb2000, specjbb2005, 
> specjvm98
> GC options: -XX:+UseParallelGC -Xmx512m
> no regression found from the performance results;
> PrintGCStats and CompareGCStats show no abnormal variations.
>


From michal at frajt.eu  Thu Jan 31 15:47:31 2013
From: michal at frajt.eu (Michal Frajt)
Date: Thu, 31 Jan 2013 16:47:31 +0100
Subject: CMS vs G1 - Scan RS very long
In-Reply-To: <C91FF7F8-86FC-4202-B8EC-ED56E7F3FFFF@kodewerk.com>
References: =?iso-8859-1?q?=3CMHHU5K=247B881215177709189455C25810630205=40frajt?=
	=?iso-8859-1?q?=2Eeu=3E_=3CC91FF7F8=2D86FC=2D4202=2DB8EC=2DED56E7F3FF?=
	=?iso-8859-1?q?FF=40kodewerk=2Ecom=3E?=
Message-ID: <MHHYJ7$B9305509AEA0AFD86593E371E9497E06@frajt.eu>


Hi Kirk, 
?
We found the default calculated minimum eden size too large as well. You can shrink the eden to the "optimal" size by specifying the new size parameter (-XX:NewSize=16m). If not specified the minimum eden size is adaptively calculated using the G1DefaultMinNewGenPercent parameter from the overall heap size (mind that some parameters got recently renamed again). If you specify the NewRatio, you won't be able to control min and max eden sizes at all. 

The CMS vs G1 test was done with exactly same eden and survivor setting in order to keep the young collections invoked with the same frequency (CMS/ParNew ~20sec, G1 ~16sec). The G1 was delivering same results for eden size around 12MB but it was invoked 10 times more frequently than ParNew with 128MB. A low eden size results into low survivor space, with not much aging, all gets promoted to the old regions, occupied old regions invokes concurrent marking frequently followed by very long mixed modes - it is simply not very sustainable setup.  

The CMS/ParNew collection is for us currently minimum 5 times faster than the G1 young collection.

iCMS - we are changing into the CMS with the CMSTriggerInterval to keep the old gen collected in regular intervals without waiting for the occupation level being reached. It might be even better than the iCMS as the marking is done without incremental interruptions which can reduce amount of the remarking as there is less time passed between the mark and remark phases (I might be wrong).

Regards,
Michal

?
Od: "Kirk Pepperdine" kirk at kodewerk.com
Komu: "Michal Frajt" michal at frajt.eu
Kopie: hotspot-gc-dev at openjdk.java.net
Datum: Thu, 31 Jan 2013 15:47:58 +0100
P?edmet: Re: CMS vs G1 - Scan RS very long

> Hi,
> 
> I'd like to add to Michal's comment to say that i've to add that I've seen very similar results in recent tuning efforts for low latency. In this case we didn't have a lot of mutation in old gen but I wasn't able to get young gen pauses times down to any where near what I could get to with the CMS collector. Unfortunately I've not been able to characterize the problem as well as you have as we had other fish to fry and I only had a limited amount of time to look at GC. That said I still will be able to run more experiments in the next two weeks.
> 
> What I did notice is that young gen started reducing it's size but where as I calculated that a 15m eden was optimal, it stopped down sizing @ 40m. I'd be interested if anyone has any suggestions on how to get the young gen shrink if it's not shrinking enough on it's own. I'm hesitant to fix the size as there are times when heap should grow but under normal load I would hope that it would return to the smaller size.
> 
> Over all I'd have to say that this is an application where I definitively would have recommended iCMS even though the hardware has 24 cores. It's very disappointing that iCMS has been depreciated even though there are many using it. I did a quick scan of my GC log DB and I'm seeing about 15% of the logs showing an icms_dc tag.
> 
> Regards,
> Kirk
> On 2013-01-31, at 3:12 PM, "Michal Frajt"  wrote:
> 
> > Hi all,
> > 
> > After the iCMS got officially deprecated we decided to compare the G1 collector with our best tuned (i)CMS setup. Unfortunately we are not able to make the G1 young collection running any closer to the ParNew. Actually we wanted to compare the G1 concurrent marking STW pauses with the CMS initial-mark and remark STW pauses but already incredibly long running G1 young collections are unacceptable for us.
> > 
> > We were able to recognize that the very long G1 young collections are caused by the scanning remembered sets. There is not much documentation about G1 internals but we were able to understand that the size of the remembered sets is related to the amount of mutating references from old regions (cards) to young regions. Unfortunately all our applications mutate permanently thousands references from old objects to young objects.
> > 
> > We are testing with the latest OpenJDK7u extended by the 7189971 patch and CMSTriggerInterval implementation. The attached GC log files represent two very equal applications processing very similar data sets, one running the G1, second running the CMS collector. The OpenJDK7u has an extra output of _pending_cards (when G1TraceConcRefinement activated) which somehow relates to the remembered sets size.
> > 
> > Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
> > CMS - invoked every ~20 sec, avg. stop 60ms
> > G1 - invoked every ~16 sec, avg. stop 410ms !!!
> > 
> > It there anything what could help us to reduce the Scan RS time or the G1 is simply not targeted for applications mutating heavily old region objects?
> > 
> > CMS parameters
> > -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2
> > 
> > G1 parameters (mind MaxNewSize not specified)
> > -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2
> > 
> > G1 log file GC young pause
> > [GC pause (young) [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
> > time: 167.12 ms, target pause time: 200.00 ms]
> > [Parallel Time: 389.8 ms, GC Workers: 8]
> >>>>> 
> >    [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9]
> >    <<<<
> > [Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 1801.6M(2048.0M)->1685.7M(2048.0M)]
> > 
> > Regards,
> > Michal
> > 
> > 
> > 
> 


From chunt at salesforce.com  Thu Jan 31 15:52:45 2013
From: chunt at salesforce.com (Charlie Hunt)
Date: Thu, 31 Jan 2013 07:52:45 -0800
Subject: RFR(S): 8005032: G1: Cleanup serial reference processing
	closures	in concurrent marking
In-Reply-To: <510A248B.3030501@oracle.com>
References: <50EC6C90.4060502@oracle.com> <50ED8E5C.4010109@oracle.com>
	<50EF4D88.2050906@oracle.com> <50F51F00.6040008@oracle.com>
	<50F858DA.8050508@oracle.com>
	<8DABB858-E8CD-4F94-B6F8-08F5374CC138@salesforce.com>
	<51091E44.10405@oracle.com> <51095DA9.6050600@oracle.com>
	<510A248B.3030501@oracle.com>
Message-ID: <15E183B9-7A8E-416A-AD7A-641CBAE065FE@salesforce.com>

I'm adding Dave Keenan to the cc list. He may remember the specific testing that was involved and configs experienced regressions.

Charlie

Sent from my iPhone

On Jan 31, 2013, at 12:03 AM, "Bengt Rutisson" <bengt.rutisson at oracle.com> wrote:

> 
> Jon,
> 
> On 1/30/13 6:51 PM, Jon Masamitsu wrote:
>> 
>> 
>> On 1/30/2013 5:21 AM, Bengt Rutisson wrote:
>>> 
>>> Hi Charlie,
>>> 
>>> On 1/17/13 9:52 PM, Charlie Hunt wrote:
>>>> John / Bengt:
>>>> 
>>>> I think I can offer a bit of info on Bengt's earlier question about 
>>>> ParallelProcRefEnabled being disabled by default.
>>>> 
>>>> IIRC, there was one workload that showed a slight perf regression 
>>>> with +ParallelProcRefEnabled.  That workload that showed a 
>>>> regression may not be as relevant as it was back when the evaluation 
>>>> / decision was made to disable it by default?
>>> 
>>> Thanks for providing some history for these flags!
>>> 
>>>> You both have probably thought about this already?  My reaction is 
>>>> ... I think reasonable defaults would be to enable 
>>>> +ParallelProcRefEnabled for Parallel[Old], CMS and G1 when 
>>>> ParallelGCThreads is greater than 1, and disable 
>>>> -ParallelProcRefEnabled with -XX:+UseSerialGC.
>>> 
>>> This sounds like a good enhancement. John, if you agree, could you 
>>> file a CR for it? Or would you like me to file it?
>> 
>> I think  the regression was with benchmarks that did not have much 
>> Reference processing
>> but still had to pay the build up/tear down cost for the parallel 
>> work.  Andrew did the work and
>> I think there was lots of performance data that went into turning it 
>> off by default.
> 
> Thanks, that is good to know. Do you remember if the benchmark had 
> regressions for all GCs or just some of them? I think Charlies 
> suggestion was to make parallel reference processing true by default for 
> the GCs where we think it would be beneficial. Not necessarily for all 
> of them.
> 
> Do you know if it is possible to find the original performance data some 
> where?
> 
> Thanks,
> Bengt
> 
>> 
>> Jon
>> 
>>> 
>>> Bengt
>>> 
>>>> 
>>>> hths,
>>>> 
>>>> charlie ...
>>>> 
>>>> On Jan 17, 2013, at 3:02 PM, John Cuthbertson wrote:
>>>> 
>>>>> Hi Bengt,
>>>>> 
>>>>> There's a new webrev at: 
>>>>> http://cr.openjdk.java.net/~johnc/8005032/webrev.1/
>>>>> 
>>>>> It looks larger than the previous webrev but the most of the change 
>>>>> was
>>>>> tweaking comments. The actual code changes are smaller.
>>>>> 
>>>>> Testing was the same as before.
>>>>> 
>>>>> On 1/15/2013 1:18 AM, Bengt Rutisson wrote:
>>>>>> I see. I didn't think about the difference betweeen ParallelGCThreads
>>>>>> and ParallelRefProcEnabled. BTW, not part of this change, but why do
>>>>>> we have ParallelRefProcEnabled? And why is it false by default?
>>>>>> Wouldn't it make more sense to have it just be dependent on
>>>>>> ParallelGCThreads?
>>>>> I don't know and the answer is probably lost in the dark depths of 
>>>>> time
>>>>> - I can only speculate. For G1 we have a CR to turn
>>>>> ParallelRefProcEnabled on if the number of GC threads > 1. I'm not 
>>>>> sure
>>>>> about the other collectors.
>>>>> 
>>>>>>> Setting it once in weakRefsWork() will not be sufficient. We will 
>>>>>>> run
>>>>>>> into an assertion failure in
>>>>>>> ParallelTaskTerminator::offer_termination().
>>>>>>> 
>>>>>>> During the reference processing, the do_void() method of the
>>>>>>> complete_gc oop closure (in our case the complete gc oop closure is
>>>>>>> an instance of G1CMParDrainMarkingStackClosure) is called multiple
>>>>>>> times (in process_phase1, sometimes process_phase2, process_phase3,
>>>>>>> and process_phaseJNI)
>>>>>>> 
>>>>>>> Setting the phase sets the number of active tasks (or threads) that
>>>>>>> the termination protocol in do_marking_step() will wait for. When an
>>>>>>> invocation of do_marking_step() offers termination, the number of
>>>>>>> tasks/threads in the terminator instance is decremented. So Setting
>>>>>>> the phase once will let the first execution of do_marking_step (with
>>>>>>> termination) from process_phase1() succeed, but subsequent calls to
>>>>>>> do_marking_step() will result in the assertion failure.
>>>>>>> 
>>>>>>> We also can't unconditionally set it in the do_void() method or even
>>>>>>> the constructor of G1CMParDrainMarkingStackClosure. Separate
>>>>>>> instances of this closure are created by each of the worker threads
>>>>>>> in the MT-case.
>>>>>>> 
>>>>>>> Note when processing is multi-threaded the complete_gc instance used
>>>>>>> is the one passed into the ProcessTask's work method (passed into
>>>>>>> process_discovered_references() using the task executor instance)
>>>>>>> which may not necessarily be the same complete gc instance as the 
>>>>>>> one
>>>>>>> passed directly into process_discovered_references().
>>>>>> Thanks for this detailed explanation. It really helped!
>>>>>> 
>>>>>> I understand the issue now, but I still think it is very confusing
>>>>>> that _cm->set_phase() is called from
>>>>>> G1CMRefProcTaskExecutor::execute() in the multithreaded case and from
>>>>>> G1CMParDrainMarkingStackClosure::do_void() in the single threaded 
>>>>>> case.
>>>>>> 
>>>>>>> It might be possible to record whether processing is MT in the
>>>>>>> G1CMRefProcTaskExecutor class and always pass the executor instance
>>>>>>> into process_discovered_references. We could then set processing to
>>>>>>> MT so that the execute() methods in the executor instance are 
>>>>>>> invoked
>>>>>>> but call the Proxy class' work method directly. Then we could
>>>>>>> override the set_single_threaded() routine (called just before
>>>>>>> process_phaseJNI) to set the phase.
>>>>>> I think this would be a better solution, but if I understand it
>>>>>> correctly it would mean that we would have to change all the
>>>>>> collectors to always pass a TaskExecutor. All of them currently pass
>>>>>> NULL in the non-MT case. I think it would be simpler if they always
>>>>>> passed a TaskExecutor but it is a pretty big change.
>>>>> I wasn't meaning to do that for the other collectors just G1's
>>>>> concurrent mark reference processor i.e. fool the ref processor into
>>>>> think it's MT so that the parallel task executor is used but only use
>>>>> the work gang if reference processing was _really_ MT.
>>>>> 
>>>>> I decided not to do this as there is an easier way. For the non-MT 
>>>>> case
>>>>> we do not need to enter the termination protocol in
>>>>> CMTask::do_marking_step(). When there's only one thread we don't 
>>>>> need to
>>>>> use the ParallelTaskTerminator to wait for other threads. And we
>>>>> certainly don't need stealing. Hence the solution is to only do the
>>>>> termination and stealing if the closure is instantiated for MT 
>>>>> reference
>>>>> processing. That removes the set_phase call().
>>>>> 
>>>>>> Another possibility is to introduce some kind of prepare method to 
>>>>>> the
>>>>>> VoidClosure (or maybe in a specialized subclass for ref processing).
>>>>>> Then we could do something like:
>>>>>> 
>>>>>>  complete_gc->prologue();
>>>>>>  if (mt_processing) {
>>>>>>    RefProcPhase2Task phase2(*this, refs_lists, 
>>>>>> !discovery_is_atomic()
>>>>>> /*marks_oops_alive*/);
>>>>>>    task_executor->execute(phase2);
>>>>>>  } else {
>>>>>>    for (uint i = 0; i < _max_num_q; i++) {
>>>>>>      process_phase2(refs_lists[i], is_alive, keep_alive, 
>>>>>> complete_gc);
>>>>>>    }
>>>>>>  }
>>>>>> 
>>>>>> G1CMParDrainMarkingStackClosure::prologue() could do the call to
>>>>>> _cm->set_phase(). And G1CMRefProcTaskExecutor::execute() would not
>>>>>> have to do it.
>>>>> The above is a reasonable extension to the reference processing 
>>>>> code. I
>>>>> no longer need this feature for this change but we should submit a CR
>>>>> for it. I'll do that.
>>>>> 
>>>>>> BTW, not really part of your change, but above code is duplicated
>>>>>> three times in ReferenceProcessor::process_discovered_reflist(). 
>>>>>> Would
>>>>>> be nice to factor this out to a method.
>>>>> Completely agree. Again I'll submit a CR for it.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> JohnC
> 


From john.cuthbertson at oracle.com  Thu Jan 31 17:56:49 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 31 Jan 2013 09:56:49 -0800
Subject: CMS vs G1 - Scan RS very long
In-Reply-To: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
References: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
Message-ID: <510AB061.6010403@oracle.com>

Hi Michal,

I haven't looked at the logs yet but from your description it sounds 
like a large number of the RSets have been coarsened and/or the fine 
grain tables have gotten very dense.

The RSet for a heap region tracks incoming references to objects in that 
region. There are three levels on granularity: sparse, fine, and coarse. 
The sparse and fine entries are encoded in open hash tables based upon 
the region containing the references. As the number of references from 
region A that point into region B increase, the number of cards in the 
hash table entry for region A increase (it's actually a bitmap with one 
bit per card) and as the number of regions that contain references that 
point into B increase, the number of region entries in fine grain table 
increase. Once we run out of space in the fine grain table (i.e. we 
can't add another bitmap for another region) we evict one of the densest 
region bitmaps and say "coarsen" that entry. When we have a coarsened 
entry we have to search the entire referencing region looking for 
incoming references compared to searching specific cards with the sparse 
and fine entries.

I'll take a look at your logs to see if I can confirm.

JohnC


On 1/31/2013 6:12 AM, Michal Frajt wrote:
> Hi all,
>
> After the iCMS got officially deprecated we decided to compare the G1 collector with our best tuned (i)CMS setup. Unfortunately we are not able to make the G1 young collection running any closer to the ParNew. Actually we wanted to compare the G1 concurrent marking STW pauses with the CMS initial-mark and remark STW pauses but already incredibly long running G1 young collections are unacceptable for us.
>
> We were able to recognize that the very long G1 young collections are caused by the scanning remembered sets. There is not much documentation about G1 internals but we were able to understand that the size of the remembered sets is related to the amount of mutating references from old regions (cards) to young regions. Unfortunately all our applications mutate permanently thousands references from old objects to young objects.
>
> We are testing with the latest OpenJDK7u extended by the 7189971 patch and CMSTriggerInterval implementation. The attached GC log files represent two very equal applications processing very similar data sets, one running the G1, second running the CMS collector. The OpenJDK7u has an extra output of _pending_cards (when G1TraceConcRefinement activated) which somehow relates to the remembered sets size.
>
> Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
> CMS - invoked every ~20 sec, avg. stop 60ms
> G1 - invoked every ~16 sec, avg. stop 410ms !!!
>
> It there anything what could help us to reduce the Scan RS time or the G1 is simply not targeted for applications mutating heavily old region objects?
>
> CMS parameters
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2
>
> G1 parameters (mind MaxNewSize not specified)
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2
>
> G1 log file GC young pause
> [GC pause (young) [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
> time: 167.12 ms, target pause time: 200.00 ms]
> [Parallel Time: 389.8 ms, GC Workers: 8]
>      >>>>
>      [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9]
>      <<<<
> [Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 1801.6M(2048.0M)->1685.7M(2048.0M)]
>
> Regards,
> Michal
>
>


From john.cuthbertson at oracle.com  Thu Jan 31 18:07:21 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 31 Jan 2013 10:07:21 -0800
Subject: CMS vs G1 - Scan RS very long
In-Reply-To: <C91FF7F8-86FC-4202-B8EC-ED56E7F3FFFF@kodewerk.com>
References: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
	<C91FF7F8-86FC-4202-B8EC-ED56E7F3FFFF@kodewerk.com>
Message-ID: <510AB2D9.7070602@oracle.com>

Hi Kirk,

Please see my reply to Michal about high RSet scan times. Further 
response inline...

On 1/31/2013 6:47 AM, Kirk Pepperdine wrote:
> Hi,
>
> I'd like to add to Michal's comment to say that i've to add that I've seen very similar results in recent tuning efforts for low latency. In this case we didn't have a lot of mutation in old gen but I wasn't able to get young gen pauses times down to any where near what I could get to with the CMS collector. Unfortunately I've not been able to characterize the problem as well as you have as we had other fish to fry and I only had a limited amount of time to look at GC. That said I still will be able to run more experiments in the next two weeks.
>
> What I did notice is that young gen started reducing it's size but where as I calculated that a 15m eden was optimal, it stopped down sizing @ 40m. I'd be interested if anyone has any suggestions on how to get the young gen shrink if it's not shrinking enough on it's own. I'm hesitant to fix the size as there are times when heap should grow but under normal load I would hope that it would return to the smaller size.

I believe you are running into the G1YoungSizePercent problem. This is 
the lower bound on the young gen as a percentage of the heap. The 
default was 20. I just recently changed the name of the flag and made 
the default 5. In 7u this flag is G1DefaultMinNewGenPercent so try 
-XX:+UnlockExperimentalVMOptions -XX:G1DefaultMinNewGenPercent=5 and see 
if your young gen gets reduced sufficiently for your needs.

Cheers,

JohnC


From john.cuthbertson at oracle.com  Thu Jan 31 18:24:44 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 31 Jan 2013 10:24:44 -0800
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <35C8C843-CD7C-4B58-B25F-CBAFE54F4F7C@kodewerk.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
	<51097FE5.1090402@oracle.com>
	<35C8C843-CD7C-4B58-B25F-CBAFE54F4F7C@kodewerk.com>
Message-ID: <510AB6EC.80705@oracle.com>

Hi Kirk,

Apologies for the delay in getting back to you - I took yesterday 
afternoon off.

On 1/30/2013 12:25 PM, Kirk Pepperdine wrote:
> Hi John,
>
> Should I add that memory is reported in M,K, or B. I was really happy with just K being reported. Any comments on that?

And now I think there's 'G' as well.

I don't think I can comment on this, unfortunately. I always thought the 
GC logs did this unit conversion. At least I think I remember seeing it 
in the 1.4.2 time frame. I would be very nervous about making such a 
change since it affects all the collectors. Changing something like this 
would be controversial. There will be some, like you, who would welcome 
the change and there will be others who will complain bitterly. It's 
probably not going to happen soon, and we would need to make it switchable.

Cheers,

JohnC


From john.cuthbertson at oracle.com  Thu Jan 31 18:30:48 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 31 Jan 2013 10:30:48 -0800
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <41178332-08A3-429F-94C4-4952EFDCBA0A@kodewerk.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
	<51097FE5.1090402@oracle.com>
	<41178332-08A3-429F-94C4-4952EFDCBA0A@kodewerk.com>
Message-ID: <510AB858.90408@oracle.com>

Hi Kirk,

This is probably associated with using and instance of the TraceTimer 
class in some cases and direct prints in others. I agree this should be 
cleaned up. I'll standardize on 'secs' since I think that is what the 
other collectors (though their TraceTimers) report.

The reason why the cleanup reports the memory is because the cleanup 
pause can free memory. Any region that's found to have no live data 
within it is freed immediately during the cleanup pause. The others 
don't. Basically if a GC event causes some kind of heap transition, we 
print that transition.

Cheers,

JohnC

On 1/30/2013 2:40 PM, Kirk Pepperdine wrote:
> Hi John,
>
> Since we're on the subject of consistency, I have this record that reports in "sec" instead of "secs".
>
> 634.958: [GC concurrent-mark-end, 0.1525020 sec]
>
> and I have memory being reported as 252M->252M(397M) here;
>
> 568.675: [GC cleanup 252M->252M(397M), 0.0055868 secs]
>
> as well as 77M(77M)->0B(77M) here;
>
> [Eden: 77M(77M)->0B(77M) Survivors: 2048K->2048K Heap: 324M(397M)->247M(397M)]
>
> Either format works but it would be nice to have one and not both.
> i'll repeat my request to report in K. B seems unnecessarily granular where as M might be too corse for smaller heap sizes.. but then, maybe not. Still sorting out the potential effects on analytics.
>
> I guess you've picked up on the other places where secs is missing.
>
> Regards,
> Kirk
>
>
>
> On 2013-01-30, at 9:17 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>
>> Hi Ramki,
>>
>> Thanks for the report. Just checked with a log that I generated this morning and the issue is still there. It looks like the units weren't added to a couple of prints in concurrentMarkThread.cpp
>>
>> I'll submit a CR. Expect a webrev later today or early tomorrow.
>>
>> Thanks,
>>
>> JohnC
>>
>> On 1/30/2013 11:58 AM, Srinivas Ramakrishna wrote:
>>> Hi John, all --
>>>
>>> I'm using 7u9, perhaps this has been fixed subsequently. Here's an example of the missing units (and the inconsistency):-
>>>
>>> The young and mixed pauses print duration units of "secs" :-
>>>
>>> 2013-01-30T01:46:45.652-0800: 9522.134: [GC pause (young), 1.10620800 secs]
>>> 2013-01-30T01:47:03.563-0800: 9540.045: [GC pause (young), 0.90593900 secs]
>>> 2013-01-30T01:47:18.619-0800: 9555.101: [GC pause (young), 0.81425400 secs]
>>> 2013-01-30T01:47:59.536-0800: 9596.018: [GC pause (young), 0.84935400 secs]
>>> 2013-01-30T01:48:12.097-0800: 9608.579: [GC pause (young) (initial-mark), 0.50153600 secs]
>>> 2013-01-30T01:48:59.189-0800: 9655.671: [GC pause (young), 0.02815400 secs]
>>> 2013-01-30T01:51:53.952-0800: 9830.457: [GC pause (mixed), 0.66860800 secs]
>>> 2013-01-30T01:53:20.704-0800: 9917.186: [GC pause (mixed), 0.47479200 secs]
>>> 2013-01-30T01:54:41.098-0800: 9997.579: [GC pause (mixed), 0.72149500 secs]
>>> 2013-01-30T01:55:58.944-0800: 10075.426: [GC pause (young), 0.32158300 secs]
>>>
>>>
>>> The concurrent phases are often missing the units, but not always (mark phase duration prints "sec", others are mum):-
>>>
>>> 2013-01-30T01:12:10.711-0800: 7447.193: [GC concurrent-root-region-scan-end, 1.0222980]
>>> 2013-01-30T01:12:41.386-0800: 7477.868: [GC concurrent-mark-end, 30.6749800 sec]
>>> 2013-01-30T01:12:41.626-0800: 7478.108: [GC concurrent-cleanup-end, 0.0063520]
>>> 2013-01-30T01:24:18.588-0800: 8175.070: [GC concurrent-root-region-scan-end, 0.5868510]
>>> 2013-01-30T01:25:01.089-0800: 8217.571: [GC concurrent-mark-end, 42.5016130 sec]
>>> 2013-01-30T01:25:01.321-0800: 8217.803: [GC concurrent-cleanup-end, 0.0057450]
>>> 2013-01-30T01:36:27.063-0800: 8903.545: [GC concurrent-root-region-scan-end, 0.3746230]
>>> 2013-01-30T01:37:18.642-0800: 8955.124: [GC concurrent-mark-end, 51.5794260 sec]
>>> 2013-01-30T01:37:18.869-0800: 8955.351: [GC concurrent-cleanup-end, 0.0048270]
>>> 2013-01-30T01:48:13.162-0800: 9609.644: [GC concurrent-root-region-scan-end, 0.5630820]
>>> 2013-01-30T01:48:55.513-0800: 9651.995: [GC concurrent-mark-end, 42.3504330 sec]
>>> 2013-01-30T01:48:55.769-0800: 9652.251: [GC concurrent-cleanup-end, 0.0041170]
>>>
>>>
>>> Would be nice to have it be consistent across G1 and indeed across all collectors, if not already the case. Also makes for more consistent parsing of logs.
>>>
>>> thanks!
>>> -- ramki


From monica.beckwith at oracle.com  Thu Jan 31 18:44:46 2013
From: monica.beckwith at oracle.com (Monica Beckwith)
Date: Thu, 31 Jan 2013 12:44:46 -0600
Subject: CMS vs G1 - Scan RS very long
In-Reply-To: <510AB061.6010403@oracle.com>
References: <MHHU5K$7B881215177709189455C25810630205@frajt.eu>
	<510AB061.6010403@oracle.com>
Message-ID: <510ABB9E.6050105@oracle.com>

Hi Michal (and John),

I did look at Michal's log files (BTW, both are the same G1 logs). And I 
can confirm that scan RS is the issue... here's the plot:


The plot above only shows the max obj copy times (for all pauses), max 
RS Scan times and Parallel Time. So yes, scan RS is the culprit.
Also, looking at your logs, it seems that for mixedGCs the reclaimable 
bytes don't cross 6%. So can you please try increasing the 
G1HeapWastePercent to 10? (the latest builds will have 10 as the default 
value).

Please let me know if that improves your RT.

-Monica


On 1/31/2013 11:56 AM, John Cuthbertson wrote:
> Hi Michal,
>
> I haven't looked at the logs yet but from your description it sounds 
> like a large number of the RSets have been coarsened and/or the fine 
> grain tables have gotten very dense.
>
> The RSet for a heap region tracks incoming references to objects in 
> that region. There are three levels on granularity: sparse, fine, and 
> coarse. The sparse and fine entries are encoded in open hash tables 
> based upon the region containing the references. As the number of 
> references from region A that point into region B increase, the number 
> of cards in the hash table entry for region A increase (it's actually 
> a bitmap with one bit per card) and as the number of regions that 
> contain references that point into B increase, the number of region 
> entries in fine grain table increase. Once we run out of space in the 
> fine grain table (i.e. we can't add another bitmap for another region) 
> we evict one of the densest region bitmaps and say "coarsen" that 
> entry. When we have a coarsened entry we have to search the entire 
> referencing region looking for incoming references compared to 
> searching specific cards with the sparse and fine entries.
>
> I'll take a look at your logs to see if I can confirm.
>
> JohnC
>
>
> On 1/31/2013 6:12 AM, Michal Frajt wrote:
>> Hi all,
>>
>> After the iCMS got officially deprecated we decided to compare the G1 
>> collector with our best tuned (i)CMS setup. Unfortunately we are not 
>> able to make the G1 young collection running any closer to the 
>> ParNew. Actually we wanted to compare the G1 concurrent marking STW 
>> pauses with the CMS initial-mark and remark STW pauses but already 
>> incredibly long running G1 young collections are unacceptable for us.
>>
>> We were able to recognize that the very long G1 young collections are 
>> caused by the scanning remembered sets. There is not much 
>> documentation about G1 internals but we were able to understand that 
>> the size of the remembered sets is related to the amount of mutating 
>> references from old regions (cards) to young regions. Unfortunately 
>> all our applications mutate permanently thousands references from old 
>> objects to young objects.
>>
>> We are testing with the latest OpenJDK7u extended by the 7189971 
>> patch and CMSTriggerInterval implementation. The attached GC log 
>> files represent two very equal applications processing very similar 
>> data sets, one running the G1, second running the CMS collector. The 
>> OpenJDK7u has an extra output of _pending_cards (when 
>> G1TraceConcRefinement activated) which somehow relates to the 
>> remembered sets size.
>>
>> Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
>> CMS - invoked every ~20 sec, avg. stop 60ms
>> G1 - invoked every ~16 sec, avg. stop 410ms !!!
>>
>> It there anything what could help us to reduce the Scan RS time or 
>> the G1 is simply not targeted for applications mutating heavily old 
>> region objects?
>>
>> CMS parameters
>> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m 
>> -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 
>> -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M 
>> -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled 
>> -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark 
>> -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC 
>> -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2
>>
>> G1 parameters (mind MaxNewSize not specified)
>> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m 
>> -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC 
>> -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 
>> -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2
>>
>> G1 log file GC young pause
>> [GC pause (young) [G1Ergonomics (CSet Construction) start choosing 
>> CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
>> time: 167.12 ms, target pause time: 200.00 ms]
>> [Parallel Time: 389.8 ms, GC Workers: 8]
>> >>>>
>>      [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, 
>> Sum: 2642.9]
>> <<<<
>> [Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 
>> 1801.6M(2048.0M)->1685.7M(2048.0M)]
>>
>> Regards,
>> Michal
>>
>>
>

-- 
Oracle <http://www.oracle.com>
Monica Beckwith | Java Performance Engineer
VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
Texas
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130131/46642b96/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: iaadbbcg.png
Type: image/png
Size: 41189 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130131/46642b96/iaadbbcg.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130131/46642b96/oracle_sig_logo.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130131/46642b96/green-for-email-sig_0.gif>

From kirk at kodewerk.com  Thu Jan 31 20:24:12 2013
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 31 Jan 2013 21:24:12 +0100
Subject: G1 concurrent phase durations do not state the time units ("secs")
In-Reply-To: <510AB6EC.80705@oracle.com>
References: <CABzyjy=tU15mfY_OGx=ZPCy86BArNx95aQvLkWkfSbu8DbzbqA@mail.gmail.com>
	<51097FE5.1090402@oracle.com>
	<35C8C843-CD7C-4B58-B25F-CBAFE54F4F7C@kodewerk.com>
	<510AB6EC.80705@oracle.com>
Message-ID: <7AC4CB92-6006-497C-B3E0-98DDAEC3B5E9@kodewerk.com>

Hi John,

No need for apologies for having a day off. We all need to do that. As others have said here, documentation on G1 is still quite sparse so I really appreciate your responses.

You can probably guess that I've looked at 100s of GC logs over the years from just about every version of the JVM there is. Can't say I've seen every combination of log thats out there but I've yet to see anything but K used as units. 
I just scanned the code for tty->print_cr... didn't see anything but a K format. Anyways, it not about the format. My only concern is the crossness in the measure.

As for the memory summary, it's not very important.. it just seems like a consistency issue but then I've not looked at enough G1 logs to know better.

Regards,
Kirk


On 2013-01-31, at 7:24 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Kirk,
> 
> Apologies for the delay in getting back to you - I took yesterday afternoon off.
> 
> On 1/30/2013 12:25 PM, Kirk Pepperdine wrote:
>> Hi John,
>> 
>> Should I add that memory is reported in M,K, or B. I was really happy with just K being reported. Any comments on that?
> 
> And now I think there's 'G' as well.
> 
> I don't think I can comment on this, unfortunately. I always thought the GC logs did this unit conversion. At least I think I remember seeing it in the 1.4.2 time frame. I would be very nervous about making such a change since it affects all the collectors. Changing something like this would be controversial. There will be some, like you, who would welcome the change and there will be others who will complain bitterly. It's probably not going to happen soon, and we would need to make it switchable.
> 
> Cheers,
> 
> JohnC


From john.cuthbertson at oracle.com  Thu Jan 31 22:08:52 2013
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Thu, 31 Jan 2013 14:08:52 -0800
Subject: RFR(XS): 8005875: G1: Kitchensink fails with ParallelGCThreads=0
In-Reply-To: <50EC99ED.4090903@oracle.com>
References: <50EC99ED.4090903@oracle.com>
Message-ID: <510AEB74.4080309@oracle.com>

Hi Everyone,

Here's a new webrev based upon feedback from Vitaly and Bengt: 
http://cr.openjdk.java.net/~johnc/8005875/webrev.1/

I've wrapped the check with the asserts suggested by Bengt into a small 
routine.

Testing:
GC test suite with ConcGCThreads=3
Kitchensink with ParallelGCThreads=0:

Thanks,

JohnC

On 1/8/2013 2:13 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I please have a couple of volunteers look over the fix for this CR 
> - the webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/8005875/webrev.0/
>
> Summary:
> One of the modules in the Kitchensink test generates a VM_PrintThreads 
> vm operation. The JVM crashes when it tries to print out G1's 
> concurrent marking worker threads when ParallelGCThreads=0 because the 
> work gang has not been created. The fix is to add the same check 
> that's used elsewhere in G1's concurrent marking.
>
> Testing:
> Kitchensink with ParallelGCThreads=0
>
> Thanks,
>
> JohnC


From yamauchi at google.com  Thu Jan 31 22:33:20 2013
From: yamauchi at google.com (Hiroshi Yamauchi)
Date: Thu, 31 Jan 2013 14:33:20 -0800
Subject: Deallocating memory pages
In-Reply-To: <op.wrozx1w706c450@eckenfels02.seeburger.de>
References: <CAASM7NKvusvgaLT0r+-73Ktriz_RMx4ft7D1t9bQ5yZVvNyAug@mail.gmail.com>
	<51021A2E.9050503@oracle.com>
	<CAASM7N+MVgDeVa4GjWOU_jxk9NDmTg1=H1C6ToCmZp5y5k1oTA@mail.gmail.com>
	<CAHjP37ESOzHeFwsMi9xc-=3Wg_ucr5q5fg3fbfBJXBH_eEY=Fw@mail.gmail.com>
	<CAASM7NJcbt5NuU1caM0CpsFQoyfvCLY1=h68J5CdvKkkAxxQrw@mail.gmail.com>
	<CAHjP37F_wpuwydFXWxQ+3MjYpauY8Ch1kkJCQCQB8-2A93Y9mA@mail.gmail.com>
	<CAASM7NJACYx9FVQBcQWVHkWXBjQCPHC0_csv2QBGdFVxz8BpeA@mail.gmail.com>
	<op.wrozx1w706c450@eckenfels02.seeburger.de>
Message-ID: <CAASM7NK=KYRtoF40+34P62K5yKuEMD3yRwPcYGvMjLk-HOqYmg@mail.gmail.com>

Hi Bernd,

> I wonder if there is any deallocation cost involved at all. The VM can
> just mark the page as not-dirty and not-used. The only cost to use the
> page again (at the same place) would be to zero it. (and even that could
> be avoided if the VMM remembers the original owner and map it backl to the
> process if that process touches it again and no other process had the need
> for the page. Kind of same as buffer cache pages.

I'd not be surprised if it's the case (though I am not very familiar
with how the kernel or the VMM does things.)

>
> But I guess only some performance tests can answer that. (And add a test
> for large/hugepages to avoid automatic page splits if they are partially
> freed)

Right.