Strange Long ParNew GC Pauses (Sample Code Included)
Jon Masamitsu
jon.masamitsu at oracle.com
Fri May 8 21:25:10 UTC 2015
Ramki,
I compared the memory/blockOffsetTable.* files between 6u35 and 7 and there
were essentially no differences. The .cpp file had some changes in
assertions.
The other two were the same except for comments.
Between 6 and 6u35 (didn't pin point it), we changed from the rather
weak N_powers=3 :-) to N_powers=14.
There seem to be two problematic cases.
1) Spike in ParNew times that lasts for 2 (or a few) GC's
2) Continually growing ParNew times
From just reading the CR
https://bugs.openjdk.java.net/browse/JDK-8079274
the test case seems to reproduce 1). The test case
has a large object that lives till the application exits.
If it were a broken logarithmic BOT, I would not expect
the ParNew pauses to shrink again.
Jon
On 5/8/2015 11:49 AM, Srinivas Ramakrishna wrote:
> This is indeed interesting...
>
> Jesper, did you figure out why (as James seems to note) the problem is
> seen in JDK 8 but not in JDK 6 (he's using ParNew in both cases).
> Also, isn't the BOT logarithmic, which will allow threads to "quickly"
> traverse back to the start of even a large object? Did something
> change in the BOT that might have led to what James sees as a
> regression, or did the range that he characterizes as a "blind spot"
> merely move from one place to another?
>
> Unfortunately, I have not been keeping up with the changes in this
> area and am not working with JDK 6 code much, so can't offer any
> clues, but am surprised that this would be a regression. (I would
> expect the same performance pathologies in this code to exist in 6 and
> forward -- has it been established conclusively that this is indeed a
> regression?)
>
> thanks!
> -- ramki
>
> On Fri, May 8, 2015 at 6:40 AM, James Lent <jlent at tivo.com
> <mailto:jlent at tivo.com>> wrote:
>
> The explanation of the problem added by Jesper Wilhelmsson on
> 2015-05-06 08:59 to JDK-8079274 is interesting, but, not very
> satisfying.
>
> 1) The problem affects more than this one specific breakpoint. It
> impacts objects in at least the range of about 480M to 512M. The
> comment "A larger object probably triggers some higher level code
> that are able to skip scanning of this part of the memory" does
> not inspire confidence. It makes even more concerned that we do
> not understand this issue and that there could be other triggers.
> It seems like "the higher level code" has a 32M blind spot.
> Further I can create scenarios where the GC code takes
> progressively longer to clear the issue. Presumably it completes
> the lower block scan during each GC cycle. Why does it take
> longer and longer each time (my guess is it has something to do
> with the number of objects moved to survivor space). If I cache 1
> in 100 instead of 1 in 10 then there are 11 progressively slower
> ParNewGC (after several fast one) and the issue takes minutes to
> clear:
>
> $ /usr/lib/jvm/java-7-oracle/bin/java -verbose:gc
> -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -Xmx4G -Xms4G LongParNewPause 536870384
> 100 100
> 0.308: [GC0.308: [ParNew: 272640K->3029K(306688K), 0.0113180 secs]
> 796927K->527316K(4160256K), 0.0114040 secs] [Times: user=0.05
> sys=0.00, real=0.02 secs]
> 0.367: [GC0.367: [ParNew: 275669K->7766K(306688K), 0.0140840 secs]
> 799956K->532053K(4160256K), 0.0141630 secs] [Times: user=0.04
> sys=0.01, real=0.02 secs]
> 0.430: [GC0.430: [ParNew: 280406K->10057K(306688K), 0.0120810
> secs] 804693K->534344K(4160256K), 0.0121690 secs] [Times:
> user=0.05 sys=0.00, real=0.02 secs]
> 0.490: [GC0.490: [ParNew: 282697K->15383K(306688K), 0.0128540
> secs] 806984K->539670K(4160256K), 0.0129390 secs] [Times:
> user=0.05 sys=0.00, real=0.02 secs]
> 0.551: [GC0.551: [ParNew: 288023K->21006K(306688K), 0.0146480
> secs] 812310K->545294K(4160256K), 0.0147650 secs] [Times:
> user=0.05 sys=0.01, real=0.01 secs]
> 0.613: [GC0.613: [ParNew: 293646K->26805K(306688K), 0.0141250
> secs] 817934K->551092K(4160256K), 0.0142050 secs] [Times:
> user=0.05 sys=0.00, real=0.01 secs]
> 0.675: [GC0.675: [ParNew: 299445K->24258K(306688K), 0.4816840
> secs] 823732K->551436K(4160256K), 0.4817620 secs] [Times:
> user=1.90 sys=0.00, real=0.48 secs]
> 1.205: [GC1.205: [ParNew: 296898K->17914K(306688K), 3.4216550
> secs] 824076K->547788K(4160256K), 3.4217180 secs] [Times:
> user=13.50 sys=0.00, real=3.42 secs]
> 4.674: [GC4.674: [ParNew: 290554K->17433K(306688K), 6.3406820
> secs] 820428K->550031K(4160256K), 6.3407410 secs] [Times:
> user=25.00 sys=0.02, real=6.34 secs]
> 11.062: [GC11.062: [ParNew: 290073K->17315K(306688K), 10.9493130
> secs] 822671K->552636K(4160256K), 10.9493890 secs] [Times:
> user=42.98 sys=0.03, real=10.95 secs]
> 22.059: [GC22.059: [ParNew: 289955K->17758K(306688K), 16.0539650
> secs] 825276K->555798K(4160256K), 16.0540250 secs] [Times:
> user=62.44 sys=0.09, real=16.05 secs]
> 38.161: [GC38.161: [ParNew: 290398K->17745K(306688K), 20.6900820
> secs] 828438K->558512K(4160256K), 20.6901420 secs] [Times:
> user=80.79 sys=0.08, real=20.69 secs]
> 58.898: [GC58.898: [ParNew: 290385K->21826K(306688K), 24.6781700
> secs] 831152K->565310K(4160256K), 24.6782710 secs] [Times:
> user=95.58 sys=0.05, real=24.68 secs]
> 83.625: [GC83.625: [ParNew: 294466K->21834K(306688K), 30.3501800
> secs] 837950K->568044K(4160256K), 30.3502610 secs] [Times:
> user=115.80 sys=0.11, real=30.35 secs]
> 114.022: [GC114.022: [ParNew: 294474K->21836K(306688K), 33.6648800
> secs] 840684K->570772K(4160256K), 33.6649390 secs] [Times:
> user=127.38 sys=0.14, real=33.67 secs]
> 147.736: [GC147.736: [ParNew: 294476K->18454K(306688K), 38.9828430
> secs] 843412K->570117K(4160256K), 38.9829040 secs] [Times:
> user=149.14 sys=0.15, real=38.99 secs]
> 186.766: [GC186.766: [ParNew: 291094K->26602K(306688K), 39.8726730
> secs] 842757K->580977K(4160256K), 39.8727330 secs] [Times:
> user=156.35 sys=0.11, real=39.88 secs]
> 226.686: [GC226.686: [ParNew: 299242K->26612K(306688K), 0.0174470
> secs] 853617K->583714K(4160256K), 0.0175300 secs] [Times:
> user=0.06 sys=0.00, real=0.02 secs]
>
> 2) The suggestion to use only 1 thread is not acceptable to me.
> Use parallel GC with one thread? Further it did not work for me
> (in fact it made things worse). "user" time does now equal "real"
> time (and the new size is 1/4 as big) so I think the parameter
> took effect:
>
> $ /usr/lib/jvm/java-8-oracle/bin/java -verbose:gc
> -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -Xmx4G -Xms4G -XX:ParallelGCThreads=1
> LongParNewPause 536870384 100 10
> 0.226: [GC (Allocation Failure) 0.226: [ParNew:
> 68160K->6980K(76672K), 0.0489756 secs] 592447K->531267K(4185792K),
> 0.0490718 secs] [Times: user=0.05 sys=0.00, real=0.05 secs]
> 0.291: [GC (Allocation Failure) 0.291: [ParNew:
> 75140K->8512K(76672K), 29.8707281 secs]
> 599427K->539479K(4185792K), 29.8707954 secs] [Times: user=29.80
> sys=0.00, real=29.88 secs]
> 30.180: [GC (Allocation Failure) 30.180: [ParNew:
> 76672K->8093K(76672K), 30.6357699 secs]
> 607639K->545772K(4185792K), 30.6358257 secs] [Times: user=30.57
> sys=0.00, real=30.63 secs]
> 60.837: [GC (Allocation Failure) 60.837: [ParNew:
> 76253K->6830K(76672K), 69.2115255 secs]
> 613932K->552169K(4185792K), 69.2115783 secs] [Times: user=69.05
> sys=0.01, real=69.22 secs]
> 130.068: [GC (Allocation Failure) 130.068: [ParNew:
> 74990K->8226K(76672K), 132.5118903 secs]
> 620329K->560381K(4185792K), 132.5119459 secs] [Times: user=132.21
> sys=0.00, real=132.52 secs]
> 262.603: [GC (Allocation Failure) 262.603: [ParNew:
> 76386K->8512K(76672K), 0.0400138 secs] 628541K->568748K(4185792K),
> 0.0400687 secs] [Times: user=0.04 sys=0.00, real=0.04 secs]
>
> 3) Applications should not be expected to tune the GC settings to
> account for the size of one static object. The connection between
> the size of this object and the GC performance was not easy to
> even find. Further the parameter ParGCCardsPerStrideChunk is
> pretty low level and not one I think many application developers
> are familiar with.
>
> Error: VM option 'ParGCCardsPerStrideChunk' is diagnostic and must
> be enabled via -XX:+UnlockDiagnosticVMOptions.
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.
>
> 4) I can not reproduce this issue in Oracle 6. This appears to be
> a regression. How can fixing a regression be called an
> enhancement request?
>
>
>
>
>
>
>
> ------------------------------------------------------------------------
>
> This email and any attachments may contain confidential and
> privileged material for the sole use of the intended recipient.
> Any review, copying, or distribution of this email (or any
> attachments) by others is prohibited. If you are not the intended
> recipient, please contact the sender immediately and permanently
> delete this email and any attachments. No employee or agent of
> TiVo Inc. is authorized to conclude any binding agreement on
> behalf of TiVo Inc. by email. Binding agreements with TiVo Inc.
> may only be made by a signed written agreement.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150508/4c12ad68/attachment.htm>
More information about the hotspot-gc-dev
mailing list