From bernd.eckenfels at googlemail.com  Mon Oct  1 09:37:16 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Mon, 1 Oct 2012 18:37:16 +0200
Subject: Initial mark times
Message-ID: <CAGO386bDnkMuUdyEngvM-nA3GsrzAPkpZ6gRs2K=U8TkfUhWWg@mail.gmail.com>

Hello,

in my logfile (Linux x64 1.6.0_33 d64), I see the following
initial-mark which tooks 12real seconds. But if I look at the
CMSStatistics I see no scanning thread took nearly as long as 12s. In
addition to that a see a ParNew 10s after the initial-mark message.
Does this perhaps mean, that the initial-mark was started at 11:05:33
but did wait for the newGC.

If yes, how long was the STW time?

2012-09-27T11:04:41.986-0800: [GC [ParNew:
13937708K->463860K(15099520K), 0.1200170 secs]
15849137K->2387363K(48653952K), 0.1222950 secs] [Times: user=1.62
sys=0.01, real=0.12 secs]

2012-09-27T11:05:33.839-0800: [GC [1 CMS-initial-mark:
1923502K(33554432K)] 15661857K(48653952K), 12.1543770 secs] [Times:
user=12.11 sys=0.00, real=12.16 secs]
Finished cms space scanning in 4th thread: 0.207 sec
Finished cms space scanning in 0th thread: 0.207 sec
Finished cms space scanning in 1th thread: 0.207 sec
Finished cms space scanning in 7th thread: 0.207 sec
Finished cms space scanning in 13th thread: 0.207 sec
Finished cms space scanning in 10th thread: 0.207 sec
Finished cms space scanning in 8th thread: 0.207 sec
Finished cms space scanning in 2th thread: 0.207 sec
Finished cms space scanning in 3th thread: 0.207 sec
Finished cms space scanning in 9th thread: 0.207 sec
Finished cms space scanning in 11th thread: 0.207 sec
Finished cms space scanning in 6th thread: 0.207 sec
Finished cms space scanning in 5th thread: 0.208 sec
Finished cms space scanning in 12th thread: 0.208 sec
Finished perm space scanning in 11th thread: 0.051 sec
Finished perm space scanning in 1th thread: 0.051 sec
Finished perm space scanning in 3th thread: 0.051 sec
Finished perm space scanning in 10th thread: 0.051 sec
Finished perm space scanning in 8th thread: 0.051 sec
Finished perm space scanning in 7th thread: 0.051 sec
Finished perm space scanning in 13th thread: 0.051 sec
Finished perm space scanning in 9th thread: 0.051 sec
Finished perm space scanning in 6th thread: 0.051 sec
Finished perm space scanning in 4th thread: 0.051 sec
Finished perm space scanning in 5th thread: 0.051 sec
Finished perm space scanning in 2th thread: 0.065 sec
Finished perm space scanning in 0th thread: 0.089 sec

2012-09-27T11:05:46.300-0800: [GC [ParNew:
13885684K->569855K(15099520K), 0.1279470 secs]
15809187K->2493358K(48653952K), 0.1305190 secs] [Times: user=1.91
sys=0.19, real=0.13 secs]

Finished perm space scanning in 12th thread: 0.400 sec
Finished work stealing in 12th thread: 0.000 sec
Finished work stealing in 1th thread: 0.350 sec
Finished work stealing in 5th thread: 0.348 sec
Finished work stealing in 2th thread: 0.335 sec
Finished work stealing in 6th thread: 0.349 sec
Finished work stealing in 9th thread: 0.349 sec
Finished work stealing in 3th thread: 0.350 sec
Finished work stealing in 4th thread: 0.349 sec
Finished work stealing in 10th thread: 0.350 sec
Finished work stealing in 8th thread: 0.350 sec
Finished work stealing in 11th thread: 0.350 sec
Finished work stealing in 0th thread: 0.312 sec
Finished work stealing in 13th thread: 0.349 sec
Finished work stealing in 7th thread: 0.350 sec

2012-09-27T11:05:46.604-0800: [CMS-concurrent-mark: 0.320/0.608 secs]
(CMS-concurrent-mark yielded 63 times)
 [Times: user=7.42 sys=0.45, real=0.61 secs]
 (cardTable: 1474 cards, re-scanned 39687 cards, 2 iterations)

2012-09-27T11:05:46.788-0800: [CMS-concurrent-preclean: 0.177/0.183
secs] (CMS-concurrent-preclean yielded 0 times)

The initial mark was starting 52s after the ParNew, it was therefore
working on a larger dataset (I guess), but it does not explain the
long wallclock time (based on the thread statistics). I see similiar
patterns multiple times in this logfile:

2012-09-28T02:05:33.476-0800: [GC [ParNew:
13925727K->588390K(15099520K), 0.1312600 secs]
15435756K->2098425K(48653952K), 0.1342830 secs] [Times: user=2.37
sys=0.02, real=0.13 secs]
2012-09-28T02:08:16.854-0800: [GC [1 CMS-initial-mark:
1510035K(33554432K)] 14963114K(48653952K), 11.8353220 secs] [Times:
user=11.83 sys=0.01, real=11.83 secs]

2012-09-28T02:08:28.708-0800: [GC [ParNew:
13453093K->660355K(15099520K), 0.1979760 secs]
14963128K->2196124K(48653952K), 0.2009570 secs] [Times: user=2.52
sys=0.00, real=0.20 secs]

Finished cms space scanning in 1th thread: 0.558 sec
Finished cms space scanning in 13th thread: 0.558 sec
Finished cms space scanning in 10th thread: 0.558 sec
Finished cms space scanning in 12th thread: 0.558 sec
Finished cms space scanning in 8th thread: 0.558 sec
Finished cms space scanning in 2th thread: 0.558 sec
Finished cms space scanning in 11th thread: 0.558 sec
Finished cms space scanning in 4th thread: 0.558 sec
Finished cms space scanning in 5th thread: 0.558 sec
Finished cms space scanning in 7th thread: 0.558 sec
Finished cms space scanning in 3th thread: 0.558 sec
Finished cms space scanning in 6th thread: 0.559 sec
Finished cms space scanning in 9th thread: 0.560 sec
Finished cms space scanning in 0th thread: 0.563 sec
Finished perm space scanning in 9th thread: 0.050 sec
Finished perm space scanning in 7th thread: 0.053 sec
Finished perm space scanning in 12th thread: 0.053 sec
Finished perm space scanning in 2th thread: 0.053 sec
Finished perm space scanning in 1th thread: 0.053 sec
Finished perm space scanning in 5th thread: 0.053 sec
Finished perm space scanning in 4th thread: 0.053 sec
Finished perm space scanning in 13th thread: 0.053 sec
Finished perm space scanning in 11th thread: 0.053 sec
Finished perm space scanning in 8th thread: 0.053 sec
Finished perm space scanning in 3th thread: 0.053 sec
Finished perm space scanning in 10th thread: 0.062 sec
Finished perm space scanning in 0th thread: 0.096 sec
Finished perm space scanning in 6th thread: 0.189 sec
Finished work stealing in 8th thread: 0.137 sec
Finished work stealing in 13th thread: 0.137 sec
Finished work stealing in 0th thread: 0.089 sec
Finished work stealing in 6th thread: 0.000 sec
Finished work stealing in 7th thread: 0.137 sec
Finished work stealing in 1th thread: 0.137 sec
Finished work stealing in 11th thread: 0.137 sec
Finished work stealing in 10th thread: 0.128 sec
Finished work stealing in 4th thread: 0.137 sec
Finished work stealing in 9th thread: 0.137 sec
Finished work stealing in 2th thread: 0.137 sec
Finished work stealing in 3th thread: 0.137 sec
Finished work stealing in 5th thread: 0.137 sec
Finished work stealing in 12th thread: 0.137 sec

2012-09-28T02:08:29.440-0800: [CMS-concurrent-mark: 0.402/0.748 secs]
(CMS-concurrent-mark yielded 9 times)
 [Times: user=12.12 sys=0.30, real=0.75 secs]
 (cardTable: 5149 cards, re-scanned 56814 cards, 2 iterations)

and here:

2012-09-28T09:07:35.539-0800: [GC [ParNew:
13892816K->612713K(15099520K), 0.1219960 secs]
15663242K->2392275K(48653952K), 0.1249600 secs] [Times: user=2.30
sys=0.03, real=0.13 secs]
2012-09-28T09:09:39.083-0800: [GC [1 CMS-initial-mark:
1779561K(33554432K)] 11806175K(48653952K), 8.4227640 secs] [Times:
user=8.39 sys=0.02, real=8.43 secs]
2012-09-28T09:09:47.523-0800: [GC [ParNew:
10026619K->586782K(15099520K), 0.1064270 secs]
11806181K->2366343K(48653952K), 0.1094300 secs] [Times: user=1.92
sys=0.01, real=0.11 secs]
Finished cms space scanning in 8th thread: 0.364 sec
Finished cms space scanning in 9th thread: 0.364 sec
Finished cms space scanning in 6th thread: 0.364 sec
Finished cms space scanning in 0th thread: 0.364 sec
Finished cms space scanning in 2th thread: 0.364 sec
Finished cms space scanning in 7th thread: 0.364 sec
Finished cms space scanning in 4th thread: 0.364 sec
Finished cms space scanning in 13th thread: 0.364 sec
Finished cms space scanning in 3th thread: 0.364 sec
Finished cms space scanning in 11th thread: 0.364 sec
Finished cms space scanning in 5th thread: 0.365 sec
Finished cms space scanning in 12th thread: 0.368 sec
Finished cms space scanning in 1th thread: 0.369 sec
Finished cms space scanning in 10th thread: 0.392 sec
Finished perm space scanning in 11th thread: 0.046 sec
Finished perm space scanning in 10th thread: 0.018 sec
Finished perm space scanning in 4th thread: 0.047 sec
Finished perm space scanning in 12th thread: 0.042 sec
Finished perm space scanning in 6th thread: 0.047 sec
Finished perm space scanning in 5th thread: 0.046 sec
Finished perm space scanning in 0th thread: 0.047 sec
Finished perm space scanning in 3th thread: 0.047 sec
Finished perm space scanning in 9th thread: 0.047 sec
Finished perm space scanning in 7th thread: 0.047 sec
Finished perm space scanning in 8th thread: 0.048 sec
Finished perm space scanning in 1th thread: 0.051 sec
Finished perm space scanning in 2th thread: 0.080 sec
Finished perm space scanning in 13th thread: 0.168 sec
Finished work stealing in 13th thread: 0.000 sec
Finished work stealing in 11th thread: 0.121 sec
Finished work stealing in 6th thread: 0.121 sec
Finished work stealing in 8th thread: 0.120 sec
Finished work stealing in 3th thread: 0.121 sec
Finished work stealing in 0th thread: 0.121 sec
Finished work stealing in 4th thread: 0.121 sec
Finished work stealing in 2th thread: 0.088 sec
Finished work stealing in 10th thread: 0.121 sec
Finished work stealing in 5th thread: 0.121 sec
Finished work stealing in 12th thread: 0.121 sec
Finished work stealing in 1th thread: 0.112 sec
Finished work stealing in 9th thread: 0.121 sec
Finished work stealing in 7th thread: 0.121 sec
2012-09-28T09:09:48.040-0800: [CMS-concurrent-mark: 0.354/0.532 secs]
(CMS-concurrent-mark yielded 3 times)
 [Times: user=7.92 sys=0.24, real=0.53 secs]
 (cardTable: 3342 cards, re-scanned 18695 cards, 2 iterations)

This is by the way using -XX:+CMSScavengeBeforeRemark but I guess this
is not affecting the behaviour here, right?

Greetings
Bernd

From java at java4.info  Mon Oct  1 10:43:54 2012
From: java at java4.info (Florian Binder)
Date: Mon, 01 Oct 2012 19:43:54 +0200
Subject: Initial mark times
In-Reply-To: <CAGO386bDnkMuUdyEngvM-nA3GsrzAPkpZ6gRs2K=U8TkfUhWWg@mail.gmail.com>
References: <CAGO386bDnkMuUdyEngvM-nA3GsrzAPkpZ6gRs2K=U8TkfUhWWg@mail.gmail.com>
Message-ID: <5069D65A.3010901@java4.info>

Hi Bernd,

you have a very large young generation. This results, as you already 
mentioned, in a very large dataset for the initial-marking.
The initial marking is done single threaded and therefore takes a lot of 
time.
Your initial-mark phase ended at 11:05:33. It stopped the world for 
about 12 seconds (started at 11:05:21).
The "Finished cms..." information is for the concurrent marking phase 
which took 0.61 sec.

Have you tried to reduce your young generation? Maybe 1gb is enough?

This is all as far as I know. Hope somebody can confirm it or correct 
me, if I am wrong ;-)

-Flo


Am 01.10.2012 18:37, schrieb Bernd Eckenfels:
> Hello,
>
> in my logfile (Linux x64 1.6.0_33 d64), I see the following
> initial-mark which tooks 12real seconds. But if I look at the
> CMSStatistics I see no scanning thread took nearly as long as 12s. In
> addition to that a see a ParNew 10s after the initial-mark message.
> Does this perhaps mean, that the initial-mark was started at 11:05:33
> but did wait for the newGC.
>
> If yes, how long was the STW time?
>
> 2012-09-27T11:04:41.986-0800: [GC [ParNew:
> 13937708K->463860K(15099520K), 0.1200170 secs]
> 15849137K->2387363K(48653952K), 0.1222950 secs] [Times: user=1.62
> sys=0.01, real=0.12 secs]
>
> 2012-09-27T11:05:33.839-0800: [GC [1 CMS-initial-mark:
> 1923502K(33554432K)] 15661857K(48653952K), 12.1543770 secs] [Times:
> user=12.11 sys=0.00, real=12.16 secs]
> Finished cms space scanning in 4th thread: 0.207 sec
> Finished cms space scanning in 0th thread: 0.207 sec
> Finished cms space scanning in 1th thread: 0.207 sec
> Finished cms space scanning in 7th thread: 0.207 sec
> Finished cms space scanning in 13th thread: 0.207 sec
> Finished cms space scanning in 10th thread: 0.207 sec
> Finished cms space scanning in 8th thread: 0.207 sec
> Finished cms space scanning in 2th thread: 0.207 sec
> Finished cms space scanning in 3th thread: 0.207 sec
> Finished cms space scanning in 9th thread: 0.207 sec
> Finished cms space scanning in 11th thread: 0.207 sec
> Finished cms space scanning in 6th thread: 0.207 sec
> Finished cms space scanning in 5th thread: 0.208 sec
> Finished cms space scanning in 12th thread: 0.208 sec
> Finished perm space scanning in 11th thread: 0.051 sec
> Finished perm space scanning in 1th thread: 0.051 sec
> Finished perm space scanning in 3th thread: 0.051 sec
> Finished perm space scanning in 10th thread: 0.051 sec
> Finished perm space scanning in 8th thread: 0.051 sec
> Finished perm space scanning in 7th thread: 0.051 sec
> Finished perm space scanning in 13th thread: 0.051 sec
> Finished perm space scanning in 9th thread: 0.051 sec
> Finished perm space scanning in 6th thread: 0.051 sec
> Finished perm space scanning in 4th thread: 0.051 sec
> Finished perm space scanning in 5th thread: 0.051 sec
> Finished perm space scanning in 2th thread: 0.065 sec
> Finished perm space scanning in 0th thread: 0.089 sec
>
> 2012-09-27T11:05:46.300-0800: [GC [ParNew:
> 13885684K->569855K(15099520K), 0.1279470 secs]
> 15809187K->2493358K(48653952K), 0.1305190 secs] [Times: user=1.91
> sys=0.19, real=0.13 secs]
>
> Finished perm space scanning in 12th thread: 0.400 sec
> Finished work stealing in 12th thread: 0.000 sec
> Finished work stealing in 1th thread: 0.350 sec
> Finished work stealing in 5th thread: 0.348 sec
> Finished work stealing in 2th thread: 0.335 sec
> Finished work stealing in 6th thread: 0.349 sec
> Finished work stealing in 9th thread: 0.349 sec
> Finished work stealing in 3th thread: 0.350 sec
> Finished work stealing in 4th thread: 0.349 sec
> Finished work stealing in 10th thread: 0.350 sec
> Finished work stealing in 8th thread: 0.350 sec
> Finished work stealing in 11th thread: 0.350 sec
> Finished work stealing in 0th thread: 0.312 sec
> Finished work stealing in 13th thread: 0.349 sec
> Finished work stealing in 7th thread: 0.350 sec
>
> 2012-09-27T11:05:46.604-0800: [CMS-concurrent-mark: 0.320/0.608 secs]
> (CMS-concurrent-mark yielded 63 times)
>   [Times: user=7.42 sys=0.45, real=0.61 secs]
>   (cardTable: 1474 cards, re-scanned 39687 cards, 2 iterations)
>
> 2012-09-27T11:05:46.788-0800: [CMS-concurrent-preclean: 0.177/0.183
> secs] (CMS-concurrent-preclean yielded 0 times)
>
> The initial mark was starting 52s after the ParNew, it was therefore
> working on a larger dataset (I guess), but it does not explain the
> long wallclock time (based on the thread statistics). I see similiar
> patterns multiple times in this logfile:
>
> 2012-09-28T02:05:33.476-0800: [GC [ParNew:
> 13925727K->588390K(15099520K), 0.1312600 secs]
> 15435756K->2098425K(48653952K), 0.1342830 secs] [Times: user=2.37
> sys=0.02, real=0.13 secs]
> 2012-09-28T02:08:16.854-0800: [GC [1 CMS-initial-mark:
> 1510035K(33554432K)] 14963114K(48653952K), 11.8353220 secs] [Times:
> user=11.83 sys=0.01, real=11.83 secs]
>
> 2012-09-28T02:08:28.708-0800: [GC [ParNew:
> 13453093K->660355K(15099520K), 0.1979760 secs]
> 14963128K->2196124K(48653952K), 0.2009570 secs] [Times: user=2.52
> sys=0.00, real=0.20 secs]
>
> Finished cms space scanning in 1th thread: 0.558 sec
> Finished cms space scanning in 13th thread: 0.558 sec
> Finished cms space scanning in 10th thread: 0.558 sec
> Finished cms space scanning in 12th thread: 0.558 sec
> Finished cms space scanning in 8th thread: 0.558 sec
> Finished cms space scanning in 2th thread: 0.558 sec
> Finished cms space scanning in 11th thread: 0.558 sec
> Finished cms space scanning in 4th thread: 0.558 sec
> Finished cms space scanning in 5th thread: 0.558 sec
> Finished cms space scanning in 7th thread: 0.558 sec
> Finished cms space scanning in 3th thread: 0.558 sec
> Finished cms space scanning in 6th thread: 0.559 sec
> Finished cms space scanning in 9th thread: 0.560 sec
> Finished cms space scanning in 0th thread: 0.563 sec
> Finished perm space scanning in 9th thread: 0.050 sec
> Finished perm space scanning in 7th thread: 0.053 sec
> Finished perm space scanning in 12th thread: 0.053 sec
> Finished perm space scanning in 2th thread: 0.053 sec
> Finished perm space scanning in 1th thread: 0.053 sec
> Finished perm space scanning in 5th thread: 0.053 sec
> Finished perm space scanning in 4th thread: 0.053 sec
> Finished perm space scanning in 13th thread: 0.053 sec
> Finished perm space scanning in 11th thread: 0.053 sec
> Finished perm space scanning in 8th thread: 0.053 sec
> Finished perm space scanning in 3th thread: 0.053 sec
> Finished perm space scanning in 10th thread: 0.062 sec
> Finished perm space scanning in 0th thread: 0.096 sec
> Finished perm space scanning in 6th thread: 0.189 sec
> Finished work stealing in 8th thread: 0.137 sec
> Finished work stealing in 13th thread: 0.137 sec
> Finished work stealing in 0th thread: 0.089 sec
> Finished work stealing in 6th thread: 0.000 sec
> Finished work stealing in 7th thread: 0.137 sec
> Finished work stealing in 1th thread: 0.137 sec
> Finished work stealing in 11th thread: 0.137 sec
> Finished work stealing in 10th thread: 0.128 sec
> Finished work stealing in 4th thread: 0.137 sec
> Finished work stealing in 9th thread: 0.137 sec
> Finished work stealing in 2th thread: 0.137 sec
> Finished work stealing in 3th thread: 0.137 sec
> Finished work stealing in 5th thread: 0.137 sec
> Finished work stealing in 12th thread: 0.137 sec
>
> 2012-09-28T02:08:29.440-0800: [CMS-concurrent-mark: 0.402/0.748 secs]
> (CMS-concurrent-mark yielded 9 times)
>   [Times: user=12.12 sys=0.30, real=0.75 secs]
>   (cardTable: 5149 cards, re-scanned 56814 cards, 2 iterations)
>
> and here:
>
> 2012-09-28T09:07:35.539-0800: [GC [ParNew:
> 13892816K->612713K(15099520K), 0.1219960 secs]
> 15663242K->2392275K(48653952K), 0.1249600 secs] [Times: user=2.30
> sys=0.03, real=0.13 secs]
> 2012-09-28T09:09:39.083-0800: [GC [1 CMS-initial-mark:
> 1779561K(33554432K)] 11806175K(48653952K), 8.4227640 secs] [Times:
> user=8.39 sys=0.02, real=8.43 secs]
> 2012-09-28T09:09:47.523-0800: [GC [ParNew:
> 10026619K->586782K(15099520K), 0.1064270 secs]
> 11806181K->2366343K(48653952K), 0.1094300 secs] [Times: user=1.92
> sys=0.01, real=0.11 secs]
> Finished cms space scanning in 8th thread: 0.364 sec
> Finished cms space scanning in 9th thread: 0.364 sec
> Finished cms space scanning in 6th thread: 0.364 sec
> Finished cms space scanning in 0th thread: 0.364 sec
> Finished cms space scanning in 2th thread: 0.364 sec
> Finished cms space scanning in 7th thread: 0.364 sec
> Finished cms space scanning in 4th thread: 0.364 sec
> Finished cms space scanning in 13th thread: 0.364 sec
> Finished cms space scanning in 3th thread: 0.364 sec
> Finished cms space scanning in 11th thread: 0.364 sec
> Finished cms space scanning in 5th thread: 0.365 sec
> Finished cms space scanning in 12th thread: 0.368 sec
> Finished cms space scanning in 1th thread: 0.369 sec
> Finished cms space scanning in 10th thread: 0.392 sec
> Finished perm space scanning in 11th thread: 0.046 sec
> Finished perm space scanning in 10th thread: 0.018 sec
> Finished perm space scanning in 4th thread: 0.047 sec
> Finished perm space scanning in 12th thread: 0.042 sec
> Finished perm space scanning in 6th thread: 0.047 sec
> Finished perm space scanning in 5th thread: 0.046 sec
> Finished perm space scanning in 0th thread: 0.047 sec
> Finished perm space scanning in 3th thread: 0.047 sec
> Finished perm space scanning in 9th thread: 0.047 sec
> Finished perm space scanning in 7th thread: 0.047 sec
> Finished perm space scanning in 8th thread: 0.048 sec
> Finished perm space scanning in 1th thread: 0.051 sec
> Finished perm space scanning in 2th thread: 0.080 sec
> Finished perm space scanning in 13th thread: 0.168 sec
> Finished work stealing in 13th thread: 0.000 sec
> Finished work stealing in 11th thread: 0.121 sec
> Finished work stealing in 6th thread: 0.121 sec
> Finished work stealing in 8th thread: 0.120 sec
> Finished work stealing in 3th thread: 0.121 sec
> Finished work stealing in 0th thread: 0.121 sec
> Finished work stealing in 4th thread: 0.121 sec
> Finished work stealing in 2th thread: 0.088 sec
> Finished work stealing in 10th thread: 0.121 sec
> Finished work stealing in 5th thread: 0.121 sec
> Finished work stealing in 12th thread: 0.121 sec
> Finished work stealing in 1th thread: 0.112 sec
> Finished work stealing in 9th thread: 0.121 sec
> Finished work stealing in 7th thread: 0.121 sec
> 2012-09-28T09:09:48.040-0800: [CMS-concurrent-mark: 0.354/0.532 secs]
> (CMS-concurrent-mark yielded 3 times)
>   [Times: user=7.92 sys=0.24, real=0.53 secs]
>   (cardTable: 3342 cards, re-scanned 18695 cards, 2 iterations)
>
> This is by the way using -XX:+CMSScavengeBeforeRemark but I guess this
> is not affecting the behaviour here, right?
>
> Greetings
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From bernd.eckenfels at googlemail.com  Mon Oct  1 13:04:37 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Mon, 1 Oct 2012 22:04:37 +0200
Subject: Initial mark times
In-Reply-To: <5069D65A.3010901@java4.info>
References: <CAGO386bDnkMuUdyEngvM-nA3GsrzAPkpZ6gRs2K=U8TkfUhWWg@mail.gmail.com>
	<5069D65A.3010901@java4.info>
Message-ID: <CAGO386Z4VbEPGdOmJ3SV_UXfp_ihhZ1ubwb6vUmPgvQeteJR+g@mail.gmail.com>

Hello Florian,

On Mon, Oct 1, 2012 at 7:43 PM, Florian Binder <java at java4.info> wrote:
> The initial marking is done single threaded and therefore takes a lot of
> time.
> Your initial-mark phase ended at 11:05:33. It stopped the world for
> about 12 seconds (started at 11:05:21).
> The "Finished cms..." information is for the concurrent marking phase
> which took 0.61 sec.

Oh ok, makes sense. That was missleading me. Yes right, it is single
threaded sadly.

> Have you tried to reduce your young generation? Maybe 1gb is enough?

It is unfortunatelly not a system where I can experiment. I already
reduced it a bit, but I guess I really
have to go to 10s frequency (and adjust ?XX:CMSWaitDuration= ).

Greetings
Bernd

From java at java4.info  Mon Oct  1 14:39:43 2012
From: java at java4.info (Florian Binder)
Date: Mon, 01 Oct 2012 23:39:43 +0200
Subject: Initial mark times
In-Reply-To: <CAGO386Z4VbEPGdOmJ3SV_UXfp_ihhZ1ubwb6vUmPgvQeteJR+g@mail.gmail.com>
References: <CAGO386bDnkMuUdyEngvM-nA3GsrzAPkpZ6gRs2K=U8TkfUhWWg@mail.gmail.com>
	<5069D65A.3010901@java4.info>
	<CAGO386Z4VbEPGdOmJ3SV_UXfp_ihhZ1ubwb6vUmPgvQeteJR+g@mail.gmail.com>
Message-ID: <506A0D9F.7010103@java4.info>

Hi Bernd,

your posted gc log shows that you have less than 2.5G of live data (in 
old generation).
Is the application still in startup phase?
In this situation I would bet ParallelOldGC would be faster than 12s ;-)

I am wondering what the concurrent collection triggered?
Maybe your perm gen is full/too small?

-Flo


Am 01.10.2012 22:04, schrieb Bernd Eckenfels:
> Hello Florian,
>
> On Mon, Oct 1, 2012 at 7:43 PM, Florian Binder <java at java4.info> wrote:
>> The initial marking is done single threaded and therefore takes a lot of
>> time.
>> Your initial-mark phase ended at 11:05:33. It stopped the world for
>> about 12 seconds (started at 11:05:21).
>> The "Finished cms..." information is for the concurrent marking phase
>> which took 0.61 sec.
> Oh ok, makes sense. That was missleading me. Yes right, it is single
> threaded sadly.
>
>> Have you tried to reduce your young generation? Maybe 1gb is enough?
> It is unfortunatelly not a system where I can experiment. I already
> reduced it a bit, but I guess I really
> have to go to 10s frequency (and adjust ?XX:CMSWaitDuration= ).
>
> Greetings
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From bernd.eckenfels at googlemail.com  Tue Oct  2 20:43:11 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Wed, 3 Oct 2012 05:43:11 +0200
Subject: Erratic(?) CMS behaviour every 5d
In-Reply-To: <CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
Message-ID: <CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>

Hello,


On Fri, Sep 28, 2012 at 7:48 PM, Bernd Eckenfels
<bernd.eckenfels at googlemail.com> wrote:
> The jstat utility has a -gccause option, do you think that will help
> me as soon as the system gets into the trashing mode?

I am currently seeing only "CMS Initial Mark" and "CMS Final Remark"
as causes for OGC and "GCLocker" as cause for ParNew, so I think that
jstat will not help me later on, when the machine starts to go wild
again. The PrintCMSStatistics=2 was not helping me to get to the root.
I will play around with PrintCMSInitiationStatistics but I really
think the GC causes (which statistic was over what limit) should be
more prominently available.

Regarding my "problem": I do see a minor growth in the PGU, so I guess
this is the reason. My fist step was to resize the PG and later on
some heapdumping to find the class loader leak is in order.

Bernd

From bernd.eckenfels at googlemail.com  Tue Oct  2 21:26:31 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Wed, 03 Oct 2012 06:26:31 +0200
Subject: PermGen occupancy triggering CMS?
In-Reply-To: <CAGO386brX9Rm0B4scy_FUECpPN4geSwS-6C3dZesZ5bv+4dtDw@mail.gmail.com>
References: <CAGO386brX9Rm0B4scy_FUECpPN4geSwS-6C3dZesZ5bv+4dtDw@mail.gmail.com>
Message-ID: <op.wlk5ahuptqmg3o@eckenfels02.seeburger.de>

Am 28.09.2012, 20:39 Uhr, schrieb Bernd Eckenfels  
<bernd.eckenfels at googlemail.com>:
> So this seems like 80% should trigger

It works like this:

src\share\vm\gc_implementation\concurrentMarkSweep\concurrentMarkSweepGeneration.cpp:276
ConcurrentMarkSweepGeneration::init_initiating_occupancy:

if CMSInitiatingPermOccupancyFraction >= 0 then this ratio is used,
otherwise it uses _initiating_occupancy = (100% - MinHeapFreeRatio%) +
                                           (CMSTriggerPermRatio% *  
MinHeapFreeRatio%)

(Not sure if it makes sense to actually consider MinHeapFreeRatio for  
PermGen, but anyway...

This will result (with the default CMSTriggerPermRatio=80 and  
MinHeapFreeRatio=40) in 92% occupancy trigger.

I can see that also in the logs:

CMSCollector shouldConcurrentCollect: 8.147
time_until_cms_gen_full 12.9501194
free=2640592
contiguous_available=1427177472
promotion_rate=159296
cms_allocation_rate=0
occupancy=0.5314856
initiatingOccupancy=0.9200000
_initiatingPermOccupancy=0.9200000_

Bernd

From michael.finocchiaro at gmail.com  Wed Oct  3 12:50:03 2012
From: michael.finocchiaro at gmail.com (Michael Finocchiaro)
Date: Wed, 3 Oct 2012 21:50:03 +0200
Subject: Histograms in jVisualVM-VisualGC plugin "not supported"
Message-ID: <CAMPV=ON-D-CP1ZJ6zAUkAtFatEUOx6ou1rJ72vdRT4i5rO7C2Q@mail.gmail.com>

Perhaps the wrong forum for this question - if so please redirect me.
I upgraded my Java to Java7 but with the same parameters for starting my
Tomcat, I am no longer able to see the histograms in the VisualGC plugin to
jVisualVM that were so useful for checking the tuning of SurvivorRatio back
when I ran on Java6. Is this a known limitation on Java7 or did I miss a
command-line parameter somewhere?
Cheers,
Fino

Michael Finocchiaro
michael.finocchiaro at gmail.com
Mobile Telephone: +33 6 46 59 36 62
MSN: le_fino at hotmail.com
Twitter: le_fino
Skype: michael.finocchiaro
Blog: http://mfinocchiaro.wordpress.com
LinkedIn: http://fr.linkedin.com/in/mfinocchiaro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121003/02c88d32/attachment.html 

From rednaxelafx at gmail.com  Thu Oct  4 09:56:00 2012
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Fri, 5 Oct 2012 00:56:00 +0800
Subject: Histograms in jVisualVM-VisualGC plugin "not supported"
In-Reply-To: <CAMPV=ON-D-CP1ZJ6zAUkAtFatEUOx6ou1rJ72vdRT4i5rO7C2Q@mail.gmail.com>
References: <CAMPV=ON-D-CP1ZJ6zAUkAtFatEUOx6ou1rJ72vdRT4i5rO7C2Q@mail.gmail.com>
Message-ID: <CA+cQ+tQe_ai_7i-X-j-i5WpY3ME3ob9P3YZgU2fBG7coGU3WOw@mail.gmail.com>

Hi Fino,

It's possible that you're using a different GC strategy in your Java7
environment from your Java6 one.
The VisualGC plugin for VisualVM doesn't support displaying histogram
information for ParallelGC; it works with other GCs in HotSpot, e.g.
DefNew/ParNew/G1. This hasn't changed from JDK6 to JDK7.

Please take a look at what GC you're using in the two environments you're
comparing.

- Kris

On Thu, Oct 4, 2012 at 3:50 AM, Michael Finocchiaro <
michael.finocchiaro at gmail.com> wrote:

> Perhaps the wrong forum for this question - if so please redirect me.
> I upgraded my Java to Java7 but with the same parameters for starting my
> Tomcat, I am no longer able to see the histograms in the VisualGC plugin to
> jVisualVM that were so useful for checking the tuning of SurvivorRatio back
> when I ran on Java6. Is this a known limitation on Java7 or did I miss a
> command-line parameter somewhere?
> Cheers,
> Fino
>
> Michael Finocchiaro
> michael.finocchiaro at gmail.com
> Mobile Telephone: +33 6 46 59 36 62
> MSN: le_fino at hotmail.com
> Twitter: le_fino
> Skype: michael.finocchiaro
> Blog: http://mfinocchiaro.wordpress.com
> LinkedIn: http://fr.linkedin.com/in/mfinocchiaro
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121005/9d6aa857/attachment.html 

From michael.finocchiaro at gmail.com  Fri Oct  5 02:28:28 2012
From: michael.finocchiaro at gmail.com (Michael Finocchiaro)
Date: Fri, 5 Oct 2012 11:28:28 +0200
Subject: Histograms in jVisualVM-VisualGC plugin "not supported"
In-Reply-To: <CA+cQ+tQe_ai_7i-X-j-i5WpY3ME3ob9P3YZgU2fBG7coGU3WOw@mail.gmail.com>
References: <CAMPV=ON-D-CP1ZJ6zAUkAtFatEUOx6ou1rJ72vdRT4i5rO7C2Q@mail.gmail.com>
	<CA+cQ+tQe_ai_7i-X-j-i5WpY3ME3ob9P3YZgU2fBG7coGU3WOw@mail.gmail.com>
Message-ID: <CAMPV=OMKzbzOb8ZQ+1M02t+jLZ_2R_dwFWa_nmUTqA1e2J6hYg@mail.gmail.com>

Thanks Krystal,
You are totally right - I needed to explicitly switch using
-XX:+UseParNewGC to see the Histograms again.
Thanks :)
BTW, does G1 use a similar technique or is there a new parallel algorithm
for scavange on new for G1?
Cheers,
Fino

Michael Finocchiaro
michael.finocchiaro at gmail.com
Mobile Telephone: +33 6 46 59 36 62
MSN: le_fino at hotmail.com
Twitter: le_fino
Skype: michael.finocchiaro
Blog: http://mfinocchiaro.wordpress.com
LinkedIn: http://fr.linkedin.com/in/mfinocchiaro


On Thu, Oct 4, 2012 at 6:56 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:

> Hi Fino,
>
> It's possible that you're using a different GC strategy in your Java7
> environment from your Java6 one.
> The VisualGC plugin for VisualVM doesn't support displaying histogram
> information for ParallelGC; it works with other GCs in HotSpot, e.g.
> DefNew/ParNew/G1. This hasn't changed from JDK6 to JDK7.
>
> Please take a look at what GC you're using in the two environments you're
> comparing.
>
> - Kris
>
> On Thu, Oct 4, 2012 at 3:50 AM, Michael Finocchiaro <
> michael.finocchiaro at gmail.com> wrote:
>
>> Perhaps the wrong forum for this question - if so please redirect me.
>> I upgraded my Java to Java7 but with the same parameters for starting my
>> Tomcat, I am no longer able to see the histograms in the VisualGC plugin to
>> jVisualVM that were so useful for checking the tuning of SurvivorRatio back
>> when I ran on Java6. Is this a known limitation on Java7 or did I miss a
>> command-line parameter somewhere?
>> Cheers,
>> Fino
>>
>> Michael Finocchiaro
>> michael.finocchiaro at gmail.com
>> Mobile Telephone: +33 6 46 59 36 62
>> MSN: le_fino at hotmail.com
>> Twitter: le_fino
>> Skype: michael.finocchiaro
>> Blog: http://mfinocchiaro.wordpress.com
>> LinkedIn: http://fr.linkedin.com/in/mfinocchiaro
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121005/690f8a73/attachment.html 

From john.cuthbertson at oracle.com  Fri Oct  5 09:16:21 2012
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Fri, 05 Oct 2012 09:16:21 -0700
Subject: Histograms in jVisualVM-VisualGC plugin "not supported"
In-Reply-To: <CAMPV=OMKzbzOb8ZQ+1M02t+jLZ_2R_dwFWa_nmUTqA1e2J6hYg@mail.gmail.com>
References: <CAMPV=ON-D-CP1ZJ6zAUkAtFatEUOx6ou1rJ72vdRT4i5rO7C2Q@mail.gmail.com>	<CA+cQ+tQe_ai_7i-X-j-i5WpY3ME3ob9P3YZgU2fBG7coGU3WOw@mail.gmail.com>
	<CAMPV=OMKzbzOb8ZQ+1M02t+jLZ_2R_dwFWa_nmUTqA1e2J6hYg@mail.gmail.com>
Message-ID: <506F07D5.7090908@oracle.com>

Hi Michael,

You can find some background on the hotspot collectors at: 
https://blogs.oracle.com/jonthecollector/entry/our_collectors

G1 is the latest collector in hotspot and is still evolving. AFAIK the 
monitoring and management tools do work with G1. We would love to hear 
other wise. There may be differences/inconsistencies because the heap in 
G1 is made up of heap regions and the different generations are purely 
logical rather than physical. The the sizes and the regions that make up 
the generations frequently change and G1sometimes returns the, as per 
the spec, invalid or unsupported value for some of the management api 
routines.

Regards,

JohnC

On 10/5/2012 2:28 AM, Michael Finocchiaro wrote:
> Thanks Krystal,
> You are totally right - I needed to explicitly switch using 
> -XX:+UseParNewGC to see the Histograms again.
> Thanks :)
> BTW, does G1 use a similar technique or is there a new parallel 
> algorithm for scavange on new for G1?
> Cheers,
> Fino
>
> Michael Finocchiaro
> michael.finocchiaro at gmail.com <mailto:michael.finocchiaro at gmail.com>
> Mobile Telephone: +33 6 46 59 36 62
> MSN: le_fino at hotmail.com <mailto:le_fino at hotmail.com>
> Twitter: le_fino
> Skype: michael.finocchiaro
> Blog: http://mfinocchiaro.wordpress.com
> LinkedIn: http://fr.linkedin.com/in/mfinocchiaro
>
>
> On Thu, Oct 4, 2012 at 6:56 PM, Krystal Mok <rednaxelafx at gmail.com 
> <mailto:rednaxelafx at gmail.com>> wrote:
>
>     Hi Fino,
>
>     It's possible that you're using a different GC strategy in your
>     Java7 environment from your Java6 one.
>     The VisualGC plugin for VisualVM doesn't support displaying
>     histogram information for ParallelGC; it works with other GCs in
>     HotSpot, e.g. DefNew/ParNew/G1. This hasn't changed from JDK6 to JDK7.
>
>     Please take a look at what GC you're using in the two environments
>     you're comparing.
>
>     - Kris
>
>     On Thu, Oct 4, 2012 at 3:50 AM, Michael Finocchiaro
>     <michael.finocchiaro at gmail.com
>     <mailto:michael.finocchiaro at gmail.com>> wrote:
>
>         Perhaps the wrong forum for this question - if so please
>         redirect me.
>         I upgraded my Java to Java7 but with the same parameters for
>         starting my Tomcat, I am no longer able to see the histograms
>         in the VisualGC plugin to jVisualVM that were so useful for
>         checking the tuning of SurvivorRatio back when I ran on Java6.
>         Is this a known limitation on Java7 or did I miss a
>         command-line parameter somewhere?
>         Cheers,
>         Fino
>
>         Michael Finocchiaro
>         michael.finocchiaro at gmail.com
>         <mailto:michael.finocchiaro at gmail.com>
>         Mobile Telephone: +33 6 46 59 36 62
>         <tel:%2B33%206%2046%2059%2036%2062>
>         MSN: le_fino at hotmail.com <mailto:le_fino at hotmail.com>
>         Twitter: le_fino
>         Skype: michael.finocchiaro
>         Blog: http://mfinocchiaro.wordpress.com
>         LinkedIn: http://fr.linkedin.com/in/mfinocchiaro
>
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net
>         <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121005/4e795f5b/attachment.html 

From bernd.eckenfels at googlemail.com  Fri Oct  5 09:27:51 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Fri, 05 Oct 2012 18:27:51 +0200
Subject: Can a full codecache trigger CMS? (was: Erratic(?) CMS behaviour
	every 5d)
In-Reply-To: <CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
Message-ID: <op.wlpr0pqztqmg3o@eckenfels02.seeburger.de>

Hello,

my problem continues, this time after 7d the erratic behaviour started.  
This time I was very closely monitoring the system, and the PermGen seems  
not to be the cause:

Pool: CMS Perm Gen (Non-heap memory)
Peak Usage : init:104857600, used:279972328, committed:314572800,  
max:314572800
Current Usage : init:104857600, used:216017192, committed:314572800,  
max:314572800

What I do see however is, that the code cache slowly approaches the  
maximum, could that trigger repeating CMS intervalls?

Pool: Code Cache (Non-heap memory)
Peak Usage : init:2555904, used:49850048, committed:50266112, max:50331648
Current Usage : init:2555904, used:49850048, committed:50266112,  
max:50331648

This time I will restart the system with OccupancyOnly setting (to 10%). I  
also increase the CodeCache and enable swweping in it. The PermGen will  
also be 1G max, just to be sure.

Here is the current log with CMSStatiscs=2 (which does not help I think).
https://mft.seeburger.de/portal-seefx/~public/52caa5cc-dfb6-41db-a096-d4f4eb6aba1b?download


Here is the rest of the memory pools, not that oldgen is at 5%, so  
unlikely cause for concurrent mode failures.

Memory Information
Total Memory Pools: 5


Pool: Par Eden Space (Heap memory)
Peak Usage : init:13743947776, used:13743947776, committed:13743947776,  
max:13743947776
Current Usage : init:13743947776, used:723224384, committed:13743947776,  
max:13743947776

Pool: Par Survivor Space (Heap memory)
Peak Usage : init:1717960704, used:1693771776, committed:1717960704,  
max:1717960704
Current Usage : init:1717960704, used:227068920, committed:1717960704,  
max:1717960704

Pool: CMS Old Gen (Heap memory)
Peak Usage : init:34359738368, used:3018247512, committed:34359738368,  
max:34359738368
Current Usage : init:34359738368, used:1787484240, committed:34359738368,  
max:34359738368

Greetings
Bernd

From dhd at exnet.com  Fri Oct  5 09:41:58 2012
From: dhd at exnet.com (Damon Hart-Davis)
Date: Fri, 5 Oct 2012 17:41:58 +0100
Subject: Can a full codecache trigger CMS? (was: Erratic(?) CMS behaviour
	every 5d)
In-Reply-To: <op.wlpr0pqztqmg3o@eckenfels02.seeburger.de>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
	<op.wlpr0pqztqmg3o@eckenfels02.seeburger.de>
Message-ID: <46E9A405-C3A0-4278-B9BE-B392961650AC@exnet.com>

Hi,

Have you tried -XX:+UseCodeCacheFlushing to let old code get evicted in case that helps?

Rgds

Damon


On 5 Oct 2012, at 17:27, Bernd Eckenfels wrote:

> Hello,
> 
> my problem continues, this time after 7d the erratic behaviour started.  
> This time I was very closely monitoring the system, and the PermGen seems  
> not to be the cause:
> 
> Pool: CMS Perm Gen (Non-heap memory)
> Peak Usage : init:104857600, used:279972328, committed:314572800,  
> max:314572800
> Current Usage : init:104857600, used:216017192, committed:314572800,  
> max:314572800
> 
> What I do see however is, that the code cache slowly approaches the  
> maximum, could that trigger repeating CMS intervalls?
> 
> Pool: Code Cache (Non-heap memory)
> Peak Usage : init:2555904, used:49850048, committed:50266112, max:50331648
> Current Usage : init:2555904, used:49850048, committed:50266112,  
> max:50331648
> 
> This time I will restart the system with OccupancyOnly setting (to 10%). I  
> also increase the CodeCache and enable swweping in it. The PermGen will  
> also be 1G max, just to be sure.
> 
> Here is the current log with CMSStatiscs=2 (which does not help I think).
> https://mft.seeburger.de/portal-seefx/~public/52caa5cc-dfb6-41db-a096-d4f4eb6aba1b?download
> 
> 
> Here is the rest of the memory pools, not that oldgen is at 5%, so  
> unlikely cause for concurrent mode failures.
> 
> Memory Information
> Total Memory Pools: 5
> 
> 
> Pool: Par Eden Space (Heap memory)
> Peak Usage : init:13743947776, used:13743947776, committed:13743947776,  
> max:13743947776
> Current Usage : init:13743947776, used:723224384, committed:13743947776,  
> max:13743947776
> 
> Pool: Par Survivor Space (Heap memory)
> Peak Usage : init:1717960704, used:1693771776, committed:1717960704,  
> max:1717960704
> Current Usage : init:1717960704, used:227068920, committed:1717960704,  
> max:1717960704
> 
> Pool: CMS Old Gen (Heap memory)
> Peak Usage : init:34359738368, used:3018247512, committed:34359738368,  
> max:34359738368
> Current Usage : init:34359738368, used:1787484240, committed:34359738368,  
> max:34359738368
> 
> Greetings
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 


From bernd-2012 at eckenfels.net  Fri Oct  5 09:48:26 2012
From: bernd-2012 at eckenfels.net (Bernd Eckenfels)
Date: Fri, 05 Oct 2012 18:48:26 +0200
Subject: Can a full codecache trigger CMS? (was: Erratic(?) CMS behaviour
	every 5d)
In-Reply-To: <46E9A405-C3A0-4278-B9BE-B392961650AC@exnet.com>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
	<op.wlpr0pqztqmg3o@eckenfels02.seeburger.de>
	<46E9A405-C3A0-4278-B9BE-B392961650AC@exnet.com>
Message-ID: <op.wlpsy0fe06c450@eckenfels02.seeburger.de>

Am 05.10.2012, 18:41 Uhr, schrieb Damon Hart-Davis <dhd at exnet.com>:
> Have you tried -XX:+UseCodeCacheFlushing to let old code get evicted in  
> case that helps?

This problem just occured, for the next restart I will use that flag,  
together with increasing the cache to 500M. Do you know if it is related  
to CMS initiation?

Gruss
Bernd

From dhd at exnet.com  Fri Oct  5 10:53:23 2012
From: dhd at exnet.com (Damon Hart-Davis)
Date: Fri, 5 Oct 2012 18:53:23 +0100
Subject: Can a full codecache trigger CMS? (was: Erratic(?) CMS behaviour
	every 5d)
In-Reply-To: <op.wlpsy0fe06c450@eckenfels02.seeburger.de>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
	<op.wlpr0pqztqmg3o@eckenfels02.seeburger.de>
	<46E9A405-C3A0-4278-B9BE-B392961650AC@exnet.com>
	<op.wlpsy0fe06c450@eckenfels02.seeburger.de>
Message-ID: <2E99C45F-7EA9-41D7-9EEF-6116FFEAF80A@exnet.com>

Hi,

I have no idea, sorry, but I am using that flag on my very memory-constrained system and things seem to be operating smoothly.

Rgds

Damon


On 5 Oct 2012, at 17:48, Bernd Eckenfels wrote:

> Am 05.10.2012, 18:41 Uhr, schrieb Damon Hart-Davis <dhd at exnet.com>:
>> Have you tried -XX:+UseCodeCacheFlushing to let old code get evicted in  
>> case that helps?
> 
> This problem just occured, for the next restart I will use that flag,  
> together with increasing the cache to 500M. Do you know if it is related  
> to CMS initiation?
> 
> Gruss
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 


From jon.masamitsu at oracle.com  Fri Oct  5 11:10:12 2012
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 05 Oct 2012 11:10:12 -0700
Subject: Fwd: Re: Can a full codecache trigger CMS?
Message-ID: <506F2284.1030205@oracle.com>

On 10/5/2012 9:48 AM, Bernd Eckenfels wrote:

>  Am 05.10.2012, 18:41 Uhr, schrieb Damon Hart-Davis<dhd at exnet.com>:
>>  Have you tried -XX:+UseCodeCacheFlushing to let old code get evicted in
>>  case that helps?
>  This problem just occured, for the next restart I will use that flag,
>  together with increasing the cache to 500M. Do you know if it is related
>  to CMS initiation?

No.  There is no connection I know of between the code cache filling and
CMS behavior as you're seeing it.

Jon

>  Gruss
>  Bernd
>  _______________________________________________
>  hotspot-gc-use mailing list
>  hotspot-gc-use at openjdk.java.net
>  http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From bernd.eckenfels at googlemail.com  Fri Oct  5 12:03:34 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Fri, 5 Oct 2012 21:03:34 +0200
Subject: Can a full codecache trigger CMS?
In-Reply-To: <506F2284.1030205@oracle.com>
References: <506F2284.1030205@oracle.com>
Message-ID: <C05FBD35-CFC8-47B7-9B44-252D3B1E46A8@gmail.com>

I read that in the event of Code cache Sweep or Full it may Trigger a FullGC(which makes sense to get rid of unused classes).

Since I run with ExiciyeGCInvokesConcurrent - it can lead to CMS over this road?

We really need GCCause Reasons in gclog and a better event diagnostics.

(one thing I appreciate at IBM jre is the clean GC log and the trace buffers)

Bernd

-- 
bernd.eckenfels.net

Am 05.10.2012 um 20:10 schrieb Jon Masamitsu <jon.masamitsu at oracle.com>:

> On 10/5/2012 9:48 AM, Bernd Eckenfels wrote:
> 
>> Am 05.10.2012, 18:41 Uhr, schrieb Damon Hart-Davis<dhd at exnet.com>:
>>> Have you tried -XX:+UseCodeCacheFlushing to let old code get evicted in
>>> case that helps?
>> This problem just occured, for the next restart I will use that flag,
>> together with increasing the cache to 500M. Do you know if it is related
>> to CMS initiation?
> 
> No.  There is no connection I know of between the code cache filling and
> CMS behavior as you're seeing it.
> 
> Jon
> 
>> Gruss
>> Bernd
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From jon.masamitsu at oracle.com  Fri Oct  5 15:22:25 2012
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Fri, 05 Oct 2012 15:22:25 -0700
Subject: Can a full codecache trigger CMS?
In-Reply-To: <C05FBD35-CFC8-47B7-9B44-252D3B1E46A8@gmail.com>
References: <506F2284.1030205@oracle.com>
	<C05FBD35-CFC8-47B7-9B44-252D3B1E46A8@gmail.com>
Message-ID: <506F5DA1.7010004@oracle.com>


On 10/5/2012 12:03 PM, Bernd Eckenfels wrote:
> I read that in the event of Code cache Sweep or Full it may Trigger a FullGC(which makes sense to get rid of unused classes).

I don't know the answer here.  I don't think the code cache has to do 
with the
number of loaded classes so much as  the amount of compiled code.
I remember before code cache flushing the code cache would fill up and then
compilations would shut down.  I don't recall a full code cache causing
a GC but that's JIT stuff so I don't know for sure.

Jon

> Since I run with ExiciyeGCInvokesConcurrent - it can lead to CMS over this road?
>
> We really need GCCause Reasons in gclog and a better event diagnostics.
>
> (one thing I appreciate at IBM jre is the clean GC log and the trace buffers)
>
> Bernd
>

From rozdev29 at gmail.com  Thu Oct 11 16:27:39 2012
From: rozdev29 at gmail.com (roz dev)
Date: Thu, 11 Oct 2012 16:27:39 -0700
Subject: How to alert for heap fragmentation
Message-ID: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>

Hi All

I am using Java 6u23, with CMS GC. I see that sometime Application gets
paused for longer time because of excessive heap fragmentation.

I have enabled PrintFLSStatistics flag and following is the log


2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: -668151027
Max   Chunk Size: 1976112973
Number of Blocks: 175445
Av.  Block  Size: 20672
Tree      Height: 78
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 10926
Max   Chunk Size: 1660
Number of Blocks: 22
Av.  Block  Size: 496
Tree      Height: 7


I would like to know from people about the way they track Heap
Fragmentation and how do we alert for this situation?

We use Nagios and I am wondering if there is a way to parse these logs and
know the max chunk size so that we can alert for it.

Any inputs are welcome.

-Saroj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/3c1ad25e/attachment.html 

From ysr1729 at gmail.com  Thu Oct 11 19:50:21 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 11 Oct 2012 19:50:21 -0700
Subject: How to alert for heap fragmentation
In-Reply-To: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
Message-ID: <CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>

In the absence of fragmentation, one would normally expect the max chunk
size of the CMS generation
to stabilize at some reasonable value, say after some 10's of CMS GC
cycles. If it doesn't, you should try
and use a larger heap, or otherwise reshape the heap to reduce promotion
rates. In my experience,
CMS seems to work best if its "duty cycle" is of the order of 1-2 %, i.e.
there are 50 to 100 times more
scavenges during the interval that it's not running vs the interva during
which it is running.

Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
"Max  Chunk Size:" and pick the
numeric component of every (4n+1)th match. The max chunk size will
typically cycle within a small band,
once it has stabilized, returning always to a high value following a CMS
cycle's completion. If the upper envelope
of this keeps steadily declining over some 10's of CMS GC cycles, then you
are probably seeing fragmentation
that will eventually succumb to fragmentation.

You can probably calibrate a threshold for the upper envelope so that if it
falls below that threshold you will
be alerted by Nagios that a closer look is in order.

At least something along those lines should work. The toughest part is
designing your "filter" to detect the
fall in the upper envelope. You will probably want to plot the metric, then
see what kind of filter will detect
the condition.... Sorry this isn't much concrete help, but hopefully it
gives you some ideas to work in
the right direction...

-- ramki

On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com> wrote:

> Hi All
>
> I am using Java 6u23, with CMS GC. I see that sometime Application gets
> paused for longer time because of excessive heap fragmentation.
>
> I have enabled PrintFLSStatistics flag and following is the log
>
>
> 2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: -668151027
> Max   Chunk Size: 1976112973
> Number of Blocks: 175445
> Av.  Block  Size: 20672
> Tree      Height: 78
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 10926
> Max   Chunk Size: 1660
> Number of Blocks: 22
> Av.  Block  Size: 496
> Tree      Height: 7
>
>
> I would like to know from people about the way they track Heap
> Fragmentation and how do we alert for this situation?
>
> We use Nagios and I am wondering if there is a way to parse these logs and
> know the max chunk size so that we can alert for it.
>
> Any inputs are welcome.
>
> -Saroj
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/f6d365ea/attachment.html 

From todd at cloudera.com  Thu Oct 11 21:11:58 2012
From: todd at cloudera.com (Todd Lipcon)
Date: Thu, 11 Oct 2012 21:11:58 -0700
Subject: How to alert for heap fragmentation
In-Reply-To: <CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
Message-ID: <CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>

Hey Ramki,

Do you know if there's any plan to offer the FLS statistics as a metric via
JMX or some other interface in the future? It would be nice to be able to
monitor fragmentation without having to actually log and parse the gc logs.

-Todd

On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

> In the absence of fragmentation, one would normally expect the max chunk
> size of the CMS generation
> to stabilize at some reasonable value, say after some 10's of CMS GC
> cycles. If it doesn't, you should try
> and use a larger heap, or otherwise reshape the heap to reduce promotion
> rates. In my experience,
> CMS seems to work best if its "duty cycle" is of the order of 1-2 %, i.e.
> there are 50 to 100 times more
> scavenges during the interval that it's not running vs the interva during
> which it is running.
>
> Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
> "Max  Chunk Size:" and pick the
> numeric component of every (4n+1)th match. The max chunk size will
> typically cycle within a small band,
> once it has stabilized, returning always to a high value following a CMS
> cycle's completion. If the upper envelope
> of this keeps steadily declining over some 10's of CMS GC cycles, then you
> are probably seeing fragmentation
> that will eventually succumb to fragmentation.
>
> You can probably calibrate a threshold for the upper envelope so that if
> it falls below that threshold you will
> be alerted by Nagios that a closer look is in order.
>
> At least something along those lines should work. The toughest part is
> designing your "filter" to detect the
> fall in the upper envelope. You will probably want to plot the metric,
> then see what kind of filter will detect
> the condition.... Sorry this isn't much concrete help, but hopefully it
> gives you some ideas to work in
> the right direction...
>
> -- ramki
>
> On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com> wrote:
>
>> Hi All
>>
>> I am using Java 6u23, with CMS GC. I see that sometime Application gets
>> paused for longer time because of excessive heap fragmentation.
>>
>> I have enabled PrintFLSStatistics flag and following is the log
>>
>>
>> 2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: -668151027
>> Max   Chunk Size: 1976112973
>> Number of Blocks: 175445
>> Av.  Block  Size: 20672
>> Tree      Height: 78
>> Before GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 10926
>> Max   Chunk Size: 1660
>> Number of Blocks: 22
>> Av.  Block  Size: 496
>> Tree      Height: 7
>>
>>
>> I would like to know from people about the way they track Heap
>> Fragmentation and how do we alert for this situation?
>>
>> We use Nagios and I am wondering if there is a way to parse these logs
>> and know the max chunk size so that we can alert for it.
>>
>> Any inputs are welcome.
>>
>> -Saroj
>>
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/0b373cdd/attachment.html 

From ysr1729 at gmail.com  Thu Oct 11 23:30:58 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 11 Oct 2012 23:30:58 -0700
Subject: How to alert for heap fragmentation
In-Reply-To: <CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
	<CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
Message-ID: <CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>

Todd, good question :-)

@Jesper et al, do you know the answer to Todd's question? I agree that
exposing all of these stats via suitable JMX/Mbean interfaces would be
quite useful.... The other possibility would be to log in the manner of
HP's gc logs (CSV format with suitable header), or jstat logs, so parsing
cost would be minimal. Then higher level, general tools like Kafka could
consume the log/event streams, apply suitable filters and inform/alert
interested monitoring agents.

@Todd & Saroj: Can you perhaps give some scenarios on how you might make
use of information such as this (more concretely say CMS fragmentation at a
specific JVM)? Would it be used only for "read-only" monitoring and
alerting, or do you see this as part of an automated data-centric control
system of sorts. The answer is kind of important, because something like
the latter can be accomplished today via gc log parsing (however kludgey
that might be) and something like Kafka/Zookeeper. On the other hand, I am
not sure if the latency of that kind of thing would fit well into a more
automated and fast-reacting data center control system or load-balancer
where a more direct JMX/MBean like interface might work better. Or was your
interest purely of the "development-debugging-performance-measurement"
kind, rather than of production JVMs? Anyway, thinking out loud here...

Thoughts/Comments/Suggestions?
-- ramki

On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com> wrote:

> Hey Ramki,
>
> Do you know if there's any plan to offer the FLS statistics as a metric
> via JMX or some other interface in the future? It would be nice to be able
> to monitor fragmentation without having to actually log and parse the gc
> logs.
>
> -Todd
>
>
> On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>
>> In the absence of fragmentation, one would normally expect the max chunk
>> size of the CMS generation
>> to stabilize at some reasonable value, say after some 10's of CMS GC
>> cycles. If it doesn't, you should try
>> and use a larger heap, or otherwise reshape the heap to reduce promotion
>> rates. In my experience,
>> CMS seems to work best if its "duty cycle" is of the order of 1-2 %, i.e.
>> there are 50 to 100 times more
>> scavenges during the interval that it's not running vs the interva during
>> which it is running.
>>
>> Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
>> "Max  Chunk Size:" and pick the
>> numeric component of every (4n+1)th match. The max chunk size will
>> typically cycle within a small band,
>> once it has stabilized, returning always to a high value following a CMS
>> cycle's completion. If the upper envelope
>> of this keeps steadily declining over some 10's of CMS GC cycles, then
>> you are probably seeing fragmentation
>> that will eventually succumb to fragmentation.
>>
>> You can probably calibrate a threshold for the upper envelope so that if
>> it falls below that threshold you will
>> be alerted by Nagios that a closer look is in order.
>>
>> At least something along those lines should work. The toughest part is
>> designing your "filter" to detect the
>> fall in the upper envelope. You will probably want to plot the metric,
>> then see what kind of filter will detect
>> the condition.... Sorry this isn't much concrete help, but hopefully it
>> gives you some ideas to work in
>> the right direction...
>>
>> -- ramki
>>
>> On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com> wrote:
>>
>>> Hi All
>>>
>>> I am using Java 6u23, with CMS GC. I see that sometime Application gets
>>> paused for longer time because of excessive heap fragmentation.
>>>
>>> I have enabled PrintFLSStatistics flag and following is the log
>>>
>>>
>>> 2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>>> Statistics for BinaryTreeDictionary:
>>> ------------------------------------
>>> Total Free Space: -668151027
>>> Max   Chunk Size: 1976112973
>>> Number of Blocks: 175445
>>> Av.  Block  Size: 20672
>>> Tree      Height: 78
>>> Before GC:
>>> Statistics for BinaryTreeDictionary:
>>> ------------------------------------
>>> Total Free Space: 10926
>>> Max   Chunk Size: 1660
>>> Number of Blocks: 22
>>> Av.  Block  Size: 496
>>> Tree      Height: 7
>>>
>>>
>>> I would like to know from people about the way they track Heap
>>> Fragmentation and how do we alert for this situation?
>>>
>>> We use Nagios and I am wondering if there is a way to parse these logs
>>> and know the max chunk size so that we can alert for it.
>>>
>>> Any inputs are welcome.
>>>
>>> -Saroj
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/a1562179/attachment-0001.html 

From todd at cloudera.com  Thu Oct 11 23:41:48 2012
From: todd at cloudera.com (Todd Lipcon)
Date: Thu, 11 Oct 2012 23:41:48 -0700
Subject: How to alert for heap fragmentation
In-Reply-To: <CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
	<CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
	<CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
Message-ID: <CADY20s4a8H1dM9SRd_8Fch5_6NUxEB2pVLW8k224a5=T4kiF+g@mail.gmail.com>

Hi Ramki. Answers inline below:

On Thu, Oct 11, 2012 at 11:30 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

>
> Todd, good question :-)
>
> @Jesper et al, do you know the answer to Todd's question? I agree that
> exposing all of these stats via suitable JMX/Mbean interfaces would be
> quite useful.... The other possibility would be to log in the manner of
> HP's gc logs (CSV format with suitable header), or jstat logs, so parsing
> cost would be minimal. Then higher level, general tools like Kafka could
> consume the log/event streams, apply suitable filters and inform/alert
> interested monitoring agents.
>
>
Parsing CSV is one possibility, but somewhat painful, because you have all
the usual issues with log rolling, compatibility between versions, etc.
Certainly better than parsing the internal dump format that
PrintFLSStatistics exposes at the moment, though :)


> @Todd & Saroj: Can you perhaps give some scenarios on how you might make
> use of information such as this (more concretely say CMS fragmentation at a
> specific JVM)? Would it be used only for "read-only" monitoring and
> alerting, or do you see this as part of an automated data-centric control
> system of sorts. The answer is kind of important, because something like
> the latter can be accomplished today via gc log parsing (however kludgey
> that might be) and something like Kafka/Zookeeper. On the other hand, I am
> not sure if the latency of that kind of thing would fit well into a more
> automated and fast-reacting data center control system or load-balancer
> where a more direct JMX/MBean like interface might work better. Or was your
> interest purely of the "development-debugging-performance-measurement"
> kind, rather than of production JVMs? Anyway, thinking out loud here...
>
>
Just to give some context, one of the main products where I work is
software which monitors large Hadoop clusters. Most of our daemons are
low-heap, but a few, notably the HBase Region Server, can have large heaps
and suffer from fragmentation. I wrote a few blog posts about this last
year (starting here:
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
)

So, if we can monitor fragmentation, I think there would be two useful
things we could do:

1) If we notice that the heap is becoming really fragmented, we know a full
GC is imminent. HBase has the capability to shift load between servers at
runtime -- so we could simply ask the load balancer to move all load off
the fragmented server, initiate a full GC manually, and then move the
region back.

Less gracefully, we could have the server do a clean shutdown, which would
be handled by our normal fault tolerance. This is actually better than a
lengthy GC pause, because we can detect a clean shutdown immediately
whereas the GC pause will take 30+ seconds before various
heartbeats/sessions expire.

2) Our monitoring software already measures various JVM metrics and exposes
them to operators (eg percentage of time spent in GC, heap usage after last
GC, etc). If an operator suspects that GC is an issue, he or she can watch
this metric or even set an alert. For some use cases, a
fragmentation-induced STW GC is nearly catastrophic. An administrator
should be able to quickly look at one of these metrics and tell whether the
fragmentation is stable or if it's creeping towards an STW, in which case
they need to re-evaluate GC tuning, live set size, etc.

Hope that helps with motivation for the feature.

-Todd


> On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com> wrote:
>
>> Hey Ramki,
>>
>> Do you know if there's any plan to offer the FLS statistics as a metric
>> via JMX or some other interface in the future? It would be nice to be able
>> to monitor fragmentation without having to actually log and parse the gc
>> logs.
>>
>> -Todd
>>
>>
>> On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>>
>>> In the absence of fragmentation, one would normally expect the max chunk
>>> size of the CMS generation
>>> to stabilize at some reasonable value, say after some 10's of CMS GC
>>> cycles. If it doesn't, you should try
>>> and use a larger heap, or otherwise reshape the heap to reduce promotion
>>> rates. In my experience,
>>> CMS seems to work best if its "duty cycle" is of the order of 1-2 %,
>>> i.e. there are 50 to 100 times more
>>> scavenges during the interval that it's not running vs the interva
>>> during which it is running.
>>>
>>> Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
>>> "Max  Chunk Size:" and pick the
>>> numeric component of every (4n+1)th match. The max chunk size will
>>> typically cycle within a small band,
>>> once it has stabilized, returning always to a high value following a CMS
>>> cycle's completion. If the upper envelope
>>> of this keeps steadily declining over some 10's of CMS GC cycles, then
>>> you are probably seeing fragmentation
>>> that will eventually succumb to fragmentation.
>>>
>>> You can probably calibrate a threshold for the upper envelope so that if
>>> it falls below that threshold you will
>>> be alerted by Nagios that a closer look is in order.
>>>
>>> At least something along those lines should work. The toughest part is
>>> designing your "filter" to detect the
>>> fall in the upper envelope. You will probably want to plot the metric,
>>> then see what kind of filter will detect
>>> the condition.... Sorry this isn't much concrete help, but hopefully it
>>> gives you some ideas to work in
>>> the right direction...
>>>
>>> -- ramki
>>>
>>> On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com> wrote:
>>>
>>>> Hi All
>>>>
>>>> I am using Java 6u23, with CMS GC. I see that sometime Application gets
>>>> paused for longer time because of excessive heap fragmentation.
>>>>
>>>> I have enabled PrintFLSStatistics flag and following is the log
>>>>
>>>>
>>>> 2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>>>> Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: -668151027
>>>> Max   Chunk Size: 1976112973
>>>> Number of Blocks: 175445
>>>> Av.  Block  Size: 20672
>>>> Tree      Height: 78
>>>> Before GC:
>>>> Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: 10926
>>>> Max   Chunk Size: 1660
>>>> Number of Blocks: 22
>>>> Av.  Block  Size: 496
>>>> Tree      Height: 7
>>>>
>>>>
>>>> I would like to know from people about the way they track Heap
>>>> Fragmentation and how do we alert for this situation?
>>>>
>>>> We use Nagios and I am wondering if there is a way to parse these logs
>>>> and know the max chunk size so that we can alert for it.
>>>>
>>>> Any inputs are welcome.
>>>>
>>>> -Saroj
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/2ce6fee8/attachment.html 

From jesper.wilhelmsson at oracle.com  Fri Oct 12 07:28:48 2012
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Fri, 12 Oct 2012 16:28:48 +0200
Subject: How to alert for heap fragmentation
In-Reply-To: <CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
	<CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
	<CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
Message-ID: <50782920.40507@oracle.com>

Ramki, Todd,

There are several projects in the pipeline for cleaning up verbose logs, 
reporting more/better data and improving the JVM monitoring infrastructure in 
different ways.

Exactly what data we will add and what logging that will be improved is not 
decided yet but I wouldn't have too high hopes that CMS is first out. Our 
prime target for logging improvements lately has been G1 which, by the way, 
might be worth while checking out if you are worried about fragmentation.

We have done some initial attempts along the lines of JEP 158 [1], again 
mainly for G1, and we are currently working with GC support for the 
event-based JVM tracing described in JEP 167 [2]. In the latter JEP the 
Parallel collectors (Parallel Scavenge and Parallel Old) will likely be first 
out with a few events. Have a look at these JEPs for more details.

[1] http://openjdk.java.net/jeps/158
[2] http://openjdk.java.net/jeps/167

Best regards,
/Jesper

On 2012-10-12 08:30, Srinivas Ramakrishna wrote:
>
> Todd, good question :-)
>
> @Jesper et al, do you know the answer to Todd's question? I agree that
> exposing all of these stats via suitable JMX/Mbean interfaces would be quite
> useful.... The other possibility would be to log in the manner of HP's gc logs
> (CSV format with suitable header), or jstat logs, so parsing cost would be
> minimal. Then higher level, general tools like Kafka could consume the
> log/event streams, apply suitable filters and inform/alert interested
> monitoring agents.
>
> @Todd & Saroj: Can you perhaps give some scenarios on how you might make use
> of information such as this (more concretely say CMS fragmentation at a
> specific JVM)? Would it be used only for "read-only" monitoring and alerting,
> or do you see this as part of an automated data-centric control system of
> sorts. The answer is kind of important, because something like the latter can
> be accomplished today via gc log parsing (however kludgey that might be) and
> something like Kafka/Zookeeper. On the other hand, I am not sure if the
> latency of that kind of thing would fit well into a more automated and
> fast-reacting data center control system or load-balancer where a more direct
> JMX/MBean like interface might work better. Or was your interest purely of the
> "development-debugging-performance-measurement" kind, rather than of
> production JVMs? Anyway, thinking out loud here...
>
> Thoughts/Comments/Suggestions?
> -- ramki
>
> On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com
> <mailto:todd at cloudera.com>> wrote:
>
>     Hey Ramki,
>
>     Do you know if there's any plan to offer the FLS statistics as a metric
>     via JMX or some other interface in the future? It would be nice to be able
>     to monitor fragmentation without having to actually log and parse the gc logs.
>
>     -Todd
>
>
>     On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna <ysr1729 at gmail.com
>     <mailto:ysr1729 at gmail.com>> wrote:
>
>         In the absence of fragmentation, one would normally expect the max
>         chunk size of the CMS generation
>         to stabilize at some reasonable value, say after some 10's of CMS GC
>         cycles. If it doesn't, you should try
>         and use a larger heap, or otherwise reshape the heap to reduce
>         promotion rates. In my experience,
>         CMS seems to work best if its "duty cycle" is of the order of 1-2 %,
>         i.e. there are 50 to 100 times more
>         scavenges during the interval that it's not running vs the interva
>         during which it is running.
>
>         Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
>         "Max  Chunk Size:" and pick the
>         numeric component of every (4n+1)th match. The max chunk size will
>         typically cycle within a small band,
>         once it has stabilized, returning always to a high value following a
>         CMS cycle's completion. If the upper envelope
>         of this keeps steadily declining over some 10's of CMS GC cycles, then
>         you are probably seeing fragmentation
>         that will eventually succumb to fragmentation.
>
>         You can probably calibrate a threshold for the upper envelope so that
>         if it falls below that threshold you will
>         be alerted by Nagios that a closer look is in order.
>
>         At least something along those lines should work. The toughest part is
>         designing your "filter" to detect the
>         fall in the upper envelope. You will probably want to plot the metric,
>         then see what kind of filter will detect
>         the condition.... Sorry this isn't much concrete help, but hopefully
>         it gives you some ideas to work in
>         the right direction...
>
>         -- ramki
>
>         On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com
>         <mailto:rozdev29 at gmail.com>> wrote:
>
>             Hi All
>
>             I am using Java 6u23, with CMS GC. I see that sometime Application
>             gets paused for longer time because of excessive heap fragmentation.
>
>             I have enabled PrintFLSStatistics flag and following is the log
>
>
>             2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>             Statistics for BinaryTreeDictionary:
>             ------------------------------------
>             Total Free Space: -668151027
>             Max   Chunk Size: 1976112973
>             Number of Blocks: 175445
>             Av.  Block  Size: 20672
>             Tree      Height: 78
>             Before GC:
>             Statistics for BinaryTreeDictionary:
>             ------------------------------------
>             Total Free Space: 10926
>             Max   Chunk Size: 1660
>             Number of Blocks: 22
>             Av.  Block  Size: 496
>             Tree      Height: 7
>
>
>             I would like to know from people about the way they track Heap
>             Fragmentation and how do we alert for this situation?
>
>             We use Nagios and I am wondering if there is a way to parse these
>             logs and know the max chunk size so that we can alert for it.
>
>             Any inputs are welcome.
>
>             -Saroj
>
>
>
>
>             _______________________________________________
>             hotspot-gc-use mailing list
>             hotspot-gc-use at openjdk.java.net
>             <mailto:hotspot-gc-use at openjdk.java.net>
>             http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
>     --
>     Todd Lipcon
>     Software Engineer, Cloudera
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121012/cc8087e8/jesper_wilhelmsson-0001.vcf 

From daubman at gmail.com  Tue Oct 16 05:54:19 2012
From: daubman at gmail.com (Aaron Daubman)
Date: Tue, 16 Oct 2012 08:54:19 -0400
Subject: Help processing G1GC logs (visualization tools) - also, mysterious
	great performance after inexplicable full GC?
Message-ID: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>

Greetings,

I recently switched to trying G1GC for several webapps and am trying
now to make sense of the GC behavior.
I have a few hopefully not-too-newbie questions:

1) Is there any up-to-date documentation on tuning G1GC? (many
articles about it seem to be pre-7u4 where there were apparently more
experimental flags available for tuning). Also, is this currently the
full set of flags available:
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options

2) Is there a list anywhere of tools known to process G1GC logs or
even instrument the JVM to visualize properly (unless I am missing
something, the latest VisualVM 1.3.4 displays the same visualization
for G1 as it does for other GCs, still showing it as purely
generational and I have no idea what it is doing with the histogram
that always shows 1-15 occupied and 0 empty... with a crazy large
number for current and very small number for target)
I have used https://github.com/chewiebug/GCViewer but haven't found it
to be all that useful for detailed analysis (e.g. am I collecting
because I am fragmented? is it large objects being allocated causing
problems?) Are there any tools that show the "cards concept" of G1 to
help better understand sizing?

3) There have been some historical posts on this list over the past
year or so discussing almost-ready-to-release tools for working with
G1GC logs - have any of these become available? (e.g.
http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-August/002982.html
)

4) In one webapp, the constant reason for frequent GCs is listed by
VisualVM as "G1: humongous allocation attempt". However, searching to
try and figure out what this means results in only a very few bug
entries (e.g. 7018286) that do not explain at all what this means, or
what the threshold for being declared 'humongous' is - were can I find
out more about this reason for minor GC?

5) I think I am perhaps just missing something given the lack of
capability of the current suite of tools, however this one really
bothers/intrigues me.
One webapp where I switched to G1GC does quite poorly (somewhat
inexplicably) for 12 or so hours (in terms of response times). This
app has a med/large (16G) heap size, and typically runs at 1/2 to 1/4
heap utilization. On occasion, even though overall heap utilization is
below 40%, there will occur a (I've only ever seen single) major GC,
and, without explanation, the application runs with order-of-magnitude
better performance after this (heap utilization remains about 40%
after major GC... eden and old commit size go up a small amount,
nothing else changes that I can see). What would cause this markedly
better performance after a major GC, and how can I dig in to this
better?
(I am frustrated, since it seems I should be able to figure out what
JVM flags to use that would cause performance to be always as good as
it is after the mysterious full GC, but I haven't been able to figure
anything out so far).

Thanks,
     Aaron

From sbordet at intalio.com  Tue Oct 16 06:19:13 2012
From: sbordet at intalio.com (Simone Bordet)
Date: Tue, 16 Oct 2012 15:19:13 +0200
Subject: Help processing G1GC logs (visualization tools) - also,
	mysterious great performance after inexplicable full GC?
In-Reply-To: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
Message-ID: <CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>

Hi,

On Tue, Oct 16, 2012 at 2:54 PM, Aaron Daubman <daubman at gmail.com> wrote:
> Greetings,
>
> I recently switched to trying G1GC for several webapps and am trying
> now to make sense of the GC behavior.
> I have a few hopefully not-too-newbie questions:
>
> 1) Is there any up-to-date documentation on tuning G1GC? (many
> articles about it seem to be pre-7u4 where there were apparently more
> experimental flags available for tuning). Also, is this currently the
> full set of flags available:
> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options

Perhaps most up to date information is from the JavaOne sessions just held.
See in particular:
https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=7236
https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583

> 4) In one webapp, the constant reason for frequent GCs is listed by
> VisualVM as "G1: humongous allocation attempt". However, searching to
> try and figure out what this means results in only a very few bug
> entries (e.g. 7018286) that do not explain at all what this means, or
> what the threshold for being declared 'humongous' is - were can I find
> out more about this reason for minor GC?

Humongous allocation means an allocation for a large object (a
"humongous object").
Normally these are large byte[], or objects that contain large byte[].

In G1, I believe that "humongous" is when the size is greater than
half a region, which by default are 1 MiB, but not 100% sure.

> 5) I think I am perhaps just missing something given the lack of
> capability of the current suite of tools, however this one really
> bothers/intrigues me.
> One webapp where I switched to G1GC does quite poorly (somewhat
> inexplicably) for 12 or so hours (in terms of response times). This
> app has a med/large (16G) heap size, and typically runs at 1/2 to 1/4
> heap utilization. On occasion, even though overall heap utilization is
> below 40%, there will occur a (I've only ever seen single) major GC,
> and, without explanation, the application runs with order-of-magnitude
> better performance after this (heap utilization remains about 40%
> after major GC... eden and old commit size go up a small amount,
> nothing else changes that I can see). What would cause this markedly
> better performance after a major GC, and how can I dig in to this
> better?
> (I am frustrated, since it seems I should be able to figure out what
> JVM flags to use that would cause performance to be always as good as
> it is after the mysterious full GC, but I haven't been able to figure
> anything out so far).

Please record logs with the flags suggested in the 2 presentations I
linked above.
In particular, I use:
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintAdaptiveSizePolicy

Once you have the log, attach it here, and hopefully someone will look at it.

Keep in mind that G1 is still "young" and while it will be the future,
today it may not be up to par with the other collectors.

Simon
-- 
http://cometd.org
http://webtide.com
Developer advice, services and support
from the Jetty & CometD experts.
----
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From monica.beckwith at oracle.com  Tue Oct 16 06:42:40 2012
From: monica.beckwith at oracle.com (Monica Beckwith)
Date: Tue, 16 Oct 2012 08:42:40 -0500
Subject: Help processing G1GC logs (visualization tools) - also, mysterious
	great performance after inexplicable full GC?
In-Reply-To: <CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
	<CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
Message-ID: <507D6450.8070801@oracle.com>

Thanks! Simone. :) Aaron - Please look at the J1 presentation that 
Simone has linked here (and feel free to ask questions/ provide 
comments). And like Simone mentioned, please send in the GC logs for the 
case where G1 does poorly with "-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps" enabled on the command line. Looking at the GC 
log(s), we can hopefully identify the cause of that major GC (which I am 
assuming to be a Full GC). We should also be able to work on improving 
(or at-least explaining) your response times, but we would like to know 
a) what is your goal, b) what are the important factors (e.g. sizing 
limitations, etc) and c) if you are working off some comparison (e.g. a 
previously tuned garbage collector).

Thank you for trying G1!

-Monica


On 10/16/2012 8:19 AM, Simone Bordet wrote:
> Hi,
>
> On Tue, Oct 16, 2012 at 2:54 PM, Aaron Daubman<daubman at gmail.com>  wrote:
>> Greetings,
>>
>> I recently switched to trying G1GC for several webapps and am trying
>> now to make sense of the GC behavior.
>> I have a few hopefully not-too-newbie questions:
>>
>> 1) Is there any up-to-date documentation on tuning G1GC? (many
>> articles about it seem to be pre-7u4 where there were apparently more
>> experimental flags available for tuning). Also, is this currently the
>> full set of flags available:
>> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options
> Perhaps most up to date information is from the JavaOne sessions just held.
> See in particular:
> https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=7236
> https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583
>
>> 4) In one webapp, the constant reason for frequent GCs is listed by
>> VisualVM as "G1: humongous allocation attempt". However, searching to
>> try and figure out what this means results in only a very few bug
>> entries (e.g. 7018286) that do not explain at all what this means, or
>> what the threshold for being declared 'humongous' is - were can I find
>> out more about this reason for minor GC?
> Humongous allocation means an allocation for a large object (a
> "humongous object").
> Normally these are large byte[], or objects that contain large byte[].
>
> In G1, I believe that "humongous" is when the size is greater than
> half a region, which by default are 1 MiB, but not 100% sure.
>
>> 5) I think I am perhaps just missing something given the lack of
>> capability of the current suite of tools, however this one really
>> bothers/intrigues me.
>> One webapp where I switched to G1GC does quite poorly (somewhat
>> inexplicably) for 12 or so hours (in terms of response times). This
>> app has a med/large (16G) heap size, and typically runs at 1/2 to 1/4
>> heap utilization. On occasion, even though overall heap utilization is
>> below 40%, there will occur a (I've only ever seen single) major GC,
>> and, without explanation, the application runs with order-of-magnitude
>> better performance after this (heap utilization remains about 40%
>> after major GC... eden and old commit size go up a small amount,
>> nothing else changes that I can see). What would cause this markedly
>> better performance after a major GC, and how can I dig in to this
>> better?
>> (I am frustrated, since it seems I should be able to figure out what
>> JVM flags to use that would cause performance to be always as good as
>> it is after the mysterious full GC, but I haven't been able to figure
>> anything out so far).
> Please record logs with the flags suggested in the 2 presentations I
> linked above.
> In particular, I use:
> -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps
> -XX:+PrintAdaptiveSizePolicy
>
> Once you have the log, attach it here, and hopefully someone will look at it.
>
> Keep in mind that G1 is still "young" and while it will be the future,
> today it may not be up to par with the other collectors.
>
> Simon

-- 
Oracle <http://www.oracle.com>
Monica Beckwith | Java Performance Engineer
VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
Texas
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121016/9b7675ef/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121016/9b7675ef/oracle_sig_logo.gif 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121016/9b7675ef/green-for-email-sig_0.gif 

From todd at cloudera.com  Tue Oct 16 14:39:18 2012
From: todd at cloudera.com (Todd Lipcon)
Date: Tue, 16 Oct 2012 14:39:18 -0700
Subject: How to alert for heap fragmentation
In-Reply-To: <50782920.40507@oracle.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
	<CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
	<CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
	<50782920.40507@oracle.com>
Message-ID: <CADY20s6bDY11+hGPh9VLXcHU6OMTQj6bk6tqUZ63TkJvHaqumQ@mail.gmail.com>

Hi Jesper,

Thanks to the links to those JEPs. From my perspective here:

JEP158: unifying GC logging is definitely appreciated, but still
leaves us to write a parser which is a bit inconvenient. We already
have a bunch of infrastructure to poll JMX for our Java processes, and
if it were a simple MBean to track fragmentation (the same way we
track committed heap and gc time for example), that would be far
better IMO.

JEP167: we might make use of this, but its focus on events doesn't
seem to map directly to what we're doing. I guess the FLS statistics
could be exposed as event properties after a collection, which would
be good enough, but still require coding to the JVMTI API, etc, vs
just the simple polling we already have a lot of tooling for.

So to summarize my thinking: everyone's got stuff that reads JMX
already, and the more you can add to the existing exposed interface,
the better.

Regarding G1, I did give it a try a year or so ago and ran into a lot
of bad behavior that caused it to full GC far more than CMS for our
workload. I haven't given it a try on the latest, and I think there
were some changes around 6 months ago which were supposed to address
the issues I saw.

-Todd

On Fri, Oct 12, 2012 at 7:28 AM, Jesper Wilhelmsson
<jesper.wilhelmsson at oracle.com> wrote:
> Ramki, Todd,
>
> There are several projects in the pipeline for cleaning up verbose logs,
> reporting more/better data and improving the JVM monitoring infrastructure
> in different ways.
>
> Exactly what data we will add and what logging that will be improved is not
> decided yet but I wouldn't have too high hopes that CMS is first out. Our
> prime target for logging improvements lately has been G1 which, by the way,
> might be worth while checking out if you are worried about fragmentation.
>
> We have done some initial attempts along the lines of JEP 158 [1], again
> mainly for G1, and we are currently working with GC support for the
> event-based JVM tracing described in JEP 167 [2]. In the latter JEP the
> Parallel collectors (Parallel Scavenge and Parallel Old) will likely be
> first out with a few events. Have a look at these JEPs for more details.
>
> [1] http://openjdk.java.net/jeps/158
> [2] http://openjdk.java.net/jeps/167
>
> Best regards,
> /Jesper
>
>
> On 2012-10-12 08:30, Srinivas Ramakrishna wrote:
>>
>>
>> Todd, good question :-)
>>
>> @Jesper et al, do you know the answer to Todd's question? I agree that
>> exposing all of these stats via suitable JMX/Mbean interfaces would be
>> quite
>> useful.... The other possibility would be to log in the manner of HP's gc
>> logs
>> (CSV format with suitable header), or jstat logs, so parsing cost would be
>> minimal. Then higher level, general tools like Kafka could consume the
>> log/event streams, apply suitable filters and inform/alert interested
>> monitoring agents.
>>
>> @Todd & Saroj: Can you perhaps give some scenarios on how you might make
>> use
>> of information such as this (more concretely say CMS fragmentation at a
>> specific JVM)? Would it be used only for "read-only" monitoring and
>> alerting,
>> or do you see this as part of an automated data-centric control system of
>> sorts. The answer is kind of important, because something like the latter
>> can
>> be accomplished today via gc log parsing (however kludgey that might be)
>> and
>> something like Kafka/Zookeeper. On the other hand, I am not sure if the
>> latency of that kind of thing would fit well into a more automated and
>> fast-reacting data center control system or load-balancer where a more
>> direct
>> JMX/MBean like interface might work better. Or was your interest purely of
>> the
>> "development-debugging-performance-measurement" kind, rather than of
>> production JVMs? Anyway, thinking out loud here...
>>
>> Thoughts/Comments/Suggestions?
>> -- ramki
>>
>> On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com
>> <mailto:todd at cloudera.com>> wrote:
>>
>>     Hey Ramki,
>>
>>     Do you know if there's any plan to offer the FLS statistics as a
>> metric
>>     via JMX or some other interface in the future? It would be nice to be
>> able
>>     to monitor fragmentation without having to actually log and parse the
>> gc logs.
>>
>>     -Todd
>>
>>
>>     On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna
>> <ysr1729 at gmail.com
>>     <mailto:ysr1729 at gmail.com>> wrote:
>>
>>         In the absence of fragmentation, one would normally expect the max
>>         chunk size of the CMS generation
>>         to stabilize at some reasonable value, say after some 10's of CMS
>> GC
>>         cycles. If it doesn't, you should try
>>         and use a larger heap, or otherwise reshape the heap to reduce
>>         promotion rates. In my experience,
>>         CMS seems to work best if its "duty cycle" is of the order of 1-2
>> %,
>>         i.e. there are 50 to 100 times more
>>         scavenges during the interval that it's not running vs the interva
>>         during which it is running.
>>
>>         Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the
>> string
>>         "Max  Chunk Size:" and pick the
>>         numeric component of every (4n+1)th match. The max chunk size will
>>         typically cycle within a small band,
>>         once it has stabilized, returning always to a high value following
>> a
>>         CMS cycle's completion. If the upper envelope
>>         of this keeps steadily declining over some 10's of CMS GC cycles,
>> then
>>         you are probably seeing fragmentation
>>         that will eventually succumb to fragmentation.
>>
>>         You can probably calibrate a threshold for the upper envelope so
>> that
>>         if it falls below that threshold you will
>>         be alerted by Nagios that a closer look is in order.
>>
>>         At least something along those lines should work. The toughest
>> part is
>>         designing your "filter" to detect the
>>         fall in the upper envelope. You will probably want to plot the
>> metric,
>>         then see what kind of filter will detect
>>         the condition.... Sorry this isn't much concrete help, but
>> hopefully
>>         it gives you some ideas to work in
>>         the right direction...
>>
>>         -- ramki
>>
>>         On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com
>>         <mailto:rozdev29 at gmail.com>> wrote:
>>
>>             Hi All
>>
>>             I am using Java 6u23, with CMS GC. I see that sometime
>> Application
>>             gets paused for longer time because of excessive heap
>> fragmentation.
>>
>>             I have enabled PrintFLSStatistics flag and following is the
>> log
>>
>>
>>             2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>>             Statistics for BinaryTreeDictionary:
>>             ------------------------------------
>>             Total Free Space: -668151027
>>             Max   Chunk Size: 1976112973
>>             Number of Blocks: 175445
>>             Av.  Block  Size: 20672
>>             Tree      Height: 78
>>             Before GC:
>>             Statistics for BinaryTreeDictionary:
>>             ------------------------------------
>>             Total Free Space: 10926
>>             Max   Chunk Size: 1660
>>             Number of Blocks: 22
>>             Av.  Block  Size: 496
>>             Tree      Height: 7
>>
>>
>>             I would like to know from people about the way they track Heap
>>             Fragmentation and how do we alert for this situation?
>>
>>             We use Nagios and I am wondering if there is a way to parse
>> these
>>             logs and know the max chunk size so that we can alert for it.
>>
>>             Any inputs are welcome.
>>
>>             -Saroj
>>
>>
>>
>>
>>             _______________________________________________
>>             hotspot-gc-use mailing list
>>             hotspot-gc-use at openjdk.java.net
>>             <mailto:hotspot-gc-use at openjdk.java.net>
>>
>>             http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>>         _______________________________________________
>>         hotspot-gc-use mailing list
>>         hotspot-gc-use at openjdk.java.net
>> <mailto:hotspot-gc-use at openjdk.java.net>
>>
>>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>>
>>     --
>>     Todd Lipcon
>>     Software Engineer, Cloudera
>>
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

From sbordet at intalio.com  Wed Oct 17 01:50:08 2012
From: sbordet at intalio.com (Simone Bordet)
Date: Wed, 17 Oct 2012 10:50:08 +0200
Subject: Help processing G1GC logs (visualization tools) - also,
	mysterious great performance after inexplicable full GC?
In-Reply-To: <CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
	<CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
	<507D6450.8070801@oracle.com>
	<CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
Message-ID: <CAFWmRJ3ig3akJz23ROcYR3E7=5QZKO1qcUSATmMp7JBWQBcN0A@mail.gmail.com>

Aaron,

my take on your problem is (short story) that you have a too small heap.

The long story is that most (the vast majority) of young GCs are
marked as (young) (initial-mark), and since G1 starts the initial
marking when the overall heap is 45% occupied, you are always in the
case where you are 45% or above occupied and therefore there always is
a young GC with initial-mark.
Just before the Full GC event, your heap is 10 GiB, but after a
collection goes to ~4.2 GiB, which is dangerously close to 4.5 GiB
that is the limit at which the initial-mark starts. It takes less than
10s to fill those 300 MiB, only to start another initial-mark.

If you happen to know your resident data size, make sure it occupies
way less than 45%: ideally you want resident data + young gen < 45%
for a few young GC (then overflow in old generation will trigger an
initial mark, but that's ok if it's every 5-10 - or more - young GCs).
Alternatively, you can play with -XX:InitiatingHeapOccupancyPercent=45
and increase the value.

A target of 50 ms will keep your young generation small-ish, so you
need to watch the logs, understand how big the young generation
stabilizes and eventually resize your heap to avoid wasting space.

After the marking, G1 will piggyback the "sweep" of the old generation
regions in a young GC. By default G1 sweeps the old generation garbage
that it marked in 4 young GC passes, and those are marked as (mixed).
Therefore you must aim for at least 4 young, mixed GC (without
initial-mark) before seeing again one with initial-mark.

I would start trying with -Xms=20G -Xmx=20G (or even bigger if you
can), and monitor the logs.
>From those you should get an understanding of your resident data size,
the overflow rate from young to old, and the young size.
Once you have those, you may be able to reduce the heap size to save
space and eventually tune the InitiatingHeapOccupancyPercent.

Add -XX:+PrintAdaptiveSizePolicy.

Monica, what does G1 do when it could not sweep the old generation
regions because every young GC was also a initial-mark, and therefore
never performed a mixed GC ? Or initial-mark also does a mixed GC ?

Finally, after the full GC the heap was resized from 10 GiB to 16 GiB.
That alone could explain why the frequency of GC diminished, and
that's why I think it's best if you start with -Xms==-Xmx: you have a
stable environment to compare apples to apples.

Thanks !

Simon
-- 
http://cometd.org
http://webtide.com
Developer advice, services and support
from the Jetty & CometD experts.
----
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz

From jesper.wilhelmsson at oracle.com  Wed Oct 17 02:59:22 2012
From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson)
Date: Wed, 17 Oct 2012 11:59:22 +0200
Subject: How to alert for heap fragmentation
In-Reply-To: <CADY20s6bDY11+hGPh9VLXcHU6OMTQj6bk6tqUZ63TkJvHaqumQ@mail.gmail.com>
References: <CAPed7MbFVGh0Zf6O59=XbzjiEu02c_cQhW_xSdUaTxU91mvTZQ@mail.gmail.com>
	<CABzyjynbUkFa_nkf20sVEitpV4AjMWk-XONhAPWQ_ozP8TUWpg@mail.gmail.com>
	<CADY20s7k+oswDCuV6SJesmPay+T6UmAqogbfVqA8Qfqvv20gaA@mail.gmail.com>
	<CABzyjymUK9YKO5bOq5WL+LVG1-JTeWLucRHyrTd-vqKGScXMpA@mail.gmail.com>
	<50782920.40507@oracle.com>
	<CADY20s6bDY11+hGPh9VLXcHU6OMTQj6bk6tqUZ63TkJvHaqumQ@mail.gmail.com>
Message-ID: <507E817A.9020909@oracle.com>

Hi Todd,

I don't have a strong opinion on JMX vs other ways of getting data from the 
VM. The serviceability team and other powers are deciding the policy here and 
the GC team mainly do as we are told.

G1 is constantly improving and if it was a while since you tried it it may be 
a good idea to try it again, especially if what you tried was before 7u4.
7u4 was a big step forward for G1 and it was the release where we officially 
started to support G1.
/Jesper

On 2012-10-16 23:39, Todd Lipcon wrote:
> Hi Jesper,
>
> Thanks to the links to those JEPs. From my perspective here:
>
> JEP158: unifying GC logging is definitely appreciated, but still
> leaves us to write a parser which is a bit inconvenient. We already
> have a bunch of infrastructure to poll JMX for our Java processes, and
> if it were a simple MBean to track fragmentation (the same way we
> track committed heap and gc time for example), that would be far
> better IMO.
>
> JEP167: we might make use of this, but its focus on events doesn't
> seem to map directly to what we're doing. I guess the FLS statistics
> could be exposed as event properties after a collection, which would
> be good enough, but still require coding to the JVMTI API, etc, vs
> just the simple polling we already have a lot of tooling for.
>
> So to summarize my thinking: everyone's got stuff that reads JMX
> already, and the more you can add to the existing exposed interface,
> the better.
>
> Regarding G1, I did give it a try a year or so ago and ran into a lot
> of bad behavior that caused it to full GC far more than CMS for our
> workload. I haven't given it a try on the latest, and I think there
> were some changes around 6 months ago which were supposed to address
> the issues I saw.
>
> -Todd
>
> On Fri, Oct 12, 2012 at 7:28 AM, Jesper Wilhelmsson
> <jesper.wilhelmsson at oracle.com> wrote:
>> Ramki, Todd,
>>
>> There are several projects in the pipeline for cleaning up verbose logs,
>> reporting more/better data and improving the JVM monitoring infrastructure
>> in different ways.
>>
>> Exactly what data we will add and what logging that will be improved is not
>> decided yet but I wouldn't have too high hopes that CMS is first out. Our
>> prime target for logging improvements lately has been G1 which, by the way,
>> might be worth while checking out if you are worried about fragmentation.
>>
>> We have done some initial attempts along the lines of JEP 158 [1], again
>> mainly for G1, and we are currently working with GC support for the
>> event-based JVM tracing described in JEP 167 [2]. In the latter JEP the
>> Parallel collectors (Parallel Scavenge and Parallel Old) will likely be
>> first out with a few events. Have a look at these JEPs for more details.
>>
>> [1] http://openjdk.java.net/jeps/158
>> [2] http://openjdk.java.net/jeps/167
>>
>> Best regards,
>> /Jesper
>>
>>
>> On 2012-10-12 08:30, Srinivas Ramakrishna wrote:
>>>
>>>
>>> Todd, good question :-)
>>>
>>> @Jesper et al, do you know the answer to Todd's question? I agree that
>>> exposing all of these stats via suitable JMX/Mbean interfaces would be
>>> quite
>>> useful.... The other possibility would be to log in the manner of HP's gc
>>> logs
>>> (CSV format with suitable header), or jstat logs, so parsing cost would be
>>> minimal. Then higher level, general tools like Kafka could consume the
>>> log/event streams, apply suitable filters and inform/alert interested
>>> monitoring agents.
>>>
>>> @Todd & Saroj: Can you perhaps give some scenarios on how you might make
>>> use
>>> of information such as this (more concretely say CMS fragmentation at a
>>> specific JVM)? Would it be used only for "read-only" monitoring and
>>> alerting,
>>> or do you see this as part of an automated data-centric control system of
>>> sorts. The answer is kind of important, because something like the latter
>>> can
>>> be accomplished today via gc log parsing (however kludgey that might be)
>>> and
>>> something like Kafka/Zookeeper. On the other hand, I am not sure if the
>>> latency of that kind of thing would fit well into a more automated and
>>> fast-reacting data center control system or load-balancer where a more
>>> direct
>>> JMX/MBean like interface might work better. Or was your interest purely of
>>> the
>>> "development-debugging-performance-measurement" kind, rather than of
>>> production JVMs? Anyway, thinking out loud here...
>>>
>>> Thoughts/Comments/Suggestions?
>>> -- ramki
>>>
>>> On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com
>>> <mailto:todd at cloudera.com>> wrote:
>>>
>>>      Hey Ramki,
>>>
>>>      Do you know if there's any plan to offer the FLS statistics as a
>>> metric
>>>      via JMX or some other interface in the future? It would be nice to be
>>> able
>>>      to monitor fragmentation without having to actually log and parse the
>>> gc logs.
>>>
>>>      -Todd
>>>
>>>
>>>      On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna
>>> <ysr1729 at gmail.com
>>>      <mailto:ysr1729 at gmail.com>> wrote:
>>>
>>>          In the absence of fragmentation, one would normally expect the max
>>>          chunk size of the CMS generation
>>>          to stabilize at some reasonable value, say after some 10's of CMS
>>> GC
>>>          cycles. If it doesn't, you should try
>>>          and use a larger heap, or otherwise reshape the heap to reduce
>>>          promotion rates. In my experience,
>>>          CMS seems to work best if its "duty cycle" is of the order of 1-2
>>> %,
>>>          i.e. there are 50 to 100 times more
>>>          scavenges during the interval that it's not running vs the interva
>>>          during which it is running.
>>>
>>>          Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the
>>> string
>>>          "Max  Chunk Size:" and pick the
>>>          numeric component of every (4n+1)th match. The max chunk size will
>>>          typically cycle within a small band,
>>>          once it has stabilized, returning always to a high value following
>>> a
>>>          CMS cycle's completion. If the upper envelope
>>>          of this keeps steadily declining over some 10's of CMS GC cycles,
>>> then
>>>          you are probably seeing fragmentation
>>>          that will eventually succumb to fragmentation.
>>>
>>>          You can probably calibrate a threshold for the upper envelope so
>>> that
>>>          if it falls below that threshold you will
>>>          be alerted by Nagios that a closer look is in order.
>>>
>>>          At least something along those lines should work. The toughest
>>> part is
>>>          designing your "filter" to detect the
>>>          fall in the upper envelope. You will probably want to plot the
>>> metric,
>>>          then see what kind of filter will detect
>>>          the condition.... Sorry this isn't much concrete help, but
>>> hopefully
>>>          it gives you some ideas to work in
>>>          the right direction...
>>>
>>>          -- ramki
>>>
>>>          On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com
>>>          <mailto:rozdev29 at gmail.com>> wrote:
>>>
>>>              Hi All
>>>
>>>              I am using Java 6u23, with CMS GC. I see that sometime
>>> Application
>>>              gets paused for longer time because of excessive heap
>>> fragmentation.
>>>
>>>              I have enabled PrintFLSStatistics flag and following is the
>>> log
>>>
>>>
>>>              2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>>>              Statistics for BinaryTreeDictionary:
>>>              ------------------------------------
>>>              Total Free Space: -668151027
>>>              Max   Chunk Size: 1976112973
>>>              Number of Blocks: 175445
>>>              Av.  Block  Size: 20672
>>>              Tree      Height: 78
>>>              Before GC:
>>>              Statistics for BinaryTreeDictionary:
>>>              ------------------------------------
>>>              Total Free Space: 10926
>>>              Max   Chunk Size: 1660
>>>              Number of Blocks: 22
>>>              Av.  Block  Size: 496
>>>              Tree      Height: 7
>>>
>>>
>>>              I would like to know from people about the way they track Heap
>>>              Fragmentation and how do we alert for this situation?
>>>
>>>              We use Nagios and I am wondering if there is a way to parse
>>> these
>>>              logs and know the max chunk size so that we can alert for it.
>>>
>>>              Any inputs are welcome.
>>>
>>>              -Saroj
>>>
>>>
>>>
>>>
>>>              _______________________________________________
>>>              hotspot-gc-use mailing list
>>>              hotspot-gc-use at openjdk.java.net
>>>              <mailto:hotspot-gc-use at openjdk.java.net>
>>>
>>>              http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>>
>>>          _______________________________________________
>>>          hotspot-gc-use mailing list
>>>          hotspot-gc-use at openjdk.java.net
>>> <mailto:hotspot-gc-use at openjdk.java.net>
>>>
>>>          http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>>
>>>
>>>      --
>>>      Todd Lipcon
>>>      Software Engineer, Cloudera
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jesper_wilhelmsson.vcf
Type: text/x-vcard
Size: 236 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121017/bef071ce/jesper_wilhelmsson.vcf 

From monica.beckwith at oracle.com  Wed Oct 17 13:53:51 2012
From: monica.beckwith at oracle.com (Monica Beckwith)
Date: Wed, 17 Oct 2012 15:53:51 -0500
Subject: Help processing G1GC logs (visualization tools) - also, mysterious
	great performance after inexplicable full GC?
In-Reply-To: <CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
	<CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
	<507D6450.8070801@oracle.com>
	<CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
Message-ID: <507F1ADF.1060001@oracle.com>

Hi Aaron,

I'd like to start by noting that by explicitly setting NewRatio, you are 
basically not letting your nursery adaptively grow (upto 80% of heap) or 
shrink (down to 20% of heap) which is needed by G1 to try to meet its 
pause time goal. In your case, it works out OK though.

Anyway, here's what I understand from your log -

  * After your first 4 young collections the heap occupancy crosses 45%
    of 8G (3.6Gs) and hence you see the "initial mark" piggy-backed on
    the young collection
      o The time spent in copying here by itself was 1.1sec (on an
        average) for the 33GC threads
  * After plotting your heap occupancy and GC Histo, it seems that your
    heap occupancy stays around 3.5Gs till the first expansion to ~10Gs
    and then stays at around 4Gs (Note: with this expansion the new heap
    occupancy limit is around 4.5Gs) and it seems that you are
    allocating "humongous objects"
      o The reason being that the next multitude of initial-marks are
        not showing Eden reaching its full capacity.
      o And the assumption is that since the heap is near its Marking
        Threshold limit, the humongous object allocation is triggering
        the initial mark.
  * I also spotted some (528 in number)
    "concurrent-mark-reset-for-overflow" and when I consulted with John
    Cuthbertson, he mentioned that we should try to increase the
    MarkStackSize to alleviate that issue.

So quick recommendations based on your application "phases" that we see 
from you GC logs -

 1. Increase your InitiatingHeapOccupancyPercent. 70 could be a safe
    number for you.
 2. Increase your MarkStackSize.
 3. Print with GC cause enabled (PrintGCCause)//, so that you can
    understand what's causing all those pauses.
 4. Also, looking at your command-line, I wanted to quickly highlight
    that UseBiasedLocking and UseTLAB are probably enabled by default.
    Also, UseNUMA will have no effect.

Hope the above helps your application.

Regards,
Monica Beckwith


On 10/16/2012 9:01 PM, Aaron Daubman wrote:
> Simone, Monica,
>
> Thanks for the helpful pointers to presentations - I'm psyched to make 
> it through quite a few of the JavaOne presentations. In the mean time, 
> it looks like one of the JVMs I'm monitoring experienced the 
> mysterious full GC that helps improve performance afterward at ~23:26 
> GMT today. I'm attaching some zipped screenshots that perhaps will 
> provide some context of memory utilization as well as the full (zipped 
> as it is ~18M) GC log file. Any insight you can provide or suggestions 
> to improve performance, set up the JVM to avoid this full GC, 
> or especially to figure out how to start the JVM up in such a way as 
> to perform as it does after these rare full GC events occur would be 
> wonderful. (some responses to questions in-line at the bottom)
> You can't tell from the scaled-out AppDynamics graph, but avg GC time 
> goes from a consistent ~125ms/min before the full GC to a periodic 
> 0-20ms/min after. (7 minor collections per minute down to 0-1 per minute)
>
> Relevant JVM flags used for this log file were:
> -XX:-UseSplitVerifier
> -XX:+UseG1GC
> -XX:MaxGCPauseMillis=50
> -Xms8g
> -Xmx16g
> -XX:NewRatio=6
> -XX:+UseNUMA
> -XX:+UseBiasedLocking
> -XX:+UseTLAB
> -XX:+AggressiveOpts
> -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/sas01/solr.dump
> -XX:+PrintGCDateStamps
> -XX:+PrintGCTimeStamps
> -XX:+PrintGCDetails
> -Xloggc:/var/log/solr/GCLogFile.log
>
> To jump right to the full GC event, just look for:
> 2012-10-16T23:26:08.048+0000: 45896.289: [Full GC 
> 11734M->3646M(16384M), 16.2904780 secs]
> It is the only "Full GC" event in the log.
>
> The screenshots from https://github.com/chewiebug/GCViewer center 
> around the Full GC event, and the AppDynamics graphs should make the 
> event timing pretty obvious.
>
> On Tue, Oct 16, 2012 at 9:42 AM, Monica Beckwith 
> <monica.beckwith at oracle.com <mailto:monica.beckwith at oracle.com>> wrote:
>
>     ... We should also be able to work on improving (or at-least
>     explaining) your response times, but we would like to know a) what
>     is your goal, b) what are the important factors (e.g. sizing
>     limitations, etc) and c) if you are working off some comparison
>     (e.g. a previously tuned garbage collector).
>
>
> I am currently trying to get a grip on real application requirements - 
> this particular JVM shares a 48-core 256G ram FusioIO disk 
> over-provisioned system with several other currently-poorly-tuned 
> large JVMs - so suffice it to say that sizing limitations, within 
> reason, are not a current first-rate concern. (I recently dropped it's 
> previously 50G heap down to the current 16G, and see typical heap 
> utilization of only 4-8G)
>
> a) the near-term goal is to reduce overall (mean) response time, as 
> well as to smooth the incredible (10ms to 30,000ms) variance in 
> response time
> Some background: This is a jetty servlet servicing requests by 
> processing data from a ~70G Solr index file. There is an unfortunate 
> amount of significant garbage produced per-request (allocation of 
> several large objects per-request is unavoidable - object pooling 
> might be worth looking in to here, and string interning in the latest 
> just-released solr 4.0 may also help, but both of these would be 
> longer-term efforts). ~10Mb dynamically created JSON is generated and 
> transmitted per-request by this servlet. We are currently at the stage 
> where higher request-rates actually cause better performance (likely 
> due to staying more active and thus preventing some paging out of the 
> 70G MMaped solr index file)
>
> b) it would be nice to reduce the JVM memory footprint to enable 
> deployment of this JVM on more commodity-like hardware (32-128G RAM, 
> 16 cores, etc...) but this is secondary to improving performance and 
> significantly reducing response-time variance. I can't think of any 
> real important limiting factors
>
> c) a previous un-tuned (JDK6 with default GC, 50G heap) which I moved 
> to CMS but still untuned both had better overall response times and 
> less variance than the current configuration. However, I've been 
> unable to stick to the "change one thing at a time" principle due to 
> difficulties making changes to production, etc... and so there were 
> many changes (JDK6 to 7, jetty 6 to 9, solr 3.3 to 3.6.1, java 
> collections to fastutil primitive collections, etc...) made all at 
> once, so it is difficult to point just at GC - however, going back to 
> the strange better performance after these sometimes-daily full GC 
> events, it "smells" like at least a GC-related issue.
>
> Thanks again!
>      Aaron
>

-- 
Oracle <http://www.oracle.com>
Monica Beckwith | Java Performance Engineer
VOIP: +1 512 401 1274 <tel:+1%20512%20401%201274>
Texas
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121017/8b5c2e7d/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: oracle_sig_logo.gif
Type: image/gif
Size: 658 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121017/8b5c2e7d/oracle_sig_logo.gif 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: green-for-email-sig_0.gif
Type: image/gif
Size: 356 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121017/8b5c2e7d/green-for-email-sig_0.gif 

From daubman at gmail.com  Wed Oct 17 20:36:14 2012
From: daubman at gmail.com (Aaron Daubman)
Date: Wed, 17 Oct 2012 23:36:14 -0400
Subject: Help processing G1GC logs (visualization tools) - also,
	mysterious great performance after inexplicable full GC?
In-Reply-To: <507F1ADF.1060001@oracle.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
	<CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
	<507D6450.8070801@oracle.com>
	<CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
	<507F1ADF.1060001@oracle.com>
Message-ID: <CALyTvnqRp8Fp87cT2ZQ5hUYYAATgSq0sWULQKu2dezxQpo2pMw@mail.gmail.com>

Hi Monica,

Thanks for the reply and very helpful analysis - some followup in-line
below:

I'd like to start by noting that by explicitly setting NewRatio, you are
> basically not letting your nursery adaptively grow (upto 80% of heap) or
> shrink (down to 20% of heap) which is needed by G1 to try to meet its pause
> time goal. In your case, it works out OK though.
>

Could you say some more about NewRatio? The setting I used was one out of
several attempts that seemed to do OK - is this best to not set with G1GC?
(what is and isn't adaptive? it seems that even MarkStackSize is on its way
to becoming adaptive?)


> Anyway, here's what I understand from your log -
>
>    - After your first 4 young collections the heap occupancy crosses 45%
>    of 8G (3.6Gs) and hence you see the "initial mark" piggy-backed on the
>    young collection
>       - The time spent in copying here by itself was 1.1sec (on an
>       average) for the 33GC threads
>    - After plotting your heap occupancy and GC Histo, it seems that your
>    heap occupancy stays around 3.5Gs till the first expansion to ~10Gs and
>    then stays at around 4Gs (Note: with this expansion the new heap occupancy
>    limit is around 4.5Gs) and it seems that you are allocating "humongous
>    objects"
>       - The reason being that the next multitude of initial-marks are not
>       showing Eden reaching its full capacity.
>       - And the assumption is that since the heap is near its Marking
>       Threshold limit, the humongous object allocation is triggering the initial
>       mark.
>    - I also spotted some (528 in number)
>    "concurrent-mark-reset-for-overflow" and when I consulted with John
>    Cuthbertson, he mentioned that we should try to increase the MarkStackSize
>    to alleviate that issue.
>
> Would you please point me to more documentation on this, as well as
suggest how to increase it (and to what)?


>
>
> So quick recommendations based on your application "phases" that we see
> from you GC logs -
>
>    1. Increase your InitiatingHeapOccupancyPercent. 70 could be a safe
>    number for you.
>    2. Increase your MarkStackSize.
>
> How / to what? (what are sane limits?)


>
>    1. Print with GC cause enabled (PrintGCCause)**, so that you can
>    understand what's causing all those pauses.
>
> Adding -XX:+PrintGCCause resulted in the JVM failing to launch with this
error:
---snip---
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
---snip---


>
>    1. Also, looking at your command-line, I wanted to quickly highlight
>    that UseBiasedLocking and UseTLAB are probably enabled by default. Also,
>    UseNUMA will have no effect.
>
>
What would be a good way for me to find this out? (e.g. is there an
up-to-date list of flags and defaults for current JDK versions?)

Thanks again!
     Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121017/57b48265/attachment-0001.html 

From bernd.eckenfels at googlemail.com  Wed Oct 17 21:50:38 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Thu, 18 Oct 2012 06:50:38 +0200
Subject: Help processing G1GC logs (visualization tools) - also,
	mysterious great performance after inexplicable full GC?
In-Reply-To: <CALyTvnqRp8Fp87cT2ZQ5hUYYAATgSq0sWULQKu2dezxQpo2pMw@mail.gmail.com>
References: <CALyTvnoPBr1aYsQTJcLvfxLOEbF_V-4TN_NN0Q-zGpw_iei6YQ@mail.gmail.com>
	<CAFWmRJ17cbctmWbPPqMGXO_i+Yi8-xf0f4bc-_WMPnRY-_2T3Q@mail.gmail.com>
	<507D6450.8070801@oracle.com>
	<CALyTvnpu9hBp5whSrS99xNW2anGec+KGJranBC=ftszfacik+g@mail.gmail.com>
	<507F1ADF.1060001@oracle.com>
	<CALyTvnqRp8Fp87cT2ZQ5hUYYAATgSq0sWULQKu2dezxQpo2pMw@mail.gmail.com>
Message-ID: <op.wmcyeot5tqmg3o@eckenfels02.seeburger.de>

Am 18.10.2012, 05:36 Uhr, schrieb Aaron Daubman <daubman at gmail.com>:
> What would be a good way for me to find this out? (e.g. is there an
> up-to-date list of flags and defaults for current JDK versions?)

You can add "-XX:+PrintCommandLineFlags", this way the VM will print in  
the first line after start the command line flags you have specified as  
well as the ones which have been enabled/disabled automatically.

For the defaults (= sign) and specified (:= sign) values of the vm  
options, you can use something like this:

C:\Users\eckenfel>java -XX:+UseG1GC -XX:+PrintFlagsFinal -version | find  
"Stack"
     uintx CMSRevisitStackSize                       = 1048576          
{product}
      intx CompilerThreadStackSize                   = 0               {pd  
product}
      intx G1MarkRegionStackSize                     = 1048576          
{product}
     uintx GCDrainStackTargetSize                    = 64               
{product}
      bool JavaMonitorsInStackTrace                  = true             
{product}
     uintx MarkStackSize                             = 16777216         
{product}
     uintx MarkStackSizeMax                          = 536870912        
{product}
      intx MaxJavaStackTraceDepth                    = 1024             
{product}
      bool OmitStackTraceInFastThrow                 = true             
{product}
      intx OnStackReplacePercentage                  = 140             {pd  
product}
     uintx PreserveMarkStackSize                     = 1024             
{product}
      intx StackRedPages                             = 1               {pd  
product}
      intx StackShadowPages                          = 6               {pd  
product}
      bool StackTraceInThrowable                     = true             
{product}
      intx StackYellowPages                          = 2               {pd  
product}
      intx ThreadStackSize                           = 0               {pd  
product}
      bool UseOnStackReplacement                     = true            {pd  
product}
      intx VMThreadStackSize                         = 0               {pd  
product}
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

Bernd

From ysr1729 at gmail.com  Thu Oct 18 09:43:42 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 18 Oct 2012 09:43:42 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
Message-ID: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>

Has anyone come across extremely long (upwards of 10 minutes) promotion
failure unwinding scenarios when using any of the collectors, but
especially with ParNew/CMS?
I recently came across one such occurrence with ParNew/CMS that, with a 40
GB young gen took upwards of 10 minutes to "unwind". I looked through the
code and I can see
that the unwinding steps can be a source of slowdown as we iterate
single-threaded (DefNew) through the large Eden to fix up self-forwarded
objects, but that still wouldn't
seem to explain such a large pause, even with a 40 GB young gen. I am
looking through the promotion failure paths to see what might be the cause
of such a large pause,
but if anyone has experienced this kind of scenario before or has any
conjectures or insights, I'd appreciate it.

thanks!
-- ramki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121018/0858a0d9/attachment.html 

From ysr1729 at gmail.com  Thu Oct 18 09:45:42 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 18 Oct 2012 09:45:42 -0700
Subject: Erratic(?) CMS behaviour every 5d
In-Reply-To: <CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
Message-ID: <CABzyjy=A+weCoEF9tfmZPNy1mUauBqvX=eA9OtGa=u149HUf8w@mail.gmail.com>

Hi Bernd -- Out of curiosity, did you get to the bottom of this?

-- ramki

On Tue, Oct 2, 2012 at 8:43 PM, Bernd Eckenfels <
bernd.eckenfels at googlemail.com> wrote:

> Hello,
>
>
>
> On Fri, Sep 28, 2012 at 7:48 PM, Bernd Eckenfels
> <bernd.eckenfels at googlemail.com> wrote:
> > The jstat utility has a -gccause option, do you think that will help
> > me as soon as the system gets into the trashing mode?
>
> I am currently seeing only "CMS Initial Mark" and "CMS Final Remark"
> as causes for OGC and "GCLocker" as cause for ParNew, so I think that
> jstat will not help me later on, when the machine starts to go wild
> again. The PrintCMSStatistics=2 was not helping me to get to the root.
> I will play around with PrintCMSInitiationStatistics but I really
> think the GC causes (which statistic was over what limit) should be
> more prominently available.
>
> Regarding my "problem": I do see a minor growth in the PGU, so I guess
> this is the reason. My fist step was to resize the PG and later on
> some heapdumping to find the class loader leak is in order.
>
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121018/44735e5e/attachment.html 

From Peter.B.Kessler at Oracle.COM  Thu Oct 18 13:20:21 2012
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Thu, 18 Oct 2012 13:20:21 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
Message-ID: <50806485.8080808@Oracle.COM>

IIRC, promotion failure still has to finish the evacuation attempt (and some objects may get promoted while the ones that fail get self-looped).  That part is the usual multi-threaded object graph walk, with failed PLAB allocations thrown in to slow you down.  Then you get to start the pass that deals with the self-loops, which you say is single-threaded.  Undoing the self-loops is in address order, but it walks by the object sizes, so probably it mostly misses in the cache.  40GB at the average object size (call them 40 bytes to make the math easy) is a lot of cache misses.  How fast is your memory system?  Probably faster than (10minutes / (40GB / 40bytes)) per cache miss.

Is it possible you are paging?  Maybe not when things are running smoothly, but maybe a 10 minute stall on one service causes things to back up (and grow the heap of) other services on the same machine?  I'm guessing.

			... peter

Srinivas Ramakrishna wrote:
> 
> Has anyone come across extremely long (upwards of 10 minutes) promotion 
> failure unwinding scenarios when using any of the collectors, but 
> especially with ParNew/CMS?
> I recently came across one such occurrence with ParNew/CMS that, with a 
> 40 GB young gen took upwards of 10 minutes to "unwind". I looked through 
> the code and I can see
> that the unwinding steps can be a source of slowdown as we iterate 
> single-threaded (DefNew) through the large Eden to fix up self-forwarded 
> objects, but that still wouldn't
> seem to explain such a large pause, even with a 40 GB young gen. I am 
> looking through the promotion failure paths to see what might be the 
> cause of such a large pause,
> but if anyone has experienced this kind of scenario before or has any 
> conjectures or insights, I'd appreciate it.
> 
> thanks!
> -- ramki
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From bernd.eckenfels at googlemail.com  Thu Oct 18 14:13:56 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Thu, 18 Oct 2012 23:13:56 +0200
Subject: Erratic(?) CMS behaviour every 5d
In-Reply-To: <CABzyjy=A+weCoEF9tfmZPNy1mUauBqvX=eA9OtGa=u149HUf8w@mail.gmail.com>
References: <CAGO386Y0wQ=A9OYdA88r0k+gwpM=NRabH4WPfdM5_11ye0P0Nw@mail.gmail.com>
	<CABzyjykuxtdJLeYb9vLBjBi8O6AQYAhBZDMh=zgmNs2m=voOzQ@mail.gmail.com>
	<CABzyjykbdDUdJ+YG26beCWu0Mbto4Q87dHSSKabcx8PWf9Fs8w@mail.gmail.com>
	<CAGO386bOaES8QdTBgcKGeBWQrBD5A89Ys2j56_Qb51XWWYMW3Q@mail.gmail.com>
	<CAGO386adrc0pavUanLcAOK3XvMxGdR3638k-GPvRHTb0OXa5vQ@mail.gmail.com>
	<CABzyjy=A+weCoEF9tfmZPNy1mUauBqvX=eA9OtGa=u149HUf8w@mail.gmail.com>
Message-ID: <op.wmd7xiwktqmg3o@eckenfels02.seeburger.de>

Hello,

unfortunatelly not. The fact that it always took a week to occur slowed me  
down and meanwhile the time critical transactions are split of to  
secondary machines with smaller heap.

I am not sure if it will still occur on the old (larger server). But I  
will sent updates in case I can reproduce anything.

Bernd

Am 18.10.2012, 18:45 Uhr, schrieb Srinivas Ramakrishna <ysr1729 at gmail.com>:

> Hi Bernd -- Out of curiosity, did you get to the bottom of this?

From ysr1729 at gmail.com  Thu Oct 18 15:47:30 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 18 Oct 2012 15:47:30 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <50806485.8080808@Oracle.COM>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
Message-ID: <CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>

Thanks Peter... the possibility of paging or related issue of VM system did
occur to me, especially because system time shows up as
somewhat high here. The problem is that this server runs without swap :-)
so the time is going elsewhere.

The cache miss theory is interesting (but would not show up as system
time), and your back of the envelope calculation gives about
0.8 us for fetching a cache line, although i am pretty sure the cache miss
predictor would probably figure out the misses and stream in the
cache lines since as you say we are going in address order). I'd expect it
to be no worse than when we do an "initial mark pause on a full Eden", give
or
take a little, and this is some 30 x worse.

One possibility I am looking at is the part where we self-loop. I suspect
the ParNew/CMS combination running with multiple worker threads
is hit hard here, if the failure happens very early say -- from what i saw
of that code recently, we don't consult the flag that says we failed
so we should just return and self-loop. Rather we retry allocation for each
subsequent object, fail that and then do the self-loop. The repeated
failed attempts might be adding up, especially since the access involves
looking at the shared pool. I'll look at how that is done, and see if we can
do a fast fail after the first failure happens, rather than try and do the
rest of the scavenge, since we'll need to do a fixup anyway.

thanks for the discussion and i'll update as and when i do some more
investigations. Keep those ideas coming, and I'll submit a bug report once
i have spent a few more cycles looking at the available data and ruminating.

- ramki

On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler <
Peter.B.Kessler at oracle.com> wrote:

> IIRC, promotion failure still has to finish the evacuation attempt (and
> some objects may get promoted while the ones that fail get self-looped).
>  That part is the usual multi-threaded object graph walk, with failed PLAB
> allocations thrown in to slow you down.  Then you get to start the pass
> that deals with the self-loops, which you say is single-threaded.  Undoing
> the self-loops is in address order, but it walks by the object sizes, so
> probably it mostly misses in the cache.  40GB at the average object size
> (call them 40 bytes to make the math easy) is a lot of cache misses.  How
> fast is your memory system?  Probably faster than (10minutes / (40GB /
> 40bytes)) per cache miss.
>
> Is it possible you are paging?  Maybe not when things are running
> smoothly, but maybe a 10 minute stall on one service causes things to back
> up (and grow the heap of) other services on the same machine?  I'm guessing.
>
>                         ... peter
>
> Srinivas Ramakrishna wrote:
>
>>
>> Has anyone come across extremely long (upwards of 10 minutes) promotion
>> failure unwinding scenarios when using any of the collectors, but
>> especially with ParNew/CMS?
>> I recently came across one such occurrence with ParNew/CMS that, with a
>> 40 GB young gen took upwards of 10 minutes to "unwind". I looked through
>> the code and I can see
>> that the unwinding steps can be a source of slowdown as we iterate
>> single-threaded (DefNew) through the large Eden to fix up self-forwarded
>> objects, but that still wouldn't
>> seem to explain such a large pause, even with a 40 GB young gen. I am
>> looking through the promotion failure paths to see what might be the cause
>> of such a large pause,
>> but if anyone has experienced this kind of scenario before or has any
>> conjectures or insights, I'd appreciate it.
>>
>> thanks!
>> -- ramki
>>
>>
>> ------------------------------**------------------------------**
>> ------------
>>
>> ______________________________**_________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.**net <hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/**mailman/listinfo/hotspot-gc-**use<http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121018/e6b9ffa8/attachment.html 

From ysr1729 at gmail.com  Thu Oct 18 17:05:41 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 18 Oct 2012 17:05:41 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
Message-ID: <CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>

System data show high context switching in vicinity of event and points at
the futile allocation bottleneck as a possible theory with some legs....

more later.
-- ramki

On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

> Thanks Peter... the possibility of paging or related issue of VM system
> did occur to me, especially because system time shows up as
> somewhat high here. The problem is that this server runs without swap :-)
> so the time is going elsewhere.
>
> The cache miss theory is interesting (but would not show up as system
> time), and your back of the envelope calculation gives about
> 0.8 us for fetching a cache line, although i am pretty sure the cache miss
> predictor would probably figure out the misses and stream in the
> cache lines since as you say we are going in address order). I'd expect it
> to be no worse than when we do an "initial mark pause on a full Eden", give
> or
> take a little, and this is some 30 x worse.
>
> One possibility I am looking at is the part where we self-loop. I suspect
> the ParNew/CMS combination running with multiple worker threads
> is hit hard here, if the failure happens very early say -- from what i saw
> of that code recently, we don't consult the flag that says we failed
> so we should just return and self-loop. Rather we retry allocation for
> each subsequent object, fail that and then do the self-loop. The repeated
> failed attempts might be adding up, especially since the access involves
> looking at the shared pool. I'll look at how that is done, and see if we can
> do a fast fail after the first failure happens, rather than try and do the
> rest of the scavenge, since we'll need to do a fixup anyway.
>
> thanks for the discussion and i'll update as and when i do some more
> investigations. Keep those ideas coming, and I'll submit a bug report once
> i have spent a few more cycles looking at the available data and
> ruminating.
>
> - ramki
>
>
> On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler <
> Peter.B.Kessler at oracle.com> wrote:
>
>> IIRC, promotion failure still has to finish the evacuation attempt (and
>> some objects may get promoted while the ones that fail get self-looped).
>>  That part is the usual multi-threaded object graph walk, with failed PLAB
>> allocations thrown in to slow you down.  Then you get to start the pass
>> that deals with the self-loops, which you say is single-threaded.  Undoing
>> the self-loops is in address order, but it walks by the object sizes, so
>> probably it mostly misses in the cache.  40GB at the average object size
>> (call them 40 bytes to make the math easy) is a lot of cache misses.  How
>> fast is your memory system?  Probably faster than (10minutes / (40GB /
>> 40bytes)) per cache miss.
>>
>> Is it possible you are paging?  Maybe not when things are running
>> smoothly, but maybe a 10 minute stall on one service causes things to back
>> up (and grow the heap of) other services on the same machine?  I'm guessing.
>>
>>                         ... peter
>>
>> Srinivas Ramakrishna wrote:
>>
>>>
>>> Has anyone come across extremely long (upwards of 10 minutes) promotion
>>> failure unwinding scenarios when using any of the collectors, but
>>> especially with ParNew/CMS?
>>> I recently came across one such occurrence with ParNew/CMS that, with a
>>> 40 GB young gen took upwards of 10 minutes to "unwind". I looked through
>>> the code and I can see
>>> that the unwinding steps can be a source of slowdown as we iterate
>>> single-threaded (DefNew) through the large Eden to fix up self-forwarded
>>> objects, but that still wouldn't
>>> seem to explain such a large pause, even with a 40 GB young gen. I am
>>> looking through the promotion failure paths to see what might be the cause
>>> of such a large pause,
>>> but if anyone has experienced this kind of scenario before or has any
>>> conjectures or insights, I'd appreciate it.
>>>
>>> thanks!
>>> -- ramki
>>>
>>>
>>> ------------------------------**------------------------------**
>>> ------------
>>>
>>> ______________________________**_________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.**net <hotspot-gc-use at openjdk.java.net>
>>> http://mail.openjdk.java.net/**mailman/listinfo/hotspot-gc-**use<http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121018/726b3a16/attachment.html 

From Peter.B.Kessler at Oracle.COM  Thu Oct 18 17:27:25 2012
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Thu, 18 Oct 2012 17:27:25 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>	<50806485.8080808@Oracle.COM>	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
Message-ID: <50809E6D.6020704@Oracle.COM>

When there's no room in the old generation and a worker has filled its PLAB to capacity, but it still has instances to try to promote, does it try to allocate a new PLAB, and fail?  That would lead to each of the workers eventually failing to allocate a new PLAB for each promotion attempt.  IIRC, PLAB allocation grabs a real lock (since it happens so rarely :-).  In the promotion failure case, that lock could get incandescent.  Maybe it's gone unnoticed because for modest young generations it doesn't stay hot enough for long enough for people to witness the supernova?  Having a young generation the size you do would exacerbate the problem.  If you have lots of workers, that would increase the amount of contention, too.

PLAB allocation might be a place where you could put a test for having failed promotion, so just return null and let the worker self-loop this instance.  That would keep the test off the fast-path (when things are going well).

I'm still guessing.

			... peter

Srinivas Ramakrishna wrote:
> System data show high context switching in vicinity of event and points 
> at the futile allocation bottleneck as a possible theory with some legs....
> 
> more later.
> -- ramki
> 
> On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com 
> <mailto:ysr1729 at gmail.com>> wrote:
> 
>     Thanks Peter... the possibility of paging or related issue of VM
>     system did occur to me, especially because system time shows up as
>     somewhat high here. The problem is that this server runs without
>     swap :-) so the time is going elsewhere.
> 
>     The cache miss theory is interesting (but would not show up as
>     system time), and your back of the envelope calculation gives about
>     0.8 us for fetching a cache line, although i am pretty sure the
>     cache miss predictor would probably figure out the misses and stream
>     in the
>     cache lines since as you say we are going in address order). I'd
>     expect it to be no worse than when we do an "initial mark pause on a
>     full Eden", give or
>     take a little, and this is some 30 x worse.
> 
>     One possibility I am looking at is the part where we self-loop. I
>     suspect the ParNew/CMS combination running with multiple worker threads
>     is hit hard here, if the failure happens very early say -- from what
>     i saw of that code recently, we don't consult the flag that says we
>     failed
>     so we should just return and self-loop. Rather we retry allocation
>     for each subsequent object, fail that and then do the self-loop. The
>     repeated
>     failed attempts might be adding up, especially since the access
>     involves looking at the shared pool. I'll look at how that is done,
>     and see if we can
>     do a fast fail after the first failure happens, rather than try and
>     do the rest of the scavenge, since we'll need to do a fixup anyway.
> 
>     thanks for the discussion and i'll update as and when i do some more
>     investigations. Keep those ideas coming, and I'll submit a bug
>     report once
>     i have spent a few more cycles looking at the available data and
>     ruminating.
> 
>     - ramki
> 
> 
>     On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler at oracle.com>> wrote:
> 
>         IIRC, promotion failure still has to finish the evacuation
>         attempt (and some objects may get promoted while the ones that
>         fail get self-looped).  That part is the usual multi-threaded
>         object graph walk, with failed PLAB allocations thrown in to
>         slow you down.  Then you get to start the pass that deals with
>         the self-loops, which you say is single-threaded.  Undoing the
>         self-loops is in address order, but it walks by the object
>         sizes, so probably it mostly misses in the cache.  40GB at the
>         average object size (call them 40 bytes to make the math easy)
>         is a lot of cache misses.  How fast is your memory system?
>          Probably faster than (10minutes / (40GB / 40bytes)) per cache miss.
> 
>         Is it possible you are paging?  Maybe not when things are
>         running smoothly, but maybe a 10 minute stall on one service
>         causes things to back up (and grow the heap of) other services
>         on the same machine?  I'm guessing.
> 
>                                 ... peter
> 
>         Srinivas Ramakrishna wrote:
> 
> 
>             Has anyone come across extremely long (upwards of 10
>             minutes) promotion failure unwinding scenarios when using
>             any of the collectors, but especially with ParNew/CMS?
>             I recently came across one such occurrence with ParNew/CMS
>             that, with a 40 GB young gen took upwards of 10 minutes to
>             "unwind". I looked through the code and I can see
>             that the unwinding steps can be a source of slowdown as we
>             iterate single-threaded (DefNew) through the large Eden to
>             fix up self-forwarded objects, but that still wouldn't
>             seem to explain such a large pause, even with a 40 GB young
>             gen. I am looking through the promotion failure paths to see
>             what might be the cause of such a large pause,
>             but if anyone has experienced this kind of scenario before
>             or has any conjectures or insights, I'd appreciate it.
> 
>             thanks!
>             -- ramki
> 
> 
>             ------------------------------__------------------------------__------------
> 
>             _________________________________________________
>             hotspot-gc-use mailing list
>             hotspot-gc-use at openjdk.java.__net
>             <mailto:hotspot-gc-use at openjdk.java.net>
>             http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use
>             <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
> 
> 
> 

From ysr1729 at gmail.com  Fri Oct 19 01:40:58 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 19 Oct 2012 01:40:58 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <50809E6D.6020704@Oracle.COM>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
Message-ID: <CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>

On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler <
Peter.B.Kessler at oracle.com> wrote:

> When there's no room in the old generation and a worker has filled its
> PLAB to capacity, but it still has instances to try to promote, does it try
> to allocate a new PLAB, and fail?  That would lead to each of the workers
> eventually failing to allocate a new PLAB for each promotion attempt.
>  IIRC, PLAB allocation grabs a real lock (since it happens so rarely :-).
>  In the promotion failure case, that lock could get incandescent.  Maybe
> it's gone unnoticed because for modest young generations it doesn't stay
> hot enough for long enough for people to witness the supernova?  Having a
> young generation the size you do would exacerbate the problem.  If you have
> lots of workers, that would increase the amount of contention, too.
>

Yes, that's exactly my thinking too. For the case of CMS, the PLAB's are
"local free block lists" and the allocation from the shared global pool is
even worse and more heavyweight than an atomic pointer bump, with a lock
protecting several layers of checks.


>
> PLAB allocation might be a place where you could put a test for having
> failed promotion, so just return null and let the worker self-loop this
> instance.  That would keep the test off the fast-path (when things are
> going well).
>

Yes, that's a good idea and might well be sufficient, and was also my first
thought. However, I also wonder about whether just moving the promotion
failure test a volatile read into the fast path of the copy routine, and
immediately failing all subsequent copies after the first failure (and
indeed via the
global flag propagating that failure across all the workers immediately)
won't just be quicker without having added that much in the fast path. It
seems
that in that case we may be able to even avoid the self-looping and the
subsequent single-threaded fixup. The first thread that fails sets the
volatile
global, so any subsequent thread artificially fails all subsequent copies
of uncopied objects. Any object reference found pointing to an object in
Eden
or From space that hasn't yet been copied will call the copy routine which
will (artificially) fail and return the original address.

I'll do some experiments and there may lurk devils in the details, but it
seems to me that this will work and be much more efficient in the
slow case, without making the fast path that much slower.


>
> I'm still guessing.


Your guesses are good, and very helpful, and I think we are on the right
track with this one as regards the cause of the slowdown.

I'll update.

-- ramki


>
>
>                         ... peter
>
> Srinivas Ramakrishna wrote:
>
>> System data show high context switching in vicinity of event and points
>> at the futile allocation bottleneck as a possible theory with some legs....
>>
>> more later.
>> -- ramki
>>
>> On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com<mailto:
>> ysr1729 at gmail.com>> wrote:
>>
>>     Thanks Peter... the possibility of paging or related issue of VM
>>     system did occur to me, especially because system time shows up as
>>     somewhat high here. The problem is that this server runs without
>>     swap :-) so the time is going elsewhere.
>>
>>     The cache miss theory is interesting (but would not show up as
>>     system time), and your back of the envelope calculation gives about
>>     0.8 us for fetching a cache line, although i am pretty sure the
>>     cache miss predictor would probably figure out the misses and stream
>>     in the
>>     cache lines since as you say we are going in address order). I'd
>>     expect it to be no worse than when we do an "initial mark pause on a
>>     full Eden", give or
>>     take a little, and this is some 30 x worse.
>>
>>     One possibility I am looking at is the part where we self-loop. I
>>     suspect the ParNew/CMS combination running with multiple worker
>> threads
>>     is hit hard here, if the failure happens very early say -- from what
>>     i saw of that code recently, we don't consult the flag that says we
>>     failed
>>     so we should just return and self-loop. Rather we retry allocation
>>     for each subsequent object, fail that and then do the self-loop. The
>>     repeated
>>     failed attempts might be adding up, especially since the access
>>     involves looking at the shared pool. I'll look at how that is done,
>>     and see if we can
>>     do a fast fail after the first failure happens, rather than try and
>>     do the rest of the scavenge, since we'll need to do a fixup anyway.
>>
>>     thanks for the discussion and i'll update as and when i do some more
>>     investigations. Keep those ideas coming, and I'll submit a bug
>>     report once
>>     i have spent a few more cycles looking at the available data and
>>     ruminating.
>>
>>     - ramki
>>
>>
>>     On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
>>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler@**oracle.com<Peter.B.Kessler at oracle.com>>>
>> wrote:
>>
>>         IIRC, promotion failure still has to finish the evacuation
>>         attempt (and some objects may get promoted while the ones that
>>         fail get self-looped).  That part is the usual multi-threaded
>>         object graph walk, with failed PLAB allocations thrown in to
>>         slow you down.  Then you get to start the pass that deals with
>>         the self-loops, which you say is single-threaded.  Undoing the
>>         self-loops is in address order, but it walks by the object
>>         sizes, so probably it mostly misses in the cache.  40GB at the
>>         average object size (call them 40 bytes to make the math easy)
>>         is a lot of cache misses.  How fast is your memory system?
>>          Probably faster than (10minutes / (40GB / 40bytes)) per cache
>> miss.
>>
>>         Is it possible you are paging?  Maybe not when things are
>>         running smoothly, but maybe a 10 minute stall on one service
>>         causes things to back up (and grow the heap of) other services
>>         on the same machine?  I'm guessing.
>>
>>                                 ... peter
>>
>>         Srinivas Ramakrishna wrote:
>>
>>
>>             Has anyone come across extremely long (upwards of 10
>>             minutes) promotion failure unwinding scenarios when using
>>             any of the collectors, but especially with ParNew/CMS?
>>             I recently came across one such occurrence with ParNew/CMS
>>             that, with a 40 GB young gen took upwards of 10 minutes to
>>             "unwind". I looked through the code and I can see
>>             that the unwinding steps can be a source of slowdown as we
>>             iterate single-threaded (DefNew) through the large Eden to
>>             fix up self-forwarded objects, but that still wouldn't
>>             seem to explain such a large pause, even with a 40 GB young
>>             gen. I am looking through the promotion failure paths to see
>>             what might be the cause of such a large pause,
>>             but if anyone has experienced this kind of scenario before
>>             or has any conjectures or insights, I'd appreciate it.
>>
>>             thanks!
>>             -- ramki
>>
>>
>>             ------------------------------**
>> __----------------------------**--__------------
>>
>>             ______________________________**___________________
>>             hotspot-gc-use mailing list
>>             hotspot-gc-use at openjdk.java.__**net
>>             <mailto:hotspot-gc-use@**openjdk.java.net<hotspot-gc-use at openjdk.java.net>
>> >
>>             http://mail.openjdk.java.net/_**
>> _mailman/listinfo/hotspot-gc-_**_use<http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use>
>>             <http://mail.openjdk.java.net/**mailman/listinfo/hotspot-gc-*
>> *use <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/a4310c6d/attachment.html 

From fuyou001 at gmail.com  Fri Oct 19 06:13:52 2012
From: fuyou001 at gmail.com (fuyou)
Date: Fri, 19 Oct 2012 21:13:52 +0800
Subject: In 32 bit Server,what's the size of Java.Lang.Object
Message-ID: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>

hi all
      I am issues about in 32 bit Server,what's the size of Java.Lang.Object
in this blog how-much-memory-is-used-by-my-java<http://kohlerm.blogspot.com/2008/12/how-much-memory-is-used-by-my-java.html>
  java.lang.Object: 2 * 4 (Object header),the size of Object is 2*4
but in the sessioCON5135_PDF_5135_0001.pdf<http://www.myexpospace.com/JavaOne2012/SessionFiles/CON5135_PDF_5135_0001.pdf>
(page
9)   Object Metadata: 3 slots of data (4 for arrays, Class: pointer to
class information,Flags: shape, hash code, etc, Lock: flatlock or pointer
to inflated monito)
 so the size of Object is 3*4

who is right?

-- 
   =============================================

  fuyou001
Best Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/0e5998e1/attachment.html 

From vitalyd at gmail.com  Fri Oct 19 06:27:37 2012
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Fri, 19 Oct 2012 09:27:37 -0400
Subject: In 32 bit Server,what's the size of Java.Lang.Object
In-Reply-To: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>
References: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>
Message-ID: <CAHjP37HTG6C5ZBCcot1SNVGEGZXxjWuYr4bPmObFE9T6vSzi+A@mail.gmail.com>

The presentation may be talking about IBM's VM.  Hotspot obj header is two
words; arrays have an extra int32 for length.

Sent from my phone
On Oct 19, 2012 9:15 AM, "fuyou" <fuyou001 at gmail.com> wrote:

> hi all
>       I am issues about in 32 bit Server,what's the size of
> Java.Lang.Object
> in this blog how-much-memory-is-used-by-my-java<http://kohlerm.blogspot.com/2008/12/how-much-memory-is-used-by-my-java.html>
>   java.lang.Object: 2 * 4 (Object header),the size of Object is 2*4
> but in the sessioCON5135_PDF_5135_0001.pdf<http://www.myexpospace.com/JavaOne2012/SessionFiles/CON5135_PDF_5135_0001.pdf> (page
> 9)   Object Metadata: 3 slots of data (4 for arrays, Class: pointer to
> class information,Flags: shape, hash code, etc, Lock: flatlock or pointer
> to inflated monito)
>  so the size of Object is 3*4
>
> who is right?
>
> --
>    =============================================
>
>   fuyou001
> Best Regards
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/d627c02e/attachment.html 

From chunt at salesforce.com  Fri Oct 19 06:36:44 2012
From: chunt at salesforce.com (Charlie Hunt)
Date: Fri, 19 Oct 2012 06:36:44 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
Message-ID: <725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>

Interesting discussion. :-)

Ramki's observation of high context switches to me suggests active locking as a possible culprit.  Fwiw, based on your discussion it looks like you're headed down a path that makes sense.

charlie...

On Oct 19, 2012, at 3:40 AM, Srinivas Ramakrishna wrote:


On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler <Peter.B.Kessler at oracle.com<mailto:Peter.B.Kessler at oracle.com>> wrote:
When there's no room in the old generation and a worker has filled its PLAB to capacity, but it still has instances to try to promote, does it try to allocate a new PLAB, and fail?  That would lead to each of the workers eventually failing to allocate a new PLAB for each promotion attempt.  IIRC, PLAB allocation grabs a real lock (since it happens so rarely :-).  In the promotion failure case, that lock could get incandescent.  Maybe it's gone unnoticed because for modest young generations it doesn't stay hot enough for long enough for people to witness the supernova?  Having a young generation the size you do would exacerbate the problem.  If you have lots of workers, that would increase the amount of contention, too.

Yes, that's exactly my thinking too. For the case of CMS, the PLAB's are "local free block lists" and the allocation from the shared global pool is
even worse and more heavyweight than an atomic pointer bump, with a lock protecting several layers of checks.


PLAB allocation might be a place where you could put a test for having failed promotion, so just return null and let the worker self-loop this instance.  That would keep the test off the fast-path (when things are going well).

Yes, that's a good idea and might well be sufficient, and was also my first thought. However, I also wonder about whether just moving the promotion
failure test a volatile read into the fast path of the copy routine, and immediately failing all subsequent copies after the first failure (and indeed via the
global flag propagating that failure across all the workers immediately) won't just be quicker without having added that much in the fast path. It seems
that in that case we may be able to even avoid the self-looping and the subsequent single-threaded fixup. The first thread that fails sets the volatile
global, so any subsequent thread artificially fails all subsequent copies of uncopied objects. Any object reference found pointing to an object in Eden
or From space that hasn't yet been copied will call the copy routine which will (artificially) fail and return the original address.

I'll do some experiments and there may lurk devils in the details, but it seems to me that this will work and be much more efficient in the
slow case, without making the fast path that much slower.


I'm still guessing.

Your guesses are good, and very helpful, and I think we are on the right track with this one as regards the cause of the slowdown.

I'll update.

-- ramki


                        ... peter

Srinivas Ramakrishna wrote:
System data show high context switching in vicinity of event and points at the futile allocation bottleneck as a possible theory with some legs....

more later.
-- ramki

On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com<mailto:ysr1729 at gmail.com> <mailto:ysr1729 at gmail.com<mailto:ysr1729 at gmail.com>>> wrote:

    Thanks Peter... the possibility of paging or related issue of VM
    system did occur to me, especially because system time shows up as
    somewhat high here. The problem is that this server runs without
    swap :-) so the time is going elsewhere.

    The cache miss theory is interesting (but would not show up as
    system time), and your back of the envelope calculation gives about
    0.8 us for fetching a cache line, although i am pretty sure the
    cache miss predictor would probably figure out the misses and stream
    in the
    cache lines since as you say we are going in address order). I'd
    expect it to be no worse than when we do an "initial mark pause on a
    full Eden", give or
    take a little, and this is some 30 x worse.

    One possibility I am looking at is the part where we self-loop. I
    suspect the ParNew/CMS combination running with multiple worker threads
    is hit hard here, if the failure happens very early say -- from what
    i saw of that code recently, we don't consult the flag that says we
    failed
    so we should just return and self-loop. Rather we retry allocation
    for each subsequent object, fail that and then do the self-loop. The
    repeated
    failed attempts might be adding up, especially since the access
    involves looking at the shared pool. I'll look at how that is done,
    and see if we can
    do a fast fail after the first failure happens, rather than try and
    do the rest of the scavenge, since we'll need to do a fixup anyway.

    thanks for the discussion and i'll update as and when i do some more
    investigations. Keep those ideas coming, and I'll submit a bug
    report once
    i have spent a few more cycles looking at the available data and
    ruminating.

    - ramki


    On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
    <Peter.B.Kessler at oracle.com<mailto:Peter.B.Kessler at oracle.com> <mailto:Peter.B.Kessler at oracle.com<mailto:Peter.B.Kessler at oracle.com>>> wrote:

        IIRC, promotion failure still has to finish the evacuation
        attempt (and some objects may get promoted while the ones that
        fail get self-looped).  That part is the usual multi-threaded
        object graph walk, with failed PLAB allocations thrown in to
        slow you down.  Then you get to start the pass that deals with
        the self-loops, which you say is single-threaded.  Undoing the
        self-loops is in address order, but it walks by the object
        sizes, so probably it mostly misses in the cache.  40GB at the
        average object size (call them 40 bytes to make the math easy)
        is a lot of cache misses.  How fast is your memory system?
         Probably faster than (10minutes / (40GB / 40bytes)) per cache miss.

        Is it possible you are paging?  Maybe not when things are
        running smoothly, but maybe a 10 minute stall on one service
        causes things to back up (and grow the heap of) other services
        on the same machine?  I'm guessing.

                                ... peter

        Srinivas Ramakrishna wrote:


            Has anyone come across extremely long (upwards of 10
            minutes) promotion failure unwinding scenarios when using
            any of the collectors, but especially with ParNew/CMS?
            I recently came across one such occurrence with ParNew/CMS
            that, with a 40 GB young gen took upwards of 10 minutes to
            "unwind". I looked through the code and I can see
            that the unwinding steps can be a source of slowdown as we
            iterate single-threaded (DefNew) through the large Eden to
            fix up self-forwarded objects, but that still wouldn't
            seem to explain such a large pause, even with a 40 GB young
            gen. I am looking through the promotion failure paths to see
            what might be the cause of such a large pause,
            but if anyone has experienced this kind of scenario before
            or has any conjectures or insights, I'd appreciate it.

            thanks!
            -- ramki


            ------------------------------__------------------------------__------------

            _________________________________________________
            hotspot-gc-use mailing list
            hotspot-gc-use at openjdk.java.__net
            <mailto:hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>>
            http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use
            <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>


_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/8c10cb65/attachment-0001.html 

From rednaxelafx at gmail.com  Fri Oct 19 07:08:42 2012
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Fri, 19 Oct 2012 22:08:42 +0800
Subject: In 32 bit Server,what's the size of Java.Lang.Object
In-Reply-To: <CAHjP37HTG6C5ZBCcot1SNVGEGZXxjWuYr4bPmObFE9T6vSzi+A@mail.gmail.com>
References: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>
	<CAHjP37HTG6C5ZBCcot1SNVGEGZXxjWuYr4bPmObFE9T6vSzi+A@mail.gmail.com>
Message-ID: <CA+cQ+tQTOVYM5HNjaUkhf6LrFgAnq-7HH3EQLQO2xAPqy=OR_Q@mail.gmail.com>

Yes, Vitaly is right. That presentation described the situation in IBM's J9
VM, which is different from Oracle JDK/OpenJDK's HotSpot VM. J9 uses a
dedicated word for locks, where as HotSpot folds that into the mark word.

So an object with no explicit fields in a 32-bit HotSpot VM uses 8 bytes, 4
for the mark word and 4 for the klass pointer.

- Kris

On Fri, Oct 19, 2012 at 9:27 PM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> The presentation may be talking about IBM's VM.  Hotspot obj header is two
> words; arrays have an extra int32 for length.
>
> Sent from my phone
> On Oct 19, 2012 9:15 AM, "fuyou" <fuyou001 at gmail.com> wrote:
>
>> hi all
>>       I am issues about in 32 bit Server,what's the size of
>> Java.Lang.Object
>> in this blog how-much-memory-is-used-by-my-java<http://kohlerm.blogspot.com/2008/12/how-much-memory-is-used-by-my-java.html>
>>   java.lang.Object: 2 * 4 (Object header),the size of Object is 2*4
>> but in the sessioCON5135_PDF_5135_0001.pdf<http://www.myexpospace.com/JavaOne2012/SessionFiles/CON5135_PDF_5135_0001.pdf> (page
>> 9)   Object Metadata: 3 slots of data (4 for arrays, Class: pointer to
>> class information,Flags: shape, hash code, etc, Lock: flatlock or pointer
>> to inflated monito)
>>  so the size of Object is 3*4
>>
>> who is right?
>>
>> --
>>    =============================================
>>
>>     fuyou001
>> Best Regards
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/a3d1eb39/attachment.html 

From dawid.weiss at gmail.com  Fri Oct 19 10:21:06 2012
From: dawid.weiss at gmail.com (Dawid Weiss)
Date: Fri, 19 Oct 2012 19:21:06 +0200
Subject: In 32 bit Server,what's the size of Java.Lang.Object
In-Reply-To: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>
References: <CAE=739sbfYV2Kj0U6dshs4dOUsa5796uuLTiD2850ND0Ds98Uw@mail.gmail.com>
Message-ID: <CAM21Rt-O5hwUxtvHOefb_n8yivy__YZfEFLHf+wTm9HAki8WUg@mail.gmail.com>

Take a look at the memory use estimator in Apache Lucene or
stand-alone extracted project here:
https://github.com/dweiss/java-sizeof

I gave a presentation about this topic a while ago at the local JUG --
the slideshow is here:
http://www.slideshare.net/DawidWeiss/sizeofobject-how-much-memory-objects-take-on-jvms-and-when-this-may-matter

There is no recording but it should give you a hint at how alignment,
paddings and other things work (across different JVMs).

Dawid

On Fri, Oct 19, 2012 at 3:13 PM, fuyou <fuyou001 at gmail.com> wrote:
> hi all
>       I am issues about in 32 bit Server,what's the size of Java.Lang.Object
> in this blog how-much-memory-is-used-by-my-java    java.lang.Object: 2 * 4
> (Object header),the size of Object is 2*4
> but in the sessioCON5135_PDF_5135_0001.pdf (page 9)   Object Metadata: 3
> slots of data (4 for arrays, Class: pointer to class information,Flags:
> shape, hash code, etc, Lock: flatlock or pointer to inflated monito)
>  so the size of Object is 3*4
>
> who is right?
>
> --
>    =============================================
>
> fuyou001
> Best Regards
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From jobrien at ieee.org  Fri Oct 19 11:13:51 2012
From: jobrien at ieee.org (John O'Brien)
Date: Fri, 19 Oct 2012 11:13:51 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
Message-ID: <CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>

Srinivas,

I am interested in how this is resolved. Can I clarify that you are
referring to the GC-- events in the garbage log where the minor gc
unwinds and turns into a major GC?

I'd like to figure out if this is something different than what I've
seen before. I have seen GC times blow out is when Transparent Huge
Pages interfered when it was doing its coalescing at the same time as
a GC. Can you clarify what kernel version you are on and if huge/large
pages are enabled?

This may help you but it will help me follow the discussion better.

Thanks,
John

On Fri, Oct 19, 2012 at 6:36 AM, Charlie Hunt <chunt at salesforce.com> wrote:
> Interesting discussion. :-)
>
> Ramki's observation of high context switches to me suggests active locking
> as a possible culprit.  Fwiw, based on your discussion it looks like you're
> headed down a path that makes sense.
>
> charlie...
>
> On Oct 19, 2012, at 3:40 AM, Srinivas Ramakrishna wrote:
>
>
>
> On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler
> <Peter.B.Kessler at oracle.com> wrote:
>>
>> When there's no room in the old generation and a worker has filled its
>> PLAB to capacity, but it still has instances to try to promote, does it try
>> to allocate a new PLAB, and fail?  That would lead to each of the workers
>> eventually failing to allocate a new PLAB for each promotion attempt.  IIRC,
>> PLAB allocation grabs a real lock (since it happens so rarely :-).  In the
>> promotion failure case, that lock could get incandescent.  Maybe it's gone
>> unnoticed because for modest young generations it doesn't stay hot enough
>> for long enough for people to witness the supernova?  Having a young
>> generation the size you do would exacerbate the problem.  If you have lots
>> of workers, that would increase the amount of contention, too.
>
>
> Yes, that's exactly my thinking too. For the case of CMS, the PLAB's are
> "local free block lists" and the allocation from the shared global pool is
> even worse and more heavyweight than an atomic pointer bump, with a lock
> protecting several layers of checks.
>
>>
>>
>> PLAB allocation might be a place where you could put a test for having
>> failed promotion, so just return null and let the worker self-loop this
>> instance.  That would keep the test off the fast-path (when things are going
>> well).
>
>
> Yes, that's a good idea and might well be sufficient, and was also my first
> thought. However, I also wonder about whether just moving the promotion
> failure test a volatile read into the fast path of the copy routine, and
> immediately failing all subsequent copies after the first failure (and
> indeed via the
> global flag propagating that failure across all the workers immediately)
> won't just be quicker without having added that much in the fast path. It
> seems
> that in that case we may be able to even avoid the self-looping and the
> subsequent single-threaded fixup. The first thread that fails sets the
> volatile
> global, so any subsequent thread artificially fails all subsequent copies of
> uncopied objects. Any object reference found pointing to an object in Eden
> or From space that hasn't yet been copied will call the copy routine which
> will (artificially) fail and return the original address.
>
> I'll do some experiments and there may lurk devils in the details, but it
> seems to me that this will work and be much more efficient in the
> slow case, without making the fast path that much slower.
>
>>
>>
>> I'm still guessing.
>
>
> Your guesses are good, and very helpful, and I think we are on the right
> track with this one as regards the cause of the slowdown.
>
> I'll update.
>
> -- ramki
>
>>
>>
>>
>>                         ... peter
>>
>> Srinivas Ramakrishna wrote:
>>>
>>> System data show high context switching in vicinity of event and points
>>> at the futile allocation bottleneck as a possible theory with some legs....
>>>
>>> more later.
>>> -- ramki
>>>
>>> On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com
>>> <mailto:ysr1729 at gmail.com>> wrote:
>>>
>>>     Thanks Peter... the possibility of paging or related issue of VM
>>>     system did occur to me, especially because system time shows up as
>>>     somewhat high here. The problem is that this server runs without
>>>     swap :-) so the time is going elsewhere.
>>>
>>>     The cache miss theory is interesting (but would not show up as
>>>     system time), and your back of the envelope calculation gives about
>>>     0.8 us for fetching a cache line, although i am pretty sure the
>>>     cache miss predictor would probably figure out the misses and stream
>>>     in the
>>>     cache lines since as you say we are going in address order). I'd
>>>     expect it to be no worse than when we do an "initial mark pause on a
>>>     full Eden", give or
>>>     take a little, and this is some 30 x worse.
>>>
>>>     One possibility I am looking at is the part where we self-loop. I
>>>     suspect the ParNew/CMS combination running with multiple worker
>>> threads
>>>     is hit hard here, if the failure happens very early say -- from what
>>>     i saw of that code recently, we don't consult the flag that says we
>>>     failed
>>>     so we should just return and self-loop. Rather we retry allocation
>>>     for each subsequent object, fail that and then do the self-loop. The
>>>     repeated
>>>     failed attempts might be adding up, especially since the access
>>>     involves looking at the shared pool. I'll look at how that is done,
>>>     and see if we can
>>>     do a fast fail after the first failure happens, rather than try and
>>>     do the rest of the scavenge, since we'll need to do a fixup anyway.
>>>
>>>     thanks for the discussion and i'll update as and when i do some more
>>>     investigations. Keep those ideas coming, and I'll submit a bug
>>>     report once
>>>     i have spent a few more cycles looking at the available data and
>>>     ruminating.
>>>
>>>     - ramki
>>>
>>>
>>>     On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
>>>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler at oracle.com>>
>>> wrote:
>>>
>>>         IIRC, promotion failure still has to finish the evacuation
>>>         attempt (and some objects may get promoted while the ones that
>>>         fail get self-looped).  That part is the usual multi-threaded
>>>         object graph walk, with failed PLAB allocations thrown in to
>>>         slow you down.  Then you get to start the pass that deals with
>>>         the self-loops, which you say is single-threaded.  Undoing the
>>>         self-loops is in address order, but it walks by the object
>>>         sizes, so probably it mostly misses in the cache.  40GB at the
>>>         average object size (call them 40 bytes to make the math easy)
>>>         is a lot of cache misses.  How fast is your memory system?
>>>          Probably faster than (10minutes / (40GB / 40bytes)) per cache
>>> miss.
>>>
>>>         Is it possible you are paging?  Maybe not when things are
>>>         running smoothly, but maybe a 10 minute stall on one service
>>>         causes things to back up (and grow the heap of) other services
>>>         on the same machine?  I'm guessing.
>>>
>>>                                 ... peter
>>>
>>>         Srinivas Ramakrishna wrote:
>>>
>>>
>>>             Has anyone come across extremely long (upwards of 10
>>>             minutes) promotion failure unwinding scenarios when using
>>>             any of the collectors, but especially with ParNew/CMS?
>>>             I recently came across one such occurrence with ParNew/CMS
>>>             that, with a 40 GB young gen took upwards of 10 minutes to
>>>             "unwind". I looked through the code and I can see
>>>             that the unwinding steps can be a source of slowdown as we
>>>             iterate single-threaded (DefNew) through the large Eden to
>>>             fix up self-forwarded objects, but that still wouldn't
>>>             seem to explain such a large pause, even with a 40 GB young
>>>             gen. I am looking through the promotion failure paths to see
>>>             what might be the cause of such a large pause,
>>>             but if anyone has experienced this kind of scenario before
>>>             or has any conjectures or insights, I'd appreciate it.
>>>
>>>             thanks!
>>>             -- ramki
>>>
>>>
>>>
>>> ------------------------------__------------------------------__------------
>>>
>>>             _________________________________________________
>>>             hotspot-gc-use mailing list
>>>             hotspot-gc-use at openjdk.java.__net
>>>             <mailto:hotspot-gc-use at openjdk.java.net>
>>>
>>> http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use
>>>
>>> <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>>>
>>>
>>>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From ysr1729 at gmail.com  Fri Oct 19 14:14:51 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 19 Oct 2012 14:14:51 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
Message-ID: <CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>

Hi John -- Interesting... I was not aware of this issue.
The kernel version i was running is 2.6.18 and, AFAICT, THP came in in
2.6.38, yes?
We also do not explicitly enabled huge pages although we probably should.
The interesting part is that even though there is fairly high system time,
it accounts only for
a quarter of the elapsed time, so it can't just be huge page coalescing
getting in the way,
and even if it is, there must be something bigger afoot here that accounts
for the rest of the
time. I checked meminfo and it showed no huge pages, free, used or reserved
on the system.

thanks!
-- ramki

On Fri, Oct 19, 2012 at 11:13 AM, John O'Brien <jobrien at ieee.org> wrote:

> Srinivas,
>
> I am interested in how this is resolved. Can I clarify that you are
> referring to the GC-- events in the garbage log where the minor gc
> unwinds and turns into a major GC?
>
> I'd like to figure out if this is something different than what I've
> seen before. I have seen GC times blow out is when Transparent Huge
> Pages interfered when it was doing its coalescing at the same time as
> a GC. Can you clarify what kernel version you are on and if huge/large
> pages are enabled?
>
> This may help you but it will help me follow the discussion better.
>
> Thanks,
> John
>
> On Fri, Oct 19, 2012 at 6:36 AM, Charlie Hunt <chunt at salesforce.com>
> wrote:
> > Interesting discussion. :-)
> >
> > Ramki's observation of high context switches to me suggests active
> locking
> > as a possible culprit.  Fwiw, based on your discussion it looks like
> you're
> > headed down a path that makes sense.
> >
> > charlie...
> >
> > On Oct 19, 2012, at 3:40 AM, Srinivas Ramakrishna wrote:
> >
> >
> >
> > On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler
> > <Peter.B.Kessler at oracle.com> wrote:
> >>
> >> When there's no room in the old generation and a worker has filled its
> >> PLAB to capacity, but it still has instances to try to promote, does it
> try
> >> to allocate a new PLAB, and fail?  That would lead to each of the
> workers
> >> eventually failing to allocate a new PLAB for each promotion attempt.
>  IIRC,
> >> PLAB allocation grabs a real lock (since it happens so rarely :-).  In
> the
> >> promotion failure case, that lock could get incandescent.  Maybe it's
> gone
> >> unnoticed because for modest young generations it doesn't stay hot
> enough
> >> for long enough for people to witness the supernova?  Having a young
> >> generation the size you do would exacerbate the problem.  If you have
> lots
> >> of workers, that would increase the amount of contention, too.
> >
> >
> > Yes, that's exactly my thinking too. For the case of CMS, the PLAB's are
> > "local free block lists" and the allocation from the shared global pool
> is
> > even worse and more heavyweight than an atomic pointer bump, with a lock
> > protecting several layers of checks.
> >
> >>
> >>
> >> PLAB allocation might be a place where you could put a test for having
> >> failed promotion, so just return null and let the worker self-loop this
> >> instance.  That would keep the test off the fast-path (when things are
> going
> >> well).
> >
> >
> > Yes, that's a good idea and might well be sufficient, and was also my
> first
> > thought. However, I also wonder about whether just moving the promotion
> > failure test a volatile read into the fast path of the copy routine, and
> > immediately failing all subsequent copies after the first failure (and
> > indeed via the
> > global flag propagating that failure across all the workers immediately)
> > won't just be quicker without having added that much in the fast path. It
> > seems
> > that in that case we may be able to even avoid the self-looping and the
> > subsequent single-threaded fixup. The first thread that fails sets the
> > volatile
> > global, so any subsequent thread artificially fails all subsequent
> copies of
> > uncopied objects. Any object reference found pointing to an object in
> Eden
> > or From space that hasn't yet been copied will call the copy routine
> which
> > will (artificially) fail and return the original address.
> >
> > I'll do some experiments and there may lurk devils in the details, but it
> > seems to me that this will work and be much more efficient in the
> > slow case, without making the fast path that much slower.
> >
> >>
> >>
> >> I'm still guessing.
> >
> >
> > Your guesses are good, and very helpful, and I think we are on the right
> > track with this one as regards the cause of the slowdown.
> >
> > I'll update.
> >
> > -- ramki
> >
> >>
> >>
> >>
> >>                         ... peter
> >>
> >> Srinivas Ramakrishna wrote:
> >>>
> >>> System data show high context switching in vicinity of event and points
> >>> at the futile allocation bottleneck as a possible theory with some
> legs....
> >>>
> >>> more later.
> >>> -- ramki
> >>>
> >>> On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <
> ysr1729 at gmail.com
> >>> <mailto:ysr1729 at gmail.com>> wrote:
> >>>
> >>>     Thanks Peter... the possibility of paging or related issue of VM
> >>>     system did occur to me, especially because system time shows up as
> >>>     somewhat high here. The problem is that this server runs without
> >>>     swap :-) so the time is going elsewhere.
> >>>
> >>>     The cache miss theory is interesting (but would not show up as
> >>>     system time), and your back of the envelope calculation gives about
> >>>     0.8 us for fetching a cache line, although i am pretty sure the
> >>>     cache miss predictor would probably figure out the misses and
> stream
> >>>     in the
> >>>     cache lines since as you say we are going in address order). I'd
> >>>     expect it to be no worse than when we do an "initial mark pause on
> a
> >>>     full Eden", give or
> >>>     take a little, and this is some 30 x worse.
> >>>
> >>>     One possibility I am looking at is the part where we self-loop. I
> >>>     suspect the ParNew/CMS combination running with multiple worker
> >>> threads
> >>>     is hit hard here, if the failure happens very early say -- from
> what
> >>>     i saw of that code recently, we don't consult the flag that says we
> >>>     failed
> >>>     so we should just return and self-loop. Rather we retry allocation
> >>>     for each subsequent object, fail that and then do the self-loop.
> The
> >>>     repeated
> >>>     failed attempts might be adding up, especially since the access
> >>>     involves looking at the shared pool. I'll look at how that is done,
> >>>     and see if we can
> >>>     do a fast fail after the first failure happens, rather than try and
> >>>     do the rest of the scavenge, since we'll need to do a fixup anyway.
> >>>
> >>>     thanks for the discussion and i'll update as and when i do some
> more
> >>>     investigations. Keep those ideas coming, and I'll submit a bug
> >>>     report once
> >>>     i have spent a few more cycles looking at the available data and
> >>>     ruminating.
> >>>
> >>>     - ramki
> >>>
> >>>
> >>>     On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
> >>>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler at oracle.com>>
> >>> wrote:
> >>>
> >>>         IIRC, promotion failure still has to finish the evacuation
> >>>         attempt (and some objects may get promoted while the ones that
> >>>         fail get self-looped).  That part is the usual multi-threaded
> >>>         object graph walk, with failed PLAB allocations thrown in to
> >>>         slow you down.  Then you get to start the pass that deals with
> >>>         the self-loops, which you say is single-threaded.  Undoing the
> >>>         self-loops is in address order, but it walks by the object
> >>>         sizes, so probably it mostly misses in the cache.  40GB at the
> >>>         average object size (call them 40 bytes to make the math easy)
> >>>         is a lot of cache misses.  How fast is your memory system?
> >>>          Probably faster than (10minutes / (40GB / 40bytes)) per cache
> >>> miss.
> >>>
> >>>         Is it possible you are paging?  Maybe not when things are
> >>>         running smoothly, but maybe a 10 minute stall on one service
> >>>         causes things to back up (and grow the heap of) other services
> >>>         on the same machine?  I'm guessing.
> >>>
> >>>                                 ... peter
> >>>
> >>>         Srinivas Ramakrishna wrote:
> >>>
> >>>
> >>>             Has anyone come across extremely long (upwards of 10
> >>>             minutes) promotion failure unwinding scenarios when using
> >>>             any of the collectors, but especially with ParNew/CMS?
> >>>             I recently came across one such occurrence with ParNew/CMS
> >>>             that, with a 40 GB young gen took upwards of 10 minutes to
> >>>             "unwind". I looked through the code and I can see
> >>>             that the unwinding steps can be a source of slowdown as we
> >>>             iterate single-threaded (DefNew) through the large Eden to
> >>>             fix up self-forwarded objects, but that still wouldn't
> >>>             seem to explain such a large pause, even with a 40 GB young
> >>>             gen. I am looking through the promotion failure paths to
> see
> >>>             what might be the cause of such a large pause,
> >>>             but if anyone has experienced this kind of scenario before
> >>>             or has any conjectures or insights, I'd appreciate it.
> >>>
> >>>             thanks!
> >>>             -- ramki
> >>>
> >>>
> >>>
> >>>
> ------------------------------__------------------------------__------------
> >>>
> >>>             _________________________________________________
> >>>             hotspot-gc-use mailing list
> >>>             hotspot-gc-use at openjdk.java.__net
> >>>             <mailto:hotspot-gc-use at openjdk.java.net>
> >>>
> >>> http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use
> >>>
> >>> <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
> >>>
> >>>
> >>>
> >
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >
> >
> >
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121019/e32975e6/attachment.html 

From tro at ordix.de  Thu Oct 25 01:47:52 2012
From: tro at ordix.de (Thomas Rohde)
Date: Thu, 25 Oct 2012 10:47:52 +0200
Subject: from space and to space size is different and varies
In-Reply-To: <CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
Message-ID: <5088FCB8.4070701@ordix.de>

Hi Folks,

up to yesterday I always thought, that from-space and to-space have 
always the same size. In a GC log of a colleague I saw the following and 
was wondering about it:

22.10.2012 00:01:59 {Heap before gc invocations=590:
22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K 
[0xcdc00000, 0xd3800000, 0xf8800000)
22.10.2012 00:01:59   eden space 54528K, 100% used 
[0xcdc00000,0xd1140000,0xd1140000)
22.10.2012 00:01:59   from space 18240K, 89% used 
[0xd1460000,0xd2464168,0xd2630000)
22.10.2012 00:01:59   to   space 18048K, 0% used 
[0xd2660000,0xd2660000,0xd3800000)
22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K 
[0x78400000, 0xcdc00000, 0xcdc00000)
22.10.2012 00:01:59   object space 1400832K, 23% used 
[0x78400000,0x8cb0a778,0xcdc00000)
22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K 
[0x74400000, 0x75800000, 0x78400000)
22.10.2012 00:01:59   object space 20480K, 97% used 
[0x74400000,0x75787568,0x75800000)
22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)] 
405818K->350331K(1473344K), 0.0717791 secs]
22.10.2012 00:01:59 Heap after gc invocations=590:
22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K 
[0xcdc00000, 0xd3800000, 0xf8800000)
22.10.2012 00:01:59   eden space 54464K, 0% used 
[0xcdc00000,0xcdc00000,0xd1130000)
22.10.2012 00:01:59   from space 18048K, 85% used 
[0xd2660000,0xd355f7d8,0xd3800000)
22.10.2012 00:01:59   to   space 18112K, 0% used 
[0xd14a0000,0xd14a0000,0xd2650000)
22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K 
[0x78400000, 0xcdc00000, 0xcdc00000)
22.10.2012 00:01:59   object space 1400832K, 23% used 
[0x78400000,0x8cb1f778,0xcdc00000)
22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K 
[0x74400000, 0x75800000, 0x78400000)
22.10.2012 00:01:59   object space 20480K, 97% used 
[0x74400000,0x75787568,0x75800000)
22.10.2012 00:01:59 }

Before GC from space has 18240K and to space has 18048K.
After GC from space has 18048K and to space has 18112K.

java version "1.5.0_20"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)

1. Why is the size of from space and to space not equal?
2. Why is the size always changing?

Bye,
Thomas

From chunt at salesforce.com  Thu Oct 25 05:23:16 2012
From: chunt at salesforce.com (Charlie Hunt)
Date: Thu, 25 Oct 2012 05:23:16 -0700
Subject: from space and to space size is different and varies
In-Reply-To: <5088FCB8.4070701@ordix.de>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
	<5088FCB8.4070701@ordix.de>
Message-ID: <357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>

Hi Thomas,

It is common for Parallel GC, and Parallel Old GC to adjust survivor sizes when -XX:+UseAdaptiveSizePolicy is enabled.  And, -XX:+UseAdaptiveSizePolicy is enabled by default with either -XX:+UseParallelGC and -XX:+UseParallelOldGC, (iirc, the latter is not available in a Java 5 HotSpot VM).

If you disable adaptive size policy, via -XX:-UseAdaptiveSizePolicy, survivor sizes should remain the same size.

hths,

charlie ...

On Oct 25, 2012, at 3:47 AM, Thomas Rohde wrote:

> Hi Folks,
> 
> up to yesterday I always thought, that from-space and to-space have 
> always the same size. In a GC log of a colleague I saw the following and 
> was wondering about it:
> 
> 22.10.2012 00:01:59 {Heap before gc invocations=590:
> 22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K 
> [0xcdc00000, 0xd3800000, 0xf8800000)
> 22.10.2012 00:01:59   eden space 54528K, 100% used 
> [0xcdc00000,0xd1140000,0xd1140000)
> 22.10.2012 00:01:59   from space 18240K, 89% used 
> [0xd1460000,0xd2464168,0xd2630000)
> 22.10.2012 00:01:59   to   space 18048K, 0% used 
> [0xd2660000,0xd2660000,0xd3800000)
> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K 
> [0x78400000, 0xcdc00000, 0xcdc00000)
> 22.10.2012 00:01:59   object space 1400832K, 23% used 
> [0x78400000,0x8cb0a778,0xcdc00000)
> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K 
> [0x74400000, 0x75800000, 0x78400000)
> 22.10.2012 00:01:59   object space 20480K, 97% used 
> [0x74400000,0x75787568,0x75800000)
> 22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)] 
> 405818K->350331K(1473344K), 0.0717791 secs]
> 22.10.2012 00:01:59 Heap after gc invocations=590:
> 22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K 
> [0xcdc00000, 0xd3800000, 0xf8800000)
> 22.10.2012 00:01:59   eden space 54464K, 0% used 
> [0xcdc00000,0xcdc00000,0xd1130000)
> 22.10.2012 00:01:59   from space 18048K, 85% used 
> [0xd2660000,0xd355f7d8,0xd3800000)
> 22.10.2012 00:01:59   to   space 18112K, 0% used 
> [0xd14a0000,0xd14a0000,0xd2650000)
> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K 
> [0x78400000, 0xcdc00000, 0xcdc00000)
> 22.10.2012 00:01:59   object space 1400832K, 23% used 
> [0x78400000,0x8cb1f778,0xcdc00000)
> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K 
> [0x74400000, 0x75800000, 0x78400000)
> 22.10.2012 00:01:59   object space 20480K, 97% used 
> [0x74400000,0x75787568,0x75800000)
> 22.10.2012 00:01:59 }
> 
> Before GC from space has 18240K and to space has 18048K.
> After GC from space has 18048K and to space has 18112K.
> 
> java version "1.5.0_20"
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
> Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)
> 
> 1. Why is the size of from space and to space not equal?
> 2. Why is the size always changing?
> 
> Bye,
> Thomas
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From tro at ordix.de  Fri Oct 26 01:32:32 2012
From: tro at ordix.de (Thomas Rohde)
Date: Fri, 26 Oct 2012 10:32:32 +0200
Subject: from space and to space size is different and varies
In-Reply-To: <357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
	<5088FCB8.4070701@ordix.de>
	<357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
Message-ID: <508A4AA0.1090709@ordix.de>

Hi Charlie,

thanks for your reply. I could reproduce this with 
-XX:+UseAdaptiveSizePolicy and Java 1.7.

My misbelief that from space and to space are always the same size. 
Thought that -XX:+UseAdaptiveSizePolicy would change both of them in the 
same manner.

Thanks!
Thomas


Am 25.10.2012 14:23, schrieb Charlie Hunt:
> Hi Thomas,
>
> It is common for Parallel GC, and Parallel Old GC to adjust survivor sizes when -XX:+UseAdaptiveSizePolicy is enabled.  And, -XX:+UseAdaptiveSizePolicy is enabled by default with either -XX:+UseParallelGC and -XX:+UseParallelOldGC, (iirc, the latter is not available in a Java 5 HotSpot VM).
>
> If you disable adaptive size policy, via -XX:-UseAdaptiveSizePolicy, survivor sizes should remain the same size.
>
> hths,
>
> charlie ...
>
> On Oct 25, 2012, at 3:47 AM, Thomas Rohde wrote:
>
>> Hi Folks,
>>
>> up to yesterday I always thought, that from-space and to-space have
>> always the same size. In a GC log of a colleague I saw the following and
>> was wondering about it:
>>
>> 22.10.2012 00:01:59 {Heap before gc invocations=590:
>> 22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K
>> [0xcdc00000, 0xd3800000, 0xf8800000)
>> 22.10.2012 00:01:59   eden space 54528K, 100% used
>> [0xcdc00000,0xd1140000,0xd1140000)
>> 22.10.2012 00:01:59   from space 18240K, 89% used
>> [0xd1460000,0xd2464168,0xd2630000)
>> 22.10.2012 00:01:59   to   space 18048K, 0% used
>> [0xd2660000,0xd2660000,0xd3800000)
>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K
>> [0x78400000, 0xcdc00000, 0xcdc00000)
>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>> [0x78400000,0x8cb0a778,0xcdc00000)
>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>> [0x74400000, 0x75800000, 0x78400000)
>> 22.10.2012 00:01:59   object space 20480K, 97% used
>> [0x74400000,0x75787568,0x75800000)
>> 22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)]
>> 405818K->350331K(1473344K), 0.0717791 secs]
>> 22.10.2012 00:01:59 Heap after gc invocations=590:
>> 22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K
>> [0xcdc00000, 0xd3800000, 0xf8800000)
>> 22.10.2012 00:01:59   eden space 54464K, 0% used
>> [0xcdc00000,0xcdc00000,0xd1130000)
>> 22.10.2012 00:01:59   from space 18048K, 85% used
>> [0xd2660000,0xd355f7d8,0xd3800000)
>> 22.10.2012 00:01:59   to   space 18112K, 0% used
>> [0xd14a0000,0xd14a0000,0xd2650000)
>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K
>> [0x78400000, 0xcdc00000, 0xcdc00000)
>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>> [0x78400000,0x8cb1f778,0xcdc00000)
>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>> [0x74400000, 0x75800000, 0x78400000)
>> 22.10.2012 00:01:59   object space 20480K, 97% used
>> [0x74400000,0x75787568,0x75800000)
>> 22.10.2012 00:01:59 }
>>
>> Before GC from space has 18240K and to space has 18048K.
>> After GC from space has 18048K and to space has 18112K.
>>
>> java version "1.5.0_20"
>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
>> Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)
>>
>> 1. Why is the size of from space and to space not equal?
>> 2. Why is the size always changing?
>>
>> Bye,
>> Thomas
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-- 
Dipl.-Wirt. Inf. (FH)
Thomas Rohde
Senior Consultant
Architekturberatung

ORDIX AG
Westernmauer 12 - 16
33098 Paderborn

Tel: 05251 / 1063-0
Mobil: 0163 / 6734966
Fax: 0180 / 1673490
mailto:tro at ordix.de
http://www.ordix.de

ORDIX AG - Aktiengesellschaft f?r Softwareentwicklung, Schulung, 
Beratung und Systemintegration
Vorsitzender des Aufsichtsrates: Prof. Dr. Hermann Johannes
Vorstand: Wolfgang K?gler (Vorsitzender), Benedikt Georgi, Christoph 
Lafeld, Axel R?ber
Firmensitz: Westernmauer 12 - 16, 33098 Paderborn, Tel: 05251 / 1063-0, 
Fax: 0180 / 1 67 34 90
Amtsgericht Paderborn, HRB 2941, Ust-IdNr.DE 126333767, Steuernummer: 
339/5866/0142

From chunt at salesforce.com  Fri Oct 26 06:13:55 2012
From: chunt at salesforce.com (Charlie Hunt)
Date: Fri, 26 Oct 2012 06:13:55 -0700
Subject: from space and to space size is different and varies
In-Reply-To: <508A4AA0.1090709@ordix.de>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
	<5088FCB8.4070701@ordix.de>
	<357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
	<508A4AA0.1090709@ordix.de>
Message-ID: <FE47AAA2-E4BD-42A3-A319-91D9F9CCE1D3@salesforce.com>

Hi Thomas, 

If you monitor your Java app with VisualVM's VisualGC plugin you can observe the behavior.  You can launch VisualVM via jvisualvm. It's in the JDKs bin dir. Then go to the Update Center to get the VisualGC plugin.

If you run your Java app with -XX:+UseParallelGC (you don't need to specify -XX:+UseAdaptiveSizePolicy, it is enabled by default), you should see s to/from spaces sized differently on VisualGC.

Then, if you disable adaptive sizing, note the '-' character after the -XX:, via -XX:-UseAdaptiveSizePolicy, you should see that to/from spaces are staying the same size.

I will double check as well. What I just described is what I would expect to see.

Changing to/from sizes is by design with adaptive sizing enabled.

Charlie

Sent from my iPhone

On Oct 26, 2012, at 3:34 AM, "Thomas Rohde" <tro at ordix.de> wrote:

> Hi Charlie,
> 
> thanks for your reply. I could reproduce this with 
> -XX:+UseAdaptiveSizePolicy and Java 1.7.
> 
> My misbelief that from space and to space are always the same size. 
> Thought that -XX:+UseAdaptiveSizePolicy would change both of them in the 
> same manner.
> 
> Thanks!
> Thomas
> 
> 
> Am 25.10.2012 14:23, schrieb Charlie Hunt:
>> Hi Thomas,
>> 
>> It is common for Parallel GC, and Parallel Old GC to adjust survivor sizes when -XX:+UseAdaptiveSizePolicy is enabled.  And, -XX:+UseAdaptiveSizePolicy is enabled by default with either -XX:+UseParallelGC and -XX:+UseParallelOldGC, (iirc, the latter is not available in a Java 5 HotSpot VM).
>> 
>> If you disable adaptive size policy, via -XX:-UseAdaptiveSizePolicy, survivor sizes should remain the same size.
>> 
>> hths,
>> 
>> charlie ...
>> 
>> On Oct 25, 2012, at 3:47 AM, Thomas Rohde wrote:
>> 
>>> Hi Folks,
>>> 
>>> up to yesterday I always thought, that from-space and to-space have
>>> always the same size. In a GC log of a colleague I saw the following and
>>> was wondering about it:
>>> 
>>> 22.10.2012 00:01:59 {Heap before gc invocations=590:
>>> 22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K
>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>> 22.10.2012 00:01:59   eden space 54528K, 100% used
>>> [0xcdc00000,0xd1140000,0xd1140000)
>>> 22.10.2012 00:01:59   from space 18240K, 89% used
>>> [0xd1460000,0xd2464168,0xd2630000)
>>> 22.10.2012 00:01:59   to   space 18048K, 0% used
>>> [0xd2660000,0xd2660000,0xd3800000)
>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K
>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>> [0x78400000,0x8cb0a778,0xcdc00000)
>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>> [0x74400000, 0x75800000, 0x78400000)
>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>> [0x74400000,0x75787568,0x75800000)
>>> 22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)]
>>> 405818K->350331K(1473344K), 0.0717791 secs]
>>> 22.10.2012 00:01:59 Heap after gc invocations=590:
>>> 22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K
>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>> 22.10.2012 00:01:59   eden space 54464K, 0% used
>>> [0xcdc00000,0xcdc00000,0xd1130000)
>>> 22.10.2012 00:01:59   from space 18048K, 85% used
>>> [0xd2660000,0xd355f7d8,0xd3800000)
>>> 22.10.2012 00:01:59   to   space 18112K, 0% used
>>> [0xd14a0000,0xd14a0000,0xd2650000)
>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K
>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>> [0x78400000,0x8cb1f778,0xcdc00000)
>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>> [0x74400000, 0x75800000, 0x78400000)
>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>> [0x74400000,0x75787568,0x75800000)
>>> 22.10.2012 00:01:59 }
>>> 
>>> Before GC from space has 18240K and to space has 18048K.
>>> After GC from space has 18048K and to space has 18112K.
>>> 
>>> java version "1.5.0_20"
>>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
>>> Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)
>>> 
>>> 1. Why is the size of from space and to space not equal?
>>> 2. Why is the size always changing?
>>> 
>>> Bye,
>>> Thomas
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> 
> -- 
> Dipl.-Wirt. Inf. (FH)
> Thomas Rohde
> Senior Consultant
> Architekturberatung
> 
> ORDIX AG
> Westernmauer 12 - 16
> 33098 Paderborn
> 
> Tel: 05251 / 1063-0
> Mobil: 0163 / 6734966
> Fax: 0180 / 1673490
> mailto:tro at ordix.de
> http://www.ordix.de
> 
> ORDIX AG - Aktiengesellschaft f?r Softwareentwicklung, Schulung, 
> Beratung und Systemintegration
> Vorsitzender des Aufsichtsrates: Prof. Dr. Hermann Johannes
> Vorstand: Wolfgang K?gler (Vorsitzender), Benedikt Georgi, Christoph 
> Lafeld, Axel R?ber
> Firmensitz: Westernmauer 12 - 16, 33098 Paderborn, Tel: 05251 / 1063-0, 
> Fax: 0180 / 1 67 34 90
> Amtsgericht Paderborn, HRB 2941, Ust-IdNr.DE 126333767, Steuernummer: 
> 339/5866/0142
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From bernd.eckenfels at googlemail.com  Sat Oct 27 20:51:13 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Sun, 28 Oct 2012 04:51:13 +0100
Subject: from space and to space size is different and varies
In-Reply-To: <FE47AAA2-E4BD-42A3-A319-91D9F9CCE1D3@salesforce.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>
	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>
	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>
	<5088FCB8.4070701@ordix.de>
	<357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
	<508A4AA0.1090709@ordix.de>
	<FE47AAA2-E4BD-42A3-A319-91D9F9CCE1D3@salesforce.com>
Message-ID: <op.wmvebndztqmg3o@eckenfels02.seeburger.de>

Am 26.10.2012, 15:13 Uhr, schrieb Charlie Hunt <chunt at salesforce.com>:
> Changing to/from sizes is by design with adaptive sizing enabled.

Well resizin survivor spaces is the design goal, not the imbalanced size  
of the regions.

But of course for growin or shrinkin it makes sense to have differences.

Bernd

From jon.masamitsu at oracle.com  Mon Oct 29 07:46:24 2012
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 29 Oct 2012 07:46:24 -0700
Subject: from space and to space size is different and varies
In-Reply-To: <508A4AA0.1090709@ordix.de>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>	<50806485.8080808@Oracle.COM>	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>	<50809E6D.6020704@Oracle.COM>	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>	<5088FCB8.4070701@ordix.de>	<357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
	<508A4AA0.1090709@ordix.de>
Message-ID: <508E96C0.5040004@oracle.com>


On 10/26/2012 1:32 AM, Thomas Rohde wrote:
> Hi Charlie,
>
> thanks for your reply. I could reproduce this with
> -XX:+UseAdaptiveSizePolicy and Java 1.7.
>
> My misbelief that from space and to space are always the same size.
> Thought that -XX:+UseAdaptiveSizePolicy would change both of them in the
> same manner.

UseAdaptiveSizePolicy changes them both at the same time if it
can.  Excuse the ascii art.

+==========+
|unused area|
+==========+
| from-space|
+==========+
| to-space  |
+==========+
| eden      |
+==========+

The space for expansion is above from-space.  from-space
typically has live data in it.  from-space can be expanded into
the unused area but to-space cannot (the live data in
from-space has to stay where it is).  After the next collection
the roles of from-space and to-space are switched and
the "new" from-space (the former to-space) can be expanded into
  the unused area. This means it takes a collections or two for an 
expansion to
take effect.  It could have been done other ways.  This is how
it is currently done.

Jon


> Thanks!
> Thomas
>
>
> Am 25.10.2012 14:23, schrieb Charlie Hunt:
>> Hi Thomas,
>>
>> It is common for Parallel GC, and Parallel Old GC to adjust survivor sizes when -XX:+UseAdaptiveSizePolicy is enabled.  And, -XX:+UseAdaptiveSizePolicy is enabled by default with either -XX:+UseParallelGC and -XX:+UseParallelOldGC, (iirc, the latter is not available in a Java 5 HotSpot VM).
>>
>> If you disable adaptive size policy, via -XX:-UseAdaptiveSizePolicy, survivor sizes should remain the same size.
>>
>> hths,
>>
>> charlie ...
>>
>> On Oct 25, 2012, at 3:47 AM, Thomas Rohde wrote:
>>
>>> Hi Folks,
>>>
>>> up to yesterday I always thought, that from-space and to-space have
>>> always the same size. In a GC log of a colleague I saw the following and
>>> was wondering about it:
>>>
>>> 22.10.2012 00:01:59 {Heap before gc invocations=590:
>>> 22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K
>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>> 22.10.2012 00:01:59   eden space 54528K, 100% used
>>> [0xcdc00000,0xd1140000,0xd1140000)
>>> 22.10.2012 00:01:59   from space 18240K, 89% used
>>> [0xd1460000,0xd2464168,0xd2630000)
>>> 22.10.2012 00:01:59   to   space 18048K, 0% used
>>> [0xd2660000,0xd2660000,0xd3800000)
>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K
>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>> [0x78400000,0x8cb0a778,0xcdc00000)
>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>> [0x74400000, 0x75800000, 0x78400000)
>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>> [0x74400000,0x75787568,0x75800000)
>>> 22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)]
>>> 405818K->350331K(1473344K), 0.0717791 secs]
>>> 22.10.2012 00:01:59 Heap after gc invocations=590:
>>> 22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K
>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>> 22.10.2012 00:01:59   eden space 54464K, 0% used
>>> [0xcdc00000,0xcdc00000,0xd1130000)
>>> 22.10.2012 00:01:59   from space 18048K, 85% used
>>> [0xd2660000,0xd355f7d8,0xd3800000)
>>> 22.10.2012 00:01:59   to   space 18112K, 0% used
>>> [0xd14a0000,0xd14a0000,0xd2650000)
>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K
>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>> [0x78400000,0x8cb1f778,0xcdc00000)
>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>> [0x74400000, 0x75800000, 0x78400000)
>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>> [0x74400000,0x75787568,0x75800000)
>>> 22.10.2012 00:01:59 }
>>>
>>> Before GC from space has 18240K and to space has 18048K.
>>> After GC from space has 18048K and to space has 18112K.
>>>
>>> java version "1.5.0_20"
>>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
>>> Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)
>>>
>>> 1. Why is the size of from space and to space not equal?
>>> 2. Why is the size always changing?
>>>
>>> Bye,
>>> Thomas
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>

From tro at ordix.de  Mon Oct 29 09:11:31 2012
From: tro at ordix.de (Thomas Rohde)
Date: Mon, 29 Oct 2012 17:11:31 +0100
Subject: from space and to space size is different and varies
In-Reply-To: <508E96C0.5040004@oracle.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>	<50806485.8080808@Oracle.COM>	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>	<50809E6D.6020704@Oracle.COM>	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>	<725B363C-BDA7-492C-85D1-8CF706137583@salesforce.com>	<CA+_uMJLL=rcVd8E8RFo8OwA7rQr0tWFzbi4gT4tvciH-wpJatg@mail.gmail.com>	<CABzyjykWXD7aKY+0KXpKPw0PjWZgyU5A=Z5+KEP2EqKk_v6M_A@mail.gmail.com>	<5088FCB8.4070701@ordix.de>	<357B56AA-DE53-4E2D-9E7B-E8714165B703@salesforce.com>
	<508A4AA0.1090709@ordix.de> <508E96C0.5040004@oracle.com>
Message-ID: <508EAAB3.3060203@ordix.de>

Hey together!

Thanks for explaining.

Bye,
Thomas


Am 29.10.2012 15:46, schrieb Jon Masamitsu:
>
> On 10/26/2012 1:32 AM, Thomas Rohde wrote:
>> Hi Charlie,
>>
>> thanks for your reply. I could reproduce this with
>> -XX:+UseAdaptiveSizePolicy and Java 1.7.
>>
>> My misbelief that from space and to space are always the same size.
>> Thought that -XX:+UseAdaptiveSizePolicy would change both of them in the
>> same manner.
> UseAdaptiveSizePolicy changes them both at the same time if it
> can.  Excuse the ascii art.
>
> +==========+
> |unused area|
> +==========+
> | from-space|
> +==========+
> | to-space  |
> +==========+
> | eden      |
> +==========+
>
> The space for expansion is above from-space.  from-space
> typically has live data in it.  from-space can be expanded into
> the unused area but to-space cannot (the live data in
> from-space has to stay where it is).  After the next collection
> the roles of from-space and to-space are switched and
> the "new" from-space (the former to-space) can be expanded into
>    the unused area. This means it takes a collections or two for an
> expansion to
> take effect.  It could have been done other ways.  This is how
> it is currently done.
>
> Jon
>
>
>
>> Thanks!
>> Thomas
>>
>>
>> Am 25.10.2012 14:23, schrieb Charlie Hunt:
>>> Hi Thomas,
>>>
>>> It is common for Parallel GC, and Parallel Old GC to adjust survivor sizes when -XX:+UseAdaptiveSizePolicy is enabled.  And, -XX:+UseAdaptiveSizePolicy is enabled by default with either -XX:+UseParallelGC and -XX:+UseParallelOldGC, (iirc, the latter is not available in a Java 5 HotSpot VM).
>>>
>>> If you disable adaptive size policy, via -XX:-UseAdaptiveSizePolicy, survivor sizes should remain the same size.
>>>
>>> hths,
>>>
>>> charlie ...
>>>
>>> On Oct 25, 2012, at 3:47 AM, Thomas Rohde wrote:
>>>
>>>> Hi Folks,
>>>>
>>>> up to yesterday I always thought, that from-space and to-space have
>>>> always the same size. In a GC log of a colleague I saw the following and
>>>> was wondering about it:
>>>>
>>>> 22.10.2012 00:01:59 {Heap before gc invocations=590:
>>>> 22.10.2012 00:01:59  PSYoungGen      total 72768K, used 70928K
>>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>>> 22.10.2012 00:01:59   eden space 54528K, 100% used
>>>> [0xcdc00000,0xd1140000,0xd1140000)
>>>> 22.10.2012 00:01:59   from space 18240K, 89% used
>>>> [0xd1460000,0xd2464168,0xd2630000)
>>>> 22.10.2012 00:01:59   to   space 18048K, 0% used
>>>> [0xd2660000,0xd2660000,0xd3800000)
>>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334889K
>>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>>> [0x78400000,0x8cb0a778,0xcdc00000)
>>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>>> [0x74400000, 0x75800000, 0x78400000)
>>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>>> [0x74400000,0x75787568,0x75800000)
>>>> 22.10.2012 00:01:59 299113.923: [GC [PSYoungGen: 70928K->15357K(72512K)]
>>>> 405818K->350331K(1473344K), 0.0717791 secs]
>>>> 22.10.2012 00:01:59 Heap after gc invocations=590:
>>>> 22.10.2012 00:01:59  PSYoungGen      total 72512K, used 15357K
>>>> [0xcdc00000, 0xd3800000, 0xf8800000)
>>>> 22.10.2012 00:01:59   eden space 54464K, 0% used
>>>> [0xcdc00000,0xcdc00000,0xd1130000)
>>>> 22.10.2012 00:01:59   from space 18048K, 85% used
>>>> [0xd2660000,0xd355f7d8,0xd3800000)
>>>> 22.10.2012 00:01:59   to   space 18112K, 0% used
>>>> [0xd14a0000,0xd14a0000,0xd2650000)
>>>> 22.10.2012 00:01:59  PSOldGen        total 1400832K, used 334973K
>>>> [0x78400000, 0xcdc00000, 0xcdc00000)
>>>> 22.10.2012 00:01:59   object space 1400832K, 23% used
>>>> [0x78400000,0x8cb1f778,0xcdc00000)
>>>> 22.10.2012 00:01:59  PSPermGen       total 20480K, used 19997K
>>>> [0x74400000, 0x75800000, 0x78400000)
>>>> 22.10.2012 00:01:59   object space 20480K, 97% used
>>>> [0x74400000,0x75787568,0x75800000)
>>>> 22.10.2012 00:01:59 }
>>>>
>>>> Before GC from space has 18240K and to space has 18048K.
>>>> After GC from space has 18048K and to space has 18112K.
>>>>
>>>> java version "1.5.0_20"
>>>> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_20-b02)
>>>> Java HotSpot(TM) Server VM (build 1.5.0_20-b02, mixed mode)
>>>>
>>>> 1. Why is the size of from space and to space not equal?
>>>> 2. Why is the size always changing?
>>>>
>>>> Bye,
>>>> Thomas
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>


-- 
Dipl.-Wirt. Inf. (FH)
Thomas Rohde
Senior Consultant
Architekturberatung

ORDIX AG
Westernmauer 12 - 16
33098 Paderborn

Tel: 05251 / 1063-0
Mobil: 0163 / 6734966
Fax: 0180 / 1673490
mailto:tro at ordix.de
http://www.ordix.de

ORDIX AG - Aktiengesellschaft f?r Softwareentwicklung, Schulung, 
Beratung und Systemintegration
Vorsitzender des Aufsichtsrates: Prof. Dr. Hermann Johannes
Vorstand: Wolfgang K?gler (Vorsitzender), Benedikt Georgi, Christoph 
Lafeld, Axel R?ber
Firmensitz: Westernmauer 12 - 16, 33098 Paderborn, Tel: 05251 / 1063-0, 
Fax: 0180 / 1 67 34 90
Amtsgericht Paderborn, HRB 2941, Ust-IdNr.DE 126333767, Steuernummer: 
339/5866/0142

From ysr1729 at gmail.com  Tue Oct 30 09:49:14 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Tue, 30 Oct 2012 09:49:14 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>
	<50806485.8080808@Oracle.COM>
	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>
	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>
	<50809E6D.6020704@Oracle.COM>
	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
Message-ID: <CABzyjy=naED3YvU_8K1D178M_VrQODrbOA1EzHrtTfTgdNiP5A@mail.gmail.com>

Hi Peter, all --

On Fri, Oct 19, 2012 at 1:40 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

>
>
> On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler <
> Peter.B.Kessler at oracle.com> wrote:
>
>> When there's no room in the old generation and a worker has filled its
>> PLAB to capacity, but it still has instances to try to promote, does it try
>> to allocate a new PLAB, and fail?  That would lead to each of the workers
>> eventually failing to allocate a new PLAB for each promotion attempt.
>>  IIRC, PLAB allocation grabs a real lock (since it happens so rarely :-).
>>  In the promotion failure case, that lock could get incandescent.  Maybe
>> it's gone unnoticed because for modest young generations it doesn't stay
>> hot enough for long enough for people to witness the supernova?  Having a
>> young generation the size you do would exacerbate the problem.  If you have
>> lots of workers, that would increase the amount of contention, too.
>>
>
> Yes, that's exactly my thinking too. For the case of CMS, the PLAB's are
> "local free block lists" and the allocation from the shared global pool is
> even worse and more heavyweight than an atomic pointer bump, with a lock
> protecting several layers of checks.
>
>
>>
>> PLAB allocation might be a place where you could put a test for having
>> failed promotion, so just return null and let the worker self-loop this
>> instance.  That would keep the test off the fast-path (when things are
>> going well).
>>
>
> Yes, that's a good idea and might well be sufficient, and was also my
> first thought. However, I also wonder about whether just moving the
> promotion
> failure test a volatile read into the fast path of the copy routine, and
> immediately failing all subsequent copies after the first failure (and
> indeed via the
> global flag propagating that failure across all the workers immediately)
> won't just be quicker without having added that much in the fast path. It
> seems
> that in that case we may be able to even avoid the self-looping and the
> subsequent single-threaded fixup. The first thread that fails sets the
> volatile
> global, so any subsequent thread artificially fails all subsequent copies
> of uncopied objects. Any object reference found pointing to an object in
> Eden
> or From space that hasn't yet been copied will call the copy routine which
> will (artificially) fail and return the original address.
>
> I'll do some experiments and there may lurk devils in the details, but it
> seems to me that this will work and be much more efficient in the
> slow case, without making the fast path that much slower.
>

I got back to thinking about this  issue with a view to implementing the
second approach above, but ran into what's an obvious problem which should
have occurred to me. We need a way of terminating the failure scan while at
the same time making sure to update all references that point at old copies
of objects that have already been copied.

We can do one or the other easily, but not both at the same time without
avoiding the linear walk of Eden and From spaces, which I had wanted to
avoid
on account of its single-threadedness.

(1) The simplest termination protocol is that once each thread sees that
promotion has failed, it does not attempt to copy the target object (if it
has not yet
      been copied), but simply, returns the object from the copy routine.
It also does not push the object on its work stack. Thus the workers will
eventually
     terminate their scan of the card-table and of the objects on their
work stacks after having updated any references to moved objects
encountered during
     the remainder of the scan. This can however still leave a subset of
the references of objects in the young generation that were not copied
pointing to
     the old copies of objects that were copied. Fixing those would require
a (linear) scan of Eden and From survivor spaces (which today is
single-threaded,
     although it conceivably could be made multi-threaded by remembering
TLAB allocation boundaries and using those as the parallel chunks of work).

(2) If we didn't want to do the linear scan mentioned above, then at
termination, we should have scanned all reachable objects in the young gen.
In that
      case, to terminate the scan we would have to remember when the
tracing had already scanned an object, so the trace would not need to
process that
      object. Previously, this was done by looking at the location of the
object (not in Eden or From Space) or at its header (a forwarding pointer).
The use of
      a self-loop forwarding pointer for failed copies requires another
pass to clear it which is the linear scan of Eden and From spaces that we
wanted to
      avoid above. The self-looping also requires evacuation and the later
rematerialization of the header contents during the scan, both of which
carry
      additional expense. Even if we had a spare header bit -- which we
well might in the 64-bit case, i think, although i haven't looked recently
-- we would
      still need to clear that bit without doing a linear pass over Eden
and From spaces. We could conceivably do so as part of the later full gc
that is to follow
      shortly, but that would add at least some bulk via a test to the full
gc closures.

So it almost seems as though the later scan of Eden and From spaces is
unavoidable, no matter what. Unless someone is able to think of a clever
way of avoiding it while still quickly terminating the tracing after
signalling global promotion failure.

Given that, my first though was to use the approach outlined in (1) where
we terminate quickly, then do a linear walk of Eden and From and fix up any
stale
references that were left pointing at old copies of successfully copied
objects. The advantage of that appears to be two-fold: firstly the algorithm
terminates the trace very quickly and does not pay any time or space
penalty of self-looping and header evacuation. The disadvantage however is
that the linear walk might need to do much more work in scanning all
objects and fixing their "stale" references, even when those objects may
themselves be dead.

To evaluate the rough relative tradeoff between the two schemes, one could
do the following back of the envelope calculation:

. Suppose that the object graph is random and that some small fraction s of
young gen objects y survive a scavenge on average.
. Then the work done in the linear walk of scheme (1) is O(y + e) where y
is the total number of objects, dead or alive, in Eden and From space,
   and e is the total number of edges (references) in those objects.
. In the case of scheme (2), the work done in the linear walk is O(y) where
y is the total number of objects, dead or alive, in Eden and From space. The
  space used and the work done for evacuating the non-prototypical headers
would be O(f.s.y) where s and y are as defined above, and f is the fraction
  of survivors that have non-prototypical headers.

Also, the multiplicative constant factor in the big-O for scheme (1) seems
to be much higher than for scheme (2): while scheme (2) merely examines
the header of each object and fixes up the self-loop (or clears the header
bit if using a space header bit), scheme (1) needs to iterate over the
references
in each object, examine the header of the target object of that reference
and determine if the reference must be updated, and do so if needed.

Thus although (1) may terminate the trace very quickly, the linear scan of
Eden and From spaces for the "fix-up" phase is likely to be much more
expensive.

It could well be that (2) is the superior scheme under those circumstances,
which would bring us back to self-looping (or its moral equivalent a header
bit).
The header bit avoids the time and space expense of the spooling of
non-prototypical headers. If we optimized the acquisition of spooling space
for
headers, we might be almost on par with the header bit use and it would
leave the fast path of normal scavenge unaffected as is the case today.
(Use of the header bit would need an extra test.)

After thinking over this, it appears as though the simplest route is to do
what Peter mentioned earlier, which is to just have the allocation of the
promotion
buffers have fast-fail paths (after promotion failure) that do not take
locks, and just cut down on that expense.

Let me know if anyone has any comments/feedback/ideas: otherwise I'll go
through the CMS (and ParallelOld) allocation paths and place
"fast-fail gates" along the slow allocation path where we go to refill the
local buffers. (Later, we can also see about the possibility of using a
header bit
to signal a failed copy, rather than a self-loop, and see if it fetches any
gains by saving on header spooling while not affecting the fast path too
much,
and of multi-threading the "linear-scan" by using TLAB boundaries of the
previous epoch. But those would be later optimizations.)

-- ramki


>
>>
>> I'm still guessing.
>
>
> Your guesses are good, and very helpful, and I think we are on the right
> track with this one as regards the cause of the slowdown.
>
> I'll update.
>
> -- ramki
>
>
>>
>>
>>                         ... peter
>>
>> Srinivas Ramakrishna wrote:
>>
>>> System data show high context switching in vicinity of event and points
>>> at the futile allocation bottleneck as a possible theory with some legs....
>>>
>>> more later.
>>> -- ramki
>>>
>>> On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna <ysr1729 at gmail.com<mailto:
>>> ysr1729 at gmail.com>> wrote:
>>>
>>>     Thanks Peter... the possibility of paging or related issue of VM
>>>     system did occur to me, especially because system time shows up as
>>>     somewhat high here. The problem is that this server runs without
>>>     swap :-) so the time is going elsewhere.
>>>
>>>     The cache miss theory is interesting (but would not show up as
>>>     system time), and your back of the envelope calculation gives about
>>>     0.8 us for fetching a cache line, although i am pretty sure the
>>>     cache miss predictor would probably figure out the misses and stream
>>>     in the
>>>     cache lines since as you say we are going in address order). I'd
>>>     expect it to be no worse than when we do an "initial mark pause on a
>>>     full Eden", give or
>>>     take a little, and this is some 30 x worse.
>>>
>>>     One possibility I am looking at is the part where we self-loop. I
>>>     suspect the ParNew/CMS combination running with multiple worker
>>> threads
>>>     is hit hard here, if the failure happens very early say -- from what
>>>     i saw of that code recently, we don't consult the flag that says we
>>>     failed
>>>     so we should just return and self-loop. Rather we retry allocation
>>>     for each subsequent object, fail that and then do the self-loop. The
>>>     repeated
>>>     failed attempts might be adding up, especially since the access
>>>     involves looking at the shared pool. I'll look at how that is done,
>>>     and see if we can
>>>     do a fast fail after the first failure happens, rather than try and
>>>     do the rest of the scavenge, since we'll need to do a fixup anyway.
>>>
>>>     thanks for the discussion and i'll update as and when i do some more
>>>     investigations. Keep those ideas coming, and I'll submit a bug
>>>     report once
>>>     i have spent a few more cycles looking at the available data and
>>>     ruminating.
>>>
>>>     - ramki
>>>
>>>
>>>     On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
>>>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler@**oracle.com<Peter.B.Kessler at oracle.com>>>
>>> wrote:
>>>
>>>         IIRC, promotion failure still has to finish the evacuation
>>>         attempt (and some objects may get promoted while the ones that
>>>         fail get self-looped).  That part is the usual multi-threaded
>>>         object graph walk, with failed PLAB allocations thrown in to
>>>         slow you down.  Then you get to start the pass that deals with
>>>         the self-loops, which you say is single-threaded.  Undoing the
>>>         self-loops is in address order, but it walks by the object
>>>         sizes, so probably it mostly misses in the cache.  40GB at the
>>>         average object size (call them 40 bytes to make the math easy)
>>>         is a lot of cache misses.  How fast is your memory system?
>>>          Probably faster than (10minutes / (40GB / 40bytes)) per cache
>>> miss.
>>>
>>>         Is it possible you are paging?  Maybe not when things are
>>>         running smoothly, but maybe a 10 minute stall on one service
>>>         causes things to back up (and grow the heap of) other services
>>>         on the same machine?  I'm guessing.
>>>
>>>                                 ... peter
>>>
>>>         Srinivas Ramakrishna wrote:
>>>
>>>
>>>             Has anyone come across extremely long (upwards of 10
>>>             minutes) promotion failure unwinding scenarios when using
>>>             any of the collectors, but especially with ParNew/CMS?
>>>             I recently came across one such occurrence with ParNew/CMS
>>>             that, with a 40 GB young gen took upwards of 10 minutes to
>>>             "unwind". I looked through the code and I can see
>>>             that the unwinding steps can be a source of slowdown as we
>>>             iterate single-threaded (DefNew) through the large Eden to
>>>             fix up self-forwarded objects, but that still wouldn't
>>>             seem to explain such a large pause, even with a 40 GB young
>>>             gen. I am looking through the promotion failure paths to see
>>>             what might be the cause of such a large pause,
>>>             but if anyone has experienced this kind of scenario before
>>>             or has any conjectures or insights, I'd appreciate it.
>>>
>>>             thanks!
>>>             -- ramki
>>>
>>>
>>>             ------------------------------**
>>> __----------------------------**--__------------
>>>
>>>             ______________________________**___________________
>>>             hotspot-gc-use mailing list
>>>             hotspot-gc-use at openjdk.java.__**net
>>>             <mailto:hotspot-gc-use@**openjdk.java.net<hotspot-gc-use at openjdk.java.net>
>>> >
>>>             http://mail.openjdk.java.net/_**
>>> _mailman/listinfo/hotspot-gc-_**_use<http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use>
>>>             <http://mail.openjdk.java.net/**mailman/listinfo/hotspot-gc-
>>> **use <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>>
>>>
>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121030/46ebba9b/attachment-0001.html 

From Peter.B.Kessler at Oracle.COM  Tue Oct 30 13:04:55 2012
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Tue, 30 Oct 2012 13:04:55 -0700
Subject: Extremely long parnew/cms promotion failure scenario?
In-Reply-To: <CABzyjy=naED3YvU_8K1D178M_VrQODrbOA1EzHrtTfTgdNiP5A@mail.gmail.com>
References: <CABzyjykDg1zs5dCeAEkxBmdq8JK_cLkU2vtRWrePrpyuNByxQw@mail.gmail.com>	<50806485.8080808@Oracle.COM>	<CABzyjy=3XHzQK-hbEVZvJMxxTzts3A3934ZyBnO3Fh6nVtwO1A@mail.gmail.com>	<CABzyjyk0yvuA-o1yVkw-bHX8A-_nB=UTz3f7Lg0QEiV9Vu24nw@mail.gmail.com>	<50809E6D.6020704@Oracle.COM>	<CABzyjyksSV-K0qEmzyaFMuKvjKQ4XN7HX0MRjA=doMvwHHB_jg@mail.gmail.com>
	<CABzyjy=naED3YvU_8K1D178M_VrQODrbOA1EzHrtTfTgdNiP5A@mail.gmail.com>
Message-ID: <509032E7.6010408@Oracle.COM>

Some comments inline.

          ... peter

Srinivas Ramakrishna wrote:
> Hi Peter, all --
>
> On Fri, Oct 19, 2012 at 1:40 AM, Srinivas Ramakrishna 
> <ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>> wrote:
>
>
>
>     On Thu, Oct 18, 2012 at 5:27 PM, Peter B. Kessler
>     <Peter.B.Kessler at oracle.com <mailto:Peter.B.Kessler at oracle.com>>
>     wrote:
>
>         When there's no room in the old generation and a worker has
>         filled its PLAB to capacity, but it still has instances to try
>         to promote, does it try to allocate a new PLAB, and fail?
>          That would lead to each of the workers eventually failing to
>         allocate a new PLAB for each promotion attempt.  IIRC, PLAB
>         allocation grabs a real lock (since it happens so rarely :-).
>          In the promotion failure case, that lock could get
>         incandescent.  Maybe it's gone unnoticed because for modest
>         young generations it doesn't stay hot enough for long enough
>         for people to witness the supernova?  Having a young
>         generation the size you do would exacerbate the problem.  If
>         you have lots of workers, that would increase the amount of
>         contention, too.
>
>
>     Yes, that's exactly my thinking too. For the case of CMS, the
>     PLAB's are "local free block lists" and the allocation from the
>     shared global pool is
>     even worse and more heavyweight than an atomic pointer bump, with
>     a lock protecting several layers of checks.
>      
>
>
>         PLAB allocation might be a place where you could put a test
>         for having failed promotion, so just return null and let the
>         worker self-loop this instance.  That would keep the test off
>         the fast-path (when things are going well).
>
>
>     Yes, that's a good idea and might well be sufficient, and was also
>     my first thought. However, I also wonder about whether just moving
>     the promotion
>     failure test a volatile read into the fast path of the copy
>     routine, and immediately failing all subsequent copies after the
>     first failure (and indeed via the
>     global flag propagating that failure across all the workers
>     immediately) won't just be quicker without having added that much
>     in the fast path. It seems
>     that in that case we may be able to even avoid the self-looping
>     and the subsequent single-threaded fixup. The first thread that
>     fails sets the volatile
>     global, so any subsequent thread artificially fails all subsequent
>     copies of uncopied objects. Any object reference found pointing to
>     an object in Eden
>     or From space that hasn't yet been copied will call the copy
>     routine which will (artificially) fail and return the original
>     address.
>
>     I'll do some experiments and there may lurk devils in the details,
>     but it seems to me that this will work and be much more efficient
>     in the
>     slow case, without making the fast path that much slower.
>
>
> I got back to thinking about this  issue with a view to implementing 
> the second approach above, but ran into what's an obvious problem 
> which should
> have occurred to me. We need a way of terminating the failure scan 
> while at the same time making sure to update all references that point 
> at old copies
> of objects that have already been copied.
>
> We can do one or the other easily, but not both at the same time 
> without avoiding the linear walk of Eden and From spaces, which I had 
> wanted to avoid
> on account of its single-threadedness.
>
> (1) The simplest termination protocol is that once each thread sees 
> that promotion has failed, it does not attempt to copy the target 
> object (if it has not yet
>       been copied), but simply, returns the object from the copy 
> routine. It also does not push the object on its work stack. Thus the 
> workers will eventually
>      terminate their scan of the card-table and of the objects on 
> their work stacks after having updated any references to moved objects 
> encountered during
>      the remainder of the scan. This can however still leave a subset 
> of the references of objects in the young generation that were not 
> copied pointing to
>      the old copies of objects that were copied. Fixing those would 
> require a (linear) scan of Eden and From survivor spaces (which today 
> is single-threaded,
>      although it conceivably could be made multi-threaded by 
> remembering TLAB allocation boundaries and using those as the parallel 
> chunks of work).
Not having block-offset tables for the eden and survivor spaces make it 
hard to parallelize that walk.  Remembering TLAB boundaries (as object 
starts, as you suggest) is a good idea, except if humongous objects are 
allocated directly in the eden (or from) without the benefit of a TLAB 
allocation.  But that doesn't seem that hard to work into a scheme that 
remembers TLAB boundaries, since it happens infrequently.  (Don't forget 
the To-space: it may also have objects that were copied to it, but which 
won't, in one of your schemes, necessarily have their contents updated.)

>
> (2) If we didn't want to do the linear scan mentioned above, then at 
> termination, we should have scanned all reachable objects in the young 
> gen. In that
>       case, to terminate the scan we would have to remember when the 
> tracing had already scanned an object, so the trace would not need to 
> process that
>       object. Previously, this was done by looking at the location of 
> the object (not in Eden or From Space) or at its header (a forwarding 
> pointer). The use of
>       a self-loop forwarding pointer for failed copies requires 
> another pass to clear it which is the linear scan of Eden and From 
> spaces that we wanted to
>       avoid above. The self-looping also requires evacuation and the 
> later rematerialization of the header contents during the scan, both 
> of which carry
>       additional expense. Even if we had a spare header bit -- which 
> we well might in the 64-bit case, i think, although i haven't looked 
> recently -- we would
>       still need to clear that bit without doing a linear pass over 
> Eden and From spaces. We could conceivably do so as part of the later 
> full gc that is to follow
>       shortly, but that would add at least some bulk via a test to the 
> full gc closures.
Non-prototypical headers have to be evacuated and rematerialized any 
time you want to install forwarding pointers.  Worrying about them might 
be a red herring.  It would be interesting to have some numbers on how 
frequent they are ("It depends.") and how much time and space it takes 
to displace them.

>
> So it almost seems as though the later scan of Eden and From spaces is 
> unavoidable, no matter what. Unless someone is able to think of a clever
> way of avoiding it while still quickly terminating the tracing after 
> signalling global promotion failure.
If you can find two bits instead of just one, you could have an epoch 
counter (modulo 3) that would tell you whether you need to look at any 
particular object during this collection, or reset the bits.  Does that 
help?  I'm not sure what you are using these bits for.  Are you 
expecting back-to-back promotion failures?  Even so, there would be a 
full re-allocation of the eden in between, which would reset the bits in 
the newly-allocated objects.  Keeping the bits outside of the heap would 
let you clear them more efficiently, at the cost of an additional memory 
stall whenever you had to read both the bit for an object and the object 
itself.  If you just read in address-order through the bits and through 
memory, maybe the prefetchers (software and hardware) could be made to work.

I don't like having to look at dead objects in the young generation, 
because there are so many of them.  (Maybe especially in a 40GB young 
generation!)  But I also can't see a way around it (yet :-).  At least 
in the parallel collector, one point of the full young generation scan 
is to change the klass for (runs of) dead objects to be, if I recall 
correctly, int[], so the full collection that's coming doesn't have to 
(redundantly) look at the individual dead objects, and doesn't have to 
think about whether there are any interior pointers in that region.  
Does it help to know that the immediately following full collection is 
going to iterate over all the objects in the heap, dead or alive?  
Tangling the code for the old generation collector with the code for the 
young generation collector doesn't sound like a good idea.

          ... peter
>
> Given that, my first though was to use the approach outlined in (1) 
> where we terminate quickly, then do a linear walk of Eden and From and 
> fix up any stale
> references that were left pointing at old copies of successfully 
> copied objects. The advantage of that appears to be two-fold: firstly 
> the algorithm
> terminates the trace very quickly and does not pay any time or space 
> penalty of self-looping and header evacuation. The disadvantage however is
> that the linear walk might need to do much more work in scanning all 
> objects and fixing their "stale" references, even when those objects may
> themselves be dead.
>
> To evaluate the rough relative tradeoff between the two schemes, one 
> could do the following back of the envelope calculation:
>
> . Suppose that the object graph is random and that some small fraction 
> s of young gen objects y survive a scavenge on average.
> . Then the work done in the linear walk of scheme (1) is O(y + e) 
> where y is the total number of objects, dead or alive, in Eden and 
> From space,
>    and e is the total number of edges (references) in those objects.
> . In the case of scheme (2), the work done in the linear walk is O(y) 
> where y is the total number of objects, dead or alive, in Eden and 
> From space. The
>   space used and the work done for evacuating the non-prototypical 
> headers would be O(f.s.y) where s and y are as defined above, and f is 
> the fraction
>   of survivors that have non-prototypical headers.
>
> Also, the multiplicative constant factor in the big-O for scheme (1) 
> seems to be much higher than for scheme (2): while scheme (2) merely 
> examines
> the header of each object and fixes up the self-loop (or clears the 
> header bit if using a space header bit), scheme (1) needs to iterate 
> over the references
> in each object, examine the header of the target object of that 
> reference and determine if the reference must be updated, and do so if 
> needed.
>
> Thus although (1) may terminate the trace very quickly, the linear 
> scan of Eden and From spaces for the "fix-up" phase is likely to be 
> much more
> expensive.
>
> It could well be that (2) is the superior scheme under those 
> circumstances, which would bring us back to self-looping (or its moral 
> equivalent a header bit).
> The header bit avoids the time and space expense of the spooling of 
> non-prototypical headers. If we optimized the acquisition of spooling 
> space for
> headers, we might be almost on par with the header bit use and it 
> would leave the fast path of normal scavenge unaffected as is the case 
> today.
> (Use of the header bit would need an extra test.)
>
> After thinking over this, it appears as though the simplest route is 
> to do what Peter mentioned earlier, which is to just have the 
> allocation of the promotion
> buffers have fast-fail paths (after promotion failure) that do not 
> take locks, and just cut down on that expense.
>
> Let me know if anyone has any comments/feedback/ideas: otherwise I'll 
> go through the CMS (and ParallelOld) allocation paths and place
> "fast-fail gates" along the slow allocation path where we go to refill 
> the local buffers. (Later, we can also see about the possibility of 
> using a header bit
> to signal a failed copy, rather than a self-loop, and see if it 
> fetches any gains by saving on header spooling while not affecting the 
> fast path too much,
> and of multi-threading the "linear-scan" by using TLAB boundaries of 
> the previous epoch. But those would be later optimizations.)
>
> -- ramki
>
>      
>
>
>         I'm still guessing.
>
>
>     Your guesses are good, and very helpful, and I think we are on the
>     right track with this one as regards the cause of the slowdown.
>
>     I'll update.
>
>     -- ramki
>      
>
>
>
>                                 ... peter
>
>         Srinivas Ramakrishna wrote:
>
>             System data show high context switching in vicinity of
>             event and points at the futile allocation bottleneck as a
>             possible theory with some legs....
>
>             more later.
>             -- ramki
>
>             On Thu, Oct 18, 2012 at 3:47 PM, Srinivas Ramakrishna
>             <ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>
>             <mailto:ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>>> wrote:
>
>                 Thanks Peter... the possibility of paging or related
>             issue of VM
>                 system did occur to me, especially because system time
>             shows up as
>                 somewhat high here. The problem is that this server
>             runs without
>                 swap :-) so the time is going elsewhere.
>
>                 The cache miss theory is interesting (but would not
>             show up as
>                 system time), and your back of the envelope
>             calculation gives about
>                 0.8 us for fetching a cache line, although i am pretty
>             sure the
>                 cache miss predictor would probably figure out the
>             misses and stream
>                 in the
>                 cache lines since as you say we are going in address
>             order). I'd
>                 expect it to be no worse than when we do an "initial
>             mark pause on a
>                 full Eden", give or
>                 take a little, and this is some 30 x worse.
>
>                 One possibility I am looking at is the part where we
>             self-loop. I
>                 suspect the ParNew/CMS combination running with
>             multiple worker threads
>                 is hit hard here, if the failure happens very early
>             say -- from what
>                 i saw of that code recently, we don't consult the flag
>             that says we
>                 failed
>                 so we should just return and self-loop. Rather we
>             retry allocation
>                 for each subsequent object, fail that and then do the
>             self-loop. The
>                 repeated
>                 failed attempts might be adding up, especially since
>             the access
>                 involves looking at the shared pool. I'll look at how
>             that is done,
>                 and see if we can
>                 do a fast fail after the first failure happens, rather
>             than try and
>                 do the rest of the scavenge, since we'll need to do a
>             fixup anyway.
>
>                 thanks for the discussion and i'll update as and when
>             i do some more
>                 investigations. Keep those ideas coming, and I'll
>             submit a bug
>                 report once
>                 i have spent a few more cycles looking at the
>             available data and
>                 ruminating.
>
>                 - ramki
>
>
>                 On Thu, Oct 18, 2012 at 1:20 PM, Peter B. Kessler
>                 <Peter.B.Kessler at oracle.com
>             <mailto:Peter.B.Kessler at oracle.com>
>             <mailto:Peter.B.Kessler at oracle.com
>             <mailto:Peter.B.Kessler at oracle.com>>> wrote:
>
>                     IIRC, promotion failure still has to finish the
>             evacuation
>                     attempt (and some objects may get promoted while
>             the ones that
>                     fail get self-looped).  That part is the usual
>             multi-threaded
>                     object graph walk, with failed PLAB allocations
>             thrown in to
>                     slow you down.  Then you get to start the pass
>             that deals with
>                     the self-loops, which you say is single-threaded.
>              Undoing the
>                     self-loops is in address order, but it walks by
>             the object
>                     sizes, so probably it mostly misses in the cache.
>              40GB at the
>                     average object size (call them 40 bytes to make
>             the math easy)
>                     is a lot of cache misses.  How fast is your memory
>             system?
>                      Probably faster than (10minutes / (40GB /
>             40bytes)) per cache miss.
>
>                     Is it possible you are paging?  Maybe not when
>             things are
>                     running smoothly, but maybe a 10 minute stall on
>             one service
>                     causes things to back up (and grow the heap of)
>             other services
>                     on the same machine?  I'm guessing.
>
>                                             ... peter
>
>                     Srinivas Ramakrishna wrote:
>
>
>                         Has anyone come across extremely long (upwards
>             of 10
>                         minutes) promotion failure unwinding scenarios
>             when using
>                         any of the collectors, but especially with
>             ParNew/CMS?
>                         I recently came across one such occurrence
>             with ParNew/CMS
>                         that, with a 40 GB young gen took upwards of
>             10 minutes to
>                         "unwind". I looked through the code and I can see
>                         that the unwinding steps can be a source of
>             slowdown as we
>                         iterate single-threaded (DefNew) through the
>             large Eden to
>                         fix up self-forwarded objects, but that still
>             wouldn't
>                         seem to explain such a large pause, even with
>             a 40 GB young
>                         gen. I am looking through the promotion
>             failure paths to see
>                         what might be the cause of such a large pause,
>                         but if anyone has experienced this kind of
>             scenario before
>                         or has any conjectures or insights, I'd
>             appreciate it.
>
>                         thanks!
>                         -- ramki
>
>
>                        
>             ------------------------------__------------------------------__------------
>
>                         _________________________________________________
>                         hotspot-gc-use mailing list
>                         hotspot-gc-use at openjdk.java.__net
>                         <mailto:hotspot-gc-use at openjdk.java.net
>             <mailto:hotspot-gc-use at openjdk.java.net>>
>                        
>             http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use
>                        
>             <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121030/9c32274c/attachment-0001.html