Long pauses with parallel collector

Karl Rosenbaum kalle at rosenbaum.se
Tue Mar 18 14:23:29 UTC 2008


Hi

We're experiencing long pauses (>60 seconds) using the parallel 
collector and the serial compacting old generation collector (the 
default choice on a server-class machine, right?). Unfortunately we have 
no logs from the gc, since this is a customer's installation and we 
can't play around with that. But I do have memory statistics along with 
response time statistics. Before the problems started we didn't really 
notice the collections but recently the collections took more than 60 
seconds. This is measured using wget on a "ping"-servlet that returns 
some memory statistics. The RESPTIME is measured by timing the 
wget-command. FREE MEM, MEM TOTAL and MEM ALLOCATED are the results of

                 long freeMem = Runtime.getRuntime().freeMemory();
                 long totalMem = Runtime.getRuntime().totalMemory();
                 long allocMem = Runtime.getRuntime().maxMemory();


This is an example of the statistics before the problems started:

TIME		FREE MEM	MEM TOTAL	MEM ALLOCATED	RESPTIME
10:59:49	1499233752	12724273152	12724273152	0.20
10:59:59	1431576712	12724273152	12724273152	0.16
11:00:09	1392930432	12724273152	12724273152	0.14
11:00:19	1385974568	12724273152	12724273152	0.13
11:00:30	1365510896	12724273152	12724273152	0.13
11:00:40	1248830048	12724273152	12724273152	0.19
11:00:50	1164298568	12724273152	12724273152	0.13
11:01:00	1122874896	12724273152	12724273152	0.15
11:01:10	1085027216	12724273152	12724273152	0.13
11:01:20	952254960	12724273152	12724273152	0.14
11:01:31	385638352	12724273152	12724273152	0.30
11:01:41	3905940544	12717326336	12717326336	0.13
11:01:51	3286915952	12717326336	12717326336	0.34
11:02:01	1862466680	12717326336	12717326336	0.17
11:02:12	1214561176	12717326336	12717326336	0.13
11:02:22	937846336	12717326336	12717326336	0.13
11:02:32	830626208	12717326336	12717326336	0.13
11:02:42	591847776	12717326336	12717326336	0.13
11:02:52	486858808	12717326336	12717326336	0.15
11:03:02	405118840	12717326336	12717326336	0.12
11:03:12	379997776	12717326336	12717326336	0.17
11:03:23	4231164944	12723027968	12723027968	0.13
11:03:33	2869501088	12723027968	12723027968	0.14
11:03:43	2848014968	12723027968	12723027968	0.15
11:03:53	2789025152	12723027968	12723027968	0.13
11:04:03	2141846392	12723027968	12723027968	0.14
11:04:13	1936386592	12723027968	12723027968	0.13
11:04:24	1922345160	12723027968	12723027968	0.13
11:04:34	1907951992	12723027968	12723027968	0.14
11:04:44	1884581280	12723027968	12723027968	0.15
11:04:54	1842599720	12723027968	12723027968	0.14
11:05:04	1833357144	12723027968	12723027968	0.14
11:05:14	819218528	12723027968	12723027968	0.15
11:05:25	766194152	12723027968	12723027968	0.14
11:05:35	738061968	12723027968	12723027968	0.14
11:05:45	716541120	12723027968	12723027968	0.13
11:05:55	690620328	12723027968	12723027968	0.13
11:06:05	601896808	12723027968	12723027968	0.13
11:06:15	3986587288	12721192960	12721192960	0.19
11:06:26	3981820120	12721192960	12721192960	0.19
11:06:36	3908686896	12721192960	12721192960	0.13
11:06:46	3896673928	12721192960	12721192960	0.13


This is an example of the statistics after the problems started:

TIME		FREE MEM	MEM TOTAL	MEM ALLOCATED	RESPTIME
15:26:13	132104472	12878741504	12878741504	0.11
15:26:23	89372112	12878741504	12878741504	0.21
15:26:33	61732520	12878741504	12878741504	0.12
15:26:43	15401280	12878741504	12878741504	12.14
15:27:05	4074793624	12878741504	12878741504	64.57
15:28:20	3378764376	12878741504	12878741504	0.12
15:28:30	2858626704	12878741504	12878741504	0.11
15:28:40	491173584	12878741504	12878741504	0.11
15:28:50	414802056	12878741504	12878741504	0.92
15:29:01	298133928	12878741504	12878741504	0.20
15:29:12	4164253720	12878741504	12878741504	64.36
15:30:26	3662076152	12878741504	12878741504	0.15
15:30:36	1823630328	12878741504	12878741504	0.11
15:30:46	1338510584	12878741504	12878741504	0.12
15:30:56	505714912	12878741504	12878741504	0.11
15:31:07	320903520	12878741504	12878741504	0.11
15:31:17	4179464600	12878741504	12878741504	71.62
15:32:38	3001104680	12878741504	12878741504	0.47
15:32:49	2593399336	12878741504	12878741504	0.13
15:32:59	2478360368	12878741504	12878741504	0.11
15:33:09	2391502968	12878741504	12878741504	0.11
15:33:19	785466304	12878741504	12878741504	0.13
15:33:29	4004574384	12878741504	12878741504	62.87
15:34:42	2643636392	12878741504	12878741504	0.11
15:34:52	2387674096	12878741504	12878741504	0.15
15:35:03	2226430872	12878741504	12878741504	0.11
15:35:13	1016694456	12878741504	12878741504	0.11
15:35:23	728910648	12878741504	12878741504	0.11
15:35:33	238095528	12878741504	12878741504	7.11
15:35:50	215109328	12878741504	12878741504	0.11
15:36:00	4025070120	12878741504	12878741504	69.14
15:37:19	2294767768	12878741504	12878741504	0.12
15:37:30	1897269848	12878741504	12878741504	0.11
15:37:40	1832375800	12878741504	12878741504	0.30
15:37:50	1741032976	12878741504	12878741504	0.11
15:38:00	1641480032	12878741504	12878741504	0.18
15:38:10	774869968	12878741504	12878741504	7.20
15:38:28	634489992	12878741504	12878741504	0.10
15:38:38	274370880	12878741504	12878741504	3.11
15:38:51	68896328	12878741504	12878741504	0.11
15:39:01	31211120	12878741504	12878741504	0.15
15:39:11	3997617032	12878741504	12878741504	60.77

As you can see, every collection results in a >60 seconds pause time. 
The server have been restarted once between the two log snippets and 
they are from different dates. The problem lasted indefinitely and all 
collections seem to take this long.

We have now applied the CMS collector to "kill" these symptoms, but we 
still need to understand what was going on here.

Do you have any suggestions on what the cause of this might be?

The server is a Solaris 10 sparc machine with 16GB memory and 4 cpus. 
The java version is 1.5.0_11

These are the java parameters:
root at fp # pargs 10502

10502:  /usr/jdk/instances/jdk1.5.0/bin/sparcv9/java -Xmx12288m 
-Dmx=12288 -Djava.awt.h
argv[0]: /usr/jdk/instances/jdk1.5.0/bin/sparcv9/java
argv[1]: -Xmx12288m
argv[2]: -Dmx=12288
argv[3]: -Djava.awt.headless=true
argv[4]: -Dfile.encoding=UTF-8
argv[5]: -Djava.library.path=/opt/FOCAfp/lib
argv[6]: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
argv[7]: -Djava.endorsed.dirs=/opt/FOCAtomcat/common/endorsed
argv[8]: -classpath
argv[9]: 
:/opt/FOCAtomcat/bin/bootstrap.jar:/opt/FOCAtomcat/bin/commons-logging-api.jar
argv[10]: -Dcatalina.base=/opt/FOCAtomcat
argv[11]: -Dcatalina.home=/opt/FOCAtomcat
argv[12]: -Djava.io.tmpdir=/opt/FOCAtomcat/temp
argv[13]: org.apache.catalina.startup.Bootstrap
argv[14]: start


Best regards,
Karl Rosenbaum
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use



More information about the hotspot-gc-dev mailing list