RFR: 8217170: gc/arguments/TestUseCompressedOopsErgo.java timed out

Kim Barrett kim.barrett at oracle.com
Tue Jul 9 21:54:19 UTC 2019


> On Jul 9, 2019, at 1:18 PM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> Hi Kim,
> 
> I’ll be happy to try trouble-shooting this. It's clearly of interest to us as we do use CMS and it’d be nice to get the tests to finish quicker. I tried reproducing the 20sec durations for the child processes with CMS but I was not able to. They seem to have more or less the same duration as with the other GCs. Do you remember which tests you saw this issue in? Also, what type of machine did you run on (“small” laptop, “larger” workstation, etc)?

I'm in the middle of a couple of other things right now, so would be
really happy to have someone else poke at this for a bit.

I filed JDK-8227414 for followup investigation.

Markus Gronlund recently attached a hacky patch (windows_process_abort.patch)
to 8217170 to help with the investigation.  That probably ought to get
copied to JDK-8227414.

The test I've mostly been looking at is TestUseCompressedOopsErgo,
which runs over a dozen subprocesses.

There's a fair amount of information in JDK-8217170 comments, but I'll
try to summarize some of it here to save you from lots of scrolling.

The machines where I've been able to fairly reliably provoke the
problem are a couple of Windows VMs in our build&test farm. They are
configured with 8 cores and 60G of memory. They are known to be
"slow", so we don't use them for builds, only for tests. jtreg test
concurrency is 4, I think. Because they are VMs, the hypervisor might
be running other VMs that are also running tests. (Indeed, those two
machines are on the same hypervisor.) Some of the issue may be related
to load on the VM and/or hypervisor; at least, load seems to be useful
in provocation.

I've seen the problem to a similar or lesser degree on other Windows
machines, e.g. the reported waitFor time is unusually long (at least
seconds rather than (usually significantly) subsecond). There's a bit
of logging spew in the .jtr files to help spot that, added by
JDK-8219149.

I *think* I've seen the problem to a lesser degree on Linux.  I've not
looked at Mac or Solaris at all.

Something that I find puzzling is that I think in all the runs of
TestUseCompressedOopsErgo that I've looked at, either all of the
subprocesses were quick, or all were multi-second slow.

> BTW, meant to say along with my review: I tried your patch locally and it shaved off around 35% of the elapsed time for the gc/arguments tests on my workstation. So a nice improvement! Thanks!

Oh, that’s nice!  I hadn’t measured that.




More information about the hotspot-gc-dev mailing list