<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
John and Uwe,<br>
<br>
I followed the original instruction sent out by Uwe to reproduce
the test. I got it up and running on my Windows x64 workstation
using a 32 bit binary. The test hangs every time I run it.<br>
<br>
John, I think your proxy issues are due to the fact that ant picks
up its proxy setting from Java. So you need to set the system
properties http.proxyHost and http.proxyPort. I did this by
exporting the the _JAVA_OPTIONS environment variable as:<br>
<br>
_JAVA_OPTIONS=-Dhttp.proxyHost=<oracle www proxy>
-Dhttp.proxyPort=<oracle proxy port><br>
<br>
Let me know if this does not work for you. We can try to debug it
offline.<br>
<br>
Since I could catch the hang in a debugger I could confirm both
that the hang is indeed related to the recent change to the
DrainMarkingStackClosures and that the problem is that we enter
the termination protocol even when reference processing is single
threaded.<br>
<br>
Looking at the comment in the constructor for
G1CMDrainMarkingStackClosure:<br>
<br>
// We only allow stealing and only enter the termination
protocol<br>
// in CMTask::do_marking_step() if this closure is being
instantiated<br>
// for parallel reference processing.<br>
_do_stealing = _do_termination = is_par;<br>
<br>
I came up with a patch that makes the test work again. But I leave
it to you, John, to figure out if this is the right way to solve
the problem.<br>
<br>
diff --git a/src/share/vm/gc_implementation/g1/concurrentMark.cpp
b/src/share/vm/gc_implementation/g1/concurrentMark.cpp<br>
--- a/src/share/vm/gc_implementation/g1/concurrentMark.cpp<br>
+++ b/src/share/vm/gc_implementation/g1/concurrentMark.cpp<br>
@@ -4336,7 +4336,9 @@<br>
gclog_or_tty->print_cr("[%u] detected overflow",
_worker_id);<br>
}<br>
<br>
+ if (do_stealing || do_termination) {<br>
_cm->enter_first_sync_barrier(_worker_id);<br>
+ }<br>
// When we exit this sync barrier we know that all tasks
have<br>
// stopped doing marking work. So, it's now safe to<br>
// re-initialise our data structures. At the end of this
method,<br>
@@ -4347,8 +4349,10 @@<br>
// We clear the local state of this task...<br>
clear_region_fields();<br>
<br>
+ if (do_stealing || do_termination) {<br>
// ...and enter the second barrier.<br>
_cm->enter_second_sync_barrier(_worker_id);<br>
+ }<br>
// At this point everything has bee re-initialised and
we're<br>
// ready to restart.<br>
} <br>
<br>
<br>
Thanks,<br>
Bengt<br>
<br>
On 3/7/13 7:44 AM, Uwe Schindler wrote:<br>
</div>
<blockquote cite="mid:00f601ce1aff$27a76eb0$76f64c10$@apache.org"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Vorformatiert Zchn";
margin:0cm;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Sprechblasentext Zchn";
margin:0cm;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";
color:black;}
span.HTMLVorformatiertZchn
{mso-style-name:"HTML Vorformatiert Zchn";
mso-style-priority:99;
mso-style-link:"HTML Vorformatiert";
font-family:Consolas;
color:black;}
span.E-MailFormatvorlage19
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#44546A;}
span.SprechblasentextZchn
{mso-style-name:"Sprechblasentext Zchn";
mso-style-priority:99;
mso-style-link:Sprechblasentext;
font-family:"Tahoma","sans-serif";
color:black;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">Hi John,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">I only have time to work on a setup this
evening Germen time, because I am on a business trip today.
Will come back to you. Unfortunately I failed to quickly
setup an easy classpath without Ivy downloading the JARS. <o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">Uwe<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">-----<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">Uwe Schindler<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US"><a class="moz-txt-link-abbreviated" href="mailto:uschindler@apache.org">uschindler@apache.org</a> <o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"
lang="EN-US">Apache Lucene PMC Member / Committer<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A">Bremen,
Germany<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"><a class="moz-txt-link-freetext" href="http://lucene.apache.org/">http://lucene.apache.org/</a><o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#44546A"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm
0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF
1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext">From:</span></b><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext">
John Cuthbertson [<a class="moz-txt-link-freetext" href="mailto:john.cuthbertson@oracle.com">mailto:john.cuthbertson@oracle.com</a>]
<br>
<b>Sent:</b> Thursday, March 07, 2013 12:49 AM<br>
<b>To:</b> Uwe Schindler<br>
<b>Cc:</b> 'Bengt Rutisson';
<a class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-dev@openjdk.java.net">hotspot-gc-dev@openjdk.java.net</a>; <a class="moz-txt-link-abbreviated" href="mailto:dev@lucene.apache.org">dev@lucene.apache.org</a><br>
<b>Subject:</b> Re: JVM hanging when using G1GC on
JDK8 b78 or b79 (Linux 32 bit)<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi Uwe,<br>
<br>
An update:<br>
<br>
I have downloaded ant and the lucerne source.<br>
<br>
I attempted the ivy-bootstrap but it failed to download the
ivy=2.3.0.jar file - even after setting:<br>
<br>
ANT_OPTS=-Dhttp.proxyHost=<...>
-Dhttp.proxyPort=<...><br>
<br>
So I manually downloaded and placed it into the ANT library
and now get:<br>
<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal">ivy-bootstrap1:<br>
[mkdir] Skipping /home/jcuthber/.ant/lib because it
already exists.<br>
[echo] installing ivy 2.3.0 to /home/jcuthber/.ant/lib<br>
[get] Getting: <a moz-do-not-send="true"
href="http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar">http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar</a><br>
[get] To: /home/jcuthber/.ant/lib/ivy-2.3.0.jar<br>
[get] Error getting <a moz-do-not-send="true"
href="http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar">http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar</a>
to /home/jcuthber/.ant/lib/ivy-2.3.0.jar<br>
[available] Found: /home/jcuthber/.ant/lib/ivy-2.3.0.jar<br>
<br>
ivy-bootstrap2:<br>
Skipped because property 'ivy.bootstrap1.success' set.<br>
<br>
ivy-checksum:<br>
<br>
ivy-bootstrap:<br>
<br>
BUILD SUCCESSFUL<br>
Total time: 3 minutes 46 seconds<o:p></o:p></p>
<p class="MsoNormal">Presumably I have to build the lucerne
source before executing the tests. That seemed to go OK.<br>
<br>
When I run the analysis/uima tests it seems to get hung up
at the "resolve" target - even without specifying G1:<br>
<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal">cairnapple{jcuthber}:408> cd
analysis/uima/<br>
cairnapple{jcuthber}:409> ls -l<br>
total 29<br>
-rw-r--r-- 1 jcuthber staff 1473 Dec 10 10:39
build.xml<br>
-rw-rw-r-- 1 jcuthber staff 6895 Mar 6 15:20
hotspot.log<br>
-rw-r--r-- 1 jcuthber staff 1316 Mar 30 2012
ivy.xml<br>
drwxr-xr-x 2 jcuthber staff 2 Mar 5 07:42 lib/<br>
drwxr-xr-x 6 jcuthber staff 6 Mar 5 07:42 src/<o:p></o:p></p>
<p class="MsoNormal"><br>
<br>
<o:p></o:p></p>
<p class="MsoNormal">ivy-configure:<br>
[ivy:configure] Loading <a moz-do-not-send="true"
href="jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivy.properties">jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivy.properties</a><br>
[ivy:configure] :: Apache Ivy 2.3.0 - 20130110142753 :: <a
moz-do-not-send="true" href="http://ant.apache.org/ivy/">http://ant.apache.org/ivy/</a>
::<br>
[ivy:configure] jakarta commons httpclient not found: using
jdk url handling<br>
[ivy:configure] :: loading settings :: file =
/export/bugs/8009536/lucene-5.0-2013-03-05_15-37-06/ivy-settings.xml<br>
[ivy:configure] no default ivy user dir defined: set to
/home/jcuthber/.ivy2<br>
[ivy:configure] including url: <a moz-do-not-send="true"
href="jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-public.xml">jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-public.xml</a><br>
[ivy:configure] no default cache defined: set to
/home/jcuthber/.ivy2/cache<br>
[ivy:configure] including url: <a moz-do-not-send="true"
href="jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-shared.xml">jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-shared.xml</a><br>
[ivy:configure] including url: <a moz-do-not-send="true"
href="jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-local.xml">jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-local.xml</a><br>
[ivy:configure] including url: <a moz-do-not-send="true"
href="jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-main-chain.xml">jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-main-chain.xml</a><br>
[ivy:configure] settings loaded (289ms)<br>
[ivy:configure] default cache:
/home/jcuthber/.ivy2/cache<br>
[ivy:configure] default resolver: default<br>
[ivy:configure] -- 7 resolvers:<br>
[ivy:configure] working-chinese-mirror [ibiblio]<br>
[ivy:configure] main [chain] [shared, public]<br>
[ivy:configure] local [file]<br>
[ivy:configure] shared [file]<br>
[ivy:configure] sonatype-releases [ibiblio]<br>
[ivy:configure] public [ibiblio]<br>
[ivy:configure] default [chain] [local, main,
sonatype-releases, working-chinese-mirror]<br>
<br>
resolve:<br>
[ivy:retrieve] no resolved descriptor found: launching
default resolve<br>
Overriding previous definition of property "ivy.version"<br>
[ivy:retrieve] using ivy parser to parse <a
moz-do-not-send="true"
href="file:///%5C%5Cexport%5Cbugs%5C8009536%5Clucene-5.0-2013-03-05_15-37-06%5Canalysis%5Cuima%5Civy.xml">file:/export/bugs/8009536/lucene-5.0-2013-03-05_15-37-06/analysis/uima/ivy.xml</a><br>
[ivy:retrieve] :: resolving dependencies ::
org.apache.lucene#analyzers-uima;working@cairnapple<br>
[ivy:retrieve] confs: [default]<br>
[ivy:retrieve] validate = true<br>
[ivy:retrieve] refresh = false<br>
[ivy:retrieve] resolving dependencies for configuration
'default'<br>
[ivy:retrieve] == resolving dependencies for
org.apache.lucene#analyzers-uima;working@cairnapple
[default]<br>
[ivy:retrieve] == resolving dependencies
org.apache.lucene#analyzers-uima;working@cairnapple->org.apache.uima#Tagger;2.3.1
[default->*]<br>
[ivy:retrieve] default: Checking cache for: dependency:
org.apache.uima#Tagger;2.3.1 {*=[*]}<br>
[ivy:retrieve] don't use cache for
org.apache.uima#Tagger;2.3.1: checkModified=true<br>
[ivy:retrieve] tried
/home/jcuthber/.ivy2/local/org.apache.uima/Tagger/2.3.1/ivys/ivy.xml<br>
[ivy:retrieve] tried
/home/jcuthber/.ivy2/local/org.apache.uima/Tagger/2.3.1/jars/Tagger.jar<br>
[ivy:retrieve] local: no ivy file nor artifact found for
org.apache.uima#Tagger;2.3.1<br>
[ivy:retrieve] main: Checking cache for: dependency:
org.apache.uima#Tagger;2.3.1 {*=[*]}<br>
[ivy:retrieve] tried
/home/jcuthber/.ivy2/shared/org.apache.uima/Tagger/2.3.1/ivys/ivy.xml<br>
[ivy:retrieve] tried
/home/jcuthber/.ivy2/shared/org.apache.uima/Tagger/2.3.1/jars/Tagger.jar<br>
[ivy:retrieve] shared: no ivy file nor artifact found for
org.apache.uima#Tagger;2.3.1<br>
[ivy:retrieve] tried <a moz-do-not-send="true"
href="http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom">http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom</a><o:p></o:p></p>
<p class="MsoNormal">and there it hangs - presumably trying to
access <a moz-do-not-send="true"
href="http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom">http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom</a><br>
<br>
There must be something with our proxy settings that that
won't allow this.<br>
<br>
JohnC<br>
<br>
<br>
On 03/06/13 11:15, Uwe Schindler wrote: <o:p></o:p></p>
<pre>Hi,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>That's unfortunately not so easy, because of project dependencies. To run the test you have to compile Lucene Core then the specific module + the test framework (which is special for Lucene) and download some JARs from Maven central (JAR hell, as usual).<o:p></o:p></pre>
<pre>If you give me some time, I would collect all needed JAR files from my local checkout and provide you the correct cmd line + a ZIP file with maybe a shell script to startup. It should be doable, but needs some work to collect all dependencies for the classpath.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>If you want to do it quicker (should be quite fast to do):<o:p></o:p></pre>
<pre>- Download ANT 1.8.2 binary zip (unfortunately ANT 1.8.4 has a bug making it not working out of the box with Java 8): <a moz-do-not-send="true" href="http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.2-bin.tar.gz">http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.2-bin.tar.gz</a> - I just wonder about the fact: isn't ANT needed to build the JDK classlib by itself? I remember that the FreeBSD OpenJDK build downloads ANT and does a large part of the compilation using ANT...<o:p></o:p></pre>
<pre>- put the ANT bin/ dir into your PATH<o:p></o:p></pre>
<pre>- download the Apache Lucene source code from Jenkins: <a moz-do-not-send="true" href="https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/lucene-5.0-2013-03-05_15-37-06-src.tgz">https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/lucene-5.0-2013-03-05_15-37-06-src.tgz</a><o:p></o:p></pre>
<pre>- go to extracted lucene source dir, call "ant ivy-bootstrap" (this will download Apache IVY, so all dependencies can be downloaded from Maven Central)<o:p></o:p></pre>
<pre>- change to the module that fails: # cd analysis/uima<o:p></o:p></pre>
<pre>- execute: # ant -Dargs="-server -XX:+UseG1GC" -Dtests.multiplier=3 -Dtests.jvms=1 test<o:p></o:p></pre>
<pre>- In a parallel console you might be able to attach to the process, the build in the main console using ANT runs inside ANT and the test framework spawns separate worker instances of the JVM to execute the tests. This makes it hard to reproduce in standalone (the command line passed to the child JVM is veeeeery long).<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>I will work on putting together a precompiled ZIP file with all needed JARs + the command line. Just tell me if you got it managed with the above howto, then I don’t need to do this.<o:p></o:p></pre>
<pre>Uwe<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>-----<o:p></o:p></pre>
<pre>Uwe Schindler<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:uschindler@apache.org">uschindler@apache.org</a> <o:p></o:p></pre>
<pre>Apache Lucene PMC Member / Committer<o:p></o:p></pre>
<pre>Bremen, Germany<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="http://lucene.apache.org/">http://lucene.apache.org/</a><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>-----Original Message-----<o:p></o:p></pre>
<pre>From: John Cuthbertson [<a moz-do-not-send="true" href="mailto:john.cuthbertson@oracle.com">mailto:john.cuthbertson@oracle.com</a>]<o:p></o:p></pre>
<pre>Sent: Wednesday, March 06, 2013 7:51 PM<o:p></o:p></pre>
<pre>To: Uwe Schindler<o:p></o:p></pre>
<pre>Cc: 'Bengt Rutisson'; <a moz-do-not-send="true" href="mailto:hotspot-gc-dev@openjdk.java.net">hotspot-gc-dev@openjdk.java.net</a>;<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:dev@lucene.apache.org">dev@lucene.apache.org</a><o:p></o:p></pre>
<pre>Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Hi Uwe,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>I've downloaded lucene-5.0-2013-03-05_15-37-06.zip from<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="https://builds.apache.org/job/Lucene-Artifacts">https://builds.apache.org/job/Lucene-Artifacts</a>-<o:p></o:p></pre>
<pre>trunk/2212/artifact/lucene/dist/<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>I don't have ant on my workstation so do you have a java command line to<o:p></o:p></pre>
<pre>run the test(s) that generate the error?<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Thanks,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>JohnC<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>On 3/6/2013 3:16 AM, Uwe Schindler wrote:<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Hi,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>I think this is a VM bug and the thread dumps that Uwe produced are<o:p></o:p></pre>
<pre>enough to start tracking down the root cause.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>I hope it is enough! If I can help with more details, tell me what I should do<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>to track this down. Unfortunately, we have no isolated test case (like a small<o:p></o:p></pre>
<pre>java class that triggers this bug) - you have to run the test cases of this<o:p></o:p></pre>
<pre>Lucene's module. It only happens there, not in any other Lucene test suite. It<o:p></o:p></pre>
<pre>may be caused by a lot of GC activity in this "UIMA" module or a specific test.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>On 3/6/13 8:52 AM, David Holmes wrote:<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>If the VM is completely unresponsive then it suggests we are at a<o:p></o:p></pre>
<pre>safepoint.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>Yes, we are hanging during a stop-the-world GC, so we are at a safepoint.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>The GC threads are not "hung" in os::parK, they are parked - waiting<o:p></o:p></pre>
<pre>to be notified of something.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>It looks like the reference processing thread is stuck in a loop<o:p></o:p></pre>
<pre>where it does wait(). So, the VM is hanging even if that stack trace<o:p></o:p></pre>
<pre>also ends up in os::park().<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>The thing is to find out why they are not being woken up.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>Actually, in this case we should probably not even be calling wait...<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Can the gdb log be posted somewhere? I don't know if the attachment<o:p></o:p></pre>
<pre>made it to the original posting on hotspot-gc but it's no longer<o:p></o:p></pre>
<pre>available on hotspot-dev.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>I received the attachment with the original email. I've attached it<o:p></o:p></pre>
<pre>to the bug report that I created: 8009536. You can find it there if<o:p></o:p></pre>
<pre>you want to. But I think we have a fairly good idea of what change<o:p></o:p></pre>
<pre>caused the hang.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>If it helps: Unfortunately, we had some problems with recent JDK builds,<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
<pre>because javac and javadoc tools were not working correctly, failing to build<o:p></o:p></pre>
<pre>our source code. Since b78 this was fixed. Until this was fixed, we used build<o:p></o:p></pre>
<pre>b65 (which was the last one working) and the G1GC hangs did not appear on<o:p></o:p></pre>
<pre>this version. So it must have happened by a change after b65 till b78.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Uwe<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Bengt<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Thanks,<o:p></o:p></pre>
<pre>David<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>On 6/03/2013 4:07 PM, Krystal Mok wrote:<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Hi Uwe,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>If you can attach gdb onto it, and jstack -m and jstack -F should<o:p></o:p></pre>
<pre>also work; that'll get you the Java stack trace.<o:p></o:p></pre>
<pre>(But it probably doesn't matter in this case, because the hang is<o:p></o:p></pre>
<pre>probably bug in the VM).<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>- Kris<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>On Wed, Mar 6, 2013 at 5:48 AM, Uwe Schindler<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
</blockquote>
<pre><a moz-do-not-send="true" href="mailto:uschindler@apache.org"><uschindler@apache.org></a><o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>wrote:<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Hi,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>since a few month we are extensively testing various preview<o:p></o:p></pre>
<pre>builds of JDK 8 for compatibility with Apache Lucene and Solr, so<o:p></o:p></pre>
<pre>we can find any bugs early and prevent the problems we had with<o:p></o:p></pre>
<pre>the release of Java 7 two years ago. Currently we have a Linux<o:p></o:p></pre>
<pre>(Ubuntu 64bit) Jenkins machine that has various JDKs (JDK 6, JDK<o:p></o:p></pre>
<pre>7, JDK 8 snapshot, IBM J9, older JRockit) installed, choosing a<o:p></o:p></pre>
<pre>different one with different hotspot and garbage collector<o:p></o:p></pre>
<pre>settings on every run of the test suite (which takes approx. 30-45<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<pre>minutes).<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>JDK 8 b79 works so far very well on Linux, we found some strange<o:p></o:p></pre>
<pre>behavior in early versions (maybe compiler errors), but no longer<o:p></o:p></pre>
<pre>at the moment. There is one configuration that constantly and<o:p></o:p></pre>
<pre>reproducibly hangs in one module that is tested: The configuration<o:p></o:p></pre>
<pre>uses JDK 8 b79 (same for b78), 32 bit, and G1GC (server or client<o:p></o:p></pre>
<pre>does not matter). The JVM running the tests hangs irresponsible<o:p></o:p></pre>
<pre>(jstack or kill -3 have no effect/cannot connect, standard kill<o:p></o:p></pre>
<pre>does not stop it, only kill -9 actually kills it). It can be<o:p></o:p></pre>
<pre>reproduced in this Lucene module 100% (it hangs always).<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>I was able to connect with GDB to the JVM and get a stack trace on<o:p></o:p></pre>
<pre>all threads (see attachment, dump.txt). As you see all threads of<o:p></o:p></pre>
<pre>G1GC seem to hang in a syscall (os:park(), a conditional wait in<o:p></o:p></pre>
<pre>pthread library). Unfortunately that’s all I can give you. A Java<o:p></o:p></pre>
<pre>stacktrace is not possible because the JVM reacts on neither kill<o:p></o:p></pre>
<pre>-3 nor jstack. With all other garbage collectors it passes the<o:p></o:p></pre>
<pre>test without hangs in a few seconds, with 32 bit G1GC it can stand<o:p></o:p></pre>
<pre>still for hours. The 64 bit JVM passes with G1GC, so only the 32<o:p></o:p></pre>
<pre>bit variant is affected. Client or Server VM makes no difference.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>To reproduce:<o:p></o:p></pre>
<pre>- Use a 32 bit JDK 8 b78 or b79 (tested on Linux 64 bit, but this<o:p></o:p></pre>
<pre>should not matter)<o:p></o:p></pre>
<pre>- Download Lucene Source code (e.g. the snapshot version we were<o:p></o:p></pre>
<pre>testing with:<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="https://builds.apache.org/job/Lucene-Artifacts">https://builds.apache.org/job/Lucene-Artifacts</a>-<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
</blockquote>
</blockquote>
<pre>trunk/2212/artifact/lucene/dist/)<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>- change to directory lucene/analysis/uima and run:<o:p></o:p></pre>
<pre> ant -Dargs="-server -XX:+UseG1GC" -Dtests.multiplier=3<o:p></o:p></pre>
<pre>-Dtests.jvms=1 test<o:p></o:p></pre>
<pre>After a while the test framework prints "stalled" messages<o:p></o:p></pre>
<pre>(because the child VM actually running the test no longer<o:p></o:p></pre>
<pre>responds). The PID is also printed. Try to get a stack trace or kill it, no<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<pre>response.<o:p></o:p></pre>
<pre> <o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote
style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>Only kill -9 helps. Choosing another garbage collector in the<o:p></o:p></pre>
<pre>above command line makes the test finish after a few seconds, e.g.<o:p></o:p></pre>
<pre>-Dargs="-server -XX:+UseConcMarkSweepGC"<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>I posted this bug report directly to the mailing list, because<o:p></o:p></pre>
<pre>with earlier bug reports, there seem to be a problem with<o:p></o:p></pre>
<pre>bugs.sun.com - there is no response from any reviewer after<o:p></o:p></pre>
<pre>several weeks and we were able to help to find and fix javadoc and<o:p></o:p></pre>
<pre>javac-compiler bugs early. So I hope you can help for this bug, too.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Uwe<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>-----<o:p></o:p></pre>
<pre>Uwe Schindler<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="mailto:uschindler@apache.org">uschindler@apache.org</a><o:p></o:p></pre>
<pre>Apache Lucene PMC Member / Committer Bremen, Germany<o:p></o:p></pre>
<pre><a moz-do-not-send="true" href="http://lucene.apache.org/">http://lucene.apache.org/</a><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<pre><o:p> </o:p></pre>
<pre> <o:p></o:p></pre>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</blockquote>
<br>
</body>
</html>