G1GC/ JIT compilation bug hunt.
Dawid Weiss
dawid.weiss at gmail.com
Tue Aug 20 02:32:24 PDT 2013
One follow-up with respect to those encoded integers -- I swapped the
correct and incorrect sequence in my last e-mail, sorry. I then also
added a sysout here:
@@ -451,6 +453,7 @@
termFreq = 1;
} else {
termFreq = freq.readVInt();
+ System.err.print("# (tf): " + termFreq + " ");
}
}
With this the assertion is no longer hit but it shows where the
difference in numbers comes from:
Correct sequence is:
# (eof)
# (code): 0
# (tf): 3
# (code): 4
# (tf): 3
# (code): 2
# (tf): 5
# (code): 2
# (tf): 2
# (code): 3
# (code): 5
# (code): 2
And the incorrect, assertion-hit run shows:
# (eof)
# (code): 0
# (code): 3
# (code): 4
# (code): 3
# (code): 2
# (code): 5
# (code): 2
# (code): 2
# (code): 3
So there's something odd happening after the first readvint which
returns 0 -- the next number that should be read by:
termFreq = freq.readVInt();
is re-read as the next code.
Dawid
On Tue, Aug 20, 2013 at 11:11 AM, Dawid Weiss <dawid.weiss at gmail.com> wrote:
> Thanks for comments guys.
>
>> It is great that you can build fastdebug VM. I assume that if I give you a
>> patch to try you can also build with it.
>
> Yes, absolutely.
>
>> First, you can run with next options and then send zipped output (related
>> part before the method compilation and optoassembler output, don't use hsdis
>> for that) and hs_err file when test fails:
>
> I did that. Used the options you requested -- a full log of all I did
> is included in the ZIP file here:
> http://www.cs.put.poznan.pl/dweiss/tmp/debug-files.zip
>
> Cat debug-files\debug-log.txt. There are several runs included in that
> ZIP file, in short:
>
> - jdk18-fastdebug, b102, full run (no abort on assertion, explained in
> debug-log.txt):
> 001-no-abort-on-assertion
> 002-no-abort-on-assertion
>
> - jdk18-fastdebug, b102, abort on assertion (includes mem dumps)
> 003-with-abort-on-assertion
> 004-with-abort-on-assertion
>
> - jdk18-fastdebug, hsx tip, abort on assertion (includes mem dumps)
> 005-hsx-tip
> 006-hsx-tip
>
> - jdk18-fastdebug, hsx tip, excluded readvint method inlining (passed
> build, so only compilation logs available).
> 007-excluded-readvint
>
>> Looking on java code in FreqProxTermsWriterPerField::flush() and based on
>> LUCENE-5168 report generated code somehow missed EOF check. Am I right? This
>> is why the Assert is thrown?
>
> It's not the only method it trips on. Depending on the seed the
> problem shows up in different places. For the dumps I included the
> issue seems to be here:
>
>> } else {
>> final int code = freq.readVInt();
>> if (!readTermFreq) {
>> docID += code;
>> termFreq = -1;
>
> If I add sysouts as in:
>
> Index: core/src/java/org/apache/lucene/index/FreqProxTermsWriterPerField.java
> ===================================================================
> --- core/src/java/org/apache/lucene/index/FreqProxTermsWriterPerField.java
> (revision 1512807)
> +++ core/src/java/org/apache/lucene/index/FreqProxTermsWriterPerField.java
> (working copy)
> @@ -427,6 +427,7 @@
> //System.out.println(" cycle");
> final int termFreq;
> if (freq.eof()) {
> + System.err.print("# (eof)");
> if (postings.lastDocCodes[termID] != -1) {
> // Return last doc
> docID = postings.lastDocIDs[termID];
> @@ -442,6 +443,7 @@
> }
> } else {
> final int code = freq.readVInt();
> + System.err.print("# (code): " + code + " ");
> if (!readTermFreq) {
> docID += code;
> termFreq = -1;
>
> then for a constant seed you'll get an identical sequence of values.
> Once the assertion hits though, the sequence deviates. Comparing a
> passing run (excluding readvint) and a failing run I get a lot of
> initial aligned outputs and then a deviation:
>
> (correct run):
> # (eof)
> # (eof)
> # (eof)
> # (code): 0
> # (code): 3
> # (code): 4
> # (code): 3
>
> (incorrect run):
> # (eof)
> # (eof)
> # (eof)
> # (code): 0
> # (code): 4 <- wtf?
> # (code): 2
> # (code): 2
>
> How can a variable encoded integer be misinterpreted from 3 to 4 is
> beyond me, sorry. But it's not an EOF condition I think, at least the
> deviation happens where the original run had was entering (code)
> branch.
>
>> Could you also send latest version of FreqProxTermsWriterPerField.java you
>> are testing?
>
> I'm testing against a fixed branch (the info is included in debug-log.txt).
>
> If there's anything else I can do, let me know. I'm also curious how
> you approach debugging this thing. It may be time-based so I don't
> think it'll reproduce on your machine out of the box, but who knows.
> All I know is that, for example, if you add one more sysout before the
> while loop it suddenly stops reproducing so probably something gets
> compiled differently... Wild :)
>
> Dawid
More information about the hotspot-dev
mailing list