[aarch64-port-dev ] SEGVs in C1 compiler with langtools test
Edward Nevill
edward.nevill at linaro.org
Wed Nov 27 03:31:39 PST 2013
Hi,
I am seeing SEGVs in the langtools section of the JTReg tests with the client compiler.
These are provoked when I run the tests with a high level of concurrency (ie -conc:50) and are also provoked more frequently when I disable thread local allocation.
The SEGVs only occur with the C1 compiler, not with C2 and not with -Xint.
The following is the command I used to provoke it
/work/images/j2sdk-client-release/bin/java -client -XX:-UseTLAB -XX:-UseCompilerSafepoints -jar lib/jtreg.jar -timeout:10 -othervm -conc:50 -vmoption:-Xint -v1 -a -ignore:quiet -w:work_langtools/JTwork -r:report_langtools/JTreport -jdk:/work/images/j2sdk-client-release langtools/test
Note that I am running the actual tests -Xint (-vmoption:-Xint) so it is the test harness that is generating the SEGV.
The typical symptom is that within 30sec of starting the test I get
Directory "report_langtools/JTreport" not found: creating
Directory "work_langtools/JTwork" not found: creating
Directory "work_langtools/JTwork/scratch" not found: creating
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f7fd1b34b69, pid=26426, tid=140186555778816
#
# JRE version: OpenJDK Runtime Environment (8.0) (build 1.8.0-internal-ed_2013_11_27_09_54-b00)
# Java VM: OpenJDK 64-Bit Client VM (25.0-b52 mixed mode linux-aarch64 )
# Problematic frame:
# V [libjvm.so+0x21ab69] DefNewGeneration::copy_to_survivor_space(oopDesc*)+0xe9
If it gets to the stage where it completes the first test the test suit then usually runs to completion.
The problem is more common with the release version, although I have seen it on the slowdebug version as well. It also happens on both the builtin sim and on the RTSM model.
Note that in order to provoke it the work and report directories (ie work_langtools and report_langtools) must be removed prior to running the test. If there are preexisting work and report directories lying around from a previous test then the problem does not seem to occur.
The actual routine that the SEGV occurs in varies, I have seen SEGVs in
FastScanClosure::do_oop(oopDesc**)
DefNewGeneration::copy_to_survivor_space(oopDesc*)
markOopDesc::displaced_mark_helper(...)
MarkSweep::follow_stack()
but copy_to_survivor_space seems the most common.
I have tried modifying eden_allocate to allocate an extra 8 bytes and place a guard word (0xdeadbeaf) 8 bytes before the object. This seems to make the problem go away. I then tried modifying the builting sim to trap any overwrites of the guard word. These were not trapped, but it seemed that the guard word was still overwritten. This seems to indicate that it is been overwritten in native C code.
I have timed out on this having been looking at it for the best part of a week and would appreciate some fresh input.
I have placed a tar file at
http://people.linaro.org/~edward.nevill/jtreg_segv.tgz
Within this there is a 'do_langtools' shell script which should provoke the above. If the fault does not happen within about 30 sec then it is not going to occur at all.
Sorry this is not much of a reproducer, but some bugs just don't lend themselves to being reproduced.
All the best,
Ed.
More information about the aarch64-port-dev
mailing list