segfault with 8u5
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Sep 12 19:06:46 UTC 2014
On 9/12/14 11:25 AM, Martin Traverso wrote:
> At this point, I can safely say we're not seeing this crash anymore
> (woohoo!).
Good!
>
> Would you mind explaining what causes the issue and what might trigger it?
You have exception handle in a method ('catch' block) and there are
several code paths into it. It is somewhere in
HiveMetastoreApiStats$1::call() @
bci:135 which is called from RetryDriver::run().
There are 3 paths into it. One path, at HiveMetastoreApiStats$1::call()
@ bci:97, throws some general exception. An other, at
HiveMetastoreApiStats$1::call @ bci:118, throws NoSuchObjectException.
And third path is unknown (from your output). It could be compiler did
not processed that bytecode yet, or it proved that the path is not taking.
The failing code tries to construct a Phi of incoming exception classes
but it misses the check that a class may be not defined on some paths.
I added this missing check.
It would be nice if you can give a java code example which shows what
HiveMetastoreApiStats$1::call() does. It doesn't need to be exactly that
method which could be proprietary and closed. I will file a bug and I
need a small test case to show the problem.
Regards,
Vladimir
>
> Thanks!
> Martin
>
>
> On Thu, Sep 11, 2014 at 8:08 PM, Martin Traverso <mtraverso at gmail.com
> <mailto:mtraverso at gmail.com>> wrote:
>
> Good news (so far)!
>
> I've had the system running for a couple of hours with this patch
> without problems (it would crash within a few minutes before). I'll
> leave it running a bit longer to be sure, but this looks promising.
>
> Thanks!
> Martin
>
>
> On Thu, Sep 11, 2014 at 6:03 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
> Nice!
>
> You can remove changes I gave to you and apply next code and see
> if it helps. It may crash in an other place.
>
> Vladimir
>
> diff -r fe1f65b0a2d8 src/share/vm/opto/doCall.cpp
> --- a/src/share/vm/opto/doCall.cpp Wed Sep 10 09:05:31 2014
> -0700
> +++ b/src/share/vm/opto/doCall.cpp Thu Sep 11 18:00:29 2014
> -0700
> @@ -802,6 +802,11 @@
> if( ex_node->is_Phi() ) {
> ex_klass_node = new (C) PhiNode( ex_node->in(0),
> TypeKlassPtr::OBJECT );
> for( uint i = 1; i < ex_node->req(); i++ ) {
> + Node* ex_in = ex_node->in(i);
> + if (ex_in == top() || ex_in == NULL) {
> + ex_klass_node->init_req(i, top());
> + continue;
> + }
> Node* p = basic_plus_adr( ex_node->in(i),
> ex_node->in(i), oopDesc::klass_offset_in___bytes() );
> Node* k = _gvn.transform( LoadKlassNode::make(_gvn,
> immutable_memory(), p, TypeInstPtr::KLASS, TypeKlassPtr::OBJECT) );
> ex_klass_node->init_req( i, k );
>
>
>
> On 9/11/14 5:31 PM, Martin Traverso wrote:
>
> Vladimir,
>
> Did you get next message in output before the crash?
> "*** Exception not InstPtr"
>
>
> Not that I can tell.
>
>
> It could happen if it is dead part of the IR graph.
> Can you add next debug output code to Hotspot's
> doCall.cpp (patch
> attached) and test with it?
>
>
> Here's the output:
>
> ******** Expecting TypeClassPtr 1 ********
>
> *** compilation continue
>
> *** save_ex_node:
>
> 900 CastPP === 899 575 [[ 891 909 909 1016
> 932 1110 958
> 978 970 993 1016 1004 1004 1013 ]]
> #java/lang/Exception:NotNull
> * Oop:java/lang/Exception:__NotNull * !jvms:
> HiveMetastoreApiStats$1::call @ bci:97 RetryDriver::run @ bci:25
>
> 1013 CheckCastPP === 1011 900 [[ 923 1021
> 1021 1031 1040
> 1050 1066 1085 1077 1100 1110 1176 1176 ]]
> #org/apache/hadoop/hive/__metastore/api/__NoSuchObjectException:NotNull:__exact
> *
> Oop:org/apache/hadoop/hive/__metastore/api/__NoSuchObjectException:NotNull:__exact
> * !jvms: HiveMetastoreApiStats$1::call @ bci:118
> RetryDriver::run @ bci:25
>
> 1000 Region === 1000 1097 _ 990 [[ 1000 942
> 1107 1108 1109
> 1110 1121 1124 1175 ]] !jvms:
> HiveMetastoreApiStats$1::call @
> bci:135 RetryDriver::run @ bci:25
>
> 1110 Phi === 1000 1013 1 900 [[ 942 1115
> 1132 1150
> 1142 1165 1115 1172 1172 ]] #java/lang/Exception:NotNull *
> Oop:java/lang/Exception:__NotNull * !jvms:
> HiveMetastoreApiStats$1::call @
> bci:135 RetryDriver::run @ bci:25
>
> *** ex_node:
>
> 900 CastPP === 899 575 [[ 891 909 909 1016
> 932 1110 958
> 978 970 993 1016 1004 1004 1013 ]]
> #java/lang/Exception:NotNull
> * Oop:java/lang/Exception:__NotNull * !jvms:
> HiveMetastoreApiStats$1::call @ bci:97 RetryDriver::run @ bci:25
>
> 1013 CheckCastPP === 1011 900 [[ 923 1021
> 1021 1031 1040
> 1050 1066 1085 1077 1100 1110 1176 1176 ]]
> #org/apache/hadoop/hive/__metastore/api/__NoSuchObjectException:NotNull:__exact
> *
> Oop:org/apache/hadoop/hive/__metastore/api/__NoSuchObjectException:NotNull:__exact
> * !jvms: HiveMetastoreApiStats$1::call @ bci:118
> RetryDriver::run @ bci:25
>
> 1000 Region === 1000 1097 _ 990 [[ 1000 942
> 1107 1108 1109
> 1110 1121 1124 1175 ]] !jvms:
> HiveMetastoreApiStats$1::call @
> bci:135 RetryDriver::run @ bci:25
>
> 1110 Phi === 1000 1013 1 900 [[ 942 1115
> 1132 1150
> 1142 1165 1115 1172 1172 ]] #java/lang/Exception:NotNull *
> Oop:java/lang/Exception:__NotNull * !jvms:
> HiveMetastoreApiStats$1::call @
> bci:135 RetryDriver::run @ bci:25
>
>
>
> I may need to ask you do more such testing if it is
> fine with you.
>
>
> Absolutely! Just let me know.
>
> Thanks,
> Martin
>
>
>
>
> Regards,
> Vladimir
>
> On 9/11/14 2:05 PM, Martin Traverso wrote:
>
> Hi Vladimir,
>
> I finally got around to building a fastdebug VM. I
> don't see the
> first
> crash anymore (great!), but the second one still
> happens. Here's
> the output:
>
> https://gist.github.com/____martint/abea9be3df700236ec0b
> <https://gist.github.com/__martint/abea9be3df700236ec0b>
>
> <https://gist.github.com/__martint/abea9be3df700236ec0b
> <https://gist.github.com/martint/abea9be3df700236ec0b>>
>
> Let me know if there's anything other information
> you'd like me
> to gather.
>
> Thanks!
> Martin
>
> On Wed, Jul 30, 2014 at 10:29 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com
> <mailto:vladimir.kozlov at oracle.com>
> <mailto:vladimir.kozlov at __oracle.com
> <mailto:vladimir.kozlov at oracle.com>>
> <mailto:vladimir.kozlov@
> <mailto:vladimir.kozlov@>__orac__le.com <http://oracle.com>
> <mailto:vladimir.kozlov at __oracle.com
> <mailto:vladimir.kozlov at oracle.com>>>> wrote:
>
> Martin,
>
> It would be also nice if you can build
> fastdebug VM and run
> with it.
> cd hotspot/make; make fastdebug LP64=1
>
> Thanks,
> Vladimir
>
>
> On 7/30/14 9:40 AM, Vladimir Kozlov wrote:
>
> 8029381 was fixed in jdk9 and 8u20 (which
> should be
> release soon):
>
> http://hg.openjdk.java.net/______jdk8u/jdk8u/hotspot/rev/______0b9500028980
> <http://hg.openjdk.java.net/____jdk8u/jdk8u/hotspot/rev/____0b9500028980>
>
> <http://hg.openjdk.java.net/____jdk8u/jdk8u/hotspot/rev/____0b9500028980
> <http://hg.openjdk.java.net/__jdk8u/jdk8u/hotspot/rev/__0b9500028980>>
>
>
> <http://hg.openjdk.java.net/____jdk8u/jdk8u/hotspot/rev/____0b9500028980
> <http://hg.openjdk.java.net/__jdk8u/jdk8u/hotspot/rev/__0b9500028980>
>
> <http://hg.openjdk.java.net/__jdk8u/jdk8u/hotspot/rev/__0b9500028980
> <http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/0b9500028980>>>
>
> I will look on C2 crash more. I don't
> remember any recent
> problems in
> catch_inline_exceptions().
>
> Regards,
> Vladimir
>
> On 5/19/14 12:24 PM, Martin Traverso wrote:
>
> The failure happened in C1 JIT
> compiler (first
> tier).
> You can try
> to switch off -XX:-TieredCompilation.
>
>
> That seemed to have worked around this
> particular
> issue.
>
> However, we ran into another crash:
>
> Stack:
> [0x00000000430c8000,______0x00000000431c9000],
> sp=0x00000000431c5980,
> free space=1014k
> Native frames: (J=compiled Java code,
> j=interpreted, Vv=VM code,
> C=native code)
>
> V [libjvm.so+0x814115]
> LoadKlassNode::make(PhaseGVN&, Node*,
> Node*,
> TypePtr const*, TypeKlassPtr const*)+0x45
> V [libjvm.so+0x51ab06]
>
>
> Parse::catch_inline_______exceptions(SafePointNode*)+______0x936
>
> V [libjvm.so+0x8c1a5a]
> Parse::do_exceptions()+0xba
> V [libjvm.so+0x8c6100]
> Parse::do_one_block()+0x180
> V [libjvm.so+0x8c6377]
> Parse::do_all_blocks()+0x127
> V [libjvm.so+0x8c95d3]
> Parse::Parse(JVMState*,
> ciMethod*, float,
> Parse*)+0x15a3
> V [libjvm.so+0x3b6529]
> ParseGenerator::generate(______JVMState*,
> Parse*)+0x99
> V [libjvm.so+0x3b7202]
>
> PredictedCallGenerator::______generate(JVMState*,
> Parse*)+0x2a2
> V [libjvm.so+0x51aefd]
> Parse::do_call()+0x1cd
> V [libjvm.so+0x8d3c7a]
> Parse::do_one_bytecode()+______0x32da
> V [libjvm.so+0x8c60f8]
> Parse::do_one_block()+0x178
>
> V [libjvm.so+0x8c6377]
> Parse::do_all_blocks()+0x127
> V [libjvm.so+0x8c95d3]
> Parse::Parse(JVMState*,
> ciMethod*, float,
> Parse*)+0x15a3
>
> V [libjvm.so+0x3b6529]
> ParseGenerator::generate(______JVMState*,
> Parse*)+0x99
> V [libjvm.so+0x46111c]
> Compile::Compile(ciEnv*,
> C2Compiler*,
> ciMethod*, int, bool, bool, bool)+0x128c
>
> V [libjvm.so+0x3b5008]
> C2Compiler::compile_method(______ciEnv*,
> ciMethod*,
> int)+0x198
> V [libjvm.so+0x46982a]
>
>
> CompileBroker::invoke_______compiler_on_method(______CompileTask*)+0xc8a
>
> V [libjvm.so+0x46c230]
>
> CompileBroker::compiler_______thread_loop()+0x620
> V [libjvm.so+0x9e303f]
> JavaThread::thread_main_inner(______)+0xdf
>
> V [libjvm.so+0x9e3205]
> JavaThread::run()+0x1b5
> V [libjvm.so+0x8a00c8]
> java_start(Thread*)+0x108
>
>
>
> Full dump here:
> https://gist.github.com/______martint/783cf3e30c17fc897423
> <https://gist.github.com/____martint/783cf3e30c17fc897423>
>
> <https://gist.github.com/____martint/783cf3e30c17fc897423
> <https://gist.github.com/__martint/783cf3e30c17fc897423>>
>
>
> <https://gist.github.com/____martint/783cf3e30c17fc897423
> <https://gist.github.com/__martint/783cf3e30c17fc897423>
>
> <https://gist.github.com/__martint/783cf3e30c17fc897423
> <https://gist.github.com/martint/783cf3e30c17fc897423>>>
>
>
> The only bug I found which could
> be related is
> next:
>
> https://bugs.openjdk.java.net/________browse/JDK-8029381
> <https://bugs.openjdk.java.net/______browse/JDK-8029381>
>
> <https://bugs.openjdk.java.__net/____browse/JDK-8029381
> <https://bugs.openjdk.java.net/____browse/JDK-8029381>>
>
> <https://bugs.openjdk.java.____net/__browse/JDK-8029381
>
> <https://bugs.openjdk.java.__net/__browse/JDK-8029381
> <https://bugs.openjdk.java.net/__browse/JDK-8029381>>>
>
> <https://bugs.openjdk.java.______net/browse/JDK-8029381
>
> <https://bugs.openjdk.java.____net/browse/JDK-8029381
> <https://bugs.openjdk.java.__net/browse/JDK-8029381
> <https://bugs.openjdk.java.net/browse/JDK-8029381>>>>
>
>
> Unfortunately, I can't see this bug
> report. I get
> redirected
> to the
> login screen.
>
>
> Thanks!
> Martin
>
>
>
>
>
More information about the hotspot-compiler-dev
mailing list