Need some help in debugging jvm corruption/crashes in C++/JNI code
David Holmes
david.holmes at oracle.com
Tue Jan 14 02:22:15 UTC 2020
Hi,
On 14/01/2020 9:56 am, Bwmat . wrote:
> Hello,
>
>
> I just joined this mailing list to hopefully get a bit of help from someone
> familiar with hotspot.
These mailing lists are not really for end user application debugging
help. However, as you have deep dived into hotspot internals ... :)
In a debug build you can use itable/vtable logging to see how the
itable/vtable is constructed and see if anything odd appears there.
-Xlog:itables*=trace,vtables*=trace
You can also try running with -Xcheck:jni.
FYI most file attachments are stripped by the mailing list so your log
did not get included.
Cheers,
David
>
>
> We found a crash in a C++ library we have that uses JNI, and I’ve been
> trying to debug it for a couple of days. I’m working on windows, and I
> managed to get a TTD (time travel debugging, a feature in the preview
> version of WinDBG) trace of when the problem occurs, and I built myself a
> debug JVM from source to help debug, but I’m completely unfamiliar with
> hotspot.
>
>
>
> I’ve also attached a log file created one time that the crash occurred (the
> symptoms aren’t always the same, sometimes it doesn’t crash at all, but
> just return an invalid value from a method call, so my test app quits
> early, sometimes it crashes from an assertion from within hotpot, not
> always the same one).
>
>
>
> One weird thing is that I found the issue while doing some testing, and a
> certain SQL query triggers the issue, but another, analogous query does
> not. This is significant, since all of the SQL processing is written in
> Java, so very little changes in the native part of the application between
> the case that works and the case that fails, which makes me think it’s a
> jvm issue. I’m currently debugging a java 8 openjdk (since I used an
> article about how to build it that used that version, and that’s what we’re
> using internally anyways, but I also reproduced the issue on a java 13
> oracle jvm, so if it IS a JVM bug, doesn’t seem like it's been fixed yet.
> I’m quite aware that it’s probably still our fault somehow though.)
>
>
>
> The issue seems to somehow be caused by the wrong java method being invoked
> by a JNI call, or maybe the right method, but on the wrong object.
>
>
>
> Early on, while debugging using Eclipse’s remote debugger, I found that,
> right before crashing, I was able to hit a breakpoint in a method on a type
> T, but the debugger told me that the ‘this’ reference was actually of type
> String!
>
>
>
> Later on, while debugging the internals of hotspot in my TTD trace (which
> allowed me to get quite far without much understanding of what’s going
> one), I found that the crash in the trace occurs when the wrong method is
> invoked on an object, returning a long, which is later interpreted as a
> string reference, and the jvm notices it isn’t actually a string when the
> native code tries to get the string length.
>
>
>
> To lay out the situation without going into too many details of our code,
> we have an interface IColumn, that has a few methods that get attributes
> about a SQL column. Some methods return primitives, some return strings.
> The native code is trying to call IColumn.GetLabel(), so it passes a
> jmethodID that was generated via a successful call to JNI’s GetMethodID,
> passing the class object of IColumn. For the receiver of the call, it
> passes an instance of a type which implements IColumn, and has no
> superclasses, but is a private static inner class.
>
>
>
> In jni_invoke_nonstatic(), it goes into the ‘else if
> (!m->has_itable_index())’ branch, which seems wrong from what little I
> gleamed from https://wiki.openjdk.java.net/display/HotSpot/InterfaceCalls
>
> It then resolves the wrong method, getDisplaySize(), which returns a long
> (I note that getDisplaySize() is the method declared directly before
> getLabel() in the declaration of IColumn, so maybe an off-by-one error
> somewhere?), and then invokes it, dooming the process to a future crash.
>
>
>
> If I go backwards in time to when the _*vtable*_index field of the Method
> (the one originally resolved using GetMethodID, and later passed into
> CallObjectMethodV) is set, it’s in KlassVtable::initialize_vtable(), on the
> line ‘mh()->set_vtable_index(initialized); // set primary vtable index’,
> and that’s the last time it’s set.
>
> Am I right in thinking that it should have instead been set to a negative
> value (so that it was an ‘itable index’), since it’s an interface method?
> If so, where would that happen? What else should I check?
>
>
>
> Any suggestions are welcome, thanks in advance.
>
More information about the hotspot-dev
mailing list