RFR(m): 8185712: [windows] Improve native symbol decoder
Thomas Stüfe
thomas.stuefe at gmail.com
Tue Sep 5 18:20:27 UTC 2017
Hi Zengyu,
thanks a lot for the review!
New webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.03/webrev/
Delta to last:
http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-native-symbol-resolver/webrev.02-to-03/webrev/
Please find comments inline.
On Tue, Sep 5, 2017 at 4:22 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> Hi Thomas,
>
> Looks good overall.
>
> symbolengine.cpp:
>
> Is there reason to use ::malloc()/::free() instead of
> os::malloc()/os::free() ?
>
>
Yes, this is deliberate, as is the use of Windows CriticalSection objects
instead of os::mutex. This coding gets executed during error handling, and
I want to prevent circular errors. os::malloc() may crash and cause error
handling to be reentered etc.
>
> Line #602:
> buf might not be null-terminated if filename is longer than buffer len.
>
>
>
Good catch! Fixed and added regression tests (gtests).
Thanks, Thomas
> Thanks,
>
> -Zhengyu
>
>
New Webrev:
>
>
>
>
>
>
>
> On 09/05/2017 09:05 AM, Thomas Stüfe wrote:
>
>> Hi Goetz,
>>
>> thank you for your review!
>>
>> New Webrev:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-
>> improve-native-symbol-resolver/webrev.02
>>
>> Delta to last:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-
>> improve-native-symbol-resolver/webrev.01-to-02/webrev/
>>
>> The only change is that I removed the -XX:InitializeDbgHelpEarly switch to
>> avoid having to file a CSR.
>>
>> Please find further comments inline:
>>
>>
>> On Mon, Sep 4, 2017 at 5:08 PM, Lindenmaier, Goetz <
>> goetz.lindenmaier at sap.com> wrote:
>>
>> Hi Thomas,
>>>
>>> I had a look at your change. Great somebody finally fixes
>>> the windows symbol printing, thanks a lot!
>>>
>>> The code looks good, I'm just not sure whether you
>>> need new files symbolengine.c|hpp. Isn't that
>>> just what should go to decoder_windows.h|cpp and
>>> class Decoder?
>>> You would also get rid of the redirections in decoder_windows.cpp.
>>>
>>>
>>> As we discussed, I see your point, but would prefer to leave the change
>> for
>> the moment as it is.
>>
>> A similar change to this one - doing away with the AbstractDecoder object
>> instantiation layer - will be coming for AIX, where it does not make much
>> sense either, and I propose to do a separate cleanup or simplification
>> change once that is done, merging decoder_windows.cpp and
>> symbolengine.cpp/hpp. Unless I hear more objections from other reviewers,
>> I'd prefer to do this in a later patch.
>>
>>
>> In shutdown() you comment
>>> // There is no reason ever to shut down the decoder.
>>> ... I think you can remove that function altogether, i.e. also
>>> from the shared code, I don't see where it is ever called.
>>>
>>>
>>> Totally agree...
>>
>>
>> Also, I think, you can just delete Decoder::can_decode_C_frame_in_vm()
>>> from the code. The only place where it is used, in frame.cpp,
>>> calls dll_address_to_duntion_name(). This returns useful information
>>> also in the case of the NullDecoder, which now is the only one to
>>> return false in that function.
>>>
>>>
>> totally agree also here, but would also prefer both issues in a separate
>> change. In fact, Ioi opened a bug for this a while ago:
>> https://bugs.openjdk.java.net/browse/JDK-8144855 - and I would like to
>> fix
>> it under that bug. Reason is, in this change, I'd like to avoid changing
>> shared sources as much as possible and keep this change windows only.
>>
>>
>>
>>> Globals_windows.hpp needs Copyright adaption, please.
>>> This is not introduced by your change, but maybe
>>> you can also fix the copyright in decoder.hpp, which
>>> says " 1997, 2015, 2017" ... should only name two
>>> years ...
>>>
>>>
>>> Not needed anymore: since I removed the -XX:InitializeDbgHelpEarly
>> switch,
>> globals_windows.hpp is reverted to its original state. Do you still want
>> me
>> to fix the date?
>>
>> Thanks for the review work!
>>
>> ..Thomas
>>
>>
>> Best regards,
>>> Goetz.
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-
>>>> bounces at openjdk.java.net] On Behalf Of Thomas Stüfe
>>>> Sent: Mittwoch, 30. August 2017 14:34
>>>> To: hotspot-runtime-dev at openjdk.java.net
>>>> Subject: RFR(m): 8185712: [windows] Improve native symbol decoder
>>>>
>>>> Hi all,
>>>>
>>>> May I please have reviews for the following change.
>>>>
>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185712
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8185712-windows-improve-
>>>> native-symbol-resolver/webrev.01/webrev/
>>>>
>>>> (This is the followup to: https://bugs.openjdk.java.net/
>>>>
>>> browse/JDK-8186349)
>>>
>>>>
>>>> -------------
>>>>
>>>> Basically, this is a reimplementation of the layer around the Windows
>>>> Symbol API (the API used to resolve debug symbols). The old
>>>> implementation
>>>> had a number of errors and shortcomings which together caused the
>>>> Windows
>>>> native symbol resolution (and hence callstacks in error logs) to be a
>>>> bit
>>>> of a lottery. The aim of this reimplementation is to make the code more
>>>> robust and easier to maintain.
>>>>
>>>> The problems with the existing implementation are listed in detail in
>>>> the
>>>> bug description.
>>>>
>>>> The new implementation:
>>>>
>>>> - uses the new centralized WindowsDbgHelper class, which wraps the
>>>> dbghelp.dll loading, introduced with JDK-8186349
>>>>
>>>> - Completely bypasses the "create two instances of AbstractDecoder class
>>>> and synchronize access to them" scheme in decoder.cpp. It does not make
>>>> sense for windows, where we have to synchronize each access to the
>>>> dbghelp.dll anyway - this is done one layer below in WindowsDbgHelper.
>>>>
>>> The
>>>
>>>> static methods of the shared Decoder class now directly access the
>>>> static
>>>> methods in the new SymbolEngine class, see decoder_windows.cpp.
>>>>
>>>> - The layer wrapping the Symbol API lives in the new
>>>> symbolengine.cpp/hpp
>>>> files. The coding takes care of properly initializing (once) the symbol
>>>>
>>> API
>>>
>>>> and of assembling the pdb search path.
>>>>
>>>> - Pdb search path construction is changed: where before we just added
>>>> jdk
>>>> and jvm bin directories, we now just add all directories of all loaded
>>>>
>>> DLLs
>>>
>>>> (which, of course, include the jdk and jvm bin directories). That way we
>>>> have a high chance of catching pdb files of third party libraries, as
>>>>
>>> long
>>>
>>>> as they follow the convention of putting the pdb files beside the dlls.
>>>> This means it is easier to analyse crashes where third party DLLs are
>>>> involved.
>>>>
>>>> - On Windows, we now have source file and line number in the callstack.
>>>>
>>>> - There is a new parameter, diagnostic and windows-only,
>>>> called "InitializeDbgHelpEarly". That parameter is by default off. If
>>>> on,
>>>> it causes the symbol engine to be initialized early, which increases the
>>>> chance of good callstacks later on (because the initialization does not
>>>> have to run in an error situation).
>>>>
>>>> - Added tests: gtests and a jtreg test which tests the callstack
>>>>
>>> printing.
>>>
>>>> All tests windows only. There is no technical reason for making them
>>>> windows only, but I wanted to keep disturbances to other platforms to a
>>>> minimum and these kind of tests can be shaky.
>>>>
>>>> Thanks a lot for reviewing this!
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>
>>>
More information about the hotspot-runtime-dev
mailing list