RFR(M): 8247515: OSX pc_to_symbol() lookup does not work with core files

Kevin Walls kevin.walls at oracle.com
Wed Jul 22 09:45:24 UTC 2020


Thanks Chris, yes looks good, I like that we check the library bounds 
before calling nearest_symbol.

--
Kevin


On 21/07/2020 21:05, Chris Plummer wrote:
> Hi Serguei and Kevin,
>
> The webrev has been updated:
>
> http://cr.openjdk.java.net/~cjplummer/8247515/webrev.02/index.html
> https://bugs.openjdk.java.net/browse/JDK-8247515
>
> Two issues were addressed:
>
> (1) Code in symbol_for_pc() assumed the caller had first checked to 
> make sure that the symbol is in a library, where-as some callers 
> assume NULL will be returned if the symbol is not in a library. This 
> is the case for pstack for example, which first blindly does a pc to 
> symbol lookup, and only if that returns null does it then check if the 
> pc is in the codecache or interpreter. The logic in symbol_for_pc() 
> assumed that if the pc was greater than the start address of the last 
> library in the list, then it must be in that library. So in stack 
> traces the frames for compiled or interpreted pc's showed up as the 
> last symbol in the last library, plus some very large offset. The fix 
> is to now track the size of libraries so we can do a proper range check.
>
> (2) There are issues with finding system libraries. See [1] 
> JDK-8249779. Because of this I disabled support for trying to locate 
> them. This was done in ps_core.c, and there are "JDK-8249779" comment 
> references in the code where I did this. The end result of this is 
> that PMap for core files won't show system libraries, and PStack for 
> core files won't show symbols for addresses in system libraries. Note 
> that currently support for PMap and PStack is disabled for OSX, but I 
> will shortly send out a review to enable them for OSX core files as 
> part of the fix for [2] JDK-8248882.
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8249779
> [2] https://bugs.openjdk.java.net/browse/JDK-8248882
>
> thanks,
>
> Chris
>
> On 7/14/20 5:46 PM, serguei.spitsyn at oracle.com wrote:
>> Okay, I'll wait for new webrev version to review.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/14/20 17:40, Chris Plummer wrote:
>>> Hi Serguie,
>>>
>>> Thanks for reviewing. This webrev is in limbo right now as I 
>>> discovered some issues that Kevin and I have been discussing off 
>>> line. One is that the code assumes the caller has first checked to 
>>> make sure that the symbol is in a library, where-as the actual 
>>> callers assume NULL will be returned if the symbol is not in a 
>>> library. The end result is that we end up returning a symbol, even 
>>> for address in the code cache or interpreter. So in stack traces 
>>> these frame show up as the last symbol in the last library, plus 
>>> some very large offset. I have a fix for that were now I track the 
>>> size of each library. But there is another issue with the code that 
>>> tries to discover all libraries in the core file. It misses a very 
>>> large number of system libraries. We understand why, but are not 
>>> sure of the solution. I might just change to code to only worry 
>>> about user libraries (JDK libs and other JNI libs).
>>>
>>> Some comments below also.
>>>
>>> On 7/14/20 4:37 PM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Chris,
>>>>
>>>> I like the suggestion from Kevin below.
>>> I'm not sure which suggestion since I updated the webrev based on 
>>> his initial suggestion.
>>>>
>>>> I have a couple of minor comments so far.
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8247515/webrev.01/src/jdk.hotspot.agent/macosx/native/libsaproc/libproc_impl.c.frames.html
>>>> 313 if (!lib->next || lib->next->base >= addr) {
>>>> I wonder if the check above has to be:
>>>> 313 if (!lib->next || lib->next->base > addr) {
>>> Yes, > would be better, although this goes away with my fix that 
>>> tracks the size of each library.
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8247515/webrev.01/src/jdk.hotspot.agent/macosx/native/libsaproc/symtab.c.frames.html
>>>> 417 if (offset_from_sym >= 0) { // ignore symbols that comes after 
>>>> "offset"
>>>> Replace: comes => come
>>> Ok.
>>>
>>> thanks,
>>>
>>> Chris
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/8/20 03:23, Kevin Walls wrote:
>>>>>
>>>>>
>>>>> Sure -- I was thinking lowest_offset_from_sym initialising 
>>>>> starting at a high positive integer (that would now be PTRDIFF_MAX 
>>>>> I think) to save a comparison with e.g. -1, you can just check if 
>>>>> the new offset is less than lowest_offset_from_sym
>>>>>
>>>>> With the ptrdiff_t change you made, this all looks good to me 
>>>>> however you decide. 8-)
>>>>>
>>>>>
>>>>>
>>>>> On 07/07/2020 21:17, Chris Plummer wrote:
>>>>>> Hi Kevin,
>>>>>>
>>>>>> Thanks for the review. Yes, that lack of initialization of 
>>>>>> lowest_offset_from_sym is a bug. I'm real surprised the compiler 
>>>>>> didn't catch it as it will be uninitialized garbage the first 
>>>>>> time it is referenced. Fortunately usually the eventual offset is 
>>>>>> very small if not 0, so probably this never prevented a proper 
>>>>>> match. I think there's also another bug:
>>>>>>
>>>>>>  415       uintptr_t offset_from_sym = offset - sym->offset;
>>>>>>
>>>>>> "offset" is the passed in offset, essentially the address of the 
>>>>>> symbol we are interested in, but given as an offset from the 
>>>>>> start of the DSO. "sym->offset" is also an offset from the start 
>>>>>> of the DSO. It could be located before or after "offset". This 
>>>>>> means the math could result in a negative number, which when 
>>>>>> converted to unsigned would be a very large positive number. This 
>>>>>> happens whenever you check a symbol that is actually located 
>>>>>> after the address you are looking up. The end result is harmless, 
>>>>>> because it just means there's no way we will match that symbol, 
>>>>>> which is what you want, but it would be good to clean this up.
>>>>>>
>>>>>> I think what is best is to use ptrdiff_t and initialize 
>>>>>> lowest_offset_from_sym to -1. I've updated the webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8247515/webrev.01/index.html
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/7/20 4:09 AM, Kevin Walls wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> Yes I think this looks good.
>>>>>>>
>>>>>>> Question: In nearest_symbol, do we need to initialize 
>>>>>>> lowest_offset_from_sym to something impossibly high, as if it 
>>>>>>> defaults to zero we never find a better/nearer result?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> On 07/07/2020 06:10, Chris Plummer wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Please help review the following:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8247515/webrev.00/index.html
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247515
>>>>>>>>
>>>>>>>> The CR contains a description of the issues being addressed. 
>>>>>>>> There is also no test for this symbol lookup support yet. It 
>>>>>>>> will be there after I push JDK-8247516 and  JDK-8247514, which 
>>>>>>>> are both blocked by the CR.
>>>>>>>>
>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8247516
>>>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8247514
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200722/038e7d2b/attachment.htm>


More information about the serviceability-dev mailing list