RFR: 8333994: NMT: call stacks should show source information [v5]

Sat Jun 22 10:37:27 UTC 2024

On Tue, 18 Jun 2024 20:41:01 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:

>>> 
>>> I have the same question. Did dwarf decoder performance improve? If so, could you point me the PR? Thanks!
>> 
>> I completely forgot that this had been an issue. The comment was even written by me :(
>> 
>> No, Elf decoder is still slow. But I have found myself too many times staring at NMT output now trying to make sense of the offsets. Missing source info in combination with the small stack size of 4 makes investigations a pain.
>> 
>> I added a simple caching mechanism to aid printing. Its pretty straight-forward, but still I am not sure it is worth the complexity. Here the numbers:
>> 
>> Running all NMT jtreg tests:
>> - Stock JVM (no source info): 40 seconds
>> - Source info: 2 min 30 seconds
>> - Source info + caching: 1 min 15 seconds
>> 
>> I think that is acceptable. Any more intricate caching would be over the complexity-benefit line.
>> 
>> @gerard-ziemski 
>> 
>> The cost is with Dwarf parsing, not dladdr. dladdr is cheap. But feel free to make Dwarf parsing cheaper, that would be surely welcome.
>
>> > I have the same question. Did dwarf decoder performance improve? If so, could you point me the PR? Thanks!
>> 
>> I completely forgot that this had been an issue. The comment was even written by me :(
>> 
>> No, Elf decoder is still slow. But I have found myself too many times staring at NMT output now trying to make sense of the offsets. Missing source info in combination with the small stack size of 4 makes investigations a pain.
>> 
>> I added a simple caching mechanism to aid printing. Its pretty straight-forward, but still I am not sure it is worth the complexity. Here the numbers:
>> 
>> Running all NMT jtreg tests:
>> 
>> * Stock JVM (no source info): 40 seconds
>> * Source info: 2 min 30 seconds
>> * Source info + caching: 1 min 15 seconds
>> 
>> I think that is acceptable. Any more intricate caching would be over the complexity-benefit line.
> 
> I simply pointed out your own old concern. If you are happy with the final performance now, then I'm good.
> 
> I will look at the cache shortly.

@gerard-ziemski @jdksjolen I did a small revamp. The strings are now stored in an Arena. That allows for compact storage and prevents truncation should a frame be longer than 1024 chars (probably never happen). Storage cost is ~96K for a detail report, compared to ~700K with fixed-sized 1024 byte entries.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19655#issuecomment-2183974599