RFR: 8295646: Ignore zero pairs in address descriptors read by dwarf parser

Christian Hagedorn chagedorn at openjdk.org
Wed Oct 19 10:22:27 UTC 2022


On Wed, 19 Oct 2022 08:22:01 GMT, Xiaolin Zheng <xlinzheng at openjdk.org> wrote:

> RISC-V generates debuginfo like
> 
> 
>> readelf --debug-dump=aranges build/linux-riscv64-server-fastdebug/images/test/hotspot/gtest/server/libjvm.so
> 
> ... 
> Length:                   1756
>   Version:                  2
>   Offset into .debug_info:  0x4bc5e9
>   Pointer Size:             8
>   Segment Size:             0
> 
>     Address            Length
>     0000000000344ece 0000000000004a2c
>     0000000000000000 0000000000000000     <=
>     0000000000000000 0000000000000000     <=
>     0000000000000000 0000000000000000     <=
>     00000000003498fa 0000000000000016
>     0000000000349910 0000000000000016
>     ....
>     000000000026d5b8 0000000000000b9a
>     000000000034a532 0000000000000628
>     000000000034ab5a 00000000000002ac
>     0000000000000000 0000000000000000     <=
>     0000000000000000 0000000000000000
>     0000000000000000 0000000000000000
>     000000000034ae06 0000000000000bee
>     000000000034b9f4 0000000000000660
>     000000000034c054 00000000000005aa
>     0000000000000000 0000000000000000
>     0000000000000000 0000000000000000     <=
>     000000000034c5fe 0000000000000af2
>     000000000034d0f0 0000000000000f16
>     000000000034e006 0000000000000b4a
>     0000000000000000 0000000000000000
>     0000000000000000 0000000000000000
>     000000000026e152 000000000000000e
>     0000000000000000 0000000000000000
> 
> 
> Our dwarf parser (gdb's dwarf parser before this April is as well [1], which encountered the same issue on RISC-V) uses `address == 0 && size == 0` in `is_terminating_entry()` to detect terminations of an arange section, which will early terminate parsing RISC-V's debuginfo at an "apparent terminator" described in [1] so that the result would not look correct with tests failures. The `_header._unit_length` is read but not used and it is the real length that can determine the section's end, so we can use it to get the end position of a section instead of `address == 0 && size == 0` checks to fix this issue.
> 
> Also, the reason why `readelf` has no such issue is it also uses the same approach to determine the end position. [2]
> 
> Tests added along with the dwarf parser patch are all tested and passed on x86_64, aarch64, and riscv64.
> Running a tier1 sanity test now.
> 
> Thanks,
> Xiaolin
> 
> [1] https://github.com/bminor/binutils-gdb/commit/1a7c41d5ece7d0d1aa77d8019ee46f03181854fa
> [2] https://github.com/bminor/binutils-gdb/blob/fd320c4c29c9a1915d24a68a167a5fd6d2c27e60/binutils/dwarf.c#L7594

Marked as reviewed by chagedorn (Reviewer).

Hi Xiaolin

> Our dwarf parser (gdb's dwarf parser before this April is as well [1], which encountered the same issue on RISC-V) uses address == 0 && size == 0 in is_terminating_entry() to detect terminations of an arange section, which will early terminate parsing RISC-V's debuginfo so that the result would not look correctly with tests failures. The _header._unit_length is read but not used and it is the real length which can determine the section's end, so we can use it to get the end position of a section instead of address == 0 && size == 0 checks to fix this issue.

That's interesting that the emitted format is not compliant with the official DWARF spec. I've encountered such inconsistencies at other places as well where GCC does something differently. Anyways, in that case, your fix makes sense to read the entire set by taking the `_unit_length` field instead of relying on `(0, 0)` being the terminating entry (which would normally be the same result).

We could additionally assert that the real terminating entry is indeed `(0, 0)` as specified in the spec. But you need to check if that is the case on RISC-V.

Thanks,
Christian

-------------

PR: https://git.openjdk.org/jdk/pull/10758


More information about the hotspot-dev mailing list