RFR: 8261710: SA DSO objects have sizes that are too large
Daniel D.Daugherty
dcubed at openjdk.java.net
Tue Feb 16 22:17:40 UTC 2021
On Tue, 16 Feb 2021 12:19:38 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8250750 has some interesting details that I forgot about. Note this is referencing the state of the code before fixing JDK-8250750, which changed how `lib->memsz` is calculated.
>>
>>> `lib->memsz` comes from the size of the `LC_SEGMENT_64` that the library was discovered in. However, the library can actually span multiple segments. In this case of the vtable address, the address was in the segment that follows the initial `LC_SEGMENT_64`. Because of this `lib->memsz` is too small, resulting in symbol lookups being restricted to addresses that are in the initial segment.
>>
>>> The simplest approach to fixing this seems to be locating the largest offset found in the symbol table, round that up to a page boundary, and use it as `lib->memsz`. I've implemented this and it seems to be working.
>>
>> So it's not perfect, but I think it's good enough for our needs and better than using the file size and ending up with a size that is too big. I think the main way that this could fail is if you use `findpc` on an address that is beyond the last symbol (if it is even possible for there to be memory mapped at an address after the last symbol). In this case `findpc` will just say it doesn't know where the address is rather than saying it is in the dso + some very large offset. For symbols in the dso, they will always be found since lib`->memsz` now covers all symbols.
>>
>> Also, since I never did figure out how to determine the size of a mach-o symbol, it's possible that the last symbol extends beyond the page boundary. If that is the case and you use `findpc` on and address that is part of the symbol but beyond the page boundary, `findpc` won't find it and say it doesn't know what the address is. So again, not perfect, but this issue can only happen with the last symbol in the dso, and only if the symbol crosses a page boundary, and only if the searched for address is after the page boundary. If you search for the start of the symbol, you'll still find it.
>>
>> Having said all that, I think maybe we can get the correct size by taking the address of the highest symbol and then looking up the `map_info` that it is in. Then it's just a matter of adding `map->vaddr + map->memsz`. But this assumes that there are no more maps for the dso after the one that has the last symbol, so maybe still not perfect. I'm not sure any of this extra work is worth it. When I dove into the mach-o support last year to fix some issues, it ended up being a huge rat hole that took way more of my time then I had expected. It was very badly broken, far worse then I first thought, and many things about mach-o core files were a mystery, and still remain so. For example, see https://bugs.openjdk.java.net/browse/JDK-8249779. We never did figure out how to create a link map.
>
> Thanks for explanation!
> I understood Marc-O is complex, and to use `lib->memsz` is reasonable as you said.
>
> I updated patch for macOS. I cannot test it, but all of tier 1 servicerbility tests were passed on GitHub Actions. Could you review again?
While I was characterizing:
JDK-8261844 serviceability/sa/ClhsdbFindPC.java#id1 failed with "'In code in NMethod for LingeredAppWithTrivialMain.main' missing from stdout/stderr"
I determined that I could easily reproduce serviceability/sa/ClhsdbFindPC.java failures
on my Ubuntu 16.04 test machine. I've applied this version of this PR at @plummercj's
request:
$ git log HEAD^!
commit c9a34723e34550301e1b5fd06185e0b0977da78e (HEAD -> pull/2563)
Author: Yasumasa Suenaga <suenaga at oss.nttdata.com>
Date: Tue Feb 16 18:05:43 2021 +0900
Remove unnecessary code
and retested serviceability/sa/ClhsdbFindPC.java:
$ do_java_test serviceability/sa/ClhsdbFindPC.java 2>&1 | tee -a do_java_test.8261710.log
INFO: GNUMAKE=make
INFO: GNUMAKE version is: GNU Make 4.1
INFO: JTREG options:
INFO: JOBS=16
INFO: TEST_MODE=othervm
INFO: EXTRA_PROBLEM_LISTS=ProblemList-extra.txt
INFO: VM_OPTIONS=
INFO: test_val=serviceability/sa/ClhsdbFindPC.java
Test Config: linux-x86_64-normal-server-release
INFO: TIMEOUT_FACTOR=4
Done testing
Test Run linux-x86_64-normal-server-release time: 0.55 minutes.
TEST TOTAL PASS FAIL ERROR
jtreg:open/test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java
>> 4 3 1 0 <<
1 failure(s) found in log=do_java_test.linux-x86_64-normal-server-release.log
TEST: serviceability/sa/ClhsdbFindPC.java#id3
ERROR: Failed to instantiate timeout handler: jdk.test.failurehandler.jtreg.GatherProcessInfoTimeoutHandler: file does not exist
Test Config: linux-x86_64-normal-server-fastdebug
INFO: TIMEOUT_FACTOR=6
Done testing
Test Run linux-x86_64-normal-server-fastdebug time: 0.87 minutes.
TEST TOTAL PASS FAIL ERROR
jtreg:open/test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java
4 4 0 0
Test Config: linux-x86_64-normal-server-slowdebug
INFO: TIMEOUT_FACTOR=12
Done testing
Test Run linux-x86_64-normal-server-slowdebug time: 2.97 minutes.
TEST TOTAL PASS FAIL ERROR
jtreg:open/test/hotspot/jtreg/serviceability/sa/ClhsdbFindPC.java
>> 4 3 1 0 <<
1 failure(s) found in log=do_java_test.linux-x86_64-normal-server-slowdebug.log
TEST: serviceability/sa/ClhsdbFindPC.java#id3
LOG: build/linux-x86_64-normal-server-slowdebug/test-support/jtreg_open_test_hotspot_jtreg_serviceability_sa_ClhsdbFindPC_java/serviceability/sa/ClhsdbFindPC_id3.jtr
Saving build/linux-x86_64-normal-server-slowdebug/test-support/jtreg_open_test_hotspot_jtreg_serviceability_sa_ClhsdbFindPC_java/serviceability/sa/ClhsdbFindPC_id3.jtr as /work/shared/bug_hunt/8261844_for_jdk17.git/test_failures.2021-02-16-171036/ClhsdbFindPC_id3.jtr.slowdebug
Saving /work/shared/bug_hunt/8261844_for_jdk17.git/build/linux-x86_64-normal-server-slowdebug/test-support/jtreg_open_test_hotspot_jtreg_serviceability_sa_ClhsdbFindPC_java/serviceability/sa/ClhsdbFindPC_id3/hs_err_pid16470.log as /work/shared/bug_hunt/8261844_for_jdk17.git/test_failures.2021-02-16-171036/hs_err_pid16470.log
Moving /work/shared/bug_hunt/8261844_for_jdk17.git/build/linux-x86_64-normal-server-slowdebug/test-support/jtreg_open_test_hotspot_jtreg_serviceability_sa_ClhsdbFindPC_java/serviceability/sa/ClhsdbFindPC_id3/core to /work/shared/bug_hunt/8261844_for_jdk17.git/test_failures.2021-02-16-171036/core.16470
Total test time: 4.40 minutes.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2563
More information about the serviceability-dev
mailing list