RFR: 8282306: os::is_first_C_frame(frame*) crashes on invalid link access [v3]

Thomas Stuefe stuefe at openjdk.java.net
Thu Feb 24 06:21:08 UTC 2022


On Wed, 23 Feb 2022 22:51:44 GMT, Johannes Bechberger <duke at openjdk.java.net> wrote:

>> This PR introduces a new method `can_access_link` into the frame class to check the accessibility of the link information. It furthermore adds a new `os::is_first_C_frame(frame*, Thread*)` that uses the `can_access_link` method
>> and the passed thread object to check the validity of frame pointer, stack pointer, sender frame pointer and sender stack pointer. This should reduce the possibilities for crashes.
>
> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Use safefetch

Hi Johannes,

thanks for taking my suggestion. This is better, and helps beyond your AsyncGetCallTrace scenario (e.g. in NMT).

safefetch works as an unconditional sub routine call to a prolog-free piece of code which does a single load. Basically:


(1) jump <safefetch pc> -> (2) load from questionable address -> (3) return


and the signal handler knows how to handle things if a segfault happens at (2).

So, for the standard case, if no fault happens, you pay for a subroutine call and a load. This is as cheap as it gets, but still not as cheap as a single inline load would be.

---

Still, I'm not sure I would add this to such a low-level function as frame::link(), at least not without analyzing the callers. Most of the callers of frame::link don't seem to be that performance-sensitive that a sub-routine call would throw them off. But I'm not sure here.

Moreover, even though your solution is beautifully simple, I don't like "lying" at this level. There may be cases where we rather have an honest crash when dereferencing an invalid frame, because we may want to analyze the root cause.

What I actually had in mind - sorry I was not too clear in my first review - was to use SafeFetch inside is_first_C_frame to check the validity of the link before dereferencing it. `is_first_C_frame()` is not super performace-critical, so it should be fine to use safefetch here. 

Note that we have `os::is_readable_pointer()` which encapsulates SafeFetch for checking pointer validity. So I imagine something like this:


bool frame::link_is_valid() {
	return os::is_readable_pointer(link);
}

...

bool os::is_first_C_frame(frame* fr) {
...
  // If the link address is invalid we are not walkable beyond this point.
  if (!fr->link_is_valid()) return true;
}


@dholmes-ora : the motivation is to harden a piece of code which may run in unsafe situations in production scenarios. Examples: AsyncGetCallTrace, stack printing in error reports, stack printing in NMT... Error handling has its secondary crash guards, but the other scenarios are "naked". And we have downstream additional facilities which use VM stack printing.

About a test, I agree, that would be nice. But one would have to "fake" an invalid stack. Maybe a new error reporting test where one deliberately overwrites portions of the stack and then tries to print the stack. However, I imagine things could be brittle, because the OS may catch a stack overwrite first. It's not totally trivial, maybe something for a separate RFE?

Cheers, Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/7591


More information about the hotspot-dev mailing list