RFR: 8319850: PrintInlining should print which methods are late inlines [v11]
Johan Sjölen
jsjolen at openjdk.org
Thu Nov 21 19:12:21 UTC 2024
On Thu, 21 Nov 2024 14:49:42 GMT, theoweidmannoracle <duke at openjdk.org> wrote:
>> In https://github.com/openjdk/jdk/pull/16595 @caojoshua previously suggested changes to indicate which calls were inlined late, when printing inlines. This PR re-introduces the changes from the previously closed PR and fixes a minor issue where asserts were triggered.
>>
>> Concerns were raised by @rwestrel in the previous PR:
>>
>>> When InlineTree::ok_to_inline() is called, some diagnostic message is recorded for the call site. Do I understand right that with this patch, if the call is inlined late, then that message is dropped and replaced by a new "late inline.." message? If that's the case, isn't it the case that sometimes the InlineTree::ok_to_inline() has some useful information that's lost when late inlining happens?
>>
>> As already pointed out in the PR by @caojoshua, this does not matter for string/methodhandle/vector/boxing late inlines, as they are [only performed if ok_to_inline() returns true](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/doCall.cpp#L189). This is also the only call to ok_to_inline().
>>
>> The only other location, where late inline call generators are created, are calls to CallGenerator::for_late_inline_virtual(), which creates a LateInlineVirtualCallGenerator. LateInlineVirtualCallGenerator (introduced in https://github.com/openjdk/jdk/pull/1550) does not really perform inlining but rather performs strength reduction from virtual to static calls. As can be verified by running the according test `test/hotspot/jtreg/compiler/c2/irTests/TestPostParseCallDevirtualization.java`, this does not affect the printing for inlining:
>>
>>
>> 5022 1026 3 compiler.c2.irTests.TestPostParseCallDevirtualization::callHelper (7 bytes) made not entrant
>> @ 1 compiler.c2.irTests.TestPostParseCallDevirtualization$I::method (0 bytes) failed to inline: virtual call
>>
>>
>> Thus, as far as I can tell, the proposed changes by @caojoshua do not lose any useful information about why no inlining prior to the late inlining occurred.
>
> theoweidmannoracle has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix more style issues
Hi Theo,
I've got some style comments, but more importantly I think that there are some bugs in the allocation scheme.
The `InlinePrinter` object is stored inside of `Compile`, so it could dictate the lifetime of its objects by having its own `Arena`, it doesn't have to share `_comp_arena`. That's up to you, I'm just stating that that is a design possibility.
Let's look at the real bug, starting in `IPInlineAttempt`.
```c++
struct IPInlineAttempt : public ArenaObj {
IPInlineAttempt(InliningResult result);
const InliningResult result;
stringStream msg;
};
It's a bug that this compiles. `stringStream` is an object with a non-trivial destructor which heap allocates. Arena allocated objects never have their destructor run, so each `IPInlineAttempt` that is allocated may potentially leak memory (depends on whether the string grows larger than the small buffer pre-allocated for it or not). I should file a ticket regarding this.
The fix for that is to separate out the `stringStream`s and `CHeap` allocate them. Here's an idea:
```c++
class InlinePrinter {
GrowableArray<stringStream, mtCompiler> _streams;
using StreamIndex = int;
struct IPInlineAttempt : public ArenaObj {
const InliningResult result;
const StreamIndex stream;
};
stringStream& stream_of(IPInlineAttempt& a) {
return _streams.at(a.stream);
}
};
This is slightly different, since a resizing of the backing GA will invalidate the pointer to the stream. If you don't want that, then you need a different container that we don't have. That'd be a nice addition, actually. I looked at your usages, seems like it's fine that we don't have address stable streams. Am I wrong?
With that change `IPInlineAttempt`s are just values and so you can simplify `IPInlineSite`:
```c++
class IPInlineSite : public ArenaObj {
GrowableArray<IPInlineAttempt> _attempts;
GrowableArray<IPInlineSite*> _children;
};
That's nice. No more suspicions of where that `IPInlineAttempt` resides or whether it can be changed.
Anyway, there are probably more and maybe more elegant ways of solving the memory leak problem.
I've asked another developer to have a look at this and to confirm whether I'm right or not on this matter.
Cheers!
src/hotspot/share/opto/printinlining.cpp line 34:
> 32: }
> 33:
> 34: InlinePrinter::IPInlineAttempt::IPInlineAttempt(InliningResult result) : result(result) {
Hm, here the `msg` isn't explicitly initialized. Does that leave it uninitialized?
src/hotspot/share/opto/printinlining.cpp line 41:
> 39: return &_nullStream;
> 40: }
> 41: auto attempt = locate_call(state, method)->add(result);
Style: Expand the `auto` into actual types. We typically only use `auto` for lambda functions.
src/hotspot/share/opto/printinlining.cpp line 45:
> 43: attempt->msg.print("%s", msg);
> 44: }
> 45: return &attempt->msg; // IPInlineAttempts are heap allocated so this address is safe
Surely arena allocated?
src/hotspot/share/opto/printinlining.cpp line 61:
> 59:
> 60: return locate_call(state->caller(), nullptr)->at_bci(state->bci(), create_for);
> 61: }
Can we be on the safe side and convert this into an iterative process instead, so that we don't have to worry about stack usage?
src/hotspot/share/opto/printinlining.hpp line 51:
> 49: * @returns An output stream which stores the message associated with this attempt. The buffer stays valid until InlinePrinter is deallocated.
> 50: * You can print arbitrary information to this stream but do not add line breaks, as this will break formatting.
> 51: */
Style: We typically don't use `@param`, `@returns` in Hotspot.
Consider this a nit, I'm not familiar enough with C2 codebase to know whether this adheres to C2 style.
src/hotspot/share/opto/printinlining.hpp line 57:
> 55: * Prints all collected inlining information to the given output stream.
> 56: */
> 57: void dump(outputStream* tty);
Style: Typically called `print_on` in Hotspot
src/hotspot/share/opto/printinlining.hpp line 95:
> 93: ciMethod* const _method;
> 94: GrowableArray<IPInlineAttempt*> _attempts;
> 95: GrowableArray<IPInlineSite*> _children;
Style: Private members are placed at the start in HotSpot.
-------------
PR Review: https://git.openjdk.org/jdk/pull/21899#pullrequestreview-2452174604
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852602107
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852582978
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852584341
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852679781
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852569011
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852567039
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852567843
More information about the hotspot-compiler-dev
mailing list