RFR: 8319850: PrintInlining should print which methods are late inlines [v11]

Johan Sjölen jsjolen at openjdk.org
Thu Nov 21 19:12:21 UTC 2024


On Thu, 21 Nov 2024 14:49:42 GMT, theoweidmannoracle <duke at openjdk.org> wrote:

>> In https://github.com/openjdk/jdk/pull/16595 @caojoshua previously suggested changes to indicate which calls were inlined late, when printing inlines. This PR re-introduces the changes from the previously closed PR and fixes a minor issue where asserts were triggered.
>> 
>> Concerns were raised by @rwestrel in the previous PR:
>> 
>>> When InlineTree::ok_to_inline() is called, some diagnostic message is recorded for the call site. Do I understand right that with this patch, if the call is inlined late, then that message is dropped and replaced by a new "late inline.." message? If that's the case, isn't it the case that sometimes the InlineTree::ok_to_inline() has some useful information that's lost when late inlining happens?
>> 
>> As already pointed out in the PR by @caojoshua, this does not matter for string/methodhandle/vector/boxing late inlines, as they are [only performed if ok_to_inline() returns true](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/doCall.cpp#L189). This is also the only call to ok_to_inline().  
>> 
>> The only other location, where late inline call generators are created, are calls to CallGenerator::for_late_inline_virtual(), which creates a LateInlineVirtualCallGenerator. LateInlineVirtualCallGenerator (introduced in https://github.com/openjdk/jdk/pull/1550) does not really perform inlining but rather performs strength reduction from virtual to static calls. As can be verified by running the according test `test/hotspot/jtreg/compiler/c2/irTests/TestPostParseCallDevirtualization.java`, this does not affect the printing for inlining:
>> 
>> 
>> 5022 1026       3       compiler.c2.irTests.TestPostParseCallDevirtualization::callHelper (7 bytes)   made not entrant
>>                               @ 1   compiler.c2.irTests.TestPostParseCallDevirtualization$I::method (0 bytes)   failed to inline: virtual call
>> 
>> 
>> Thus, as far as I can tell, the proposed changes by @caojoshua do not lose any useful information about why no inlining prior to the late inlining occurred.
>
> theoweidmannoracle has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix more style issues

Hi Theo,

I've got some style comments, but more importantly I think that there are some bugs in the allocation scheme.

The `InlinePrinter` object is stored inside of `Compile`, so it could dictate the lifetime of its objects by having its own `Arena`, it doesn't have to share `_comp_arena`. That's up to you, I'm just stating that that is a design possibility.

Let's look at the real bug, starting in `IPInlineAttempt`.

```c++
  struct IPInlineAttempt : public ArenaObj {
    IPInlineAttempt(InliningResult result);
    const InliningResult result;
    stringStream msg;
  };


It's a bug that this compiles. `stringStream` is an object with a non-trivial destructor which heap allocates. Arena allocated objects never have their destructor run, so each `IPInlineAttempt` that is allocated may potentially leak memory (depends on whether the string grows larger than the small buffer pre-allocated for it or not). I should file a ticket regarding this.

The fix for that is to separate out the `stringStream`s and `CHeap` allocate them. Here's an idea:

```c++
class InlinePrinter {
  GrowableArray<stringStream, mtCompiler> _streams;
  using StreamIndex = int;

  struct IPInlineAttempt : public ArenaObj {
    const InliningResult result;
    const StreamIndex stream;
  };
  stringStream& stream_of(IPInlineAttempt& a) {
    return _streams.at(a.stream);
  }
};


This is slightly different, since a resizing of the backing GA will invalidate the pointer to the stream. If you don't want that, then you need a different container that we don't have. That'd be a nice addition, actually. I looked at your usages, seems like it's fine that we don't have address stable streams. Am I wrong?

With that change `IPInlineAttempt`s are just values and so you can simplify `IPInlineSite`:

```c++
class IPInlineSite : public ArenaObj {
  GrowableArray<IPInlineAttempt> _attempts;
  GrowableArray<IPInlineSite*> _children;
};


That's nice. No more suspicions of where that `IPInlineAttempt` resides or whether it can be changed.

Anyway, there are probably more and maybe more elegant ways of solving the memory leak problem.

I've asked another developer to have a look at this and to confirm whether I'm right or not on this matter.

Cheers!

src/hotspot/share/opto/printinlining.cpp line 34:

> 32: }
> 33: 
> 34: InlinePrinter::IPInlineAttempt::IPInlineAttempt(InliningResult result) : result(result) {

Hm, here the `msg` isn't explicitly initialized. Does that leave it uninitialized?

src/hotspot/share/opto/printinlining.cpp line 41:

> 39:     return &_nullStream;
> 40:   }
> 41:   auto attempt = locate_call(state, method)->add(result);

Style: Expand the `auto` into actual types. We typically only use `auto` for lambda functions.

src/hotspot/share/opto/printinlining.cpp line 45:

> 43:     attempt->msg.print("%s", msg);
> 44:   }
> 45:   return &attempt->msg; // IPInlineAttempts are heap allocated so this address is safe

Surely arena allocated?

src/hotspot/share/opto/printinlining.cpp line 61:

> 59: 
> 60:   return locate_call(state->caller(), nullptr)->at_bci(state->bci(), create_for);
> 61: }

Can we be on the safe side and convert this into an iterative process instead, so that we don't have to worry about stack usage?

src/hotspot/share/opto/printinlining.hpp line 51:

> 49:    * @returns An output stream which stores the message associated with this attempt. The buffer stays valid until InlinePrinter is deallocated.
> 50:    *          You can print arbitrary information to this stream but do not add line breaks, as this will break formatting.
> 51:    */

Style: We typically don't use `@param`, `@returns` in Hotspot.

Consider this a nit, I'm not familiar enough with C2 codebase to know whether this adheres to C2 style.

src/hotspot/share/opto/printinlining.hpp line 57:

> 55:    * Prints all collected inlining information to the given output stream.
> 56:    */
> 57:   void dump(outputStream* tty);

Style: Typically called `print_on` in Hotspot

src/hotspot/share/opto/printinlining.hpp line 95:

> 93:     ciMethod* const _method;
> 94:     GrowableArray<IPInlineAttempt*> _attempts;
> 95:     GrowableArray<IPInlineSite*> _children;

Style: Private members are placed at the start in HotSpot.

-------------

PR Review: https://git.openjdk.org/jdk/pull/21899#pullrequestreview-2452174604
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852602107
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852582978
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852584341
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852679781
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852569011
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852567039
PR Review Comment: https://git.openjdk.org/jdk/pull/21899#discussion_r1852567843


More information about the hotspot-compiler-dev mailing list