RFR: 8279016: JFR Leak Profiler is broken with Shenandoah

Erik Gahlin egahlin at openjdk.org
Fri Jul 26 11:43:35 UTC 2024


On Fri, 26 Jul 2024 08:36:30 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 60:
>> 
>>> 58:             addToTestField();
>>> 59: 
>>> 60:             // Let GC catch up before we stop the recording and do the old object sample
>> 
>> We have worked hard to remove Thread.sleep, in the JFR tests, so we can run them in parallel, i.e. -conc:64, without introducing failures. I think the test was written before event streaming existed. The test could perhaps be rewritten to use an event stream, i.e:
>> 
>>     try (RecordingStream rs = new RecordingStream()) {
>>       rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity");
>>       rs.onEvent(e -> {
>>         if (hasValidField(e)) {
>>           rs.close();
>>         }
>>       });
>>       rs.startAsync();
>>       addToTestField();
>>       rs.awaitTermination();
>>     }
>
> I agree using `sleep` here is not great. But I could not find a better way to do it.
> 
> When I was looking at this test, I noticed that `OldObjectSample` event is only emitted when recording is stopped: https://github.com/openjdk/jdk/blob/7f11935461a6ff1ef0fac800fb4264a519e21061/src/jdk.jfr/share/classes/jdk/jfr/internal/OldObjectSample.java#L41-L42 -- this is why I thought to add `sleep` right before we stop the recording.
> 
> So going async does not really help, because of this circularity: we only stop the recording when event is received, which we would not receive until recording is stopped. Sure enough, this times out:
> 
> 
>     public static void main(String[] args) throws Exception {
>         WhiteBox.setWriteAllObjectSamples(true);
> 
>         try (RecordingStream rs = new RecordingStream()) {
>             rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity");
>             rs.onEvent(e -> {
>                 try {
>                     if (hasValidField(e)) {
>                         rs.close();
>                     }
>                 } catch (Exception ex) {
>                     ex.printStackTrace();
>                 }
>             });
> 
>             rs.startAsync();
>             addToTestField();
>             rs.awaitTermination();
>         }
>     }

I see the problem now. 

As I understand it, this only happens with Shenandoah. Would it be possible to sleep only when using that GC? Or perhaps repeat on failure, increasing the sleep time with a few seconds för every iteration, starting with 0s?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692942006


More information about the hotspot-jfr-dev mailing list