RFR: 8279016: JFR Leak Profiler is broken with Shenandoah
Erik Gahlin
egahlin at openjdk.org
Fri Jul 26 11:43:35 UTC 2024
On Fri, 26 Jul 2024 08:36:30 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 60:
>>
>>> 58: addToTestField();
>>> 59:
>>> 60: // Let GC catch up before we stop the recording and do the old object sample
>>
>> We have worked hard to remove Thread.sleep, in the JFR tests, so we can run them in parallel, i.e. -conc:64, without introducing failures. I think the test was written before event streaming existed. The test could perhaps be rewritten to use an event stream, i.e:
>>
>> try (RecordingStream rs = new RecordingStream()) {
>> rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity");
>> rs.onEvent(e -> {
>> if (hasValidField(e)) {
>> rs.close();
>> }
>> });
>> rs.startAsync();
>> addToTestField();
>> rs.awaitTermination();
>> }
>
> I agree using `sleep` here is not great. But I could not find a better way to do it.
>
> When I was looking at this test, I noticed that `OldObjectSample` event is only emitted when recording is stopped: https://github.com/openjdk/jdk/blob/7f11935461a6ff1ef0fac800fb4264a519e21061/src/jdk.jfr/share/classes/jdk/jfr/internal/OldObjectSample.java#L41-L42 -- this is why I thought to add `sleep` right before we stop the recording.
>
> So going async does not really help, because of this circularity: we only stop the recording when event is received, which we would not receive until recording is stopped. Sure enough, this times out:
>
>
> public static void main(String[] args) throws Exception {
> WhiteBox.setWriteAllObjectSamples(true);
>
> try (RecordingStream rs = new RecordingStream()) {
> rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity");
> rs.onEvent(e -> {
> try {
> if (hasValidField(e)) {
> rs.close();
> }
> } catch (Exception ex) {
> ex.printStackTrace();
> }
> });
>
> rs.startAsync();
> addToTestField();
> rs.awaitTermination();
> }
> }
I see the problem now.
As I understand it, this only happens with Shenandoah. Would it be possible to sleep only when using that GC? Or perhaps repeat on failure, increasing the sleep time with a few seconds för every iteration, starting with 0s?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692942006
More information about the hotspot-jfr-dev
mailing list