[14] RFR(M): 8234059: Stress test fails with "Unexpected Exception in thread JFR Event Stream"

Erik Gahlin erik.gahlin at oracle.com
Thu Dec 19 15:43:22 UTC 2019


Reviewed offline. Looks good.

Erik

On 2019-12-19 02:35, Markus Gronlund wrote:
> Greetings,
>
> Kindly asking for reviews for the following changeset.
>
> In short summary, the problem can manifest in situations where recording is stopped (completely, no recordings running) and later started again. For the subsequent start, artifacts that were tagged for the epoch that was running when the first recording stopped still carry their tag bits for that epoch. When recording resumes, and the same epoch is switched back in, the system will fail to notify correctly about new, epoch relative, tagged artifacts. The reason is that the artifact is determined to carry the correct, albeit now stale, bits. In this situation, only the event, but not the corresponding constant pool artifact, is serialized and this will later cause a resolution problem for the parser.
>
> The solution consists of two main parts: the major part will introduce an explicit tag clearing pass at the point of recording start (added as part of overall clear()). The minor part turns the bit check for an already tagged artifact into a composite bit check, also involving an explicitly set CLEARED_BIT. This is to avoid very rare issues where a just cleared bit is again set by a tagging thread (which is a trade-off involved as part of the asymmetry of using cas() bit clear vs non-cas() bit set). Now, at the tag site, even if the correct bits (for the current epoch) are set, if the CLEARED_BIT is also set, the artifact will be treated as 'untagged' artifact and a notification will be sent.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8234059
> Webrev: http://cr.openjdk.java.net/~mgronlun/8234059/webrev01/
> Test: jdk_jfr
>
> Thanks
> Markus


More information about the hotspot-jfr-dev mailing list