RFR 8242263: Diagnose synchronization on primitive wrappers
Markus Gronlund
markus.gronlund at oracle.com
Thu Aug 20 09:29:21 UTC 2020
Hi Patricio,
The incremental update looks ok.
Thank you
Markus
-----Original Message-----
From: Patricio Chilano
Sent: den 20 augusti 2020 08:24
To: Markus Gronlund <markus.gronlund at oracle.com>; David Holmes <david.holmes at oracle.com>; Erik Gahlin <erik.gahlin at oracle.com>
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR 8242263: Diagnose synchronization on primitive wrappers
Hi Markus,
Thanks for looking at this.
On 8/19/20 3:20 PM, Markus Gronlund wrote:
> Hi Patricio,
>
> Touching on something that Erik alluded to, and that is that the flag:
>
> 811 diagnostic(intx, DiagnoseSyncOnPrimitiveWrappers, 0, \
> 812 "Detect and take action upon identifying synchronization on " \
> 813 "primitive wrappers. Modes: " \
> 814 "0: off; " \
> 815 "1: exit with fatal error; " \
> 816 "2: log message to stdout. Log options can be customized " \
> 817 " with -Xlog:primitivewrappers; " \
> 818 "3: emit JFR event") \
> 819 range(0, 3)
>
> has become quite rich - also trying to auto-activate functionality if certain parts are enabled - for example it ergonomically enables UL.
>
> For JFR, this is not the best situation, because there all event control is already detailed in the .jfc configuration files. The flag as it is suggested would now add yet another place to know about to have the JFR events be generated. The flag mechanics can be simplified, at least for the "3. emit JFR event" option. This is because it is already implied when DiagnoseSyncOnPrimitiveWrappers is set (they are set to "enabled" in the .jfc's). It can therefore be removed as an option.
Ok, makes sense. I removed it as an extra option. The JFR event will just be posted now if JFR is running for the non fatal case.
> In general, and to simplify matters, maybe just let the individual subsystems (UL, JFR) configure if, and if so what they will do if information comes routed their way by means of this flag? If so, you might only need two options: "1. Exit with fatal error", "2. Log information" (send to UL, send to JFR) - (perhaps you want to provide convenience by activating the UL tags in this case).
Right, in this case I need to activate the UL tag. Since the only purpose of this flag is to alert when an event occurs, just enabling the flag means I should output something without having to force the user to specify yet additional flags. That's why by default I activate the UL tag primitivewrappers to print to stdout. I could also just print to stdout without using UL, but then the user will not be able to change the output file unless I either add another flag, which I think would be too much for this, or I actually allow the user to specify that through UL and then when having to print I check if the tag is enabled to decide how to print. But for that last scenario since I will have to print anyways I might as well always use UL and configure the tag by default to print to stdout.
I was also considering maybe to remove the posting of the JFR event altogether, but looking at this last version I think there is no harm in keeping it. If JFR is enabled we will just post an event when this flag is set to logging.
Anyways, below is v6, let me know if you are okay with the changes:
Inc: http://cr.openjdk.java.net/~pchilanomate/8242263/v6/inc/webrev/
Full: http://cr.openjdk.java.net/~pchilanomate/8242263/v6/webrev/
Thanks!
Patricio
> Thanks
> Markus
>
>
> -----Original Message-----
> From: Patricio Chilano
> Sent: den 12 augusti 2020 00:14
> To: David Holmes <david.holmes at oracle.com>; Erik Gahlin
> <erik.gahlin at oracle.com>
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR 8242263: Diagnose synchronization on primitive
> wrappers
>
>
> On 8/10/20 9:40 PM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 11/08/2020 6:03 am, Patricio Chilano wrote:
>>> Hi Erik,
>>>
>>> Thanks for looking into this and for the offline discussions.
>> If there are still issues with this regarding JFR then I would
>> suggest we simply drop the use of JFR for reporting this.
> Ok, I'll wait to see about the JFR changes.
>
>> Meanwhile looking at v5 incremental that all seems fine to me.
> Great, thanks for looking into this David!
>
>
> Thanks,
> Patricio
>> Thanks,
>> David
>> -----
>>
>>> On 8/10/20 10:19 AM, Erik Gahlin wrote:
>>>> Hi Patricio,
>>>>
>>>> I have tried to review the JFR changes, but I need more information
>>>> on how the feature is going to be used.
>>>>
>>>> Is this something temporary that will be used gain insight for the
>>>> Valhalla project, or do you think this will be needed longer term?
>>>> If it is longer term, the bar is higher on what can be accepted.
>>>>
>>>> If this is a temporary feature, we could mark the event as
>>>> experimental in metadata.xml, similar to what we did with GC events
>>>> during the development of ZGC. Events that are experimental do not
>>>> show up by default in visualization tools such as JMC and can be
>>>> removed when they are not needed, or we have a better solution.
>>>>
>>> Right, it's temporary so we can mark it as experimental. I also
>>> changed the category name to be "Java Virtual Machine, Diagnostics".
>>>
>>>> For events to be enabled in default.jfc, they should not cause more
>>>> than 1% overhead, not even in pathological cases. To me, it seems
>>>> this could happen if you make a loop where it is triggered all the
>>>> time.
>>>>
>>>> For events in profile.jfc, the overhead should still be low (1-2%),
>>>> but the target is the typical application. For example, allocation
>>>> profiling is only enabled in profile.jfc. Some application that are
>>>> allocation intensive could cause a higher overhead than 1%, but
>>>> that is OK because that configuration is only to be used for a
>>>> short period of time. That said, there is still a budget on how
>>>> much space in a recording an event can take up, so it also depends
>>>> on how important the information is for the user. To me, it seems
>>>> this event will not be that important for the average user, which
>>>> makes me think it should be disabled by default.
>>>>
>>> After our off-list discussion I left it as enabled. I also removed
>>> the threshold setting and added the startTime=false in metadata.xml.
>>>
>>>> To complicate things, I noticed that aa command line flag is also
>>>> used to enable the event. This is something we have worked hard to
>>>> avoid when it comes to JFR events. All configuration should happen
>>>> using configuration files. We had problem with this in the past
>>>> where users gets confused why their events are not enabled. When we
>>>> ported JFR to Hotspot, we got rid of those of those flags/events
>>>> and it is a much better situation. Again, if this a temporary event
>>>> to discover usage pattern for a release or two, it might be OK, but
>>>> if we believe this event to stick around, I think we should look
>>>> into alternatives, or not use JFR at all for this.
>>>>
>>> Yes, I see your point. Since we want to diagnose synchronization on
>>> these classes but we don't want to affect performance the flag has
>>> to be specified at startup otherwise the diagnostic instrumentation
>>> is disabled, so I don't see a clear way to enable the events from
>>> JFR alone. In any case, this will be a temporary event.
>>>
>>> Here is v5. Let me know if you are okay with the changes.
>>>
>>> Inc: http://cr.openjdk.java.net/~pchilanomate/8242263/v5/inc/webrev/
>>> Full: http://cr.openjdk.java.net/~pchilanomate/8242263/v5/webrev/
>>>
>>> Thanks Erik!
>>>
>>> Patricio
>>>> Thanks
>>>>
>>>> Erik
>>>>
>>>>> On 28 Jul 2020, at 21:16, Patricio Chilano
>>>>> <patricio.chilano.mateo at oracle.com
>>>>> <mailto:patricio.chilano.mateo at oracle.com>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Please review the following change that adds diagnostic
>>>>> capabilities when synchronizing on primitive wrapper classes.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8242263
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~pchilanomate/8242263/v1/webrev/
>>>>>
>>>>> The new flag allows to identify synchronization on these classes
>>>>> and to take one of the following actions: exit the VM with fatal
>>>>> error, log a warning message, or issue a JFR event. The
>>>>> implementation uses a simple approach where a check is added at
>>>>> the beginning of the monitorenter generated code when the flag is
>>>>> enabled to check whether the object is of a primitive wrapper
>>>>> class. If it is, we jump to the slow path, otherwise we just
>>>>> continue as always. The extra instructions will be: load the klass
>>>>> of the object, load the _access_flags field for that klass, AND
>>>>> with a constant, and branch based on the result. The code will
>>>>> only be generated whenever the new opt-in diagnostic flag is
>>>>> enabled so performance won't be affected when off.
>>>>>
>>>>> In addition to the purpose described in the description of the
>>>>> bug, this flag will also be useful when trying to diagnose
>>>>> possible synchronization issues if these classes ever become
>>>>> inline types as part of the Valhalla project.
>>>>>
>>>>> I added test SyncOnPrimitiveWrapperTest.java that tests for the
>>>>> exit and logging cases. I added test
>>>>> TestSyncOnPrimitiveWrapperEvent.java to test for the JFR event case.
>>>>> I tested the patch running tiers1-6 in mach5 with the flag set to
>>>>> DiagnoseSyncOnPrimitiveWrappers=2.
>>>>> I checked it builds with arm32 and ppc but can't run any tests on
>>>>> those platforms, so it would be good if somebody can run the new
>>>>> test included in the patch.
>>>>>
>>>>> Let me know if you think I should run or add any more tests.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Patricio
More information about the hotspot-runtime-dev
mailing list