RFR 8191216: SimpleTimeZone.clone() has a data race on cache fields
David Holmes
david.holmes at oracle.com
Tue Nov 28 07:17:00 UTC 2017
Hi Peter,
I like what you have done here. That said the general thread-unsafeness
of the code in SimpleTimeZone still causes me concern - but what you are
doing is not breaking anything more than it is already broken.
David
On 25/11/2017 9:32 AM, Peter Levart wrote:
> Hi,
>
> @Venkat: Sorry for late response, but I had to try something 1st.
>
> This is an official request for reviewing a patch for fixing a data race
> between cloning a SimpleTimeZone object and lazily initializing its 3
> cache fields which may produce a clone with inconsistent cache state.
> Here's Jira issue with details:
>
> https://bugs.openjdk.java.net/browse/JDK-8191216
>
> Venkat has proposed to simply call invalidateCache() on the clone before
> returning it from SimpleTimeZone.clone() method:
>
> public Object clone()
> {
> SimpleTimeZone clone = (SimpleTimeZone) super.clone();
> clone.invalidateCache();
> return clone;
> }
>
> This fixes the issue and for the case of TimeZone.getDefault() which is
> called from ZoneId.systemDefault() even elides synchronization in
> clone.invalidateCache() and allocation of the clone, so JITed
> ZoneId.systemDefault() is unaffected by the patch. Initially I was
> satisfied with the fix, but then I tested something. Suppose someone
> sets default zone to SimpleTimeZone:
>
> TimeZone.setDefault(
> new SimpleTimeZone(3600000,
> "Europe/Paris",
> Calendar.MARCH, -1, Calendar.SUNDAY,
> 3600000, SimpleTimeZone.UTC_TIME,
> Calendar.OCTOBER, -1, Calendar.SUNDAY,
> 3600000, SimpleTimeZone.UTC_TIME,
> 3600000)
> );
>
> And then calls some methods that initialize the cache of the internal
> shared TimeZone.defaultTimeZone instance:
>
> new Date().toString();
>
> The code which after that tries to obtain default time zone and
> calculate the offset from UTC at some current point in time, for example:
>
> TimeZone.getDefault().getOffset(now)
>
> can't use the cached state because it has been invalidated in the
> returned clone. Such code has to re-compute the offset every time it
> gets new clone. I measured this with a JMH benchmark and got the
> following result:
>
> Default:
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
>
> Venkat's patch - invalidateCache():
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
>
> We see that ZoneId.systemDefault() is unaffected, but
> TimeZone.getDefault().getOffset(now) becomes 3x slower.
>
> This is not good, so I tried an alternative fix for the issue - simply
> making the SimpleTimeZone.clone() synchronized. Access to cache fields
> is already synchronized, so this should fix the race. Here's the JMH result:
>
> Default:
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
>
> Synchronized clone():
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
>
> We see that caching works again, but synchronization has some overhead
> which is not big for single-threaded access, but might get bigger when
> multiple threads try to get default zone concurrently.
>
> So I created a 3rd variant of the fix which I'm presenting here and
> requesting a review for:
>
> http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_race/webrev.01/
>
> The JMH benchmark shows this:
>
> Default:
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
>
> Cache object:
>
> Benchmark Mode Cnt Score Error Units
> ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op
> ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
>
> Not only does the fix not affect ZoneId.systemDefault() which is not
> surprising, but it also speeds-up cache lookup in single-threaded
> benchmark and certainly eliminates possible contention among threads
> looking up the shared instance.
>
> I have run the test/jdk/java/util/TimeZone and
> test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused
> by compilation errors (package sun.util.locale.provider is not visible),
> while 59 other tests pass.
>
> So, what do you think?
>
> Regards, Peter
>
>
> Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
>> Thanks Peter for sponsoring this patch. Is there anything else that
>> needs to be done from my end for this patch to be integrated? Please
>> let me know.
>>
>> -Venkat
>>
>>
>> On 14/11/17 8:46 PM, Peter Levart wrote:
>>> Hi Venkat,
>>>
>>> I created the following issue:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8191216
>>>
>>> I can also sponsor this patch and push it for JDK 10.
>>>
>>> The patch that you are proposing looks good to me. Does anybody have
>>> anything else to say?
>>>
>>> Regards, Peter
>>>
>>>
>>> On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
>>>> Thanks David, Peter for your review and comments. As I am new to the
>>>> community, can you please help me to open a bug and integrate the
>>>> changes into code base?
>>>>
>>>> -Venkat
>>>>
>>>> On 12/11/17 2:19 AM, Peter Levart wrote:
>>>>> Hi David, Venkat,
>>>>>
>>>>> On 11/11/17 21:15, Peter Levart wrote:
>>>>>> For example, take the following method:
>>>>>>
>>>>>> String defaultTZID() {
>>>>>> return TimeZone.getDefault().getID();
>>>>>> }
>>>>>>
>>>>>> When JIT compiles it and inlines invocations to other methods
>>>>>> within it, it can prove that cloned TimeZone instance never
>>>>>> escapes the call to defaultTZID() and can therefore skip
>>>>>> allocating the instance on heap.
>>>>>>
>>>>>> But this is fragile. If code in invoked methods changes, they may
>>>>>> not get inlined or EA may not be able to prove that the cloned
>>>>>> instance can't escape and allocation may be introduced.
>>>>>> ZoneId.systemDefault() is a hot method and it would be nice if we
>>>>>> manage to keep it allocation free.
>>>>>
>>>>> Well, I tried the following variant of SimpleTimeZone.clone() patch:
>>>>>
>>>>> public Object clone()
>>>>> {
>>>>> SimpleTimeZone tz = (SimpleTimeZone) super.clone();
>>>>> // like tz.invalidateCache() but without holding a lock on
>>>>> clone
>>>>> tz.cacheYear = tz.startYear - 1;
>>>>> tz.cacheStart = tz.cacheEnd = 0;
>>>>> return tz;
>>>>> }
>>>>>
>>>>>
>>>>> ...and the JMH benchmark with gc profiling shows that
>>>>> ZoneId.systemDefault() still manages to get JIT-compiled without
>>>>> introducing allocation.
>>>>>
>>>>> Even the following (original Venkat's) patch:
>>>>>
>>>>> public Object clone()
>>>>> {
>>>>> SimpleTimeZone tz = (SimpleTimeZone) super.clone();
>>>>> tz.invalidateCache();
>>>>> return tz;
>>>>> }
>>>>>
>>>>> ...does the same and the locking in invalidateCache() is elided
>>>>> too. Allocation and lock-elision go hand-in-hand. When object
>>>>> doesn't escape, allocation on heap may be eliminated and locks on
>>>>> that instance elided.
>>>>>
>>>>> So this is better than synchronizing on the original instance
>>>>> during .clone() execution as it has potential to avoid locking
>>>>> overhead.
>>>>>
>>>>> So Venkat, go ahead. My fear was unjustified.
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>
>>>
>>>
>>
>
More information about the core-libs-dev
mailing list