Incorrect validation of DST in java.util.SimpleTimeZone
Hi, In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST. When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination. We considered two approaches to fix the issue. 1)Synchronize access to cloning default timezone object when cache is being modified. 2)Invalidate the cache while returning the clone. We preferred the second option as synchronization is more expensive. We have attached the patch and jtreg testcase. Please review. -- Regards -Venkat
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review. On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
-- Regards -Venkat
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body. Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help. [1] http://openjdk.java.net/guide/webrevHelp.html Regards, Sean. On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Thanks Sean. I am pasting the patch here: --- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; } /** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +} On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
-- Regards -Venkat
AFAICS SimpleTimeZone is simply not thread-safe. It has state that can be modified concurrently without synchronization and with fields not even declared volatile. Only the "cache" makes an attempt to use synchronization. So clone() is never guaranteed to actually produce a copy with valid/consistent field values. The suggested patch certainly improves the situation by at least resetting the cache of the cloned instance before returning it. David On 11/11/2017 3:53 PM, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi David, On 11/11/2017 07:51 AM, David Holmes wrote:
AFAICS SimpleTimeZone is simply not thread-safe. It has state that can be modified concurrently without synchronization and with fields not even declared volatile. Only the "cache" makes an attempt to use synchronization. So clone() is never guaranteed to actually produce a copy with valid/consistent field values.
The suggested patch certainly improves the situation by at least resetting the cache of the cloned instance before returning it.
The instance of SimpleTimeZone that is shared among threads (internally in JDK) is the defaultTimeZone instance (obtained through package-private TimeZone.getDefaultRef() method). I checked the usages and they seem to be "read-only" - not modifying the instance, just obtaining information from it. The cache OTOH, as you say, is synchronized. TimeZone and subclasses seem to be designed to be modified by single thread only, but can be used from multiple threads to read the information from them, including lazily computed and cached information. Usage withing JDK seems to comply with that. Venkat's patch therefore correctly fixes the remaining issue that is observed when the shared SimpleTimeZone instance is being cloned while also being accessed from multiple threads in read-only mode. Invalidating cache of the cloned instance just before returning it from clone() method means that instance obtained from TimeZone.getDefault() will never get cached state from original instance and will always have to re-compute it, but I think this is still better than synchronizing on the original instance which may never be optimized away (i.e. elided) by JIT. Regards, Peter
David
On 11/11/2017 3:53 PM, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi Peter, On 15/11/2017 1:02 AM, Peter Levart wrote:
Hi David,
On 11/11/2017 07:51 AM, David Holmes wrote:
AFAICS SimpleTimeZone is simply not thread-safe. It has state that can be modified concurrently without synchronization and with fields not even declared volatile. Only the "cache" makes an attempt to use synchronization. So clone() is never guaranteed to actually produce a copy with valid/consistent field values.
The suggested patch certainly improves the situation by at least resetting the cache of the cloned instance before returning it.
The instance of SimpleTimeZone that is shared among threads (internally in JDK) is the defaultTimeZone instance (obtained through package-private TimeZone.getDefaultRef() method). I checked the usages and they seem to be "read-only" - not modifying the instance, just obtaining information from it. The cache OTOH, as you say, is synchronized.
The initial problem statement was: "When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, ..." so that doesn't seem to be read-only. Though perhaps it is a very specific race.
TimeZone and subclasses seem to be designed to be modified by single thread only, but can be used from multiple threads to read the information from them, including lazily computed and cached information. Usage withing JDK seems to comply with that.
There's certainly no documentation of any such intent, or design. Seems more like the synchronization has been added (or not) based on how it is used within JDK rather than considering the actual API of the public types.
Venkat's patch therefore correctly fixes the remaining issue that is
Okay. As I said it certainly makes things better. Cheers, David
observed when the shared SimpleTimeZone instance is being cloned while also being accessed from multiple threads in read-only mode. Invalidating cache of the cloned instance just before returning it from clone() method means that instance obtained from TimeZone.getDefault() will never get cached state from original instance and will always have to re-compute it, but I think this is still better than synchronizing on the original instance which may never be optimized away (i.e. elided) by JIT.
Regards, Peter
David
On 11/11/2017 3:53 PM, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi David, On 11/14/2017 10:28 PM, David Holmes wrote:
Hi Peter,
On 15/11/2017 1:02 AM, Peter Levart wrote:
Hi David,
On 11/11/2017 07:51 AM, David Holmes wrote:
AFAICS SimpleTimeZone is simply not thread-safe. It has state that can be modified concurrently without synchronization and with fields not even declared volatile. Only the "cache" makes an attempt to use synchronization. So clone() is never guaranteed to actually produce a copy with valid/consistent field values.
The suggested patch certainly improves the situation by at least resetting the cache of the cloned instance before returning it.
The instance of SimpleTimeZone that is shared among threads (internally in JDK) is the defaultTimeZone instance (obtained through package-private TimeZone.getDefaultRef() method). I checked the usages and they seem to be "read-only" - not modifying the instance, just obtaining information from it. The cache OTOH, as you say, is synchronized.
The initial problem statement was:
"When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, ..."
so that doesn't seem to be read-only. Though perhaps it is a very specific race.
User code may do that with its own instances, but that would be invalid usage. There is no evidence (at least I haven't spotted it yet) that JDK code does the same too. As far as I have checked, internal JDK code is aware of the fact that defaultTimeZone instance is a shared instance. For example, take a protected java.util.Calendar no-arg constructor. It initializes the Calendar instance with the result of TimeZone.getDefaultRef() which returns a shared instance. But it also sets Calendar's 'sharedZone' flag, marking that the TimeZone instance it references is a shared instance. Methods that would expose such instance to user code are careful not to do that. For example: public TimeZone getTimeZone() { // If the TimeZone object is shared by other Calendar instances, then // create a clone. if (sharedZone) { zone = (TimeZone) zone.clone(); sharedZone = false; } return zone; } (BTW, this method may expose shared instance to user code if it is invoked concurrently from multiple threads on the same Calendar instance - there's no attempt to prevent writes to zone and sharedZone fields to be observed in non-program order by some concurrent thread) So I believe the situation is not so critical as it seemed at first. There may be other concurrency bugs like in above getTimeZone() method lurking, but the real problem here is cache* fields that may be modified while using "read-only" part of API. All such accesses are synchronized, except in the Object.clone() which reads them without holding the lock.
TimeZone and subclasses seem to be designed to be modified by single thread only, but can be used from multiple threads to read the information from them, including lazily computed and cached information. Usage withing JDK seems to comply with that.
There's certainly no documentation of any such intent, or design. Seems more like the synchronization has been added (or not) based on how it is used within JDK rather than considering the actual API of the public types.
Mercurial history does not go that far in the past to be able to see if synchronization for cache* fields was added at some point and why. My conclusions are based solely on the state of current code. Regards, Peter
Venkat's patch therefore correctly fixes the remaining issue that is
Okay. As I said it certainly makes things better.
Cheers, David
observed when the shared SimpleTimeZone instance is being cloned while also being accessed from multiple threads in read-only mode. Invalidating cache of the cloned instance just before returning it from clone() method means that instance obtained from TimeZone.getDefault() will never get cached state from original instance and will always have to re-compute it, but I think this is still better than synchronizing on the original instance which may never be optimized away (i.e. elided) by JIT.
Regards, Peter
David
On 11/11/2017 3:53 PM, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote: > Hi, > > In a multi-threaded environment, when java.util.SimpleTimeZone > object is used to create a default timezone, there can be a race > condition between the methods java.util.Timezone.getDefault() > and java.util.Timezone.getDefaultRef() which can result in > inconsistency of cache that is used to validate a particular > time/date in DST. > > When a thread is cloning a default timezone object > (SimpleTimeZone) and at the same time if a different thread > modifies the time/year values, then the cache values (cacheYear, > cacheStart, cacheEnd) can become inconsistent which leads to > incorrect DST determination. > > We considered two approaches to fix the issue. > > 1)Synchronize access to cloning default timezone object when > cache is being modified. > > 2)Invalidate the cache while returning the clone. > > We preferred the second option as synchronization is more > expensive. > > We have attached the patch and jtreg testcase. Please review. >
Hi Venkateswara R Chintala, I would like to remind that TimeZone.clone() is also in the code path of java.time.ZoneId.systemDefault() where it was relied on to be optimized by JIT to never actually allocate the cloned TimeZone object by employing escape analysis. It would be nice to verify if this patch keeps that behavior. To test this, consider a JMH benchmark that simply invokes ZoneId.systemDefault() and returns it from the test method. 1st verify that unmodified code, when JITed, performs no allocations. Then test with modified code by applying the patch and see if there's any difference. Hint: use "-prof gc" command line options to run the benchmarks.jar. Regards, Peter On 11/11/17 06:53, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi Peter, On 11/11/2017 8:06 PM, Peter Levart wrote:
Hi Venkateswara R Chintala,
I would like to remind that TimeZone.clone() is also in the code path of java.time.ZoneId.systemDefault() where it was relied on to be optimized by JIT to never actually allocate the cloned TimeZone object by employing escape analysis.
remind? Where is this documented? And given: public static TimeZone getDefault() { return (TimeZone) getDefaultRef().clone(); } how can it not allocate?? David -----
It would be nice to verify if this patch keeps that behavior. To test this, consider a JMH benchmark that simply invokes ZoneId.systemDefault() and returns it from the test method. 1st verify that unmodified code, when JITed, performs no allocations. Then test with modified code by applying the patch and see if there's any difference. Hint: use "-prof gc" command line options to run the benchmarks.jar.
Regards, Peter
On 11/11/17 06:53, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi David, On 11/11/17 13:37, David Holmes wrote:
Hi Peter,
On 11/11/2017 8:06 PM, Peter Levart wrote:
Hi Venkateswara R Chintala,
I would like to remind that TimeZone.clone() is also in the code path of java.time.ZoneId.systemDefault() where it was relied on to be optimized by JIT to never actually allocate the cloned TimeZone object by employing escape analysis.
remind? Where is this documented?
Unfortunately it is not documented. My bad. I did test it at the time and determined that ZoneId.systemDefault(), with my patch applied, did not do allocations when JIT kicked-in. So we settled for an easier variant: http://cr.openjdk.java.net/~plevart/jdk9-dev/ZoneId.systemDefault/webrev.04/ instead of a more complicated one that uses SharedSecrets to avoid cloning: http://cr.openjdk.java.net/~plevart/jdk9-dev/ZoneId.systemDefault/webrev.01/
And given:
public static TimeZone getDefault() { return (TimeZone) getDefaultRef().clone(); }
how can it not allocate??
When JIT compiles some method and inlines called methods, it also analyses all allocations performed to see if some of them can be proved to not escape the invocation of that compiled method. It then eliminates allocations of such objects on heap and rather generates code that keeps their state in registers or stack. For example, take the following method: String defaultTZID() { return TimeZone.getDefault().getID(); } When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap. But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free. Regards, Peter
David -----
It would be nice to verify if this patch keeps that behavior. To test this, consider a JMH benchmark that simply invokes ZoneId.systemDefault() and returns it from the test method. 1st verify that unmodified code, when JITed, performs no allocations. Then test with modified code by applying the patch and see if there's any difference. Hint: use "-prof gc" command line options to run the benchmarks.jar.
Regards, Peter
On 11/11/17 06:53, Venkateswara R Chintala wrote:
Thanks Sean. I am pasting the patch here:
--- old/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.643867420 +0530 +++ new/src/java.base/share/classes/java/util/SimpleTimeZone.java 2017-11-11 11:17:38.375870421 +0530 @@ -868,7 +868,11 @@ */ public Object clone() { - return super.clone(); + // Invalidate the time zone cache while cloning as it + // can be inconsistent due to race condition. + SimpleTimeZone tz = (SimpleTimeZone) super.clone(); + tz.invalidateCache(); + return tz; }
/** --- /dev/null 2017-11-02 17:09:59.155627814 +0530 +++ new/test/java/util/TimeZone/SimpleTimeZoneTest.java 2017-11-11 11:17:38.867864912 +0530 @@ -0,0 +1,55 @@ +/* + * @test + * @summary Tests the race condition between java.util.TimeZone.getDefault() and java.util.GregorianCalendar() + * @run main SimpleTimeZoneTest +*/ + +import java.util.Calendar; +import java.util.GregorianCalendar; +import java.util.SimpleTimeZone; +import java.util.TimeZone; + +public class SimpleTimeZoneTest extends Thread { + Calendar cal; + + public SimpleTimeZoneTest (Calendar cal) { + this.cal = cal; + } + + public static void main (String[] args) { + TimeZone stz = new SimpleTimeZone(7200000, "Asia/Jerusalem", Calendar.MARCH, 27, 0, 3600000, Calendar.SEPTEMBER, 16, 0, 3600000); + TimeZone.setDefault(stz); + + SimpleTimeZoneTest stt = new SimpleTimeZoneTest(new GregorianCalendar()); + stt.setDaemon(true); + stt.start(); + + for (int i = 0; i < 50000; i++) { + Calendar cal = new GregorianCalendar(); + cal.clear(); + cal.getTimeInMillis(); + cal.set(2014, 2, 2); + cal.clear(); + cal.getTimeInMillis(); + cal.set(1970, 2, 2); + } + + } + + public void run() { + while (true) { + cal.setTimeZone(TimeZone.getDefault()); + cal.clear(); + cal.set(2008, 9, 9); + Calendar calInst = java.util.Calendar.getInstance(); + calInst.setTimeInMillis(cal.getTimeInMillis()); + + if (calInst.get(java.util.Calendar.HOUR_OF_DAY) != cal.get(java.util.Calendar.HOUR_OF_DAY) || + calInst.get(java.util.Calendar.MINUTE) != cal.get(java.util.Calendar.MINUTE) || + calInst.get(java.util.Calendar.SECOND) != cal.get(java.util.Calendar.SECOND) || + calInst.get(java.util.Calendar.MILLISECOND) != cal.get(java.util.Calendar.MILLISECOND)) { + throw new RuntimeException("Test failed"); + } + } + } +}
On 10/11/17 9:29 PM, Seán Coffey wrote:
I think the OpenJDK mailing lists accept attachments if in patch format. If it's a simple short patch, it's acceptable to paste it into email body.
Easiest solution is to use webrev[1]. If you can't upload this to cr.openjdk.java.net - then one of your colleagues may be able to help.
[1] http://openjdk.java.net/guide/webrevHelp.html
Regards, Sean.
On 10/11/17 12:18, Venkateswara R Chintala wrote:
Looks like the patch attached earlier is not visible. As this is my first contribution, please let me know how I can send the patch for review.
On 10/11/17 5:37 PM, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
We have attached the patch and jtreg testcase. Please review.
Hi David, Venkat, On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch: public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; } ...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation. Even the following (original Venkat's) patch: public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; } ...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided. So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead. So Venkat, go ahead. My fear was unjustified. Regards, Peter
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base? -Venkat On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Venkat, I created the following issue: https://bugs.openjdk.java.net/browse/JDK-8191216 I can also sponsor this patch and push it for JDK 10. The patch that you are proposing looks good to me. Does anybody have anything else to say? Regards, Peter On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know. -Venkat On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
-- Regards -Venkat
Hi, @Venkat: Sorry for late response, but I had to try something 1st. This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details: https://bugs.openjdk.java.net/browse/JDK-8191216 Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method: public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; } This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone: TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) ); And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance: new Date().toString(); The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example: TimeZone.getDefault().getOffset(now) can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Venkat's patch - invalidateCache(): Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower. This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Synchronized clone(): Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently. So I created a 3rd variant of the fix which I'm presenting here and requesting a review for: http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r... The JMH benchmark shows this: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Cache object: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance. I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass. So, what do you think? Regards, Peter Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Peter, I like what you have done here. That said the general thread-unsafeness of the code in SimpleTimeZone still causes me concern - but what you are doing is not breaking anything more than it is already broken. David On 25/11/2017 9:32 AM, Peter Levart wrote:
Hi,
@Venkat: Sorry for late response, but I had to try something 1st.
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi David, Can I consider your comment as a Review? I'd like to get this patch into JDK10 if possible. Regards, Peter On 11/28/2017 08:17 AM, David Holmes wrote:
Hi Peter,
I like what you have done here. That said the general thread-unsafeness of the code in SimpleTimeZone still causes me concern - but what you are doing is not breaking anything more than it is already broken.
David
On 25/11/2017 9:32 AM, Peter Levart wrote:
Hi,
@Venkat: Sorry for late response, but I had to try something 1st.
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote: > For example, take the following method: > > String defaultTZID() { > return TimeZone.getDefault().getID(); > } > > When JIT compiles it and inlines invocations to other methods > within it, it can prove that cloned TimeZone instance never > escapes the call to defaultTZID() and can therefore skip > allocating the instance on heap. > > But this is fragile. If code in invoked methods changes, they > may not get inlined or EA may not be able to prove that the > cloned instance can't escape and allocation may be introduced. > ZoneId.systemDefault() is a hot method and it would be nice if > we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Peter, On 6/12/2017 9:08 PM, Peter Levart wrote:
Hi David,
Can I consider your comment as a Review? I'd like to get this patch into JDK10 if possible.
No sorry. I see what you're doing and I think it is okay but the regular owners/maintainers of this code need to have the say on any changes here. David
Regards, Peter
On 11/28/2017 08:17 AM, David Holmes wrote:
Hi Peter,
I like what you have done here. That said the general thread-unsafeness of the code in SimpleTimeZone still causes me concern - but what you are doing is not breaking anything more than it is already broken.
David
On 25/11/2017 9:32 AM, Peter Levart wrote:
Hi,
@Venkat: Sorry for late response, but I had to try something 1st.
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote: > Hi David, Venkat, > > On 11/11/17 21:15, Peter Levart wrote: >> For example, take the following method: >> >> String defaultTZID() { >> return TimeZone.getDefault().getID(); >> } >> >> When JIT compiles it and inlines invocations to other methods >> within it, it can prove that cloned TimeZone instance never >> escapes the call to defaultTZID() and can therefore skip >> allocating the instance on heap. >> >> But this is fragile. If code in invoked methods changes, they >> may not get inlined or EA may not be able to prove that the >> cloned instance can't escape and allocation may be introduced. >> ZoneId.systemDefault() is a hot method and it would be nice if >> we manage to keep it allocation free. > > Well, I tried the following variant of SimpleTimeZone.clone() patch: > > public Object clone() > { > SimpleTimeZone tz = (SimpleTimeZone) super.clone(); > // like tz.invalidateCache() but without holding a lock > on clone > tz.cacheYear = tz.startYear - 1; > tz.cacheStart = tz.cacheEnd = 0; > return tz; > } > > > ...and the JMH benchmark with gc profiling shows that > ZoneId.systemDefault() still manages to get JIT-compiled without > introducing allocation. > > Even the following (original Venkat's) patch: > > public Object clone() > { > SimpleTimeZone tz = (SimpleTimeZone) super.clone(); > tz.invalidateCache(); > return tz; > } > > ...does the same and the locking in invalidateCache() is elided > too. Allocation and lock-elision go hand-in-hand. When object > doesn't escape, allocation on heap may be eliminated and locks on > that instance elided. > > So this is better than synchronizing on the original instance > during .clone() execution as it has potential to avoid locking > overhead. > > So Venkat, go ahead. My fear was unjustified. > > Regards, Peter >
On 06/12/2017 11:32, David Holmes wrote:
Hi Peter,
On 6/12/2017 9:08 PM, Peter Levart wrote:
Hi David,
Can I consider your comment as a Review? I'd like to get this patch into JDK10 if possible.
No sorry. I see what you're doing and I think it is okay but the regular owners/maintainers of this code need to have the say on any changes here. I think this class is normally maintained on i18n-dev but I think introducing the Cache object looks good and making this much easier to understand.
-Alan
Hi, On 12/06/2017 02:30 PM, Alan Bateman wrote:
I think this class is normally maintained on i18n-dev but I think introducing the Cache object looks good and making this much easier to understand.
-Alan
Thanks Alan, I'm forwarding to i18n-dev to see if maintainers of that part of JDK have any comments... This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details: https://bugs.openjdk.java.net/browse/JDK-8191216 Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method: public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; } This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone: TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) ); And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance: new Date().toString(); The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example: TimeZone.getDefault().getOffset(now) can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Venkat's patch - invalidateCache(): Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower. This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Synchronized clone(): Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently. So I created a 3rd variant of the fix which I'm presenting here and requesting a review for: http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r... The JMH benchmark shows this: Default: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op Cache object: Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance. I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass. So, what do you think? Regards, Peter Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Peter, Venkat, Thank you for the fix. It looks good to me. Improved performance is a nice bonus! Would you be able to provide with a regression test? Naoto On 12/6/17 6:10 AM, Peter Levart wrote:
Hi,
On 12/06/2017 02:30 PM, Alan Bateman wrote:
I think this class is normally maintained on i18n-dev but I think introducing the Cache object looks good and making this much easier to understand.
-Alan
Thanks Alan, I'm forwarding to i18n-dev to see if maintainers of that part of JDK have any comments...
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote:
For example, take the following method:
String defaultTZID() { return TimeZone.getDefault().getID(); }
When JIT compiles it and inlines invocations to other methods within it, it can prove that cloned TimeZone instance never escapes the call to defaultTZID() and can therefore skip allocating the instance on heap.
But this is fragile. If code in invoked methods changes, they may not get inlined or EA may not be able to prove that the cloned instance can't escape and allocation may be introduced. ZoneId.systemDefault() is a hot method and it would be nice if we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Naoto, Thank you for reviewing. Naoto Sato je 06. 12. 2017 ob 20:41 napisal:
Hi Peter, Venkat,
Thank you for the fix. It looks good to me. Improved performance is a nice bonus! Would you be able to provide with a regression test?
Sure, here it is: http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r... The test reliably detects several races in 2 seconds of runtime which cause incorrect offset calculations in JDK 9: shared TZ: 7363986 correct, 0 incorrect calculations cloned TZ: 3960264 correct, 1180 incorrect calculations Exception in thread "main" java.lang.AssertionError: 1180 fatal data races detected at SimpleTimeZoneCloneRaceTest.main(SimpleTimeZoneCloneRaceTest.java:86) With patch applied, there are no failures of course. Can this go into JDK 10 ? Regards, Peter
Naoto
On 12/6/17 6:10 AM, Peter Levart wrote:
Hi,
On 12/06/2017 02:30 PM, Alan Bateman wrote:
I think this class is normally maintained on i18n-dev but I think introducing the Cache object looks good and making this much easier to understand.
-Alan
Thanks Alan, I'm forwarding to i18n-dev to see if maintainers of that part of JDK have any comments...
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote:
Hi David, Venkat,
On 11/11/17 21:15, Peter Levart wrote: > For example, take the following method: > > String defaultTZID() { > return TimeZone.getDefault().getID(); > } > > When JIT compiles it and inlines invocations to other methods > within it, it can prove that cloned TimeZone instance never > escapes the call to defaultTZID() and can therefore skip > allocating the instance on heap. > > But this is fragile. If code in invoked methods changes, they > may not get inlined or EA may not be able to prove that the > cloned instance can't escape and allocation may be introduced. > ZoneId.systemDefault() is a hot method and it would be nice if > we manage to keep it allocation free.
Well, I tried the following variant of SimpleTimeZone.clone() patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); // like tz.invalidateCache() but without holding a lock on clone tz.cacheYear = tz.startYear - 1; tz.cacheStart = tz.cacheEnd = 0; return tz; }
...and the JMH benchmark with gc profiling shows that ZoneId.systemDefault() still manages to get JIT-compiled without introducing allocation.
Even the following (original Venkat's) patch:
public Object clone() { SimpleTimeZone tz = (SimpleTimeZone) super.clone(); tz.invalidateCache(); return tz; }
...does the same and the locking in invalidateCache() is elided too. Allocation and lock-elision go hand-in-hand. When object doesn't escape, allocation on heap may be eliminated and locks on that instance elided.
So this is better than synchronizing on the original instance during .clone() execution as it has potential to avoid locking overhead.
So Venkat, go ahead. My fear was unjustified.
Regards, Peter
Hi Peter, Thanks for the tests. Looks good to me. One nit: it should throw an Exception instead of AssertionError when the test fails. No further review is needed.
Can this go into JDK 10 ?
You can push it before the JDK 10 fork. Naoto On 12/9/17 2:33 PM, Peter Levart wrote:
Hi Naoto,
Thank you for reviewing.
Naoto Sato je 06. 12. 2017 ob 20:41 napisal:
Hi Peter, Venkat,
Thank you for the fix. It looks good to me. Improved performance is a nice bonus! Would you be able to provide with a regression test?
Sure, here it is:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The test reliably detects several races in 2 seconds of runtime which cause incorrect offset calculations in JDK 9:
shared TZ: 7363986 correct, 0 incorrect calculations cloned TZ: 3960264 correct, 1180 incorrect calculations Exception in thread "main" java.lang.AssertionError: 1180 fatal data races detected at SimpleTimeZoneCloneRaceTest.main(SimpleTimeZoneCloneRaceTest.java:86)
With patch applied, there are no failures of course.
Can this go into JDK 10 ?
Regards, Peter
Naoto
On 12/6/17 6:10 AM, Peter Levart wrote:
Hi,
On 12/06/2017 02:30 PM, Alan Bateman wrote:
I think this class is normally maintained on i18n-dev but I think introducing the Cache object looks good and making this much easier to understand.
-Alan
Thanks Alan, I'm forwarding to i18n-dev to see if maintainers of that part of JDK have any comments...
This is an official request for reviewing a patch for fixing a data race between cloning a SimpleTimeZone object and lazily initializing its 3 cache fields which may produce a clone with inconsistent cache state. Here's Jira issue with details:
https://bugs.openjdk.java.net/browse/JDK-8191216
Venkat has proposed to simply call invalidateCache() on the clone before returning it from SimpleTimeZone.clone() method:
public Object clone() { SimpleTimeZone clone = (SimpleTimeZone) super.clone(); clone.invalidateCache(); return clone; }
This fixes the issue and for the case of TimeZone.getDefault() which is called from ZoneId.systemDefault() even elides synchronization in clone.invalidateCache() and allocation of the clone, so JITed ZoneId.systemDefault() is unaffected by the patch. Initially I was satisfied with the fix, but then I tested something. Suppose someone sets default zone to SimpleTimeZone:
TimeZone.setDefault( new SimpleTimeZone(3600000, "Europe/Paris", Calendar.MARCH, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, Calendar.OCTOBER, -1, Calendar.SUNDAY, 3600000, SimpleTimeZone.UTC_TIME, 3600000) );
And then calls some methods that initialize the cache of the internal shared TimeZone.defaultTimeZone instance:
new Date().toString();
The code which after that tries to obtain default time zone and calculate the offset from UTC at some current point in time, for example:
TimeZone.getDefault().getOffset(now)
can't use the cached state because it has been invalidated in the returned clone. Such code has to re-compute the offset every time it gets new clone. I measured this with a JMH benchmark and got the following result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Venkat's patch - invalidateCache():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 179,476 ± 1,942 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,538 ± 0,073 ns/op
We see that ZoneId.systemDefault() is unaffected, but TimeZone.getDefault().getOffset(now) becomes 3x slower.
This is not good, so I tried an alternative fix for the issue - simply making the SimpleTimeZone.clone() synchronized. Access to cache fields is already synchronized, so this should fix the race. Here's the JMH result:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Synchronized clone():
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 58,103 ± 0,936 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 4,154 ± 0,034 ns/op
We see that caching works again, but synchronization has some overhead which is not big for single-threaded access, but might get bigger when multiple threads try to get default zone concurrently.
So I created a 3rd variant of the fix which I'm presenting here and requesting a review for:
http://cr.openjdk.java.net/~plevart/jdk10-dev/8191216_SimpleTimeZone_clone_r...
The JMH benchmark shows this:
Default:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 57,168 ± 0,501 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,558 ± 0,040 ns/op
Cache object:
Benchmark Mode Cnt Score Error Units ZoneIdBench.TimeZone_getDefault_getOffset avgt 10 42,860 ± 0,274 ns/op ZoneIdBench.ZoneId_systemDefault avgt 10 3,545 ± 0,057 ns/op
Not only does the fix not affect ZoneId.systemDefault() which is not surprising, but it also speeds-up cache lookup in single-threaded benchmark and certainly eliminates possible contention among threads looking up the shared instance.
I have run the test/jdk/java/util/TimeZone and test/jdk/java/util/Calendar jtreg tests and there were 7 failures caused by compilation errors (package sun.util.locale.provider is not visible), while 59 other tests pass.
So, what do you think?
Regards, Peter
Venkateswara R Chintala je 21. 11. 2017 ob 10:14 napisal:
Thanks Peter for sponsoring this patch. Is there anything else that needs to be done from my end for this patch to be integrated? Please let me know.
-Venkat
On 14/11/17 8:46 PM, Peter Levart wrote:
Hi Venkat,
I created the following issue:
https://bugs.openjdk.java.net/browse/JDK-8191216
I can also sponsor this patch and push it for JDK 10.
The patch that you are proposing looks good to me. Does anybody have anything else to say?
Regards, Peter
On 11/13/2017 11:28 AM, Venkateswara R Chintala wrote:
Thanks David, Peter for your review and comments. As I am new to the community, can you please help me to open a bug and integrate the changes into code base?
-Venkat
On 12/11/17 2:19 AM, Peter Levart wrote: > Hi David, Venkat, > > On 11/11/17 21:15, Peter Levart wrote: >> For example, take the following method: >> >> String defaultTZID() { >> return TimeZone.getDefault().getID(); >> } >> >> When JIT compiles it and inlines invocations to other methods >> within it, it can prove that cloned TimeZone instance never >> escapes the call to defaultTZID() and can therefore skip >> allocating the instance on heap. >> >> But this is fragile. If code in invoked methods changes, they >> may not get inlined or EA may not be able to prove that the >> cloned instance can't escape and allocation may be introduced. >> ZoneId.systemDefault() is a hot method and it would be nice if >> we manage to keep it allocation free. > > Well, I tried the following variant of SimpleTimeZone.clone() patch: > > public Object clone() > { > SimpleTimeZone tz = (SimpleTimeZone) super.clone(); > // like tz.invalidateCache() but without holding a lock > on clone > tz.cacheYear = tz.startYear - 1; > tz.cacheStart = tz.cacheEnd = 0; > return tz; > } > > > ...and the JMH benchmark with gc profiling shows that > ZoneId.systemDefault() still manages to get JIT-compiled without > introducing allocation. > > Even the following (original Venkat's) patch: > > public Object clone() > { > SimpleTimeZone tz = (SimpleTimeZone) super.clone(); > tz.invalidateCache(); > return tz; > } > > ...does the same and the locking in invalidateCache() is elided > too. Allocation and lock-elision go hand-in-hand. When object > doesn't escape, allocation on heap may be eliminated and locks on > that instance elided. > > So this is better than synchronizing on the original instance > during .clone() execution as it has potential to avoid locking > overhead. > > So Venkat, go ahead. My fear was unjustified. > > Regards, Peter >
Thanks Naoto, Alan, David and Venkat. The change is in. Regards, Peter Naoto Sato je 11. 12. 2017 ob 19:41 napisal:
Hi Peter,
Thanks for the tests. Looks good to me.
One nit: it should throw an Exception instead of AssertionError when the test fails. No further review is needed.
Can this go into JDK 10 ?
You can push it before the JDK 10 fork.
Naoto
Hi Venkat, On 11/10/17 13:07, Venkateswara R Chintala wrote:
Hi,
In a multi-threaded environment, when java.util.SimpleTimeZone object is used to create a default timezone, there can be a race condition between the methods java.util.Timezone.getDefault() and java.util.Timezone.getDefaultRef() which can result in inconsistency of cache that is used to validate a particular time/date in DST.
When a thread is cloning a default timezone object (SimpleTimeZone) and at the same time if a different thread modifies the time/year values, then the cache values (cacheYear, cacheStart, cacheEnd) can become inconsistent which leads to incorrect DST determination.
We considered two approaches to fix the issue.
1)Synchronize access to cloning default timezone object when cache is being modified.
2)Invalidate the cache while returning the clone.
We preferred the second option as synchronization is more expensive.
Unfortunately, the SimpleTimeZone.invalidateCache() is also synchronized. So of the same cost, and it may inhibit JIT optimization. Perhaps it would be best to synchronize cloning and then arrange the ZoneId.systemDefault() codepath to not involve cloning. I refrained from doing that in a patch for: https://bugs.openjdk.java.net/browse/JDK-8074002 simply because it was easier and benchmarks showed that cloning is optimized away. But now we should reconsider that and use TimeZone.getDefaultRef() from the ZoneId.systemDefault() (introducing JavaUtilAccess into SharedSecrets mechanism)... Regards, Peter
participants (6)
-
Alan Bateman
-
David Holmes
-
Naoto Sato
-
Peter Levart
-
Seán Coffey
-
Venkateswara R Chintala