[12] RFR: 8215194: Initial size of UnicodeBlock map is incorrect
Hi, Please review the fix for the following issue: https://bugs.openjdk.java.net/browse/JDK-8215194 The proposed fix is located at: http://cr.openjdk.java.net/~naoto/8215194/webrev.00/ This one line fix is for the correctness of the initial map size of Character.UnicodeBlock. Naoto
Hi Naoto, Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667; Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term. $.02, Roger On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
Hi Roger, Thanks. I updated it as suggested (incl. test using reflection): http://cr.openjdk.java.net/~naoto/8215194/webrev.01/ Naoto On 12/11/18 7:57 AM, Roger Riggs wrote:
Hi Naoto,
Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667;
Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term.
$.02, Roger
On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
Hi Naoto, Looks ok, I intended only the number of elements to be defined as a constant. The other factors can be hard coded. In the test, you will still have to edit the test when the number changes. I meant to avoid that edit. Though then may be there is not need for the test at all. Roger On 12/11/2018 12:59 PM, Naoto Sato wrote:
Hi Roger,
Thanks. I updated it as suggested (incl. test using reflection):
http://cr.openjdk.java.net/~naoto/8215194/webrev.01/
Naoto
On 12/11/18 7:57 AM, Roger Riggs wrote:
Hi Naoto,
Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667;
Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term.
$.02, Roger
On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
Hi Roger, On 12/11/18 10:12 AM, Roger Riggs wrote:
Hi Naoto,
Looks ok,
I intended only the number of elements to be defined as a constant. The other factors can be hard coded.
OK, I revised it again: http://cr.openjdk.java.net/~naoto/8215194/webrev.02/
In the test, you will still have to edit the test when the number changes. I meant to avoid that edit. Though then may be there is not need for the test at all.
Yes, if we just reflectively use the constant in Character.UnicodeBlock in the test, it is guaranteed to succeed no matter what. Thus I added the assertion. Still, the test ofHashMap() succeeds till the block additions surpasses that 1024 capacity (thus this was overlooked on Unicode 11 upgrade). Naoto
Roger
On 12/11/2018 12:59 PM, Naoto Sato wrote:
Hi Roger,
Thanks. I updated it as suggested (incl. test using reflection):
http://cr.openjdk.java.net/~naoto/8215194/webrev.01/
Naoto
On 12/11/18 7:57 AM, Roger Riggs wrote:
Hi Naoto,
Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667;
Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term.
$.02, Roger
On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
Hi Naoto! The fix looks good, thank you! I wonder if the test can be updated to also verify the optimal capacity of HashMap<String, Character.UnicodeScript> aliases; On 12/11/18 10:33 AM, Naoto Sato wrote:
Hi Roger,
On 12/11/18 10:12 AM, Roger Riggs wrote:
Hi Naoto,
Looks ok,
I intended only the number of elements to be defined as a constant. The other factors can be hard coded.
OK, I revised it again:
http://cr.openjdk.java.net/~naoto/8215194/webrev.02/
In the test, you will still have to edit the test when the number changes. I meant to avoid that edit. Though then may be there is not need for the test at all.
Yes, if we just reflectively use the constant in Character.UnicodeBlock in the test, it is guaranteed to succeed no matter what. OptimalCapacity.ofHashMap uses the initialCapacity to verify that a HashMap created with that initialCapacity will not reallocate its internal array after inserting the factual number of elements.
So, if one day the number of entities is increased, but the constant NUM_ENTITIES is not updated, then the test will catch it. With kind regards, Ivan
Thus I added the assertion. Still, the test ofHashMap() succeeds till the block additions surpasses that 1024 capacity (thus this was overlooked on Unicode 11 upgrade).
Naoto
Roger
On 12/11/2018 12:59 PM, Naoto Sato wrote:
Hi Roger,
Thanks. I updated it as suggested (incl. test using reflection):
http://cr.openjdk.java.net/~naoto/8215194/webrev.01/
Naoto
On 12/11/18 7:57 AM, Roger Riggs wrote:
Hi Naoto,
Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667;
Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term.
$.02, Roger
On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
-- With kind regards, Ivan Gerasimov
Hi Ivan, Thank you for the review. On 12/11/18 10:49 AM, Ivan Gerasimov wrote:
OptimalCapacity.ofHashMap uses the initialCapacity to verify that a HashMap created with that initialCapacity will not reallocate its internal array after inserting the factual number of elements.
So, if one day the number of entities is increased, but the constant NUM_ENTITIES is not updated, then the test will catch it.
Actually the reason I am fixing it is that, the test did not detect the discrepancy with Unicode 11 upgrade, where it just increased the entry numbers by the true block increases (11 blocks), without adding their aliases. --- --- a/src/java.base/share/classes/java/lang/Character.java Wed Nov 14 13:15:54 2018 +0100 +++ b/src/java.base/share/classes/java/lang/Character.java Wed Nov 21 14:24:31 2018 +0530 @@ -43,7 +43,7 @@ * a character's category (lowercase letter, digit, etc.) and for converting * characters from uppercase to lowercase and vice versa. * <p> - * Character information is based on the Unicode Standard, version 10.0.0. + * Character information is based on the Unicode Standard, version 11.0.0. * <p> * The methods and data of class {@code Character} are defined by * the information in the <i>UnicodeData</i> file that is part of the @@ -681,11 +681,11 @@ */ public static final class UnicodeBlock extends Subset { /** - * 638 - the expected number of entities + * 649 - the expected number of entities * 0.75 - the default load factor of HashMap */ private static Map<String, UnicodeBlock> map = - new HashMap<>((int)(638 / 0.75f + 1.0f)); + new HashMap<>((int)(649 / 0.75f + 1.0f)); /** * Creates a UnicodeBlock with the given identifier name. --- However, the HashMap allocates the entry size to the nearest power of two number, i.e., 1024 for both numbers 649 and 667. Thus the test passes even if there is discrepancy. Naoto
Hi Naoto, while the here fix looks good, I can't help wondering why the map isn't final? /Claes On 2018-12-11 19:59, Naoto Sato wrote:
Hi Ivan,
Thank you for the review.
On 12/11/18 10:49 AM, Ivan Gerasimov wrote:
OptimalCapacity.ofHashMap uses the initialCapacity to verify that a HashMap created with that initialCapacity will not reallocate its internal array after inserting the factual number of elements.
So, if one day the number of entities is increased, but the constant NUM_ENTITIES is not updated, then the test will catch it.
Actually the reason I am fixing it is that, the test did not detect the discrepancy with Unicode 11 upgrade, where it just increased the entry numbers by the true block increases (11 blocks), without adding their aliases.
---
--- a/src/java.base/share/classes/java/lang/Character.java Wed Nov 14 13:15:54 2018 +0100 +++ b/src/java.base/share/classes/java/lang/Character.java Wed Nov 21 14:24:31 2018 +0530 @@ -43,7 +43,7 @@ * a character's category (lowercase letter, digit, etc.) and for converting * characters from uppercase to lowercase and vice versa. * <p> - * Character information is based on the Unicode Standard, version 10.0.0. + * Character information is based on the Unicode Standard, version 11.0.0. * <p> * The methods and data of class {@code Character} are defined by * the information in the <i>UnicodeData</i> file that is part of the @@ -681,11 +681,11 @@ */ public static final class UnicodeBlock extends Subset { /** - * 638 - the expected number of entities + * 649 - the expected number of entities * 0.75 - the default load factor of HashMap */ private static Map<String, UnicodeBlock> map = - new HashMap<>((int)(638 / 0.75f + 1.0f)); + new HashMap<>((int)(649 / 0.75f + 1.0f));
/** * Creates a UnicodeBlock with the given identifier name. ---
However, the HashMap allocates the entry size to the nearest power of two number, i.e., 1024 for both numbers 649 and 667. Thus the test passes even if there is discrepancy.
Naoto
Hi Claes, I don't see any particular reason for that. In fact, I was wondering the same thing, but just left it as it was. Since I've already pushed the changeset, I will fix it on the next occasion. Naoto On 12/11/18 2:31 PM, Claes Redestad wrote:
Hi Naoto,
while the here fix looks good, I can't help wondering why the map isn't final?
/Claes
On 2018-12-11 19:59, Naoto Sato wrote:
Hi Ivan,
Thank you for the review.
On 12/11/18 10:49 AM, Ivan Gerasimov wrote:
OptimalCapacity.ofHashMap uses the initialCapacity to verify that a HashMap created with that initialCapacity will not reallocate its internal array after inserting the factual number of elements.
So, if one day the number of entities is increased, but the constant NUM_ENTITIES is not updated, then the test will catch it.
Actually the reason I am fixing it is that, the test did not detect the discrepancy with Unicode 11 upgrade, where it just increased the entry numbers by the true block increases (11 blocks), without adding their aliases.
---
--- a/src/java.base/share/classes/java/lang/Character.java Wed Nov 14 13:15:54 2018 +0100 +++ b/src/java.base/share/classes/java/lang/Character.java Wed Nov 21 14:24:31 2018 +0530 @@ -43,7 +43,7 @@ * a character's category (lowercase letter, digit, etc.) and for converting * characters from uppercase to lowercase and vice versa. * <p> - * Character information is based on the Unicode Standard, version 10.0.0. + * Character information is based on the Unicode Standard, version 11.0.0. * <p> * The methods and data of class {@code Character} are defined by * the information in the <i>UnicodeData</i> file that is part of the @@ -681,11 +681,11 @@ */ public static final class UnicodeBlock extends Subset { /** - * 638 - the expected number of entities + * 649 - the expected number of entities * 0.75 - the default load factor of HashMap */ private static Map<String, UnicodeBlock> map = - new HashMap<>((int)(638 / 0.75f + 1.0f)); + new HashMap<>((int)(649 / 0.75f + 1.0f));
/** * Creates a UnicodeBlock with the given identifier name. ---
However, the HashMap allocates the entry size to the nearest power of two number, i.e., 1024 for both numbers 649 and 667. Thus the test passes even if there is discrepancy.
Naoto
Looks fine. Thanks, Roger On 12/11/2018 01:33 PM, Naoto Sato wrote:
Hi Roger,
On 12/11/18 10:12 AM, Roger Riggs wrote:
Hi Naoto,
Looks ok,
I intended only the number of elements to be defined as a constant. The other factors can be hard coded.
OK, I revised it again:
http://cr.openjdk.java.net/~naoto/8215194/webrev.02/
In the test, you will still have to edit the test when the number changes. I meant to avoid that edit. Though then may be there is not need for the test at all.
Yes, if we just reflectively use the constant in Character.UnicodeBlock in the test, it is guaranteed to succeed no matter what. Thus I added the assertion. Still, the test ofHashMap() succeeds till the block additions surpasses that 1024 capacity (thus this was overlooked on Unicode 11 upgrade).
Naoto
Roger
On 12/11/2018 12:59 PM, Naoto Sato wrote:
Hi Roger,
Thanks. I updated it as suggested (incl. test using reflection):
http://cr.openjdk.java.net/~naoto/8215194/webrev.01/
Naoto
On 12/11/18 7:57 AM, Roger Riggs wrote:
Hi Naoto,
Since the value changes from time to time, it would give it some visibility if it were defined using a private final int (or float) private final int MAP_CAPACITY = 667;
Though I suppose the test can't use the value without using reflection. But it would lower the maintenance in the long term.
$.02, Roger
On 12/11/2018 09:51 AM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
Hi Naoto, Thanks for fixing this. Your fix looks good to me. Thanks, Rachna On 12/11/18 8:21 PM, Naoto Sato wrote:
Hi,
Please review the fix for the following issue:
https://bugs.openjdk.java.net/browse/JDK-8215194
The proposed fix is located at:
http://cr.openjdk.java.net/~naoto/8215194/webrev.00/
This one line fix is for the correctness of the initial map size of Character.UnicodeBlock.
Naoto
-- Thanks, Rachna
participants (5)
-
Claes Redestad
-
Ivan Gerasimov
-
Naoto Sato
-
Rachna Goel
-
Roger Riggs