Inconsistencies in the return value (type) of string functions toLower() and toUpper().
Sasi Peri
pvssasikala at gmail.com
Mon Jun 20 03:29:08 UTC 2022
Hello,
*Issue details*
toLower() and toUpper() return a new String object sometimes and sometimes
a string literal, based on the input string type (and also sometimes based
on the VM/jdk type)
For example
- HotSpot VM, If the input is a string literal, which *already* has
all “lower case letters”, toLower would return the same string literal, if
not it will convert all letters to lower and returns a new String() object.
- However, openJ9 (e.g. IBM jdk8 ditsro, ) always returns a new
String object not a literal.
This behavior is non deterministic, inconsistent, you cannot always predict
if the outcome is a new string object OR an interned string from pool
(particularly from unit testing stand point).
*Sample code to show case above behavior*
*package* com.bugs;
*import* java.util.Locale;
*public* *class* TestStringFunction
{
* public* *static* *void* main(String args[])
{
String s1 = "abc";
String s2 = "ABC";
System.*out*.println("----- case: when string already lower ----------"
);
* testIfEqualsLower*(s1);
System.*out*.println("----- case: when string with upper case
----------");
* testIfEqualsLower*(s2);
}
* private* *static* *void* testIfEqualsLower(String s)
{
* if*(s.toLowerCase() == "abc")
{
System.*out*.println("YES - literal");
}
* if*(s.toLowerCase().equals("abc"))
{
System.*out*.println("YES – equals func");
}
}
}
*Out put*
----- case: when string already lower ----------
YES - literal
YES – equals func
----- case: when string with upper case ----------
YES - equals func
*Why this could be an issue or bug prone?*
- Suppose an unit test is written, for a method doAThing(), that has
toLower/Upper conversions in the middle of the code somewhere, and apply
logic based on that.
- Though general guidance to compare unknown string (types) is always
using equals, sometimes developers can make a mistake (i.e. suppose they
used == in a unit test)
- If the code review did not catch it, this behavior can cause all
unit tests passed, as long unit test is written with “small case string” as
input.
- It could potentially make it to prod, and can be realized only when
it hits a case when input string has all uppercase OR mixed case letters,
which could be after multiple sprints, at the point not easily detectable.
*Suggestion*
It would be great if we always can make a new String() and return always a
String object not an interned string sometimes, as openJ9/ibm does (with
some jdk versions).
It may not be good idea, to always return “.interned” value and fill up the
intern pool, for these short-lived objects (as they are most times).
If this is agreed/approved, I can make a change and commit.
Regards
- SP
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20220619/4810e963/attachment-0001.htm>
More information about the core-libs-dev
mailing list