Inconsistencies in the return value (type) of string functions toLower() and toUpper().

Sasi Peri pvssasikala at gmail.com
Mon Jun 20 03:29:08 UTC 2022


Hello,

*Issue details*

toLower() and toUpper() return a new String object sometimes and sometimes
a string literal, based on the input string type (and also sometimes based
on the VM/jdk type)



For example

-       HotSpot VM, If the input is a string literal, which *already* has
all “lower case letters”, toLower would return the same string literal, if
not it will convert all letters to lower and returns a new String() object.

-       However, openJ9 (e.g. IBM jdk8 ditsro, ) always returns a new
String object not a literal.



This behavior is non deterministic, inconsistent, you cannot always predict
if the outcome is a new string object OR an interned string from pool
(particularly from unit testing stand point).



*Sample code to show case above behavior*

*package* com.bugs;

*import* java.util.Locale;



*public* *class* TestStringFunction

{



*  public* *static* *void* main(String args[])

  {

    String s1 = "abc";

    String s2 = "ABC";



    System.*out*.println("----- case: when string already lower ----------"
);

*    testIfEqualsLower*(s1);



    System.*out*.println("----- case: when string with upper case
----------");

*    testIfEqualsLower*(s2);





  }



*  private* *static* *void* testIfEqualsLower(String s)

  {



*     if*(s.toLowerCase() == "abc")

     {

        System.*out*.println("YES - literal");

     }



*     if*(s.toLowerCase().equals("abc"))

     {

        System.*out*.println("YES – equals func");

     }



   }

}





*Out put*



----- case: when string already lower ----------

YES - literal

YES – equals func



----- case: when string with upper case ----------

YES - equals func





*Why this could be an issue or bug prone?*

-       Suppose an unit test is written, for a method doAThing(), that has
toLower/Upper conversions in the middle of the code somewhere, and apply
logic based on that.
-       Though general guidance to compare unknown string (types) is always
using equals, sometimes developers can make a mistake (i.e. suppose they
used == in a unit test)
-       If the code review did not catch it, this behavior can cause all
unit tests passed, as long unit test is written with “small case string” as
input.
-       It could potentially make it to prod, and can be realized only when
it hits a case when input string has all uppercase OR mixed case letters,
which could be after multiple sprints, at the point not easily detectable.



*Suggestion*

It would be great if we always can make a new String() and return always a
String object not an interned string sometimes, as openJ9/ibm does (with
some jdk versions).

It may not be good idea, to always return “.interned” value and fill up the
intern pool, for these short-lived objects (as they are most times).



If this is agreed/approved, I can make a change and commit.


Regards

- SP
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20220619/4810e963/attachment-0001.htm>


More information about the core-libs-dev mailing list