RFR: JDK-8032012, , String.toLowerCase/toUpperCase performance improvement

Xueming Shen xueming.shen at oracle.com
Thu Feb 6 17:40:44 UTC 2014


Paul,

toUpperCaseEx is overridden in CharacterData00/Latin1. Those two are
under gensrc/java/lang. It might be possible to combine them some day
(need to dig out some decade long history and probably there is compability
concern...), but definitely is beyond the scope of this "improvement" :-)

Ulf,

webrev has been updated to use isBmpCodePoint() as suggested. Another
benefit of using isBmpCodePoint() is that some Character.ERROR checks
are no longer necessary

             if (lowerChar != Character.ERROR&&
                 lowerChar<  Character.MIN_SUPPLEMENTARY_CODE_POINT) {
                 result[resultOffset++] = (char)lowerChar;
             } else {

-->

                       if (Character.isBmpCodePoint(lowerChar)) {
                           result[resultOffset++] = (char)lowerChar;
                       } else {


Are we good enough to go? :-) While it took much longer than I would have
expected, but I'm happy with the latest result.

http://cr.openjdk.java.net/~sherman/8032012/webrev/

-Sherman

On 02/06/2014 01:44 AM, Paul Sandoz wrote:
> On Feb 6, 2014, at 5:37 AM, Xueming Shen<xueming.shen at oracle.com>  wrote:
>
>> Fair enough. I don't think it's going to be a measurable difference. I have updated the webrev
>> to use the Character.isSurrogate() for better readability.
>>
>> http://cr.openjdk.java.net/~sherman/8032012/webrev
>>
> One last point, sorry :-)
>
> Can you use Character.toUpperCase rather than Character.toUpperCaseEx ? [*].
>
> Paul.
>
> [*] They both seem to point to the same code, so either Character.toUpperCase can incorrectly return Character.ERROR or Character.toUpperCaseEx never returns ERROR:
>
>      public static int toUpperCase(int codePoint) {
>          return CharacterData.of(codePoint).toUpperCase(codePoint);
>      }
>
>      static int toUpperCaseEx(int codePoint) {
>          assert isValidCodePoint(codePoint);
>          return CharacterData.of(codePoint).toUpperCaseEx(codePoint);
>      }
>
>
>
> abstract class CharacterData {
>      ...
>      abstract int toUpperCase(int ch);
>      ...
>
>      //need to implement for JSR204
>      int toUpperCaseEx(int ch) {
>          return toUpperCase(ch);
>      }
>




More information about the core-libs-dev mailing list