Re: RFR: JDK-8032012, , String.toLowerCase/toUpperCase performance improvement

21 Jan 2014


      Hi Paul, thanks for reviewing the changeset, comment inlined below.

On 01/20/2014 09:24 AM, Paul Sandoz wrote:
...
Some quick comments.
In String.toLowerCase:
- it would be nice to get rid of the pseudo goto using the "scan" labelled block.
webrev has been updated to remove the pseudo goto by checking the "first" against
"len" after the loop break.
...
- there is no need for the "localeDependent" local variable.
I just tried to not throw away the result of this "localeDependent", which is still needed
in the toXCaseEx() methods.
...
- you might be able to optimize by doing (could depend on the answer to the next point):
int c = (int)value[i];
  int lc = Character.toLowerCase(c);
  if (.....) { result[i] = (char)lc; } else { return toLowerCaseEx(result, i, locale, localeDependent); }
- Do you need to check ERROR for the result of toLowerCase?
2586             if (c == Character.ERROR ||
Yes, Character.toLowerCase() should never return ERROR (while the package private
Character.toUpperCaseEx() will). In theory there is no need to check if the return
value of  Character.toUpperCase(int) > min_supplementary_code_point in our loop,
because there is no bmp character returns a supplementary code point as its lower
case. But since it's a data driven mapping table, there is no guarantee the unicode
data table is not going to change in the "future", so I still keep the check. The webrev
has been updated to use one single "if" inside the loop.

I have a "single if" implementation for the toUpperCase() as well, though don't sure
which one is better/faster :-)

http://cr.openjdk.java.net/~sherman/8032012/

-Sherman

Re: RFR: JDK-8032012, , String.toLowerCase/toUpperCase performance improvement

Xueming Shen