RFR: Regex exponential backtracking issue --- more cleanup/tuning
Xueming Shen
xueming.shen at oracle.com
Fri Mar 18 18:28:54 UTC 2016
Thanks Roger!
I have pulled those regex changes together at
http://cr.openjdk.java.net/~sherman/regexBackTrack.Lamnda.CanonEQ/webrev
So I'm updating the responding change there.
On 03/18/2016 10:03 AM, Roger Riggs wrote:
> Hi Sherman,
>
> A few comments:
>
> src/java.base/share/classes/java/util/regex/CharPredicates.java:
>
> - The static fields like ALPHABETIC should be final
> - line: 139/140 the posix and uprops HashMaps could have an initializer for size to avoid resizing (12 and 18):
> - 224: typo: "categoreis"
> - 232: double space: "return props.get"
>
Updated. I'm making the CharPredicates class final.
> - Remove debugging prints to "PrintPattern.pmap..." before commit
Removed (it's convenient to have them there to just comment in for debugging)
> - 288: can you resort "Pf" and "Pi" to be with the other P<x> entries; maybe sort the whole list.
>
Those on the top with single-line-def are sorted/listed by the order of their types defined
in Character.java.
> - 305: Is the definition of "C" missing the equivalent of "Cn" unassigned or is that an intentional difference?
>
My memory can't tell, but I would assume it was an "intentional" difference. I
will dig a little deep to see if it's the update of the spec. This has not be touched
for a long time, I'm not sure if it can be easily put back, if it's NOT intentional.
> -323: "LD" the definition of LD is not in the reference (is it specified somewhere?)
It's the letter-and-digit. looks like a "home-made" one. Again, not sure if this
was in the spec in the old day. And, don't think we can just remove it.
> -333: the referenced spec defines the names of the character classes but not the contents; are they elsewhere?
http://www.unicode.org/reports/tr44/
5.7.1 General Category Values
> -374: add "final" to static declarations
Updated the whole class to final.
>
> Pattern.java:
> -2651: stray trailing "{" after comment
> -2746: indentation of arguments
Updated
> -5493: javadoc the description of Predicate interface and implementations
> -5637: if you want it to be the first initializer put it at the top of the file, not buried at the end.
> and make them final.
>
The "original" comment is a little "misleading. The Pattern class itself is final though.
> That's a start, I need to look a bit more closely at the algorithmic change that prompted the fix.
>
thanks! I have also asked Masayoshi to help review the canonical-equivalents part.
Sherman
More information about the core-libs-dev
mailing list