Codereview request for 7189363: Regex Pattern compilation buggy for special sequences

Alan Bateman Alan.Bateman at oracle.com
Thu Aug 9 08:45:27 UTC 2012


On 08/08/2012 22:58, Xueming Shen wrote:
> Hi
>
> It appears the optimization (flatten a binary tree structure into a 
> branch/
> switch) we put into JDK6 for alternation operation [1] has problem if the
> first construct is a group "(...)" followed by a greedy/reluctant 
> "once or not
> at all" quantifier. For example regex "(a)?bc|d" or "(a)??bc|d" can't 
> be found
> for input "d", while "a?bc|d" or "a??bc|d" just works fine.
>
> The root cause is that expr() mistakenly to use the Branch node from the
> sub-expression (in above case, the branch node for "(a)?" instead of 
> using
> its own Branch node (because the incorrect use of "prev instance of 
> Branch",
> see the diff), which basically messes up the internal regex node tree [2]
>
> The webrev is at
> http://cr.openjdk.java.net/~sherman/7189363/webrev
This looks okay to me too. I'm also surprised it hasn't been noticed 
before now.

-Alan.



More information about the core-libs-dev mailing list