JDK 8: Second Release Candidate

Xueming Shen xueming.shen at oracle.com
Sat Feb 15 16:34:45 UTC 2014


Created an issue for this one.

https://bugs.openjdk.java.net/browse/JDK-8035042

Need go dig out some history. requireEnd is primarily designed for 
j.u.Scanner.

-Sherman

On 2/15/14 8:02 AM, Martin Buchholz wrote:
> Java has had for a while now a bug in its regex system which I'd like 
> to see fixed.
>
> The short of it is that the \z pattern does not return 'requiresEnd' 
> and it should.
>
>    public void endTest()
>    {
>       Matcher m = Pattern.compile( "\\z" ).matcher( "" );
>       m.find();
>       System.out.println( m.requireEnd() );
>       assert ( m.requireEnd() );
>    }
>
> This prints 'false'.  It shouldn't take much thought to convince 
> yourself that if the end of input is required, then 'requiresEnd()' 
> should always be true.  There's never a case for the \z pattern that 
> you want to match less than all of input.  The above code snippet 
> would make a fine unit test for this bug, btw.
>
> You can see the results of this bug if you use other parts of the API, 
> for example java.util.Scanner.  Since the the requiresEnd() method 
> always returns false, the Scanner will match its own internal buffer 
> (usually 1024 characters) and not the end of input.
>
>
>    public void demo()
>    {
>       StringBuilder str = new StringBuilder( 4 * 1024 );
>       for( int i = 0; i < 1024; i++ ) {
>          str.append( i );
>          str.append( ',' );
>       }
>       Scanner s = new Scanner( str.toString() );
>       String result = s.useDelimiter( "\\z" ).next();
>       String expected = str.toString();
>       System.out.println( result.length()+", "+expected.length() );
>       assert( expected.equals( result ) );
>    }
>
> Output:
> C:\Users\Brenden\Dev\proj\Test2\build\classes>java -version
> java version "1.8.0"
> Java(TM) SE Runtime Environment (build 1.8.0-b129)
> Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode)
>
> C:\Users\Brenden\Dev\proj\Test2\build\classes>java -cp . 
> quicktest.RegexBug
> 1024, 4010
>
> You can see that the length of the matched string is 1024, not 4010 
> from the original string.
>
> Finally, if you need more convincing, you can test the \Z (capital Z) 
> pattern, which does essentially the same thing as \z.  \Z always sets 
> its requiresEnd() flag, and it works as expected in the tests above.




More information about the core-libs-dev mailing list