JDK 8: Second Release Candidate
Xueming Shen
xueming.shen at oracle.com
Sat Feb 15 16:34:45 UTC 2014
Created an issue for this one.
https://bugs.openjdk.java.net/browse/JDK-8035042
Need go dig out some history. requireEnd is primarily designed for
j.u.Scanner.
-Sherman
On 2/15/14 8:02 AM, Martin Buchholz wrote:
> Java has had for a while now a bug in its regex system which I'd like
> to see fixed.
>
> The short of it is that the \z pattern does not return 'requiresEnd'
> and it should.
>
> public void endTest()
> {
> Matcher m = Pattern.compile( "\\z" ).matcher( "" );
> m.find();
> System.out.println( m.requireEnd() );
> assert ( m.requireEnd() );
> }
>
> This prints 'false'. It shouldn't take much thought to convince
> yourself that if the end of input is required, then 'requiresEnd()'
> should always be true. There's never a case for the \z pattern that
> you want to match less than all of input. The above code snippet
> would make a fine unit test for this bug, btw.
>
> You can see the results of this bug if you use other parts of the API,
> for example java.util.Scanner. Since the the requiresEnd() method
> always returns false, the Scanner will match its own internal buffer
> (usually 1024 characters) and not the end of input.
>
>
> public void demo()
> {
> StringBuilder str = new StringBuilder( 4 * 1024 );
> for( int i = 0; i < 1024; i++ ) {
> str.append( i );
> str.append( ',' );
> }
> Scanner s = new Scanner( str.toString() );
> String result = s.useDelimiter( "\\z" ).next();
> String expected = str.toString();
> System.out.println( result.length()+", "+expected.length() );
> assert( expected.equals( result ) );
> }
>
> Output:
> C:\Users\Brenden\Dev\proj\Test2\build\classes>java -version
> java version "1.8.0"
> Java(TM) SE Runtime Environment (build 1.8.0-b129)
> Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode)
>
> C:\Users\Brenden\Dev\proj\Test2\build\classes>java -cp .
> quicktest.RegexBug
> 1024, 4010
>
> You can see that the length of the matched string is 1024, not 4010
> from the original string.
>
> Finally, if you need more convincing, you can test the \Z (capital Z)
> pattern, which does essentially the same thing as \z. \Z always sets
> its requiresEnd() flag, and it works as expected in the tests above.
More information about the core-libs-dev
mailing list