JDK 8: Second Release Candidate
Brenden Towey
brendentowey at gmail.com
Fri Feb 14 10:54:07 PST 2014
I'd like to make a bug report. Java has had for a while now a bug in
its regex system which I'd like to see fixed.
The short of it is that the \z pattern does not return 'requiresEnd' and
it should.
public void endTest()
{
Matcher m = Pattern.compile( "\\z" ).matcher( "" );
m.find();
System.out.println( m.requireEnd() );
assert ( m.requireEnd() );
}
This prints 'false'. It shouldn't take much thought to convince
yourself that if the end of input is required, then 'requiresEnd()'
should always be true. There's never a case for the \z pattern that you
want to match less than all of input. The above code snippet would make
a fine unit test for this bug, btw.
You can see the results of this bug if you use other parts of the API,
for example java.util.Scanner. Since the the requiresEnd() method
always returns false, the Scanner will match its own internal buffer
(usually 1024 characters) and not the end of input.
public void demo()
{
StringBuilder str = new StringBuilder( 4 * 1024 );
for( int i = 0; i < 1024; i++ ) {
str.append( i );
str.append( ',' );
}
Scanner s = new Scanner( str.toString() );
String result = s.useDelimiter( "\\z" ).next();
String expected = str.toString();
System.out.println( result.length()+", "+expected.length() );
assert( expected.equals( result ) );
}
Output:
C:\Users\Brenden\Dev\proj\Test2\build\classes>java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build 1.8.0-b129)
Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode)
C:\Users\Brenden\Dev\proj\Test2\build\classes>java -cp . quicktest.RegexBug
1024, 4010
You can see that the length of the matched string is 1024, not 4010 from
the original string.
Finally, if you need more convincing, you can test the \Z (capital Z)
pattern, which does essentially the same thing as \z. \Z always sets
its requiresEnd() flag, and it works as expected in the tests above.
Summary: \z should always set its 'requiresEnd' flag.
Thanks for taking the time to read this!
On 2/11/2014 2:31 PM, mark.reinhold at oracle.com wrote:
> Last week a serious flaw in a new API was reported [1]. We decided to
> fix that bug, along with an unrelated JCK failure on Mac OS X [2], so
> we now have a second JDK 8 Release Candidate, build 129.
>
> Binaries available here, as usual: https://jdk8.java.net/download.html
>
>
More information about the jdk8-dev
mailing list