JDK 9 RFR(s): 8150488: add note to Scanner.findAll() regardingpossible infinite streams
Timo Kinnunen
timo.kinnunen at gmail.com
Thu Mar 30 15:56:41 UTC 2017
Hi,
I guess this somewhat contrived example also wouldn’t work?
String s = "\\b\\w+|\\G|\\B";
String t = "Matcher m = Pattern.compile(s).matcher(t);\n";
Matcher m = Pattern.compile(s).matcher(t);
while(m.find()) {
System.out.println("'" + m.group() + "'");
}
// Outputs:
// 'Matcher'
// ''
// 'm'
// ''
// ''
// ''
// 'Pattern'
// ''
// 'compile'
// ''
// 's'
// ''
// ''
// 'matcher'
// ''
// 't'
// ''
// ''
// ''
// ''
Sent from Mail for Windows 10
From: Xueming Shen
Sent: Thursday, March 30, 2017 05:41
To: core-libs-dev at openjdk.java.net
Subject: Re: JDK 9 RFR(s): 8150488: add note to Scanner.findAll() regardingpossible infinite streams
On 3/29/17, 5:56 PM, Stuart Marks wrote:
> Hi all,
>
> Please review these non-normative textual additions to the
> Scanner.findAll() method docs. These methods were added earlier in JDK
> 9; there's a small pitfall if the regex can match zero characters.
>
Stuart,
This might practically put the api itself almost useless? it might be an
easy task to spot
whether or not it's a 0-width-match-possible regex when the regex is
simple, but it gets
harder and harder, if not impossible when the regex gets complicated,
especially consider
the possible use scenario that the use site is embedded deeply inside a
library implementation.
The alternative is to "fix" it, maybe as what Matcher.find() does, if
the previous match is
zero-width-match (the fist==last), we step one to the next cursor before
next try. I know
Scanner.findPatternInBuffer() is setting new region set every time it is
invoked which makes
it complicated, but I would assume it might be still worth a trying? for
example, utilize the
"hasNextResult"/matcher.end(). I'm not sure without looking into the
code, does
while (hasNext(pattern)) {
next(pattern);
}
have the same issue, when pattern matches 0-width?
Thanks!
-Sherman
> Thanks,
>
> s'marks
>
>
> # HG changeset patch
> # User smarks
> # Date 1490749958 25200
> # Tue Mar 28 18:12:38 2017 -0700
> # Node ID 6b43c4698752779793d58813f46d3687c17dde75
> # Parent fb54b256d751ae3191e9cef42ff9f5630931f047
> 8150488: add note to Scanner.findAll() regarding possible infinite
> streams
> Reviewed-by: XXX
>
> diff -r fb54b256d751 -r 6b43c4698752
> src/java.base/share/classes/java/util/Scanner.java
> --- a/src/java.base/share/classes/java/util/Scanner.java Mon Mar 27
> 15:12:01 2017 -0700
> +++ b/src/java.base/share/classes/java/util/Scanner.java Tue Mar 28
> 18:12:38 2017 -0700
> @@ -2808,6 +2808,10 @@
> * }
> * }</pre>
> *
> + * <p>The pattern must always match at least one character. If
> the pattern
> + * can match zero characters, the result will be an infinite stream
> + * of empty matches.
> + *
> * @param pattern the pattern to be matched
> * @return a sequential stream of match results
> * @throws NullPointerException if pattern is null
> @@ -2829,6 +2833,11 @@
> * scanner.findAll(Pattern.compile(patString))
> * }</pre>
> *
> + * @apiNote
> + * The pattern must always match at least one character. If the
> pattern
> + * can match zero characters, the result will be an infinite stream
> + * of empty matches.
> + *
> * @param patString the pattern string
> * @return a sequential stream of match results
> * @throws NullPointerException if patString is null
More information about the core-libs-dev
mailing list