RFR JDK-8139414: java.util.Scanner hasNext() returns true, next() throws NoSuchElementException
Stuart Marks
stuart.marks at oracle.com
Tue Jun 14 20:22:35 UTC 2016
Hi Sherman,
The fix looks good.
It would be helpful if the test for 8072582 generated the string instead of
using a literal that's more than 1K long. The exact length is significant
because Scanner's default buffer size is 1024, so the delimiter has to straddle
the buffer boundary.
The 8139414 test generates its string, which is nicer. In this case the test is
taken from the bug report, but in my opinion the addition of the "boundary"
variable (which is the string ";") makes things more obscure. I'd suggest
inlining it.
For both test cases it might be helpful to have a little utility that appends n
copies of a char to a StringBuilder.
Thanks,
s'marks
On 6/8/16 1:57 PM, Xueming Shen wrote:
> Hi,
>
> Please help review the change for
>
> JDK-8139414: java.util.Scanner hasNext() returns true, next() throws
> NoSuchElementException
> JDK-8072582: Scanner delimits incorrectly when delimiter spans a buffer boundary
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8139414
> https://bugs.openjdk.java.net/browse/JDK-8072582
> webrev: http://cr.openjdk.java.net/~sherman/8072582_8139414/webrev
>
> In both cases the delimiter pattern is a kinda of "alternation" regex construct
> which can "match" the existing characters at the end of the internal buffer as
> delimiters, AND can extend to match more delimiters if more input is available.
>
> In issue JDK-8139414, the hasNext() uses hasTokenInBuffer() to find the delimiters
> "-;". It does not go beyond the boundary to check if there is more character, such
> as "-" that can also be part of the delimiters). So hasNext() returns true with the
> assumption that there is a token because there is/are more character after "-;".
> But method getCompleteTokenInBuffer() (used by next() implementation), which
> has the logic to check beyond the boundary even the delimiter pattern already
> has a match. It matches "-;-" as the delimiters and then find no "next" (null)
> after
> that.
>
> Similar for issue 8072582. This time the getCompleteTokenInBuffer does not
> use the "lookingAt() and beyond" logic for the second delimiters, which triggers
> problem when the delimiter pattern has different match result (beginning position)
> for cases within boundary and beyond boundary.
>
> The proposed fix here is to always check if there is more input when match
> delimiters at the internal buffer boundary.
>
> Thanks,
> Sherman
More information about the core-libs-dev
mailing list