RFR(m) 2: 8072722: add stream support to Scanner
Stuart Marks
stuart.marks at oracle.com
Wed Sep 16 23:08:09 UTC 2015
On 9/16/15 8:43 AM, Xueming Shen wrote:
> I'm talking about the check "immediately" prior to the call to accept(). It
> will not function after the modCount tips over to the negative int value,
> because the "expectedCount >=0" check.
>
> Consider the use scenario that the Scanner is on top of an endless input
> stream, you have a token stream on top of it. The check before the
> "accept(token" will not be performed until the expectedCount/modCount tips
> back to positive value again from the negative, then off, then on... During
> the off period (it will take a while from negative back to positive), the
> stream will just work fine to feed the accept() the "next" token even if
> there is another thread keeps "stealing" tokens from the same scanner, if the
> timing is right. Looks like not really a "fail-fast" in this scenario.
Right, after modCount wraps around to negative, the CME checking becomes
dysfunctional. It doesn't do any harm, but it ceases to perform proper checking.
This is kind of a corner case but ... well I admit I did have to do a fair bit
of puzzling to figure out what the behavior would be and to prove to myself that
it was benign.
> This can be "easily" addressed, if you have a separate boolean field such as
> "initialized". The code can look like below in tryAdvance(...)
[edited per your subsequent message]
>
> if (!initialized) {
> expectedCount = modCount;
> initialized = true;
> }
> if (expectedCount != modCount) {
> throw new CME();
> }
> ...
>
> Well, if you think this is an unlikely use scenario and the intention of the
> check/guard here is mainly to prevent the wrong doing within the pipe
> operation, then it might not worth the extra field, and I'm fine with the
> latest webrev.
The cost of the additional field is negligible. I haven't written out the code
but I suspect that an explicit "initialized" field will be easier to reason
about, certainly easier to understand than the behavior that occurs if modCount
wraps around to negative.
Note that this also applies to Matcher, although it's less likely since
Matcher's input is a CharSequence instead of an indefinite-sized source such as
a file or an input stream. In talking to Paul Sandoz about this (author of the
streams stuff for Matcher) he felt it was important to keep the behaviors of
Matcher and Scanner consistent.
But this has dragged out somewhat and I don't really want to add Matcher changes
to this changeset. How about I do the following:
1) I'll push the latest webrev as it stands.
2) I'll file a separate bug to clean up Scanner's and Matcher's spliterators'
modCount checking to avoid the overflow issue.
3) I'll fix all the spliterators at the same time.
How does that sound?
s'marks
More information about the core-libs-dev
mailing list