RFR(m) 2: 8072722: add stream support to Scanner

Wed Sep 16 15:43:43 UTC 2015

On 9/15/15 9:48 PM, Stuart Marks wrote:
>
>
> On 9/10/15 2:12 PM, Xueming Shen wrote:
>> I think it might be a "nice to have" for a "fail-fast" effort after 
>> the the
>> consumer consumed/accepted the result (the second check), but isn't 
>> it a bug
>> for the consumer to accept any result if there is CME condition occurred
>> already?
>
> I'm not sure which spliterator we're talking about at this point, but 
> the issue is similar between them. Prior to calling the consumer's 
> accept() method, in FindSpliterator, the modCount has previously been 
> asserted to be equal to expectedCount. In TokenSpliterator, the 
> expectedCount is refreshed from the modCount immediately prior to 
> calling accept(). (This is done because advancing the spliterator in 
> this case increments the modCount.)
>
> In both spliterators, then, the expectedCount should be equal to the 
> modCount immediately prior to the call to accept(). Also in both 
> spliterators, the modCount and expectedCount are compared immediately 
> after accept(), and if they aren't equal, CME is thrown.
>

For both spliterators, particularly the token() method. The check after 
the accept() method is fine
(as you suggested below, it guards against the wrong doing by the user 
code inside the accept()).
I'm talking about the check "immediately" prior to the call to accept(). 
It will not function after the
modCount tips over to the negative int value, because the "expectedCount 
 >=0" check.

Consider the use scenario that the Scanner is on top of an endless input 
stream, you have a token
stream on top of it. The check before the "accept(token" will not be 
performed until the
expectedCount/modCount tips back to positive value again from the 
negative, then off, then on...
During the off period (it will take a while from negative back to 
positive), the stream will just work
fine to feed the accept() the "next" token even if there is another 
thread keeps "stealing" tokens from
the same scanner, if the timing is right.  Looks like not really a 
"fail-fast" in this scenario.

This can be "easily" addressed, if you have a separate boolean field 
such as "initlized". The code
can look like below in tryAdvance(...)

     if (!initialize) {
         expectedCount = modCount;
     }
     if (expectedCount != modCount) {
         throw new CME();
     }
     ...

Well, if you think this is an unlikely use scenario and the intention of 
the check/guard here
is mainly to prevent the wrong doing within the pipe operation, then it 
might not worth the
extra field, and I'm fine with the latest webrev.

-Sherman

> What this guards against is the accept() method -- really, one of the 
> application's lambdas that's been passed to a pipeline operation -- 
> modifying the state of the scanner. This only really works in a 
> sequential stream, but it's all we've got. (In a parallel stream, I 
> think the element is buffered somewhere and is handed to another 
> thread. If that other thread attempts to modify the scanner's state, 
> all bets are off because of memory visibility issues.)
>
> Anyway, at least for sequential streams, this check does properly 
> guard against the case where somebody modifies the scanner's state 
> from within a pipeline operation. There are tests for this too; see 
> ScanTest.streamComodTest().
>