RFR(m) 2: 8072722: add stream support to Scanner

Wed Sep 16 15:56:34 UTC 2015

On 9/16/15 8:43 AM, Xueming Shen wrote:
> On 9/15/15 9:48 PM, Stuart Marks wrote:
>>
>>
>> On 9/10/15 2:12 PM, Xueming Shen wrote:
>>> I think it might be a "nice to have" for a "fail-fast" effort after 
>>> the the
>>> consumer consumed/accepted the result (the second check), but isn't 
>>> it a bug
>>> for the consumer to accept any result if there is CME condition 
>>> occurred
>>> already?
>>
>> I'm not sure which spliterator we're talking about at this point, but 
>> the issue is similar between them. Prior to calling the consumer's 
>> accept() method, in FindSpliterator, the modCount has previously been 
>> asserted to be equal to expectedCount. In TokenSpliterator, the 
>> expectedCount is refreshed from the modCount immediately prior to 
>> calling accept(). (This is done because advancing the spliterator in 
>> this case increments the modCount.)
>>
>> In both spliterators, then, the expectedCount should be equal to the 
>> modCount immediately prior to the call to accept(). Also in both 
>> spliterators, the modCount and expectedCount are compared immediately 
>> after accept(), and if they aren't equal, CME is thrown.
>>
>
> For both spliterators, particularly the token() method. The check 
> after the accept() method is fine
> (as you suggested below, it guards against the wrong doing by the user 
> code inside the accept()).
> I'm talking about the check "immediately" prior to the call to 
> accept(). It will not function after the
> modCount tips over to the negative int value, because the 
> "expectedCount >=0" check.
>
> Consider the use scenario that the Scanner is on top of an endless 
> input stream, you have a token
> stream on top of it. The check before the "accept(token" will not be 
> performed until the
> expectedCount/modCount tips back to positive value again from the 
> negative, then off, then on...
> During the off period (it will take a while from negative back to 
> positive), the stream will just work
> fine to feed the accept() the "next" token even if there is another 
> thread keeps "stealing" tokens from
> the same scanner, if the timing is right.  Looks like not really a 
> "fail-fast" in this scenario.
>
> This can be "easily" addressed, if you have a separate boolean field 
> such as "initlized". The code
> can look like below in tryAdvance(...)
>
>     if (!initialize) {
>         expectedCount = modCount;
     ---> initialized = true;


>     }
>     if (expectedCount != modCount) {
>         throw new CME();
>     }
>     ...
>
> Well, if you think this is an unlikely use scenario and the intention 
> of the check/guard here
> is mainly to prevent the wrong doing within the pipe operation, then 
> it might not worth the
> extra field, and I'm fine with the latest webrev.
>
> -Sherman
>
>
>> What this guards against is the accept() method -- really, one of the 
>> application's lambdas that's been passed to a pipeline operation -- 
>> modifying the state of the scanner. This only really works in a 
>> sequential stream, but it's all we've got. (In a parallel stream, I 
>> think the element is buffered somewhere and is handed to another 
>> thread. If that other thread attempts to modify the scanner's state, 
>> all bets are off because of memory visibility issues.)
>>
>> Anyway, at least for sequential streams, this check does properly 
>> guard against the case where somebody modifies the scanner's state 
>> from within a pipeline operation. There are tests for this too; see 
>> ScanTest.streamComodTest().
>>
>