RFR(m) 2: 8072722: add stream support to Scanner
Stuart Marks
stuart.marks at oracle.com
Wed Sep 16 04:48:57 UTC 2015
On 9/10/15 2:12 PM, Xueming Shen wrote:
> I think it might be a "nice to have" for a "fail-fast" effort after the the
> consumer consumed/accepted the result (the second check), but isn't it a bug
> for the consumer to accept any result if there is CME condition occurred
> already?
I'm not sure which spliterator we're talking about at this point, but the issue
is similar between them. Prior to calling the consumer's accept() method, in
FindSpliterator, the modCount has previously been asserted to be equal to
expectedCount. In TokenSpliterator, the expectedCount is refreshed from the
modCount immediately prior to calling accept(). (This is done because advancing
the spliterator in this case increments the modCount.)
In both spliterators, then, the expectedCount should be equal to the modCount
immediately prior to the call to accept(). Also in both spliterators, the
modCount and expectedCount are compared immediately after accept(), and if they
aren't equal, CME is thrown.
What this guards against is the accept() method -- really, one of the
application's lambdas that's been passed to a pipeline operation -- modifying
the state of the scanner. This only really works in a sequential stream, but
it's all we've got. (In a parallel stream, I think the element is buffered
somewhere and is handed to another thread. If that other thread attempts to
modify the scanner's state, all bets are off because of memory visibility issues.)
Anyway, at least for sequential streams, this check does properly guard against
the case where somebody modifies the scanner's state from within a pipeline
operation. There are tests for this too; see ScanTest.streamComodTest().
>>>> It'd be better to initialize expectedCount to modCount in constrocutor?
>>
>> That's how I had it initially, but at Paul Sandoz' suggestion I delayed the
>> initialization to the first call to tryAdvance(). This allows the Scanner's
>> state to be modified after stream creation but before stream pipeline
>> execution. This is the way that Paul's stream code in Matcher works. I'm not
>> sure how important this is. Having Scanner be gratuitously different from
>> Matcher seems like it would be irritating though.
>
> I noticed the spec says "Scanning starts upon initiation of the terminal
> stream operation, using the current state of this scanner..." guess it means
> the "CME" enforcement starts with the "stream operation" starts (a kinda of
> later-initialization). But personally feel it may create a unnecessary
> inconsistent situation, depends on whether or not there is state change
> between the creation of the Stream object and the starting of the stream
> operation. But I'm not a stream > expert :-)
Well, one of my earlier revisions basically said that you can't touch the
Scanner at all after tokens() or findAll() has been called. This works, but is
unnessarily restrictive, and it's inconsistent with Paul's approach with
Matcher.results().
This is pretty easy to see because the constructors for the new spliterators
simply initialize themselves, but they don't hang onto any state from the
scanner. The only actual dependence on the state of the scanner starts at the
first call to tryAdvance(), which is when the first element is actually
introduced to the stream. It's safe for the application to change the state of
the scanner any time up until that point. It does introduce a little bit of
complexity in that there's an additional state in the expectedCount checking (as
we've seen) :-). But it does allow a bit more flexibility with the caller's
handling of the scanner and a stream derived from it.
s'marks
More information about the core-libs-dev
mailing list