About the stream deduping method

Kevin Bourrillion kevinb at google.com
Thu Jan 24 12:16:22 PST 2013


Brian's survey summary:

> "distinct" seemed preferred to "unique".

Sounds good, let this be the end of "unique" then.


> Also, most other method names are verby; I propose changing to
"removeDuplicates" or "filterDuplicates".

*About the term "duplicate".*  If it were always the case that the
*first* occurrence
of 'obj' in a stream is the one preserved, and all the rest thereby deemed
to be the "duplicates" and excluded, then there would be a stronger reason
to like the use of "duplicates" here.  Since that's not the case, it's just
"meh".

*About "remove".* It sounds very mutative.  Since Streams feel similar to
Iterators, there is some potential for confusion.

*About "filter".* The sense is inverted. filter() preserves the ones that *
do* match the predicate. This would have to be something like
"filterOutDuplicates" (yuck).

*About consistency*.  Right now among the chaining-style methods you've got
imperative verbs (filter, map, limit, explode, perhaps sort), you've got
nouns (substream), adjectives (parallel, sequential, sorted), and one
that's not even based in anything grammatical (tee).  (Side point: the set
of terminal operations are no more consistent than this, either.)  If
consistency seems at all attainable (?), we would have to start by choosing
which out of all this we want to be consistent *with*.

*If I had to choose. *I would probably just go with:   distinct().   That's
all SQL needed, and I think it did fine with it.


-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130124/5784b7d8/attachment.html 


More information about the lambda-libs-spec-experts mailing list