CFR - updated 8001667: Comparator combinators and extension methods

Michael Hixson michael.hixson at gmail.com
Fri Mar 8 03:36:17 PST 2013


That would get me the functionality I want.  It's the route Guava
took, with ordering.onResultOf(function).  If you do go this route, I
suggest "appliedTo" as another possibility for the name.

I do see this solution as a big improvement over what's there
currently.  It works great for most of the examples in my codebase.

The rest of this email will be more argumentative!

What are the advantages of this over my proposal?  (comparing(f,c) and
c1.thenComparing(f,c2))

One thing I don't like about this, which is an issue shared by
ordering.onResultOf (and ordering.nullsFirst too), is that it reads
"backwards".  If you look at the code and try to translate it to plain
English, it sounds strange.  "Sort the objects comparing X.  No wait!
Apply this other transformation first, and compare those results
instead."  It's not so bad in your version of my original example, but
it scales badly as the sort becomes more complex.

Here's an example from my codebase where I'm using non-natural and
null-friendly orderings in the same place.  I think if you try to
translate it to use your proposed solution, it will demonstrate the
awkwardness:

---------------------------------------------------
// Current code (variable and type names altered slightly):

final Map<Integer, Page> pagesByAuthorId = ...

// This is declared outside of the sort below to avoid excessive
object allocations.
final Comparator<Page> pageOrder = Ordering
    .from((Comparator<Page>) (a, b) ->
        CASE_INSENSITIVE_ORDER.compare(a.getTitle(), b.getTitle()))
    .nullsFirst();

authors.sort((a, b) -> ComparisonChain.start()
    .compare(pagesByAuthorId.get(a.getId()),
             pagesByAuthorId.get(b.getId()),
             pageOrder)
    .compare(a.getLastName(), b.getLastName(), CASE_INSENSITIVE_ORDER)
    .compare(a.getId(), b.getId())
    .result());
---------------------------------------------------

To be clear, I'm not unhappy with that code as it is.  It would be
neat if the JDK let write that as clearly (or more clearly?) without
third-party libraries though.

Here's my take at rewriting that using your proposed solution.  I am
assuming I will still need Guava for nulls because you seemed pretty
skeptical about my suggestion there:

---------------------------------------------------
// With c.apply(f):

authors.sort(
    Ordering.from(CASE_INSENSITIVE_ORDER.apply(Page::getTitle)).nullsFirst()
        .apply(author -> pagesByAuthorId.get(author.getId())
        .thenComparing(CASE_INSENSITIVE_ORDER.apply(Author::getLastName))
        .thenComparing(Author::getId));

// In English:
// sort the authors:
//  - ordering from case insensitive order, applied to page title with
nulls first, applied to the author's page
//  - then comparing case insensitive order applied to the author's last name
//  - then comparing the author's id
---------------------------------------------------

The first part of the sort is really weird.  It's so weird that I
think I'd keep my "pageOrder" variable from the first example, even
though hoisting it outside of the call to sort() is no longer a
performance improvement, just because this reads so poorly.

Contrast that with:

---------------------------------------------------
// With comparing(f,c) and c.thenComparing(f,c) and static nullsFirst():

authors.sort(
     comparing(author -> pagesByAuthorId.get(author.getId())
               nullsFirst().thenComparing(
                   Page::getTitle,
                   CASE_INSENSITIVE_ORDER))
         .thenComparing(Author::getLastName, CASE_INSENSITIVE_ORDER)
         .thenComparing(Author::getId));

// In English:
// sort the authors:
//  - comparing the author's page, with nulls first then comparing
page title in case insensitive order
//  - then comparing the author's last name in case insensitive order
//  - then comparing the author's id
---------------------------------------------------

Isn't that easier to read?

-Michael

On Thu, Mar 7, 2013 at 10:57 PM, Henry Jen <henry.jen at oracle.com> wrote:
> I think it all boiled down to re-use an existing Comparator to compare
> for another type.
>
> What about we added this to Comparator<T> as a default method,
>
>>     default <S> Comparator<S> apply(Function<S, ? extends T> keyExtractor) {
>>         Objects.requireNonNull(keyExtractor);
>>         return (Comparator<S> & Serializable)
>>             (c1, c2) -> compare(keyExtractor.apply(c1), keyExtractor.apply(c2));
>>     }
>
> Then the code you illustrated would be,
>
> people.sort(CASE_INSENSITIVE_ORDER.apply(Person::getLastName)
>         .thenComparing(nullsLast.thenComparing(CASE_INSENSITIVE_ORDER)
>                 .apply(Person::getEmailAddress)));
>
> byKey and byValue is actually added based on the same thought that when
> something is not a Comparable. With above, it can be replaced with
>
> cmp.apply(Map.Entry<K,V>::getKey);
>
> This is just inverse thinking of comparing, where the thoughts is mainly
> for Comparable and primitive type. For those, comparing/thenComparing is
> a more convenient and comprehensive expression.
>
> Thoughts?
>
> Cheers,
> Henry
>
>
> On 03/06/2013 03:06 PM, Michael Hixson wrote:
>> On Wed, Mar 6, 2013 at 1:01 PM, Henry Jen <henry.jen at oracle.com> wrote:
>>> On 03/06/2013 02:31 AM, Michael Hixson wrote:
>>>
>>>> 1. Why disable the following code even though it is (theoretically) safe?
>>>>
>>>>   Comparator<CharSequence> a = comparing(CharSequence::length);
>>>>   Comparator<String> b = a.thenComparing(CASE_INSENSITIVE_ORDER);
>>>>
>>>> That code would compile if the signatures of the thenComparing methods
>>>> had generics like <S extends T> instead of <T>.  The example above is
>>>> conjured up and ridiculous, but I think real code will have use for
>>>> it.  Is there any downside to narrowing the return type this way?
>>>>
>>>
>>> I think that make sense, will need to do some experiment to make sure it
>>> won't confuse type system when chaining all together, and I am not sure
>>> how much burden this has on inference engine. I prefer simplicity.
>>>
>>> Theoretically, if all function take Comparator carefully allows super
>>> type, which we have, isn't that enough? Your above example can be,
>>>
>>> Comparator<String> a = comparing<CharSequence::length);
>>> a.thenComparing(CASE_INSENSITIVE_ORDER);
>>>
>>
>> The idea is that I wanted to use both comparators.  It's important
>> that "a" remains Comparator<CharSequence> because I want to sort
>> List<CharSequence> objects with it and use it to generate other
>> CharSequence-subclass comparators in addition to "b".
>>
>> Also, min/max will need <S extends T> or else it will break Guava's
>> Ordering class.  The same thing happened a while back with
>> comparator.reverse() (which was then renamed to reverseOrder).  If the
>> only reason for the rename was to unbreak Guava, then if you use <S
>> extends T> you could change it back because the signatures will match.
>>
>> (Maybe the Guava devs have more insight about this signature?  They
>> went that route for most of Ordering's methods.  Some of the same
>> reasoning might apply here.)
>>
>>>
>>>> 2. There's a thenComparing(Function), but no thenComparing(Function,
>>>> Comparator).  This seems like a big omission.  Surely people will want
>>>> secondary orderings on fields by something other than natural
>>>> (null-unfriendly) ordering.  Also, a Comparators.comparing(Function,
>>>> Comparator) method would remove the need for all the Map-centric
>>>> methods that are currently in the Comparators class.  For instance:
>>>> comparing(Map.Entry::getValue, naturalOrder()).
>>>>
>>>
>>> Note that Function form must return a Comparable, which implies an order
>>> already. Combine with Comparator.reverseOrder() method, that would cover
>>> the ground.
>>>
>>> If the Comparable is not a fit, probably write a Comparator in lambda is
>>> needed anyway?
>>>
>>>> 3. There is no support for sorting of nullable values or fields.
>>>> Sorting of nullable values directly perhaps should not be encouraged,
>>>> but sorting values by nullable fields within a chained sort is
>>>> completely reasonable.  Please add some support for this, either as
>>>> chain methods on Comparator, or as factory methods on Comparators.
>>>>
>>>
>>> Not sure what do you propose to be added. NULL is a controversial topic,
>>> only application know what it means. Therefore, avoid try to interpret
>>> null is probably a better approach. Write a Comparator if needed.
>>>
>>
>> Regarding comparing(Function,comparator) and nulls - I'm possibly just
>> repeating old arguments but I'll give it another shot.
>>
>> There are use cases for these all over the code base I ported to Java
>> 8.  I'll repost the example from my first email since I think that may
>> answer your question about nulls:
>>
>>   // Sort a list of people by:
>>   //  1) Last name, ignoring case
>>   //  2) Email address, with no email (null) last, ignoring case
>>   people.sort(
>>       comparing(Person::getLastName, CASE_INSENSITIVE_ORDER)
>>           .thenComparing(
>>               Person::getEmailAddress,
>>               nullsLast().thenComparing(CASE_INSENSITIVE_ORDER)));
>>
>> The Comparators class itself presents four use cases for
>> comparing(Function,Comparator).  I don't think they're especially good
>> cases, but: naturalOrderKeys, naturalOrderValues, byKey, byValue could
>> all be done instead as comparing(Map.Entry::get{Key,Value},c).  It is
>> odd to me that four specialized versions of this are being offered (I
>> can't recall the last time I wanted to sort map entries) but the more
>> primitive operation behind them is not.
>>
>> -Michael
>>
>


More information about the lambda-dev mailing list