summary of point lambdafication survey
Yuval Shavit
yshavit at akiban.com
Tue Sep 27 12:51:44 PDT 2011
I think your're right about everything there.
Imho (and the h is important here!), convenience functions are a Good Thing
for a few reasons:
* they're convenient
* they encourage certain paradigms (ie, "use this map function rather than
writing a loop yourself")
* they provide a common language, which makes it easier to read other
people's code
* they provide people with example code for "free"
By the last point, I mean that someone who wants to understand how map can
be written in terms of reduce has an obvious place to look -- the source
code for map. This is especially useful when you're trying to teach people a
new concept such as functional programming.
I realize now that my original email could have been construed as
questioning whether you should include a map-written-in-terms-of-reduce
convenience. If so, my apologies -- that wasn't my intent. I was just trying
to get clarification on whether map would be "its own thing," or whether it
would be written in terms of reduce (which I think it should be, if only for
that fourth bullet point).
On Tue, Sep 27, 2011 at 1:56 PM, Stuart Marks <stuart.marks at oracle.com>wrote:
> After mulling this over for a couple days, I realized I wasn't getting
> anywhere staring at the Haskell code you wrote, inasmuch as I don't know
> Haskell. :-) So I decided to rewrite scenario #2 ("groupBy") in Java, using
> fold (which is called "reduce" in the current prototype).
>
> Map<String,List<String>> map =
> list.reduce(new HashMap<String,List<String>>()**, (map, str) ->
> {
> String initial = str.substring(0, 1);
> List<String> val = map.get(initial);
> if (val == null) {
> val = new ArrayList<String>();
> map.put(initial, val);
> }
> val.add(str);
> return map;
> });
>
> Ugh! So this is what you mean by "a bit more work." Most of this is because
> Java lacks things like tuples or a MultiMap. So let's assume the existence
> of some kind of MultiMap:
>
> MultiMap<String,String> map =
> list.reduce(new MultiMap<String,List<String>>(**), (map, str)
> -> {
> map.put(str.substring(0, 1), str);
> return map;
> });
>
> Now as you say if map.put() were to return the map itself (it doesn't in
> any of the multimaps that I've seen, but let's run with this) we could
> simply write:
>
> MultiMap<String,String> map =
> list.reduce(new MultiMap<String,List<String>>(**),
> (map, str) -> map.put(str.substring(0, 1), str));
>
> This is nice, but perhaps not quite as nice as:
>
> MultiMap<String, String> map = groupBy(list, s -> s.substring(0, 1));
>
> or even
>
> Map<String, List<String>> map = groupBy(list, s -> s.substring(0, 1));
>
> So, back to your point. It does seem that a fold ("reduce") operation is
> fairly general, and that other operations such as toMap and groupBy can be
> implemented in terms of fold. Does that mean it's not necessary to have
> these other operations in the library? Or are they common enough that
> they're worthwhile to have as a convenience?
>
> s'marks
>
>
>
>
>
>
>
> On 9/23/11 4:01 PM, Yuval Shavit wrote:
>
>> In a world where the result of inserting into a map is the map that you've
>> just
>> inserted into (Java isn't that world, but it can be made into that world
>> in a
>> couple lines), I think both scenarios you describe can be implemented with
>> a
>> fold. That said, in your #2 scenario, if you really have a Map<K,V> and
>> not a
>> MultiMap<K,V>, then the fold's function does have to do a bit more work.
>>
>> Forgive me if this is overreaching myself, but since I'm reading up on
>> Haskell,
>> I gave both of your directions a go:
>>
>> import Data.Map as M
>> toMap1 :: (Ord k) => (e->k) -> (e->v) -> Map k v -> [e] -> Map k v
>> toMap1 kf vf = foldr (\e-> M.insert (kf e) (vf e))
>>
>> toMap2 :: (Ord k) => (e->k) -> (e->v) -> Map k [v] -> [e] -> Map k [v]
>> toMap2 kf vf = foldr (doinsert kf vf)
>>
>> doinsert :: (Ord k) => (e->k) -> (e->v) -> e -> Map k [v] -> Map k
>> [v]
>> doinsert kf vf e m =
>> let k = kf e
>> v = vf e
>> m' = case (M.lookup k m) of
>> Nothing -> M.insert k [v] m
>> (Just vs) -> M.insert k (v:vs) m
>> in m'
>>
>> *Main> toMap1 show (*2) empty [1..5]
>> fromList [("1",2),("2",4),("3",6),("4",**8),("5",10)]
>>
>> *Main> toMap2 odd id empty [1..5]
>> fromList [(False,[2,4]),(True,[1,3,5])]
>>
>> -Yuval
>>
>> On Fri, Sep 23, 2011 at 5:55 PM, Stuart Marks <stuart.marks at oracle.com
>> <mailto:stuart.marks at oracle.**com <stuart.marks at oracle.com>>> wrote:
>>
>> On 9/23/11 1:39 PM, Yuval Shavit wrote:
>>
>> Collections.toMap sounds like a convenience function for a fold, is
>> that about
>> right?
>>
>>
>> I don't think so. This might be because of how heavily I had condensed
>> the
>> original submission. Let me explain.
>>
>> As it stood, the original submission was interesting but not quite
>> right in
>> my view. It seemed to lead in either of two different directions, both
>> of
>> which are potentially very useful in their own right. Perhaps the
>> submitter
>> could post here to clarify, in case I've misinterpreted something.
>>
>> 1) The first direction is, given a Set and a Mapper, return a Map:
>>
>> <K,V> Map<K,V> toMap(Set<K> set, Mapper<K,V> mapper)
>>
>> Conceptually this applies the mapper function to every element K of the
>> set, giving a value V; these (K,V) pairs are stored into a map which is
>> then returned. It's an open question whether this is done immediately,
>> done
>> lazily, whether results are cached, etc.
>>
>> 2) The other direction is, given a collection of values and a Mapper
>> that
>> generates keys, return a Map that maps each key to the values from
>> which it
>> was generated.
>>
>> I've seen this called groupBy elsewhere. The problem is, how to
>> represent
>> multiple values for a single key. Other systems use tuples or
>> multi-valued
>> maps. For this example I'll use List<V> as the value of each map entry:
>>
>> <K,V> Map<K,List<V>> groupBy(Collection<V> coll, Mapper<V,K> mapper)
>>
>> (Proper wildcards elided.)
>>
>> For each value in the collection, run the mapper on it to get a key;
>> then
>> put this key-value pair into a map. If there are multiple values
>> corresponding to the same key, the values get put into a list. For
>> uniformity you'd always put the values into a list, even if there's
>> only
>> one value.
>>
>> For example,
>>
>> List<String> list = Arrays.asList("Beulah", "Maude", "Gertrude",
>> "Beatrice", "Gladys", "Georgia", "Mildred", "Mabel");
>> Map<String, String> map = groupBy(list, s -> s.substring(0, 1));
>>
>> would result in map containing
>>
>> "B" => ["Beulah", "Beatrice"]
>> "G" => ["Gertrude", "Gladys", "Georgia"]
>> "M" => ["Maude", "Mildred", "Mabel"]
>>
>> s'marks
>>
>>
>>
More information about the lambda-dev
mailing list