DevoxxUK Lambdas Lab

Wed Apr 3 04:13:01 PDT 2013

My experience (part 2):
I found the API for the word frequency use case to be remarkably
tricky to find. The proposed "correct" solution is certainly
non-obvious, and IMO less clear than writing the code out in
non-lambda Java.

Following a relatively simpe approach to trying to solve a problem -
just keep adding steps until it works - I ended up with something
pretty horrendous:

try (BufferedReader br = new BufferedReader(
    new InputStreamReader(
     CountWordFreq.class.getResourceAsStream("book.txt")))) {

  Map<String, List<String>> initial = br.lines()
    .flatMap(s -> Arrays.stream(s.split(" ")))
    .collect(Collectors.groupingBy(s -> s));

  Map<Map.Entry<String, Integer>, Integer> freq1 = initial
    .entrySet().stream()
    .map(entry -> new AbstractMap.SimpleImmutableEntry<String,
Integer>(entry.getKey(), entry.getValue().size()))
    .collect(Collectors.toMap(entry -> entry.getValue()));

  Supplier<HashMap<String, Integer>> supplier = () -> new
HashMap<String, Integer>();
  BiConsumer<HashMap<String, Integer>, Map.Entry<String, Integer>> accum =
    (HashMap<String, Integer> result, Map.Entry<String, Integer>
entry) -> result.put(entry.getKey(), entry.getValue());
  BiConsumer<HashMap<String, Integer>, HashMap<String, Integer>>
merger = HashMap::putAll;

  Map<String, Integer> freq2 = initial.entrySet().stream()
    .map(entry -> new AbstractMap.SimpleImmutableEntry<String,
Integer>(entry.getKey(), entry.getValue().size()))
    .collect(supplier, accum, merger);
}

This is far worse in readability than it would have been just writing
it imperatively.

Some key points.
The s -> s in the groupingBy looks weird, but is fairly logical.

The first and relatively easy step was producing a Map<String,
List<String>>. I suspected that I needed to reduce to get the the list
into a number, but it was completely non-obvious to me as to how to do
that. The Collectors.reducing methods didn't seem to fit in anywhere,
so I tried looking at the IntStatistics stuff. Again, it wasn't
obvious as to whether they were relevant or how I'd use them. So,
rather than ask for help, I decided to continue on and see if I could
find any solution just by working with the APIs.

As the collect-groupingBy has resulted in a Map, I started from there.
The only way to get a stream was on the entrySet(), which wasn't
really what I wanted but would have to do.

The map() method to map the list to its size was easy in theory, but
very verbose in practice.

ACTION:
Add a static method to Map.Entry:
public static <K, V> Entry<K, V> of(K key, V value) {
  return new AbstractMap.SimpleImmutableEntry<K, V>(key, value);
}

I then wanted to convert the stream of Entry back to a Map. I assumed
this would be easy. I spent lots of time going insane trying to find a
way. Eventually, I got help to use the 3-arg form of collect(). Even
then, working out what the 3 arguments meant and should be was complex
and confusing. (I ended up writing the three arguments as separate
lines, as I was getting inference errors and I had at that point low
confidence in the inference engine. It turned out that the error was a
separate problem, but I never tried putting the lines "back in", as
the pain of using the API on that day was too much by that stage).

ACTION:
Add a much better way to convert a stream of Entry to a Map. Perhaps:
 .collect(Collectors.toMap(entry -> getKey(), entry -> entry.getValue()));

The "correct" approach was apparantly this:
 Map<String, Integer> right = br.lines()
   .flatMap(s -> Arrays.stream(s.split(" ")))
   .collect(Collectors.groupingBy(
        s -> s, Collectors.reducing(s -> 1, Integer::sum)));

This still seems pretty non-obvious. I guess its the stage of mapping
the string to the number one which grates. (I can see what it does,
but as I've said before, reducing is much more of a trick for non
functional programmers to move to.)

ACTION:
Can there be some kind of reducingSum() method? It feels like a common
use case, and might provide the source code for people to look at
providing the example code for them to then learn and write more
advanced reducers themselves.

I'm sorry to say that my experience of the API wasn't that positive. I
hope things can still be tweaked, or I worry about the code we're
going to see.

Stephen

On 3 April 2013 09:12, Richard Warburton <richard.warburton at gmail.com> wrote:
> *
>
> Hi,
>
> Last week at DevoxxUK we ran a brief lambdas hackday.  People were
> encouraged to focus on the collectors component of the API through setting
> a few problems to solve.  I appreciate its a bit late in the game as far as
> API changes, but some of these issues are fixable through
> documentation/improved compiler error changes rather than API changes.
>
> 1. No one complained about the move from “into(new ArrayList<Foo>());” ->
> “collect(toList())” when you explained that the change had been made.
>  However, people didn’t naturally find Collectors.toList() and they did
> express frustration around that.  At least one request for an abbreviated
> toList() method on a stream - more for findability/fluency reasons rather
> than brevity of code.
>
> 2. ToIntFunction, ToDoubleFunction etc. are all usable with flatMap, but
> the naming confused people as to why.
>
> 3. Several people requested a way to transform a boxed stream into an
> unboxed stream.  Its pretty easy to go the other way around, but there
> didn’t seem to be any utility methods for making the boxed -> unboxed
> transformation.
>
> 4. People found “groupingBy” to be a hard conceptual leap.  We had set an
> exercise where people were asked to count the frequency of words in a
> document, in order to force them to use it.
>
> a. Quite a few people didn’t initially look for a function that collects a
> stream into a map.
>
> b. When you suggest that they should look for that, they didn’t look for
> something called “groupingBy”.
>
> c. They did get the concept once you bring up SQL.  With hindsight I wished
> I had enquired about how many people had used LINQ.
>
> d. People then didn’t grok that they needed to use the multiple argument
> overload of groupingBy, with a reducingBy, in order to complete the task.
>  I suspect that this method needs more documentation examples in order to
> be easily understandable by people.
>
> 5. People are beginning to get confused by old documentation on the
> internet being out of date.  I hadn’t seen this in previous hackdays.  Even
> an article in the latest Java Magazine is out of date due to the API moving
> so much recently, and so is the official tutorial:
> http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html.
>  Probably not a long term concern - but might be a concern for the first
> few months.
>
> 6. If anyone else runs this kind of thing and they are a day-to-day eclipse
> user, word of advice to make sure you know how to set the preferred JVM
> location in netbeans and intellij before you run the event!
>
> Thanks to everyone that attended, and especially to Stuart Marks, Maurice
> Naftalin, Graham Allan and John Oliver for helping out with running the lab.
>
> There’s a full link with code that people wrote and pasted at, and it also
> contains some more comments by people:
>
> https://docs.google.com/document/d/1riMDt_JkAX74X30lHuOiSBjSxa7ifjxaynOA4LttOCk/edit
>
> regards,
> *
>
>   Richard Warburton
>
>   http://insightfullogic.com
>   @RichardWarburto <http://twitter.com/richardwarburto>
>