Incremental updates from Tasks

Fri Jan 6 14:54:48 PST 2012

>>> What about an API similar to how it's done in SwingWorker? There we
>>> define a 2nd type parameter T and publish() partial results. This would
>>> be called from the call() method whenever we have a partial result:
>>> 
>>> protected void publish(T chunk)
>>> 
>>> Which in turn results into a call (on the JavaFX thread) to:
>>> 
>>> protected void process(List<T> chunks...)
>>> 
>>> Which allows to process one or more chunks that have been published
>>> before using publish().
>>> 
>>> I like that API and it solves the question how to handle atomic vs. list
>>> types simply by introducing a new generic type. It also has the
>>> advantage that Swing refugees feel familiar with it :-)
>>> 
>>> Anything wrong with that approach?
>> 
>> There are two things I don't like about that approach. The first is that it introduces overhead in cases where we are only interested in the "most recent intermediate value". That "chunks" list might be 10,000 elements long, and we only care about the very last item in it.
> 
> The chunks list is usually short. Whenever the background thread calls
> publish() the item is put into the chunks list, as soon as the EDT gets
> to run (hopefully often enough!) whatever is in that list gets
> process()ed. The next publish() starts in a fresh list. In other words,
> we only get the latest bunch of data. Whether the implementation keeps
> all the data around or not is up to itself.

I think in the case of something like this:

        Task<Integer> task = new Task<Integer>() {
            @Override
            protected Integer call() throws Exception {
                int iterations;
                for (iterations = 0; iterations < 10000000; iterations++) {
                    if (isCancelled()) {
                        break;
                    }
                    updateMessage("Iteration " + iterations);
                    updateProgress(iterations, 10000000);
                    updateValue(iterations);
                }
                return iterations;
            }
        };

We would see very many items added to the chunk list. I wonder what are some good practical use cases? I suggested computing primes, or computing pi, or some other such algorithm. In those cases I suspect we will flood the event queue very quickly (remember we only run pulses at 60 times per second).

I ran an interesting test case. I had an AtomicInteger to act as my counter, and found an implementation of computing primes in Java. I wrote a task and tracked how often updateValue was called vs. how often the value property was updated on the FX thread. Results (some intermediate set of counts):

7
2
6
454
27
5
4
5
4
3
6
310
3
124

So sometimes it kept up well, other times it was somewhat slower. I would guess the assertion that "there aren't that many chunks" is probably generally accurate, but not always. Interestingly, the very first counts I got were:

4178
62
23
26
26

Indicating that before things got hot-spotted there was a lot more data being buffered up. In any case, with mobile and embedded devices, I'm worried about having a queue that large that I'm populating and dumping with regularity (garbage collection has a big impact on battery life, so I'm told).

I need to do more experimentation. The ObservableListTask is actually kind of a mess. The problem is that the call() method returns the final result, but in an ObservableListTask you don't want to force the implementation of ObservableListTask to return a result! In fact, for such a Task you want the result set immediately in the constructor and then never set again -- only have items added to it.

It almost suggests that ObservableListTask should extend from FutureTask and not from Task, duplicating a chunk of the Task API. Or that there should be a TaskBase extending from FutureTask with Task and ObservableListTask extending from it (I guess that is along the lines of what you were suggesting, isn't it??).

Richard