Java 8 in a JEE Environment

Tue Apr 16 12:11:07 PDT 2013

>> , but even at numbers much lower than
>> that, if any of them intend to use parallelism at all, having so many
>> threads competing for the CPUs undermines everyone's parallelism.
>
> As I understand, these pools should be fine-tuned at deployment time
> so to keep the total number of threads reasonable.

... at the cost of poor data-parallelism.  (If you have to spin up a 
thread during a computation more than infrequently, you just pushed the 
breakeven out from a few thousand to a few million.  Which brings us 
back to "better to just turn it off.")

>> This is a great argument for "just turn it off"!  This framework was
>> designed and optimized for CPU-intensive data-parallel operations; you want
>> to use it for something that is the exact opposite of that, and along the
>> way are willing to impose crushing performance penalties on those that *are*
>> using it as intended.
>
> Then we come to the most relevant part of our discussion. If we are
> talking about CPU-intensive data-parallel operations only, and if we
> don't have a SPI for controlling how they are executed so another
> implementation could focus on the kind of Java EE taks I have
> mentioned above, the right decision might be documenting that database
> centric operations are out of scope and turn it off.

Glad to have come around to this.  It really does come down to "what is 
this API for", and, like the elephant, is very easy to see different use 
cases as primary.

Its not that we are hostile to providing a friendly API to doing IO 
concurrently instead of serially -- that's a perfectly valid goal and I 
totally support it if it doesn't come at the expense of all the other 
use cases.  Its just that is not the design center of FJ, and a good 
framework for IO-parallel operations would look very different from a 
good framework for data-parallel operations.  Trying to turn the good 
data-parallel framework we've got into something that it isn't, at the 
cost of crippling what it is good at, seems silly.

> I have a question, though: do you understand I am talking about
> operations that *wait* a lot for the database server to respond
> instead of exchanging large chunks of data most of the time?

Yes, though I still claim that this only one of multiple use cases where 
EE users might want to use parallelism, and we should be wary to 
extrapolate too much from this example.  And, even here, you may well 
want some throttling -- you don't want to hit your upstream server with 
thousands of concurrent requests and have them get rejected.  And the 
heuristics for setting pool sizes (and therefore throttling requests) 
for data-parallel operations are not really compatible with those for IO 
operations.  And thread-per-request may not be the right resource model 
for IO-parallelism anyway.

>> Then JSR-238 should define a mechanism for this -- which is a different
>> problem (and likely requires a different solution).
>
> I guess you mean JSR-236?

Yeah, can never keep these straight.