Java 8 in a JEE Environment

Tue Apr 16 11:51:47 PDT 2013

On Tue, Apr 16, 2013 at 2:21 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>> What tradeoffs are you willing to tolerate to enable this to work?
>>
>>
>> I expect it to be a lot slower than in Java SE itself.
>
>
> Are you willing for it to be 10x *slower* than the equivalent sequential
> code?  (Are you under the impression that there's some reason the parallel
> code can't be slower than the equivalent sequential code?)

Yes, I am willing. :-) Java EE has always been more about reliability
and scalability than speed.

>>> Are you willing to impose a 100x parallelism hit to make this work?
>>
>>
>> If that is the best the container can give me, yes. Then I will think
>> before I use parallelStream() if the code is in my control. It has to
>> at least *work*.
>
>
> The emphasized use of *work* -- as if there were only one definition of
> "work" and that one is so obvious that you have emphasize it in this way --
> is really troubling here.  It suggests that you are completely unwilling to
> consider any compromise in the programming model, even to the extent that
> this may subject every other user to a potentially crippling performance
> penalty.

There is no user of Stream in Java EE (even SE) today. However, if I
decide to run my container on JDK 8 because it is faster and it works,
I expect libraries I use might give sub-optimal performance, but at
least I do hope they work when they call my code back.

> It sure would be nice if you could have everything you want, and it sure
> would be nice for parallelism to be simple and always give you a result that
> is at least as fast as sequential, but that's not reality.

That is a price we have to pay, one way or another, or we would have
to compromise both performance and features.

>>> Are you willing to impose that hit on *every* parallel use case, even
>>> those that just want
>>> to add up some numbers, not just those that actually need access to the
>>> application context?
>>
>>
>> If there is no easy way out, yes.
>
>
> Too bad they don't get a vote.

I find this is way less of the 5% *Java EE* applications today doing
that kind of stuff. You also make decisions based on the majority. If
you have such an algorithm in your code, you probably are using
something else today which might be based on FJP already.

>>> What about the security and isolation implications?
>>
>>
>> Each module should have its own pool instance, configurable for now
>> using container-specific features. In the future, these can be
>> standardized (maybe as early as EE 8, maybe only in 9).
>
>
> This is technically naive.  Not only is this impractical with more than a
> few tens of apps running in a container

Having more than a couple of applications running in the same JVM
instance (and at some point, machine) is for nuts :-) Not something I
would or would recommend doing or would expect to perform even today
(application server vendors to beat me in 3, 2....)

>, but even at numbers much lower than
> that, if any of them intend to use parallelism at all, having so many
> threads competing for the CPUs undermines everyone's parallelism.

As I understand, these pools should be fine-tuned at deployment time
so to keep the total number of threads reasonable.

> This is a great argument for "just turn it off"!  This framework was
> designed and optimized for CPU-intensive data-parallel operations; you want
> to use it for something that is the exact opposite of that, and along the
> way are willing to impose crushing performance penalties on those that *are*
> using it as intended.

Then we come to the most relevant part of our discussion. If we are
talking about CPU-intensive data-parallel operations only, and if we
don't have a SPI for controlling how they are executed so another
implementation could focus on the kind of Java EE taks I have
mentioned above, the right decision might be documenting that database
centric operations are out of scope and turn it off.

I have a question, though: do you understand I am talking about
operations that *wait* a lot for the database server to respond
instead of exchanging large chunks of data most of the time?

> Then JSR-238 should define a mechanism for this -- which is a different
> problem (and likely requires a different solution).

I guess you mean JSR-236?

> I get that the Stream API seems like a pretty way to express it so you want
> to just use that, but the underlying implementation (and no small part of the
> API design) is built around the needs of data-parallel operations, not
> IO-parallel ones.

>From my experiments, I can tell you that with some small changes in
Java EE and/or how threads are created, the current API does make
running several slow queries (lots of waits) several times faster,
which is what I would expect, with basically no setup. I wonder if
what I was looking for would be a different way of setting up the
implementation, for which, from the archives, I understand the SPI
would not be mature enough.

Great discussion, it made things way clearer to me. Thanks for your time, Brian.

Regards,
Michael