Java 8 in a JEE Environment
misterm at gmail.com
Tue Apr 16 09:02:12 PDT 2013
On Tue, Apr 16, 2013 at 12:03 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> The hard part -- and this has been the subject of much discussion with the
> EE team -- is what exactly *does* first-class support for Java EE mean in
> this context?
> For example, consider the following code:
> list.parallelStream().forEach(e -> e.remoteEjbInvocation());
> What are your expectations about what should happen here?
That I can access Java EE resources without any problem.
> Do you expect security and transaction context to be propagated to the pool thread?
> What tradeoffs are you willing to tolerate to enable this to work?
I expect it to be a lot slower than in Java SE itself.
> Are you willing to impose a 100x parallelism hit to make this work?
If that is the best the container can give me, yes. Then I will think
before I use parallelStream() if the code is in my control. It has to
at least *work*.
> Are you willing to impose that hit on *every* parallel use case, even those that just want
> to add up some numbers, not just those that actually need access to the
> application context?
If there is no easy way out, yes.
> What about the security and isolation implications?
Each module should have its own pool instance, configurable for now
using container-specific features. In the future, these can be
standardized (maybe as early as EE 8, maybe only in 9).
> Is it fair to subtly encourage ill-behaved clients to lock up the FJ pool
> threads with long-running tasks like IO, even though this is clearly far
> outside the design center of FJ?
That is what people will use it for, especially for database operations.
> The bottom line is that it's really easy to stand up and say "it should do
> the right thing", but it is far from obvious what the right thing is, and
> how much we should be willing to punish use case A in order to enable use
> case B. (If the right thing were obvious and free, it would have already
> been done.)
In Java EE applications, users' expectations would be that parallelism
should help mainly with batch operations in general or with a handful
of small operations done sequentially today just because there is no
easy way to run them in parallel.
Most of my emails here come from an experiment I have been doing of
migrating a couple of our projects to Java SE 8 (lambda & date-time)
to test the API in the wild and see how things go in terms of
performance, good/bad practices of usage etc. One of these system is a
custom-made BI tool that generates several graphics per page.
Generating them in parallel sometimes make it 5 times faster and
that's because they are low contention. I have done my workarounds to
make some things get propagated already. It is not something I would
recommend people do in production, it is, as I said, an experiment
that let's me understand what I want/need from Java EE specs from now
on and also how to improve the current API.
But the typical use case I will still get to is to do transactional
work. The cost of waiting for database response and generating the
massive amount of logs most applications do is sometimes 95% of the
time spent in these systems. If it can be run in parallel, even with
major overhead, it still would be several times faster than today.
> EE containers have long used the standard trick of intercepting access to
> thread pools, and setting up context before the user code executes in it.
> That works great for millisecond-scale tasks like web page hits. That falls
> apart completely for nanosecond-scale tasks (because the setup cost
> effectively folds into the serial fraction of Amdahl's Law).
I think the expectation is to have something decent for the real slow
cases (80/20), not the ones one would focus in Java SE.
> So trying to apply the usual thread-pool tricks to fork-join pools virtually guarantees
> it would be better to just turn parallelism off entirely -- which is not the
> outcome anyone wants.
The cost of processing a single item in the typical Java EE
application starts at several tens of milliseconds - when it is fast -
and goes up to minutes, so if this can take advantage of the stream
API, it would rock :-)
More information about the lambda-dev