Stream confusion
Archie Cobbs
archie.cobbs at gmail.com
Wed Nov 22 17:52:06 UTC 2023
On Wed, Nov 22, 2023 at 11:13 AM Nathan Reynolds <numeralnathan at gmail.com>
wrote:
> It is not only deterministic but also JIT (or an advanced version) could
> reduce this code to the following single line in main().
>
> System.out.println("Total red weight = 123");
>
> Since there are no assignments to the Widget fields after construction,
> they are effectively final. I wouldn't be surprised if JIT assumes that
> they are final. Also, there is no need to put final in the main() body
> since these values are effectively final. (You can lookup effectively
> final.)
>
OK I believe you that in the current implementation of the JDK, the
behavior is deterministic.
But as a developer trying to write bug-free code, I can only assume what is
published in the documented API.
Implementation details about the runtime I happen to be using are not
relevant to proving the statement "This Java program is always
deterministic".
And if I'm limited to what's published in the API for Stream, etc., how am
I supposed to prove to myself that the program is deterministic?
To do that, I would have to assume (for example) that Stream.filter() will
always execute in the current thread, because otherwise the writes to the
non-final fields are not guaranteed to be visible to other threads.
But this is not guaranteed to be true in the Stream API docs. In fact, they
explicitly state that this would be an unsafe assumption:
Note also that attempting to access mutable state from behavioral
> parameters presents you with a bad choice with respect to safety and
> performance; if you do not synchronize access to that state, you have a
> data race and therefore your code is broken
>
To make the example program correct based on the documented API, we would
have to add synchronization around the construction of the Widgets and pair
that with synchronization around the lambdas passed to filter() and
mapToInt().
Of course in the real world nobody does that. As a result, my contention is
that there is a giant universe of code out there that, just like my
example, executes a Stream pipeline on data that is constructed/prepared in
the current thread but not guaranteed to be safely published, and where the
pipeline contains unsynchronized "behavioral parameters" that access that
data.
Any such code is therefore not guaranteed to be deterministic!
The fact that the code works *today* is nice, but it doesn't change the
fact that all of this code is just a giant ticking time bomb in case the
implementation ever changes (project loom anyone?)
Or, you might say "Well, in practice non-parallel streams are always
executed in the local thread and I'm sure they'll be that way for a long
time".
Great! If that's our stance, then this should be made official and
documented in the API: "Non-parallel streams always execute in the current
thread".
Right now it seems like we have the worst of both worlds...
-Archie
--
Archie L. Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20231122/00d04dcc/attachment.htm>
More information about the amber-dev
mailing list