Stream confusion

Nathan Reynolds numeralnathan at gmail.com
Wed Nov 22 18:24:04 UTC 2023


If you use a parallel stream, I would guess the code would still be
deterministic.  The parallel stream has to safely confer the references
from the main thread to the helper threads.  This requires some sort of
synchronization.  By safely conferring the references, the field values in
the objects get the benefit of the synchronization too.

I haven't played with Loom threads yet.  From what I have read, it doesn't
sound like Loom threads change the functional behavior of the program.
Loom threads only change the performance behavior.

On Wed, Nov 22, 2023 at 9:52 AM Archie Cobbs <archie.cobbs at gmail.com> wrote:

>
> On Wed, Nov 22, 2023 at 11:13 AM Nathan Reynolds <numeralnathan at gmail.com>
> wrote:
>
>> It is not only deterministic but also JIT (or an advanced version) could
>> reduce this code to the following single line in main().
>>
>> System.out.println("Total red weight = 123");
>>
>> Since there are no assignments to the Widget fields after construction,
>> they are effectively final.  I wouldn't be surprised if JIT assumes that
>> they are final.  Also, there is no need to put final in the main() body
>> since these values are effectively final.  (You can lookup effectively
>> final.)
>>
>
> OK I believe you that in the current implementation of the JDK, the
> behavior is deterministic.
>
> But as a developer trying to write bug-free code, I can only assume what
> is published in the documented API.
>
> Implementation details about the runtime I happen to be using are not
> relevant to proving the statement "This Java program is always
> deterministic".
>
> And if I'm limited to what's published in the API for Stream, etc., how am
> I supposed to prove to myself that the program is deterministic?
>
> To do that, I would have to assume (for example) that Stream.filter()
> will always execute in the current thread, because otherwise the writes to
> the non-final fields are not guaranteed to be visible to other threads.
>
> But this is not guaranteed to be true in the Stream API docs. In fact,
> they explicitly state that this would be an unsafe assumption:
>
> Note also that attempting to access mutable state from behavioral
>> parameters presents you with a bad choice with respect to safety and
>> performance; if you do not synchronize access to that state, you have a
>> data race and therefore your code is broken
>>
>
> To make the example program correct based on the documented API, we would
> have to add synchronization around the construction of the Widgets and pair
> that with synchronization around the lambdas passed to filter() and
> mapToInt().
>
> Of course in the real world nobody does that. As a result, my contention
> is that there is a giant universe of code out there that, just like my
> example, executes a Stream pipeline on data that is constructed/prepared in
> the current thread but not guaranteed to be safely published, and where the
> pipeline contains unsynchronized "behavioral parameters" that access that
> data.
>
> Any such code is therefore not guaranteed to be deterministic!
>
> The fact that the code works *today* is nice, but it doesn't change the
> fact that all of this code is just a giant ticking time bomb in case the
> implementation ever changes (project loom anyone?)
>
> Or, you might say "Well, in practice non-parallel streams are always
> executed in the local thread and I'm sure they'll be that way for a long
> time".
>
> Great! If that's our stance, then this should be made official and
> documented in the API: "Non-parallel streams always execute in the current
> thread".
>
> Right now it seems like we have the worst of both worlds...
>
> -Archie
>
> --
> Archie L. Cobbs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20231122/7994a7e5/attachment-0001.htm>


More information about the amber-dev mailing list