Single Thread Continuation

robert engels rengels at ix.netcom.com
Thu Jul 6 02:35:05 UTC 2023


One final note on this, by changing the reader to be a virtual thread rather than a platform thread the performance improves in the “synchronized generator” by 10x. Interestingly, it has the opposite effect on the non-synchronous version - reducing its performance to that of synchronized version.


> On Jul 5, 2023, at 7:03 PM, robert engels <rengels at ix.netcom.com> wrote:
> 
> 
> 
>> On Jul 5, 2023, at 2:06 PM, Attila Kelemen <attila.kelemen85 at gmail.com <mailto:attila.kelemen85 at gmail.com>> wrote:
>> 
>> Given that Ron Pressler wrote that there is not even a road map for generators, this will be my last email on this topic.
>> 
>> I saw in one of your other emails that "I only stated that Java does not need any language change". If that is the case, then I probably misunderstood you, because I agree with that. A language change is neither required, nor desirable. All I wish for is the JDK to provide some utility methods (and a few interfaces).
>> 
>> As for your comments below:
>> 
>> You might be misunderstanding what I'm referring to with "Iterator is unreachable -> Continuation is unreachable -> Generator is unreachable", because this is not what your code does (and there is no way it could do that). What I mean here is that when the "Iterator is unreachable", then in a JDK provided implementation the generator immediately becomes unreachable as well (simply because the assumption is that the generator is only reachable through the iterator). In your implementation that is not the case. What happens is that when the iterator becomes unreachable, then only eventually will the generator become unreachable as well, because some code needs to exit before that happens. And this makes a difference, because if the full GC happens after the iterator becomes unreachable, but before your code noticed that, then even after the full GC completes the generator will still be reachable as far as the GC is concerned, thus a potential OOM.
> 
> I understood what you are saying, but that is trivially solved. The Generator can be made to implement auto-closable, and then the standard try-with-resources could be used to trigger the clean-up if that was a concern - but these very contrived as well as it is doubtful in any system the generator would be the object using so much memory as to prevent an OOM.
> 
>> 
>> As for the need for 100% synchronous generator: This is the most common case I believe, so I would expect the JDK to optimize for this case. Other cases can be implemented atop of that.
> 
> I created a branch ’synchronous’ that does this. I underestimated the work - it was 11 lines of code not 10.
> 
>> 
>> About lookahead: Your implementation doesn't have zero lookahead, but 1. And this is not your fault, but the fault of how the `Iterator` looks like (it could be worked around, if the work was done in `hasNext`, but that seems awful to me). So, I would actually really love a `boolean moveNext` based iterator as well for actual 0 lookahead. Also, there is no need to have a generator utility to provide the ability of additional lookahead since that is better to implement at a higher level.
> 
> The synchronous branch has 0 look-ahead and as expected performance drops tremendously - from 350 ms to 10+ seconds. All systems designed like this suffer ping-pong latency - which is why you don’t do it this way if you can - always better to read ahead and discard if not needed (sometimes you can’t do this easily).
> 
> Typically an expensive/complex generator would amortize the transfer cost so it isn’t as bad as it looks anyway.
> 
>> 
>> As for your implementation done 100%: That is not really true (though a nicely optimized one for sure). Also, the fact that you found a way to make your previous code more efficient just proves that it is better to have this at a common location where everybody can benefit from improvements found. Anyway, I don't want to go into this more, because if you are only against a language change, then we are probably on the same page on this already.
> 
> It is 100% functional - and it would be easy to adapt the synchronous to a mode so that generators that don’t require it don’t pay a performance hit. (There are other ways to reduce the ping-pong latency like spin loops but they’re beyond the scope of this discussion).
> 
> 
>> 
>> Attila
>> 
>> robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> ezt írta (időpont: 2023. júl. 5., Sze, 3:31):
>> 
>> 
>>> On Jul 4, 2023, at 6:28 PM, Attila Kelemen <attila.kelemen85 at gmail.com <mailto:attila.kelemen85 at gmail.com>> wrote:
>>> 
>>> Robert Engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> ezt írta (időpont: 2023. júl. 5., Sze, 0:48):
>>> I don’t believe any of those statements are true. Even if the language supported generators directly - they are still subject to gc. The JVM needs a way to release the generator and it’s backing resources. If it did that directly - like a destructor - when it goes out of scope it still wouldn’t be able to release the other resources. 
>>> 
>>> 
>>> It is not about being subject to GC or not. My claim is the following: With JVM implementation, it could be that:
>>> 
>>> Iterator is unreachable -> Continuation is unreachable -> Generator is unreachable
>> 
>> This is exactly how my implementation works.
>> 
>>> 
>>> The point is that in custom VT based implementation the JVM cannot know this, because a code will have to terminate the generator loop first, while the JVM doesn't have to (similarly as done in Kotlin, if I'm not mistaken). Potentially delaying the cleanup of some VT related resources is not that big of a deal. Delaying the unreachability of the generator might be.
>>>  
>> 
>> Yes it can - and it does! That is how GC languages work. Even if the JVM supported generators directly - it is still going to rely on GC for cleaning up memory resources - and a generator could use try-with-resource to clean-up other resources more quickly. Which is exactly what the implementation I shared does (the producer can use try-with-resource if it needed too).
>> 
>> Yes, there is a potential delay in when the system determines that the iterator/continuation/producer/generator is unreachable - but that is no different than any memory resource in Java. If you wanted to add a ‘close’ method to the iterator you could - for more immediate clean-up but it is not necessary. If you use the weak-reference queue you will also typically be notified far sooner than the finalizer method I used (since the finalizer needs to be scheduled and competes with other objects needing finalization).
>> 
>>> 
>>> You understanding of how OOM and GC works is not correct. 
>>> 
>>> Can you clarify what exactly you are referring to? My claim is that, if the generator retains a large object (like a large array), then in a VM based implementation the JVM can see that the generator is unreachable, if the iterator is unreachable. Thus, in case it needs a lot of memory, it can conclude that there is no need for OOM. In a custom VT bsed case, there is no such chance, because it can't possibly know that the generator will soon be unreachable.
>> 
>> It will know the generator is unreachable the same way it determines that any object is unreachable. If the JVM is unable to allocate memory (eg. capped) it will run full GC cycles to free memory before it triggers an OOM. If the generator (holding the memory) is still reachable then it would be in a JVM specific implementation as well - meaning the memory could not be released.
>> 
>>>  
>>> 
>>> My implementation does not require a queue (unless you consider a handoff variable a queue) or exceptions. 
>>> 
>>> 
>>> Yes, I'm referring to that. You yourself called that a "hand-off queue" as well. It doesn't matter how you implement it, it is still a queue.
>> 
>> It is doubtful that a jvm native implementation would not use a hand-off mechanism - otherwise the generator would need to be 100% synchronous with zero read ahead - no one would want generators to work that way - it is far far less efficient than a hand-off or full queue.
>> 
>>>  
>>> I think you’ll find the implementation I shared to be very efficient - and it was a super quick effort. An atomic CAS and LockSupport would make it even more so - but any complex generator will dominate the performance over the handoff infrastructure. 
>>>> 
>>> 
>>> 
>>> It is very efficient given your possibilities, but the JVM could do better, because it doesn't need a queue. In fact, it can just immediately change the context on the same carrier thread in `Iterator.next`, and doesn't even need to involve the queue for VT (let alone the extra queue a custom implementation needs atop of that).
>> 
>> See above. It is doubtful it wouldn’t use a queue as this would be slower and not using the power/performance of concurrency. The queue size is negligible compared with the other resources probably used by a complex generator.
>> 
>>> 
>>> As for the "super quick effort". Maybe, but your implementation is far from complete, as there are a lot of additional things to deal with in the general case (when you are not just inlining a simple counting loop), obviously needs some cleanup, and also any implementation that is used in serious code would require extensive testing. All of those are non-trivial effort, and there is a lot of chance for bugs, and there would be no point in forcing many-many people to go through that, when the JDK could just provide a good and efficient implementation.
>> 
>> The implementation is 100% complete, but to make it more obvious I created a new branch. https://github.com/robaho/generators/tree/complex <https://github.com/robaho/generators/tree/complex>
>> 
>> That has arbitrary generators and improves the synchronization.
>> 
>> It can generate and consume 1M values in < 350 ms or 350 nanos per operation.
>> 
>> As I said, the overhead in the framework is negligible. The generators themselves if they have any complexity or IO are going to dominate the performance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230705/6c662a9f/attachment-0001.htm>


More information about the loom-dev mailing list