<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Tbc, I am referring to cpu bound tasks of equal duration (cpu time).<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jul 3, 2024, at 12:46 PM, Robert Engels <<a href="mailto:robaho@icloud.com" class="">robaho@icloud.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html; charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Tbh, I didn’t quite understand this:<div class=""><br class=""></div><div class="">"<span style="caret-color: rgb(0, 0, 0);" class="">Latency is not generally improved by time sharing regardless of the number of CPUs. In some situations time sharing will make it (potentially much) better, and in others it will make it (potentially much) worse. </span><font class=""><span style="caret-color: rgb(0, 0, 0);" class="">“</span></font></div><div class=""><font class=""><span style="caret-color: rgb(0, 0, 0);" class=""><br class=""></span></font></div><div class=""><font class=""><span style="caret-color: rgb(0, 0, 0);" class="">Because it is referring to two different things in my opinion.</span></font></div><div class=""><font class=""><span style="caret-color: rgb(0, 0, 0);" class=""><br class=""></span></font></div><div class=""><font class=""><span style="caret-color: rgb(0, 0, 0);" class="">I would have stated this that:</span></font></div><div class=""><font class=""><span style="caret-color: rgb(0, 0, 0);" class=""><br class=""></span></font></div><div class=""><font class="">“Tail latency is improved for cpu bound tasks by timesharing regardless of the number of CPUs”.</font></div><div class=""><font class=""><br class=""></font></div><div class=""><font class="">I don’t see how timesharing can ever make tail latency worse - as normally the context switch overhead is a very small percentage of the timeslice allotment.</font></div><div class=""><font class=""><br class=""></font></div><div class=""><font class="">Also, the statement:</font></div><div class=""><font class=""><br class=""></font></div><div class=""><font class="">"</font><span style="caret-color: rgb(0, 0, 0);" class="">IO preempts threads both in the OS and with the virtual thread scheduler even without time sharing.</span><font class=""><span style="caret-color: rgb(0, 0, 0);" class="">”</span></font></div><div class=""><font class=""><br class=""></font></div><div class=""><font class="">is not correct according to what I know about most OSes. An OS without timeslicing will never pre-empt a completely CPU bound task - it will run to completion or be killed - those are the only options (and the latter is close to Thread.stop() which as we know is problematic).<br class=""></font><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Jul 3, 2024, at 12:39 PM, Attila Kelemen <<a href="mailto:attila.kelemen85@gmail.com" class="">attila.kelemen85@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div class="">I think you somewhat misunderstood Ron's comment on "same". Same means that they are progressing the same task. For example, you have a large composite task which is made up of 100 small chunks, and then you start these 100 chunks of work in parallel. If you are fair, then what you will see is that 0%, 0% ... and suddenly 100% when all of them are completed (assuming total fairness). While in case of non-fair, you will see progress that few chunks done, yet a few more done, etc.</div><br class=""><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Robert Engels <<a href="mailto:robaho@icloud.com" class="">robaho@icloud.com</a>> ezt írta (időpont: 2024. júl. 3., Sze, 18:44):<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;" class="">I don't think that is correct - but I could be wrong.<div class=""><br class=""></div><div class="">With platform threads the min and max latency in a completely cpu bound scenario should be very close the the average with a completely fair scheduler (when all tasks/threads are submitted at the same time).</div><div class=""><br class=""></div><div class="">Without timesharing, the average will be the same, but the min and max latencies will be far off the average - as the tasks submitted first complete very quickly, and the tasks submitted at the end take a very long time because they need to have all of the tasks before them complete.</div><div class=""><br class=""></div><div class="">In regards to the the “enough cpus” comment, I only meant that if there are enough cpus and a “somewhat” balanced workload, it is unlikely that all of the cpu bounds tasks could consume all of the carrier threads given a random distribution. If you have more active tasks than cpus and the majority of the tasks are cpu bound, the IO tasks are going to suffer in a non-timesliced scenario - they will be stalled waiting for a carrier thread - even though the amount of cpu they need will be very small.</div><div class=""><br class=""></div><div class="">This has a lot of info on the subject <a href="https://docs.kernel.org/scheduler/sched-design-CFS.html" target="_blank" class="">https://docs.kernel.org/scheduler/sched-design-CFS.html</a> including:</div><div class=""><br class=""></div><div class=""><span style="color: rgb(62, 67, 73); font-family: serif; font-size: inherit; background-color: rgb(255, 255, 255);" class="">On real hardware, we can run only a single task at once, so we have to introduce the concept of “virtual runtime.” The virtual runtime of a task specifies when its next timeslice would start execution on the ideal multi-tasking CPU described above. In practice, the virtual runtime of a task is its actual runtime normalized to the total number of running tasks.</span></div><div class=""><br class=""></div><div class=""><div class="">I recommend this <a href="https://opensource.com/article/19/2/fair-scheduling-linux" target="_blank" class="">https://opensource.com/article/19/2/fair-scheduling-linux</a> for an in-depth discussion on how the dynamic timeslices are computed.</div><div class=""><br class=""></div><div class="">The linux scheduler relies on timeslicing in order to have a “fair” system. I think most Java “server” type applications strive for fairness as well - i.e. long tail latencies in anything are VERY bad (thus the constant fight against long GC pauses - better to amortize those for consistency).</div><div class=""><br class=""></div></div></div></blockquote></div></div>
</div></blockquote></div><br class=""></div></div></div></blockquote></div><br class=""></body></html>