Strange interaction with hyperthreading on Intel hybrid CPU

Alan Bateman Alan.Bateman at oracle.com
Tue Oct 10 10:56:30 UTC 2023


On 10/10/2023 09:49, Michael van Acken wrote:
> Hello,
>
> I have the strange situation that *disabling* hyperthreading on a two 
> year old Intel Alder Lake system (8 performance cores, 4 efficiency 
> cores) gives me a speedup of 1.2 for one particular use case.
>
> The scenario is a compiler bootstrap, with namespaces running as 
> virtual threads delegating  compilation of individual functions to 
> dedicated virtual threads.  All in all very unstructured concurrency, 
> and not an experiment I would ever have contemplated without Loom's 
> virtual threads.  This attempt to go wide with the number of 
> concurrent tasks (maybe 2k in total) has a positive effect on overall 
> runtime, but it's not dramatic and even less so if hyperthreading is 
> enabled.
>
> I first noticed the runtime discrepancy a year ago, and before 
> upgrading Ubuntu this morning I took the chance to compare their 
> 6.2.0-34 kernel with 6.5.0-9.  Averaging bash's time over 10 runs of 
> the java process and booting Linux both without and with nosmp set, I 
> got this:
>
> 6.2.0-34 -- default 1.49s --> nosmp 1.21s for a speedup of 1.23
> 6.5.0-9 -- default 1.44s --> nosmp 1.20s for a speedup of 1.20
>
> The default setup reports 20 logical cpus, while nosmp reduces this to 
> 12.  I doubt that this particular workload comes even close to 
> utilizing 20 logical cpus, so not much upside is to be expected by 
> having the 8 additional logical cpus.  But at some time in the past I 
> tried this on an older machine with 4 cores/8 threads, and there the 
> additional logical cores were very beneficial -- as expected.
>
> What baffles me is the significant downside when going from 12 nosmp 
> cores to 20 with smp.  What can make this workload so sensitive to 
> hyperthreading?  Has anyone seen something similar?
>
Are the virtual threads executing compilation tasks in this usage? How 
many of them are running concurrently? I'm trying to see if this is a 
good use of virtual threads or not.

On HT, this is a good topic. We've had one report where limiting the 
number of carrier threads to half the number of hardware threads 
improved performance. I think we need more usage on these systems to see 
if the ergonomics and defaults should be tuned for these processors.

-Alan



More information about the loom-dev mailing list