[ILUG] 2.6 Kernel + Hyperthreading
tony at palamon.ie
Fri Sep 10 23:19:13 IST 2004
Bryan O'Donoghue wrote:
> Hold on a second.. I'm fairly sure we decided yesterday, that
> non-threaded applications would benefit from HT. There is no way a Hyper
> Thread CPU can know if it's dealing with a kernel thread, a process or a
> user-space thread... unless I'm very, very wrong... to the CPU, it's
> just an execution context. Perhaps the kernel knows better and only
> schedules threaded applications for a Virtual CPU... but, again, I think
> we decided yesterday, that CPU=Virutal CPU, thread=process, ergo, it
> doesn't matter, if your application is threaded or not, hyperthreadding
> _is_ a win.
Hyperthreading or (SMT as it of the world called it before intel got
their marketing people on the case) is by no means a guaranteed win.
What is does allows multiple threads to attempt to keep the processor pipeline
full instead of one. More accurately, it allows multiple threads to issue
instuctions to the re-order buffer, based on the correct assumption that
instructions from different threads won't have dependancies, thus even if
some instructions are held up, others from the other thread(s) can proceed.
Of course, for hyperthreading, we're talking multiple = 2.
This can be a good thing, because any kind of pointer chasing code will tend
to have great difficulty keeping the execution units busy, since following
an uncached pointer costs over 100 potential execution slots by the time
DRAM coughs it up. In fact, given a diet of that kind of code, you could
probably run 30+ threads and still barely keep the chip productive.
However, there are two downsides to it. The first is that code is sometimes
written to take advantage of the re-order buffer to speculatively load data
which isn't quite needed yet, or to run another group of operations which are
independent of others. Since the re-order buffer is now split between
2+ threads, the processor doesn't get to see so far into each one, and might
not do the speculative load instruction / second bundle of instructions in
parallel with the earlier instructions. This means that hyperthreading tends
to run any particular thread a bit slower than it would without hyperthreading,
but the net throughput is better. Hyperthreading CPUs also favour one thread
above the other to help reduce this problem. Fancy SMT implementations allow
the OS to control this.
The second downside is that threads might start fighting over the caches. This
can be a particular problem if it gets to the stage where each thread is evicting
the other's data every few instructions. This can result in a massive slowdown,
and a net throughput of far less then a single thread would get.
So in the best case, hyperthreading gives you twice the instruction throughput,
with no slowdown on either thread relative to it running alone. In the worst case,
it halves the size and associativity of your cache, and halves the size of the
re-order buffer, for a massive slowdown, even if there are two threads running.
So, basically, it depends on what the workload is.
More information about the ILUG