Home arrow Articles arrow HyperThreading Technology - Overview
HyperThreading Technology - Overview Print
Sunday, 26 March 2006
HyperThreading Technology (HTT) is Intel's trademark for their implementation of the simultaneous multithreading technology (SMT) on the Pentium 4 micro architecture. It is basically a more advanced form of SuperThreading that first debuted on the Intel Xeon processors and was later added to Pentium 4 processors.

In SuperThreading, the processor can execute instructions from a different thread each cycle. Thus cycles left unused by a thread can be used by another that is ready to run. Still, a given thread is almost surely not utilizing all the multiple execution units of a modern processor at the same time. More advanced implementations of SMT allow multiple threads to run in the same cycle, using different execution units of a superscalar processor.

The HyperThreading technology improves processor performance under certain workloads by providing useful work for execution units that would otherwise be idle, for example during a cache miss. The advantages of HyperThreading are listed as improved support for multi-threaded code, allowing multiple threads to run simultaneously, improved reaction and response time, and increased number of users a server can support.

HyperThreading works by duplicating certain sections of the processor but not duplicating the main execution resources. Where execution resources in a non-HyperThreading capable processor are not used by the current task, and especially when the processor is stalled, a HyperThreading equipped processor may use those execution resources to execute the other scheduled task.

From a software or architecture perspective, this means operating systems and user programs can schedule processes or threads to logical processors as they would on multiple physical processors. As most of today's operating systems (such as Windows and Linux) are capable of dividing their workload among multiple processors (this is called Symmetric Multi Processing or SMP), the operating system simply acts as though the HyperThreading processor is a pool of two processors.

Before continuing our discussion of multiprocessing, let's take a moment to unpack the term "program" a bit more. In most modern operating systems, what users normally call a program would be more technically termed a process. Associated with each process is a context, "context" being just a catch-all term that encompasses all the information that completely describes the process's current state of execution (e.g. the contents of the CPU registers, the program counter, the flags, etc.).

Processes are made up of threads, and each process consists of at least one main thread. Processes can be made up of multiple threads, and each of these threads can have its own local context in addition to the process's context, which is shared by all the threads in a process. In reality, a thread is just a specific type of stripped-down process, a "lightweight process".

SuperThreading processors can help alleviate some of the latency problems brought on by DRAM memory's slowness relative to the CPU. For instance, consider the case of a multithreaded processor executing two threads. If the first thread requests data from main memory and this data is not present in the cache, then this thread could stall for many CPU cycles while waiting for the data to arrive. In the meantime, however, the processor could execute the second thread while the first one is stalled, thereby keeping the pipeline full and getting useful work out of what would otherwise be dead cycles.

While SuperThreading can help immensely in hiding memory access latencies, it does not, however, address the waste associated with poor instruction-level parallelism within individual threads. If the scheduler can find only two instructions in the first thread to issue in parallel to the execution unit on a given cycle, then the other two issue slots will simply go unused. HyperThreading is simply SuperThreading without the restriction that all the instructions issued by the front end on each clock are from the same thread.

HyperThreading strength is that allows the scheduling logic maximum flexibility to fill execution slots, thereby making more efficient use of available execution resources by keeping the execution core busier. If you compare the SMP, you can see that the same amount of work gets done in both systems, but the HyperThreaded system uses a fraction of the resources and has a fraction of the waste of the SMP system.

Shared resources are at the heart of hyper-threading. The more resources that can be shared between logical processors, the more efficient hyper-threading can be at squeezing the maximum amount of computing power out of the minimum amount of die space.

When about code optimizing for HyperThreading, the Intel Corp. insures that a small amount of code needs to be rewritten/optimized. Probably they meant that it is ideal to schedule processes onto fully unused processor packages (e.g. both virtual processors idle). A HTT scheduler implementation would mean a new balancing policy in the load balancer for a better distribute load on these systems.

To get a better picture on what is happening on the OS side, let�s take a look a the Linux OS.

The current Linux symmetric multiprocessing kernel at both the 2.4 and 2.5 versions was made aware of Hyper-Threading, and performance speed-up had been observed in multithreaded benchmarks. The results on Linux kernel 2.4.19 show Hyper-Threading technology could improve multithreaded applications by 30%. Current work on Linux kernel 2.5.32 may provide performance speed-up as much as 51%.

The status of many resources can be monitored through the PROC file system, and so are the CPUs. By typing cat /proc/cpuinfo we will see that the Linux kernel thinks it runs on a dual processor system.

Therefore, unlike SMP, no major adjustments must be applied overall for the kernel, the only major change being the scheduler tuning for a better distribution of the load balancer.

In a final note, I want to bring in discussion the HTT vulnerability that allows different threads to access certain cache locations. It later showed up that this threat is only theoretical and will definately not decrease HyperThreading's popularity and usability.
Last Updated ( Sunday, 09 April 2006 )
Recent Forum Posts
File system implementation
12/11 23:13 - by daser
1.How good design and implementation file system? 2.how to program IDE and scsi disk? 3.How to rea
Writing OS from scratch
11/05 09:05 - by Gunalan
Hi all, In all the websites where ever i have visited on writing operating system, everybody has
Re:Booting kernel help for n
07/06 18:47 - by Anubis208
I found you can boot with GRUB if you have a multiboot header... ALIGN 4 mboot: ; Multiboot
Booting kernel help for noob
07/04 12:30 - by Anubis208
Could someone direct me to a tutorial on writing a bootsector and kernel on Windows using nasm and d
Re:64-bit kernel not passing
06/20 18:13 - by ChazZeromus
Can you give me the your code, data and stack segment descriptor? They should all be 64-bits no mat

A WebArticles site. Sponsored by Evoleto. Motorola V525 / Business Directory / Delaware Incorporation / Home Made Bazaar