Titan Cray XK7


Titan will unlock secrets of the universe from the smallest to the largest scales, including (from left), the transition of states in a quantum magnet, the ability of promising new drugs to disassemble damaging fibrils in the brains of Alzheimer’s sufferers, the confinement and dispersion of small molecules within carbon nanostructures, the behavior of neutrons in a nuclear reactor core, long-term climate forecasting, and the mechanism through which a collapsing stellar core blows the star into space.

Overview

Titan Overview

The Oak Ridge Leadership Computing Facility (OLCF) has completed the first phase of an upgrade of the Jaguar system that will result in a hybrid-architecture Cray XK7 system named Titan, with a peak theoretical performance of more than 20 petaflops. Titan will be the first major supercomputing system to utilize a hybrid architecture, or one that utilizes both conventional 16-core AMD Opteron CPUs and unconventional NIVIDIA Kepler graphics processing units (GPUs). The combination of CPUs and GPUs will allow Titan and future systems to overcome power and space limitations inherent in previous generations of high-performance computers.

In the first phase of the upgrade, completed in February 2012, each of Jaguar’s 18,688 Cray XT5 compute nodes was replaced with a new Cray XK7 compute node that consists of one AMD 16-core Opteron 6274 processor running at 2.2 GHz, 32 gigabytes of DDR3 memory, and Cray’s new high performance Gemini network providing higher bandwidth, lower latency, faster collectives, and greater reliability than the previous generation XT5 nodes. The upgraded Jaguar system has a total of 299,008 AMD Opteron CPU cores, 600 terabytes of memory, and is connected to the 240 GB/s Spider file system. Phase I of this upgrade also populated 960 of these XK6 nodes with NVIDIA Fermi GPUs.

The second phase of the upgrade began in the fall of 2012 by adding 18,688 NVIDIA Kepler K20 GPUs with 6 gigabytes of high speed directly attached memory. Because they handle hundreds of calculations simultaneously, GPUs can go through many more than CPUs in a given time. Yet they draw only modestly more electricity. By relying on its 299,008 CPU cores to guide simulations and allowing its Kepler GPUs to do the heavy lifting, Titan will be approximately ten times more powerful than its predecessor, Jaguar, while occupying the same space and drawing essentially the same level of power.

When complete, Titan will have a theoretical peak performance of more than 20 petaflops, or more than 20,000 trillion calculations per second. This will enable researchers across the scientific arena, from materials to climate change to astrophysics, to acquire unparalleled accuracy in their simulations and achieve research breakthroughs more rapidly than ever before.

Users of Titan will continue to have access to the Spider file system, with 240 GB/s data bandwidth and over 10 PB of storage capacity. The OLCF will upgrade Spider in 2013 to increase both bandwidth and capacity. The users will also have access to the HPSS data archive, LENS data analysis and visualization cluster, and the newly upgraded EVEREST high resolution visualization facility. All of these resources are available through high performance networks including ESnet’s recently upgraded 100 gigabit per second links.

More Information

For more information on the Titan project, please visit http://olcf.ornl.gov/titan.

Support

For support on Titan, please visit http://www.olcf.ornl.gov/support.

Tech Specs

Titan System Configuration
Architecture: Cray XK7
Processor: 16-Core AMD
Cabinets: 200
Nodes: 18,688 AMD Opterons
Cores/node: 16
Total cores: 299,008 Opteron Cores
Memory/node: 32GB + 6GB
Memory/core: 2GB
Interconnect: Gemini
GPUs: 18,688 K20 Keplers
Speed: 20+ PF
Square Footage 4,352 sq feet