Biomedical Modeling at the National Cancer Institute

The Advanced Biomedical Computing Center at the National Cancer Institute (NCI) houses one of the largest Cray SV1 supercomputers in the Protein Folding world. Researchers from around the world use NCI's 96-processor Cray SV1 system to solve some of the most difficult problems in computational biology -- studying protein structure and function at the most detailed levels.

In order to model biomolecular systems in an accurate way, scientists require leading-edge software along with the Cray hardware at NCI. Molecular dynamics codes such as CHARMm and AMBER are being heavily utilized at NCI to perform molecular dynamics on large biomolecular systems. In addition, the electronic structure of chemical systems is being modeled with the the highly optimized Gaussian98 application from Gaussian Inc. Recently, researchers at NCI have used the SHELXL x-ray refinement code, specially designed for the Cray SV1 supercomputer, to understand the structure of large proteins at accuracies of less than 1 angstrom. These applications all achieve true supercomputer performance from the combination of efficient vector computing, efficient parallelization (with MPI, LINDA or OpenMP), and unequaled memory bandwidth. The combination of vector and parallel performance of these applications gives researchers a performance advantage, which potentially saves months of effort and translates into a faster discovery cycle.

Cray and NCI Collaborate on More Powerful Bioinformatics Research Tools

Cray and NCI are working together to develop bioinformatics research tools substantially more powerful than those available today. In an initial demonstration project, scientists at the Advanced Biomedical Computing Center produced a comprehensive map of short tandem repeat sequences (STRs) -- often used as gene markers -- for the entire human genome. Using the Cray SV1, computations that previously took hours are being completed in seconds. This will enable biologists to do full-scale analyses that previously were impractical.

"In preliminary testing, the unique technologies available on Cray vector supercomputers have provided enormous speed-ups for full-scale analysis of some common types of bioinformatics problems," said Bill Long, Cray's chief collaborator for the NCI work. "Assuming this validation continues, we believe there is a potential to make full-scale, exhaustive analysis of many bioinformatics problems feasible for the first time." Although exhaustive analysis typically produces results that are more complete and reliable than methods based on statistical sampling, he said, to date exhaustive analysis has been too slow and expensive to use routinely.

Short tandem repeats, also known as microsatellites, are repetitive sequences of DNA that scientists have exploited for several years as tools to map new genes, study the structure of chromosomes, and compare the DNA of different species, all of which are major areas of interest in biology and medical research.

Other bioinformatics software tools under development in the NCI-Cray collaboration include: non-tandem repeats, EST cluster assembly, CG island detection, genome assembly from BAC clones, SNP (single nucleotide polymorphism) analysis, and the extension to protein sequences for proteomic applications.


What is a Supercomputer?

Cray Systems at Work

Industry Links

Cray Inc.

Home | Cray Solutions | Supercomputing | About Cray | News | Jobs | Investing

Copyright © 2003 Cray Inc. | Trademarks | Contact Cray