Parallel Data Lab Receives Computing Cluster from Los Alamos National Lab

Carnegie Mellon University

has received a supercomputer from Los Alamos National Lab (LANL) that will be reconstructed into a computing cluster operated by the Parallel Data Lab (PDL) and housed in the Data Center Observatory. This new cluster will augment the existing Narwhal, also from LANL and made up of parts of the decommissioned Roadrunner supercomputer technology, the fastest supercomputer in the world from June 2008 to June 2009.

This new supercomputer, tentatively named Wolf, will be an important part of educating CMU’s next generation of computer science professionals, researchers and educators. The system recently was retired from LANL’s open institutional computing environment. While no longer efficient for simulation science, it still has high value as a training tool and for computer science research. Wolf is made up of 616 computing nodes, each containing two eight-core Intel Xeon Sandy Bridge processors, totaling 9,856 processing cores across the entire cluster. The cluster interconnect is QDR InfiniBand, providing a network that is 30 times faster than Narwhal. Altogether, it will have the capability of about 200 teraflops, where one teraflop represents one trillion computations per second.

“Wolf’s processing cores are each significantly faster than the previous system, and it consists of about 50 percent more computing nodes,” said George Amvrosiadis, assistant research professor of electrical and computer engineering and the Parallel Data Lab (PDL). “We will be retiring the Narwhal nodes. Our experienced PDL team, with Jason Boles leading the installation effort, is doing this gradually to make sure everything works as expected.”

In the five years since they received Narwhal from LANL, the researchers of the Parallel Data Lab have developed several projects with the computing cluster in service of educating the world’s next thought leaders in several areas of computer science including: scalable storage, cloud computing, machine learning and operating systems.

/Public Release. View in full here.