NSF Grant Boosts Data Center Fortification

Pennsylvania State University

Data centers that support banking, defense, health care and business use over 4% of the world's electricity every year, a number that will increase to over 13% in just the next three years due to the increasing prevalence of artificial intelligence (AI), according to Wangda Zuo, professor of architectural engineering.

To stay operational, these energy-intensive centers rely on immense cooling systems, which are susceptible to what Rômulo Meira-Góes, assistant professor of electrical engineering, calls "cascading failures." Tiny issues like a single motor or valve failing can quickly compound, eventually leading to catastrophic overheating of the whole facility.

To proactively identify potential issues facing data center cooling, the duo are leading a team in developing a novel fault detection and diagnostics (FDD) system. The project, which the researchers said will also be equipped to prepare for and resist unexpected events like malicious cyberattacks, long before they take data centers' cooling systems offline, is funded with a three-year, $340,000 grant from the U.S. National Science Foundation.

According to Meira-Góes, principal investigator on the project, FDD methods offer a streamlined way to detect and solve issues facing complex mechanical networks like data center cooling systems. Although current methods use advanced statistical modeling and AI-powered machine learning algorithms to detect potential failures, they have clear inefficiencies.

"It requires a ton of field expertise in order to determine problems, and it can be easy for a technician to miss an issue if only a single piece of hardware fails," Meira-Góes said. "Although implementing AI can help, AI cannot accurately predict an event without experiencing it, meaning it struggles to effectively forecast and prepare for rare or unanticipated events like cyberattacks."

The team's system will balance proactive prediction and real-time hardware monitoring in data center cooling, using both a digital modeling tool alongside a testbed for physical hardware, Meira-Góes said. The digital tool will simulate the entire facility, identifying specific areas prone to failure as well as the potential cascading or unexpected issues that could disrupt a data center's cooling system, while the testbed allows for proactive monitoring and replacement of physical equipment around the facility.

According to Zuo, co-principal investigator on the project, creating an FDD method that incorporates both digital predictions and hardware testing adds a needed level of redundancy to the cooling systems of these critical facilities.

"In a data center, you have such a variety of system types and operations that it is impractical to monitor and plan for cooling problems using just hardware," Zuo said. "By creating a digital model of the facility and offering hardware testing simultaneously through one FDD method, we can substantially increase prepared mitigations for a variety of situations."

In addition to increasing repair efficiency, their new method is intended to improve data center protection across the United States. The prevalence of cyberattacks is a growing concern, Meira-Góes said, so reliable cooling back-ups are necessary to stop data centers from overheating and frying millions of dollars of expensive equipment in the event of an attack.

"Whether a problem is an equipment failure or an outside force actively tampering with the system impacts how technicians should approach the problem," Meira-Góes said. "We are reapproaching how FDD is handled because failures and attacks are usually analyzed separately - our tool will be able to analyze and identify them both at the same time."

This project will build upon Meira-Góes's existing work analyzing and modeling systems under attack and Zuo's more than 10 years of experience researching commercial data center cooling.

"Our goal is to develop a system that will benefit commercial cooling operations for data centers for years to come," Zuo said. "This FDD method will provide an efficient and attack-resistant support system that can scale and adapt as quickly as data centers in the face of increasing demand."

Ardeshir Moftakhari, assistant professor of mechanical and aerospace engineering at the Oklahoma State University, is an additional principal investigator on this work.

At Penn State, researchers are solving real problems that impact the health, safety and quality of life of people across the commonwealth, the nation and around the world.    

For decades, federal support for research has fueled innovation that makes our country safer, our industries more competitive and our economy stronger. Recent federal funding cuts threaten this progress.  

Learn more about the implications of federal funding cuts to our future at Research or Regress.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.