In the largest computer systems project ever undertaken at EPFL, an international team of researchers has come up with a new way of building computers to help tackle the increasing challenges faced by data centers.
Everybody knows the feeling of having too much to think about at once, but imagine what it is like for data center servers. Very often the same machine has to cater to the demands of many different programs, at the call of many different users, all at the same time.
Not only is there a queue for the available server memory, but there are security checks: each request has to be carefully treated so that users will not see each others' data. The result? Bottlenecks. Servers run slowly, resources are not used efficiently.
To try to solve this problem, five years ago, Babak Falsafi, Full Professor at the School of Computer and Communication Sciences, formed an international, multi-talented team to create Midgard, a new approach in the use of virtual memory to current operating systems.
"The traditional computer system principles to manage memory date back to the 1960s," explained Falsafi. "But for a long time, Apple has been redesigning computer hardware and software specifically for its iPhones and Mac laptops/desktops. This is what designers of data center servers should have done years ago. Traditional computer system principles are not suited to the incredibly high demands of today's cloud services."
Computer programs need to figure out where to perform their computation and virtual memory is the computer's way of tricking a program into believing it has a lot more space to perform computation than is available. Moreover, the space is shared among programs without them knowing about it. It's virtual memory's job to direct every program to the space allocated to it and run identity and access checks whenever the space is used. This ensures that programs don't run into each other, but it makes for a slow process.
To boost the performance of servers, the Midgard team set about redesigning virtual memory. They compartmentalized virtual memory so that programs can quickly find their workspace and hardware can perform the access checks with little delay or energy.
Midgard can not only reduce the bottlenecks in virtual memory while maintaining the high levels of security required by a data center, it is compatible with modern software developer standards (for phones, laptops/desktops and servers) and oblivious to application developers.
The benefits are not only a faster performance, but also a much higher level of efficiency, playing a role in the all-important battle to reduce the carbon footprint of data centers.
"In data centers, memory is a shared hardware component, and accounts for 50% of the server cost, having grown by 12 orders of magnitude in capacity since the late 1960s," outlined Falsafi.
But servers are not always utilized efficiently in data centers. There are independent reports indicating that more than half of the memory that cloud providers rent to customers is not used.
"Microsoft reported recently that because of how they rent servers to cloud customers, 20% of their memory remains stranded (not rented). We're trying to create technologies to help operators, and their customers, use their hardware more efficiently," he continued.
As the five-year Midgard initiative reaches its conclusion it can boast a raft of publications, an in-house demonstrator showing order of magnitude gains, a showcase presentation at Intel and - of course - talk of a sequel.
"We are now looking at rack-scale computing," explains Falsafi, "with a purpose-built collection of servers that behave as a single computer in a data center. With emerging network fabrics that connect an entire rack of servers together, cloud services can use all hardware, not just memory, more efficiently in data centers. We want to move on with what we have discovered in Midgard and develop the best possible strategy to improve the hardware utilization of future data centers. This is the goal that drives us on."
The principal investigators of Midgard are: Professor Babak Falsafi, Professor Mathias Payer and Professor David Atienza, colleagues from the EPFL EcoCloud Center, as well as Professor Abhishek Bhattacharjee at Yale and Professor Boris Grot at Edinburgh University. They have been supported by associated EPFL Faculty and researchers in Switzerland and around the world.