AI hardware needs to become more brain-like to meet the growing energy demands of real-world applications, according to researchers.
In a study published in Frontiers in Science , scientists from Purdue University and the Georgia Institute of Technology have outlined practical approaches to overcome the limitations of modern computing hardware.
Conventional computers are based on the von Neumann architecture, with separate processors and memory units. Every time data is needed, information must be shuttled back and forth between the two components. This transfer, known as the memory wall, accounts for most of the delays and energy use in AI processing.
The researchers argue that integrating processing capability within or next to the memory unit would help to overcome this bottleneck. Achieving this could allow new types of algorithms that make AI applications feasible without relying on data- and energy-intensive cloud computing.
"Language processing models have grown 5,000-fold in size over the last four years. This alarmingly rapid expansion makes it crucial that AI is as efficient as possible. That means fundamentally rethinking how computers are designed," said Kaushik Roy, Professor of Electrical and Computer Engineering at Purdue University and lead author of the study.
Inspired by the brain
One way to avoid the memory wall problem and make AI computing more efficient is to take inspiration from our brains.
When a neuron receives activation signals from other neurons, it builds up an electrical charge, known as the membrane potential. If this potential reaches a certain threshold, the neuron sends its own signal onwards. In doing so, a neuron stores and processes information in the same place and only communicates when something changes.
This has inspired new AI algorithms known as spiking neural networks (SNNs), which can respond efficiently to irregular and occasional events. This contrasts with traditional AI networks, which excel at data-intensive tasks like face recognition, image classification, image analysis, and 3D reconstruction.
"The capabilities of the human brain have long been an inspiration for AI systems. Machine learning algorithms came from the brain's ability to learn and generalize from input data. Now we want to take this to the next level and recreate the brain's efficient processing mechanisms," said Adarsh Kosta, co-author and researcher at Purdue University.
AI on the fly
This neuro-inspired approach could allow AI applications to expand beyond large-scale data centers.
For example, an autonomous drone in a search and rescue scenario must detect its surroundings, identify and track objects, make decisions, and plan its actions in real-time. Relying on cloud-based computing can cause too much of a lag, and so these processes need to be completed onboard as efficiently as possible.
In such scenarios, it's essential for computer systems to be lightweight and low-power. One efficiency gain is for drones to use event-based cameras. Unlike video cameras that record a regular stream of frames, these sensors only send data when there is a sufficient change to the pixels.
Event-based cameras use less data and power; however, their intermittent and time-dependent outputs aren't well-suited to traditional processing units. SNN algorithms, just like the brain, are highly efficient at responding to sequences of events, making the most of these sparse signals.
This approach could enable the drone to be more capable or have a longer range. Greater efficiency would also benefit AI applications in a wide range of areas, such as transportation, or medical devices.
"AI is one of the most transformative technologies of the 21st century. However, to move it out of data centers and into the real world, we need to dramatically reduce its energy use. With less data transfer and more efficient processing, AI can fit into small, affordable devices with batteries that last longer," said Tanvi Sharma, co-author and researcher at Purdue University.
Hardware solutions
Successfully applying SNNs will require specialized hardware that overcomes the memory wall.
Compute-in-memory (CIM) systems carry out the calculations where the data is stored, reducing expensive data movement. This is ideal for SNN algorithms, which need to repeatedly refer to the memory to update and check the membrane potentials over time.
There are two main ways of achieving this. Analog methods use electrical currents flowing through the memory cells to perform calculations. Digital methods use standard digital logic (0s and 1s) inside or next to the memory array. Digital is more accurate but uses more energy than analog.
There are several potential technologies that could deliver CIM systems, but none that presents a clear winner in all cases. Instead, the authors emphasize the value of combining approaches and designing the algorithms, circuits, and memory together, so each application uses the most appropriate building blocks.
"Co-designing the hardware and algorithms together is the only way to break through the memory wall and deliver fast, lightweight, low-power AI," said Roy. "This collaborative design approach could also create platforms that are far more versatile by switching between traditional AI networks and neuro-inspired networks depending on the application."