A self-driving car has a split second to determine whether the image it sees is a person and whether that person has stepped on or off a curb.
A mistake, of course, would have deadly consequences.
Such is the increasingly challenging world of artificial intelligence and machine learning with demand for computers and robots that can instantly analyze images, reason and solve problems. One of the biggest challenges in AI is that computers simply can’t do the computations fast and accurately enough, especially in regards to understanding fast-changing images.
Washington State University researchers have recently developed a computer architecture that achieves similar accuracy as conventional graphical processing units (GPUs) but works more than 50 times faster. Led by graduate student Biresh Joardar and Partha Pande, Boeing Centennial Chair in Computer Engineering, the researchers presented their work virtually this month at the prestigious Design, Automation and Test in Europe Conference (DATE) in Grenoble, France.
In their work, the researchers developed a heterogeneous computer architecture that combines the commonly used GPU-based platform with a new and emerging memory technology called resistive random-access memory (ReRAM). Their architecture aims to speed up and improve accuracy for a particular class of machine learning algorithm called Deep Neural Networks (DNNs) for image segmentation. Image segmentation is already used in technology for self-driving cars, and could have many additional applications in medical technology, including for robotic surgery.
GPUS are popular in today’s computer technology, including in phones, personal computers, and game consoles. By doing their processing operations in parallel, they are able to do computationally complex tasks quickly, such as image processing. However, they require a lot of power to run and have limited memory bandwidth. As computation for complicated artificial intelligence demands increases, the required power and execution time for GPUs is also increasing. Image segmentation tasks also involve heavy data communications traffic that could create new performance bottlenecks.
“Additional data traffic will put even more stress on the already limited memory bandwidth,” Joardar said.
ReRAM, meanwhile, has been used to design high-performance hardware architectures. The ReRAM-based architectures are more energy efficient and solve the memory and bandwidth issues of GPUs. However, these architectures have numerous challenges that can lead to inaccurate predictions. So, in the case of the self-driving car trying to instantly determine where several people are as it moves down the street, “if you do these computations too slowly or wrong, someone might get hurt,” Joardar said.
To address these challenges, the researchers proposed a new heterogeneous architecture that combines the GPU and ReRAM. In particular, they took advantage of a mathematical technique called stochastic rounding in their architecture to improve the accuracy of the DNN model by adding a bit of randomness.
In model simulations, their combined architecture achieved near-GPU levels of accuracy while at the same time outperforming the speed of conventional GPUs by 33 times on average and up to 53 times.
The research was supported by the National Science Foundation and the U.S. Army.