Researchers from North
Carolina State University have developed a new
technique that allows graphics processing units (GPUs) and central processing
units (CPUs) on a single chip to collaborate—boosting processor performance by
an average of more than 20%.
“Chip manufacturers are now creating processors that have a ‘fused
architecture,’ meaning that they include CPUs and GPUs on a single chip,” says Huiyang
Zhou, PhD, an associate professor of electrical and computer engineering who
co-authored a paper on the research. “This approach decreases manufacturing
costs and makes computers more energy efficient. However, the CPU cores and GPU
cores still work almost exclusively on separate functions. They rarely
collaborate to execute any given program, so they aren’t as efficient as they
could be. That’s the issue we’re trying to resolve.”
GPUs were initially designed to execute graphics programs, and they are
capable of executing many individual functions very quickly. CPUs, or the “brains” of a computer, have less computational power—but are better able to
perform more complex tasks.
“Our approach is to allow the GPU cores to execute computational functions,
and have CPU cores pre-fetch the data the GPUs will need from off-chip main
memory,” Zhou says.
“This is more efficient because it allows CPUs and GPUs to do what they are
good at. GPUs are good at performing computations. CPUs are good at making
decisions and flexible data retrieval.”
In other words, CPUs and GPUs fetch data from off-chip main memory at
approximately the same speed, but GPUs can execute the functions that use that
data more quickly. So, if a CPU determines what data a GPU will need in
advance, and fetches it from off-chip main memory, that allows the GPU to focus
on executing the functions themselves—and the overall process takes less time.
In preliminary testing, Zhou’s team found that its new approach improved
fused processor performance by an average of 21.4%.
This approach has not been possible in the past, Zhou adds, because CPUs and
GPUs were located on separate chips.