New research using a computational model shows how the human brain’s face-recognition mechanism works.
Researchers at the Massachusetts Institute of Technology have designed a machine-learning system that implemented a model that seems to capture aspects of human neurology that previous models have missed and trained it to recognize particular faces by feeding it a battery of sample images.
The trained system included an intermediate processing step that represented a face’s degree of rotation but not the actual direction. This property wasn’t built into the system, but emerged spontaneously from the training process and it duplicates an experimentally observed feature of the primate face-processing mechanism, which indicates that the system and the brain are doing something similar.
Tomaso Poggio, a professor of brain and cognitive sciences at MIT and director of the Center for Brains, Minds and Machines (CBMM), a multi-institution research consortium funded by the National Science Foundation, explained that the results may need to be looked at further.
“This is not a proof that we understand what’s going on,” Poggio said in a statement. “Models are kind of cartoons of reality, especially in biology. So I would be surprised if things turn out to be this simple. But I think it’s strong evidence that we are on the right track.”
The research includes mathematical proof that the particular type of machine-learning system used will inevitably yield intermediary representations that are indifferent to angle of rotation.
Poggio said he believes that the brain must produce “invariant” representations of faces and other objects, meaning representations that are indifferent to objects’ orientation in space, their distance from the viewer or their location in the visual field.
The neurons in an intermediate region appear to be mirror symmetric, meaning they’re sensitive to the angle of face rotation without respect to direction.
“It was not a model that was trying to explain mirror symmetry,” Poggio said. “This model was trying to explain invariance and in the process, there is this other property that pops out.”
The machine-learning system is a neural network—which consists of very simple processing units, arranged into layers that are densely connected to the processing units—which roughly approximates the architecture of the human brain.
Data is fed into the bottom layer of the network, which processes them in some way and feeds them to the next layer.
The output of the top layer is correlated with some classification criterion, like correctly determining whether a given image depicts a particular person.
During the training the researchers used a variation on Hebb’s rule— “neurons that fire together wire together”—which means as the weights of the connections between nodes are being adjusted to produce more accurate outputs, nodes that react in concert to particular stimuli end up contributing more to the final output than nodes that react independently or not at all.
This approach ended up yielding invariant representations but the middle layers of the network also duplicated the mirror-symmetric responses of the intermediate visual-processing regions of the primate brain.
Poggio is joined on the paper by several other members of both the CBMM and the McGovern Institute: first author Joel Leibo, a researcher at Google DeepMind, who earned his Ph.D. in brain and cognitive sciences from MIT with Poggio as his advisor; Qianli Liao, an MIT graduate student in electrical engineering and computer science; Fabio Anselmi, a postdoc in the IIT@MIT Laboratory for Computational and Statistical Learning, a joint venture of MIT and the Italian Institute of Technology; and Winrich Freiwald, an associate professor at the Rockefeller University.