Competition promoted development of more powerful machine- learning algorithms for visual identification of species
At a glance, would you be able to tell the difference between a donkey and a mule? A jaguar and a leopard? Most computers can’t, at least not yet, but a contest hosted by Caltech and Cornell Tech, the engineering campus of Cornell University, aimed to change that.
The two institutions teamed up to create the iNaturalist Challenge, a competition to create the best machine-learning algorithm for identifying the world’s plant and animal species. The contest was an outgrowth of the institutions’ previous work together on Visipedia, a visual encyclopedia created by a network of people and machine-learning computers that harvest image information off the internet. The technology was developed for the encyclopedia by Pietro Perona’s Vision Group in the Division of Engineering and Applied Science at Caltech and Serge Belongie’s Computer Vision Group at Cornell Tech. It has previously been used in Registree, a piece of software that identifies and catalogs urban tree species, and Merlin Bird ID, a mobile app that can identify 650 species of North American birds from photos taken by the user.
Grant Van Horn and Oisin Mac Aodha, researchers in Perona’s lab, wanted to see how much further they could push machine-learning technology, and the idea for the contest was born. During a period stretching from April to July, anyone who wanted to compete was given access to a database of 650,000 images of more than 5,000 species in categories that included, among others, protozoans, fungi, plants, archanids, and mammals. They used that database to develop algorithms for automatic species identification.
By the time the competition closed in July, the organizers had received 32 entries comprising teams and individuals. Van Horn says the winning algorithm was able to correctly identify species in a test database of 100,000 photos 80 percent of the time when it was allowed only one guess per photo. When the algorithm was allowed five guesses per photo, its success rate rose to 95 percent.
“That’s way better than almost every person would do on a test like this,” he says. “It’s 5,000 species, so it’s pretty hard to be an expert on all of them. This is where we’re seeing the true benefit of computer vision.”
Still, Mac Aodha says, while impressive, the algorithms developed for the contest still use traditional neural network technology and are far from perfect. He hopes future contests will spur something new in the machine-learning field.
“From the competition results we can see that computers perform very well when they have access to many photographs of a given species. However, to truly reach expert human ability, we need to design algorithms that are capable of understanding new species from only a small number of images,” he says.
“Building on our success in identifying birds with Visipedia, we challenged technologists to develop algorithms to recognize even more species with incredible results,” said Serge Belongie, professor of computer science at Cornell Tech and the co-founder of Visipedia. “It’s another step forward in computer vision technologies.”